Automated Publisher
1c09b30d22
📚 Auto-publish: Add/update 3 blog posts
...
Hugo Publish CI / build-and-deploy (push) Has started running
Generated on: Sun Aug 3 02:45:17 UTC 2025
Source: md-personal repository
2025-08-03 02:45:17 +00:00
Automated Publisher
73f53ff6b9
📚 Auto-publish: Add/update 2 blog posts
...
Hugo Publish CI / build-and-deploy (push) Successful in 3m10s
Generated on: Sun Aug 3 02:37:55 UTC 2025
Source: md-personal repository
2025-08-03 02:37:56 +00:00
38bbe8cbae
🗑️ (posts): remove unused image and its reference in markdown file
Hugo Publish CI / build-and-deploy (push) Successful in 42s
2025-08-02 19:24:41 -07:00
Automated Publisher
b6192ca3ca
📚 Auto-publish: Add/update 2 blog posts
...
Hugo Publish CI / build-and-deploy (push) Successful in 3m31s
Generated on: Sun Aug 3 01:47:39 UTC 2025
Source: md-personal repository
2025-08-03 01:47:39 +00:00
Automated Publisher
0b377b2189
📚 Auto-publish: Add/update 2 blog posts
...
Hugo Publish CI / build-and-deploy (push) Failing after 11m2s
Generated on: Sat Aug 2 18:07:06 PDT 2025
Source: md-personal repository
2025-08-02 18:07:06 -07:00
a3ccac4cd2
✨ (content): add new image file to posts directory
Hugo Publish CI / build-and-deploy (push) Successful in 16s
2025-08-02 15:49:50 -07:00
88cbb7efd5
✨ (posts): add deep dive into PPO for language models post
...
Hugo Publish CI / build-and-deploy (push) Successful in 14s
This commit introduces a new blog post detailing the Proximal Policy Optimization (PPO) algorithm as used in Reinforcement Learning from Human Feedback (RLHF) for Large Language Models (LLMs).
The post covers:
- The mapping of RL concepts to text generation.
- The roles of the Actor, Critic, and Reward Model.
- The use of Generalized Advantage Estimation (GAE) for stable credit assignment.
- The PPO clipped surrogate objective for safe policy updates.
- The importance of pretraining loss to prevent catastrophic forgetting.
- The full iterative training loop.
2025-08-02 15:46:24 -07:00
291f598d8c
Delete content/posts/credit_card.html
continuous-integration/drone Build is passing
2023-09-24 05:25:39 +00:00
Eric Liu
0794ee0bce
Add rootCA
2020-10-26 04:47:36 +00:00
Eric Liu
74b5002bff
Add credit card spending
2020-06-16 23:30:17 -07:00
Eric Liu
2f0990f161
initial commit
2019-02-05 05:18:26 +00:00