📚 Auto-publish: Add/update 3 blog posts
Some checks are pending
Hugo Publish CI / build-and-deploy (push) Has started running

Generated on: Sun Aug  3 02:45:17 UTC 2025
Source: md-personal repository
This commit is contained in:
Automated Publisher
2025-08-03 02:45:17 +00:00
parent 73f53ff6b9
commit 1c09b30d22
3 changed files with 3 additions and 3 deletions

View File

@@ -1,6 +1,6 @@
---
title: "A Deep Dive into PPO for Language Models"
date: 2025-08-03T02:36:44
date: 2025-08-03T02:45:10
draft: false
---
@@ -9,7 +9,7 @@ Large Language Models (LLMs) have demonstrated astonishing capabilities, but out
You may have seen diagrams like the one below, which outlines the RLHF training process. It can look intimidating, with a web of interconnected models, losses, and data flows.
![[Pasted image 20250730232756.png]]
![](/images/a-deep-dive-into-ppo-for-language-models/.png)
This post will decode that diagram, piece by piece. We'll explore the "why" behind each component, moving from high-level concepts to the deep technical reasoning that makes this process work.

View File

@@ -1,6 +1,6 @@
---
title: "T5 - The Transformer That Zigged When Others Zagged - An Architectural Deep Dive"
date: 2025-08-03T02:36:44
date: 2025-08-03T02:45:10
draft: false
---

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 MiB