📚 Auto-publish: Add/update 3 blog posts
Some checks are pending
Hugo Publish CI / build-and-deploy (push) Has started running

Generated on: Sun Aug  3 02:45:17 UTC 2025
Source: md-personal repository
This commit is contained in:
Automated Publisher
2025-08-03 02:45:17 +00:00
parent 73f53ff6b9
commit 1c09b30d22
3 changed files with 3 additions and 3 deletions

View File

@@ -1,6 +1,6 @@
--- ---
title: "A Deep Dive into PPO for Language Models" title: "A Deep Dive into PPO for Language Models"
date: 2025-08-03T02:36:44 date: 2025-08-03T02:45:10
draft: false draft: false
--- ---
@@ -9,7 +9,7 @@ Large Language Models (LLMs) have demonstrated astonishing capabilities, but out
You may have seen diagrams like the one below, which outlines the RLHF training process. It can look intimidating, with a web of interconnected models, losses, and data flows. You may have seen diagrams like the one below, which outlines the RLHF training process. It can look intimidating, with a web of interconnected models, losses, and data flows.
![[Pasted image 20250730232756.png]] ![](/images/a-deep-dive-into-ppo-for-language-models/.png)
This post will decode that diagram, piece by piece. We'll explore the "why" behind each component, moving from high-level concepts to the deep technical reasoning that makes this process work. This post will decode that diagram, piece by piece. We'll explore the "why" behind each component, moving from high-level concepts to the deep technical reasoning that makes this process work.

View File

@@ -1,6 +1,6 @@
--- ---
title: "T5 - The Transformer That Zigged When Others Zagged - An Architectural Deep Dive" title: "T5 - The Transformer That Zigged When Others Zagged - An Architectural Deep Dive"
date: 2025-08-03T02:36:44 date: 2025-08-03T02:45:10
draft: false draft: false
--- ---

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 MiB