📚 Auto-publish: Add/update 4 blog posts
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 37s

Generated on: Sun Aug  3 03:10:45 UTC 2025
Source: md-personal repository
This commit is contained in:
Automated Publisher
2025-08-03 03:10:45 +00:00
parent 38bbe8cbae
commit ec6f60a996
4 changed files with 4 additions and 3 deletions

View File

@@ -0,0 +1 @@
Pasted image 20250730232756.png|.png

View File

@@ -1,6 +1,6 @@
--- ---
title: "A Deep Dive into PPO for Language Models" title: "A Deep Dive into PPO for Language Models"
date: 2025-08-03T01:47:10 date: 2025-08-03T03:10:41
draft: false draft: false
--- ---
@@ -9,7 +9,7 @@ Large Language Models (LLMs) have demonstrated astonishing capabilities, but out
You may have seen diagrams like the one below, which outlines the RLHF training process. It can look intimidating, with a web of interconnected models, losses, and data flows. You may have seen diagrams like the one below, which outlines the RLHF training process. It can look intimidating, with a web of interconnected models, losses, and data flows.
![[Pasted image 20250730232756.png]] ![](/images/a-deep-dive-into-ppo-for-language-models/.png)
This post will decode that diagram, piece by piece. We'll explore the "why" behind each component, moving from high-level concepts to the deep technical reasoning that makes this process work. This post will decode that diagram, piece by piece. We'll explore the "why" behind each component, moving from high-level concepts to the deep technical reasoning that makes this process work.

View File

@@ -1,6 +1,6 @@
--- ---
title: "T5 - The Transformer That Zigged When Others Zagged - An Architectural Deep Dive" title: "T5 - The Transformer That Zigged When Others Zagged - An Architectural Deep Dive"
date: 2025-08-03T01:47:10 date: 2025-08-03T03:10:41
draft: false draft: false
--- ---

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 MiB