📚 Auto-publish: Add/update 6 blog posts
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 17s

Generated on: Sat Jan 10 20:10:48 UTC 2026
Source: md-personal repository
This commit is contained in:
Automated Publisher
2026-01-10 20:10:48 +00:00
parent f7528b364e
commit 13abf5792b
6 changed files with 9 additions and 9 deletions

View File

@@ -77,7 +77,7 @@ It turned out to be a syntax error in my arguments passed to the `Trainer` (or r
### Pitfall #2: Stability vs. Noise
The loss curve was initially extremely erratic. The batch size on my GPU was limited (Physical Batch Size = 4).
**The Fix**: I implemented **Gradient Accumulation** (accumulating over 8 steps) to simulate a batch size of 32. This smoothed out the optimization landscape significantly.
![S3 File](http://localhost:4998/attachments/image-1b23344ea5541d156e5ac20823d12d7c6723b691.png?client=default&bucket=obsidian)
![S3 File](/images/technical-deep-dive-llm-categorization/eedb3be8259a4a70aa7029b78a029364.png)
### Pitfall #3: Overfitting
With a small dataset (~2k samples), overfitting is a real risk. I employed a multi-layered defense strategy: