📚 Auto-publish: Add/update 6 blog posts
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 58s

Generated on: Thu Jan  8 18:13:13 UTC 2026
Source: md-personal repository
This commit is contained in:
Automated Publisher
2026-01-08 18:13:13 +00:00
parent 3b1396d814
commit f7528b364e
6 changed files with 9 additions and 9 deletions

View File

@@ -77,7 +77,7 @@ It turned out to be a syntax error in my arguments passed to the `Trainer` (or r
### Pitfall #2: Stability vs. Noise
The loss curve was initially extremely erratic. The batch size on my GPU was limited (Physical Batch Size = 4).
**The Fix**: I implemented **Gradient Accumulation** (accumulating over 8 steps) to simulate a batch size of 32. This smoothed out the optimization landscape significantly.
![S3 File](/images/technical-deep-dive-llm-categorization/eedb3be8259a4a70aa7029b78a029364.png)
![S3 File](http://localhost:4998/attachments/image-1b23344ea5541d156e5ac20823d12d7c6723b691.png?client=default&bucket=obsidian)
### Pitfall #3: Overfitting
With a small dataset (~2k samples), overfitting is a real risk. I employed a multi-layered defense strategy: