📚 Auto-publish: Add/update 6 blog posts
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 17s
All checks were successful
Hugo Publish CI / build-and-deploy (push) Successful in 17s
Generated on: Sat Jan 10 20:10:48 UTC 2026 Source: md-personal repository
This commit is contained in:
@@ -77,7 +77,7 @@ It turned out to be a syntax error in my arguments passed to the `Trainer` (or r
|
||||
### Pitfall #2: Stability vs. Noise
|
||||
The loss curve was initially extremely erratic. The batch size on my GPU was limited (Physical Batch Size = 4).
|
||||
**The Fix**: I implemented **Gradient Accumulation** (accumulating over 8 steps) to simulate a batch size of 32. This smoothed out the optimization landscape significantly.
|
||||

|
||||

|
||||
|
||||
### Pitfall #3: Overfitting
|
||||
With a small dataset (~2k samples), overfitting is a real risk. I employed a multi-layered defense strategy:
|
||||
|
||||
Reference in New Issue
Block a user