Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 65
Running 13 Defeating the trainer-generator precision mismatch in TRL 🎯 13 Download research PDF (Pro access required)
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 22 days ago • 46