view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix Nov 3, 2025 • 53
Running 3.61k The Ultra-Scale Playbook 🌌 3.61k The ultimate guide to training LLM on large GPU Clusters
Running Featured 1.24k FineWeb: decanting the web for the finest text data at scale 🍷 1.24k Generate high-quality text data for LLMs using FineWeb
Running on CPU Upgrade Featured 2.75k The Smol Training Playbook 📚 2.75k The secrets to building world-class LLMs
DynaVis: Dynamically Synthesized UI Widgets for Visualization Editing Paper • 2401.10880 • Published Jan 19, 2024 • 1
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Paper • 2401.15024 • Published Jan 26, 2024 • 73