view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 30 days ago • 85
Flan-T5 release Collection The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated Jul 10, 2025 • 33
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated Jan 27 • 172
gpt-oss-safeguard Collection gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss • 2 items • Updated Oct 29, 2025 • 64
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated 3 days ago • 697