Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published Mar 13, 2025 • 30
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 149
view article Article The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU Jan 2 • 12
LLaVA-Video Collection Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 8 items • Updated Feb 21, 2025 • 65
SmolLM3 evaluation datasets Collection Datasets to decontaminate the post-training mixtures against. Use the subset and column values described per entry • 13 items • Updated Jul 8, 2025 • 8
SmolLM3 pretraining datasets Collection datasets used in SmolLM3 pretraining • 15 items • Updated Aug 12, 2025 • 45
INTELLECT-2 Collection INTELLECT-2 is a 32 billion parameter language model with globally distributed reinforcement learning. • 3 items • Updated Oct 7, 2025 • 26
view article Article Preserving Agency: Why AI Safety Needs Community, Not Corporate Control Sep 29, 2025 • 10
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12, 2024 • 71
Cosmos-Reason1 Collection ⚠️ The latest version of Cosmos Reason is now live! 👉 https://huggingface.co/collections/nvidia/cosmos-reason2 • 8 items • Updated 3 days ago • 40
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9, 2025 • 779
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics Paper • 2506.04308 • Published Jun 4, 2025 • 43
view article Article Interactive Tools for machine learning, deep learning, and math May 26, 2025 • 47
AdaptThink: Reasoning Models Can Learn When to Think Paper • 2505.13417 • Published May 19, 2025 • 83
AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning Paper • 2505.11896 • Published May 17, 2025 • 58
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers • 25 items • Updated 3 days ago • 117
Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents Paper • 2505.02156 • Published May 4, 2025 • 18