view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 12 days ago • 838
Emergent Social Intelligence Risks in Generative Multi-Agent Systems Paper • 2603.27771 • Published 15 days ago • 51
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published 14 days ago • 57
Vega: Learning to Drive with Natural Language Instructions Paper • 2603.25741 • Published 18 days ago • 6
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought Paper • 2603.22847 • Published 21 days ago • 26
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published 27 days ago • 94
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation Paper • 2603.12267 • Published Mar 12 • 13
Beyond Imitation: Reinforcement Learning for Active Latent Planning Paper • 2601.21598 • Published Jan 29 • 10
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published Dec 23, 2025 • 62
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published Dec 22, 2025 • 66
Reinforcement Learning for Self-Improving Agent with Skill Library Paper • 2512.17102 • Published Dec 18, 2025 • 42
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding Paper • 2512.13586 • Published Dec 15, 2025 • 93
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 164
T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground Paper • 2512.10430 • Published Dec 11, 2025 • 119
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO Paper • 2511.13288 • Published Nov 17, 2025 • 19
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17, 2025 • 45