MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published 6 days ago • 95
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published 6 days ago • 112
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published 7 days ago • 175
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 6 days ago • 253
Training a Student Expert via Semi-Supervised Foundation Model Distillation Paper • 2604.03841 • Published 11 days ago • 10
ViVa: A Video-Generative Value Model for Robot Reinforcement Learning Paper • 2604.08168 • Published 6 days ago • 17
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 6 days ago • 273
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 7 days ago • 309
CUE-R: Beyond the Final Answer in Retrieval-Augmented Generation Paper • 2604.05467 • Published 8 days ago • 7
Can Natural Image Autoencoders Compactly Tokenize fMRI Volumes for Long-Range Dynamics Modeling? Paper • 2604.03619 • Published 11 days ago • 8
Watch Before You Answer: Learning from Visually Grounded Post-Training Paper • 2604.05117 • Published 9 days ago • 35
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision Paper • 2604.04934 • Published 9 days ago • 42
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning Paper • 2604.05404 • Published 8 days ago • 41
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement Paper • 2604.01591 • Published 13 days ago • 40
GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers Paper • 2604.02648 • Published 12 days ago • 45
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation Paper • 2604.03922 • Published 10 days ago • 53
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 8 days ago • 114
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published 9 days ago • 232
BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs Paper • 2604.02045 • Published 13 days ago • 33