MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding Paper • 2603.22458 • Published 15 days ago • 132
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale Paper • 2604.04771 • Published 1 day ago • 82
DocDancer: Towards Agentic Document-Grounded Information Seeking Paper • 2601.05163 • Published Jan 8 • 7
The Trinity of Consistency as a Defining Principle for General World Models Paper • 2602.23152 • Published Feb 26 • 201
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery Paper • 2602.08990 • Published Feb 9 • 78
OpenDataArena/MMFineReason-SFT-586K-Qwen3-VL-235B-Thinking Viewer • Updated Feb 3 • 586k • 749 • 6
MMFineReason Collection High-quality STEM reasoning dataset for Multimodal LLM post-training. • 8 items • Updated 7 days ago • 22
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods Paper • 2601.21821 • Published Jan 29 • 62
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing Paper • 2509.22186 • Published Sep 26, 2025 • 155
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing Paper • 2509.22186 • Published Sep 26, 2025 • 155