Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 15 days ago • 33
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published 8 days ago • 48
RealMaster: Lifting Rendered Scenes into Photorealistic Video Paper • 2603.23462 • Published 10 days ago • 32
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG Paper • 2603.23497 • Published 10 days ago • 90