ToolOmni: Enabling Open-World Tool Use via Agentic learning with Proactive Retrieval and Grounded Execution Paper • 2604.13787 • Published Apr 15 • 1
jina-embeddings-v5-omni: Text-Geometry-Preserving Multimodal Embeddings via Frozen-Tower Composition Paper • 2605.08384 • Published 15 days ago • 10
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published 20 days ago • 113
OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories Paper • 2605.04036 • Published 18 days ago • 66
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 17 days ago • 99
WindowsWorld: A Process-Centric Benchmark of Autonomous GUI Agents in Professional Cross-Application Environments Paper • 2604.27776 • Published 23 days ago • 15
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence Paper • 2603.13398 • Published Mar 11 • 155
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 98
F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World Paper • 2603.19223 • Published Mar 19 • 33
Supervised Fine-Tuning or Contrastive Learning? Towards Better Multimodal LLM Reranking Paper • 2510.14824 • Published Oct 16, 2025 • 2
Encoders vs Decoders: the Ettin Suite Collection A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 • 30 items • Updated Mar 2 • 29
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published Oct 13, 2025 • 108
MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning Paper • 2603.12266 • Published Mar 12 • 19