VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice Paper • 2601.05175 • Published 21 days ago • 34
AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models Paper • 2511.14295 • Published Nov 18, 2025 • 72
Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine Paper • 2510.21614 • Published Oct 24, 2025 • 22
AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models Paper • 2511.14295 • Published Nov 18, 2025 • 72
AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models Paper • 2511.14295 • Published Nov 18, 2025 • 72 • 3
Multimodal Safety Evaluation in Generative Agent Social Simulations Paper • 2510.07709 • Published Oct 9, 2025 • 13
Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation Paper • 2509.21989 • Published Sep 26, 2025 • 23
Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale Paper • 2509.14008 • Published Sep 17, 2025 • 88
Hala Collection A series of light-weight Arabic language models (instruction following + translation) and Arabic instruction dataset. • 8 items • Updated Sep 18, 2025 • 7
Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale Paper • 2509.14008 • Published Sep 17, 2025 • 88
Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale Paper • 2509.14008 • Published Sep 17, 2025 • 88 • 3