JudgeBoard: Benchmarking and Enhancing Small Language Models for Reasoning Evaluation Paper ⢠2511.15958 ⢠Published Nov 20, 2025 ⢠1
VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks Paper ⢠2511.04662 ⢠Published Nov 6, 2025 ⢠34
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 ⢠139
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness Paper ⢠2308.08708 ⢠Published Aug 17, 2023 ⢠5
Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks Paper ⢠2505.12845 ⢠Published May 19, 2025 ⢠1
Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus Paper ⢠2411.12498 ⢠Published Nov 19, 2024 ⢠2
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper ⢠2410.02884 ⢠Published Oct 3, 2024 ⢠54
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper ⢠2401.01335 ⢠Published Jan 2, 2024 ⢠68
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model Paper ⢠2312.11370 ⢠Published Dec 18, 2023 ⢠20
Prompting Is Programming: A Query Language for Large Language Models Paper ⢠2212.06094 ⢠Published Dec 12, 2022 ⢠1
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies Paper ⢠2308.03188 ⢠Published Aug 6, 2023 ⢠2