Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models Paper • 2512.00590 • Published Nov 29, 2025 • 45
Multimodal Evaluation of Russian-language Architectures Paper • 2511.15552 • Published Nov 19, 2025 • 78
Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures Paper • 2510.24081 • Published Oct 28, 2025 • 18
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms Paper • 2511.17592 • Published Nov 17, 2025 • 118
Emergent Misalignment via In-Context Learning: Narrow in-context examples can produce broadly misaligned LLMs Paper • 2510.11288 • Published Oct 13, 2025 • 48
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA Paper • 2510.04849 • Published Oct 6, 2025 • 114
OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features Paper • 2509.22033 • Published Sep 26, 2025 • 18
The Rogue Scalpel: Activation Steering Compromises LLM Safety Paper • 2509.22067 • Published Sep 26, 2025 • 27
When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs Paper • 2508.11383 • Published Aug 15, 2025 • 41
RiemannLoRA: A Unified Riemannian Framework for Ambiguity-Free LoRA Optimization Paper • 2507.12142 • Published Jul 16, 2025 • 36
T-LoRA: Single Image Diffusion Model Customization Without Overfitting Paper • 2507.05964 • Published Jul 8, 2025 • 119
Inverse-and-Edit: Effective and Fast Image Editing by Cycle Consistency Models Paper • 2506.19103 • Published Jun 23, 2025 • 42
Geopolitical biases in LLMs: what are the "good" and the "bad" countries according to contemporary language models Paper • 2506.06751 • Published Jun 7, 2025 • 71
Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA Paper • 2505.21115 • Published May 27, 2025 • 140
Through the Looking Glass: Common Sense Consistency Evaluation of Weird Images Paper • 2505.07704 • Published May 12, 2025 • 29