DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24, 2025 • 60
WaterDrum: Watermarking for Data-centric Unlearning Metric Paper • 2505.05064 • Published May 8, 2025 • 8
Understanding Domain Generalization: A Noise Robustness Perspective Paper • 2401.14846 • Published Jan 26, 2024
Group-robust Sample Reweighting for Subpopulation Shifts via Influence Functions Paper • 2503.07315 • Published Mar 10, 2025
Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions Paper • 2406.04606 • Published Jun 7, 2024
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29, 2025 • 98