Open to Work

9 11 3

Baohao Liao

baohao

https://baohaoliao.github.io/

AI & ML interests

NLP

Recent Activity

updated a model 2 days ago

baohao/agentic_opd_data

published a model 2 days ago

baohao/agentic_opd_data

published a dataset 2 days ago

baohao/agentic_opd_data

View all activity

Organizations

upvoted a collection 3 months ago

SAGE

Collection

Self-Hinting Language Models Enhance Reinforcement Learning • 23 items • Updated Mar 27 • 3

upvoted a paper 3 months ago

Self-Hinting Language Models Enhance Reinforcement Learning

Paper • 2602.03143 • Published Feb 3 • 31

upvoted a paper 6 months ago

3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability

Paper • 2409.00119 • Published Aug 28, 2024 • 1

upvoted a collection 7 months ago

Reinforce-Ada

Collection

Training & test sets and finetuned models • 19 items • Updated Oct 26, 2025 • 3

upvoted a paper 7 months ago

Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training

Paper • 2510.04996 • Published Oct 6, 2025 • 16

upvoted an article 8 months ago

Article

Gaia2 and ARE: Empowering the community to study agents

clefourrier, gregmialz, mlcu, mortimerp9, XciD, tfrere, evijit, RomainFroger, dheeraj7596, CarolinePascal, upiter

•

Sep 22, 2025

• 134

upvoted a paper 11 months ago

Lost at the Beginning of Reasoning

Paper • 2506.22058 • Published Jun 27, 2025 • 1

upvoted a paper 12 months ago

Fractured Chain-of-Thought Reasoning

Paper • 2505.12992 • Published May 19, 2025 • 23

upvoted a paper about 1 year ago

Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation

Paper • 2505.06027 • Published May 9, 2025 • 18

upvoted an article about 1 year ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

NormalUhr

•

Feb 11, 2025

• 121

upvoted a paper over 1 year ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published Jan 31, 2025 • 39

Baohao Liao

AI & ML interests

Recent Activity

Organizations

baohao's activity

Gaia2 and ARE: Empowering the community to study agents

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment