1 28 19

sian cao

sonald

AI & ML interests

AI, big data, OS

Recent Activity

upvoted an article 8 days ago

Deriving the DPO Loss from First Principles

upvoted an article 10 days ago

Deriving the PPO Loss from First Principles

upvoted an article 13 days ago

From GRPO to DAPO and GSPO: What, Why, and How

View all activity

Organizations

upvoted an article 8 days ago

Article

Deriving the DPO Loss from First Principles

9 days ago

•

upvoted an article 10 days ago

Article

Deriving the PPO Loss from First Principles

14 days ago

•

upvoted an article 13 days ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9, 2025

•

liked a Space 14 days ago

The Ultra-Scale Playbook

🌌

3.63k

The ultimate guide to training LLM on large GPU Clusters

upvoted 3 articles 15 days ago

Article

Efficient MultiModal Data Pipeline

Jul 8, 2025

•

Article

KV Cache from scratch in nanoVLM

Jun 4, 2025

•

108

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

Sep 11, 2025

•

177

upvoted an article 20 days ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Oct 7, 2024

•

liked a Space 20 days ago

The Smol Training Playbook

📚

2.81k

The secrets to building world-class LLMs

upvoted an article 21 days ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

24 days ago

•

104

upvoted an article 27 days ago

Article

Putting RL back in RLHF

Jun 12, 2024

•

109

upvoted a paper about 1 month ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 283

liked a dataset about 1 month ago

PleIAs/SYNTH

Viewer • Updated Nov 11, 2025 • 68M • 23.2k • 218

upvoted 3 articles 2 months ago

Article

Why Did MiniMax M2 End Up as a Full Attention Model?

Oct 30, 2025

•

Article

Aligning to What? Rethinking Agent Generalization in MiniMax M2

Oct 30, 2025

•

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

May 21, 2025

•

247

upvoted 4 articles 3 months ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

Aug 5, 2025

•

508

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

Jun 12, 2025

•

151

Article

Vision Language Models Explained

Apr 11, 2024

•

505

Article

You could have designed state of the art positional encoding

Nov 25, 2024

•

429

sian cao

AI & ML interests

Recent Activity

Organizations

sonald's activity

Deriving the DPO Loss from First Principles

Deriving the PPO Loss from First Principles

From GRPO to DAPO and GSPO: What, Why, and How

The Ultra-Scale Playbook

Efficient MultiModal Data Pipeline

KV Cache from scratch in nanoVLM

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

Efficient LLM Pretraining: Packed Sequences and Masked Attention

The Smol Training Playbook

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Putting RL back in RLHF

Why Did MiniMax M2 End Up as a Full Attention Model?

Aligning to What? Rethinking Agent Generalization in MiniMax M2

nanoVLM: The simplest repository to train your VLM in pure PyTorch

Welcome GPT OSS, the new open-source model family from OpenAI!

Learn the Hugging Face Kernel Hub in 5 Minutes

Vision Language Models Explained

You could have designed state of the art positional encoding