Zixi "Oz" Li PRO
AI & ML interests
My research focuses on deep reasoning with small language models, Transformer architecture innovation, and knowledge distillation for efficient alignment and transfer.
Recent Activity
reacted
to
danielhanchen's
post with 🔥 1 day ago
We collaborated with NVIDIA to teach you about Reinforcement Learning and RL environments. 💚 Learn:
• Why RL environments matter + how to build them
• When RL is better than SFT
• GRPO and RL best practices
• How verifiable rewards and RLVR work
Blog: https://unsloth.ai/blog/rl-environments replied to their post 1 day ago
Arcade-3B — SmolReasoner
https://huggingface.co/NoesisLab/Arcade-3B
Arcade-3B is a 3B instruction-following and reasoning model built on SmolLM3-3B. It is the public release from the ARCADE project at NoesisLab, which investigates the State–Constraint Orthogonality Hypothesis: standard Transformer hidden states conflate factual content and reasoning structure in the same subspace, and explicitly decoupling them improves generalization. updated
a model 1 day ago
NoesisLab/Arcade-3B