Zixi "Oz" Li PRO

OzTianlu

https://github.com/lizixi-0x2F

lizixi-0x2F

AI & ML interests

My research focuses on deep reasoning with small language models, Transformer architecture innovation, and knowledge distillation for efficient alignment and transfer.

Recent Activity

reacted to danielhanchen's post with 🔥 1 day ago

We collaborated with NVIDIA to teach you about Reinforcement Learning and RL environments. 💚 Learn: • Why RL environments matter + how to build them • When RL is better than SFT • GRPO and RL best practices • How verifiable rewards and RLVR work Blog: https://unsloth.ai/blog/rl-environments

replied to their post 1 day ago

Arcade-3B — SmolReasoner https://huggingface.co/NoesisLab/Arcade-3B Arcade-3B is a 3B instruction-following and reasoning model built on SmolLM3-3B. It is the public release from the ARCADE project at NoesisLab, which investigates the State–Constraint Orthogonality Hypothesis: standard Transformer hidden states conflate factual content and reasoning structure in the same subspace, and explicitly decoupling them improves generalization.

updated a model 1 day ago

NoesisLab/Arcade-3B

View all activity

Organizations

OzTianlu 's models

None public yet