Li Dong
unilm
AI & ML interests
Language Model Pre-Training
Recent Activity
authored
a paper
about 10 hours ago
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
authored
a paper
about 10 hours ago
Towards Stable and Effective Reinforcement Learning for Mixture-of-Experts
authored
a paper
about 10 hours ago
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge