-
Statistical Methods in Generative AI
Paper • 2509.07054 • Published • 11 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 276 -
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
Paper • 2511.17592 • Published • 121
Shariar Kabir
shariar076
AI & ML interests
NLP, Mech Interp, Data Science
Organizations
Stats and LLM
-
Statistical Methods in Generative AI
Paper • 2509.07054 • Published • 11 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 276 -
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
Paper • 2511.17592 • Published • 121
models 48
shariar076/Llama-3.1-8B-DPO-0R100L
Text Generation • 8B • Updated
shariar076/Llama-3.1-8B-DPO-25R75L
Text Generation • 8B • Updated • 1
shariar076/Llama-3.1-8B-DPO-50R50L
Text Generation • 8B • Updated
shariar076/Llama-3.1-8B-DPO-75R25L
Text Generation • 8B • Updated
shariar076/Llama-3.1-8B-DPO-100R0L
Text Generation • 8B • Updated • 2
shariar076/Llama-3.1-8B-Instruct-DPO-0R100L-PoliTune
Text Generation • 8B • Updated • 1
shariar076/Llama-3.1-8B-Instruct-DPO-25R75L-PoliTune
Text Generation • 8B • Updated
shariar076/Llama-3.1-8B-Instruct-DPO-50R50L-PoliTune
Text Generation • 8B • Updated
shariar076/Llama-3.1-8B-Instruct-DPO-75R25L-PoliTune
Text Generation • 8B • Updated
shariar076/Llama-3.1-8B-Instruct-DPO-100R0L-PoliTune
Text Generation • 8B • Updated • 1
datasets 0
None public yet