John Smith PRO
John6666
AI & ML interests
None yet
Recent Activity
reacted
to
sergiopaniego's
post
with π€
about 4 hours ago
We just released TRL v0.26.0!
It comes packed with updates:
> Agent training with tools in GRPO
> New CISPO & SAPO losses + reasoning rewards
> vLLM quantization in colocate mode
> Dataset shuffling in SFT
> Lots of NEW examples
> Tons of fixes and documentation improvements
updated
a dataset
about 19 hours ago
John6666/forum3
updated
a dataset
about 19 hours ago
John6666/knowledge_base_md_for_rag_1