Running RL 2 Office Document Task Environment π Explore office document RL tasks and model performance
bpHigh/gstar_assignment_2_prompt_modified_grpo_max_tokens_256_steps_120 2B β’ Updated Oct 12, 2025 β’ 1
bpHigh/CE_TOTAL_TRANSLATED_EXCEPT_MARATHI_SEMREL Text Classification β’ 0.3B β’ Updated Jan 31, 2024 β’ 5
bpHigh/CE_TOTAL_TRANSLATED_AUGMENTED_SEMREL Text Classification β’ 0.3B β’ Updated Jan 28, 2024 β’ 4