koutch/short_paper_llama_1.json_train_dpo_v3_train_no_think Text Generation • 8B • Updated about 3 hours ago
koutch/short_paper_llama_1.json_train_dpo_v2_train_no_think Text Generation • 8B • Updated about 3 hours ago
koutch/short_paper_qwen_1.json_train_dpo_v2_train_no_think Text Generation • 4B • Updated about 4 hours ago
koutch/short_paper_llama_1.json_train_dpo_v4_train_no_think Text Generation • 8B • Updated about 4 hours ago
koutch/short_paper_qwen_1.json_train_dpo_v4_train_no_think Text Generation • 4B • Updated about 5 hours ago
koutch/short_paper_qwen_1.json_train_dpo_v3_train_no_think Text Generation • 4B • Updated about 5 hours ago
koutch/short_paper_smol_1.json_train_dpo_v3_train_no_think Text Generation • 3B • Updated about 5 hours ago
koutch/short_paper_smol_1.json_train_dpo_v4_train_no_think Text Generation • 3B • Updated about 5 hours ago • 16
koutch/short_paper_smol_1.json_train_dpo_v2_train_no_think Text Generation • 3B • Updated about 5 hours ago
koutch/short_paper_llama_llama3.1-8b_train_sft_train_no_think Text Generation • 8B • Updated about 5 hours ago • 149