BartBanaFinal

This model is a fine-tuned version of IAmSkyDra/BARTBana_v5 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9120
  • Sacrebleu: 5.0981
  • Chrf++: 18.4403

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 512
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 2000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Chrf++ Validation Loss Sacrebleu
No log 0 0 11.1997 74.5460 0.9579
0.1371 0.0269 2000 7.5555 1.4283 0.1498
0.1246 0.0539 4000 8.2284 1.4896 0.3342
0.1192 0.0808 6000 10.0258 1.4069 0.5165
0.1232 0.1077 8000 9.0570 1.2832 0.4636
0.1281 0.1346 10000 9.6671 1.1653 0.8182
0.1166 0.1616 12000 11.0435 1.2052 0.8000
0.1365 0.1885 14000 11.6117 1.2694 1.0365
0.1369 0.2154 16000 11.8559 1.1566 1.5111
0.1435 0.2424 18000 10.1954 1.1604 1.1035
0.1309 0.2693 20000 12.1724 1.1573 1.8697
0.1429 0.2962 22000 11.8521 1.1820 1.3905
0.1479 0.3232 24000 12.8547 1.0645 1.9644
0.1033 0.3501 26000 12.3747 1.1773 1.7998
0.1068 0.3770 28000 11.6427 1.2409 1.3029
0.1139 0.4039 30000 13.7875 1.1162 2.3469
0.1139 0.4039 30000 13.7875 1.1162 2.3469
0.1287 0.4309 32000 12.3235 1.1047 1.5293
0.1369 0.4578 34000 12.0428 1.1774 1.7930
0.1338 0.4847 36000 12.5997 1.1800 1.8025
0.1398 0.5117 38000 13.9852 1.1218 2.1374
0.1793 0.5386 40000 14.9792 1.0166 2.8086
0.1556 0.5655 42000 14.8273 1.0475 2.8293
0.1563 0.5924 44000 14.8104 1.0428 2.5901
0.1373 0.6194 46000 14.0005 1.0972 2.3887
0.1686 0.6463 48000 14.6327 1.0907 2.5002
0.2061 0.6732 50000 17.3738 0.9981 3.1007
0.175 0.7002 52000 17.6129 0.9794 4.3472
0.2017 0.7271 54000 15.4695 0.9891 3.2719
0.1949 0.7540 56000 16.5992 0.9531 4.4438
0.2277 0.7810 58000 18.9529 0.9621 4.6969
0.2273 0.8079 60000 17.1567 0.9442 4.5503
0.2572 0.8348 62000 17.4794 0.9348 4.5176
0.2874 0.8617 64000 17.2450 0.9010 4.5275
0.2548 0.8887 66000 18.4403 0.9120 5.0981
0.3086 0.9156 68000 16.9905 0.9026 4.4480
0.4937 0.9425 70000 16.6739 0.8929 4.3731
0.6425 0.9695 72000 17.5479 0.8724 4.6322
0.8591 0.9964 74000 15.8401 0.8449 4.6105
0.9277 1.0 74268 0.8437 4.6162 15.9720

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.9.0+cu128
  • Datasets 4.4.1
  • Tokenizers 0.22.1
Downloads last month
6
Safetensors
Model size
0.4B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for IAmSkyDra/BartBanaFinal_v3

Finetuned
(33)
this model

Evaluation results