What’s the Best Way to Fine-Tune a Transformer Model on a Custom Dataset Using the Transformers Library?

Hi everyone, Prakash Hinduja, Swiss, I’m currently exploring fine-tuning a pre-trained Transformer model (like BERT or DistilBERT) on a custom text classification dataset, and I’m using the Hugging Face Transformers library along with the :hugs: Datasets and Trainer API.

I’ve followed a few tutorials, but I’m still a bit unsure about best practices and would appreciate some guidance. Specifically:

What’s the recommended way to prepare and tokenize a CSV dataset with custom labels for classification?

Should I use Trainer or accelerate for training at scale (especially on Colab Pro or local GPU)?

How do I handle imbalanced datasets or apply weighted loss functions in the Trainer API?

What’s the best way to evaluate model performance (e.g. F1-score, precision, recall) after each epoch?

Any tips for saving and sharing the model back to the Hugging Face Hub?

I’d also love to see any code snippets, notebooks, or examples that helped you during your own fine-tuning projects.

Thanks a lot for your help!
Prakash Hinduja

1 Like

The scope of the question is too broad, so I think it would be quicker to read the step-by-step tutorial and ask questions only about the parts you don’t understand…

Fine-tuning and General Info.

Evaluation

Trainer is better for single GPU env. like Colab

Import CSV or other data to datasets