Hi everyone, Prakash Hinduja, Swiss, I’m currently exploring fine-tuning a pre-trained Transformer model (like BERT or DistilBERT) on a custom text classification dataset, and I’m using the Hugging Face Transformers library along with the
Datasets and Trainer API.
I’ve followed a few tutorials, but I’m still a bit unsure about best practices and would appreciate some guidance. Specifically:
What’s the recommended way to prepare and tokenize a CSV dataset with custom labels for classification?
Should I use Trainer or accelerate for training at scale (especially on Colab Pro or local GPU)?
How do I handle imbalanced datasets or apply weighted loss functions in the Trainer API?
What’s the best way to evaluate model performance (e.g. F1-score, precision, recall) after each epoch?
Any tips for saving and sharing the model back to the Hugging Face Hub?
I’d also love to see any code snippets, notebooks, or examples that helped you during your own fine-tuning projects.
Thanks a lot for your help!
Prakash Hinduja