DistilBERT IMDb DAPT Sentiment Classifier
This model is a sentiment classification model built on top of a Domain-Adaptive Pretrained (DAPT) DistilBERT encoder, further fine-tuned on the IMDb movie reviews sentiment classification task.
The encoder was first adapted to the movie review domain using masked language modeling (MLM) on IMDb reviews, and then a linear classification head was trained for binary sentiment prediction.
The model predicts:
- Positive
- Negative for English movie reviews.
Model architecture
- Base model: DistilBERT DAPT IMDb
- Classifier Head: Linear layer on top of the
[CLS]token. - Pre-classifier: Not used in this implementation
- Labels: 2
Fine tuning details
- Dataset: stanford/imdb
- Loss: Cross-entropy loss
- Tokenization: WordPiece
- Dynamic padding
- Train samples: 25k
- Validation samples: 5k
- Test samples: 20k
Intended Uses
Supported Use Cases
- Sentiment analysis of movie reviews
- Opinion mining on review-style English text
- NLP research on domain adaptation
Out-of-Scope Uses
- Medical, legal or financial decision-making
- Safety-critical systems
- Content moderation or profiling of individuals
- Non-English or highly technical text
Usage
Loading the model
from transformers import AutoTokenizer
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file as safe_load_file
REPO_ID = "ayushshah/distilbert-dapt-imdb-sentiment"
hf_hub_download(repo_id=REPO_ID, filename="model.py", local_dir=".")
from model import DistilBERTClassifier, infer_reviews
classifier = DistilBERTClassifier()
tokenizer = AutoTokenizer.from_pretrained(REPO_ID)
model_path = hf_hub_download(repo_id=REPO_ID, filename="model.safetensors")
classifier.load_state_dict(safe_load_file(model_path))
classifier.eval()
Inference
# For single review inference
sample_review = "Great movie"
label_class, label, conf = infer_reviews(sample_review, classifier, tokenizer)
print(f"Predicted Sentiment: {label} (Confidence: {conf:.4f})\n")
# For batch inference
reviews = ["I loved this film!", "This was a terrible movie."]
results = infer_reviews(reviews, classifier, tokenizer)
for review, (label_class, label, conf) in zip(reviews, results):
print(f"Predicted Sentiment: {label} (Confidence: {conf:.4f})\n")
Evaluation
- Faster convergance during fine-tuning
- DAPT improves accuracy compared to DistilBERT base
- Not using a pre_classifier gives marginal better results (~0.5%) with lesser parameters
Limitations
- Inherits biases present in IMDb reviews
- Performance may degrade on non-review text
- Not suitable for sarcasm-heavy or highly nuanced sentiment
- Not designed for multilingual sentiment analysis
Ethical Considerations
- IMDb reviews may reflect demographic and cultural biases
- Outputs should not be used to infer personal traits
- The model should not be used for automated moderation or sensitive decision-making
Training Hyperparameters
- Encoder LR: 1e-5
- Classifier LR: 5e-5
- Optimizer: AdamW
- Epochs: 3
- Batch size: 64
- Scheduler: Linear decay (min lr: 1e-8) with 10% warmup
- Automatic mixed precision training
Training Stats
Epoch 1: Train Loss: 0.3159, Train Acc: 0.8533 | Val Loss: 0.2046, Val Acc: 0.9160
Epoch 2: Train Loss: 0.1744, Train Acc: 0.9341 | Val Loss: 0.1986, Val Acc: 0.9220
Epoch 3: Train Loss: 0.1400, Train Acc: 0.9507 | Val Loss: 0.2009, Val Acc: 0.9230
License
This model follows the license of the base DistilBERT model. Please refer to the Hugging Face repository for license details.
More information
- Downloads last month
- 49
Model tree for ayushshah/distilbert-dapt-imdb-sentiment
Base model
distilbert/distilbert-base-uncasedDataset used to train ayushshah/distilbert-dapt-imdb-sentiment
Space using ayushshah/distilbert-dapt-imdb-sentiment 1
Paper for ayushshah/distilbert-dapt-imdb-sentiment
Evaluation results
- Accuracy on imdbtest set self-reported0.932
- F1 Score on imdbtest set self-reported0.932
- Precision on imdbtest set self-reported0.930
- Recall on imdbtest set self-reported0.934
- ROC AUC on imdbtest set self-reported0.981
- Accuracy on sst2validation set self-reported0.890
- F1 Score on sst2validation set self-reported0.893
- Precision on sst2validation set self-reported0.887
- Recall on sst2validation set self-reported0.899
- ROC AUC on sst2validation set self-reported0.945