DistilBERT IMDb DAPT Sentiment Classifier

This model is a sentiment classification model built on top of a Domain-Adaptive Pretrained (DAPT) DistilBERT encoder, further fine-tuned on the IMDb movie reviews sentiment classification task.

The encoder was first adapted to the movie review domain using masked language modeling (MLM) on IMDb reviews, and then a linear classification head was trained for binary sentiment prediction.

The model predicts:

Positive
Negative for English movie reviews.

Model architecture

Base model: DistilBERT DAPT IMDb
Classifier Head: Linear layer on top of the [CLS] token.
Pre-classifier: Not used in this implementation
Labels: 2

Fine tuning details

Dataset: stanford/imdb
Loss: Cross-entropy loss
Tokenization: WordPiece
Dynamic padding
Train samples: 25k
Validation samples: 5k
Test samples: 20k

Intended Uses

Supported Use Cases

Sentiment analysis of movie reviews
Opinion mining on review-style English text
NLP research on domain adaptation

Out-of-Scope Uses

Medical, legal or financial decision-making
Safety-critical systems
Content moderation or profiling of individuals
Non-English or highly technical text

Usage

Loading the model

from transformers import AutoTokenizer
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file as safe_load_file

REPO_ID = "ayushshah/distilbert-dapt-imdb-sentiment"
hf_hub_download(repo_id=REPO_ID, filename="model.py", local_dir=".")

from model import DistilBERTClassifier, infer_reviews

classifier = DistilBERTClassifier()
tokenizer = AutoTokenizer.from_pretrained(REPO_ID)

model_path = hf_hub_download(repo_id=REPO_ID, filename="model.safetensors")
classifier.load_state_dict(safe_load_file(model_path))
classifier.eval()

Inference

# For single review inference
sample_review = "Great movie"
label_class, label, conf = infer_reviews(sample_review, classifier, tokenizer)
print(f"Predicted Sentiment: {label} (Confidence: {conf:.4f})\n")

# For batch inference
reviews = ["I loved this film!", "This was a terrible movie."]
results = infer_reviews(reviews, classifier, tokenizer)
for review, (label_class, label, conf) in zip(reviews, results):
    print(f"Predicted Sentiment: {label} (Confidence: {conf:.4f})\n")

Evaluation

Faster convergance during fine-tuning
DAPT improves accuracy compared to DistilBERT base
Not using a pre_classifier gives marginal better results (~0.5%) with lesser parameters

Limitations

Inherits biases present in IMDb reviews
Performance may degrade on non-review text
Not suitable for sarcasm-heavy or highly nuanced sentiment
Not designed for multilingual sentiment analysis

Ethical Considerations

IMDb reviews may reflect demographic and cultural biases
Outputs should not be used to infer personal traits
The model should not be used for automated moderation or sensitive decision-making

Training Hyperparameters

Encoder LR: 1e-5
Classifier LR: 5e-5
Optimizer: AdamW
Epochs: 3
Batch size: 64
Scheduler: Linear decay (min lr: 1e-8) with 10% warmup
Automatic mixed precision training

Training Stats

Epoch 1: Train Loss: 0.3159, Train Acc: 0.8533 | Val Loss: 0.2046, Val Acc: 0.9160
Epoch 2: Train Loss: 0.1744, Train Acc: 0.9341 | Val Loss: 0.1986, Val Acc: 0.9220
Epoch 3: Train Loss: 0.1400, Train Acc: 0.9507 | Val Loss: 0.2009, Val Acc: 0.9230

License

This model follows the license of the base DistilBERT model. Please refer to the Hugging Face repository for license details.

More information

Downloads last month: 49

Safetensors

Model size

66.4M params

Tensor type

F32

Model tree for ayushshah/distilbert-dapt-imdb-sentiment

Base model

distilbert/distilbert-base-uncased

Finetuned

ayushshah/distilbert-base-uncased-imdb-dapt

Finetuned

(1)

this model

Dataset used to train ayushshah/distilbert-dapt-imdb-sentiment

Space using ayushshah/distilbert-dapt-imdb-sentiment 1

Paper for ayushshah/distilbert-dapt-imdb-sentiment

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 21

Evaluation results

Accuracy on imdb
test set self-reported

0.932
F1 Score on imdb
test set self-reported

0.932
Precision on imdb
test set self-reported

0.930
Recall on imdb
test set self-reported

0.934
ROC AUC on imdb
test set self-reported

0.981
Accuracy on sst2
validation set self-reported

0.890
F1 Score on sst2
validation set self-reported

0.893
Precision on sst2
validation set self-reported

0.887
Recall on sst2
validation set self-reported

0.899
ROC AUC on sst2
validation set self-reported

0.945