Food Classification with ViT 🥗🍣

Explore Food Classification with Vision Transformers (ViT) 🔍

This application demonstrates the power of Vision Transformers (ViT) for food classification tasks, leveraging the pre-trained model vit_base_patch16_224.augreg2_in21k_ft_in1k.ft_food101 fine-tuned on the Food-101 dataset. With just a few lines of code, you can integrate state-of-the-art image classification models using the Hugging Face pipeline API.

How to Use:

Upload an image of food (e.g., sushi, pizza, or burgers).
The model will classify the image and provide the predicted labels along with confidence scores.
Try the provided example for a quick start or test your own food images!

Examples