Model Card for Model ID
Indic Spoken Language Classifier supporting 42 languages.
Model Details
The model consists of a whisper-large-v3-turbo encoder followed by a pooled-attention module and finally a linear classifier layer. Trained end-to-end on ARTPARK-IISc/Vaani dataset, capable of classifying 42 Indic languages. The list of supported languages include:
["Surjapuri",
"Marathi",
"Assamese",
"Haryanvi",
"Halbi",
"Malayalam",
"Maithili",
"Wancho",
"Chhattisgarhi",
"Punjabi",
"Magahi",
"Nepali",
"Garhwali",
"Garo",
"Khortha",
"Sumi",
"Bajjika",
"Marwari",
"Telugu",
"Nagamese",
"Tulu",
"Odia",
"Urdu",
"Kumaoni",
"Kannada",
"Tamil",
"Sambalpuri",
"Bengali",
"Rajasthani",
"English",
"Malvani",
"Chakma",
"Surgujia",
"Kokborok",
"Khariboli",
"Hindi",
"Kurukh",
"Angika",
"Sadri",
"Bhojpuri",
"Konkani",
"Gujarati"]
How to Get Started with the Model
from transformers import pipeline
pipe = pipeline(
"audio-classification",
model="ARTPARK-IISc/Vaani-LID_v0",
trust_remote_code=True,
# device=-1 # uncomment this line to run on cpu
)
out = pipe("path/to/16kHz/mono-channel/wav/file")
print(out)
Evaluation
metrics:
- accuracy
model-index:
results:
task: type: speech-classification dataset: name: FLEURS split: test metrics:
- name: Accuracy value: 0.71
dataset: name: Kathbath split: test metrics:
- name: Accuracy value: 0.63
dataset: name: Vaani split: test metrics:
- name: Accuracy value: 0.77
- Downloads last month
- 187
Model tree for ARTPARK-IISc/Vaani-LID_v0
Base model
openai/whisper-large-v3
Finetuned
openai/whisper-large-v3-turbo