--- license: apache-2.0 --- ## Lutech-AI/I-SPIn **I**talian-**S**entence **P**air **In**ference, AKA **I-SPIn**.
This is a fine-tuned version of the model [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2).
Its main task is to perform the [Natural Language Inference (NLI)](https://nlp.stanford.edu/projects/snli/) task in the Italian language.
The prediction labels may assume three possible values: 1. 1 means the model predicts entailment; 2. 0 represents the neutral case; 3. -1 corresponds to contradiction. ## How it was trained 1. Train [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) on the NLI task; 2. Apply Knowledge Distillation on the output of (1.) with IT-EN translation dataset to retain NLI knowledge and improve Italian language comprehension. More details available in the paper!: https://arxiv.org/abs/2309.02887 # Usage #1 (HuggingFace Transformers) In the environment on which you want to run the project, type: ```markdown pip install --extra-index-url https://test.pypi.org/simple/ ispin ``` NOTE: during the first execution, a total of two different models will be downloaded: 1. I-SPIn; 2. paraphrase-multilingual-mpnet-base-v2. Each is roughly 1GB in dimension. ## Retrieve embeddings If you installed the package correctly, you can retrieve embeddings in the following way: ```python from ispin.ISPIn import ISPIn model = ISPIn.from_pretrained('Lutech-AI/I-SPIn') sentences = ['Questa รจ una frase di prova', 'Testando il funzionamento del modello'] sentence_embeddings = model(sentences) print(sentence_embeddings) # -> torch.Size(2, 768) ``` ## Retrieve labels If you installed the package correctly, you can retrieve labels in the following way: ```python from ispin.ISPIn import ISPIn model = ISPIn.from_pretrained('Lutech-AI/I-SPIn') premises = ['Il modello sta funzionando correttamente', 'Il modello non funziona correttamente'] hypothesis = ['Testando il funzionamento del modello'] premises_embeddings = model(premises) hypothesis_embeddings = model(hypothesis) predictions = model.predict( premises_embeddings, hypothesis_embeddings, one_to_many = False ) print(predictions) # -> [0 -1] ``` The computation is subdivided in two tasks (embedding, classification) to simplify a custom fine-tuning process. If you want to further optimize this classification head, you might want to deepcopy the layers and continue training (one can choose which layers by slicing the list): ```python import torch import copy module_list = torch.nn.ModuleList(list(copy.deepcopy(model.layers))[start:end]) ``` # Usage #2 (cloning repo) (will be deleted) In a terminal located in your project folder, type: 'git clone https://huggingface.co/Lutech-AI/I-SPIn/ ISPIn'.
Please specify the final 'ISPIn' to avoid complications when calling the Python module.
Then, in the code where you call the model, substitute the line: ```python model = ISPIn.from_pretrained('Lutech-AI/I-SPIn') ``` with: ```python model = ISPIn.from_pretrained('[your/path]/I-SPIn') ``` ## Full model architecture ```markdown ISPIn( (encoder): XLMRobertaModel(...) # transformers internal implementation of 'paraphrase-multilingual-mpnet-base-v2' (layers): ModuleList( (0): Linear(in_features=1536, out_features=1024, bias=True) (1): Linear(in_features=1024, out_features=512, bias=True) (2): Linear(in_features=512, out_features=256, bias=True) (3): Linear(in_features=256, out_features=128, bias=True) (4): Linear(in_features=128, out_features=64, bias=True) (5): Linear(in_features=64, out_features=3, bias=True) ) (activation): GELU() ) ``` ## Evaluation results | Dataset | Metric | Performance | |:--------------------------------------:|--------------|-------------| | [RTE3-ITA](https://github.com/gilnoh/RTEFormatWork/tree/master/RTE3-ITdata-original-format) | Accuracy | 68% | | [RTE3-ITA](https://github.com/gilnoh/RTEFormatWork/tree/master/RTE3-ITdata-original-format) | Min F1-Score | 60% | | [RTE-2009-ITA](https://live.european-language-grid.eu/catalogue/corpus/8121/download/) | Accuracy | 59% | | [RTE-2009-ITA](https://live.european-language-grid.eu/catalogue/corpus/8121/download/) | Min F1-Score | 31% | | [SNLI](https://nlp.stanford.edu/projects/snli/) (IT) translated w/[NLLB-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) | Accuracy | 74% | | [MNLI-Matched](https://cims.nyu.edu/~sbowman/multinli/) (IT) translated w/[NLLB-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) | Accuracy | 72% | | [MNLI-Mismatched](https://cims.nyu.edu/~sbowman/multinli/) (IT) translated w/[NLLB-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) | Accuracy | 73% NOTE: in [RTE3-ITA](https://github.com/gilnoh/RTEFormatWork/tree/master/RTE3-ITdata-original-format) and [RTE-2009-ITA](https://live.european-language-grid.eu/catalogue/corpus/8121/download/), there is no 'neutral' class. Hence, in those cases, during testing, as the model classified a sentence pair as 'neutral', it was manually relabeled as 'contradiction'.