---
license: apache-2.0
---
## Lutech-AI/I-SPIn
**I**talian-**S**entence **P**air **In**ference, AKA **I-SPIn**.<br>
This is a fine-tuned version of the model [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2).<br>
Its main task is to perform the [Natural Language Inference (NLI)](https://nlp.stanford.edu/projects/snli/) task in the Italian language.<br>
The prediction labels may assume three possible values: 
1. 1 means the model predicts <em>entailment</em>;
2. 0 represents the <em>neutral</em> case;
3. -1 corresponds to <em>contradiction</em>.
 
## How it was trained
1. Train [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) on the NLI task;
2. Apply Knowledge Distillation on the output of (1.) with IT-EN translation dataset to retain NLI knowledge and improve Italian language comprehension.

More details available in the paper!: https://arxiv.org/abs/2309.02887

# Usage #1 (HuggingFace Transformers)
In the environment on which you want to run the project, type:
```markdown
pip install --extra-index-url https://test.pypi.org/simple/ ispin
```

NOTE: during the first execution, a total of two different models will be downloaded: 
1. I-SPIn;
2. paraphrase-multilingual-mpnet-base-v2.

Each is roughly 1GB in dimension.

## Retrieve embeddings
If you installed the package correctly, you can retrieve embeddings in the following way:
```python
from ispin.ISPIn import ISPIn
model = ISPIn.from_pretrained('Lutech-AI/I-SPIn')
sentences = ['Questa è una frase di prova', 'Testando il funzionamento del modello']
sentence_embeddings = model(sentences)
print(sentence_embeddings) # -> torch.Size(2, 768)
```

## Retrieve labels
If you installed the package correctly, you can retrieve labels in the following way:
```python
from ispin.ISPIn import ISPIn
model = ISPIn.from_pretrained('Lutech-AI/I-SPIn')
premises = ['Il modello sta funzionando correttamente', 'Il modello non funziona correttamente']
hypothesis = ['Testando il funzionamento del modello']
premises_embeddings = model(premises)
hypothesis_embeddings = model(hypothesis)
predictions = model.predict(
    premises_embeddings,
    hypothesis_embeddings,
    one_to_many = False
)
print(predictions) # -> [0 -1]
```

The computation is subdivided in two tasks (embedding, classification) to simplify a custom fine-tuning process.

If you want to further optimize this classification head, you might want to deepcopy the layers and continue training 
(one can choose which layers by slicing the list):
```python
import torch
import copy
module_list = torch.nn.ModuleList(list(copy.deepcopy(model.layers))[start:end])
```

# Usage #2 (cloning repo) (will be deleted)
In a terminal located in your project folder, type: 'git clone https://huggingface.co/Lutech-AI/I-SPIn/ ISPIn'. <br>
Please specify the final 'ISPIn' to avoid complications when calling the Python module. <br>
Then, in the code where you call the model, substitute the line:
```python
model = ISPIn.from_pretrained('Lutech-AI/I-SPIn')
```
with:
```python
model = ISPIn.from_pretrained('[your/path]/I-SPIn')
```

## Full model architecture
```markdown
ISPIn(
  (encoder): XLMRobertaModel(...) # transformers internal implementation of 'paraphrase-multilingual-mpnet-base-v2'
  (layers): ModuleList(
    (0): Linear(in_features=1536, out_features=1024, bias=True)
    (1): Linear(in_features=1024, out_features=512, bias=True)
    (2): Linear(in_features=512, out_features=256, bias=True)
    (3): Linear(in_features=256, out_features=128, bias=True)
    (4): Linear(in_features=128, out_features=64, bias=True)
    (5): Linear(in_features=64, out_features=3, bias=True)
  )
  (activation): GELU()
)
```

## Evaluation results
|                 Dataset                | Metric       | Performance |
|:--------------------------------------:|--------------|-------------|
| [RTE3-ITA](https://github.com/gilnoh/RTEFormatWork/tree/master/RTE3-ITdata-original-format)                               | Accuracy     | 68%         |
| [RTE3-ITA](https://github.com/gilnoh/RTEFormatWork/tree/master/RTE3-ITdata-original-format)                               | Min F1-Score | 60%         |
| [RTE-2009-ITA](https://live.european-language-grid.eu/catalogue/corpus/8121/download/)                             | Accuracy     | 59%         |
| [RTE-2009-ITA](https://live.european-language-grid.eu/catalogue/corpus/8121/download/)                               | Min F1-Score | 31%         |
| [SNLI](https://nlp.stanford.edu/projects/snli/) (IT) translated w/[NLLB-600M](https://huggingface.co/facebook/nllb-200-distilled-600M)            | Accuracy     | 74%         |
| [MNLI-Matched](https://cims.nyu.edu/~sbowman/multinli/) (IT) translated w/[NLLB-600M](https://huggingface.co/facebook/nllb-200-distilled-600M)    | Accuracy     | 72%         |
| [MNLI-Mismatched](https://cims.nyu.edu/~sbowman/multinli/) (IT) translated w/[NLLB-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) | Accuracy     | 73%         

NOTE: in [RTE3-ITA](https://github.com/gilnoh/RTEFormatWork/tree/master/RTE3-ITdata-original-format) and [RTE-2009-ITA](https://live.european-language-grid.eu/catalogue/corpus/8121/download/), there is no 'neutral' class.
Hence, in those cases, during testing, as the model classified a sentence pair as 'neutral', it was manually relabeled as 'contradiction'.