初始化项目,由ModelHub XC社区提供模型

Model: eduardofv/stsb-m-mt-es-distiluse-base-multilingual-cased-v1
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-13 17:07:32 +08:00
commit cace405113
14 changed files with 119690 additions and 0 deletions

16
.gitattributes vendored Normal file
View File

@@ -0,0 +1,16 @@
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tar.gz filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text

7
1_Pooling/config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"word_embedding_dimension": 768,
"pooling_mode_cls_token": false,
"pooling_mode_mean_tokens": true,
"pooling_mode_max_tokens": false,
"pooling_mode_mean_sqrt_len_tokens": false
}

59
README.md Normal file
View File

@@ -0,0 +1,59 @@
---
language: es
datasets:
- stsb_multi_mt
tags:
- sentence-similarity
- sentence-transformers
---
This is a test model that was fine-tuned using the Spanish datasets from [stsb_multi_mt](https://huggingface.co/datasets/stsb_multi_mt) in order to understand and benchmark STS models.
## Model and training data description
This model was built taking `distiluse-base-multilingual-cased-v1` and training it on a Semantic Textual Similarity task using a modified version of the training script for STS from Sentece Transformers (the modified script is included in the repo). It was trained using the Spanish datasets from [stsb_multi_mt](https://huggingface.co/datasets/stsb_multi_mt) which are the STSBenchmark datasets automatically translated to other languages using deepl.com. Refer to the dataset repository for more details.
## Intended uses & limitations
This model was built just as a proof-of-concept on STS fine-tuning using Spanish data and no specific use other than getting a sense on how this training works.
## How to use
You may use it as any other STS trained model to extract sentence embeddings. Check Sentence Transformers documentation.
## Training procedure
This model was trained using this [Colab Notebook](https://colab.research.google.com/drive/1ZNjDMFdy_lKhnD9BtbqzSbQ4LNz638ZA?usp=sharing)
## Evaluation results
Evaluating `distiluse-base-multilingual-cased-v1` on the Spanish test dataset before training results in:
```
2021-07-06 17:44:46 - EmbeddingSimilarityEvaluator: Evaluating the model on dataset:
2021-07-06 17:45:00 - Cosine-Similarity : Pearson: 0.7662 Spearman: 0.7583
2021-07-06 17:45:00 - Manhattan-Distance: Pearson: 0.7805 Spearman: 0.7772
2021-07-06 17:45:00 - Euclidean-Distance: Pearson: 0.7816 Spearman: 0.7778
2021-07-06 17:45:00 - Dot-Product-Similarity: Pearson: 0.6610 Spearman: 0.6536
```
While the fine-tuned version with the defaults of the training script and the Spanish training dataset results in:
```
2021-07-06 17:49:22 - EmbeddingSimilarityEvaluator: Evaluating the model on stsb-multi-mt-test dataset:
2021-07-06 17:49:24 - Cosine-Similarity : Pearson: 0.8265 Spearman: 0.8207
2021-07-06 17:49:24 - Manhattan-Distance: Pearson: 0.8131 Spearman: 0.8190
2021-07-06 17:49:24 - Euclidean-Distance: Pearson: 0.8129 Spearman: 0.8190
2021-07-06 17:49:24 - Dot-Product-Similarity: Pearson: 0.7773 Spearman: 0.7692
```
In our [STS Evaluation repository](https://github.com/eduardofv/sts_eval) we compare the performance of this model with other models from Sentence Transformers and Tensorflow Hub using the standard STSBenchmark and the 2017 STSBenchmark Task 3 for Spanish.
## Resources
- Training dataset [stsb_multi_mt](https://huggingface.co/datasets/stsb_multi_mt)
- Sentence Transformers [Semantic Textual Similarity](https://www.sbert.net/examples/training/sts/README.html)
- Check [sts_eval](https://github.com/eduardofv/sts_eval) for a comparison with Tensorflow and Sentence-Transformers models
- Check the [development environment to run the scripts and evaluation](https://github.com/eduardofv/ai-denv)

23
config.json Normal file
View File

@@ -0,0 +1,23 @@
{
"_name_or_path": "sentence-transformers/distiluse-base-multilingual-cased-v1",
"activation": "gelu",
"architectures": [
"DistilBertModel"
],
"attention_dropout": 0.1,
"dim": 768,
"dropout": 0.1,
"hidden_dim": 3072,
"initializer_range": 0.02,
"max_position_embeddings": 512,
"model_type": "distilbert",
"n_heads": 12,
"n_layers": 6,
"pad_token_id": 0,
"qa_dropout": 0.1,
"seq_classif_dropout": 0.2,
"sinusoidal_pos_embds": false,
"tie_weights_": true,
"transformers_version": "4.8.2",
"vocab_size": 119547
}

View File

@@ -0,0 +1,7 @@
{
"__version__": {
"sentence_transformers": "2.0.0",
"transformers": "4.8.2",
"pytorch": "1.9.0+cu102"
}
}

View File

@@ -0,0 +1,5 @@
epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
0,-1,0.8652665968345811,0.8603021160706319,0.8497111617616812,0.8557700577868964,0.8491194556237853,0.8549563009982794,0.8385542095072176,0.8388913913494812
1,-1,0.8645114658595832,0.860518712305324,0.8489194350105598,0.8550638501344121,0.8481446637438415,0.8541834331582855,0.8351583718649971,0.836709131838303
2,-1,0.867030445948165,0.8634980335095029,0.8506727168486387,0.8564475574925573,0.8501374118495724,0.8558676389439143,0.8365838992705756,0.8392035046976417
3,-1,0.8661164372932091,0.8624662517807953,0.8493156802662722,0.8550037227391485,0.848672351002434,0.854242569948961,0.8308664780710621,0.8349043949768572
1 epoch steps cosine_pearson cosine_spearman euclidean_pearson euclidean_spearman manhattan_pearson manhattan_spearman dot_pearson dot_spearman
2 0 -1 0.8652665968345811 0.8603021160706319 0.8497111617616812 0.8557700577868964 0.8491194556237853 0.8549563009982794 0.8385542095072176 0.8388913913494812
3 1 -1 0.8645114658595832 0.860518712305324 0.8489194350105598 0.8550638501344121 0.8481446637438415 0.8541834331582855 0.8351583718649971 0.836709131838303
4 2 -1 0.867030445948165 0.8634980335095029 0.8506727168486387 0.8564475574925573 0.8501374118495724 0.8558676389439143 0.8365838992705756 0.8392035046976417
5 3 -1 0.8661164372932091 0.8624662517807953 0.8493156802662722 0.8550037227391485 0.848672351002434 0.854242569948961 0.8308664780710621 0.8349043949768572

14
modules.json Normal file
View File

@@ -0,0 +1,14 @@
[
{
"idx": 0,
"name": "0",
"path": "",
"type": "sentence_transformers.models.Transformer"
},
{
"idx": 1,
"name": "1",
"path": "1_Pooling",
"type": "sentence_transformers.models.Pooling"
}
]

3
pytorch_model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0f7593234a63463168275846bce5bfe5989641084ab092f4867a8d48a9fc1337
size 538972985

View File

@@ -0,0 +1,4 @@
{
"max_seq_length": 512,
"do_lower_case": false
}

View File

@@ -0,0 +1,2 @@
epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
-1,-1,0.8273478826973636,0.8215128959042091,0.8137184308367987,0.8197566352546802,0.8138222620547743,0.8195473589233008,0.7755088639539296,0.7692474708707059
1 epoch steps cosine_pearson cosine_spearman euclidean_pearson euclidean_spearman manhattan_pearson manhattan_spearman dot_pearson dot_spearman
2 -1 -1 0.8273478826973636 0.8215128959042091 0.8137184308367987 0.8197566352546802 0.8138222620547743 0.8195473589233008 0.7755088639539296 0.7692474708707059

1
special_tokens_map.json Normal file
View File

@@ -0,0 +1 @@
{"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}

1
tokenizer.json Normal file

File diff suppressed because one or more lines are too long

1
tokenizer_config.json Normal file
View File

@@ -0,0 +1 @@
{"do_lower_case": false, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "max_len": 512, "special_tokens_map_file": "old_models/distiluse-base-multilingual-cased-v1/0_Transformer/special_tokens_map.json", "name_or_path": "sentence-transformers/distiluse-base-multilingual-cased-v1", "do_basic_tokenize": true, "never_split": null, "tokenizer_class": "DistilBertTokenizer"}

119547
vocab.txt Normal file

File diff suppressed because it is too large Load Diff