初始化项目，由ModelHub XC社区提供模型

Model: eduardofv/stsb-m-mt-es-distiluse-base-multilingual-cased-v1 Source: Original Platform
2026-05-13 17:07:32 +08:00
commit cace405113
14 changed files with 119690 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,16 @@
 *.bin.* filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tar.gz filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
--- a/1_Pooling/config.json
+++ b/1_Pooling/config.json
@@ -0,0 +1,7 @@
 {
  "word_embedding_dimension": 768,
  "pooling_mode_cls_token": false,
  "pooling_mode_mean_tokens": true,
  "pooling_mode_max_tokens": false,
  "pooling_mode_mean_sqrt_len_tokens": false
 }
--- a/README.md
+++ b/README.md
@@ -0,0 +1,59 @@
 ---
 language: es
 datasets:
 - stsb_multi_mt
 tags:
 - sentence-similarity
 - sentence-transformers
 ---
 This is a test model that was fine-tuned using the Spanish datasets from [stsb_multi_mt](https://huggingface.co/datasets/stsb_multi_mt) in order to understand and benchmark STS models.
 ## Model and training data description
 This model was built taking `distiluse-base-multilingual-cased-v1` and training it on a Semantic Textual Similarity task using a modified version of the training script for STS from Sentece Transformers (the modified script is included in the repo). It was trained using the Spanish datasets from [stsb_multi_mt](https://huggingface.co/datasets/stsb_multi_mt) which are the STSBenchmark datasets automatically translated to other languages using deepl.com. Refer to the dataset repository for more details.
 ## Intended uses & limitations
 This model was built just as a proof-of-concept on STS fine-tuning using Spanish data and no specific use other than getting a sense on how this training works.
 ## How to use
 You may use it as any other STS trained model to extract sentence embeddings. Check Sentence Transformers documentation. 
 ## Training procedure
 This model was trained using this [Colab Notebook](https://colab.research.google.com/drive/1ZNjDMFdy_lKhnD9BtbqzSbQ4LNz638ZA?usp=sharing)
 ## Evaluation results
 Evaluating `distiluse-base-multilingual-cased-v1` on the Spanish test dataset before training results in:
 ```
 2021-07-06 17:44:46 - EmbeddingSimilarityEvaluator: Evaluating the model on  dataset:
 2021-07-06 17:45:00 - Cosine-Similarity :	Pearson: 0.7662	Spearman: 0.7583
 2021-07-06 17:45:00 - Manhattan-Distance:	Pearson: 0.7805	Spearman: 0.7772
 2021-07-06 17:45:00 - Euclidean-Distance:	Pearson: 0.7816	Spearman: 0.7778
 2021-07-06 17:45:00 - Dot-Product-Similarity:	Pearson: 0.6610	Spearman: 0.6536
 ```
 While the fine-tuned version with the defaults of the training script and the Spanish training dataset results in:
 ```
 2021-07-06 17:49:22 - EmbeddingSimilarityEvaluator: Evaluating the model on stsb-multi-mt-test dataset:
 2021-07-06 17:49:24 - Cosine-Similarity :	Pearson: 0.8265	Spearman: 0.8207
 2021-07-06 17:49:24 - Manhattan-Distance:	Pearson: 0.8131	Spearman: 0.8190
 2021-07-06 17:49:24 - Euclidean-Distance:	Pearson: 0.8129	Spearman: 0.8190
 2021-07-06 17:49:24 - Dot-Product-Similarity:	Pearson: 0.7773	Spearman: 0.7692
 ```
 In our [STS Evaluation repository](https://github.com/eduardofv/sts_eval) we compare the performance of this model with other models from Sentence Transformers and Tensorflow Hub using the standard STSBenchmark and the 2017 STSBenchmark Task 3 for Spanish.
 ## Resources
 - Training dataset [stsb_multi_mt](https://huggingface.co/datasets/stsb_multi_mt)
 - Sentence Transformers [Semantic Textual Similarity](https://www.sbert.net/examples/training/sts/README.html)
 - Check [sts_eval](https://github.com/eduardofv/sts_eval) for a comparison with Tensorflow and Sentence-Transformers models
 - Check the [development environment to run the scripts and evaluation](https://github.com/eduardofv/ai-denv)
--- a/config.json
+++ b/config.json
@@ -0,0 +1,23 @@
 {
  "_name_or_path": "sentence-transformers/distiluse-base-multilingual-cased-v1",
  "activation": "gelu",
  "architectures": [
    "DistilBertModel"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "transformers_version": "4.8.2",
  "vocab_size": 119547
 }
--- a/config_sentence_transformers.json
+++ b/config_sentence_transformers.json
@@ -0,0 +1,7 @@
 {
  "__version__": {
    "sentence_transformers": "2.0.0",
    "transformers": "4.8.2",
    "pytorch": "1.9.0+cu102"
  }
 }
--- a/eval/similarity_evaluation_sts-dev_results.csv
+++ b/eval/similarity_evaluation_sts-dev_results.csv
@@ -0,0 +1,5 @@
 epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
 0,-1,0.8652665968345811,0.8603021160706319,0.8497111617616812,0.8557700577868964,0.8491194556237853,0.8549563009982794,0.8385542095072176,0.8388913913494812
 1,-1,0.8645114658595832,0.860518712305324,0.8489194350105598,0.8550638501344121,0.8481446637438415,0.8541834331582855,0.8351583718649971,0.836709131838303
 2,-1,0.867030445948165,0.8634980335095029,0.8506727168486387,0.8564475574925573,0.8501374118495724,0.8558676389439143,0.8365838992705756,0.8392035046976417
 3,-1,0.8661164372932091,0.8624662517807953,0.8493156802662722,0.8550037227391485,0.848672351002434,0.854242569948961,0.8308664780710621,0.8349043949768572
--- a/modules.json
+++ b/modules.json
@@ -0,0 +1,14 @@
 [
  {
    "idx": 0,
    "name": "0",
    "path": "",
    "type": "sentence_transformers.models.Transformer"
  },
  {
    "idx": 1,
    "name": "1",
    "path": "1_Pooling",
    "type": "sentence_transformers.models.Pooling"
  }
 ]
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:0f7593234a63463168275846bce5bfe5989641084ab092f4867a8d48a9fc1337
 size 538972985
--- a/sentence_bert_config.json
+++ b/sentence_bert_config.json
@@ -0,0 +1,4 @@
 {
  "max_seq_length": 512,
  "do_lower_case": false
 }
--- a/similarity_evaluation_stsb-multi-mt-test_results.csv
+++ b/similarity_evaluation_stsb-multi-mt-test_results.csv
@@ -0,0 +1,2 @@
 epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman
 -1,-1,0.8273478826973636,0.8215128959042091,0.8137184308367987,0.8197566352546802,0.8138222620547743,0.8195473589233008,0.7755088639539296,0.7692474708707059
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1 @@
 {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1 @@
 {"do_lower_case": false, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "max_len": 512, "special_tokens_map_file": "old_models/distiluse-base-multilingual-cased-v1/0_Transformer/special_tokens_map.json", "name_or_path": "sentence-transformers/distiluse-base-multilingual-cased-v1", "do_basic_tokenize": true, "never_split": null, "tokenizer_class": "DistilBertTokenizer"}
--- a/vocab.txt
+++ b/vocab.txt
		`@@ -0,0 +1,2 @@`
							`epoch,steps,cosine_pearson,cosine_spearman,euclidean_pearson,euclidean_spearman,manhattan_pearson,manhattan_spearman,dot_pearson,dot_spearman`
							`-1,-1,0.8273478826973636,0.8215128959042091,0.8137184308367987,0.8197566352546802,0.8138222620547743,0.8195473589233008,0.7755088639539296,0.7692474708707059`
		`@@ -0,0 +1 @@`
							`{"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}`
		`@@ -0,0 +1 @@`
							`{"do_lower_case": false, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "max_len": 512, "special_tokens_map_file": "old_models/distiluse-base-multilingual-cased-v1/0_Transformer/special_tokens_map.json", "name_or_path": "sentence-transformers/distiluse-base-multilingual-cased-v1", "do_basic_tokenize": true, "never_split": null, "tokenizer_class": "DistilBertTokenizer"}`