Embedding-gemma-300M-skills/README.md

---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:4992
- loss:MultipleNegativesRankingLoss
base_model: google/embeddinggemma-300m
widget:
- source_sentence: Client onboarding, implementation, project management, communication,
    Salesforce, G-suite, Asana, Single Sign-On (SSO), SFTP, data analysis
  sentences:
  - A software engineer uses Python and GitHub to automate testing processes.
  - Setting up clients in Salesforce and G-suite efficiently requires strong project
    management and clear communication.
  - Choosing between cloud storage solutions like Dropbox and Google Drive can be
    challenging.
- source_sentence: SQL, Excel, stakeholder management, product management
  sentences:
  - FastAPI and Flask both enable developers to build robust RESTful APIs efficiently.
  - Data analysis using SQL and Excel for stakeholder updates in product management.
  - Project scheduling and Gantt charts for timeline tracking.
- source_sentence: Power Platform,Robotic Process Automation,Power Automate Cloud
    & Desktop,Automation Anywhere,PL900,SAP ECC,Generative AI,Power BI
  sentences:
  - SAP ECC and PL900 are essential for financial management systems.
  - Automation Anywhere, Power Platform, and Power Automate Cloud & Desktop are key
    tools for streamlining business processes.
  - Guidewire uses a test automation framework to ensure continuous integration and
    security testing.
- source_sentence: Critical Care,ICU
  sentences:
  - Guiding students through Java programming basics is crucial for their computer
    engineering education.
  - Intensive Care, ICU unit
  - Pediatric Clinic, outpatient
- source_sentence: successfactors,algorithms,sap,data analysis,natural language processing,software
    testing,neural networks,development methodologies
  sentences:
  - successfactors offers travel packages and vacation deals through its partnership
    with various hotels.
  - Azure Data Lake and Cosmos DB are key components of the Microsoft data ecosystem.
  - successfactors uses advanced algorithms to enhance sap software testing and improve
    data analysis accuracy.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---

# SentenceTransformer based on google/embeddinggemma-300m

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m) on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m) <!-- at revision 57c266a740f537b4dc058e1b0cda161fd15afa75 -->
- **Maximum Sequence Length:** 2048 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
    - csv
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("AROY76/Embedding-gemma-300M-skills")
# Run inference
queries = [
    "successfactors,algorithms,sap,data analysis,natural language processing,software testing,neural networks,development methodologies",
]
documents = [
    'successfactors uses advanced algorithms to enhance sap software testing and improve data analysis accuracy.',
    'successfactors offers travel packages and vacation deals through its partnership with various hotels.',
    'Azure Data Lake and Cosmos DB are key components of the Microsoft data ecosystem.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.7328, 0.0418, 0.0872]])
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### csv

* Dataset: csv
* Size: 4,992 training samples
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
* Approximate statistics based on the first 1000 samples:
  |         | anchor                                                                             | positive                                                                         | negative                                                                         |
  |:--------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
  | type    | string                                                                             | string                                                                           | string                                                                           |
  | details | <ul><li>min: 3 tokens</li><li>mean: 32.16 tokens</li><li>max: 121 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 18.5 tokens</li><li>max: 97 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 15.5 tokens</li><li>max: 76 tokens</li></ul> |
* Samples:
  | anchor                                                                                                                                                                                                                        | positive                                                                                                                                                                                                                                                               | negative                                                                                                                                                                                                                                                                   |
  |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>Statistical analysis, SQL, Scripting (Ruby, Python, etc.), Version control (git), Web design/UX, Monte Carlo simulations, RoR, Front-end JS, Growth hacking, Airflow, Pandas</code>                                     | <code>Data analysis, databases, programming languages like Ruby or Python, software versioning, user interface design, probability modeling, Ruby on Rails, JavaScript for interfaces, customer growth strategies, workflow automation, data manipulation tools</code> | <code>Cloud storage, hardware configuration, network security, project management methodologies, graphic design software, database normalization techniques, agile development practices, server administration, marketing analytics, containerization technologies</code> |
  | <code>Graphic Design, digital design, print design, web design, environmental/experiential design, interaction design, brand design, visual design, communication, user research, illustration, digital design systems</code> | <code>Visual design, graphic design, communication, user research, illustration, digital design systems, web design, brand design, interaction design, print design, environmental/experiential design</code>                                                          | <code>Project management, software development, networking, cybersecurity, database administration, IT infrastructure, agile methodologies, cloud computing, hardware engineering, quality assurance</code>                                                                |
  | <code>problem solving, customer support, writing, grammar</code>                                                                                                                                                              | <code>improving writing skills to enhance clarity and grammar in customer support communications</code>                                                                                                                                                                | <code>designing website layouts for better user experience</code>                                                                                                                                                                                                          |
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim",
      "gather_across_devices": false
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `per_device_train_batch_size`: 16
- `learning_rate`: 2e-05
- `num_train_epochs`: 2
- `warmup_ratio`: 0.1
- `prompts`: task: sentence similarity | query: 

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: no
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 8
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 2e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 2
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `parallelism_config`: None
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `hub_revision`: None
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `liger_kernel_config`: None
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: task: sentence similarity | query: 
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}

</details>

### Framework Versions
- Python: 3.12.12
- Sentence Transformers: 5.2.0
- Transformers: 4.57.0.dev0
- PyTorch: 2.9.0+cu126
- Accelerate: 1.12.0
- Datasets: 4.0.0
- Tokenizers: 0.22.1

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->
初始化项目，由ModelHub XC社区提供模型 Model: AROY76/Embedding-gemma-300M-skills Source: Original Platform 2026-05-14 14:04:56 +08:00			`---`
			`tags:`
			`- sentence-transformers`
			`- sentence-similarity`
			`- feature-extraction`
			`- dense`
			`- generated_from_trainer`
			`- dataset_size:4992`
			`- loss:MultipleNegativesRankingLoss`
			`base_model: google/embeddinggemma-300m`
			`widget:`
			`- source_sentence: Client onboarding, implementation, project management, communication,`
			`Salesforce, G-suite, Asana, Single Sign-On (SSO), SFTP, data analysis`
			`sentences:`
			`- A software engineer uses Python and GitHub to automate testing processes.`
			`- Setting up clients in Salesforce and G-suite efficiently requires strong project`
			`management and clear communication.`
			`- Choosing between cloud storage solutions like Dropbox and Google Drive can be`
			`challenging.`
			`- source_sentence: SQL, Excel, stakeholder management, product management`
			`sentences:`
			`- FastAPI and Flask both enable developers to build robust RESTful APIs efficiently.`
			`- Data analysis using SQL and Excel for stakeholder updates in product management.`
			`- Project scheduling and Gantt charts for timeline tracking.`
			`- source_sentence: Power Platform,Robotic Process Automation,Power Automate Cloud`
			`& Desktop,Automation Anywhere,PL900,SAP ECC,Generative AI,Power BI`
			`sentences:`
			`- SAP ECC and PL900 are essential for financial management systems.`
			`- Automation Anywhere, Power Platform, and Power Automate Cloud & Desktop are key`
			`tools for streamlining business processes.`
			`- Guidewire uses a test automation framework to ensure continuous integration and`
			`security testing.`
			`- source_sentence: Critical Care,ICU`
			`sentences:`
			`- Guiding students through Java programming basics is crucial for their computer`
			`engineering education.`
			`- Intensive Care, ICU unit`
			`- Pediatric Clinic, outpatient`
			`- source_sentence: successfactors,algorithms,sap,data analysis,natural language processing,software`
			`testing,neural networks,development methodologies`
			`sentences:`
			`- successfactors offers travel packages and vacation deals through its partnership`
			`with various hotels.`
			`- Azure Data Lake and Cosmos DB are key components of the Microsoft data ecosystem.`
			`- successfactors uses advanced algorithms to enhance sap software testing and improve`
			`data analysis accuracy.`
			`pipeline_tag: sentence-similarity`
			`library_name: sentence-transformers`
			`---`

			`# SentenceTransformer based on google/embeddinggemma-300m`

			`This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m) on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.`

			`## Model Details`

			`### Model Description`
			`- Model Type: Sentence Transformer`
			`- Base model: [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m) <!-- at revision 57c266a740f537b4dc058e1b0cda161fd15afa75 -->`
			`- Maximum Sequence Length: 2048 tokens`
			`- Output Dimensionality: 768 dimensions`
			`- Similarity Function: Cosine Similarity`
			`- Training Dataset:`
			`- csv`
			`<!-- - Language: Unknown -->`
			`<!-- - License: Unknown -->`

			`### Model Sources`

			`- Documentation: [Sentence Transformers Documentation](https://sbert.net)`
			`- Repository: [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)`
			`- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)`

			`### Full Model Architecture`

			```
			`SentenceTransformer(`
			`(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})`
			`(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})`
			`(2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})`
			`(3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})`
			`(4): Normalize()`
			`)`
			```

			`## Usage`

			`### Direct Usage (Sentence Transformers)`

			`First install the Sentence Transformers library:`

			```bash
			`pip install -U sentence-transformers`
			```

			`Then you can load this model and run inference.`
			```python
			`from sentence_transformers import SentenceTransformer`

			`# Download from the 🤗 Hub`
			`model = SentenceTransformer("AROY76/Embedding-gemma-300M-skills")`
			`# Run inference`
			`queries = [`
			`"successfactors,algorithms,sap,data analysis,natural language processing,software testing,neural networks,development methodologies",`
			`]`
			`documents = [`
			`'successfactors uses advanced algorithms to enhance sap software testing and improve data analysis accuracy.',`
			`'successfactors offers travel packages and vacation deals through its partnership with various hotels.',`
			`'Azure Data Lake and Cosmos DB are key components of the Microsoft data ecosystem.',`
			`]`
			`query_embeddings = model.encode_query(queries)`
			`document_embeddings = model.encode_document(documents)`
			`print(query_embeddings.shape, document_embeddings.shape)`
			`# [1, 768] [3, 768]`

			`# Get the similarity scores for the embeddings`
			`similarities = model.similarity(query_embeddings, document_embeddings)`
			`print(similarities)`
			`# tensor([[0.7328, 0.0418, 0.0872]])`
			```

			`<!--`
			`### Direct Usage (Transformers)`

			`<details><summary>Click to see the direct usage in Transformers</summary>`

			`</details>`
			`-->`

			`<!--`
			`### Downstream Usage (Sentence Transformers)`

			`You can finetune this model on your own dataset.`

			`<details><summary>Click to expand</summary>`

			`</details>`
			`-->`

			`<!--`
			`### Out-of-Scope Use`

			`List how the model may foreseeably be misused and address what users ought not to do with the model.`
			`-->`

			`<!--`
			`## Bias, Risks and Limitations`

			`What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.`
			`-->`

			`<!--`
			`### Recommendations`

			`What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.`
			`-->`

			`## Training Details`

			`### Training Dataset`

			`#### csv`

			`* Dataset: csv`
			`* Size: 4,992 training samples`
			`* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>`
			`* Approximate statistics based on the first 1000 samples:`
			`\| \| anchor \| positive \| negative \|`
			`\|:--------\|:-----------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------\|`
			`\| type \| string \| string \| string \|`
			`\| details \| <ul><li>min: 3 tokens</li><li>mean: 32.16 tokens</li><li>max: 121 tokens</li></ul> \| <ul><li>min: 3 tokens</li><li>mean: 18.5 tokens</li><li>max: 97 tokens</li></ul> \| <ul><li>min: 3 tokens</li><li>mean: 15.5 tokens</li><li>max: 76 tokens</li></ul> \|`
			`* Samples:`
			\| anchor \| positive \| negative \|
			\|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
			\| <code>Statistical analysis, SQL, Scripting (Ruby, Python, etc.), Version control (git), Web design/UX, Monte Carlo simulations, RoR, Front-end JS, Growth hacking, Airflow, Pandas</code> \| <code>Data analysis, databases, programming languages like Ruby or Python, software versioning, user interface design, probability modeling, Ruby on Rails, JavaScript for interfaces, customer growth strategies, workflow automation, data manipulation tools</code> \| <code>Cloud storage, hardware configuration, network security, project management methodologies, graphic design software, database normalization techniques, agile development practices, server administration, marketing analytics, containerization technologies</code> \|
			\| <code>Graphic Design, digital design, print design, web design, environmental/experiential design, interaction design, brand design, visual design, communication, user research, illustration, digital design systems</code> \| <code>Visual design, graphic design, communication, user research, illustration, digital design systems, web design, brand design, interaction design, print design, environmental/experiential design</code> \| <code>Project management, software development, networking, cybersecurity, database administration, IT infrastructure, agile methodologies, cloud computing, hardware engineering, quality assurance</code> \|
			\| <code>problem solving, customer support, writing, grammar</code> \| <code>improving writing skills to enhance clarity and grammar in customer support communications</code> \| <code>designing website layouts for better user experience</code> \|
			`* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:`
			```json
			`{`
			`"scale": 20.0,`
			`"similarity_fct": "cos_sim",`
			`"gather_across_devices": false`
			`}`
			```

			`### Training Hyperparameters`
			`#### Non-Default Hyperparameters`

			- `per_device_train_batch_size`: 16
			- `learning_rate`: 2e-05
			- `num_train_epochs`: 2
			- `warmup_ratio`: 0.1
			- `prompts`: task: sentence similarity \| query:

			`#### All Hyperparameters`
			`<details><summary>Click to expand</summary>`

			- `overwrite_output_dir`: False
			- `do_predict`: False
			- `eval_strategy`: no
			- `prediction_loss_only`: True
			- `per_device_train_batch_size`: 16
			- `per_device_eval_batch_size`: 8
			- `per_gpu_train_batch_size`: None
			- `per_gpu_eval_batch_size`: None
			- `gradient_accumulation_steps`: 1
			- `eval_accumulation_steps`: None
			- `torch_empty_cache_steps`: None
			- `learning_rate`: 2e-05
			- `weight_decay`: 0.0
			- `adam_beta1`: 0.9
			- `adam_beta2`: 0.999
			- `adam_epsilon`: 1e-08
			- `max_grad_norm`: 1.0
			- `num_train_epochs`: 2
			- `max_steps`: -1
			- `lr_scheduler_type`: linear
			- `lr_scheduler_kwargs`: {}
			- `warmup_ratio`: 0.1
			- `warmup_steps`: 0
			- `log_level`: passive
			- `log_level_replica`: warning
			- `log_on_each_node`: True
			- `logging_nan_inf_filter`: True
			- `save_safetensors`: True
			- `save_on_each_node`: False
			- `save_only_model`: False
			- `restore_callback_states_from_checkpoint`: False
			- `no_cuda`: False
			- `use_cpu`: False
			- `use_mps_device`: False
			- `seed`: 42
			- `data_seed`: None
			- `jit_mode_eval`: False
			- `use_ipex`: False
			- `bf16`: False
			- `fp16`: False
			- `fp16_opt_level`: O1
			- `half_precision_backend`: auto
			- `bf16_full_eval`: False
			- `fp16_full_eval`: False
			- `tf32`: None
			- `local_rank`: 0
			- `ddp_backend`: None
			- `tpu_num_cores`: None
			- `tpu_metrics_debug`: False
			- `debug`: []
			- `dataloader_drop_last`: False
			- `dataloader_num_workers`: 0
			- `dataloader_prefetch_factor`: None
			- `past_index`: -1
			- `disable_tqdm`: False
			- `remove_unused_columns`: True
			- `label_names`: None
			- `load_best_model_at_end`: False
			- `ignore_data_skip`: False
			- `fsdp`: []
			- `fsdp_min_num_params`: 0
			- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
			- `fsdp_transformer_layer_cls_to_wrap`: None
			- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
			- `parallelism_config`: None
			- `deepspeed`: None
			- `label_smoothing_factor`: 0.0
			- `optim`: adamw_torch_fused
			- `optim_args`: None
			- `adafactor`: False
			- `group_by_length`: False
			- `length_column_name`: length
			- `ddp_find_unused_parameters`: None
			- `ddp_bucket_cap_mb`: None
			- `ddp_broadcast_buffers`: False
			- `dataloader_pin_memory`: True
			- `dataloader_persistent_workers`: False
			- `skip_memory_metrics`: True
			- `use_legacy_prediction_loop`: False
			- `push_to_hub`: False
			- `resume_from_checkpoint`: None
			- `hub_model_id`: None
			- `hub_strategy`: every_save
			- `hub_private_repo`: None
			- `hub_always_push`: False
			- `hub_revision`: None
			- `gradient_checkpointing`: False
			- `gradient_checkpointing_kwargs`: None
			- `include_inputs_for_metrics`: False
			- `include_for_metrics`: []
			- `eval_do_concat_batches`: True
			- `fp16_backend`: auto
			- `push_to_hub_model_id`: None
			- `push_to_hub_organization`: None
			- `mp_parameters`:
			- `auto_find_batch_size`: False
			- `full_determinism`: False
			- `torchdynamo`: None
			- `ray_scope`: last
			- `ddp_timeout`: 1800
			- `torch_compile`: False
			- `torch_compile_backend`: None
			- `torch_compile_mode`: None
			- `include_tokens_per_second`: False
			- `include_num_input_tokens_seen`: False
			- `neftune_noise_alpha`: None
			- `optim_target_modules`: None
			- `batch_eval_metrics`: False
			- `eval_on_start`: False
			- `use_liger_kernel`: False
			- `liger_kernel_config`: None
			- `eval_use_gather_object`: False
			- `average_tokens_across_devices`: False
			- `prompts`: task: sentence similarity \| query:
			- `batch_sampler`: batch_sampler
			- `multi_dataset_batch_sampler`: proportional
			- `router_mapping`: {}
			- `learning_rate_mapping`: {}

			`</details>`

			`### Framework Versions`
			`- Python: 3.12.12`
			`- Sentence Transformers: 5.2.0`
			`- Transformers: 4.57.0.dev0`
			`- PyTorch: 2.9.0+cu126`
			`- Accelerate: 1.12.0`
			`- Datasets: 4.0.0`
			`- Tokenizers: 0.22.1`

			`## Citation`

			`### BibTeX`

			`#### Sentence Transformers`
			```bibtex
			`@inproceedings{reimers-2019-sentence-bert,`
			`title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",`
			`author = "Reimers, Nils and Gurevych, Iryna",`
			`booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",`
			`month = "11",`
			`year = "2019",`
			`publisher = "Association for Computational Linguistics",`
			`url = "https://arxiv.org/abs/1908.10084",`
			`}`
			```

			`#### MultipleNegativesRankingLoss`
			```bibtex
			`@misc{henderson2017efficient,`
			`title={Efficient Natural Language Response Suggestion for Smart Reply},`
			`author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},`
			`year={2017},`
			`eprint={1705.00652},`
			`archivePrefix={arXiv},`
			`primaryClass={cs.CL}`
			`}`
			```

			`<!--`
			`## Glossary`

			`Clearly define terms in order to be accessible across audiences.`
			`-->`

			`<!--`
			`## Model Card Authors`

			`Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.`
			`-->`

			`<!--`
			`## Model Card Contact`

			`Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.`
			`-->`