cds_sh1_fr_30/README.md

---
library_name: transformers
tags:
- generated_from_trainer
model-index:
- name: cds_sh1_fr_30
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# cds_sh1_fr_30

This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 3.3099

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 256
- eval_batch_size: 256
- seed: 30
- optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 20
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 6.0017        | 1.0   | 145  | 4.5165          |
| 4.1747        | 2.0   | 290  | 4.0281          |
| 3.8777        | 3.0   | 435  | 3.7704          |
| 3.6723        | 4.0   | 580  | 3.6043          |
| 3.5366        | 5.0   | 725  | 3.5114          |
| 3.4452        | 6.0   | 870  | 3.4480          |
| 3.3743        | 7.0   | 1015 | 3.4009          |
| 3.3153        | 8.0   | 1160 | 3.3667          |
| 3.2657        | 9.0   | 1305 | 3.3414          |
| 3.2193        | 10.0  | 1450 | 3.3189          |
| 3.1773        | 11.0  | 1595 | 3.3084          |
| 3.1366        | 12.0  | 1740 | 3.2952          |
| 3.0985        | 13.0  | 1885 | 3.2936          |
| 3.061         | 14.0  | 2030 | 3.2873          |
| 3.0252        | 15.0  | 2175 | 3.2956          |
| 2.9897        | 16.0  | 2320 | 3.2938          |
| 2.9574        | 17.0  | 2465 | 3.2979          |
| 2.9302        | 18.0  | 2610 | 3.3015          |
| 2.9063        | 19.0  | 2755 | 3.3071          |
| 2.8895        | 20.0  | 2900 | 3.3099          |


### Framework versions

- Transformers 4.56.1
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.0
初始化项目，由ModelHub XC社区提供模型 Model: fpadovani/cds_sh1_fr_30 Source: Original Platform 2026-06-04 10:55:18 +08:00			`---`
			`library_name: transformers`
			`tags:`
			`- generated_from_trainer`
			`model-index:`
			`- name: cds_sh1_fr_30`
			`results: []`
			`---`

			`<!-- This model card has been generated automatically according to the information the Trainer had access to. You`
			`should probably proofread and complete it, then remove this comment. -->`

			`# cds_sh1_fr_30`

			`This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.`
			`It achieves the following results on the evaluation set:`
			`- Loss: 3.3099`

			`## Model description`

			`More information needed`

			`## Intended uses & limitations`

			`More information needed`

			`## Training and evaluation data`

			`More information needed`

			`## Training procedure`

			`### Training hyperparameters`

			`The following hyperparameters were used during training:`
			`- learning_rate: 0.0001`
			`- train_batch_size: 256`
			`- eval_batch_size: 256`
			`- seed: 30`
			`- optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments`
			`- lr_scheduler_type: linear`
			`- lr_scheduler_warmup_steps: 500`
			`- num_epochs: 20`
			`- mixed_precision_training: Native AMP`

			`### Training results`

			`\| Training Loss \| Epoch \| Step \| Validation Loss \|`
			`\|:-------------:\|:-----:\|:----:\|:---------------:\|`
			`\| 6.0017 \| 1.0 \| 145 \| 4.5165 \|`
			`\| 4.1747 \| 2.0 \| 290 \| 4.0281 \|`
			`\| 3.8777 \| 3.0 \| 435 \| 3.7704 \|`
			`\| 3.6723 \| 4.0 \| 580 \| 3.6043 \|`
			`\| 3.5366 \| 5.0 \| 725 \| 3.5114 \|`
			`\| 3.4452 \| 6.0 \| 870 \| 3.4480 \|`
			`\| 3.3743 \| 7.0 \| 1015 \| 3.4009 \|`
			`\| 3.3153 \| 8.0 \| 1160 \| 3.3667 \|`
			`\| 3.2657 \| 9.0 \| 1305 \| 3.3414 \|`
			`\| 3.2193 \| 10.0 \| 1450 \| 3.3189 \|`
			`\| 3.1773 \| 11.0 \| 1595 \| 3.3084 \|`
			`\| 3.1366 \| 12.0 \| 1740 \| 3.2952 \|`
			`\| 3.0985 \| 13.0 \| 1885 \| 3.2936 \|`
			`\| 3.061 \| 14.0 \| 2030 \| 3.2873 \|`
			`\| 3.0252 \| 15.0 \| 2175 \| 3.2956 \|`
			`\| 2.9897 \| 16.0 \| 2320 \| 3.2938 \|`
			`\| 2.9574 \| 17.0 \| 2465 \| 3.2979 \|`
			`\| 2.9302 \| 18.0 \| 2610 \| 3.3015 \|`
			`\| 2.9063 \| 19.0 \| 2755 \| 3.3071 \|`
			`\| 2.8895 \| 20.0 \| 2900 \| 3.3099 \|`


			`### Framework versions`

			`- Transformers 4.56.1`
			`- Pytorch 2.8.0+cu128`
			`- Datasets 4.0.0`
			`- Tokenizers 0.22.0`