初始化项目,由ModelHub XC社区提供模型

Model: donoway/ARC-Easy_Llama-3.2-1B-oqrx1b71
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-25 22:34:24 +08:00
commit 540848bd33
20 changed files with 5563 additions and 0 deletions

37
.gitattributes vendored Normal file
View File

@@ -0,0 +1,37 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
checkpoint-21/tokenizer.json filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

145
README.md Normal file
View File

@@ -0,0 +1,145 @@
---
library_name: transformers
license: llama3.2
base_model: meta-llama/Llama-3.2-1B
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: ARC-Easy_Llama-3.2-1B-oqrx1b71
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# ARC-Easy_Llama-3.2-1B-oqrx1b71
This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 3.7385
- Model Preparation Time: 0.0069
- Mdl: 3074.3332
- Accumulated Loss: 2130.9654
- Correct Preds: 371.0
- Total Preds: 570.0
- Accuracy: 0.6509
- Correct Gen Preds: 326.0
- Gen Accuracy: 0.5719
- Correct Gen Preds 32: 79.0
- Correct Preds 32: 103.0
- Total Labels 32: 158.0
- Accuracy 32: 0.6519
- Gen Accuracy 32: 0.5
- Correct Gen Preds 33: 110.0
- Correct Preds 33: 113.0
- Total Labels 33: 152.0
- Accuracy 33: 0.7434
- Gen Accuracy 33: 0.7237
- Correct Gen Preds 34: 91.0
- Correct Preds 34: 99.0
- Total Labels 34: 142.0
- Accuracy 34: 0.6972
- Gen Accuracy 34: 0.6408
- Correct Gen Preds 35: 46.0
- Correct Preds 35: 56.0
- Total Labels 35: 118.0
- Accuracy 35: 0.4746
- Gen Accuracy 35: 0.3898
- Correct Gen Preds 36: 0.0
- Correct Preds 36: 0.0
- Total Labels 36: 0.0
- Accuracy 36: 0.0
- Gen Accuracy 36: 0.0
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 112
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
### Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 32 | Correct Preds 32 | Total Labels 32 | Accuracy 32 | Gen Accuracy 32 | Correct Gen Preds 33 | Correct Preds 33 | Total Labels 33 | Accuracy 33 | Gen Accuracy 33 | Correct Gen Preds 34 | Correct Preds 34 | Total Labels 34 | Accuracy 34 | Gen Accuracy 34 | Correct Gen Preds 35 | Correct Preds 35 | Total Labels 35 | Accuracy 35 | Gen Accuracy 35 | Correct Gen Preds 36 | Correct Preds 36 | Total Labels 36 | Accuracy 36 | Gen Accuracy 36 |
|:-------------:|:-----:|:----:|:---------------:|:----------------------:|:---------:|:----------------:|:-------------:|:-----------:|:--------:|:-----------------:|:------------:|:--------------------:|:----------------:|:---------------:|:-----------:|:---------------:|:--------------------:|:----------------:|:---------------:|:-----------:|:---------------:|:--------------------:|:----------------:|:---------------:|:-----------:|:---------------:|:--------------------:|:----------------:|:---------------:|:-----------:|:---------------:|:--------------------:|:----------------:|:---------------:|:-----------:|:---------------:|
| No log | 0 | 0 | 1.5354 | 0.0069 | 1262.6022 | 875.1692 | 172.0 | 570.0 | 0.3018 | 170.0 | 0.2982 | 154.0 | 154.0 | 158.0 | 0.9747 | 0.9747 | 0.0 | 0.0 | 152.0 | 0.0 | 0.0 | 15.0 | 17.0 | 142.0 | 0.1197 | 0.1056 | 1.0 | 1.0 | 118.0 | 0.0085 | 0.0085 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1.4592 | 1.0 | 1 | 1.5354 | 0.0069 | 1262.6022 | 875.1692 | 172.0 | 570.0 | 0.3018 | 170.0 | 0.2982 | 154.0 | 154.0 | 158.0 | 0.9747 | 0.9747 | 0.0 | 0.0 | 152.0 | 0.0 | 0.0 | 15.0 | 17.0 | 142.0 | 0.1197 | 0.1056 | 1.0 | 1.0 | 118.0 | 0.0085 | 0.0085 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1.4587 | 2.0 | 2 | 2.8572 | 0.0069 | 2349.6051 | 1628.6221 | 188.0 | 570.0 | 0.3298 | 188.0 | 0.3298 | 0.0 | 0.0 | 158.0 | 0.0 | 0.0 | 46.0 | 46.0 | 152.0 | 0.3026 | 0.3026 | 141.0 | 141.0 | 142.0 | 0.9930 | 0.9930 | 1.0 | 1.0 | 118.0 | 0.0085 | 0.0085 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1.7615 | 3.0 | 3 | 1.5150 | 0.0069 | 1245.8152 | 863.5333 | 173.0 | 570.0 | 0.3035 | 173.0 | 0.3035 | 0.0 | 0.0 | 158.0 | 0.0 | 0.0 | 151.0 | 151.0 | 152.0 | 0.9934 | 0.9934 | 8.0 | 8.0 | 142.0 | 0.0563 | 0.0563 | 14.0 | 14.0 | 118.0 | 0.1186 | 0.1186 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.683 | 4.0 | 4 | 1.3806 | 0.0069 | 1135.3136 | 786.9394 | 307.0 | 570.0 | 0.5386 | 254.0 | 0.4456 | 39.0 | 59.0 | 158.0 | 0.3734 | 0.2468 | 107.0 | 126.0 | 152.0 | 0.8289 | 0.7039 | 76.0 | 84.0 | 142.0 | 0.5915 | 0.5352 | 32.0 | 38.0 | 118.0 | 0.3220 | 0.2712 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0591 | 5.0 | 5 | 1.8312 | 0.0069 | 1505.8471 | 1043.7737 | 356.0 | 570.0 | 0.6246 | 262.0 | 0.4596 | 53.0 | 98.0 | 158.0 | 0.6203 | 0.3354 | 98.0 | 115.0 | 152.0 | 0.7566 | 0.6447 | 69.0 | 92.0 | 142.0 | 0.6479 | 0.4859 | 42.0 | 51.0 | 118.0 | 0.4322 | 0.3559 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0003 | 6.0 | 6 | 2.3233 | 0.0069 | 1910.5098 | 1324.2645 | 353.0 | 570.0 | 0.6193 | 288.0 | 0.5053 | 64.0 | 97.0 | 158.0 | 0.6139 | 0.4051 | 103.0 | 113.0 | 152.0 | 0.7434 | 0.6776 | 75.0 | 90.0 | 142.0 | 0.6338 | 0.5282 | 46.0 | 53.0 | 118.0 | 0.4492 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 7.0 | 7 | 2.6634 | 0.0069 | 2190.2029 | 1518.1330 | 366.0 | 570.0 | 0.6421 | 306.0 | 0.5368 | 66.0 | 101.0 | 158.0 | 0.6392 | 0.4177 | 108.0 | 115.0 | 152.0 | 0.7566 | 0.7105 | 81.0 | 95.0 | 142.0 | 0.6690 | 0.5704 | 51.0 | 55.0 | 118.0 | 0.4661 | 0.4322 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 8.0 | 8 | 2.9283 | 0.0069 | 2408.0786 | 1669.1529 | 365.0 | 570.0 | 0.6404 | 313.0 | 0.5491 | 67.0 | 101.0 | 158.0 | 0.6392 | 0.4241 | 109.0 | 113.0 | 152.0 | 0.7434 | 0.7171 | 87.0 | 96.0 | 142.0 | 0.6761 | 0.6127 | 50.0 | 55.0 | 118.0 | 0.4661 | 0.4237 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 9.0 | 9 | 3.1394 | 0.0069 | 2581.6660 | 1789.4745 | 370.0 | 570.0 | 0.6491 | 318.0 | 0.5579 | 70.0 | 104.0 | 158.0 | 0.6582 | 0.4430 | 110.0 | 115.0 | 152.0 | 0.7566 | 0.7237 | 88.0 | 97.0 | 142.0 | 0.6831 | 0.6197 | 50.0 | 54.0 | 118.0 | 0.4576 | 0.4237 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 10.0 | 10 | 3.2952 | 0.0069 | 2709.7573 | 1878.2607 | 368.0 | 570.0 | 0.6456 | 314.0 | 0.5509 | 73.0 | 101.0 | 158.0 | 0.6392 | 0.4620 | 109.0 | 114.0 | 152.0 | 0.75 | 0.7171 | 86.0 | 98.0 | 142.0 | 0.6901 | 0.6056 | 46.0 | 55.0 | 118.0 | 0.4661 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 11.0 | 11 | 3.4102 | 0.0069 | 2804.3076 | 1943.7979 | 366.0 | 570.0 | 0.6421 | 318.0 | 0.5579 | 74.0 | 100.0 | 158.0 | 0.6329 | 0.4684 | 109.0 | 114.0 | 152.0 | 0.75 | 0.7171 | 89.0 | 98.0 | 142.0 | 0.6901 | 0.6268 | 46.0 | 54.0 | 118.0 | 0.4576 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 12.0 | 12 | 3.4933 | 0.0069 | 2872.6570 | 1991.1741 | 366.0 | 570.0 | 0.6421 | 320.0 | 0.5614 | 74.0 | 100.0 | 158.0 | 0.6329 | 0.4684 | 110.0 | 114.0 | 152.0 | 0.75 | 0.7237 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 46.0 | 54.0 | 118.0 | 0.4576 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 13.0 | 13 | 3.5624 | 0.0069 | 2929.5074 | 2030.5798 | 364.0 | 570.0 | 0.6386 | 318.0 | 0.5579 | 75.0 | 100.0 | 158.0 | 0.6329 | 0.4747 | 109.0 | 113.0 | 152.0 | 0.7434 | 0.7171 | 88.0 | 97.0 | 142.0 | 0.6831 | 0.6197 | 46.0 | 54.0 | 118.0 | 0.4576 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 14.0 | 14 | 3.6095 | 0.0069 | 2968.2088 | 2057.4056 | 365.0 | 570.0 | 0.6404 | 318.0 | 0.5579 | 71.0 | 100.0 | 158.0 | 0.6329 | 0.4494 | 111.0 | 114.0 | 152.0 | 0.75 | 0.7303 | 90.0 | 97.0 | 142.0 | 0.6831 | 0.6338 | 46.0 | 54.0 | 118.0 | 0.4576 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 15.0 | 15 | 3.6542 | 0.0069 | 3004.9703 | 2082.8867 | 368.0 | 570.0 | 0.6456 | 319.0 | 0.5596 | 73.0 | 102.0 | 158.0 | 0.6456 | 0.4620 | 110.0 | 113.0 | 152.0 | 0.7434 | 0.7237 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 46.0 | 55.0 | 118.0 | 0.4661 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 16.0 | 16 | 3.6640 | 0.0069 | 3013.0252 | 2088.4699 | 367.0 | 570.0 | 0.6439 | 321.0 | 0.5632 | 73.0 | 101.0 | 158.0 | 0.6392 | 0.4620 | 111.0 | 114.0 | 152.0 | 0.75 | 0.7303 | 91.0 | 98.0 | 142.0 | 0.6901 | 0.6408 | 46.0 | 54.0 | 118.0 | 0.4576 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 17.0 | 17 | 3.6991 | 0.0069 | 3041.8667 | 2108.4613 | 369.0 | 570.0 | 0.6474 | 321.0 | 0.5632 | 76.0 | 103.0 | 158.0 | 0.6519 | 0.4810 | 109.0 | 113.0 | 152.0 | 0.7434 | 0.7171 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 46.0 | 55.0 | 118.0 | 0.4661 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 18.0 | 18 | 3.7206 | 0.0069 | 3059.5550 | 2120.7219 | 370.0 | 570.0 | 0.6491 | 321.0 | 0.5632 | 77.0 | 104.0 | 158.0 | 0.6582 | 0.4873 | 109.0 | 113.0 | 152.0 | 0.7434 | 0.7171 | 89.0 | 97.0 | 142.0 | 0.6831 | 0.6268 | 46.0 | 56.0 | 118.0 | 0.4746 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 19.0 | 19 | 3.7281 | 0.0069 | 3065.7453 | 2125.0127 | 368.0 | 570.0 | 0.6456 | 320.0 | 0.5614 | 74.0 | 102.0 | 158.0 | 0.6456 | 0.4684 | 110.0 | 114.0 | 152.0 | 0.75 | 0.7237 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 46.0 | 54.0 | 118.0 | 0.4576 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 20.0 | 20 | 3.7380 | 0.0069 | 3073.8754 | 2130.6481 | 369.0 | 570.0 | 0.6474 | 321.0 | 0.5632 | 77.0 | 102.0 | 158.0 | 0.6456 | 0.4873 | 109.0 | 114.0 | 152.0 | 0.75 | 0.7171 | 90.0 | 97.0 | 142.0 | 0.6831 | 0.6338 | 45.0 | 56.0 | 118.0 | 0.4746 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 21.0 | 21 | 3.7385 | 0.0069 | 3074.3332 | 2130.9654 | 371.0 | 570.0 | 0.6509 | 326.0 | 0.5719 | 79.0 | 103.0 | 158.0 | 0.6519 | 0.5 | 110.0 | 113.0 | 152.0 | 0.7434 | 0.7237 | 91.0 | 99.0 | 142.0 | 0.6972 | 0.6408 | 46.0 | 56.0 | 118.0 | 0.4746 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 22.0 | 22 | 3.7640 | 0.0069 | 3095.2993 | 2145.4980 | 366.0 | 570.0 | 0.6421 | 319.0 | 0.5596 | 75.0 | 101.0 | 158.0 | 0.6392 | 0.4747 | 108.0 | 113.0 | 152.0 | 0.7434 | 0.7105 | 90.0 | 97.0 | 142.0 | 0.6831 | 0.6338 | 46.0 | 55.0 | 118.0 | 0.4661 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 23.0 | 23 | 3.7726 | 0.0069 | 3102.3483 | 2150.3840 | 367.0 | 570.0 | 0.6439 | 325.0 | 0.5702 | 81.0 | 102.0 | 158.0 | 0.6456 | 0.5127 | 108.0 | 112.0 | 152.0 | 0.7368 | 0.7105 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 46.0 | 55.0 | 118.0 | 0.4661 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 24.0 | 24 | 3.7770 | 0.0069 | 3105.9931 | 2152.9104 | 368.0 | 570.0 | 0.6456 | 321.0 | 0.5632 | 76.0 | 102.0 | 158.0 | 0.6456 | 0.4810 | 109.0 | 112.0 | 152.0 | 0.7368 | 0.7171 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 46.0 | 56.0 | 118.0 | 0.4746 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 25.0 | 25 | 3.7813 | 0.0069 | 3109.4856 | 2155.3312 | 367.0 | 570.0 | 0.6439 | 322.0 | 0.5649 | 78.0 | 103.0 | 158.0 | 0.6519 | 0.4937 | 109.0 | 112.0 | 152.0 | 0.7368 | 0.7171 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 26.0 | 26 | 3.7764 | 0.0069 | 3105.4752 | 2152.5513 | 367.0 | 570.0 | 0.6439 | 322.0 | 0.5649 | 79.0 | 103.0 | 158.0 | 0.6519 | 0.5 | 109.0 | 112.0 | 152.0 | 0.7368 | 0.7171 | 89.0 | 97.0 | 142.0 | 0.6831 | 0.6268 | 45.0 | 55.0 | 118.0 | 0.4661 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 27.0 | 27 | 3.7828 | 0.0069 | 3110.7617 | 2156.2157 | 365.0 | 570.0 | 0.6404 | 321.0 | 0.5632 | 79.0 | 102.0 | 158.0 | 0.6456 | 0.5 | 108.0 | 112.0 | 152.0 | 0.7368 | 0.7105 | 89.0 | 97.0 | 142.0 | 0.6831 | 0.6268 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 28.0 | 28 | 3.7887 | 0.0069 | 3115.5792 | 2159.5549 | 367.0 | 570.0 | 0.6439 | 321.0 | 0.5632 | 78.0 | 102.0 | 158.0 | 0.6456 | 0.4937 | 108.0 | 111.0 | 152.0 | 0.7303 | 0.7105 | 90.0 | 99.0 | 142.0 | 0.6972 | 0.6338 | 45.0 | 55.0 | 118.0 | 0.4661 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 29.0 | 29 | 3.7980 | 0.0069 | 3123.2559 | 2164.8760 | 366.0 | 570.0 | 0.6421 | 324.0 | 0.5684 | 79.0 | 102.0 | 158.0 | 0.6456 | 0.5 | 109.0 | 112.0 | 152.0 | 0.7368 | 0.7171 | 91.0 | 98.0 | 142.0 | 0.6901 | 0.6408 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 30.0 | 30 | 3.7857 | 0.0069 | 3113.0984 | 2157.8354 | 369.0 | 570.0 | 0.6474 | 325.0 | 0.5702 | 79.0 | 102.0 | 158.0 | 0.6456 | 0.5 | 110.0 | 114.0 | 152.0 | 0.75 | 0.7237 | 92.0 | 99.0 | 142.0 | 0.6972 | 0.6479 | 44.0 | 54.0 | 118.0 | 0.4576 | 0.3729 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 31.0 | 31 | 3.8105 | 0.0069 | 3133.4788 | 2171.9620 | 367.0 | 570.0 | 0.6439 | 323.0 | 0.5667 | 78.0 | 103.0 | 158.0 | 0.6519 | 0.4937 | 109.0 | 111.0 | 152.0 | 0.7303 | 0.7171 | 91.0 | 98.0 | 142.0 | 0.6901 | 0.6408 | 45.0 | 55.0 | 118.0 | 0.4661 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 32.0 | 32 | 3.7983 | 0.0069 | 3123.4747 | 2165.0277 | 368.0 | 570.0 | 0.6456 | 323.0 | 0.5667 | 80.0 | 103.0 | 158.0 | 0.6519 | 0.5063 | 109.0 | 112.0 | 152.0 | 0.7368 | 0.7171 | 89.0 | 98.0 | 142.0 | 0.6901 | 0.6268 | 45.0 | 55.0 | 118.0 | 0.4661 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 33.0 | 33 | 3.7838 | 0.0069 | 3111.5678 | 2156.7744 | 368.0 | 570.0 | 0.6456 | 320.0 | 0.5614 | 77.0 | 102.0 | 158.0 | 0.6456 | 0.4873 | 108.0 | 112.0 | 152.0 | 0.7368 | 0.7105 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 45.0 | 56.0 | 118.0 | 0.4746 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 34.0 | 34 | 3.8000 | 0.0069 | 3124.8632 | 2165.9901 | 366.0 | 570.0 | 0.6421 | 320.0 | 0.5614 | 79.0 | 103.0 | 158.0 | 0.6519 | 0.5 | 108.0 | 111.0 | 152.0 | 0.7303 | 0.7105 | 88.0 | 97.0 | 142.0 | 0.6831 | 0.6197 | 45.0 | 55.0 | 118.0 | 0.4661 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 35.0 | 35 | 3.8131 | 0.0069 | 3135.6651 | 2173.4774 | 363.0 | 570.0 | 0.6368 | 320.0 | 0.5614 | 79.0 | 102.0 | 158.0 | 0.6456 | 0.5 | 107.0 | 110.0 | 152.0 | 0.7237 | 0.7039 | 89.0 | 97.0 | 142.0 | 0.6831 | 0.6268 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 36.0 | 36 | 3.7965 | 0.0069 | 3121.9935 | 2164.0010 | 367.0 | 570.0 | 0.6439 | 322.0 | 0.5649 | 79.0 | 102.0 | 158.0 | 0.6456 | 0.5 | 109.0 | 113.0 | 152.0 | 0.7434 | 0.7171 | 89.0 | 98.0 | 142.0 | 0.6901 | 0.6268 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 37.0 | 37 | 3.7996 | 0.0069 | 3124.5287 | 2165.7583 | 368.0 | 570.0 | 0.6456 | 322.0 | 0.5649 | 79.0 | 103.0 | 158.0 | 0.6519 | 0.5 | 109.0 | 113.0 | 152.0 | 0.7434 | 0.7171 | 89.0 | 97.0 | 142.0 | 0.6831 | 0.6268 | 45.0 | 55.0 | 118.0 | 0.4661 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 38.0 | 38 | 3.7855 | 0.0069 | 3112.9473 | 2157.7307 | 369.0 | 570.0 | 0.6474 | 324.0 | 0.5684 | 79.0 | 103.0 | 158.0 | 0.6519 | 0.5 | 110.0 | 113.0 | 152.0 | 0.7434 | 0.7237 | 91.0 | 99.0 | 142.0 | 0.6972 | 0.6408 | 44.0 | 54.0 | 118.0 | 0.4576 | 0.3729 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 39.0 | 39 | 3.7976 | 0.0069 | 3122.9250 | 2164.6467 | 366.0 | 570.0 | 0.6421 | 322.0 | 0.5649 | 79.0 | 102.0 | 158.0 | 0.6456 | 0.5 | 108.0 | 111.0 | 152.0 | 0.7303 | 0.7105 | 89.0 | 97.0 | 142.0 | 0.6831 | 0.6268 | 46.0 | 56.0 | 118.0 | 0.4746 | 0.3898 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 40.0 | 40 | 3.7913 | 0.0069 | 3117.7141 | 2161.0348 | 366.0 | 570.0 | 0.6421 | 323.0 | 0.5667 | 79.0 | 102.0 | 158.0 | 0.6456 | 0.5 | 109.0 | 112.0 | 152.0 | 0.7368 | 0.7171 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 41.0 | 41 | 3.8044 | 0.0069 | 3128.4659 | 2168.4873 | 370.0 | 570.0 | 0.6491 | 321.0 | 0.5632 | 78.0 | 103.0 | 158.0 | 0.6519 | 0.4937 | 107.0 | 112.0 | 152.0 | 0.7368 | 0.7039 | 91.0 | 99.0 | 142.0 | 0.6972 | 0.6408 | 45.0 | 56.0 | 118.0 | 0.4746 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 42.0 | 42 | 3.8034 | 0.0069 | 3127.6742 | 2167.9386 | 367.0 | 570.0 | 0.6439 | 323.0 | 0.5667 | 79.0 | 103.0 | 158.0 | 0.6519 | 0.5 | 110.0 | 113.0 | 152.0 | 0.7434 | 0.7237 | 89.0 | 96.0 | 142.0 | 0.6761 | 0.6268 | 45.0 | 55.0 | 118.0 | 0.4661 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 43.0 | 43 | 3.7958 | 0.0069 | 3121.3983 | 2163.5884 | 367.0 | 570.0 | 0.6439 | 323.0 | 0.5667 | 80.0 | 102.0 | 158.0 | 0.6456 | 0.5063 | 108.0 | 112.0 | 152.0 | 0.7368 | 0.7105 | 90.0 | 97.0 | 142.0 | 0.6831 | 0.6338 | 45.0 | 56.0 | 118.0 | 0.4746 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 44.0 | 44 | 3.8052 | 0.0069 | 3129.1145 | 2168.9369 | 366.0 | 570.0 | 0.6421 | 321.0 | 0.5632 | 77.0 | 101.0 | 158.0 | 0.6392 | 0.4873 | 108.0 | 112.0 | 152.0 | 0.7368 | 0.7105 | 91.0 | 98.0 | 142.0 | 0.6901 | 0.6408 | 45.0 | 55.0 | 118.0 | 0.4661 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 45.0 | 45 | 3.7989 | 0.0069 | 3123.9498 | 2165.3570 | 367.0 | 570.0 | 0.6439 | 324.0 | 0.5684 | 79.0 | 102.0 | 158.0 | 0.6456 | 0.5 | 110.0 | 113.0 | 152.0 | 0.7434 | 0.7237 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 46.0 | 46 | 3.8045 | 0.0069 | 3128.6029 | 2168.5823 | 369.0 | 570.0 | 0.6474 | 322.0 | 0.5649 | 76.0 | 102.0 | 158.0 | 0.6456 | 0.4810 | 109.0 | 113.0 | 152.0 | 0.7434 | 0.7171 | 92.0 | 99.0 | 142.0 | 0.6972 | 0.6479 | 45.0 | 55.0 | 118.0 | 0.4661 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 47.0 | 47 | 3.8041 | 0.0069 | 3128.2620 | 2168.3460 | 367.0 | 570.0 | 0.6439 | 324.0 | 0.5684 | 81.0 | 103.0 | 158.0 | 0.6519 | 0.5127 | 108.0 | 112.0 | 152.0 | 0.7368 | 0.7105 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 48.0 | 48 | 3.7913 | 0.0069 | 3117.7010 | 2161.0257 | 367.0 | 570.0 | 0.6439 | 321.0 | 0.5632 | 77.0 | 102.0 | 158.0 | 0.6456 | 0.4873 | 108.0 | 112.0 | 152.0 | 0.7368 | 0.7105 | 91.0 | 99.0 | 142.0 | 0.6972 | 0.6408 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 49.0 | 49 | 3.8082 | 0.0069 | 3131.6407 | 2170.6880 | 365.0 | 570.0 | 0.6404 | 321.0 | 0.5632 | 78.0 | 101.0 | 158.0 | 0.6392 | 0.4937 | 108.0 | 112.0 | 152.0 | 0.7368 | 0.7105 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 50.0 | 50 | 3.8087 | 0.0069 | 3132.0417 | 2170.9659 | 367.0 | 570.0 | 0.6439 | 322.0 | 0.5649 | 78.0 | 103.0 | 158.0 | 0.6519 | 0.4937 | 109.0 | 112.0 | 152.0 | 0.7368 | 0.7171 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 45.0 | 54.0 | 118.0 | 0.4576 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 51.0 | 51 | 3.7933 | 0.0069 | 3119.3595 | 2162.1753 | 368.0 | 570.0 | 0.6456 | 321.0 | 0.5632 | 77.0 | 102.0 | 158.0 | 0.6456 | 0.4873 | 109.0 | 112.0 | 152.0 | 0.7368 | 0.7171 | 90.0 | 98.0 | 142.0 | 0.6901 | 0.6338 | 45.0 | 56.0 | 118.0 | 0.4746 | 0.3814 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
### Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1

36
checkpoint-21/config.json Normal file
View File

@@ -0,0 +1,36 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": 128001,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 8192,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 16,
"num_key_value_heads": 8,
"pad_token_id": 128004,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 32.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"rope_theta": 500000.0,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.51.3",
"use_cache": true,
"vocab_size": 128256
}

View File

@@ -0,0 +1,9 @@
{
"_from_model_config": true,
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": 128001,
"temperature": 0.6,
"top_p": 0.9,
"transformers_version": "4.51.3"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:adb8d0cd79d7d8e1a9a997977ba53da9160ece736a92782a441115fced4ef446
size 2471645608

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6c90d21cd1e048399239d8e35c308b3c9bdd52d83bde5e380fa277bb0adef273
size 4943382114

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bdc6b9e266079af4905e788266ab5911a661673579466ce5a178ea07f97617a8
size 14244

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d84bfb48e0636c1b417b109cff164d10bcd741414d8fb3a1492190b54d573f8c
size 1064

View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|finetune_right_pad_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
checkpoint-21/tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8f9cc115c0a2df7f95373e88d223e7fe0c5fe2b1804d07650e45c90766f827c8
size 5432

36
config.json Normal file
View File

@@ -0,0 +1,36 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": 128001,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 8192,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 16,
"num_key_value_heads": 8,
"pad_token_id": 128004,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 32.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"rope_theta": 500000.0,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.51.3",
"use_cache": true,
"vocab_size": 128256
}

9
generation_config.json Normal file
View File

@@ -0,0 +1,9 @@
{
"_from_model_config": true,
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": 128001,
"temperature": 0.6,
"top_p": 0.9,
"transformers_version": "4.51.3"
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:adb8d0cd79d7d8e1a9a997977ba53da9160ece736a92782a441115fced4ef446
size 2471645608

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|finetune_right_pad_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

2063
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8f9cc115c0a2df7f95373e88d223e7fe0c5fe2b1804d07650e45c90766f827c8
size 5432