初始化项目,由ModelHub XC社区提供模型
Model: shopifyinterngrinder/sidekick-autocomplete-06b-clm-shopping Source: Original Platform
This commit is contained in:
51
README.md
Normal file
51
README.md
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
base_model: Qwen/Qwen3-0.6B
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- sidekick
|
||||
- sft
|
||||
- chat
|
||||
- shopify
|
||||
datasets:
|
||||
- shopifyinterngrinder/sidekick-autocomplete-data-shopping
|
||||
---
|
||||
|
||||
# shopifyinterngrinder/sidekick-autocomplete-06b-clm-shopping
|
||||
|
||||
Fine-tuned from [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) using [TRL](https://github.com/huggingface/trl) SFT.
|
||||
|
||||
## Training Details
|
||||
|
||||
| Parameter | Value |
|
||||
|---|---|
|
||||
| Base Model | [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) |
|
||||
| Dataset | [shopifyinterngrinder/sidekick-autocomplete-data-shopping](https://huggingface.co/datasets/shopifyinterngrinder/sidekick-autocomplete-data-shopping) @ `main` |
|
||||
| Training Examples | 69,780 |
|
||||
| Validation Examples | 7,754 |
|
||||
| Epochs | 3 |
|
||||
| Learning Rate | 2e-05 |
|
||||
| Batch Size (per device) | 1 |
|
||||
| Gradient Accumulation | 4 |
|
||||
| Max Sequence Length | 512 |
|
||||
| Precision | bf16 |
|
||||
| Optimizer | adamw_torch_fused |
|
||||
| Warmup Steps | 100 |
|
||||
| Weight Decay | 0.01 |
|
||||
| LR Scheduler | cosine |
|
||||
| Packing | Enabled |
|
||||
| Dataset Format | prompt_completion |
|
||||
|
||||
|
||||
## Framework Versions
|
||||
|
||||
| Library | Version |
|
||||
|---|---|
|
||||
| Transformers | 4.57.6 |
|
||||
| TRL | 0.29.0 |
|
||||
| PyTorch | 2.8.0+cu128 |
|
||||
| Datasets | 3.6.0 |
|
||||
| Accelerate | 1.13.0 |
|
||||
Reference in New Issue
Block a user