Model: shopifyinterngrinder/sidekick-autocomplete-06b-clm-shopping Source: Original Platform
52 lines
1.3 KiB
Markdown
52 lines
1.3 KiB
Markdown
---
|
|
base_model: Qwen/Qwen3-0.6B
|
|
language:
|
|
- en
|
|
library_name: transformers
|
|
license: apache-2.0
|
|
pipeline_tag: text-generation
|
|
tags:
|
|
- sidekick
|
|
- sft
|
|
- chat
|
|
- shopify
|
|
datasets:
|
|
- shopifyinterngrinder/sidekick-autocomplete-data-shopping
|
|
---
|
|
|
|
# shopifyinterngrinder/sidekick-autocomplete-06b-clm-shopping
|
|
|
|
Fine-tuned from [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) using [TRL](https://github.com/huggingface/trl) SFT.
|
|
|
|
## Training Details
|
|
|
|
| Parameter | Value |
|
|
|---|---|
|
|
| Base Model | [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) |
|
|
| Dataset | [shopifyinterngrinder/sidekick-autocomplete-data-shopping](https://huggingface.co/datasets/shopifyinterngrinder/sidekick-autocomplete-data-shopping) @ `main` |
|
|
| Training Examples | 69,780 |
|
|
| Validation Examples | 7,754 |
|
|
| Epochs | 3 |
|
|
| Learning Rate | 2e-05 |
|
|
| Batch Size (per device) | 1 |
|
|
| Gradient Accumulation | 4 |
|
|
| Max Sequence Length | 512 |
|
|
| Precision | bf16 |
|
|
| Optimizer | adamw_torch_fused |
|
|
| Warmup Steps | 100 |
|
|
| Weight Decay | 0.01 |
|
|
| LR Scheduler | cosine |
|
|
| Packing | Enabled |
|
|
| Dataset Format | prompt_completion |
|
|
|
|
|
|
## Framework Versions
|
|
|
|
| Library | Version |
|
|
|---|---|
|
|
| Transformers | 4.57.6 |
|
|
| TRL | 0.29.0 |
|
|
| PyTorch | 2.8.0+cu128 |
|
|
| Datasets | 3.6.0 |
|
|
| Accelerate | 1.13.0 |
|