Model: kalistratov/qwen3-1.8b-semantic-ids Source: Original Platform
language, tags, license, base_model, datasets
| language | tags | license | base_model | datasets | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| en |
|
apache-2.0 | Qwen/Qwen3-1.7B |
|
Qwen3-1.8B Fine-tuned for Semantic ID Recommendation
Overview
Qwen3-1.8B fine-tuned for generative product recommendation via hierarchical semantic identifiers. The model generates 4-level Semantic IDs (<|sid_start|><|A#|><|B#|><|C#|><|D#|><|sid_end|>) given product descriptions, purchase histories, or co-purchase contexts.
This is the smaller model in a controlled comparison experiment (1.8B vs 8B) conducted under identical training conditions.
Training
Stage 1: Vocabulary Expansion
- Added 1,027 special tokens (3 structural + 4×256 codebook tokens)
- Trained only embedding matrices (0.3% of parameters)
- 2,000 steps, LR 1×10⁻³, batch 64
Stage 2: Full Fine-tuning
- Dataset: 4,719,994 instruction-formatted conversations (Amazon Pet Supplies)
- Task types: text→SID, sequential recommendation, co-purchase prediction
- Optimizer: AdamW 8-bit, LR 2×10⁻⁵, cosine with min LR (0.2×peak)
- Warmup: 3%, weight decay 0.01
- Batch: 64 × 2 = 128 effective, 3 epochs
- Techniques: Custom instruction masking, greedy sequence packing (~3× throughput)
- Hardware: NVIDIA H100 80GB (vast.ai)
Results
Hierarchical SID prediction accuracy (A-level match, greedy decoding):
| Task | Accuracy |
|---|---|
| Text → SID | 59.9% |
| Sequential recommendation | 7.0% |
| Co-purchase prediction | 5.5% |
Evaluation: 3,000 samples per task, 11 task types.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("kalistratov/qwen3-1.8b-semantic-ids")
tokenizer = AutoTokenizer.from_pretrained("kalistratov/qwen3-1.8b-semantic-ids")
Citation
Master's thesis, Moscow Institute of Physics and Technology (MIPT), 2026.
References
- Y. Sun et al. "OpenOneRec," arXiv:2502.18851, 2025.
- J. Liu et al. "PLUM," arXiv:2406.12346, 2024.
- E. Yan. "semantic-ids-llm," GitHub, 2024.
Description
Languages
Jinja
100%