--- language: en tags: - semantic-ids - recommendation - generative-retrieval - qwen3 - fine-tuned license: apache-2.0 base_model: Qwen/Qwen3-1.7B datasets: - amazon-pet-supplies --- # Qwen3-1.8B Fine-tuned for Semantic ID Recommendation ## Overview Qwen3-1.8B fine-tuned for generative product recommendation via hierarchical semantic identifiers. The model generates 4-level Semantic IDs (`<|sid_start|><|A#|><|B#|><|C#|><|D#|><|sid_end|>`) given product descriptions, purchase histories, or co-purchase contexts. This is the smaller model in a controlled comparison experiment (1.8B vs 8B) conducted under identical training conditions. ## Training ### Stage 1: Vocabulary Expansion - Added 1,027 special tokens (3 structural + 4×256 codebook tokens) - Trained only embedding matrices (0.3% of parameters) - 2,000 steps, LR 1×10⁻³, batch 64 ### Stage 2: Full Fine-tuning - **Dataset**: 4,719,994 instruction-formatted conversations (Amazon Pet Supplies) - **Task types**: text→SID, sequential recommendation, co-purchase prediction - **Optimizer**: AdamW 8-bit, LR 2×10⁻⁵, cosine with min LR (0.2×peak) - **Warmup**: 3%, weight decay 0.01 - **Batch**: 64 × 2 = 128 effective, 3 epochs - **Techniques**: Custom instruction masking, greedy sequence packing (~3× throughput) - **Hardware**: NVIDIA H100 80GB (vast.ai) ## Results Hierarchical SID prediction accuracy (A-level match, greedy decoding): | Task | Accuracy | |------|----------| | Text → SID | 59.9% | | Sequential recommendation | 7.0% | | Co-purchase prediction | 5.5% | Evaluation: 3,000 samples per task, 11 task types. ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("kalistratov/qwen3-1.8b-semantic-ids") tokenizer = AutoTokenizer.from_pretrained("kalistratov/qwen3-1.8b-semantic-ids") ``` ## Citation Master's thesis, Moscow Institute of Physics and Technology (MIPT), 2026. ## References 1. Y. Sun et al. "OpenOneRec," arXiv:2502.18851, 2025. 2. J. Liu et al. "PLUM," arXiv:2406.12346, 2024. 3. E. Yan. "semantic-ids-llm," GitHub, 2024.