初始化项目，由ModelHub XC社区提供模型

Model: kalistratov/qwen3-1.8b-semantic-ids Source: Original Platform
2026-05-10 07:18:20 +08:00
commit aa99d04387
12 changed files with 161214 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,68 @@
+---
+language: en
+tags:
+  - semantic-ids
+  - recommendation
+  - generative-retrieval
+  - qwen3
+  - fine-tuned
+license: apache-2.0
+base_model: Qwen/Qwen3-1.7B
+datasets:
+  - amazon-pet-supplies
+---
+
+# Qwen3-1.8B Fine-tuned for Semantic ID Recommendation
+
+## Overview
+
+Qwen3-1.8B fine-tuned for generative product recommendation via hierarchical semantic identifiers. The model generates 4-level Semantic IDs (`<|sid_start|><|A#|><|B#|><|C#|><|D#|><|sid_end|>`) given product descriptions, purchase histories, or co-purchase contexts.
+
+This is the smaller model in a controlled comparison experiment (1.8B vs 8B) conducted under identical training conditions.
+
+## Training
+
+### Stage 1: Vocabulary Expansion
+- Added 1,027 special tokens (3 structural + 4×256 codebook tokens)
+- Trained only embedding matrices (0.3% of parameters)
+- 2,000 steps, LR 1×10⁻³, batch 64
+
+### Stage 2: Full Fine-tuning
+- **Dataset**: 4,719,994 instruction-formatted conversations (Amazon Pet Supplies)
+- **Task types**: text→SID, sequential recommendation, co-purchase prediction
+- **Optimizer**: AdamW 8-bit, LR 2×10⁻⁵, cosine with min LR (0.2×peak)
+- **Warmup**: 3%, weight decay 0.01
+- **Batch**: 64 × 2 = 128 effective, 3 epochs
+- **Techniques**: Custom instruction masking, greedy sequence packing (~3× throughput)
+- **Hardware**: NVIDIA H100 80GB (vast.ai)
+
+## Results
+
+Hierarchical SID prediction accuracy (A-level match, greedy decoding):
+
+| Task | Accuracy |
+|------|----------|
+| Text → SID | 59.9% |
+| Sequential recommendation | 7.0% |
+| Co-purchase prediction | 5.5% |
+
+Evaluation: 3,000 samples per task, 11 task types.
+
+## Usage
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model = AutoModelForCausalLM.from_pretrained("kalistratov/qwen3-1.8b-semantic-ids")
+tokenizer = AutoTokenizer.from_pretrained("kalistratov/qwen3-1.8b-semantic-ids")
+```
+
+## Citation
+
+Master's thesis, Moscow Institute of Physics and Technology (MIPT), 2026.
+
+## References
+
+1. Y. Sun et al. "OpenOneRec," arXiv:2502.18851, 2025.
+2. J. Liu et al. "PLUM," arXiv:2406.12346, 2024.
+3. E. Yan. "semantic-ids-llm," GitHub, 2024.