初始化项目，由ModelHub XC社区提供模型

Model: distil-labs/distil-qwen3-0.6b-text2sql Source: Original Platform
2026-06-03 02:12:49 +08:00
commit 7fcf2966fd
18 changed files with 152545 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,193 @@
+---
+library_name: transformers
+license: apache-2.0
+base_model: Qwen/Qwen3-0.6B
+tags:
+  - text2sql
+  - sql
+  - nlp
+  - distillation
+  - qwen3
+datasets:
+  - distil-labs/text2sql-synthetic
+language:
+  - en
+pipeline_tag: text-generation
+---
+
+# Distil-Qwen3-0.6B-Text2SQL
+
+A fine-tuned Qwen3-0.6B model for converting natural language questions into SQL queries. Trained using knowledge distillation from DeepSeek-V3, this compact 0.6B parameter model delivers strong Text2SQL performance while being extremely lightweight and fast for local deployment.
+
+## Results
+
+| Metric | DeepSeek-V3 (Teacher) | Qwen3-0.6B (Base) | **This Model** |
+|--------|:---------------------:|:-----------------:|:--------------:|
+| LLM-as-a-Judge | 76% | 36% | **74%** |
+| Exact Match | 38% | 24% | **40%** |
+| ROUGE | 88.6% | 69.3% | **88.5%** |
+| METEOR | 90.4% | 65.3% | **88.5%** |
+
+The fine-tuned model achieves **74% on LLM-as-a-Judge** accuracy with only 0.6B parameters - a **2x improvement** over the base model and approaching the 685B parameter teacher's performance at a fraction of the size.
+
+## Quick Start
+
+### Using Transformers
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-qwen3-0.6b-text2sql")
+tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-qwen3-0.6b-text2sql")
+
+schema = """CREATE TABLE employees (
+  id INTEGER PRIMARY KEY,
+  name TEXT NOT NULL,
+  department TEXT,
+  salary INTEGER
+);"""
+
+question = "How many employees earn more than 50000?"
+
+messages = [
+    {
+        "role": "system",
+        "content": """You are a problem solving model working on task_description XML block:
+<task_description>You are given a database schema and a natural language question. Generate the SQL query that answers the question.
+
+Input:
+- Schema: One or two table definitions in SQL DDL format
+- Question: Natural language question about the data
+
+Output:
+- A single SQL query that answers the question
+- No explanations, comments, or additional text
+
+Rules:
+- Use only tables and columns from the provided schema
+- Use uppercase SQL keywords (SELECT, FROM, WHERE, etc.)
+- Use SQLite-compatible syntax</task_description>
+You will be given a single task in the question XML block
+Solve only the task in question block.
+Generate only the answer, do not generate anything else"""
+    },
+    {
+        "role": "user",
+        "content": f"""Now for the real task, solve the task in question block.
+Generate only the solution, do not generate anything else
+<question>Schema:
+{schema}
+
+Question: {question}</question>"""
+    }
+]
+
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=256, temperature=0)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+
+### Using Ollama (GGUF version)
+
+For local inference, use the quantized GGUF version included in this repository:
+
+```bash
+# Download and create Ollama model
+ollama create distil-qwen3-0.6b-text2sql -f Modelfile
+
+# Run inference
+ollama run distil-qwen3-0.6b-text2sql
+```
+
+## Model Details
+
+| Property | Value |
+|----------|-------|
+| Base Model | [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) |
+| Parameters | 0.6 billion |
+| Architecture | Qwen3ForCausalLM |
+| Context Length | 40,960 tokens |
+| Precision | bfloat16 |
+| Training Data | ~10,000 synthetic examples |
+| Teacher Model | DeepSeek-V3 |
+
+## Training
+
+This model was trained using the [Distil Labs](https://distillabs.ai) platform:
+
+1. **Seed Data**: 50 hand-validated Text2SQL examples covering various SQL complexities
+2. **Synthetic Generation**: Expanded to ~10,000 examples using DeepSeek-V3
+3. **Fine-tuning**: 4 epochs on the synthetic dataset
+4. **Evaluation**: LLM-as-a-Judge with semantic equivalence checking
+
+### Training Hyperparameters
+
+- Epochs: 4
+- Learning Rate: 5e-5 (cosine schedule)
+- Batch Size: 1 (with gradient accumulation)
+- Total Steps: ~40,000
+
+## Task Format
+
+### Input Format
+
+```
+Schema:
+CREATE TABLE table_name (
+  column_name DATA_TYPE [CONSTRAINTS],
+  ...
+);
+
+Question: Natural language question about the data
+```
+
+### Output Format
+
+A single SQL query with:
+- Uppercase SQL keywords (SELECT, FROM, WHERE, etc.)
+- SQLite-compatible syntax
+- No explanations or additional text
+
+### Supported SQL Features
+
+- **Simple**: SELECT, WHERE, COUNT, SUM, AVG, MAX, MIN
+- **Medium**: JOIN, GROUP BY, HAVING, ORDER BY, LIMIT
+- **Complex**: Subqueries, multiple JOINs, UNION
+
+## Use Cases
+
+- Natural language interfaces to databases
+- SQL query assistance and autocompletion
+- Database chatbots and conversational BI
+- Educational tools for learning SQL
+- Edge deployment and mobile applications
+
+## Limitations
+
+- Optimized for SQLite syntax
+- Best with 1-2 table schemas
+- May struggle with highly complex nested subqueries
+- Trained on English questions only
+
+## License
+
+This model is released under the Apache 2.0 license.
+
+## Links
+
+- [Distil Labs Website](https://distillabs.ai)
+- [GitHub](https://github.com/distil-labs)
+- [Hugging Face](https://huggingface.co/distil-labs)
+
+## Citation
+
+```bibtex
+@misc{distil-qwen3-0.6b-text2sql,
+  author = {Distil Labs},
+  title = {Distil-Qwen3-0.6B-Text2SQL: A Compact Fine-tuned Model for Natural Language to SQL},
+  year = {2025},
+  publisher = {Hugging Face},
+  url = {https://huggingface.co/distil-labs/distil-qwen3-0.6b-text2sql}
+}
+```