distil-qwen3-0.6b-text2sql/README.md

---
library_name: transformers
license: apache-2.0
base_model: Qwen/Qwen3-0.6B
tags:
  - text2sql
  - sql
  - nlp
  - distillation
  - qwen3
datasets:
  - distil-labs/text2sql-synthetic
language:
  - en
pipeline_tag: text-generation
---

# Distil-Qwen3-0.6B-Text2SQL

A fine-tuned Qwen3-0.6B model for converting natural language questions into SQL queries. Trained using knowledge distillation from DeepSeek-V3, this compact 0.6B parameter model delivers strong Text2SQL performance while being extremely lightweight and fast for local deployment.

## Results

| Metric | DeepSeek-V3 (Teacher) | Qwen3-0.6B (Base) | **This Model** |
|--------|:---------------------:|:-----------------:|:--------------:|
| LLM-as-a-Judge | 76% | 36% | **74%** |
| Exact Match | 38% | 24% | **40%** |
| ROUGE | 88.6% | 69.3% | **88.5%** |
| METEOR | 90.4% | 65.3% | **88.5%** |

The fine-tuned model achieves **74% on LLM-as-a-Judge** accuracy with only 0.6B parameters - a **2x improvement** over the base model and approaching the 685B parameter teacher's performance at a fraction of the size.

## Quick Start

### Using Transformers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-qwen3-0.6b-text2sql")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-qwen3-0.6b-text2sql")

schema = """CREATE TABLE employees (
  id INTEGER PRIMARY KEY,
  name TEXT NOT NULL,
  department TEXT,
  salary INTEGER
);"""

question = "How many employees earn more than 50000?"

messages = [
    {
        "role": "system",
        "content": """You are a problem solving model working on task_description XML block:
<task_description>You are given a database schema and a natural language question. Generate the SQL query that answers the question.

Input:
- Schema: One or two table definitions in SQL DDL format
- Question: Natural language question about the data

Output:
- A single SQL query that answers the question
- No explanations, comments, or additional text

Rules:
- Use only tables and columns from the provided schema
- Use uppercase SQL keywords (SELECT, FROM, WHERE, etc.)
- Use SQLite-compatible syntax</task_description>
You will be given a single task in the question XML block
Solve only the task in question block.
Generate only the answer, do not generate anything else"""
    },
    {
        "role": "user",
        "content": f"""Now for the real task, solve the task in question block.
Generate only the solution, do not generate anything else
<question>Schema:
{schema}

Question: {question}</question>"""
    }
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Using Ollama (GGUF version)

For local inference, use the quantized GGUF version included in this repository:

```bash
# Download and create Ollama model
ollama create distil-qwen3-0.6b-text2sql -f Modelfile

# Run inference
ollama run distil-qwen3-0.6b-text2sql
```

## Model Details

| Property | Value |
|----------|-------|
| Base Model | [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) |
| Parameters | 0.6 billion |
| Architecture | Qwen3ForCausalLM |
| Context Length | 40,960 tokens |
| Precision | bfloat16 |
| Training Data | ~10,000 synthetic examples |
| Teacher Model | DeepSeek-V3 |

## Training

This model was trained using the [Distil Labs](https://distillabs.ai) platform:

1. **Seed Data**: 50 hand-validated Text2SQL examples covering various SQL complexities
2. **Synthetic Generation**: Expanded to ~10,000 examples using DeepSeek-V3
3. **Fine-tuning**: 4 epochs on the synthetic dataset
4. **Evaluation**: LLM-as-a-Judge with semantic equivalence checking

### Training Hyperparameters

- Epochs: 4
- Learning Rate: 5e-5 (cosine schedule)
- Batch Size: 1 (with gradient accumulation)
- Total Steps: ~40,000

## Task Format

### Input Format

```
Schema:
CREATE TABLE table_name (
  column_name DATA_TYPE [CONSTRAINTS],
  ...
);

Question: Natural language question about the data
```

### Output Format

A single SQL query with:
- Uppercase SQL keywords (SELECT, FROM, WHERE, etc.)
- SQLite-compatible syntax
- No explanations or additional text

### Supported SQL Features

- **Simple**: SELECT, WHERE, COUNT, SUM, AVG, MAX, MIN
- **Medium**: JOIN, GROUP BY, HAVING, ORDER BY, LIMIT
- **Complex**: Subqueries, multiple JOINs, UNION

## Use Cases

- Natural language interfaces to databases
- SQL query assistance and autocompletion
- Database chatbots and conversational BI
- Educational tools for learning SQL
- Edge deployment and mobile applications

## Limitations

- Optimized for SQLite syntax
- Best with 1-2 table schemas
- May struggle with highly complex nested subqueries
- Trained on English questions only

## License

This model is released under the Apache 2.0 license.

## Links

- [Distil Labs Website](https://distillabs.ai)
- [GitHub](https://github.com/distil-labs)
- [Hugging Face](https://huggingface.co/distil-labs)

## Citation

```bibtex
@misc{distil-qwen3-0.6b-text2sql,
  author = {Distil Labs},
  title = {Distil-Qwen3-0.6B-Text2SQL: A Compact Fine-tuned Model for Natural Language to SQL},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/distil-labs/distil-qwen3-0.6b-text2sql}
}
```
初始化项目，由ModelHub XC社区提供模型 Model: distil-labs/distil-qwen3-0.6b-text2sql Source: Original Platform 2026-06-03 02:12:49 +08:00			`---`
			`library_name: transformers`
			`license: apache-2.0`
			`base_model: Qwen/Qwen3-0.6B`
			`tags:`
			`- text2sql`
			`- sql`
			`- nlp`
			`- distillation`
			`- qwen3`
			`datasets:`
			`- distil-labs/text2sql-synthetic`
			`language:`
			`- en`
			`pipeline_tag: text-generation`
			`---`

			`# Distil-Qwen3-0.6B-Text2SQL`

			`A fine-tuned Qwen3-0.6B model for converting natural language questions into SQL queries. Trained using knowledge distillation from DeepSeek-V3, this compact 0.6B parameter model delivers strong Text2SQL performance while being extremely lightweight and fast for local deployment.`

			`## Results`

			`\| Metric \| DeepSeek-V3 (Teacher) \| Qwen3-0.6B (Base) \| This Model \|`
			`\|--------\|:---------------------:\|:-----------------:\|:--------------:\|`
			`\| LLM-as-a-Judge \| 76% \| 36% \| 74% \|`
			`\| Exact Match \| 38% \| 24% \| 40% \|`
			`\| ROUGE \| 88.6% \| 69.3% \| 88.5% \|`
			`\| METEOR \| 90.4% \| 65.3% \| 88.5% \|`

			`The fine-tuned model achieves 74% on LLM-as-a-Judge accuracy with only 0.6B parameters - a 2x improvement over the base model and approaching the 685B parameter teacher's performance at a fraction of the size.`

			`## Quick Start`

			`### Using Transformers`

			```python
			`from transformers import AutoModelForCausalLM, AutoTokenizer`

			`model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-qwen3-0.6b-text2sql")`
			`tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-qwen3-0.6b-text2sql")`

			`schema = """CREATE TABLE employees (`
			`id INTEGER PRIMARY KEY,`
			`name TEXT NOT NULL,`
			`department TEXT,`
			`salary INTEGER`
			`);"""`

			`question = "How many employees earn more than 50000?"`

			`messages = [`
			`{`
			`"role": "system",`
			`"content": """You are a problem solving model working on task_description XML block:`
			`<task_description>You are given a database schema and a natural language question. Generate the SQL query that answers the question.`

			`Input:`
			`- Schema: One or two table definitions in SQL DDL format`
			`- Question: Natural language question about the data`

			`Output:`
			`- A single SQL query that answers the question`
			`- No explanations, comments, or additional text`

			`Rules:`
			`- Use only tables and columns from the provided schema`
			`- Use uppercase SQL keywords (SELECT, FROM, WHERE, etc.)`
			`- Use SQLite-compatible syntax</task_description>`
			`You will be given a single task in the question XML block`
			`Solve only the task in question block.`
			`Generate only the answer, do not generate anything else"""`
			`},`
			`{`
			`"role": "user",`
			`"content": f"""Now for the real task, solve the task in question block.`
			`Generate only the solution, do not generate anything else`
			`<question>Schema:`
			`{schema}`

			`Question: {question}</question>"""`
			`}`
			`]`

			`text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)`
			`inputs = tokenizer(text, return_tensors="pt")`
			`outputs = model.generate(**inputs, max_new_tokens=256, temperature=0)`
			`print(tokenizer.decode(outputs[0], skip_special_tokens=True))`
			```

			`### Using Ollama (GGUF version)`

			`For local inference, use the quantized GGUF version included in this repository:`

			```bash
			`# Download and create Ollama model`
			`ollama create distil-qwen3-0.6b-text2sql -f Modelfile`

			`# Run inference`
			`ollama run distil-qwen3-0.6b-text2sql`
			```

			`## Model Details`

			`\| Property \| Value \|`
			`\|----------\|-------\|`
			`\| Base Model \| [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) \|`
			`\| Parameters \| 0.6 billion \|`
			`\| Architecture \| Qwen3ForCausalLM \|`
			`\| Context Length \| 40,960 tokens \|`
			`\| Precision \| bfloat16 \|`
			`\| Training Data \| ~10,000 synthetic examples \|`
			`\| Teacher Model \| DeepSeek-V3 \|`

			`## Training`

			`This model was trained using the [Distil Labs](https://distillabs.ai) platform:`

			`1. Seed Data: 50 hand-validated Text2SQL examples covering various SQL complexities`
			`2. Synthetic Generation: Expanded to ~10,000 examples using DeepSeek-V3`
			`3. Fine-tuning: 4 epochs on the synthetic dataset`
			`4. Evaluation: LLM-as-a-Judge with semantic equivalence checking`

			`### Training Hyperparameters`

			`- Epochs: 4`
			`- Learning Rate: 5e-5 (cosine schedule)`
			`- Batch Size: 1 (with gradient accumulation)`
			`- Total Steps: ~40,000`

			`## Task Format`

			`### Input Format`

			```
			`Schema:`
			`CREATE TABLE table_name (`
			`column_name DATA_TYPE [CONSTRAINTS],`
			`...`
			`);`

			`Question: Natural language question about the data`
			```

			`### Output Format`

			`A single SQL query with:`
			`- Uppercase SQL keywords (SELECT, FROM, WHERE, etc.)`
			`- SQLite-compatible syntax`
			`- No explanations or additional text`

			`### Supported SQL Features`

			`- Simple: SELECT, WHERE, COUNT, SUM, AVG, MAX, MIN`
			`- Medium: JOIN, GROUP BY, HAVING, ORDER BY, LIMIT`
			`- Complex: Subqueries, multiple JOINs, UNION`

			`## Use Cases`

			`- Natural language interfaces to databases`
			`- SQL query assistance and autocompletion`
			`- Database chatbots and conversational BI`
			`- Educational tools for learning SQL`
			`- Edge deployment and mobile applications`

			`## Limitations`

			`- Optimized for SQLite syntax`
			`- Best with 1-2 table schemas`
			`- May struggle with highly complex nested subqueries`
			`- Trained on English questions only`

			`## License`

			`This model is released under the Apache 2.0 license.`

			`## Links`

			`- [Distil Labs Website](https://distillabs.ai)`
			`- [GitHub](https://github.com/distil-labs)`
			`- [Hugging Face](https://huggingface.co/distil-labs)`

			`## Citation`

			```bibtex
			`@misc{distil-qwen3-0.6b-text2sql,`
			`author = {Distil Labs},`
			`title = {Distil-Qwen3-0.6B-Text2SQL: A Compact Fine-tuned Model for Natural Language to SQL},`
			`year = {2025},`
			`publisher = {Hugging Face},`
			`url = {https://huggingface.co/distil-labs/distil-qwen3-0.6b-text2sql}`
			`}`
			```