sql-gemma3/README.md

---
language:
- en
license: gemma
base_model: unsloth/gemma-3-1b-it
tags:
- text-to-sql
- finetuning
datasets:
- gretelai/synthetic_text_to_sql
pipeline_tag: text-generation
---

# SQL-Gemma3

`SQL-Gemma3` is a fine-tuned version of `Gemma 3 1B Instruct` for text-to-SQL generation. It was trained on a balanced sampled subset of the [Gretel synthetic_text_to_sql dataset](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql) to improve SQL generation from table schema and natural language questions.

## Model Details

- Base model: `unsloth/gemma-3-1b-it`
- Task: Natural language to SQL
- Training data: balanced sampled subset of `gretelai/synthetic_text_to_sql`
- Reported training loss: `0.201`
- Reported test loss: `0.21`

## Intended Use

This model is intended for:

- Generating SQL queries from schema-aware prompts
- Learning and experimentation with text-to-SQL workflows
- Prototyping NL-to-SQL assistants

It is not guaranteed to produce correct, executable, or secure SQL for every prompt. Review generated queries before using them in production systems.

## Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "vishnurchityala/sql-gemma3"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

messages = [
    {
        "role": "user",
        "content": (
            "CREATE TABLE employees(id INT, name TEXT, salary INT);\n\n"
            "Find the average salary of all employees."
        ),
    }
]

inputs = tokenizer(
    tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True,
    ),
    return_tensors="pt",
)

outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Limitations

- Performance is summarized here using loss only, not execution accuracy
- Output quality depends heavily on schema clarity and prompt format
- The model may generate dialect-specific or invalid SQL in some cases

## Acknowledgements

- Base model: [Gemma 3](https://huggingface.co/google)
- Dataset: [Gretel AI synthetic_text_to_sql](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql)
初始化项目，由ModelHub XC社区提供模型 Model: vishnurchityala/sql-gemma3 Source: Original Platform 2026-06-04 12:18:16 +08:00			`---`
			`language:`
			`- en`
			`license: gemma`
			`base_model: unsloth/gemma-3-1b-it`
			`tags:`
			`- text-to-sql`
			`- finetuning`
			`datasets:`
			`- gretelai/synthetic_text_to_sql`
			`pipeline_tag: text-generation`
			`---`

			`# SQL-Gemma3`

			`SQL-Gemma3` is a fine-tuned version of `Gemma 3 1B Instruct` for text-to-SQL generation. It was trained on a balanced sampled subset of the [Gretel synthetic_text_to_sql dataset](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql) to improve SQL generation from table schema and natural language questions.

			`## Model Details`

			- Base model: `unsloth/gemma-3-1b-it`
			`- Task: Natural language to SQL`
			- Training data: balanced sampled subset of `gretelai/synthetic_text_to_sql`
			- Reported training loss: `0.201`
			- Reported test loss: `0.21`

			`## Intended Use`

			`This model is intended for:`

			`- Generating SQL queries from schema-aware prompts`
			`- Learning and experimentation with text-to-SQL workflows`
			`- Prototyping NL-to-SQL assistants`

			`It is not guaranteed to produce correct, executable, or secure SQL for every prompt. Review generated queries before using them in production systems.`

			`## Usage`

			```python
			`from transformers import AutoTokenizer, AutoModelForCausalLM`

			`model_id = "vishnurchityala/sql-gemma3"`

			`tokenizer = AutoTokenizer.from_pretrained(model_id)`
			`model = AutoModelForCausalLM.from_pretrained(model_id)`

			`messages = [`
			`{`
			`"role": "user",`
			`"content": (`
			`"CREATE TABLE employees(id INT, name TEXT, salary INT);\n\n"`
			`"Find the average salary of all employees."`
			`),`
			`}`
			`]`

			`inputs = tokenizer(`
			`tokenizer.apply_chat_template(`
			`messages,`
			`tokenize=False,`
			`add_generation_prompt=True,`
			`),`
			`return_tensors="pt",`
			`)`

			`outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)`
			`print(tokenizer.decode(outputs[0], skip_special_tokens=True))`
			```

			`## Limitations`

			`- Performance is summarized here using loss only, not execution accuracy`
			`- Output quality depends heavily on schema clarity and prompt format`
			`- The model may generate dialect-specific or invalid SQL in some cases`

			`## Acknowledgements`

			`- Base model: [Gemma 3](https://huggingface.co/google)`
			`- Dataset: [Gretel AI synthetic_text_to_sql](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql)`