初始化项目，由ModelHub XC社区提供模型

Model: vishnurchityala/sql-gemma3 Source: Original Platform
2026-06-04 12:18:16 +08:00
commit 19e5324330
8 changed files with 281 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,78 @@
+---
+language:
+- en
+license: gemma
+base_model: unsloth/gemma-3-1b-it
+tags:
+- text-to-sql
+- finetuning
+datasets:
+- gretelai/synthetic_text_to_sql
+pipeline_tag: text-generation
+---
+
+# SQL-Gemma3
+
+`SQL-Gemma3` is a fine-tuned version of `Gemma 3 1B Instruct` for text-to-SQL generation. It was trained on a balanced sampled subset of the [Gretel synthetic_text_to_sql dataset](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql) to improve SQL generation from table schema and natural language questions.
+
+## Model Details
+
+- Base model: `unsloth/gemma-3-1b-it`
+- Task: Natural language to SQL
+- Training data: balanced sampled subset of `gretelai/synthetic_text_to_sql`
+- Reported training loss: `0.201`
+- Reported test loss: `0.21`
+
+## Intended Use
+
+This model is intended for:
+
+- Generating SQL queries from schema-aware prompts
+- Learning and experimentation with text-to-SQL workflows
+- Prototyping NL-to-SQL assistants
+
+It is not guaranteed to produce correct, executable, or secure SQL for every prompt. Review generated queries before using them in production systems.
+
+## Usage
+
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+
+model_id = "vishnurchityala/sql-gemma3"
+
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(model_id)
+
+messages = [
+    {
+        "role": "user",
+        "content": (
+            "CREATE TABLE employees(id INT, name TEXT, salary INT);\n\n"
+            "Find the average salary of all employees."
+        ),
+    }
+]
+
+inputs = tokenizer(
+    tokenizer.apply_chat_template(
+        messages,
+        tokenize=False,
+        add_generation_prompt=True,
+    ),
+    return_tensors="pt",
+)
+
+outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+
+## Limitations
+
+- Performance is summarized here using loss only, not execution accuracy
+- Output quality depends heavily on schema clarity and prompt format
+- The model may generate dialect-specific or invalid SQL in some cases
+
+## Acknowledgements
+
+- Base model: [Gemma 3](https://huggingface.co/google)
+- Dataset: [Gretel AI synthetic_text_to_sql](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql)