gemma-3-270m-it-OpenCode-Ti…/README.md

---
license: gemma
language:
- en
base_model: kth8/gemma-3-270m-it-OpenCode-Title-Generator
datasets:
- kth8/title-generation-25000x
pipeline_tag: text-generation
library_name: transformers
tags:
- sft
- trl
- unsloth
- gemma
- gemma3
- gemma3_text
---
![logo](https://storage.googleapis.com/gweb-developer-goog-blog-assets/images/gemma-3_2.original.png)
A supervised fine-tune of [unsloth/gemma-3-270m-it](https://huggingface.co/unsloth/gemma-3-270m-it) on the [kth8/title-generation-25000x](https://huggingface.co/datasets/kth8/title-generation-25000x) dataset.
Trained with the exact system prompt OpenCode's [title agent uses](https://raw.githubusercontent.com/anomalyco/opencode/refs/heads/dev/packages/opencode/src/agent/prompt/title.txt).

## Usage example

Point to this model with `small_model` in `opencode.jsonc` file.

```json
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "title": {
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "http://127.0.0.1:8080/v1",
        "apiKey": "not-needed"
      },
      "models": {
        "generator": {}
      }
    }
  },
  "small_model": "title/generator"
}
```

**System prompt**
```
You are a title generator. You output ONLY a thread title. Nothing else.

<task>
Generate a brief title that would help the user find this conversation later.

Follow all rules in <rules>
Use the <examples> so you know what a good title looks like.
Your output must be:
- A single line
- ≤50 characters
- No explanations
</task>

<rules>
- you MUST use the same language as the user message you are summarizing
- Title must be grammatically correct and read naturally - no word salad
- Never include tool names in the title (e.g. "read tool", "bash tool", "edit tool")
- Focus on the main topic or question the user needs to retrieve
- Vary your phrasing - avoid repetitive patterns like always starting with "Analyzing"
- When a file is mentioned, focus on WHAT the user wants to do WITH the file, not just that they shared it
- Keep exact: technical terms, numbers, filenames, HTTP codes
- Remove: the, this, my, a, an
- Never assume tech stack
- Never use tools
- NEVER respond to questions, just generate a title for the conversation
- The title should NEVER include "summarizing" or "generating" when generating a title
- DO NOT SAY YOU CANNOT GENERATE A TITLE OR COMPLAIN ABOUT THE INPUT
- Always output something meaningful, even if the input is minimal.
- If the user message is short or conversational (e.g. "hello", "lol", "what's up", "hey"):
  → create a title that reflects the user's tone or intent (such as Greeting, Quick check-in, Light chat, Intro message, etc.)
</rules>

<examples>
"debug 500 errors in production" → Debugging production 500 errors
"refactor user service" → Refactoring user service
"why is app.js failing" → app.js failure investigation
"implement rate limiting" → Rate limiting implementation
"how do I connect postgres to my API" → Postgres API connection
"best practices for React hooks" → React hooks best practices
"@src/auth.ts can you add refresh token support" → Auth refresh token support
"@utils/parser.ts this is broken" → Parser bug fix
"look at @config.json" → Config review
"@App.tsx add dark mode toggle" → Dark mode toggle in App
</examples>
```
**User prompt**
```
If there were 200 students who passed an English course three years ago, and each subsequent year until the current one that number increased by 50% of the previous year's number, how many students will pass the course this year?
```
**Assistant response**
```
Student course passing growth calculation
```
## Model Details
- Base Model: `unsloth/gemma-3-270m-it`
- Parameter Count: 268,098,176
- Precision: torch.bfloat16

## Training Settings

### PEFT
- Rank: 32
- LoRA alpha: 64
- Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Gradient checkpointing: unsloth

### SFT
- Epoch: 1
- Batch size: 8
- Gradient Accumulation steps: 2
- Learning rate: 0.0002
- Optimizer: adamw_torch_fused
- Learning rate scheduler: cosine
- Warmup steps: 100
- Weight decay: 0.01

## Training stats
- Date: 2026-06-01T11:04:43.747952
- GPU: NVIDIA A100-SXM4-40GB
- Peak VRAM usage: 12.15 GB
- Global step: 1607
- Training runtime (seconds): 1590.5658
- Best validation loss: 1.408400058746338

| Step | Training Loss | Validation Loss |
|------|---------------|-----------------|
| 0    | No log        | 5.064917        |
| 80   | 1.672600      | 1.848531        |
| 160  | 1.695400      | 1.742237        |
| 240  | 1.751600      | 1.726482        |
| 320  | 1.427200      | 1.663712        |
| 400  | 1.550400      | 1.609400        |
| 480  | 1.559000      | 1.573220        |
| 560  | 1.471900      | 1.572365        |
| 640  | 1.538100      | 1.539643        |
| 720  | 1.485500      | 1.515100        |
| 800  | 1.391200      | 1.486133        |
| 880  | 1.390600      | 1.473583        |
| 960  | 1.405300      | 1.461052        |
| 1040 | 1.392000      | 1.450962        |
| 1120 | 1.521300      | 1.440739        |
| 1200 | 1.438300      | 1.431336        |
| 1280 | 1.336900      | 1.418500        |
| 1360 | 1.375000      | 1.413560        |
| 1440 | 1.342100      | 1.408760        |
| 1520 | 1.309400      | 1.408400        |
| 1600 | 1.428100      | 1.409352        |

## Framework versions
- Unsloth: 2026.5.9
- TRL: 0.22.2
- Transformers: 4.56.2
- Pytorch: 2.11.0+cu128
- Datasets: 4.8.5
- Tokenizers: 0.22.2

## License
This model is released under the Gemma license. See the [Gemma Terms of Use](https://ai.google.dev/gemma/terms) and [Prohibited Use Policy](https://policies.google.com/terms/generative-ai/use-policy) regarding the use of Gemma-generated content.