初始化项目,由ModelHub XC社区提供模型
Model: kth8/gemma-3-270m-it-OpenCode-Title-Generator-GGUF Source: Original Platform
This commit is contained in:
37
.gitattributes
vendored
Normal file
37
.gitattributes
vendored
Normal file
@@ -0,0 +1,37 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
gemma-3-270m-it-OpenCode-Title-Generator-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
gemma-3-270m-it-OpenCode-Title-Generator-bf16.gguf filter=lfs diff=lfs merge=lfs -text
|
||||
164
README.md
Normal file
164
README.md
Normal file
@@ -0,0 +1,164 @@
|
||||
---
|
||||
license: gemma
|
||||
language:
|
||||
- en
|
||||
base_model: kth8/gemma-3-270m-it-OpenCode-Title-Generator
|
||||
datasets:
|
||||
- kth8/title-generation-25000x
|
||||
pipeline_tag: text-generation
|
||||
library_name: transformers
|
||||
tags:
|
||||
- sft
|
||||
- trl
|
||||
- unsloth
|
||||
- gemma
|
||||
- gemma3
|
||||
- gemma3_text
|
||||
---
|
||||

|
||||
A supervised fine-tune of [unsloth/gemma-3-270m-it](https://huggingface.co/unsloth/gemma-3-270m-it) on the [kth8/title-generation-25000x](https://huggingface.co/datasets/kth8/title-generation-25000x) dataset.
|
||||
Trained with the exact system prompt OpenCode's [title agent uses](https://raw.githubusercontent.com/anomalyco/opencode/refs/heads/dev/packages/opencode/src/agent/prompt/title.txt).
|
||||
|
||||
## Usage example
|
||||
|
||||
Point to this model with `small_model` in `opencode.jsonc` file.
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "https://opencode.ai/config.json",
|
||||
"provider": {
|
||||
"title": {
|
||||
"npm": "@ai-sdk/openai-compatible",
|
||||
"options": {
|
||||
"baseURL": "http://127.0.0.1:8080/v1",
|
||||
"apiKey": "not-needed"
|
||||
},
|
||||
"models": {
|
||||
"generator": {}
|
||||
}
|
||||
}
|
||||
},
|
||||
"small_model": "title/generator"
|
||||
}
|
||||
```
|
||||
|
||||
**System prompt**
|
||||
```
|
||||
You are a title generator. You output ONLY a thread title. Nothing else.
|
||||
|
||||
<task>
|
||||
Generate a brief title that would help the user find this conversation later.
|
||||
|
||||
Follow all rules in <rules>
|
||||
Use the <examples> so you know what a good title looks like.
|
||||
Your output must be:
|
||||
- A single line
|
||||
- ≤50 characters
|
||||
- No explanations
|
||||
</task>
|
||||
|
||||
<rules>
|
||||
- you MUST use the same language as the user message you are summarizing
|
||||
- Title must be grammatically correct and read naturally - no word salad
|
||||
- Never include tool names in the title (e.g. "read tool", "bash tool", "edit tool")
|
||||
- Focus on the main topic or question the user needs to retrieve
|
||||
- Vary your phrasing - avoid repetitive patterns like always starting with "Analyzing"
|
||||
- When a file is mentioned, focus on WHAT the user wants to do WITH the file, not just that they shared it
|
||||
- Keep exact: technical terms, numbers, filenames, HTTP codes
|
||||
- Remove: the, this, my, a, an
|
||||
- Never assume tech stack
|
||||
- Never use tools
|
||||
- NEVER respond to questions, just generate a title for the conversation
|
||||
- The title should NEVER include "summarizing" or "generating" when generating a title
|
||||
- DO NOT SAY YOU CANNOT GENERATE A TITLE OR COMPLAIN ABOUT THE INPUT
|
||||
- Always output something meaningful, even if the input is minimal.
|
||||
- If the user message is short or conversational (e.g. "hello", "lol", "what's up", "hey"):
|
||||
→ create a title that reflects the user's tone or intent (such as Greeting, Quick check-in, Light chat, Intro message, etc.)
|
||||
</rules>
|
||||
|
||||
<examples>
|
||||
"debug 500 errors in production" → Debugging production 500 errors
|
||||
"refactor user service" → Refactoring user service
|
||||
"why is app.js failing" → app.js failure investigation
|
||||
"implement rate limiting" → Rate limiting implementation
|
||||
"how do I connect postgres to my API" → Postgres API connection
|
||||
"best practices for React hooks" → React hooks best practices
|
||||
"@src/auth.ts can you add refresh token support" → Auth refresh token support
|
||||
"@utils/parser.ts this is broken" → Parser bug fix
|
||||
"look at @config.json" → Config review
|
||||
"@App.tsx add dark mode toggle" → Dark mode toggle in App
|
||||
</examples>
|
||||
```
|
||||
**User prompt**
|
||||
```
|
||||
If there were 200 students who passed an English course three years ago, and each subsequent year until the current one that number increased by 50% of the previous year's number, how many students will pass the course this year?
|
||||
```
|
||||
**Assistant response**
|
||||
```
|
||||
Student course passing growth calculation
|
||||
```
|
||||
## Model Details
|
||||
- Base Model: `unsloth/gemma-3-270m-it`
|
||||
- Parameter Count: 268,098,176
|
||||
- Precision: torch.bfloat16
|
||||
|
||||
## Training Settings
|
||||
|
||||
### PEFT
|
||||
- Rank: 32
|
||||
- LoRA alpha: 64
|
||||
- Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
|
||||
- Gradient checkpointing: unsloth
|
||||
|
||||
### SFT
|
||||
- Epoch: 1
|
||||
- Batch size: 8
|
||||
- Gradient Accumulation steps: 2
|
||||
- Learning rate: 0.0002
|
||||
- Optimizer: adamw_torch_fused
|
||||
- Learning rate scheduler: cosine
|
||||
- Warmup steps: 100
|
||||
- Weight decay: 0.01
|
||||
|
||||
## Training stats
|
||||
- Date: 2026-06-01T11:04:43.747952
|
||||
- GPU: NVIDIA A100-SXM4-40GB
|
||||
- Peak VRAM usage: 12.15 GB
|
||||
- Global step: 1607
|
||||
- Training runtime (seconds): 1590.5658
|
||||
- Best validation loss: 1.408400058746338
|
||||
|
||||
| Step | Training Loss | Validation Loss |
|
||||
|------|---------------|-----------------|
|
||||
| 0 | No log | 5.064917 |
|
||||
| 80 | 1.672600 | 1.848531 |
|
||||
| 160 | 1.695400 | 1.742237 |
|
||||
| 240 | 1.751600 | 1.726482 |
|
||||
| 320 | 1.427200 | 1.663712 |
|
||||
| 400 | 1.550400 | 1.609400 |
|
||||
| 480 | 1.559000 | 1.573220 |
|
||||
| 560 | 1.471900 | 1.572365 |
|
||||
| 640 | 1.538100 | 1.539643 |
|
||||
| 720 | 1.485500 | 1.515100 |
|
||||
| 800 | 1.391200 | 1.486133 |
|
||||
| 880 | 1.390600 | 1.473583 |
|
||||
| 960 | 1.405300 | 1.461052 |
|
||||
| 1040 | 1.392000 | 1.450962 |
|
||||
| 1120 | 1.521300 | 1.440739 |
|
||||
| 1200 | 1.438300 | 1.431336 |
|
||||
| 1280 | 1.336900 | 1.418500 |
|
||||
| 1360 | 1.375000 | 1.413560 |
|
||||
| 1440 | 1.342100 | 1.408760 |
|
||||
| 1520 | 1.309400 | 1.408400 |
|
||||
| 1600 | 1.428100 | 1.409352 |
|
||||
|
||||
## Framework versions
|
||||
- Unsloth: 2026.5.9
|
||||
- TRL: 0.22.2
|
||||
- Transformers: 4.56.2
|
||||
- Pytorch: 2.11.0+cu128
|
||||
- Datasets: 4.8.5
|
||||
- Tokenizers: 0.22.2
|
||||
|
||||
## License
|
||||
This model is released under the Gemma license. See the [Gemma Terms of Use](https://ai.google.dev/gemma/terms) and [Prohibited Use Policy](https://policies.google.com/terms/generative-ai/use-policy) regarding the use of Gemma-generated content.
|
||||
3
gemma-3-270m-it-OpenCode-Title-Generator-Q8_0.gguf
Normal file
3
gemma-3-270m-it-OpenCode-Title-Generator-Q8_0.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:42456cf7e3b1ed3a61b25ed3ae16e1386af6be501e92e8de2813d70c4a7f74a1
|
||||
size 291545920
|
||||
3
gemma-3-270m-it-OpenCode-Title-Generator-bf16.gguf
Normal file
3
gemma-3-270m-it-OpenCode-Title-Generator-bf16.gguf
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:358dcad968d2a538582c8f089d9447d22cf58b770eabf8390ad8775cdd2e1cfc
|
||||
size 542835520
|
||||
Reference in New Issue
Block a user