gemma-3-1b-it-Conversation-GGUF/README.md at main - gemma-3-1b-it-Conversation-GGUF - Gitea: Git with a cup of tea

kth8/gemma-3-1b-it-Conversation-GGUF

Files

ModelHub XC 43f2175bba 初始化项目，由ModelHub XC社区提供模型

Model: kth8/gemma-3-1b-it-Conversation-GGUF
Source: Original Platform

2026-04-12 12:55:58 +08:00

1.9 KiB

Raw Permalink Blame History

license, language, base_model, datasets, pipeline_tag, library_name, tags

license

language

base_model

datasets

pipeline_tag

library_name

tags

gemma

en

kth8/gemma-3-1b-it-Conversation

kth8/multi-turn-conversation-50000x

text-generation

transformers

sft

trl

unsloth

google

gemma

gemma3

gemma3_text

A fine-tune of unsloth/gemma-3-1b-it on the kth8/multi-turn-conversation-50000x dataset.

Usage example

System prompt

You are a helpful assistant.

User prompt

Hey there! How's it going?

Assistant response

Hey! I'm doing great, thanks for asking! I'm here and ready to help with whatever you need. What's on your mind today?

Model Details

Base Model: unsloth/gemma-3-1b-it
Parameter Count: 999885952
Precision: torch.bfloat16

Training Settings

Hardware

GPU: NVIDIA RTX PRO 6000 Blackwell Server Edition

PEFT

Rank: 32
LoRA alpha: 64
Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Gradient checkpointing: unsloth

SFT

Epoch: 2
Batch size: 48
Gradient Accumulation steps: 1
Warmup ratio: 0.1
Learning rate: 0.0002
Optimizer: adamw_torch_fused
Learning rate scheduler: cosine

Training stats

Global step: 1996
Training runtime (seconds): 6834.1445
Average training loss: 1.1743444665400442
Final validation loss: 1.1191450357437134

Framework versions

Unsloth: 2026.3.8
TRL: 0.22.2
Transformers: 4.56.2
Pytorch: 2.10.0+cu128
Datasets: 4.8.3
Tokenizers: 0.22.2

License

This model is released under the Gemma license. See the Gemma Terms of Use and Prohibited Use Policy regarding the use of Gemma-generated content.