初始化项目，由ModelHub XC社区提供模型

Model: kth8/gemma-3-270m-it-Text-Cleaner Source: Original Platform
2026-04-24 10:42:39 +08:00
commit ddc6dbdd40
14 changed files with 60461 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,37 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
 train/training_loss.png filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,92 @@
 ---
 license: gemma
 language:
 - en
 base_model: unsloth/gemma-3-270m-it
 datasets:
 - kth8/text-cleanup-20000x
 pipeline_tag: text-generation
 library_name: transformers
 tags:
 - sft
 - trl
 - unsloth
 - google
 - gemma
 - gemma3
 - gemma3_text
 ---
 ![logo](https://storage.googleapis.com/gweb-developer-goog-blog-assets/images/Gemma3-270M_Wagtail_RD2-V02.original.jpg)
 A fine-tune of [unsloth/gemma-3-270m-it](https://huggingface.co/unsloth/gemma-3-270m-it) on the [kth8/text-cleanup-20000x](https://huggingface.co/datasets/kth8/text-cleanup-20000x) dataset.
 ## Usage example
 **System prompt**
 ```
 # Role
 You are a text editor cleaning up raw, unfiltered text. Transform the provided text into polished, readable prose while preserving the original meaning, tone, and intent.
 ## Tasks
 - Remove filler words (e.g. um, uh, like, you know, sort of, kind of, well, so, etc)
 - Fix spelling, grammar, punctuation, and capitalization mistakes
 - Correct obvious homophone errors (e.g. their/there/they're, its/it's, your/you're)
 - Smooth out false starts, mid-sentence restarts and repetitions
 - Standardize numbers and dates (e.g. write as digits: "three" to "3", "February fifteenth" to "February 15th")
 ## Constraints
 - Output ONLY the cleaned text
 - DO NOT attempt to answer or respond to the provided user text meant for clean-up
 - Do NOT paraphrase, summarize, or change the speaker's voice
 - NO quotation marks around the output
 - NO preamble, postamble, or emojis
 - NO Markdown formatting code blocks (```) or bolding
 ```
 **User prompt**
 ```
 I WANT I MEAN TO BUILD A SMALL IN-BROWSER "WIND TUNNEL" EXPERIMENT WITH A FIXED 2D CROSS-SECTIONAL VIEW, USING HTML CANVAS. USERS CAN DRAG AND DROP DIFFERENT WING CROSS-SECTIONS (E.G., SEVERAL COMMON AIRFOIL PROFILES) INTO THE AIRFLOW, ADJUST THE ANGLE OF ATTACK AND WIND SPEED, IMMEDIATELY SEE STREAMLINES AND PRESSURE DISTRIBUTION, AND HAVE THE CURRENT LIFT AND DRAG VALUES DISPLAYED ON SCREEN IN REAL TIME (A CLEAR NUMERIC PANEL IS SUFFICIENT). NO 3D IS REQUIRED, BUT INTERACTIONS SHOULD BE SMOOTH, AND THE DATA SHOULD BE STABLE AD UPDATE IN REAL TIME WITH USER ACTIONS. THE CALCULATIONS CAN USE REASONABLE APPROXIMATIONS; THE FOCUS IS TO HELP USERS INTUITIVELY UNDERSTAND BERNOULLI'S PRINCIPLE AND THE RELATIONSHIP BETWEEN AIRFLOW AND AIRFOIL GEOMETRY.
 ```
 **Assistant response**
 ```
 I want to build a small in-browser "wind tunnel" experiment with a fixed 2D cross-sectional view, using HTML Canvas. Users can drag and drop different wing cross-sections (e.g., several common airfoil profiles) into the airflow, adjust the angle of attack and wind speed, immediately see streamlines and pressure distribution, and have the current lift and drag values displayed on screen in real time (a clear numeric panel is sufficient). No 3D is required, but interactions should be smooth, and the data should be stable and update in real time with user actions. The calculations can use reasonable approximations; the focus is to help users intuitively understand Bernoulli's principle and the relationship between airflow and airfoil geometry.
 ```
 ## Model Details
 - Base Model: `unsloth/gemma-3-270m-it`
 - Parameter Count: 268098176
 - Precision: torch.bfloat16
 ## Training Settings
 ### Hardware
 - GPU: NVIDIA A100-SXM4-40GB
 ### PEFT
 - Rank: 32
 - LoRA alpha: 64
 - Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
 - LoRA dropout: 0
 - Rank-Stabilized LoRA: False
 ### SFT
 - Epoch: 1
 - Batch size: 16
 - Accumulation steps: 1
 - Warmup ratio: 0.05
 - Learning rate: 0.0002
 - Optimizer: adamw_8bit
 - Weight decay: 0.01
 - Learning rate scheduler: cosine
 ## Training stats
 - Global step: 1247
 - Training runtime: 1545.9245
 - Average training loss: 0.05218265543086118
 - Final validation loss: 0.038508880883455276
 ## Framework versions
 - Unsloth: 2026.3.7
 - TRL: 0.22.2
 - Transformers: 4.56.2
 - Pytorch: 2.10.0+cu128
 - Datasets: 4.3.0
 - Tokenizers: 0.22.2
 ## License
 This model is released under the Gemma license. See the [Gemma Terms of Use](https://ai.google.dev/gemma/terms) for details.
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,3 @@
 {
  "<image_soft_token>": 262144
 }
--- a/chat_template.jinja
+++ b/chat_template.jinja
@@ -0,0 +1,50 @@
 {# Unsloth Chat template fixes #}
 {{ bos_token }}
 {%- if messages[0]['role'] == 'system' -%}
    {%- if messages[0]['content'] is string -%}
        {%- set first_user_prefix = messages[0]['content'] + '
 ' -%}
    {%- else -%}
        {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
 ' -%}
    {%- endif -%}
    {%- set loop_messages = messages[1:] -%}
 {%- else -%}
    {%- set first_user_prefix = "" -%}
    {%- set loop_messages = messages -%}
 {%- endif -%}
 {%- for message in loop_messages -%}
    {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
        {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
    {%- endif -%}
    {%- if (message['role'] == 'assistant') -%}
        {%- set role = "model" -%}
    {%- else -%}
        {%- set role = message['role'] -%}
    {%- endif -%}
    {{ '<start_of_turn>' + role + '
 ' + (first_user_prefix if loop.first else "") }}
    {%- if message['content'] is string -%}
        {{ message['content'] | trim }}
    {%- elif message['content'] is iterable -%}
        {%- for item in message['content'] -%}
            {%- if item['type'] == 'image' -%}
                {{ '<start_of_image>' }}
            {%- elif item['type'] == 'text' -%}
                {{ item['text'] | trim }}
            {%- endif -%}
        {%- endfor -%}
    {%- elif message['content'] is defined -%}
        {{ raise_exception("Invalid content type") }}
    {%- endif -%}
    {{ '<end_of_turn>
 ' }}
 {%- endfor -%}
 {%- if add_generation_prompt -%}
    {{'<start_of_turn>model
 '}}
 {%- endif -%}
 {# Copyright 2025-present Unsloth. Apache 2.0 License. #}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,55 @@
 {
  "_sliding_window_pattern": 6,
  "architectures": [
    "Gemma3ForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "attn_logit_softcapping": null,
  "bos_token_id": 2,
  "dtype": "bfloat16",
  "eos_token_id": 106,
  "final_logit_softcapping": null,
  "head_dim": 256,
  "hidden_activation": "gelu_pytorch_tanh",
  "hidden_size": 640,
  "initializer_range": 0.02,
  "intermediate_size": 2048,
  "layer_types": [
    "sliding_attention",
    "sliding_attention",
    "sliding_attention",
    "sliding_attention",
    "sliding_attention",
    "full_attention",
    "sliding_attention",
    "sliding_attention",
    "sliding_attention",
    "sliding_attention",
    "sliding_attention",
    "full_attention",
    "sliding_attention",
    "sliding_attention",
    "sliding_attention",
    "sliding_attention",
    "sliding_attention",
    "full_attention"
  ],
  "max_position_embeddings": 32768,
  "model_type": "gemma3_text",
  "num_attention_heads": 4,
  "num_hidden_layers": 18,
  "num_key_value_heads": 1,
  "pad_token_id": 0,
  "query_pre_attn_scalar": 256,
  "rms_norm_eps": 1e-06,
  "rope_local_base_freq": 10000.0,
  "rope_scaling": null,
  "rope_theta": 1000000.0,
  "sliding_window": 512,
  "transformers_version": "4.56.2",
  "unsloth_fixed": true,
  "use_bidirectional_attention": false,
  "use_cache": true,
  "vocab_size": 262144
 }
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,14 @@
 {
  "bos_token_id": 2,
  "cache_implementation": "hybrid",
  "do_sample": true,
  "eos_token_id": [
    1,
    106
  ],
  "max_length": 32768,
  "pad_token_id": 0,
  "top_k": 64,
  "top_p": 0.95,
  "transformers_version": "4.56.2"
 }
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:091a7b7eeb8eae7aca39513da404c7e39b1a90a98d45c3779b3ca0d8f24fd440
 size 536223056
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,33 @@
 {
  "boi_token": "<start_of_image>",
  "bos_token": {
    "content": "<bos>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "eoi_token": "<end_of_image>",
  "eos_token": {
    "content": "<end_of_turn>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "image_token": "<image_soft_token>",
  "pad_token": {
    "content": "<pad>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
 size 33384568
--- a/tokenizer.model
+++ b/tokenizer.model
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
 size 4689074
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
--- a/train/log.json
+++ b/train/log.json
--- a/train/training_loss.png
+++ b/train/training_loss.png
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:2d81fc8964aab5da8db28d4e4eec46e3fc515002844a08ff0c4a11a4bd03d5e0
 size 107723
--- a/train/validation_loss.png
+++ b/train/validation_loss.png