初始化项目，由ModelHub XC社区提供模型

Model: qingy2024/Formatter-0.6B Source: Original Platform
2026-06-17 09:31:17 +08:00
commit 19bb28e17a
11 changed files with 151896 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,36 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,82 @@
+---
+base_model: unsloth/Qwen3-0.6B-Base
+tags:
+- text-generation-inference
+- transformers
+- unsloth
+- qwen2
+- trl
+- sft
+license: apache-2.0
+language:
+- en
+---
+
+# Formatter 0.6B
+
+- **Developed by:** qingy2024
+- **License:** apache-2.0
+- **Finetuned from model :** Qwen3 0.6B (base)
+
+This is mainly my experiment to play around with adding special tokens and changing the chat template while fine-tuning.
+
+```jinja
+{%- set last_message = messages[-1] -%}
+{%- if last_message.role == "user" -%}
+{{- '<|problem_start|>\n' + last_message.content + '<|problem_end|>\n' -}}
+{%- elif last_message.role == "assistant" -%}
+{%- for message in messages -%}
+{%- if message.role == "user" -%}
+{{- '<|problem_start|>\n' + message.content + '<|problem_end|>\n' -}}
+{%- elif message.role == "assistant" -%}
+{{- '<|formatted_problem_start|>\n' + message.content + '<|formatted_problem_end|>\n' -}}
+{%- else -%}
+{{- raise('Unknown role: ' + message.role) -}}
+{%- endif -%}
+{%- endfor -%}
+{%- else -%}
+{{- raise('Unsupported role: ' + last_message.role) -}}
+{%- endif -%}
+{%- if add_generation_prompt and last_message.role == "user" -%}
+{{- '<|formatted_problem_start|>\n' -}}
+{%- endif -%}
+```
+
+Example:
+
+```
+User: Read the excerpt from Dr. Martin Luther King Jr.’s "I Have a Dream" speech.
+
+I am not unmindful that some of you have come here out of great trials and tribulations. Some of you have come fresh from narrow jail cells. Some of you have come from areas where your quest for freedom left you battered by the storms of persecution and staggered by the winds of police brutality. You have been the veterans of creative suffering. Continue to work with the faith that unearned suffering is redemptive. Go back to Mississippi, go back to Alabama, go back to South Carolina, go back to Georgia, go back to Louisiana, go back to the slums and ghettos of our northern cities, knowing that somehow this situation can and will be changed. Let us not wallow in the valley of despair.
+
+Which lines in this paragraph can be used as examples of metaphor? Select 3 options.
+
+great trials and tribulations
+storms of persecution
+winds of police brutality
+go back to Georgia
+this situation can and will be changed
+let us not wallow in the valley of despair
+```
+```
+LLM: Read the excerpt from Dr. Martin Luther King Jr.’s "I Have a Dream" speech.
+
+I am not unmindful that some of you have come here out of great trials and tribulations. Some of you have come fresh from narrow jail cells. Some of you have come from areas where your quest for freedom left you battered by the storms of persecution and staggered by the winds of police brutality. You have been the veterans of creative suffering. Continue to work with the faith that unearned suffering is redemptive. Go back to Mississippi, go back to Alabama, go back to South Carolina, go back to Georgia, go back to Louisiana, go back to the slums and ghettos of our northern cities, knowing that somehow this situation can and will be changed. Let us not wallow in the valley of despair.
+
+Which lines in this paragraph can be used as examples of metaphor? Select 3 options.
+A. great trials and tribulations
+B. storms of persecution
+C. winds of police brutality
+D. go back to Georgia
+E. this situation can and will be changed
+F. let us not wallow in the valley of despair
+```
+
+### Lessons Learned
+
+- When adding new tokens to the model, LoRA will be much worse. Use full fine-tuning to get better results.
+- Be very careful about chat templates. Every character/new line/space matters and not following that can make the model have worse performance.
+- For Qwen base models, leave the `<|endoftext|>` as the EOS token. Then you can train it to use other tokens like `<|im_end|>`. If you set the EOS token to `<|im_end|>`, the model will get confused.
+- For Qwen models in general, always put the `<|endoftext|>` at the end of each training example.
+
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,32 @@
+{
+  "</think>": 151668,
+  "</tool_call>": 151658,
+  "</tool_response>": 151666,
+  "<think>": 151667,
+  "<tool_call>": 151657,
+  "<tool_response>": 151665,
+  "<|box_end|>": 151649,
+  "<|box_start|>": 151648,
+  "<|endoftext|>": 151643,
+  "<|file_sep|>": 151664,
+  "<|fim_middle|>": 151660,
+  "<|fim_pad|>": 151662,
+  "<|fim_prefix|>": 151659,
+  "<|fim_suffix|>": 151661,
+  "<|formatted_problem_end|>": 151672,
+  "<|formatted_problem_start|>": 151671,
+  "<|im_end|>": 151645,
+  "<|im_start|>": 151644,
+  "<|image_pad|>": 151655,
+  "<|object_ref_end|>": 151647,
+  "<|object_ref_start|>": 151646,
+  "<|problem_end|>": 151670,
+  "<|problem_start|>": 151669,
+  "<|quad_end|>": 151651,
+  "<|quad_start|>": 151650,
+  "<|repo_name|>": 151663,
+  "<|video_pad|>": 151656,
+  "<|vision_end|>": 151653,
+  "<|vision_pad|>": 151654,
+  "<|vision_start|>": 151652
+}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,33 @@
+{
+  "architectures": [
+    "Qwen3ForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "chat_template": "{%- set last_message = messages[-1] -%}\n{%- if last_message.role == \"user\" -%}\n{{- '<|problem_start|>\n' + last_message.content + '<|problem_end|>\n' -}}\n{%- elif last_message.role == \"assistant\" -%}\n{%- for message in messages -%}\n{%- if message.role == \"user\" -%}\n{{- '<|problem_start|>\n' + message.content + '<|problem_end|>\n' -}}\n{%- elif message.role == \"assistant\" -%}\n{{- '<|formatted_problem_start|>\n' + message.content + '<|formatted_problem_end|>\n' -}}\n{%- else -%}\n{{- raise('Unknown role: ' + message.role) -}}\n{%- endif -%}\n{%- endfor -%}\n{%- else -%}\n{{- raise('Unsupported role: ' + last_message.role) -}}\n{%- endif -%}\n{%- if add_generation_prompt and last_message.role == \"user\" -%}\n{{- '<|formatted_problem_start|>\n' -}}\n{%- endif -%}",
+  "eos_token_id": 151643,
+  "head_dim": 128,
+  "hidden_act": "silu",
+  "hidden_size": 1024,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "max_position_embeddings": 32768,
+  "max_window_layers": 28,
+  "model_type": "qwen3",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 28,
+  "num_key_value_heads": 8,
+  "pad_token_id": 151654,
+  "rms_norm_eps": 1e-06,
+  "rope_scaling": null,
+  "rope_theta": 1000000,
+  "sliding_window": null,
+  "tie_word_embeddings": true,
+  "torch_dtype": "bfloat16",
+  "transformers_version": "4.51.3",
+  "unsloth_fixed": true,
+  "unsloth_version": "2025.5.5",
+  "use_cache": true,
+  "use_sliding_window": false,
+  "vocab_size": 151673
+}
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,8 @@
+{
+  "bos_token_id": 151643,
+  "eos_token_id": 151643,
+  "max_length": 32768,
+  "max_new_tokens": 2048,
+  "pad_token_id": 151654,
+  "transformers_version": "4.51.3"
+}
--- a/merges.txt
+++ b/merges.txt
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:6bc43db6266963fb81109ae184e8c8c62b1f19c37f3673a6de664dc3d000aa68
+size 1191596472
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,46 @@
+{
+  "additional_special_tokens": [
+    {
+      "content": "<|problem_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false
+    },
+    {
+      "content": "<|problem_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false
+    },
+    {
+      "content": "<|formatted_problem_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false
+    },
+    {
+      "content": "<|formatted_problem_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false
+    }
+  ],
+  "eos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|vision_pad|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:80fcdc481bbdbbb1e30da04bb78edb9374807f86c83d9895905fdac2f91f0ae8
+size 11423446
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,264 @@
+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151645": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151646": {
+      "content": "<|object_ref_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|object_ref_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151648": {
+      "content": "<|box_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151649": {
+      "content": "<|box_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151665": {
+      "content": "<tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151666": {
+      "content": "</tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151667": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151668": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151669": {
+      "content": "<|problem_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151670": {
+      "content": "<|problem_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151671": {
+      "content": "<|formatted_problem_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151672": {
+      "content": "<|formatted_problem_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "additional_special_tokens": [
+    "<|problem_start|>",
+    "<|problem_end|>",
+    "<|formatted_problem_start|>",
+    "<|formatted_problem_end|>"
+  ],
+  "bos_token": null,
+  "chat_template": "{%- set last_message = messages[-1] -%}\n{%- if last_message.role == \"user\" -%}\n{{- '<|problem_start|>\n' + last_message.content + '<|problem_end|>\n' -}}\n{%- elif last_message.role == \"assistant\" -%}\n{%- for message in messages -%}\n{%- if message.role == \"user\" -%}\n{{- '<|problem_start|>\n' + message.content + '<|problem_end|>\n' -}}\n{%- elif message.role == \"assistant\" -%}\n{{- '<|formatted_problem_start|>\n' + message.content + '<|formatted_problem_end|>\n' -}}\n{%- else -%}\n{{- raise('Unknown role: ' + message.role) -}}\n{%- endif -%}\n{%- endfor -%}\n{%- else -%}\n{{- raise('Unsupported role: ' + last_message.role) -}}\n{%- endif -%}\n{%- if add_generation_prompt and last_message.role == \"user\" -%}\n{{- '<|formatted_problem_start|>\n' -}}\n{%- endif -%}",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|endoftext|>",
+  "errors": "replace",
+  "extra_special_tokens": {},
+  "model_max_length": 32768,
+  "pad_token": "<|vision_pad|>",
+  "padding_side": "right",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}
--- a/vocab.json
+++ b/vocab.json