初始化项目，由ModelHub XC社区提供模型

Model: MihaiPopa-1/Qwen-3-0.6B-Claude-4.7-Opus-Distilled Source: Original Platform
2026-05-08 09:20:48 +08:00
commit 7ebebae426
7 changed files with 672 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,36 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,233 @@
+---
+base_model: Qwen/Qwen3-0.6B
+# base_model: Unsloth/Qwen3-0.6B-Unsloth-bnb-4bit - Variant that I used for fine-tuning (4-bit BNB quant by Unsloth)
+tags:
+- text-generation-inference
+- transformers
+- unsloth
+- qwen3
+- claude
+- reasoning
+- 4.7-opus
+- opus-4.7
+- claude-4.7-opus
+- claude-opus-4.7
+- distilled
+- claude-opus
+license: apache-2.0
+language:
+- en
+datasets:
+- lordx64/reasoning-distill-opus-4-7-max-sft
+---
+
+# Qwen 3 0.6B (Claude 4.7 Opus Distilled)
+
+What happens if you take the reasoning of Claude 4.7 Opus and put it in Qwen 3 0.6B? You'll get Qwen 3 0.6B (Claude 4.7 Opus Distilled)!
+
+Fine-tuned from [Qwen 3 0.6B](https://www.huggingface.co/Qwen/Qwen3-0.6B) (with Unsloth), this model is designed for tackling hard problems on any device!
+
+# Features
+* **Adaptive Reasoning:** Passes the Strawberry test and tackles quite hard problems, using a lot of reasoning tokens and comparable with larger models!
+* **Tiny Size:** Beats any other large model on speed and memory usage. No other model is able to compete with this!
+
+# Issues
+* Don't use it for accurate facts, like the Minecraft tests I tried hallucinated wrong answers!
+
+## Example
+```python
+from unsloth import FastLanguageModel
+from transformers import TextStreamer
+import torch
+
+# 1. Configuration
+max_seq_length = 8192 # Your reasoning trace might be long
+dtype = None          # None for auto detection (FP16 for T4, BF16 for Ampere)
+load_in_4bit = False  # You mentioned using 16-bit for better logic/SVG stability
+
+# 2. Load the model and tokenizer from HF
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name = "MihaiPopa-1/Qwen-3-0.6B-Claude-4.7-Opus-Distilied",
+    max_seq_length = max_seq_length,
+    dtype = dtype,
+    load_in_4bit = load_in_4bit,
+    # device_map = "auto", # Unsloth handles this, but you can keep it if needed
+)
+
+# 3. Switch to Inference Mode
+FastLanguageModel.for_inference(model)
+
+# 2. Define the Streamer
+# skip_prompt=True ensures you don't print the input prompt twice
+text_streamer = TextStreamer(tokenizer, skip_prompt=True)
+
+# 3. The High-Reasoning Prompt
+prompt = """<|im_start|>system
+You are a helpful AI assistant.<|im_end|>
+<|im_start|>user
+Give 2-digit numbers which the product of each digit equals 24.<|im_end|>
+<|im_start|>assistant"""
+
+inputs = tokenizer([prompt], return_tensors = "pt").to("cuda")
+
+# 4. Generate with the streamer
+_ = model.generate(
+    **inputs, 
+    streamer = text_streamer, # This line enables the real-time display
+    max_new_tokens = 8192,
+    temperature = 0.6,
+    top_p = 0.95,
+    top_k = 20,
+    repetition_penalty = 1.15,
+    use_cache = True
+)
+```
+
+gave this:
+```
+<think>
+Okay, so I need to find two-digit numbers where the product of their digits is equal to 24. Let me think about how to approach this.
+
+First, let's recall that a two-digit number has tens and units place. So if the number is AB (where A is the tens digit and B is the units), then the product of its digits would be A*B = 24. My goal is to find all such pairs (A,B) where A and B are single digits (from 0-9) and their product is 24.
+
+Let me start by listing possible factors of 24. The prime factorization of 24 is 2^3 * 3. Therefore, any combination of digits whose product is 24 must include these primes in some way. Since we're dealing with two digits, maybe I can list all possible combinations of two digits from 1 to 9 (since leading zeros aren't allowed for two-digit numbers) that multiply to 24.
+
+Let me try different possibilities:
+
+Starting with 1 as one of the digits. If A=1, then B needs to be 24. But since B is a single digit, there's no solution here because 24 isn't a valid digit. Similarly, if B=1, then A should be 24, again invalid. So 1 cannot be part of either digit.
+
+Next, considering 2 as one of the digits. If A=2, then B=12. Again, 12 isn't a single digit. If B=2, then A=12, also not valid. So 2 doesn't work unless both digits are larger than 2.
+
+What about 3? If A=3, then B=8. Both 3 and 8 are single digits. That works! So 3*8=24. So the number could be 38 or 83.
+
+Wait, but what about other combinations?
+
+How about 4? If A=4, then B=6. Both 4 and 6 are single digits. Product is 24. Number: 46 or 64.
+
+Similarly, 5? Then B would have to be 24/5 = 4.8, which isn't an integer. Not valid.
+
+6? Then B=24/6=4. Valid. So 64 or 46.
+
+7? 24 divided by 7 is approximately 3.428... Not integer.
+
+8? 24 divided by 8 is 3. Which gives us 83 or 38.
+
+Same with 9? 24 divided by 9 is roughly 2.666..., not integer.
+
+So compiling all valid ones:
+
+From 3 and 8: 38, 83
+
+From 4 and 6: 46, 64
+
+From 6 and 4: same as above, already covered.
+
+Also, check if there are more possibilities. For example, 2 and 12 (invalid), 3 and 8, etc.
+
+Is there another pair? Let me see. What about 12 and something else? No, they have to be single digits. How about 24 and something? No, only two digits.
+
+Therefore, the two-digit numbers are 38, 83; 46, 64.
+
+But wait, let me verify once more. Let's take 38: 3*8=24 ✔️. 83: 8*3=24 ✔️. 46: 4*6=24 ✔️. 64: 6*4=24 ✔️. Yes, those are correct.
+
+Are there any others? Let me think. Suppose someone might consider 24 itself as a two-digit number, but it's already considered as 24, but the problem says "two-digit numbers", so 24 is acceptable. However, does 24 count? Wait, the question says "product of each digit". For 24, the digits are 2 and 4, product is 8 ≠ 24. So 24 is excluded. Hence, our previous list is correct.
+
+Another thought: Could there be numbers like 12, 24, etc., but as mentioned earlier, only when digits are higher. So yes, seems like 38, 46, 64, 83 are the solutions.
+
+I don't think there are more possibilities. All other combinations either result in non-integer digits or exceed the maximum single-digit value.
+</think>
+
+To solve the problem of finding **two-digit numbers** where the **product of each digit equals 24**, follow these steps:
+
+---
+
+### Step-by-step Analysis:
+
+We are looking for two-digit numbers $ \text{AB} $, i.e., numbers between 10 and 99 inclusive, such that:
+
+$$
+A \times B = 24
+$$
+
+Where:
+- $ A $ is the tens digit,
+- $ B $ is the units digit,
+- Both $ A $ and $ B $ are integers between 0 and 9 (inclusive).
+
+---
+
+### Finding Valid Combinations:
+
+Start by factoring 24 into two integers between 1 and 9:
+
+- $ 3 \times 8 = 24 $
+- $ 4 \times 6 = 24 $
+
+These give us the following valid two-digit numbers:
+
+- $ 38 $ → $ 3 \times 8 = 24 $
+- $ 83 $ → $ 8 \times 3 = 24 $
+- $ 46 $ → $ 4 \times 6 = 24 $
+- $ 64 $ → $ 6 \times 4 = 24 $
+
+All four combinations satisfy the condition.
+
+---
+
+### Final Answer:
+
+The two-digit numbers are:
+
+$$
+\boxed{38}, \quad \boxed{83}, \quad \boxed{46}, \quad \boxed{64}
+$$<|im_end|>
+```
+
+# Usage
+Code is by Gemini 3 Flash (then some little modifications by myself):
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+
+# 1. Load from your Hugging Face Repo
+model_id = "MihaiPopa-1/Qwen-3-0.6B-Claude-4.7-Opus-Distilled"
+
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.float32, # Standard for CPU
+    device_map="cpu"           # Forces CPU usage
+)
+
+# 2. Generate
+prompt = "<|im_start|>system\nYou are a helpful AI assistant.<|im_end|>\n<|im_start|>user\nGive 2-digit numbers which the product of each digit equals 24<|im_end|>\n<|im_start|>assistant\n"
+inputs = tokenizer(prompt, return_tensors="pt").to("cpu")
+
+with torch.no_grad():
+    outputs = model.generate(
+    **inputs,
+    max_new_tokens = 8192,
+    temperature = 0.6,
+    top_p = 0.95,
+    top_k = 20,
+    repetition_penalty = 1.15,
+    use_cache = True
+)
+    
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+
+# Data Used
+Greetings to [Opus 4.7 Max SFT](https://www.huggingface.co/datasets/lordx64/reasoning-distill-opus-4-7-max-sft) for the amazing dataset!
+
+---
+
+# Uploaded finetuned model
+
+- **Developed by:** MihaiPopa-1
+- **License:** apache-2.0
+- **Finetuned from model :** unsloth/qwen3-0.6b-unsloth-bnb-4bit
+
+This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
+
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
--- a/chat_template.jinja
+++ b/chat_template.jinja
@@ -0,0 +1,99 @@
+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0].role == 'system' %}
+        {{- messages[0].content + '\n\n' }}
+    {%- endif %}
+    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0].role == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+{%- for forward_message in messages %}
+    {%- set index = (messages|length - 1) - loop.index0 %}
+    {%- set message = messages[index] %}
+    {%- set current_content = message.content if message.content is defined and message.content is not none else '' %}
+    {%- set tool_start = '<tool_response>' %}
+    {%- set tool_start_length = tool_start|length %}
+    {%- set start_of_message = current_content[:tool_start_length] %}
+    {%- set tool_end = '</tool_response>' %}
+    {%- set tool_end_length = tool_end|length %}
+    {%- set start_pos = (current_content|length) - tool_end_length %}
+    {%- if start_pos < 0 %}
+        {%- set start_pos = 0 %}
+    {%- endif %}
+    {%- set end_of_message = current_content[start_pos:] %}
+    {%- if ns.multi_step_tool and message.role == "user" and not(start_of_message == tool_start and end_of_message == tool_end) %}
+        {%- set ns.multi_step_tool = false %}
+        {%- set ns.last_query_index = index %}
+    {%- endif %}
+{%- endfor %}
+{%- for message in messages %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+        {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {%- set m_content = message.content if message.content is defined and message.content is not none else '' %}
+        {%- set content = m_content %}
+        {%- set reasoning_content = '' %}
+        {%- if message.reasoning_content is defined and message.reasoning_content is not none %}
+            {%- set reasoning_content = message.reasoning_content %}
+        {%- else %}
+            {%- if '</think>' in m_content %}
+                {%- set content = (m_content.split('</think>')|last).lstrip('\n') %}
+                {%- set reasoning_content = (m_content.split('</think>')|first).rstrip('\n') %}
+                {%- set reasoning_content = (reasoning_content.split('<think>')|last).lstrip('\n') %}
+            {%- endif %}
+        {%- endif %}
+        {%- if loop.index0 > ns.last_query_index %}
+            {%- if loop.last or (not loop.last and (not reasoning_content.strip() == '')) %}
+                {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
+            {%- else %}
+                {{- '<|im_start|>' + message.role + '\n' + content }}
+            {%- endif %}
+        {%- else %}
+            {{- '<|im_start|>' + message.role + '\n' + content }}
+        {%- endif %}
+        {%- if message.tool_calls %}
+            {%- for tool_call in message.tool_calls %}
+                {%- if (loop.first and content) or (not loop.first) %}
+                    {{- '\n' }}
+                {%- endif %}
+                {%- if tool_call.function %}
+                    {%- set tool_call = tool_call.function %}
+                {%- endif %}
+                {{- '<tool_call>\n{"name": "' }}
+                {{- tool_call.name }}
+                {{- '", "arguments": ' }}
+                {%- if tool_call.arguments is string %}
+                    {{- tool_call.arguments }}
+                {%- else %}
+                    {{- tool_call.arguments | tojson }}
+                {%- endif %}
+                {{- '}\n</tool_call>' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- message.content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+    {%- if enable_thinking is defined and enable_thinking is false %}
+        {{- '<think>\n\n</think>\n\n' }}
+    {%- endif %}
+{%- endif %}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,64 @@
+{
+    "architectures": [
+        "Qwen3ForCausalLM"
+    ],
+    "attention_bias": false,
+    "attention_dropout": 0.0,
+    "bos_token_id": null,
+    "torch_dtype": "float16",
+    "eos_token_id": 151645,
+    "head_dim": 128,
+    "hidden_act": "silu",
+    "hidden_size": 1024,
+    "initializer_range": 0.02,
+    "intermediate_size": 3072,
+    "layer_types": [
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention",
+        "full_attention"
+    ],
+    "max_position_embeddings": 40960,
+    "max_window_layers": 28,
+    "model_type": "qwen3",
+    "num_attention_heads": 16,
+    "num_hidden_layers": 28,
+    "num_key_value_heads": 8,
+    "pad_token_id": 151669,
+    "rms_norm_eps": 1e-06,
+    "rope_parameters": {
+        "rope_theta": 1000000,
+        "rope_type": "default"
+    },
+    "sliding_window": null,
+    "tie_word_embeddings": true,
+    "unsloth_fixed": true,
+    "unsloth_version": "2026.4.8",
+    "use_cache": false,
+    "use_sliding_window": false,
+    "vocab_size": 151936
+}
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:358dd093536d80319700990ff72e55acd2fcd24e8c7081a439165982c0d3a053
+size 1192135096
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d7430e9138b76e93fb6f93462394d236b411111aef53cb421ba97d2691040cca
+size 11423114
--- a/tokenizer_config.json
+++ b/tokenizer_config.json