初始化项目,由ModelHub XC社区提供模型

Model: MihaiPopa-1/Qwen-3-0.6B-Claude-4.7-Opus-Distilled
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-08 09:20:48 +08:00
commit 7ebebae426
7 changed files with 672 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

233
README.md Normal file
View File

@@ -0,0 +1,233 @@
---
base_model: Qwen/Qwen3-0.6B
# base_model: Unsloth/Qwen3-0.6B-Unsloth-bnb-4bit - Variant that I used for fine-tuning (4-bit BNB quant by Unsloth)
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3
- claude
- reasoning
- 4.7-opus
- opus-4.7
- claude-4.7-opus
- claude-opus-4.7
- distilled
- claude-opus
license: apache-2.0
language:
- en
datasets:
- lordx64/reasoning-distill-opus-4-7-max-sft
---
# Qwen 3 0.6B (Claude 4.7 Opus Distilled)
What happens if you take the reasoning of Claude 4.7 Opus and put it in Qwen 3 0.6B? You'll get Qwen 3 0.6B (Claude 4.7 Opus Distilled)!
Fine-tuned from [Qwen 3 0.6B](https://www.huggingface.co/Qwen/Qwen3-0.6B) (with Unsloth), this model is designed for tackling hard problems on any device!
# Features
* **Adaptive Reasoning:** Passes the Strawberry test and tackles quite hard problems, using a lot of reasoning tokens and comparable with larger models!
* **Tiny Size:** Beats any other large model on speed and memory usage. No other model is able to compete with this!
# Issues
* Don't use it for accurate facts, like the Minecraft tests I tried hallucinated wrong answers!
## Example
```python
from unsloth import FastLanguageModel
from transformers import TextStreamer
import torch
# 1. Configuration
max_seq_length = 8192 # Your reasoning trace might be long
dtype = None # None for auto detection (FP16 for T4, BF16 for Ampere)
load_in_4bit = False # You mentioned using 16-bit for better logic/SVG stability
# 2. Load the model and tokenizer from HF
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "MihaiPopa-1/Qwen-3-0.6B-Claude-4.7-Opus-Distilied",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
# device_map = "auto", # Unsloth handles this, but you can keep it if needed
)
# 3. Switch to Inference Mode
FastLanguageModel.for_inference(model)
# 2. Define the Streamer
# skip_prompt=True ensures you don't print the input prompt twice
text_streamer = TextStreamer(tokenizer, skip_prompt=True)
# 3. The High-Reasoning Prompt
prompt = """<|im_start|>system
You are a helpful AI assistant.<|im_end|>
<|im_start|>user
Give 2-digit numbers which the product of each digit equals 24.<|im_end|>
<|im_start|>assistant"""
inputs = tokenizer([prompt], return_tensors = "pt").to("cuda")
# 4. Generate with the streamer
_ = model.generate(
**inputs,
streamer = text_streamer, # This line enables the real-time display
max_new_tokens = 8192,
temperature = 0.6,
top_p = 0.95,
top_k = 20,
repetition_penalty = 1.15,
use_cache = True
)
```
gave this:
```
<think>
Okay, so I need to find two-digit numbers where the product of their digits is equal to 24. Let me think about how to approach this.
First, let's recall that a two-digit number has tens and units place. So if the number is AB (where A is the tens digit and B is the units), then the product of its digits would be A*B = 24. My goal is to find all such pairs (A,B) where A and B are single digits (from 0-9) and their product is 24.
Let me start by listing possible factors of 24. The prime factorization of 24 is 2^3 * 3. Therefore, any combination of digits whose product is 24 must include these primes in some way. Since we're dealing with two digits, maybe I can list all possible combinations of two digits from 1 to 9 (since leading zeros aren't allowed for two-digit numbers) that multiply to 24.
Let me try different possibilities:
Starting with 1 as one of the digits. If A=1, then B needs to be 24. But since B is a single digit, there's no solution here because 24 isn't a valid digit. Similarly, if B=1, then A should be 24, again invalid. So 1 cannot be part of either digit.
Next, considering 2 as one of the digits. If A=2, then B=12. Again, 12 isn't a single digit. If B=2, then A=12, also not valid. So 2 doesn't work unless both digits are larger than 2.
What about 3? If A=3, then B=8. Both 3 and 8 are single digits. That works! So 3*8=24. So the number could be 38 or 83.
Wait, but what about other combinations?
How about 4? If A=4, then B=6. Both 4 and 6 are single digits. Product is 24. Number: 46 or 64.
Similarly, 5? Then B would have to be 24/5 = 4.8, which isn't an integer. Not valid.
6? Then B=24/6=4. Valid. So 64 or 46.
7? 24 divided by 7 is approximately 3.428... Not integer.
8? 24 divided by 8 is 3. Which gives us 83 or 38.
Same with 9? 24 divided by 9 is roughly 2.666..., not integer.
So compiling all valid ones:
From 3 and 8: 38, 83
From 4 and 6: 46, 64
From 6 and 4: same as above, already covered.
Also, check if there are more possibilities. For example, 2 and 12 (invalid), 3 and 8, etc.
Is there another pair? Let me see. What about 12 and something else? No, they have to be single digits. How about 24 and something? No, only two digits.
Therefore, the two-digit numbers are 38, 83; 46, 64.
But wait, let me verify once more. Let's take 38: 3*8=24 ✔️. 83: 8*3=24 ✔️. 46: 4*6=24 ✔️. 64: 6*4=24 ✔️. Yes, those are correct.
Are there any others? Let me think. Suppose someone might consider 24 itself as a two-digit number, but it's already considered as 24, but the problem says "two-digit numbers", so 24 is acceptable. However, does 24 count? Wait, the question says "product of each digit". For 24, the digits are 2 and 4, product is 8 ≠ 24. So 24 is excluded. Hence, our previous list is correct.
Another thought: Could there be numbers like 12, 24, etc., but as mentioned earlier, only when digits are higher. So yes, seems like 38, 46, 64, 83 are the solutions.
I don't think there are more possibilities. All other combinations either result in non-integer digits or exceed the maximum single-digit value.
</think>
To solve the problem of finding **two-digit numbers** where the **product of each digit equals 24**, follow these steps:
---
### Step-by-step Analysis:
We are looking for two-digit numbers $ \text{AB} $, i.e., numbers between 10 and 99 inclusive, such that:
$$
A \times B = 24
$$
Where:
- $ A $ is the tens digit,
- $ B $ is the units digit,
- Both $ A $ and $ B $ are integers between 0 and 9 (inclusive).
---
### Finding Valid Combinations:
Start by factoring 24 into two integers between 1 and 9:
- $ 3 \times 8 = 24 $
- $ 4 \times 6 = 24 $
These give us the following valid two-digit numbers:
- $ 38 $ → $ 3 \times 8 = 24 $
- $ 83 $ → $ 8 \times 3 = 24 $
- $ 46 $ → $ 4 \times 6 = 24 $
- $ 64 $ → $ 6 \times 4 = 24 $
All four combinations satisfy the condition.
---
### Final Answer:
The two-digit numbers are:
$$
\boxed{38}, \quad \boxed{83}, \quad \boxed{46}, \quad \boxed{64}
$$<|im_end|>
```
# Usage
Code is by Gemini 3 Flash (then some little modifications by myself):
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/Qwen-3-0.6B-Claude-4.7-Opus-Distilled"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float32, # Standard for CPU
device_map="cpu" # Forces CPU usage
)
# 2. Generate
prompt = "<|im_start|>system\nYou are a helpful AI assistant.<|im_end|>\n<|im_start|>user\nGive 2-digit numbers which the product of each digit equals 24<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens = 8192,
temperature = 0.6,
top_p = 0.95,
top_k = 20,
repetition_penalty = 1.15,
use_cache = True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
# Data Used
Greetings to [Opus 4.7 Max SFT](https://www.huggingface.co/datasets/lordx64/reasoning-distill-opus-4-7-max-sft) for the amazing dataset!
---
# Uploaded finetuned model
- **Developed by:** MihaiPopa-1
- **License:** apache-2.0
- **Finetuned from model :** unsloth/qwen3-0.6b-unsloth-bnb-4bit
This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

99
chat_template.jinja Normal file
View File

@@ -0,0 +1,99 @@
{%- if tools %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].role == 'system' %}
{{- messages[0].content + '\n\n' }}
{%- endif %}
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
{%- for tool in tools %}
{{- "\n" }}
{{- tool | tojson }}
{%- endfor %}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
{%- if messages[0].role == 'system' %}
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for forward_message in messages %}
{%- set index = (messages|length - 1) - loop.index0 %}
{%- set message = messages[index] %}
{%- set current_content = message.content if message.content is defined and message.content is not none else '' %}
{%- set tool_start = '<tool_response>' %}
{%- set tool_start_length = tool_start|length %}
{%- set start_of_message = current_content[:tool_start_length] %}
{%- set tool_end = '</tool_response>' %}
{%- set tool_end_length = tool_end|length %}
{%- set start_pos = (current_content|length) - tool_end_length %}
{%- if start_pos < 0 %}
{%- set start_pos = 0 %}
{%- endif %}
{%- set end_of_message = current_content[start_pos:] %}
{%- if ns.multi_step_tool and message.role == "user" and not(start_of_message == tool_start and end_of_message == tool_end) %}
{%- set ns.multi_step_tool = false %}
{%- set ns.last_query_index = index %}
{%- endif %}
{%- endfor %}
{%- for message in messages %}
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
{%- elif message.role == "assistant" %}
{%- set m_content = message.content if message.content is defined and message.content is not none else '' %}
{%- set content = m_content %}
{%- set reasoning_content = '' %}
{%- if message.reasoning_content is defined and message.reasoning_content is not none %}
{%- set reasoning_content = message.reasoning_content %}
{%- else %}
{%- if '</think>' in m_content %}
{%- set content = (m_content.split('</think>')|last).lstrip('\n') %}
{%- set reasoning_content = (m_content.split('</think>')|first).rstrip('\n') %}
{%- set reasoning_content = (reasoning_content.split('<think>')|last).lstrip('\n') %}
{%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_query_index %}
{%- if loop.last or (not loop.last and (not reasoning_content.strip() == '')) %}
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and content) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "tool" %}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|im_start|>user' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- message.content }}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- if enable_thinking is defined and enable_thinking is false %}
{{- '<think>\n\n</think>\n\n' }}
{%- endif %}
{%- endif %}

64
config.json Normal file
View File

@@ -0,0 +1,64 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": null,
"torch_dtype": "float16",
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 40960,
"max_window_layers": 28,
"model_type": "qwen3",
"num_attention_heads": 16,
"num_hidden_layers": 28,
"num_key_value_heads": 8,
"pad_token_id": 151669,
"rms_norm_eps": 1e-06,
"rope_parameters": {
"rope_theta": 1000000,
"rope_type": "default"
},
"sliding_window": null,
"tie_word_embeddings": true,
"unsloth_fixed": true,
"unsloth_version": "2026.4.8",
"use_cache": false,
"use_sliding_window": false,
"vocab_size": 151936
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:358dd093536d80319700990ff72e55acd2fcd24e8c7081a439165982c0d3a053
size 1192135096

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d7430e9138b76e93fb6f93462394d236b411111aef53cb421ba97d2691040cca
size 11423114

234
tokenizer_config.json Normal file

File diff suppressed because one or more lines are too long