Go to file

ModelHub XC 664bf50338 初始化项目，由ModelHub XC社区提供模型

Model: hadadxyz/Qwen3-4B-Diversity
Source: Original Platform

2026-05-09 19:29:05 +08:00

evaluations

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

added_tokens.json

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

LICENSE

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

merges.txt

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00001-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00002-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00003-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00004-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00005-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00006-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00007-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00008-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00009-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00010-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00011-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00012-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00013-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00014-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00015-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00016-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model-00017-of-00017.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

model.safetensors.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

vocab.json

初始化项目，由ModelHub XC社区提供模型

2026-05-09 19:29:05 +08:00

README.md

base_model, tags, datasets, model-index, pipeline_tag, library_name, license, license_link

base_model

tags

datasets

model-index

pipeline_tag

library_name

license

license_link

Qwen/Qwen3-4B

distillation

distilled

sft

peft

qwen3

ianncity/KIMI-K2.5-550000x

Jackrong/Qwen3.5-reasoning-700x

nohurry/Opus-4.6-Reasoning-3000x-filtered

TeichAI/claude-4.5-opus-high-reasoning-250x

TeichAI/gemini-3-pro-preview-high-reasoning-250x

TeichAI/claude-haiku-4.5-high-reasoning-1700x

TeichAI/gpt-5.2-high-reasoning-250x

Roman1111111/gemini-3.1-pro-hard-high-reasoning

Jackrong/glm-4.7-multiturn-CoT

bmeyer2025/glm5-reasoning-traces

TeichAI/claude-sonnet-4.5-high-reasoning-250x

TeichAI/deepseek-v3.2-speciale-openr1-math-3k

TeichAI/deepseek-v3.2-speciale-OpenCodeReasoning-3k

TeichAI/deepseek-v3.2-speciale-1000x

TeichAI/gpt-5-codex-1000x

name

results

hadadxyz/Qwen3-4B-Diversity

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu	cais/mmlu

type	value	name
acc	67.8	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Humanities	cais/mmlu

type	value	name
acc	57.9	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Formal Logic	cais/mmlu

type	value	name
acc	58.7	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School European History	cais/mmlu

type	value	name
acc	78.2	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Us History	cais/mmlu

type	value	name
acc	84.8	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School World History	cais/mmlu

type	value	name
acc	83.1	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu International Law	cais/mmlu

type	value	name
acc	77.7	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Jurisprudence	cais/mmlu

type	value	name
acc	78.7	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Logical Fallacies	cais/mmlu

type	value	name
acc	82.8	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Moral Disputes	cais/mmlu

type	value	name
acc	71.1	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Moral Scenarios	cais/mmlu

type	value	name
acc	28.4	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Philosophy	cais/mmlu

type	value	name
acc	73.3	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Prehistory	cais/mmlu

type	value	name
acc	76.2	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Professional Law	cais/mmlu

type	value	name
acc	47.4	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu World Religions	cais/mmlu

type	value	name
acc	78.4	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Other	cais/mmlu

type	value	name
acc	72.1	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Business Ethics	cais/mmlu

type	value	name
acc	73.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Clinical Knowledge	cais/mmlu

type	value	name
acc	75.5	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu College Medicine	cais/mmlu

type	value	name
acc	71.1	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Global Facts	cais/mmlu

type	value	name
acc	41.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Human Aging	cais/mmlu

type	value	name
acc	67.7	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Management	cais/mmlu

type	value	name
acc	84.5	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Marketing	cais/mmlu

type	value	name
acc	85.5	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Medical Genetics	cais/mmlu

type	value	name
acc	75.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Miscellaneous	cais/mmlu

type	value	name
acc	79.7	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Nutrition	cais/mmlu

type	value	name
acc	74.8	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Professional Accounting	cais/mmlu

type	value	name
acc	55.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Professional Medicine	cais/mmlu

type	value	name
acc	71.7	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Virology	cais/mmlu

type	value	name
acc	53.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Social Sciences	cais/mmlu

type	value	name
acc	78.4	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Econometrics	cais/mmlu

type	value	name
acc	64.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Geography	cais/mmlu

type	value	name
acc	84.3	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Government And Politics	cais/mmlu

type	value	name
acc	87.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Macroeconomics	cais/mmlu

type	value	name
acc	74.6	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Microeconomics	cais/mmlu

type	value	name
acc	80.7	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Psychology	cais/mmlu

type	value	name
acc	87.2	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Human Sexuality	cais/mmlu

type	value	name
acc	75.6	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Professional Psychology	cais/mmlu

type	value	name
acc	71.2	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Public Relations	cais/mmlu

type	value	name
acc	71.8	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Security Studies	cais/mmlu

type	value	name
acc	74.3	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Sociology	cais/mmlu

type	value	name
acc	84.1	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Us Foreign Policy	cais/mmlu

type	value	name
acc	81.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Stem	cais/mmlu

type	value	name
acc	68.1	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Abstract Algebra	cais/mmlu

type	value	name
acc	45.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Anatomy	cais/mmlu

type	value	name
acc	61.5	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Astronomy	cais/mmlu

type	value	name
acc	78.9	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu College Biology	cais/mmlu

type	value	name
acc	83.3	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu College Chemistry	cais/mmlu

type	value	name
acc	54.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu College Computer Science	cais/mmlu

type	value	name
acc	69.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu College Mathematics	cais/mmlu

type	value	name
acc	58.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu College Physics	cais/mmlu

type	value	name
acc	53.9	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Computer Security	cais/mmlu

type	value	name
acc	80.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Conceptual Physics	cais/mmlu

type	value	name
acc	77.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Electrical Engineering	cais/mmlu

type	value	name
acc	76.6	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Elementary Mathematics	cais/mmlu

type	value	name
acc	65.6	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Biology	cais/mmlu

type	value	name
acc	86.1	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Chemistry	cais/mmlu

type	value	name
acc	70.4	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Computer Science	cais/mmlu

type	value	name
acc	86.0	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Mathematics	cais/mmlu

type	value	name
acc	42.6	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Physics	cais/mmlu

type	value	name
acc	62.9	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu High School Statistics	cais/mmlu

type	value	name
acc	71.3	accuracy

task

dataset

metrics

type	name
text-generation	Text Generation

name	type
Mmlu Machine Learning	cais/mmlu

type	value	name
acc	57.1	accuracy

text-generation

transformers

apache-2.0

https://huggingface.co/hadadxyz/Qwen3-4B-Diversity/blob/main/LICENSE

Introduction

Qwen3-4B-Diversity is a fine-tuned language model based on Qwen/Qwen3-4B that has been trained on a diverse collection of high-quality reasoning datasets. This model combines knowledge distilled from various state-of-the-art AI systems to provide enhanced reasoning capabilities across multiple domains including mathematics, coding, general problem-solving, and multi-turn conversations.

Training Configuration

The model was trained using supervised fine-tuning techniques with parameter-efficient methods to optimize performance while maintaining computational efficiency. Key training parameters include:

Parameter	Value
Number of Epochs	2
Context Length	40,960

Hardware and Resources

Resource	Specification
GPU	A100-80GB
Training Duration	Approximately 17 hours
Estimated Cost	$27 to $30

Training Data

Dataset	Rows Used	Model
ianncity/KIMI-K2.5-550000x (General-Distillation)	1,000	Kimi K2.5
Jackrong/Qwen3.5-reasoning-700x	633	Qwen3.5
nohurry/Opus-4.6-Reasoning-3000x-filtered	2,326	Claude Opus 4.6
TeichAI/claude-4.5-opus-high-reasoning-250x	250	Claude Opus 4.5
TeichAI/gemini-3-pro-preview-high-reasoning-250x	248	Gemini 3 Pro
TeichAI/claude-haiku-4.5-high-reasoning-1700x	1,688	Claude Haiku 4.5
TeichAI/gpt-5.2-high-reasoning-250x	249	GPT-5.2
Roman1111111/gemini-3.1-pro-hard-high-reasoning	3,150	Gemini 3.1 Pro
Jackrong/glm-4.7-multiturn-CoT	5,090	GLM-4.7
bmeyer2025/glm5-reasoning-traces	1,744	GLM-5
TeichAI/claude-sonnet-4.5-high-reasoning-250x	247	Claude Sonnet 4.5
TeichAI/deepseek-v3.2-speciale-openr1-math-3k	3,317	DeepSeek V3.2-Speciale
TeichAI/deepseek-v3.2-speciale-OpenCodeReasoning-3k	2,953	DeepSeek V3.2-Speciale
TeichAI/deepseek-v3.2-speciale-1000x	991	DeepSeek V3.2-Speciale
TeichAI/gpt-5-codex-1000x	991	GPT-5 Codex
Total	24,877	Combined diverse reasoning dataset

Model Capabilities

This model excels in several key areas:

Advanced Reasoning: The model can break down complex problems into steps and provide detailed reasoning processes.
Mathematical Problem Solving: Enhanced capabilities for mathematical reasoning and problem-solving through dedicated math-focused datasets.
Code Generation and Understanding: Improved coding abilities from multiple code-reasoning datasets including DeepSeek and GPT-5 Codex data.
Multi-Turn Conversations: Better handling of extended dialogues and context-aware responses.
Domain Versatility: Exposure to reasoning patterns from various AI systems provides flexibility across different domains and task types.

Usage

Quick Demo

If you are looking for a quick demo that is completely free and without any cost, you can use Google Colab.

Ollama (Local)

# https://ollama.com/hadad/qwen3-4bd

# hadad/qwen3-4bd:Q8_0  |  4.3GB
# hadad/qwen3-4bd:BF16  |  8.1GB

# ollama pull hadad/qwen3-4bd:Q8_0

ollama run hadad/qwen3-4bd:Q8_0

If you are using Ollama and are interested in tools or function calling, it is recommended to use the OpenAI-compatible API provided by Ollama. This approach is more powerful.

Refer to the Ollama documentation.

Python (Local)

#pip install transformers==4.56.2

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "hadadxyz/Qwen3-4B-Diversity"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# prepare the model input
prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

# parsing thinking content
try:
    # rindex finding 151668 (</think>)
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

print("thinking content:", thinking_content)
print("content:", content)

Inference Parameters

For optimal results, we recommend the following generation parameters:

Thinking

Parameter	Recommended Value	Description
temperature	0.6	Controls randomness in generation
top_p	0.95	Nucleus sampling threshold
top_k	20	Top-k sampling parameter
min_p	0	Minimum probability threshold

Non-Thinking

Parameter	Recommended Value	Description
temperature	0.7	Controls randomness in generation
top_p	0.8	Nucleus sampling threshold
top_k	20	Top-k sampling parameter
min_p	0	Minimum probability threshold

Citation

If you use this model in your research or applications, please cite both this model and the base model:

@misc{qwen3-4b-diversity,
  author = {hadadxyz},
  title  = {Qwen3-4B-Diversity},
  year   = {2026},
  url    = {https://huggingface.co/hadadxyz/Qwen3-4B-Diversity}
}

Acknowledgments

This model was made possible through the combination of multiple high-quality datasets from the community. We acknowledge and thank all dataset creators and the Qwen team for providing the excellent base model.