Files

ModelHub XC 691a3cba06 初始化项目，由ModelHub XC社区提供模型

Model: prithivMLmods/Qwen3-4B-ft-bf16
Source: Original Platform

2026-05-25 10:26:13 +08:00

3.4 KiB

Raw Permalink Blame History

license, language, base_model, pipeline_tag, library_name, tags

license

language

base_model

pipeline_tag

library_name

Qwen3-4B-ft-bf16

Qwen3-4B-ft-bf16 is a fine-tuned, moderately abliterated version of the Qwen3-4B model. Designed for enhanced context awareness and controlled expressiveness, this model balances precision with creativity across a wide range of tasks—from complex reasoning to natural dialogue, code generation, and multilingual understanding.

Key Features:

Improved Context Awareness
Retains and utilizes long-range contextual information effectively, making it ideal for long-form conversations, document understanding, and summarization tasks.
Moderate Abliteration
Introduces measured behavioral flexibility that enhances creativity and adaptability while maintaining reliability, alignment, and safety in outputs.
Dual Thinking Modes
Supports dynamic switching between thinking mode (for math, logic, and coding) and non-thinking mode (for general-purpose conversations), ensuring optimal task matching.
Multilingual Mastery
Excels in over 100 languages and dialects for translation, multilingual chat, and cross-lingual reasoning.
Tool-Ready Agent Capabilities
Designed to integrate with tool APIs and complex workflows, with consistent performance in both thinking and non-thinking contexts.

Quickstart with Hugging Face Transformers🤗

pip install transformers==4.51.3
pip install huggingface_hub[hf_xet]

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Qwen3-4B-ft-bf16"

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Define input
prompt = "Describe how renewable energy impacts economic development."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Generate output
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

# Parse thinking content
try:
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip()
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip()

print("thinking content:", thinking_content)
print("content:", content)

Best Practices

Sampling Settings:
- Thinking mode: temperature=0.6, top_p=0.95, top_k=20
- Non-thinking mode: temperature=0.7, top_p=0.8, top_k=20
Token Length:
- Standard: 32768 tokens
- Extended Reasoning Tasks: up to 38912 tokens
Prompt Design:
- Math Problems: Add "Please reason step by step, and put your final answer within \boxed{}."
- MCQs: Format answers as {"answer": "B"} for easy parsing.
- Multi-turn: Omit thinking logs in conversation history for cleaner context.

3.4 KiB Raw Permalink Blame History

Qwen3-4B-ft-bf16

Key Features:

Quickstart with Hugging Face Transformers🤗

Best Practices

3.4 KiB

Raw Permalink Blame History