ModelHub XC d756cf0e45 初始化项目,由ModelHub XC社区提供模型
Model: syj4205/broken-model-fixed
Source: Original Platform
2026-05-29 09:34:20 +08:00

library_name, pipeline_tag, base_model
library_name pipeline_tag base_model
transformers text-generation
Qwen/Qwen3-8B

Qwen3-8B Fixed Model

This repository is a fixed version of yunmorning/broken-model.
The original model could not be used to run a functional /chat/completions API server due to two critical issues.


Changes Made

Fix 1: Added chat_template to tokenizer_config.json

Before: chat_template field did not exist
After: Added official Qwen3 Jinja2 chat template

Why: OpenAI-compatible API servers (vLLM, FriendliAI, etc.) rely on chat_template to convert the messages array into model input via tokenizer.apply_chat_template(). Without this field, the /chat/completions endpoint cannot format prompts and fails entirely.


Fix 2: Corrected shard mapping in model.safetensors.index.json

Before: Layer 7's q_proj, k_proj, v_proj pointed to wrong shard
After: Corrected to the right shard

Tensor Before After
model.layers.7.self_attn.q_proj.weight model-00001-of-00005.safetensors model-00002-of-00005.safetensors
model.layers.7.self_attn.k_proj.weight model-00001-of-00005.safetensors model-00002-of-00005.safetensors
model.layers.7.self_attn.v_proj.weight model-00001-of-00005.safetensors model-00002-of-00005.safetensors

Why: All other tensors in Layer 7 correctly point to model-00002. This mismatch causes a weight loading error at inference time since the tensors cannot be found in the referenced shard.


Fix 3: Corrected base_model metadata in README.md

Before: base_model: meta-llama/Meta-Llama-3.1-8B
After: base_model: Qwen/Qwen3-8B

Why: The actual weights (lm_head.weight shape [151936, 4096]) and config (model_type: qwen3, architectures: Qwen3ForCausalLM) confirm this is a Qwen3-8B model. LLaMA 3.1-8B has a vocab size of 128,256, which does not match. This is a metadata-only fix with no effect on inference.


What Was NOT Changed

Item Reason
config.json All architecture values match Qwen3-8B spec exactly
tokenizer_class: "Qwen2Tokenizer" Qwen3 intentionally reuses Qwen2Tokenizer (same BPE)
eos_token_id: [151645, 151643] Matches official Qwen3 generation config

Verification

from transformers import pipeline
import torch

pipe = pipeline(
    "text-generation",
    model="syj4205/broken-model-fixed",
    dtype=torch.bfloat16,
    device_map="auto",
)

messages = [{"role": "user", "content": "Who are you?"}]
result = pipe(messages, max_new_tokens=40)
print(result[0]["generated_text"][-1]["content"])

Output:

<think> Okay, the user asked, "Who are you?" I need to respond in a friendly and
informative way. Let me start by introducing my name, Qwen...</think>
I'm Qwen, a large language model developed by Alibaba Cloud.
Description
Model synced from source: syj4205/broken-model-fixed
Readme 2 MiB