初始化项目,由ModelHub XC社区提供模型

Model: majimenez/broken-model-fixed
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-05 05:55:58 +08:00
commit 82393e1cd8
14 changed files with 152193 additions and 0 deletions

62
README.md Normal file
View File

@@ -0,0 +1,62 @@
---
base_model: Qwen/Qwen3-8B
library_name: transformers
pipeline_tag: text-generation
---
# broken-model-fixed
This repository is a fixed copy of [yunmorning/broken-model](https://huggingface.co/yunmorning/broken-model), which was reported as unable to serve a functional `/chat/completions` API endpoint.
## Root Cause
The model weights, architecture config, and tokenizer vocabulary are all valid Qwen3-8B artifacts. However, the `tokenizer_config.json` was **missing the `chat_template` field**.
The `chat_template` is a Jinja2 template that tells inference servers (vLLM, TGI, FriendliAI, etc.) how to convert a list of chat messages (`[{"role": "user", "content": "Hello"}]`) into the model's expected prompt format (ChatML with `<|im_start|>`/`<|im_end|>` delimiters). Without it, any `/chat/completions` request fails with:
```
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set
```
## Changes Made
### 1. `tokenizer_config.json` — added `chat_template` (critical fix)
Added the standard Qwen3-8B chat template (4168 characters) from the reference model [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B). This was the **only** field missing; all other tokenizer config values were already correct and identical to the reference.
The template handles:
- System, user, and assistant message formatting using ChatML (`<|im_start|>`/`<|im_end|>`)
- Tool/function calling via `<tool_call>` and `<tool_response>` tags
- Reasoning/thinking blocks via `<think>` tags
- Generation prompt injection for inference
### 2. `README.md` — corrected `base_model` metadata
The original README declared `base_model: meta-llama/Llama-3.1-8B`, but the model is actually Qwen3-8B based on:
- `config.json`: `architectures: ["Qwen3ForCausalLM"]`, `model_type: "qwen3"`
- Weight tensors contain Qwen3-specific layers (`q_norm`, `k_norm`)
- `tokenizer_config.json`: `tokenizer_class: "Qwen2Tokenizer"`, vocab_size 151936
- 36 hidden layers and intermediate_size 12288 (matching Qwen3-8B, not Llama-3.1-8B which has 32 layers and intermediate_size 14336)
## Verification on FriendliAI
When importing the **broken** model (`yunmorning/broken-model`) into FriendliAI, both "Tool call" and "Reasoning parser" features were detected as **"Not supported"**, and endpoint creation failed with an internal system error.
After applying the fix, importing `majimenez/broken-model-fixed` into FriendliAI correctly detects both features as **"Supported"**. This is because FriendliAI parses the `chat_template` to detect these capabilities:
- **Tool call**: the template contains `<tool_call>`/`</tool_call>` and `<tool_response>`/`</tool_response>` formatting logic
- **Reasoning parser**: the template contains `<think>`/`</think>` block handling with `enable_thinking` support
Without a `chat_template` at all, neither feature can be detected or used.
### End-to-end confirmation
- **Broken model** (`yunmorning/broken-model`): Failed to load entirely on FriendliAI — endpoint creation returned an internal system error.
- **Fixed model** (`majimenez/broken-model-fixed`): Deployed successfully on FriendliAI and responded correctly to `/chat/completions` requests, producing both reasoning (`<think>` blocks) and body response text.
### Files NOT changed
All other files were verified correct and left unmodified:
- `config.json` — identical to reference Qwen/Qwen3-8B
- `generation_config.json` — identical to reference Qwen/Qwen3-8B
- `tokenizer.json`, `vocab.json`, `merges.txt` — tokenizer vocabulary files
- `model-*.safetensors` / `model.safetensors.index.json` — model weights