初始化项目，由ModelHub XC社区提供模型

Model: majimenez/broken-model-fixed Source: Original Platform
2026-05-05 05:55:58 +08:00
commit 82393e1cd8
14 changed files with 152193 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,62 @@
+---
+base_model: Qwen/Qwen3-8B
+library_name: transformers
+pipeline_tag: text-generation
+---
+
+# broken-model-fixed
+
+This repository is a fixed copy of [yunmorning/broken-model](https://huggingface.co/yunmorning/broken-model), which was reported as unable to serve a functional `/chat/completions` API endpoint.
+
+## Root Cause
+
+The model weights, architecture config, and tokenizer vocabulary are all valid Qwen3-8B artifacts. However, the `tokenizer_config.json` was **missing the `chat_template` field**.
+
+The `chat_template` is a Jinja2 template that tells inference servers (vLLM, TGI, FriendliAI, etc.) how to convert a list of chat messages (`[{"role": "user", "content": "Hello"}]`) into the model's expected prompt format (ChatML with `<|im_start|>`/`<|im_end|>` delimiters). Without it, any `/chat/completions` request fails with:
+
+```
+ValueError: Cannot use chat template functions because tokenizer.chat_template is not set
+```
+
+## Changes Made
+
+### 1. `tokenizer_config.json` — added `chat_template` (critical fix)
+
+Added the standard Qwen3-8B chat template (4168 characters) from the reference model [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B). This was the **only** field missing; all other tokenizer config values were already correct and identical to the reference.
+
+The template handles:
+- System, user, and assistant message formatting using ChatML (`<|im_start|>`/`<|im_end|>`)
+- Tool/function calling via `<tool_call>` and `<tool_response>` tags
+- Reasoning/thinking blocks via `<think>` tags
+- Generation prompt injection for inference
+
+### 2. `README.md` — corrected `base_model` metadata
+
+The original README declared `base_model: meta-llama/Llama-3.1-8B`, but the model is actually Qwen3-8B based on:
+- `config.json`: `architectures: ["Qwen3ForCausalLM"]`, `model_type: "qwen3"`
+- Weight tensors contain Qwen3-specific layers (`q_norm`, `k_norm`)
+- `tokenizer_config.json`: `tokenizer_class: "Qwen2Tokenizer"`, vocab_size 151936
+- 36 hidden layers and intermediate_size 12288 (matching Qwen3-8B, not Llama-3.1-8B which has 32 layers and intermediate_size 14336)
+
+## Verification on FriendliAI
+
+When importing the **broken** model (`yunmorning/broken-model`) into FriendliAI, both "Tool call" and "Reasoning parser" features were detected as **"Not supported"**, and endpoint creation failed with an internal system error.
+
+After applying the fix, importing `majimenez/broken-model-fixed` into FriendliAI correctly detects both features as **"Supported"**. This is because FriendliAI parses the `chat_template` to detect these capabilities:
+- **Tool call**: the template contains `<tool_call>`/`</tool_call>` and `<tool_response>`/`</tool_response>` formatting logic
+- **Reasoning parser**: the template contains `<think>`/`</think>` block handling with `enable_thinking` support
+
+Without a `chat_template` at all, neither feature can be detected or used.
+
+### End-to-end confirmation
+
+- **Broken model** (`yunmorning/broken-model`): Failed to load entirely on FriendliAI — endpoint creation returned an internal system error.
+- **Fixed model** (`majimenez/broken-model-fixed`): Deployed successfully on FriendliAI and responded correctly to `/chat/completions` requests, producing both reasoning (`<think>` blocks) and body response text.
+
+### Files NOT changed
+
+All other files were verified correct and left unmodified:
+- `config.json` — identical to reference Qwen/Qwen3-8B
+- `generation_config.json` — identical to reference Qwen/Qwen3-8B
+- `tokenizer.json`, `vocab.json`, `merges.txt` — tokenizer vocabulary files
+- `model-*.safetensors` / `model.safetensors.index.json` — model weights