初始化项目，由ModelHub XC社区提供模型

Model: AXCXEPT/Qwen3-EZO-8B-beta Source: Original Platform
2026-06-05 11:18:21 +08:00
commit 031b1e4641
15 changed files with 152250 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,62 @@
+---
+library_name: transformers
+tags:
+- text-generation-inference
+license: apache-2.0
+language:
+- ja
+- en
+base_model:
+- Qwen/Qwen3-8B
+pipeline_tag: text-generation
+---
+
+# Model Card for Model ID
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/657e900beaad53ff67ba84db/BTcieHtGHxotpgfP09KN_.png)
+
+We are releasing **Qwen3-EZO-8b-beta**, an 8B-parameter LLM based on Qwen3-8B.
+
+While the model size corresponds to an SLM (Small Language Model), it achieves performance on multi-turn tasks comparable to Gemini 2.5 Flash and GPT-4o. It significantly improves upon the original Qwen3-8B, recording **MT-Bench 9.08** and **JMT-Bench 8.87** scores.
+
+It supports parallel processing of deep-thinking prompts using our **Deep-Think** technique and is compatible with the OpenAI API via **vLLM** deployment.
+
+Although it was initially planned as a closed model for API-based access, we have decided to release it as an open model in light of our new policy to monetize only after further accuracy improvements.
+
+## BenchMark
+
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/657e900beaad53ff67ba84db/D2RBr80WfoUn3UPID57wy.png)
+*Based on repeated evaluations of frequent outputs at temperatures 0.2 and 0.6, conducted on May 13, 2025, using GPT-4o and Gemini 2.5 Flash as judges.*
+*All tests were performed internally on a single A40 GPU. Results may vary under external or official benchmark conditions.*
+
+--
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/657e900beaad53ff67ba84db/vLzZGOSUi682420yQ259G.png)
+
+
+## How to use:
+Runs on a single A40 GPU.
+
+```bash
+vllm serve AXCXEPT/Qwen3-EZO-8b-beta --enable-reasoning --reasoning-parser deepseek_r1
+```
+
+```python
+from openai import OpenAI
+client = OpenAI(
+    base_url="http://localhost:8000/v1",
+    api_key="token-abc123",
+)
+
+prompt = """Every morning Aya goes for a $9$-kilometer-long walk and stops at a coffee shop afterwards. When she walks at a constant speed of $s$ kilometers per hour, the walk takes her 4 hours, including $t$ minutes spent in the coffee shop. When she walks $s+2$ kilometers per hour, the walk takes her 2 hours and 24 minutes, including $t$ minutes spent in the coffee shop. Suppose Aya walks at $s+rac{1}{2}$ kilometers per hour. Find the number of minutes the walk takes her, including the $t$ minutes spent in the coffee shop."""
+completion = client.chat.completions.create(
+  model="AXCXEPT/Qwen3-EZO-8b-beta",
+  messages=[
+    {"role": "user", "content": prompt}
+  ]
+)
+
+print(completion.choices[0].message)
+```
+
+## Special Thanks
+本モデルのベースモデルの開発を行った、Alibaba Cloud社ならびにQwen開発チームに、尊敬と敬意の念をここに表します。
+We would like to express our sincere respect and appreciation to Alibaba Cloud and the Qwen development team for their work in creating the base model for this project.