63 lines
2.8 KiB
Markdown
63 lines
2.8 KiB
Markdown
---
|
||
library_name: transformers
|
||
tags:
|
||
- text-generation-inference
|
||
license: apache-2.0
|
||
language:
|
||
- ja
|
||
- en
|
||
base_model:
|
||
- Qwen/Qwen3-8B
|
||
pipeline_tag: text-generation
|
||
---
|
||
|
||
# Model Card for Model ID
|
||

|
||
|
||
We are releasing **Qwen3-EZO-8b-beta**, an 8B-parameter LLM based on Qwen3-8B.
|
||
|
||
While the model size corresponds to an SLM (Small Language Model), it achieves performance on multi-turn tasks comparable to Gemini 2.5 Flash and GPT-4o. It significantly improves upon the original Qwen3-8B, recording **MT-Bench 9.08** and **JMT-Bench 8.87** scores.
|
||
|
||
It supports parallel processing of deep-thinking prompts using our **Deep-Think** technique and is compatible with the OpenAI API via **vLLM** deployment.
|
||
|
||
Although it was initially planned as a closed model for API-based access, we have decided to release it as an open model in light of our new policy to monetize only after further accuracy improvements.
|
||
|
||
## BenchMark
|
||
|
||

|
||
*Based on repeated evaluations of frequent outputs at temperatures 0.2 and 0.6, conducted on May 13, 2025, using GPT-4o and Gemini 2.5 Flash as judges.*
|
||
*All tests were performed internally on a single A40 GPU. Results may vary under external or official benchmark conditions.*
|
||
|
||
--
|
||

|
||
|
||
|
||
## How to use:
|
||
Runs on a single A40 GPU.
|
||
|
||
```bash
|
||
vllm serve AXCXEPT/Qwen3-EZO-8b-beta --enable-reasoning --reasoning-parser deepseek_r1
|
||
```
|
||
|
||
```python
|
||
from openai import OpenAI
|
||
client = OpenAI(
|
||
base_url="http://localhost:8000/v1",
|
||
api_key="token-abc123",
|
||
)
|
||
|
||
prompt = """Every morning Aya goes for a $9$-kilometer-long walk and stops at a coffee shop afterwards. When she walks at a constant speed of $s$ kilometers per hour, the walk takes her 4 hours, including $t$ minutes spent in the coffee shop. When she walks $s+2$ kilometers per hour, the walk takes her 2 hours and 24 minutes, including $t$ minutes spent in the coffee shop. Suppose Aya walks at $s+rac{1}{2}$ kilometers per hour. Find the number of minutes the walk takes her, including the $t$ minutes spent in the coffee shop."""
|
||
completion = client.chat.completions.create(
|
||
model="AXCXEPT/Qwen3-EZO-8b-beta",
|
||
messages=[
|
||
{"role": "user", "content": prompt}
|
||
]
|
||
)
|
||
|
||
print(completion.choices[0].message)
|
||
```
|
||
|
||
## Special Thanks
|
||
本モデルのベースモデルの開発を行った、Alibaba Cloud社ならびにQwen開発チームに、尊敬と敬意の念をここに表します。
|
||
We would like to express our sincere respect and appreciation to Alibaba Cloud and the Qwen development team for their work in creating the base model for this project.
|