Files
enginex-mlu370-any2any/README.md
luopingyi a5b788e078 Update README.md
增加对比结果
2025-10-14 14:59:30 +08:00

79 lines
2.5 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# enginex-mlu370-any2any
# 寒武纪 mlu370 统一多模态
该模型测试框架在寒武纪mlu370 X8/X4加速卡上基于Transfomer框架适配了 Qwen/Qwen3-Omni-30B-A3B-Instruct 模型。
* 详见 https://modelscope.cn/models/Qwen/Qwen3-Omni-30B-A3B-Instruct
## Quick Start
1. 首先从modelscope上下载模型
```bash
modelscope download --model Qwen/Qwen3-Omni-30B-A3B-Instruct --local_dir /models/Qwen3-Omni-30B-A3B-Instruct
```
2. 构建镜像
```bash
docker build -t qwen:omni .
```
3. 启动docker
```bash
docker run -it --rm \
-v /models/:/mnt/models \
--device=/dev/cambricon_dev0:/dev/cambricon_dev0 \
--device=/dev/cambricon_dev1:/dev/cambricon_dev1 \
--device=/dev/cambricon_dev2:/dev/cambricon_dev2 \
--device=/dev/cambricon_dev3:/dev/cambricon_dev3 \
--device=/dev/cambricon_ctl:/dev/cambricon_ctl \
-p 8080:80 \
qwen:omni
```
注意需要在本地使用寒武纪mlu370 芯片
4. 测试服务
4.1 测试视觉理解
```bash
python request.py
```
4.2 测试统一多模态
启动容器时指定入口点为 /bin/bash
```bash
docker run -it --rm \
-v /models/:/mnt/models \
--device=/dev/cambricon_dev0:/dev/cambricon_dev0 \
--device=/dev/cambricon_dev1:/dev/cambricon_dev1 \
--device=/dev/cambricon_dev2:/dev/cambricon_dev2 \
--device=/dev/cambricon_dev3:/dev/cambricon_dev3 \
--device=/dev/cambricon_ctl:/dev/cambricon_ctl \
--entrypoint /bin/bash \
-p 8080:80 \
qwen:omni
```
将 test.py 等拷贝到容器内
```
docker cp ./test.py <container_id>:/workspace/test.py
docker cp ./cars.jpg <container_id>:/workspace/cars.jpg
docker cp ./cough.wav <container_id>:/workspace/cough.wav
```
进入容器执行测试脚本
```bash
python test.py
```
## 寒武纪 mlu370-X8/X4 上 Qwen/Qwen3-Omni-30B-A3B-Instruct 模型 运行测试结果
测试方式为在 Nvidia H100 和 mlu370-X8/X4 加速卡上对10个图片相关问题回答获取运行时间
| 模型名称 | 是否启用 Talker | 适配状态 | mlu370-X8 运行时间 (s) | mlu370-X4 运行时间 (s) | Nvidia H100 运行时间 (s) |
|-----------------------------|----------------|----------|------------------------|------------------------|---------------------------|
| Qwen3-Omni-30B-A3B-Instruct | 是 | 成功 | 32.5787 | 30.1759 | 11.3393 |
| Qwen3-Omni-30B-A3B-Instruct | 否 | 成功 | 10.1789 | 9.1539 | 2.9123 |