79 lines
2.5 KiB
Markdown
79 lines
2.5 KiB
Markdown
# enginex-mlu370-any2any
|
||
|
||
# 寒武纪 mlu370 统一多模态
|
||
|
||
该模型测试框架在寒武纪mlu370 (X8/X4)加速卡上,基于Transfomer框架,适配了 Qwen/Qwen3-Omni-30B-A3B-Instruct 模型。
|
||
|
||
* 详见 https://modelscope.cn/models/Qwen/Qwen3-Omni-30B-A3B-Instruct
|
||
|
||
|
||
## Quick Start
|
||
1. 首先从modelscope上下载模型
|
||
```bash
|
||
modelscope download --model Qwen/Qwen3-Omni-30B-A3B-Instruct --local_dir /models/Qwen3-Omni-30B-A3B-Instruct
|
||
```
|
||
2. 构建镜像
|
||
```bash
|
||
docker build -t qwen:omni .
|
||
```
|
||
|
||
3. 启动docker
|
||
```bash
|
||
docker run -it --rm \
|
||
-v /models/:/mnt/models \
|
||
--device=/dev/cambricon_dev0:/dev/cambricon_dev0 \
|
||
--device=/dev/cambricon_dev1:/dev/cambricon_dev1 \
|
||
--device=/dev/cambricon_dev2:/dev/cambricon_dev2 \
|
||
--device=/dev/cambricon_dev3:/dev/cambricon_dev3 \
|
||
--device=/dev/cambricon_ctl:/dev/cambricon_ctl \
|
||
-p 8080:80 \
|
||
qwen:omni
|
||
```
|
||
注意需要在本地使用寒武纪mlu370 芯片
|
||
|
||
4. 测试服务
|
||
|
||
4.1 测试视觉理解
|
||
```bash
|
||
python request.py
|
||
```
|
||
4.2 测试统一多模态
|
||
|
||
启动容器时指定入口点为 /bin/bash
|
||
|
||
```bash
|
||
docker run -it --rm \
|
||
-v /models/:/mnt/models \
|
||
--device=/dev/cambricon_dev0:/dev/cambricon_dev0 \
|
||
--device=/dev/cambricon_dev1:/dev/cambricon_dev1 \
|
||
--device=/dev/cambricon_dev2:/dev/cambricon_dev2 \
|
||
--device=/dev/cambricon_dev3:/dev/cambricon_dev3 \
|
||
--device=/dev/cambricon_ctl:/dev/cambricon_ctl \
|
||
--entrypoint /bin/bash \
|
||
-p 8080:80 \
|
||
qwen:omni
|
||
```
|
||
|
||
将 test.py 等拷贝到容器内
|
||
```
|
||
docker cp ./test.py <container_id>:/workspace/test.py
|
||
docker cp ./cars.jpg <container_id>:/workspace/cars.jpg
|
||
docker cp ./cough.wav <container_id>:/workspace/cough.wav
|
||
```
|
||
|
||
进入容器执行测试脚本
|
||
|
||
```bash
|
||
python test.py
|
||
```
|
||
|
||
## 寒武纪 mlu370-X8/X4 上 Qwen/Qwen3-Omni-30B-A3B-Instruct 模型 运行测试结果
|
||
|
||
测试方式为在 Nvidia H100 和 mlu370-X8/X4 加速卡上对10个图片相关问题回答,获取运行时间
|
||
|
||
| 模型名称 | 是否启用 Talker | 适配状态 | mlu370-X8 运行时间 (s) | mlu370-X4 运行时间 (s) | Nvidia H100 运行时间 (s) |
|
||
|-----------------------------|----------------|----------|------------------------|------------------------|---------------------------|
|
||
| Qwen3-Omni-30B-A3B-Instruct | 是 | 成功 | 32.5787 | 30.1759 | 11.3393 |
|
||
| Qwen3-Omni-30B-A3B-Instruct | 否 | 成功 | 10.1789 | 9.1539 | 2.9123 |
|
||
|