xc-llm-ascend/docs/source/tutorials/multi_node_qwen3vl.md

# Multi-Node-DP (Qwen3-VL-235B-A22B)

:::{note}
Qwen3 VL relies on the newest version of `transformers` (>4.56.2). Please install it from source.
:::

## Verify Multi-Node Communication Environment

Refer to [multi_node.md](https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_node.html#verification-process).

## Run with Docker
Assume you have Atlas 800 A3 (64G*16) nodes (or 2 * A2), and want to deploy the `Qwen3-VL-235B-A22B-Instruct` model across multiple nodes.

```{code-block} bash
   :substitutions:
# Update the vllm-ascend image
export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|
docker run --rm \
--name vllm-ascend \
--net=host \
--device /dev/davinci0 \
--device /dev/davinci1 \
--device /dev/davinci2 \
--device /dev/davinci3 \
--device /dev/davinci4 \
--device /dev/davinci5 \
--device /dev/davinci6 \
--device /dev/davinci7 \
--device /dev/davinci8 \
--device /dev/davinci9 \
--device /dev/davinci10 \
--device /dev/davinci11 \
--device /dev/davinci12 \
--device /dev/davinci13 \
--device /dev/davinci14 \
--device /dev/davinci15 \
--device /dev/davinci_manager \
--device /dev/devmm_svm \
--device /dev/hisi_hdc \
--shm-size 256g \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /root/.cache:/root/.cache \
-p 8000:8000 \
-it $IMAGE bash
```

Run the following scripts on two nodes respectively.

:::{note}
Before launching the inference server, ensure the following environment variables are set for multi-node communication.
:::

Node 0

```shell
#!/bin/sh
# this obtained through ifconfig
# nic_name is the network interface name corresponding to local_ip of the current node
nic_name="xxxx"
local_ip="xxxx"

export HCCL_IF_IP=$local_ip
export GLOO_SOCKET_IFNAME=$nic_name
export TP_SOCKET_IFNAME=$nic_name
export HCCL_SOCKET_IFNAME=$nic_name
export OMP_PROC_BIND=false
export OMP_NUM_THREADS=10
export VLLM_USE_V1=1
export HCCL_BUFFSIZE=1024

vllm serve Qwen/Qwen3-VL-235B-A22B-Instruct \
--host 0.0.0.0 \
--port 8000 \
--data-parallel-size 2 \
--api-server-count 2 \
--data-parallel-size-local 1 \
--data-parallel-address $local_ip \
--data-parallel-rpc-port 13389 \
--seed 1024 \
--served-model-name qwen3vl \
--tensor-parallel-size 8 \
--enable-expert-parallel \
--max-num-seqs 16 \
--max-model-len 32768 \
--max-num-batched-tokens 4096 \
--trust-remote-code \
--no-enable-prefix-caching \
--gpu-memory-utilization 0.8 \
```

Node 1

```shell
#!/bin/sh

# this obtained through ifconfig
# nic_name is the network interface name corresponding to local_ip of the current node
nic_name="xxxx"
local_ip="xxxx"

# The value of node0_ip must be consistent with the value of local_ip set in node0 (master node)
node0_ip="xxxx"

export HCCL_IF_IP=$local_ip
export GLOO_SOCKET_IFNAME=$nic_name
export TP_SOCKET_IFNAME=$nic_name
export HCCL_SOCKET_IFNAME=$nic_name
export OMP_PROC_BIND=false
export OMP_NUM_THREADS=10
export VLLM_USE_V1=1
export HCCL_BUFFSIZE=1024

vllm serve Qwen/Qwen3-VL-235B-A22B-Instruct \
--host 0.0.0.0 \
--port 8000 \
--headless \
--data-parallel-size 2 \
--data-parallel-size-local 1 \
--data-parallel-start-rank 1 \
--data-parallel-address $node0_ip \
--data-parallel-rpc-port 13389 \
--seed 1024 \
--tensor-parallel-size 8 \
--served-model-name qwen3vl \
--max-num-seqs 16 \
--max-model-len 32768 \
--max-num-batched-tokens 4096 \
--enable-expert-parallel \
--trust-remote-code \
--no-enable-prefix-caching \
--gpu-memory-utilization 0.8 \
```

If the service starts successfully, the following information will be displayed on node 0:

```shell
INFO:     Started server process [44610]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Started server process [44611]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
```

Once your server is started, you can query the model with input prompts:

```shell
curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
    "model": "qwen3vl",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},
        {"type": "text", "text": "What is the text in the illustrate?"}
    ]}
    ]
    }'
```
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00			`# Multi-Node-DP (Qwen3-VL-235B-A22B)`

[Doc] Add deepseek v3.2 tutorial (#3275) Add deepseek v3.2 tutorial - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com> 2025-09-30 17:54:31 +08:00			`:::{note}`
[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			Qwen3 VL relies on the newest version of `transformers` (>4.56.2). Please install it from source.
[Doc] Add deepseek v3.2 tutorial (#3275) Add deepseek v3.2 tutorial - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com> 2025-09-30 17:54:31 +08:00			`:::`

[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00			`## Verify Multi-Node Communication Environment`

[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			`Refer to [multi_node.md](https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_node.html#verification-process).`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00
[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			`## Run with Docker`
			Assume you have Atlas 800 A3 (64G16) nodes (or 2 A2), and want to deploy the `Qwen3-VL-235B-A22B-Instruct` model across multiple nodes.
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00
			```{code-block} bash
			`:substitutions:`
			`# Update the vllm-ascend image`
			`export IMAGE=quay.io/ascend/vllm-ascend:\|vllm_ascend_version\|`
			`docker run --rm \`
			`--name vllm-ascend \`
			`--net=host \`
			`--device /dev/davinci0 \`
			`--device /dev/davinci1 \`
			`--device /dev/davinci2 \`
			`--device /dev/davinci3 \`
			`--device /dev/davinci4 \`
			`--device /dev/davinci5 \`
			`--device /dev/davinci6 \`
			`--device /dev/davinci7 \`
			`--device /dev/davinci8 \`
			`--device /dev/davinci9 \`
			`--device /dev/davinci10 \`
			`--device /dev/davinci11 \`
			`--device /dev/davinci12 \`
			`--device /dev/davinci13 \`
			`--device /dev/davinci14 \`
			`--device /dev/davinci15 \`
			`--device /dev/davinci_manager \`
			`--device /dev/devmm_svm \`
			`--device /dev/hisi_hdc \`
[Doc] Add --shm-size option to Docker command for qwen3 vl 235B (#3519) ### What this PR does / why we need it? Added shared memory size option to Docker run command.If shm-size is not specified, docker will use 64MB by default. In this case, vllm:EngineCore process may coredump if workload is high. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Done Closes: https://github.com/vllm-project/vllm-ascend/issues/3513 - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: likeful <irayki@gmail.com> Signed-off-by: leijie2015 <irayki@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> 2025-10-20 23:37:35 +08:00			`--shm-size 256g \`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00			`-v /usr/local/dcmi:/usr/local/dcmi \`
			`-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \`
			`-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \`
			`-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \`
			`-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \`
			`-v /etc/ascend_install.info:/etc/ascend_install.info \`
			`-v /root/.cache:/root/.cache \`
			`-p 8000:8000 \`
			`-it $IMAGE bash`
			```

[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			`Run the following scripts on two nodes respectively.`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00
			`:::{note}`
[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			`Before launching the inference server, ensure the following environment variables are set for multi-node communication.`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00			`:::`

[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			`Node 0`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00
			```shell
			`#!/bin/sh`
			`# this obtained through ifconfig`
[Doc] Clearer corresponding relationship between configurations for multi-node guides (#3441) Optimize multi-node guide: more clearer corresponding relationship between configuration items and nodes ### What this PR does / why we need it? Some issues caused by misunderstandings due to unclear guidance content, for example: #3367 ### Does this PR introduce _any_ user-facing change? NA ### How was this patch tested? NA - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: leo-pony <nengjunma@outlook.com> 2025-10-16 08:54:03 +08:00			`# nic_name is the network interface name corresponding to local_ip of the current node`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00			`nic_name="xxxx"`
			`local_ip="xxxx"`

			`export HCCL_IF_IP=$local_ip`
			`export GLOO_SOCKET_IFNAME=$nic_name`
			`export TP_SOCKET_IFNAME=$nic_name`
			`export HCCL_SOCKET_IFNAME=$nic_name`
			`export OMP_PROC_BIND=false`
[0.11.0]fix the configuration conflicts in documentation (#4824) ### What this PR does / why we need it? Fix configuration errors in our documentation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? NA. Signed-off-by: linfeng-yuan <1102311262@qq.com> 2025-12-09 15:37:06 +08:00			`export OMP_NUM_THREADS=10`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00			`export VLLM_USE_V1=1`
			`export HCCL_BUFFSIZE=1024`

			`vllm serve Qwen/Qwen3-VL-235B-A22B-Instruct \`
			`--host 0.0.0.0 \`
			`--port 8000 \`
			`--data-parallel-size 2 \`
			`--api-server-count 2 \`
			`--data-parallel-size-local 1 \`
			`--data-parallel-address $local_ip \`
			`--data-parallel-rpc-port 13389 \`
			`--seed 1024 \`
			`--served-model-name qwen3vl \`
			`--tensor-parallel-size 8 \`
			`--enable-expert-parallel \`
			`--max-num-seqs 16 \`
			`--max-model-len 32768 \`
			`--max-num-batched-tokens 4096 \`
			`--trust-remote-code \`
			`--no-enable-prefix-caching \`
			`--gpu-memory-utilization 0.8 \`
			```

[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			`Node 1`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00
			```shell
			`#!/bin/sh`

[Doc] Clearer corresponding relationship between configurations for multi-node guides (#3441) Optimize multi-node guide: more clearer corresponding relationship between configuration items and nodes ### What this PR does / why we need it? Some issues caused by misunderstandings due to unclear guidance content, for example: #3367 ### Does this PR introduce _any_ user-facing change? NA ### How was this patch tested? NA - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: leo-pony <nengjunma@outlook.com> 2025-10-16 08:54:03 +08:00			`# this obtained through ifconfig`
			`# nic_name is the network interface name corresponding to local_ip of the current node`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00			`nic_name="xxxx"`
			`local_ip="xxxx"`
[Doc] Clearer corresponding relationship between configurations for multi-node guides (#3441) Optimize multi-node guide: more clearer corresponding relationship between configuration items and nodes ### What this PR does / why we need it? Some issues caused by misunderstandings due to unclear guidance content, for example: #3367 ### Does this PR introduce _any_ user-facing change? NA ### How was this patch tested? NA - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: leo-pony <nengjunma@outlook.com> 2025-10-16 08:54:03 +08:00
			`# The value of node0_ip must be consistent with the value of local_ip set in node0 (master node)`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00			`node0_ip="xxxx"`

			`export HCCL_IF_IP=$local_ip`
			`export GLOO_SOCKET_IFNAME=$nic_name`
			`export TP_SOCKET_IFNAME=$nic_name`
			`export HCCL_SOCKET_IFNAME=$nic_name`
			`export OMP_PROC_BIND=false`
[0.11.0]fix the configuration conflicts in documentation (#4824) ### What this PR does / why we need it? Fix configuration errors in our documentation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? NA. Signed-off-by: linfeng-yuan <1102311262@qq.com> 2025-12-09 15:37:06 +08:00			`export OMP_NUM_THREADS=10`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00			`export VLLM_USE_V1=1`
			`export HCCL_BUFFSIZE=1024`

			`vllm serve Qwen/Qwen3-VL-235B-A22B-Instruct \`
			`--host 0.0.0.0 \`
			`--port 8000 \`
			`--headless \`
			`--data-parallel-size 2 \`
			`--data-parallel-size-local 1 \`
			`--data-parallel-start-rank 1 \`
			`--data-parallel-address $node0_ip \`
			`--data-parallel-rpc-port 13389 \`
			`--seed 1024 \`
			`--tensor-parallel-size 8 \`
			`--served-model-name qwen3vl \`
			`--max-num-seqs 16 \`
			`--max-model-len 32768 \`
			`--max-num-batched-tokens 4096 \`
			`--enable-expert-parallel \`
			`--trust-remote-code \`
			`--no-enable-prefix-caching \`
			`--gpu-memory-utilization 0.8 \`
			```

[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			`If the service starts successfully, the following information will be displayed on node 0:`
[Doc]Add qwen3_vl series guide (#3227) ### What this PR does / why we need it? This PR provides user guide documents for Qwen3-VL 4B and Qwen3-VL-235B-A22B. ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: https://github.com/vllm-project/vllm/commit/releases/v0.11.0 --------- Signed-off-by: booker123456 <945658361@qq.com> 2025-09-28 21:35:52 +08:00
			```shell
			`INFO: Started server process [44610]`
			`INFO: Waiting for application startup.`
			`INFO: Application startup complete.`
			`INFO: Started server process [44611]`
			`INFO: Waiting for application startup.`
			`INFO: Application startup complete.`
			```

			`Once your server is started, you can query the model with input prompts:`

			```shell
			`curl http://localhost:8000/v1/chat/completions \`
			`-H "Content-Type: application/json" \`
			`-d '{`
			`"model": "qwen3vl",`
			`"messages": [`
			`{"role": "system", "content": "You are a helpful assistant."},`
			`{"role": "user", "content": [`
			`{"type": "image_url", "image_url": {"url": "https://modelscope.oss-cn-beijing.aliyuncs.com/resource/qwen.png"}},`
			`{"type": "text", "text": "What is the text in the illustrate?"}`
			`]}`
			`]`
			`}'`
			```