[Feat]Xlite Qwen3-vl Support (#5228)

### What this PR does / why we need it?
This patch adds support for the Qwen3-VL model in Xlite. For more
details about Xlite, please refer to the following
link:https://atomgit.com/openeuler/GVirt/blob/master/xlite/README.md.
The latest performance comparison data between xlite and the default
aclgraph mode is as follows:

### Does this PR introduce _any_ user-facing change?
XLite graph mode supports the Qwen3-VL model.

### How was this patch tested?
vLLM version: v0.12.0 

- vLLM version: release/v0.13.0
- vLLM main:
ad32e3e19c

Signed-off-by: lvjunqi <lvjunqi1@huawei.com>
Co-authored-by: lvjunqi <lvjunqi1@huawei.com>
This commit is contained in:
lvjunqi
2025-12-22 16:30:52 +08:00
committed by GitHub
parent 78aa7f2693
commit 55beac9c91
4 changed files with 19 additions and 9 deletions

View File

@@ -281,7 +281,7 @@ class NPUPlatform(Platform):
parallel_config.all2all_backend = "flashinfer_all2allv"
if ascend_config.xlite_graph_config.enabled:
logger.info(
"Euler Xlite enabled. See: https://gitee.com/openeuler/GVirt/tree/master/xlite"
"openEuler Xlite enabled. See: https://atomgit.com/openeuler/GVirt/tree/master/xlite"
)
parallel_config.worker_cls = "vllm_ascend.xlite.xlite_worker.XliteWorker"
else: