xc-llm-ascend

Files

TaoYu Chen 20dedba5d1 Add qwen2.5 vl multimodal feature for vllm-ascend v1 (#736 )

### What this PR does / why we need it?

The current vllm-ascend is not support the multimodal model in
vllm-ascend v1 yet. So I change the `model_runner_v1.py` file with using
MRoPE feature and so on to support this feature. It currently still not
perfect since the Ascend operator is not support the `window/full attn`
to reduce Memcpy operations, so it would out of memory if the input
embedding is too large, so We can't use `self._profile_multimodal()` for
profile since it use a big dummy input (i.e. images) as the multimodal
input.

Fixes: https://github.com/vllm-project/vllm-ascend/issues/514

### Does this PR introduce _any_ user-facing change?

No, this feature not need change the user-facing

### How was this patch tested?

I test this offline using my machine 910B3 and my own fork, and it works
well.

---------

Signed-off-by: cty <ctynb@qq.com>

2025-06-07 16:53:19 +08:00

e2e

[CI/UT][PD Disaggreate] Initialize PD Disaggreate UT (#889 )

2025-05-29 10:17:12 +08:00

long_term

[Misc] Refactor additional_config (#1029 )

2025-06-05 16:28:01 +08:00

multicard

[perf]: support dual-batch overlap(dbo) for deepseek (#941 )

2025-06-07 16:46:58 +08:00

singlecard

Add qwen2.5 vl multimodal feature for vllm-ascend v1 (#736 )