Files
xc-llm-ascend/vllm_ascend/models
Zhijun Chen 463910e686 [Bugfix] use module-level import for patched function in Qwen3Next (#4354)
### What this PR does / why we need it?

**Problem**: The Qwen3Next model implementation currently imports
chunk_gated_delta_rule directly using `from ... import ...`

In frameworks like `verl`, the model file is often imported before
`vllm-ascend` initializes and applies its patches. This causes the model
to permanently hold a reference to the original (unpatched) vLLM kernel,
resulting in execution errors on Ascend devices even if the patch runs
later.

**Solution**: Changed the import style to `from vllm...ops import chunk`
and call `chunk.chunk_gated_delta_rule().`

This ensures that the function lookup happens at runtime (dynamic
dispatch), allowing the model to correctly pick up the patched function
regardless of import order.

- vLLM version: v0.11.0
- vLLM main:
2918c1b49c

Signed-off-by: zjchenn <zjchenn@gmail.com>
2025-11-25 20:15:43 +08:00
..
2025-11-24 17:08:20 +08:00
2025-11-24 17:08:20 +08:00
2025-11-24 17:08:20 +08:00