Files
xc-llm-ascend/vllm_ascend/worker
wangbj127 e6ba5a88f7 [v0.18.0][BugFix] Fix Qwen3.5 MoE FC1 error under high concurrency when dp>1 (#8395)
### What this PR does / why we need it?
GDN Attention uses FIA's query_start_loc (padded), which may cause
conv1d update errors under high concurrency when dp > 1, and this PR is
to make GDN use its own query_start_loc (unpadded).

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- vLLM version: v0.18.0

Signed-off-by: Wangbingjie <wangbj1207@126.com>
2026-04-20 10:26:19 +08:00
..