Files
xc-llm-ascend/vllm_ascend/ops/triton
bowenli 048c8d1afe [v0.18.0][Bugfix] Fix the bug of MTP1 crashing in multiple concurrent scenarios. (#7699)
### What this PR does / why we need it?
The triton operator does not perform boundary checks on the global
position within the loop, leading to the memory overflow in scenarios
with multiple concurrency + 1-step MTP launch.

Solution: Add a check that global_pos < vec_len, and strictly limit the
boundaries of all memory accesses to avoid out-of-bounds writes.
backport:#7459

Signed-off-by: Bowen-Leee <caoshankuangren@gmail.com>
2026-03-27 14:13:12 +08:00
..