Files
xc-llm-ascend/vllm_ascend/ops
Levi f0d41199a6 [Performance] Remove index opetation when VLLM_ASCEND_FLASHCOMM2_PARALLEL_SIZE=1 (#5936)
### What this PR does / why we need it?
When enable VLLM_ASCEND_FLASHCOMM2_PARALLEL_SIZE>1, we need index
operation to reorganize the batch, because that we need ensure the
correct batch-id for each rank after the reduce-scatter op in
VLLM_ASCEND_FLASHCOMM2_PARALLEL_SIZE>1. But we do not need it when
VLLM_ASCEND_FLASHCOMM2_PARALLEL_SIZE=1, which dose not need
reduce-scatter.

Signed-off-by: Levi-JQ <yujinqi2@huawei.com>
Co-authored-by: Levi-JQ <yujinqi2@huawei.com>
2026-01-19 17:12:13 +08:00
..