Files
xc-llm-ascend/vllm_ascend
drslark 44a4ff6960 [main][BugFix] Avoided a bug of torch_npu.npu_mm_reduce_scatter_base when sp size >= 16 (#6168)
### What this PR does / why we need it?
If `sp` is enabled and `tp_size` >= 16,
`torch_npu.npu_mm_reduce_scatter_base` will raises a exception.
After consulting with the operator developer, we learned that the
operator does not work when `tp` = 16.
So, we disable the operator when `tp` = 16.

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested

We started a server with `sp` enabled and `tp` = 16.

It started successfully.

```text
(APIServer pid=1855938) INFO:     Started server process [1855938]
(APIServer pid=1855938) INFO:     Waiting for application startup.
(APIServer pid=1855938) INFO:     Application startup complete.
```

- vLLM version: v0.13.0
- vLLM main:
d68209402d

Signed-off-by: drslark <slarksblood@qq.com>
2026-01-23 21:12:23 +08:00
..
2026-01-22 15:46:59 +08:00
2026-01-23 09:45:08 +08:00
2026-01-22 15:46:59 +08:00
2026-01-22 09:26:39 +08:00