xc-llm-ascend

Files

XiaoxinWang 37d1bd8c50 fixed fia pad logic in graph mode. (#7144 )

### What this PR does / why we need it?
related to vllm PR #34043 this pr delete func
‘relax_for_mixed_batch_cudagraphs’, num_reqs no longer equals the actual
number of requests, due to fia operator requires that
query_start_loc[-1] equals the total number of computed tokens, so this
func delete cause the ifa error.
In full graph mode, set num_reqs_paded = num_reqs to fix the error
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.16.0
- vLLM main:
4034c3d32e

---------

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>

2026-03-12 14:50:54 +08:00

310p

[Lint]Style: Convert test/ to ruff format(Batch #1 ) (#6738 )

2026-03-10 09:52:50 +08:00

doctests

[Doc] Recover installation doc to use pip install (#4109 )

2025-11-11 09:25:44 +08:00

models

[Lint]Style: Convert test/ to ruff format(Batch #1 ) (#6738 )

2026-03-10 09:52:50 +08:00

multicard

fixed fia pad logic in graph mode. (#7144 )