xc-llm-ascend

Files

realliujiaxu d3c3538ddc [Bugfix]fix bug when graph_size is not divisible by tp_size (#2719 )

### What this PR does / why we need it?
fix https://github.com/vllm-project/vllm-ascend/issues/2702
- A2: skip graph_size update that makes it to tp_size because
dispatch/combine op support different batch size across EP ranks
- A3: add `max_num_reqs = max(new_graph_batch_sizes)` to fix graph_size
and max_num_reqs mismatch

### Does this PR introduce _any_ user-facing change?
Nope
### How was this patch tested?


- vLLM version: v0.10.1.1
- vLLM main:
e599e2c65e

---------

Signed-off-by: realliujiaxu <realliujiaxu@163.com>

2025-09-08 14:52:33 +08:00

__init__.py

[Misc][V0 Deprecation] Remove Cache Engine Used for V0 Worker (#1878 )

2025-07-19 09:42:32 +08:00

model_runner_v1.py

[Bugfix]fix bug when graph_size is not divisible by tp_size (#2719 )

2025-09-08 14:52:33 +08:00

npu_input_batch.py

Support v0.10.1 (#2584 )

2025-08-28 18:47:53 +08:00

worker_v1.py

[Main][Feat]Set the Profiler parameters through environment variables consistent with vLLM (#2608 )

2025-09-03 10:58:08 +08:00