Files
xc-llm-ascend/vllm_ascend/spec_decode
Zetong Li 4fa7cf6f50 [Bugfix] Fix problematic dummy_run & improper input_batch_size in eagle (#6517)
### What this PR does / why we need it?
This PR aims to fix problematic dummy_run that will cause excessive npu
memory and to fix improper input_batch_size that will degrade running
performance.

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
by ci

- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

---------

Signed-off-by: Zetong Li <slippersss@126.com>
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
Co-authored-by: lilinsiman <lilinsiman@gmail.com>
2026-02-07 09:30:10 +08:00
..