[bugfix] ascend schedule encountered an incorrect req block length in the check_watermark_for_prefill function (#2508)
### What this PR does / why we need it?
bugfix ascend schedule encountered an incorrect req block length in the
check_watermark_for_prefill function
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.10.1.1
- vLLM main:
426cc8629f
Signed-off-by: liziyu <liziyu16@huawei.com>
This commit is contained in:
@@ -465,7 +465,7 @@ class AscendScheduler(Scheduler):
|
|||||||
self.block_size)
|
self.block_size)
|
||||||
req_blocks = self.kv_cache_manager.coordinator.get_blocks(
|
req_blocks = self.kv_cache_manager.coordinator.get_blocks(
|
||||||
request.request_id)
|
request.request_id)
|
||||||
num_new_blocks = (num_required_blocks - len(req_blocks) -
|
num_new_blocks = (num_required_blocks - len(req_blocks[0]) -
|
||||||
len(computed_blocks))
|
len(computed_blocks))
|
||||||
num_evictable_computed_blocks = sum(1 for blk in computed_blocks
|
num_evictable_computed_blocks = sum(1 for blk in computed_blocks
|
||||||
if blk.ref_cnt == 0)
|
if blk.ref_cnt == 0)
|
||||||
|
|||||||
Reference in New Issue
Block a user