[Bugfix]Fix of Pooling Code and Update of Pooling Usage Guide (#6126)

### What this PR does / why we need it?
Fix of Pooling Code and Update of Pooling Usage Guide
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
pr:[[Bugfix]Fixed precision issues caused by pooled request
pooling](https://github.com/vllm-project/vllm-ascend/pull/6049)
readyhttps://github.com/vllm-project/vllm-ascend/pull/6049
read for review
- vLLM version: v0.13.0
- vLLM main:
d68209402d

---------

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Signed-off-by: fangjianwei <f30058701@china.huawei.com>
Signed-off-by: DreamerLeader <88812830+DreamerLeader@users.noreply.github.com>
Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>
Co-authored-by: fangjianwei <f30058701@china.huawei.com>
This commit is contained in:
DreamerLeader
2026-02-04 16:35:41 +08:00
committed by GitHub
parent 804a9ec4e6
commit 2dac18afea
2 changed files with 8 additions and 1 deletions

View File

@@ -159,6 +159,7 @@ class KVPoolScheduler:
self._request_trackers.pop(finished_req_id, None)
self._unfinished_requests.pop(finished_req_id, None)
self._unfinished_request_ids.discard(finished_req_id)
self._preempted_req_ids.discard(finished_req_id)
for req_id in scheduler_output.preempted_req_ids:
self._preempted_req_ids.update(scheduler_output.preempted_req_ids)