[P/D]The issue of solving the force-free secondary release request, which causes the node to crash. (#5968)

### What this PR does / why we need it?
The force-free secondary release request causes the node to crash. When
requests are pulled too quickly, they should not be added to the
delay-free queue.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By ci

- vLLM version: v0.13.0
- vLLM main:
2c24bc6996

Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
This commit is contained in:
wangxiaoteng888
2026-01-17 18:49:27 +08:00
committed by GitHub
parent 22f253142a
commit fff5df3efe

View File

@@ -152,7 +152,8 @@ class KVCacheTaskTracker:
def add_delayed_request(self, request_id: str, delay_start_time: float):
"""Add a delayed free request."""
with self.done_task_lock:
self.delayed_free_requests[request_id] = delay_start_time
if request_id in self.reqs_to_process:
self.delayed_free_requests[request_id] = delay_start_time
def _retrieve_expired_requests(self):
"""Retrieve all expired delayed requests."""