[Bugfix] PCP adaptation for VLLM v0.11.2 modifications (#4604)

To adapt to the vLLM v0.11.2 image, the method for obtaining PCP size
and DCP size has been modified.
___
- vLLM version: v0.11.2

---------

Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
This commit is contained in:
Slightwind
2025-12-01 19:20:32 +08:00
committed by GitHub
parent 0d14f635b4
commit aa56a0f4b7

View File

@@ -29,8 +29,10 @@ class KVPoolScheduler:
"load_async", False)
# request_id -> (vllm cached tokes, kvpool cached tokens)
self.load_specs: dict[str, LoadSpec] = {}
self.pcp_size = vllm_config.parallel_config.prefill_context_parallel_size
self.dcp_size = vllm_config.parallel_config.decode_context_parallel_size
self.pcp_size = getattr(vllm_config.parallel_config,
"prefill_context_parallel_size", 1)
self.dcp_size = getattr(vllm_config.parallel_config,
"decode_context_parallel_size", 1)
self._block_size = vllm_config.cache_config.block_size
if self.pcp_size > 1: