[P/D] layerwise connector support recompute scheduler (#5900)

### What this PR does / why we need it?
layerwise connector support recompute scheduler. 

NOTE:
Triggering recompute will invoke the tokenizer again, which may lead to
precision fluctuations.

[RFC]: CDCP Scheduling for Disaggregated Prefilling with KV Cache
Layerwise Push Support
https://github.com/vllm-project/vllm-ascend/issues/4842

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
bde38c11df

---------

Signed-off-by: liziyu <liziyu16@huawei.com>
Signed-off-by: wangxiaoteng <wangxiaoteng@huawei.com>
Co-authored-by: wangxiaoteng <wangxiaoteng@huawei.com>
This commit is contained in:
liziyu
2026-02-07 15:24:42 +08:00
committed by GitHub
parent d266fd7b47
commit e5f0e0eaf7
2 changed files with 89 additions and 12 deletions

View File

@@ -642,7 +642,7 @@ class RecomputeScheduler(Scheduler):
EngineCoreOutput(
request_id=req_info.request_id,
finish_reason=FinishReason.STOP,
new_token_ids=[req_info.output_token_ids[-1]],
new_token_ids=[],
stop_reason="recomputed",
)
)