[Feat](sfa,dcp) support dcp for sfa (#6563)

### What this PR does / why we need it?
This PR adds DCP support to the SFA backend.

Please note that due to operator constraints, the current implementation
has to all-gather the entire KV cache and modify the block table to
satisfy the operator input requirements. This results in significantly
increased communication overhead and peak memory usage. Therefore, this
is only a temporary workaround and will be refactored once the operator
provides proper support.

Additionally, because of the above limitations,
`cp_kv_cache_interleave_size` is currently required to be equal to
`block_size`. This restriction will also be removed after the refactor.

#### Test
accuracy test using DeepSeek-V3.2-Exp-W8A8 with dp2tp8dcp8

| dataset | version | metric | mode | vllm-api-general-stream |
|----- | ----- | ----- | ----- | -----|
| gsm8kdataset | - | accuracy | gen | 96.35 |

- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0

---------

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
This commit is contained in:
Qiu
2026-02-09 18:52:25 +08:00
committed by GitHub
parent 80e5812b39
commit cb7c419bc0
5 changed files with 190 additions and 13 deletions

View File

@@ -77,6 +77,9 @@ jobs:
- name: multi-node-qwenw8a8-2node-longseq
config_file_path: Qwen3-235B-W8A8-longseq.yaml
size: 2
- name: multi-node-deepseek-V3_2-W8A8-cp
config_file_path: DeepSeek-V3_2-W8A8-cp.yaml
size: 2
- name: multi-node-qwen-disagg-pd
config_file_path: Qwen3-235B-disagg-pd.yaml
size: 2