revert TND modify when dcp pcp (#3948)

### What this PR does / why we need it?
1、revert TND modify when dcp pcp, which is introduced by
f57bdb09fc
2、deal aclgraph pad border issue

- vLLM version: v0.11.0
- vLLM main:
83f478bb19

Signed-off-by: weiguihua2 <weiguihua2@huawei.com>
This commit is contained in:
weiguihua2
2025-11-03 22:22:17 +08:00
committed by GitHub
parent cc2cd42ad3
commit 5453033a41
3 changed files with 27 additions and 17 deletions

View File

@@ -301,9 +301,9 @@ def update_attn_dcp_pcp_params(update_stream, forward_context, runtime_shape):
):
(q_nope, k_nope, value, num_heads, num_kv_heads, scale,
block_table, block_size, actual_seq_lengths_kv, attn_output,
softmax_lse, cp_rank, dcp_rank, dcp_size) = param
softmax_lse, pcp_rank, dcp_rank, dcp_size) = param
actual_seq_lengths_kv = forward_context.attn_metadata[
key].decode_meta.num_computed_tokens_of_pcp_dcp[:, cp_rank,
key].decode_meta.num_computed_tokens_of_pcp_dcp[:, pcp_rank,
dcp_rank]
pad_length = runtime_shape - len(actual_seq_lengths_kv)
pad_tensor = np.zeros(pad_length,