[v0.11.0-dev][bugfix] Fix a bug in wrongly set npu_stream (#4106)

### What this PR does / why we need it?
This pr fixes a bug introduced in #3985, which set wrong npu_stream
(possibly by mistakes in cherry-pick). I correct it and make
`update_attn_params` consistent to main branch.

### Does this PR introduce _any_ user-facing change?
No.

Signed-off-by: Angazenn <supperccell@163.com>
This commit is contained in:
Angazenn
2025-11-11 09:16:41 +08:00
committed by GitHub
parent c5fe179cef
commit 2069bef449

View File

@@ -231,8 +231,6 @@ def update_attn_params(update_stream, forward_context, runtime_shape):
block_table=block_table,
context_lens=seq_lens,
out=output)
with torch.npu.stream(update_stream):
torch.npu.graph_task_update_begin(update_stream, handle)
torch_npu._npu_paged_attention(query=query,
key_cache=key_cache,