Files
xc-llm-ascend/vllm_ascend/distributed/kv_transfer
pz1116 a7f91fce71 [KV Pool]get_num_new_matched_tokens return 0 if token length < block_size (#7146)
### What this PR does / why we need it?
Currently, we call lookup_client for looking up token hit in KV Pool,
however, when token length < block size, the key will be empty and there
is no point to lookup in KV Pool backend since there will never be a
hit.
Hence, add early return in `get_num_new_matched_tokens` when `token_len`
< `block_size`

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.16.0
- vLLM main:
4034c3d32e

---------

Signed-off-by: Pz1116 <zpbzpb123123@gmail.com>
Co-authored-by: fems14 <1804143737@qq.com>
2026-03-11 15:05:34 +08:00
..