[Refactor] remove some metadata variables in attention_v1. (#5160)
RFC: https://github.com/vllm-project/vllm-ascend/issues/4629
Reason:
The metadata data class contains an excessive number of variables. We
will inherit the metadata of the community and simultaneously remove
some variables that are no longer needed at present.
Todo:
1. remove attn_state partly.
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>
This commit is contained in:
@@ -247,7 +247,9 @@ class XliteWrapper:
|
||||
if not with_prefill or self.full_mode:
|
||||
batch = attn_metadata.num_prefills + attn_metadata.num_decodes
|
||||
seq_lens = attn_metadata.seq_lens[:batch]
|
||||
query_lens = attn_metadata.query_lens[:batch]
|
||||
query_lens = attn_metadata.query_start_loc_cpu[
|
||||
1:] - attn_metadata.query_start_loc_cpu[:-1]
|
||||
query_lens = query_lens[:batch]
|
||||
cached_lens = seq_lens - query_lens
|
||||
|
||||
xlite_attn_metadata = ModelAttnMeta()
|
||||
|
||||
Reference in New Issue
Block a user