[3/N][Refactor] Move torchair_attention to torchair dir (#2017)
### What this PR does / why we need it?
1. Move `torchair_attention` to `torchair` dir.
2. Make `AscendAttentionTorchairBackend` extend `AscendAttentionBackend`
to reduce duplicate methods.
3. Make `AscendTorchairMetadata` extend `AscendMetadata` to reduce
duplicate properties.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.10.0
- vLLM main:
0933f9d518
---------
Signed-off-by: shen-shanshan <467638484@qq.com>
This commit is contained in:
@@ -169,7 +169,9 @@ class AscendAttentionMetadataBuilder:
|
||||
num_actual_tokens,
|
||||
max_query_len,
|
||||
enable_dbo_across_dp: bool = False,
|
||||
is_only_prefill: bool = False):
|
||||
is_only_prefill: bool = False,
|
||||
*args,
|
||||
**kwargs):
|
||||
|
||||
block_table = self.runner.input_batch.block_table[0].get_device_tensor(
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user