### What this PR does / why we need it?
This PR removes the unused `make_attention_mask` function from
`vllm_ascend/worker/v2/attn_utils.py`.
**Why it's dead code:**
- After PR #4870 (attention mask unification refactor), attention mask
generation has been centralized in the `AttentionMaskBuilder` singleton
class
- The mask is now generated directly by metadata builders when needed
(e.g., `AscendAttentionMetadataBuilder`, `AscendMLAMetadataBuilder`)
- The `make_attention_mask` function is no longer called anywhere in the
codebase
- The function's parameters (including `attn_mask` and `spec_attn_mask`)
were also removed from `build_attn_metadata` in the same refactor
**Changes:**
- Remove `make_attention_mask` function (24 lines) from
`vllm_ascend/worker/v2/attn_utils.py`
### Does this PR introduce _any_ user-facing change?
No. This is a code cleanup that removes dead code. No user-facing
behavior changes.
### How was this patch tested?
- Verified that `make_attention_mask` is not called anywhere in the
codebase (via `grep`)
- CI tests pass to ensure no regressions
- The function has been unused since PR #4870 was merged
- vLLM version: v0.13.0
- vLLM main:
2f4e6548ef
Signed-off-by: lico67373 <918688502@qq.com>
Co-authored-by: weijinqian0 <1184188277@qq.com>
[Experimental] Model Runner V2
This directory contains the new model runner which is under active development.
please see Model Runner V2 to get specific plans.