Files
xc-llm-ascend/vllm_ascend/worker
Yizhou 54bd531db8 [v0.11.0][Fix] Fix attention metadata handling for profiling and MLA (#3636) (#3643)
### What this PR does / why we need it?
This is a port PR of #3636 .

Move the creation of dummy attention metadata to occur after the ACL
graph runtime mode is determined. This ensures the metadata is
initialized with the correct configuration during a profile run.

Additionally, remove the `attn_metadata` existence check before updating
MLA attention parameters. This change prevents the update from being
skipped when metadata is not yet available, ensuring parameters are set
correctly.

### Does this PR introduce _any_ user-facing change? None.

### How was this patch tested?
None.

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
2025-10-23 10:29:30 +08:00
..
2025-10-09 10:28:38 +08:00