Files
xc-llm-ascend/vllm_ascend
Shanshan Shen 55d0790597 [2/N][Refactor] Refactor V1 attention for better extensibility (#1995)
### What this PR does / why we need it?

Refactor V1 Attention for better extensibility (prepared for torchair
attention refactor).

**Main changes:**
- Move different kinds of foward into their method respectively, e.g.,
`_forward_prefill_no_cache()`, `_forward_prefill_cache_hit()`,
`_forward_decode_only()`, `_forward_v1_style()`.

### Does this PR introduce _any_ user-facing change?

No.

- vLLM version: v0.10.0
- vLLM main:
14a5d903ab

Signed-off-by: shen-shanshan <467638484@qq.com>
2025-08-14 09:32:41 +08:00
..
2025-08-07 17:19:23 +08:00