xc-llm-ascend

Files

XiaoxinWang 579b7e5f21 add pagedattention to support FULL_DECODE_ONLY. (#3102 )

### What this PR does / why we need it?
Calculate in advance the workspace memory size needed for the
PagedAttention operator to avoid deadlocks during resource cleanup. This
PR requires torch_npu version 0920 or newer.
### How was this patch tested?


- vLLM version: v0.11.0

---------

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>

2025-10-10 08:50:33 +08:00

ISSUE_TEMPLATE

[Doc] Release note for v0.11.0rc0 (#3224 )

2025-09-30 03:26:18 +08:00

workflows

add pagedattention to support FULL_DECODE_ONLY. (#3102 )

2025-10-10 08:50:33 +08:00

actionlint.yaml

[CI] Update pre_commit runner (#2850 )

2025-09-10 20:23:25 +08:00

dependabot.yml

[CI] Add dependabot support and labeler workflow (#162 )