xc-llm-ascend/vllm_ascend at 9eb62935b8f8c8bb7f1a9296fd73f3babc64f755 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

XiaoxinWang 9eb62935b8 fix pagedattention to support fullgraph. (#3436 )

### What this PR does / why we need it?
Calculate in advance the workspace memory size needed for the
PagedAttention operator to avoid deadlocks during resource cleanup. This
PR requires torch_npu version 0920 or newer.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>

2025-10-14 16:10:09 +08:00

..

fix pagedattention to support fullgraph. (#3436 )

2025-10-14 16:10:09 +08:00

fix pagedattention to support fullgraph. (#3436 )

2025-10-14 16:10:09 +08:00

[BugFix] Fix ascend scheduler assert error (#3191 )

2025-09-28 18:22:08 +08:00

device_allocator

[Misc]Clean up useless import from vllm (#2049 )

2025-07-28 16:01:59 +08:00

[Feature] mooncake connector support GQA transport (#2947 )

2025-10-13 15:48:37 +08:00

Bugfix: Expose the user policy type interface (#3336 )

2025-10-11 16:28:57 +08:00

[Bugfix][LoRA] Fix forward error and shape mismatch when using LoRA (#3153 )

2025-09-28 17:30:50 +08:00

[MoE] [Refactor] Combine common_fused_moe and fused_moe (#3176 )

2025-10-09 14:12:46 +08:00

[Quickfix] update CachedRequestState as NewRequestData changed (#2367 )

2025-08-15 07:35:27 +08:00

[Feature] optimize sp & qwen3 next support sp. (#3225 )

2025-10-13 23:02:12 +08:00

[feat] support customized and separated hccl_buffer_size for process group initialization (#3073 )

2025-10-11 15:55:22 +08:00

[Feature] Add W4A4 Flat Quantization support (#3427 )

2025-10-13 23:20:16 +08:00

Drop 0.10.2 (#3284 )

2025-10-09 10:28:38 +08:00

bugfix for mtp (#3300 )

2025-10-09 19:22:46 +08:00

[Feature] optimize sp & qwen3 next support sp. (#3225 )

2025-10-13 23:02:12 +08:00

[Fix] Fix mc2_tokens_capacity-related issues (#3411 )

2025-10-14 10:56:12 +08:00

__init__.py

【bugfix】fix connector register failed (#3335 )

2025-10-09 21:09:54 +08:00

ascend_config.py

Bugfix: Expose the user policy type interface (#3336 )

2025-10-11 16:28:57 +08:00

ascend_forward_context.py

Revert PTA upgrade PR (#3352 )

2025-10-10 14:09:53 +08:00

envs.py

Add DeepSeek V3.2 support (#3270 )

2025-09-30 03:25:58 +08:00

meta_registration.py

Fix the bugs about operator registration by PyTorch Dispatcher (#2786 )

2025-09-13 11:58:52 +08:00

platform.py

[Feat][Graph]Support FULL_DECEDE_ONLY mode for MLA models (#3125 )

2025-10-10 16:31:20 +08:00

utils.py

[Feat] enable hierarchical communication for mc2 ops on A2 (#3015 )

2025-10-13 16:13:17 +08:00