xc-llm-ascend

Files

wangxiyuan f12f76d7ba Drop 0.10.2 (#3284 )

Drop v0.10.2 support, we support vLLM 0.11.0rc3 now.
- vLLM version: v0.11.0rc3
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

2025-10-09 10:28:38 +08:00

__init__.py

[1/N][refactor] torchair fused_moe refactor (#2438 )

2025-08-25 15:46:10 +08:00

sequence_parallel.py

[Refactor] [SP]The sequence parallelism characteristics in the MoE and Dense models are integrated into a single solution. (#3085 )

2025-09-24 11:29:59 +08:00

shared_weight_layer.py

[1/N][Feat] Cut down memory usage for o_proj in DeepSeek (#2931 )

2025-09-24 17:16:41 +08:00

torchair_activation.py

[main] mlp weight prefetch in Qwen Dense Models (#2816 )

2025-09-11 21:20:09 +08:00

torchair_fused_moe.py

Drop 0.10.2 (#3284 )

2025-10-09 10:28:38 +08:00

torchair_layernorm.py

[main] mlp weight prefetch in Qwen Dense Models (#2816 )

2025-09-11 21:20:09 +08:00

torchair_rotary_embedding.py

Fix the bugs about operator registration by PyTorch Dispatcher (#2786 )

2025-09-13 11:58:52 +08:00