xc-llm-ascend

Files

wangxiyuan f12f76d7ba Drop 0.10.2 (#3284 )

Drop v0.10.2 support, we support vLLM 0.11.0rc3 now.
- vLLM version: v0.11.0rc3
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

2025-10-09 10:28:38 +08:00

__init__.py

Add DeepSeek V3.2 support (#3270 )

2025-09-30 03:25:58 +08:00

patch_attention_layer.py

Drop 0.10.2 (#3284 )

2025-10-09 10:28:38 +08:00

patch_attention_selector.py

Drop 0.10.2 (#3284 )

2025-10-09 10:28:38 +08:00

patch_attentionspec.py

Add DeepSeek V3.2 support (#3270 )

2025-09-30 03:25:58 +08:00

patch_distributed.py

[BugFix]add all2all when dp_size > 1 && downgrade npu_dequant_swiglu_quant (#819 )

2025-05-15 09:19:55 +08:00

patch_logits.py

[Bugfix][LoRA][Patch] Fix the LoRA inference bug after upstream vLLM codebase changed (#2560 )

2025-08-28 10:40:51 +08:00

patch_minicpm.py

[Model][MiniCPM] support MiniCPM (#645 )

2025-04-27 11:27:24 +08:00

patch_triton.py

[2/N][Refactor][Qwen3-Next] remove redundant methods and patch methods in Qwen3NextGatedDeltaNet (#3082 )

2025-09-24 11:25:42 +08:00

patch_weight_loader.py

Drop 0.10.2 (#3284 )

2025-10-09 10:28:38 +08:00