xc-llm-ascend

Files

whx dc960e798e [BugFix] Fix mlapo accuracy problem related with weight processing. (#3850 )

This PR fixes a mlapo accuracy problem related with weight processing.
Furthermore, add back mlapo related e2e test with quantized deepseek
model.


- vLLM version: v0.11.0rc3
- vLLM main:
83f478bb19

Signed-off-by: whx-sjtu <2952154980@qq.com>

2025-10-30 00:34:55 +08:00

attention

[BugFix] Fix mlapo accuracy problem related with weight processing. (#3850 )

2025-10-30 00:34:55 +08:00

compilation

[feat]dcp pcp support aclgraph (#3731 )

2025-10-27 09:58:23 +08:00

core

Upgrade to new vllm commit (#3719 )

2025-10-25 15:36:32 +08:00

device_allocator

[Misc]Clean up useless import from vllm (#2049 )

2025-07-28 16:01:59 +08:00

distributed

[Bugfix]fix_mulit_connector_bug (#3332 )

2025-10-29 23:23:06 +08:00

eplb

[CI]Add EPLB CI. (#3568 )

2025-10-21 22:58:02 +08:00

kv_offload

Upgrade to 0.11.1 newest vllm commit (#3762 )

2025-10-28 14:55:03 +08:00

lora

[1/N][Refactor] Refactor code to adapt with vllm main (#3612 )

2025-10-24 16:55:08 +08:00

model_loader

Upgrade to new vllm commit (#3719 )

2025-10-25 15:36:32 +08:00

models

Upgrade to 0.11.1 newest vllm commit (#3762 )

2025-10-28 14:55:03 +08:00

ops

[Perf] Delete redundant operations in model_runner and forward_context (#3677 )

2025-10-29 15:59:55 +08:00

patch

Upgrade to new vllm commit (#3719 )

2025-10-25 15:36:32 +08:00

quantization

【Bugfix】bugfix for weight load of kimi-k2 (#3798 )

2025-10-27 21:18:35 +08:00

sample

Upgrade to 0.11.1 newest vllm commit (#3762 )

2025-10-28 14:55:03 +08:00

spec_decode

Upgrade to 0.11.1 newest vllm commit (#3762 )

2025-10-28 14:55:03 +08:00

torchair

[BugFix] deepseek torchair adapt for torch_npu version (#3862 )

2025-10-29 22:39:34 +08:00

worker

bugfix for mtp fullgraph (#3845 )

2025-10-29 23:50:13 +08:00

__init__.py

[1/N][Refactor] Refactor code to adapt with vllm main (#3612 )

2025-10-24 16:55:08 +08:00

ascend_config.py

[1/N][Refactor] Refactor code to adapt with vllm main (#3612 )

2025-10-24 16:55:08 +08:00

ascend_forward_context.py

[Perf] Delete redundant operations in model_runner and forward_context (#3677 )

2025-10-29 15:59:55 +08:00

cpu_binding.py

[main] support cpu binding (#3546 )

2025-10-21 09:17:03 +08:00

envs.py

[main] remove dbo code (#3712 )

2025-10-25 15:53:01 +08:00

meta_registration.py

Fix the bugs about operator registration by PyTorch Dispatcher (#2786 )

2025-09-13 11:58:52 +08:00

platform.py

bugfix for mtp fullgraph (#3845 )

2025-10-29 23:50:13 +08:00

utils.py

bugfix for mtp fullgraph (#3845 )

2025-10-29 23:50:13 +08:00