xc-llm-ascend

Files

whx bd11c0054f [BugFix] Fix torchair+mtp bug after deleting deepseek_mtp. (#3590 )

This is a missing bug fix introduced by PR #3561

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: whx-sjtu <2952154980@qq.com>

2025-10-21 22:23:52 +08:00

models

[BugFix] Fix torchair+mtp bug after deleting deepseek_mtp. (#3590 )

2025-10-21 22:23:52 +08:00

ops

[BugFix]Support redundant experts in EPLB (#3473 )

2025-10-18 00:09:16 +08:00

quantization

[Feat][quantization] Support new version w4a8 dynamic quantization for Linear layers (#3311 )

2025-10-21 20:18:39 +08:00

__init__.py

[2/4][Refactor] Refactor torchair utils (#1892 )

2025-07-21 19:43:30 +08:00

test_torchair_attention.py

[Bugfix]:replace npu_incre_flash_attention with npu_fused_infer_atten… (#2901 )

2025-09-18 14:06:08 +08:00

test_torchair_mla.py

[Model][1/N] Delete deepseek v2/v3 modeling codes. (#3189 )

2025-10-20 15:31:34 +08:00

test_utils.py

[Feat] Unquantized Linear to nz and control all nz-cast (#3356 )

2025-10-14 17:39:26 +08:00