xc-llm-ascend

Files

whx dc960e798e [BugFix] Fix mlapo accuracy problem related with weight processing. (#3850 )

This PR fixes a mlapo accuracy problem related with weight processing.
Furthermore, add back mlapo related e2e test with quantized deepseek
model.


- vLLM version: v0.11.0rc3
- vLLM main:
83f478bb19

Signed-off-by: whx-sjtu <2952154980@qq.com>

2025-10-30 00:34:55 +08:00

__init__.py

[Core] Make V1 work and enable V1 engine test (#389 )

2025-03-28 19:34:23 +08:00

attention_mask.py

support prefill cache mode use fia op (#3696 )

2025-10-27 19:41:07 +08:00

attention_v1.py

[long_seq_optim] BSND to TND and FA_UPDATE replacement (#3778 )

2025-10-29 09:33:35 +08:00

mla_v1.py

[BugFix] Fix mlapo accuracy problem related with weight processing. (#3850 )

2025-10-30 00:34:55 +08:00

sfa_v1.py

[main] remove dbo code (#3712 )

2025-10-25 15:53:01 +08:00

utils.py

[main] remove dbo code (#3712 )

2025-10-25 15:53:01 +08:00