[Bugfix] Fix deepseek percision issue and add acc ci for it (#905)

### What this PR does / why we need it? Fix deepseek percision issue on V0 and add acc ci for it Fixes https://github.com/vllm-project/vllm-ascend/issues/1062 ### How was this patch tested? CI passed with new added test. Signed-off-by: MengqingCao <cmq0113@163.com>
2025-06-04 20:26:44 +08:00
parent da9acfca60
commit afc4c0cd03
9 changed files with 121 additions and 43 deletions
--- a/vllm_ascend/quantization/w8a8_dynamic.py
+++ b/vllm_ascend/quantization/w8a8_dynamic.py
@@ -624,6 +624,8 @@ class AscendW8A8DynamicFusedMoEMethod:
        if enable_force_load_balance:
            topk_ids = torch.randint_like(topk_ids, 0, global_num_experts)

+        topk_weights = topk_weights.to(x.dtype)
+
        if VLLM_ENABLE_MC2 and not is_prefill:
            return fused_experts_with_mc2(
                hidden_states=x,