[refactor] Refactoring AscendFusedMoE (#1229)

### What this PR does / why we need it? This PR is used for resolved [issue 1147](https://github.com/vllm-project/vllm-ascend/issues/1147) 1. Move fused_moe code into one file `fused_moe.py`. 2. Integrate branch conditions into function `get_fused_moe_state`.  ### Does this PR introduce _any_ user-facing change? 1. This PR has removed the env `VLLM_ENABLE_MC2`, because I think this env is useless, we can make judgments based on the current scenario without this env, it will only increase complexity. 2. This PR has removed the env `USING_LCCL_COM`, because this env has already expired. 3. `additional_config.expert_tensor_parallel_size` has already expired, and now we also use parameter `enable_expert_parallel`, consistent with the vLLM.  ### How was this patch tested?  Signed-off-by: zzzzwwjj <1183291235@qq.com>
2025-06-17 17:49:03 +08:00
parent 05dec7eda9
commit 23ca68d0c8
9 changed files with 150 additions and 204 deletions
--- a/vllm_ascend/envs.py
+++ b/vllm_ascend/envs.py
@@ -50,18 +50,10 @@ env_variables: Dict[str, Callable[[], Any]] = {
    # value is None, which means the system default C compiler will be used.
    "C_COMPILER":
    lambda: os.getenv("C_COMPILER", None),
-    # Whether to enable MC2 for DeepSeek. If not set, the default value is False.
-    # MC2 is a fusion operator provided by Ascend to speed up computing and communication.
-    # Find more detail here: https://www.hiascend.com/document/detail/zh/canncommercial/81RC1/developmentguide/opdevg/ascendcbestP/atlas_ascendc_best_practices_10_0043.html
-    "VLLM_ENABLE_MC2":
-    lambda: bool(int(os.getenv("VLLM_ENABLE_MC2", '0'))),
    # Whether to enable the topk optimization. It's disabled by default for experimental support
    # We'll make it enabled by default in the future.
    "VLLM_ASCEND_ENABLE_TOPK_OPTIMIZE":
    lambda: bool(int(os.getenv("VLLM_ASCEND_ENABLE_TOPK_OPTIMIZE", '0'))),
-    # Whether to use LCCL communication. If not set, the default value is False.
-    "USING_LCCL_COM":
-    lambda: bool(int(os.getenv("USING_LCCL_COM", '0'))),
    # The version of the Ascend chip. If not set, the default value is
    # ASCEND910B1. It's used for package building. Please make sure that the
    # version is correct.