[main][bugfix] Modify the default value of the enable_shared_pert_dp to false (#2457)

### What this PR does / why we need it? enable_shared_pert_dp is currently on by default. This optimization is currently only valid for deepseek series models. The default opening affects the accuracy of the qwen series models. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? use parameter --additional_config='{"enable_shared_expert_dp": true}' - vLLM version: v0.10.0 - vLLM main: d983769c41 Signed-off-by: Wang Kunpeng <1289706727@qq.com>
2025-08-20 20:25:53 +08:00
parent c40d4171bc
commit 1de16ead8e
2 changed files with 2 additions and 2 deletions
--- a/docs/source/user_guide/configuration/additional_config.md
+++ b/docs/source/user_guide/configuration/additional_config.md
@@ -32,7 +32,7 @@ The following table lists the additional configuration options available in vLLM
 | `expert_map_path`             | str  | `None` | When using expert load balancing for the MOE model, an expert map path needs to be passed in. |
 | `chunked_prefill_for_mla`     | bool | `False` | Whether to enable the fused operator-like chunked_prefill. |
 | `kv_cache_dtype`     | str | `None` | When using the kv cache quantization method, kv cache dtype needs to be set, currently only int8 is supported. |
-| `enable_shared_expert_dp`     | bool | `True` | When the shared expert in DP, it has better performance but consumes more memory. When the memory is sensitive, this switch can be turned off manually. |
+| `enable_shared_expert_dp`     | bool | `False` | When the shared expert in DP, it has better performance but consumes more memory. Currently only DeepSeek series models are supported to use. |

 The details of each config option are as follows:

--- a/vllm_ascend/ascend_config.py
+++ b/vllm_ascend/ascend_config.py
@@ -48,7 +48,7 @@ class AscendConfig:
        self.chunked_prefill_for_mla = additional_config.get(
            "chunked_prefill_for_mla", False)
        self.enable_shared_expert_dp = additional_config.get(
-            "enable_shared_expert_dp", True
+            "enable_shared_expert_dp", False
        ) and not self.torchair_graph_config.enabled and vllm_config.parallel_config.enable_expert_parallel