[Feature] Support moe multi-stream for aclgraph. (#2946)
This PR puts the calculation of shared experts into a separate stream,
overlaping with routing experts.
- vLLM version: v0.10.2
- vLLM main:
fbd6523ac0
---------
Signed-off-by: whx-sjtu <2952154980@qq.com>
This commit is contained in:
@@ -195,8 +195,8 @@ msgid ""
|
||||
msgstr "是否将MLA的向量操作放到另一个流中。此选项仅对使用MLA的模型(例如,DeepSeek)有效。"
|
||||
|
||||
#: ../../user_guide/configuration/additional_config.md
|
||||
msgid "`enable_multistream_moe`"
|
||||
msgstr "`enable_multistream_moe`"
|
||||
msgid "`multistream_overlap_shared_expert`"
|
||||
msgstr "`multistream_overlap_shared_expert`"
|
||||
|
||||
#: ../../user_guide/configuration/additional_config.md
|
||||
msgid ""
|
||||
|
||||
Reference in New Issue
Block a user