Files
xc-llm-ascend/vllm_ascend
whx a58ff9e92f [Cherry-pick] Port MoE multi-stream fix to v0.11.0-dev (#3753)
This PR moves the communication operation of shared experts out of extra
stream because I found that this might cause rtMemcpy related errors
when running shared experts multistream with aclgraph.

Furthermore, I utilize a global variable as extra stream object to avoid
allocating streams for each layer in full-graph mode.

Signed-off-by: whx-sjtu <2952154980@qq.com>
2025-10-25 15:51:43 +08:00
..
2025-10-21 22:58:02 +08:00
2025-10-09 10:28:38 +08:00
2025-10-15 19:36:32 +08:00