This PR designs the shared expert multi-stream parallelism of w8a8-dynamic-quantized MoE stage in more detail to achieve better performance. - vLLM version: v0.10.0 - vLLM main: 2cc571199b Signed-off-by: whx-sjtu <2952154980@qq.com>
2cc571199b