weichen
320edde2df
[main] [refactor] refactor fused_moe.py to enable token_dispatchers ( #2570 )
...
### What this PR does / why we need it?
Enable token_dispatcher to replace fused_experts_with_xxx in eager mode
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
e2e & ut
- vLLM version: v0.10.1.1
- vLLM main:
704432af3c
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com >
Co-authored-by: sherie <963372609@qq.com >
Co-authored-by: weijinqian0 <12153182+weijinqian0@users.noreply.github.com >
Co-authored-by: shiyuan680 <72335504+shiyuan680@users.noreply.github.com >
2025-08-28 10:13:35 +08:00
weichen
950c4b219a
[main] refactor alltoallv in fused_moe ( #2487 )
...
### What this PR does / why we need it?
Refactor all2all-related fused_experts (both quantized/unquantized) into
TokenDispatcherWithAll2AllV, including dispatch & combine calculation.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
E2E & UT
- vLLM version: v0.10.0
- vLLM main:
65197a5fb3
Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com >
2025-08-23 20:38:17 +08:00
sherie
3f867ee708
refactor allgather/mc2-related fused_experts ( #2369 )
...
### What this PR does / why we need it?
refactor allgather/mc2-related fused_experts
- vLLM version: v0.10.0
- vLLM main:
de7b67a023
Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com >
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com >
2025-08-20 14:20:46 +08:00
weijinqian0
6e00aed4d5
[main][Feature]Moe alltoallv communication optimization for unquantized RL training sence ( #2088 )
...
It comes from 0.9.1dev
[0.9.1][Feature]Moe alltoallv communication optimization for unquantized
RL training sence & alltoallv support dpo (#1547 )
- vLLM version: v0.10.0
- vLLM main:
97608dc276
---------
Signed-off-by: weijinqian_v1 <weijinqian@huawei.com >
Signed-off-by: whx-sjtu <2952154980@qq.com >
Signed-off-by: curryliu <120010041@link.cuhk.edu.cn >
Signed-off-by: wangli <wangli858794774@gmail.com >
Signed-off-by: ChenTaoyu-SJTU <ctynb@qq.com >
Signed-off-by: taoxudonghaha <justsheldon@163.com >
Signed-off-by: shen-shanshan <467638484@qq.com >
Signed-off-by: Shanshan Shen <87969357+shen-shanshan@users.noreply.github.com >
Signed-off-by: leo-pony <nengjunma@outlook.com >
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
Signed-off-by: MengqingCao <cmq0113@163.com >
Co-authored-by: weijinqian_v1 <weijinqian@huawei.com >
Co-authored-by: whx <56632993+whx-sjtu@users.noreply.github.com >
Co-authored-by: curryliu <99582471+Irving11-BKN@users.noreply.github.com >
Co-authored-by: Li Wang <wangli858794774@gmail.com >
Co-authored-by: TaoYu Chen <ctynb@qq.com >
Co-authored-by: taoxudonghaha <justsheldon@163.com >
Co-authored-by: Shanshan Shen <467638484@qq.com >
Co-authored-by: leo-pony <nengjunma@outlook.com >
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com >
Co-authored-by: Mengqing Cao <cmq0113@163.com >
2025-08-02 09:49:10 +08:00