This website requires JavaScript.
Explore
Help
Register
Sign In
EngineX-Hygon
/
sglang
Watch
5
Star
0
Fork
0
You've already forked sglang
Code
Issues
Pull Requests
Actions
7
Projects
Releases
Wiki
Activity
Files
9045cc1eb8daa77e6d4d271e3bdebc6e26584303
sglang
/
python
/
sglang
/
srt
/
layers
/
moe
History
Xiaoyu Zhang
9045cc1eb8
[torch.compile bug] avoid biased_grouped_topk_impl func repeatedly triggering
torch.compile
in forward pass (
#8353
)
2025-07-25 21:17:47 +08:00
..
ep_moe
[AMD] Remove vllm's scaled_fp8_quant and moe_sum when SGLANG_USE_AITER=1 (
#7484
)
2025-07-21 17:33:19 -07:00
fused_moe_triton
[code style] Clean dead triton kernel code in fused_moe and useless vllm_ops import (
#8310
)
2025-07-24 14:38:30 +08:00
cutlass_moe_params.py
[CUTLASS-FP4-MOE] Introduce CutlassMoEParams class for easy initialization of Cutlass Grouped Gems Metadata (
#6887
)
2025-06-05 13:13:14 -07:00
cutlass_moe.py
Add a CUDA kernel for fusing mapping and weighted sum for MoE. (
#6916
)
2025-06-07 15:24:39 -07:00
cutlass_w4a8_moe.py
feat: support DeepSeek-R1-W4AFP8 model with ep-moe mode (
#7762
)
2025-07-07 14:47:21 -07:00
fused_moe_native.py
[1/N] MoE Refactor: refactor
select_experts
(
#7966
)
2025-07-19 00:51:15 -07:00
router.py
Improve streaming, log_level, memory report, weight loading, and benchmark script (
#7632
)
2025-06-29 23:16:19 -07:00
topk.py
[torch.compile bug] avoid biased_grouped_topk_impl func repeatedly triggering
torch.compile
in forward pass (
#8353
)
2025-07-25 21:17:47 +08:00