Logo
Explore Help
Register Sign In
EngineX-Hygon/sglang
5
0
Fork 0
You've already forked sglang
Code Issues Pull Requests Actions 7 Projects Releases Wiki Activity
Files
9045cc1eb8daa77e6d4d271e3bdebc6e26584303
sglang/python/sglang/srt/layers/moe
History
Xiaoyu Zhang 9045cc1eb8 [torch.compile bug] avoid biased_grouped_topk_impl func repeatedly triggering torch.compile in forward pass (#8353)
2025-07-25 21:17:47 +08:00
..
ep_moe
[AMD] Remove vllm's scaled_fp8_quant and moe_sum when SGLANG_USE_AITER=1 (#7484)
2025-07-21 17:33:19 -07:00
fused_moe_triton
[code style] Clean dead triton kernel code in fused_moe and useless vllm_ops import (#8310)
2025-07-24 14:38:30 +08:00
cutlass_moe_params.py
[CUTLASS-FP4-MOE] Introduce CutlassMoEParams class for easy initialization of Cutlass Grouped Gems Metadata (#6887)
2025-06-05 13:13:14 -07:00
cutlass_moe.py
Add a CUDA kernel for fusing mapping and weighted sum for MoE. (#6916)
2025-06-07 15:24:39 -07:00
cutlass_w4a8_moe.py
feat: support DeepSeek-R1-W4AFP8 model with ep-moe mode (#7762)
2025-07-07 14:47:21 -07:00
fused_moe_native.py
[1/N] MoE Refactor: refactor select_experts (#7966)
2025-07-19 00:51:15 -07:00
router.py
Improve streaming, log_level, memory report, weight loading, and benchmark script (#7632)
2025-06-29 23:16:19 -07:00
topk.py
[torch.compile bug] avoid biased_grouped_topk_impl func repeatedly triggering torch.compile in forward pass (#8353)
2025-07-25 21:17:47 +08:00
Powered by Gitea Version: 1.24.3 Page: 271ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API