sglang/moe at e0917e6bd0fbbbbc8ba3db48ae26f07366ab9a0c - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

Stefan He e0917e6bd0 Remove vllm ops scaled fp8 quant and accelerate per token quant by 20-28% (#4215 )

Co-authored-by: Stefan He <bhe@linkedin.com>

2025-03-12 00:08:03 -07:00

..

Remove vllm ops scaled fp8 quant and accelerate per token quant by 20-28% (#4215 )

2025-03-12 00:08:03 -07:00

fused_moe_triton

Remove vllm ops scaled fp8 quant and accelerate per token quant by 20-28% (#4215 )

2025-03-12 00:08:03 -07:00

fused_moe_native.py

Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988 )

2025-03-03 00:12:04 -08:00

topk.py

feat: update grouped_topk to support softmax and sigmoid (#3680 )

2025-02-21 16:30:15 +08:00