sglang/moe at 63e84352b7613047ee09bcbd9722d25d0a4d8f77 - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

PGFLMG 8fdcd98efe [7/n] decouple quantization impl from vllm dependency - gguf kernel (#11019 )

2025-10-11 14:04:57 -07:00

..

cutlass_moe/w4a8

pass a_scale from fp8 quant result instead of hard code to 1.0f (#10241 )

2025-09-10 12:56:05 -07:00

marlin_moe_wna16

Support compile sgl-kernel on cuda 13.0 (#9721 )

2025-08-28 10:18:03 -07:00

cutlass_moe_helper.cu

[Fix]Fix index oob in get_group_gemm_starts kernel. (#8564 )

2025-07-30 19:49:35 -07:00

fp8_blockwise_moe_kernel.cu

Update CUTLASS. Refine KernelSchedule for fp8 (grouped) gemm. (#10491 )

2025-09-16 02:47:37 -07:00

moe_align_kernel.cu

[AMD] Reorganize hip-related header files in sgl-kernel (#9320 )

2025-08-18 16:53:44 -07:00

moe_fused_gate.cu

Fix correction bias undefined behavior for nvfp4 models (#10426 )

2025-09-14 18:41:09 -07:00

moe_sum_reduce.cu

[sgl-kernel] Support float64 moe_sum_reduce cuda kernel (#11068 )

2025-10-07 14:31:11 +00:00

moe_sum.cu

[7/n] decouple quantization impl from vllm dependency - gguf kernel (#11019 )

2025-10-11 14:04:57 -07:00

moe_topk_softmax_kernels.cu

Support compile sgl-kernel on cuda 13.0 (#9721 )

2025-08-28 10:18:03 -07:00

nvfp4_blockwise_moe.cu

[1/2] Add Kernel support for Cutlass based Fused FP4 MoE (#6093 )

2025-06-02 13:48:03 -07:00

prepare_moe_input.cu

fix: fix apply_shuffle_mul_sum (#7444 )

2025-07-04 23:23:30 -07:00