sglang/kernels at 6f993e8b9e6c4acdb92aa76bbd7a4963666bfadf - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

Yuan Luo 616a3e20df [sgl-kernel] Support moe_sum_reduce cuda kernel (#10321 )

Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>

2025-09-19 14:12:09 +08:00

..

support 1 shot allreduce in 1-node and 2-node using mscclpp (#6277 )

2025-06-04 22:11:24 -07:00

decoding_attention_triton

[CI] Remove unused imports with Ruff to pre-commit config, only to benchmarks/docs/examples folder (#3969 )

2025-03-27 19:45:02 -07:00

Support tuning DeepEP configs (#6742 )

2025-05-29 08:12:22 -07:00

refactor apply_w8a8_block_fp8_linear in fp (#6545 )

2025-05-29 00:15:11 -07:00

[benchmark] add flashinfer_allreduce_fusion benchmark (#9937 )

2025-09-03 16:31:01 +08:00

flashinfer_allreduce_fusion

[benchmark] add flashinfer_allreduce_fusion benchmark (#9937 )

2025-09-03 16:31:01 +08:00

fused_moe_triton

[sgl-kernel] Support moe_sum_reduce cuda kernel (#10321 )

2025-09-19 14:12:09 +08:00

minmax-text-01-lightning_attention

[CI] Remove unused imports with Ruff to pre-commit config, only to benchmarks/docs/examples folder (#3969 )

2025-03-27 19:45:02 -07:00

[NVIDIA] [2/N] Optimize silu_and_mul_scaled_fp4_grouped_quant perf (#9556 )

2025-08-29 17:17:03 -07:00

[CI] Remove unused imports with Ruff to pre-commit config, only to benchmarks/docs/examples folder (#3969 )

2025-03-27 19:45:02 -07:00

scheduler_batch

[test] add ut and bm for get_last_loc (#6746 )

2025-05-29 11:47:21 -07:00

sliding_window_attention_triton

Optimize triton swa kernel by skipping computation (#8860 )

2025-08-06 21:37:50 +08:00