sglang/attention at 50f1b6d6b16446f4cee61af096648f8d4bf2bee2 - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

JieXin Liang ab1a4fa5cb [fix] fix cutlass_mla_backend with cuda_graph and add sm_scale for sgl-kernel cutlass_mla (#7184 )

2025-06-14 12:45:41 -07:00

..

cutlass_sm100_mla

[perf][sgl-kernel] extend cutlass_mla_decode to support num_head < 128 (#6929 )

2025-06-08 19:37:34 -07:00

cascade.cu

feat: adapt merge_state (#5337 )

2025-04-12 21:14:04 -07:00

cutlass_mla_kernel.cu

[fix] fix cutlass_mla_backend with cuda_graph and add sm_scale for sgl-kernel cutlass_mla (#7184 )

2025-06-14 12:45:41 -07:00

lightning_attention_decode_kernel.cu

support cmake for sgl-kernel (#4706 )

2025-03-27 01:42:28 -07:00

merge_attn_states.cu

bugfix: fix merge_state_v2 cuda graph (#5419 )

2025-04-15 10:18:47 -07:00

vertical_slash_index.cu

[Feat] QWen-1M context support[1/2]: Update block sparse attention backend utils kernel (#5847 )

2025-04-28 11:03:17 -07:00