sglang/gemm at d7954b7682406db30689cf4b2029381f9bb937a7 - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

Xiaoyu Zhang 2c8fd99363 [sgl-kernel] per token group quant support COLUMN MAJOR (#4817 )

2025-04-02 18:29:59 -07:00

..

awq_kernel.cu

fix sgl-kernel cu118 build (#4872 )

2025-03-28 17:23:51 -07:00

bmm_fp8.cu

Move rope and bmm into sgl-kernel (#4241 )

2025-03-09 18:38:15 -07:00

cublas_grouped_gemm.cu

support cmake for sgl-kernel (#4706 )

2025-03-27 01:42:28 -07:00

fp8_blockwise_gemm_kernel.cu

Support Blackwell Block Scale FP8 Gemm (#4278 )

2025-03-12 14:17:11 -07:00

fp8_gemm_kernel.cu

Support fp8 gemm for blackwell (#4558 )

2025-03-20 12:40:28 -07:00

int8_gemm_kernel.cu

Fix shared memory OOM on sm86 GPUs. (#4797 )

2025-03-26 10:41:53 -07:00

nvfp4_quant_entry.cu

Support FP4 gemm (1/2) (#3899 )

2025-03-24 19:50:23 -07:00

nvfp4_quant_kernels.cu

fix sgl-kernel cu118 build (#4872 )

2025-03-28 17:23:51 -07:00

nvfp4_scaled_mm_entry.cu

Support FP4 gemm (1/2) (#3899 )

2025-03-24 19:50:23 -07:00

nvfp4_scaled_mm_kernels.cu

[Build] Fix cuda12.8 build error in nvfp4_scaled_mm_kernels.cu (#4953 )

2025-03-31 12:00:34 -07:00

per_tensor_quant_fp8.cu

Speed up per token and per tensor quant by 15% (#4639 )

2025-03-22 00:37:57 -07:00

per_token_group_quant_8bit.cu

[sgl-kernel] per token group quant support COLUMN MAJOR (#4817 )

2025-04-02 18:29:59 -07:00

per_token_quant_fp8.cu

Speed up per token and per tensor quant by 15% (#4639 )

2025-03-22 00:37:57 -07:00