Logo
Explore Help
Register Sign In
EngineX-Hygon/sglang
5
0
Fork 0
You've already forked sglang
Code Issues Pull Requests Actions 7 Projects Releases Wiki Activity
Files
f4674df646ca8a5515dfdc93677f7bdc052416c6
sglang/sgl-kernel/csrc/gemm
History
Yuan Luo 0c8dab9e67 [sgl-kernel] Opt per_token_quant_fp8 with warp reduce (#8130)
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
2025-07-23 21:22:59 +08:00
..
awq_kernel.cu
…
bmm_fp8.cu
…
dsv3_fused_a_gemm.cu
…
dsv3_router_gemm_bf16_out.cu
Add bf16 output option for dsv3_router_gemm kernel (#7999)
2025-07-20 09:49:37 +08:00
dsv3_router_gemm_entry.cu
Add bf16 output option for dsv3_router_gemm kernel (#7999)
2025-07-20 09:49:37 +08:00
dsv3_router_gemm_float_out.cu
Add bf16 output option for dsv3_router_gemm kernel (#7999)
2025-07-20 09:49:37 +08:00
fp8_blockwise_gemm_kernel.cu
…
fp8_gemm_kernel.cu
…
int8_gemm_kernel.cu
…
nvfp4_expert_quant.cu
…
nvfp4_quant_entry.cu
…
nvfp4_quant_kernels.cu
…
nvfp4_scaled_mm_entry.cu
…
nvfp4_scaled_mm_kernels.cu
…
per_tensor_quant_fp8.cu
…
per_token_group_quant_8bit.cu
…
per_token_quant_fp8.cu
[sgl-kernel] Opt per_token_quant_fp8 with warp reduce (#8130)
2025-07-23 21:22:59 +08:00
qserve_w4a8_per_chn_gemm.cu
…
qserve_w4a8_per_group_gemm.cu
…
Powered by Gitea Version: 1.24.3 Page: 3651ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API