Logo
Explore Help
Register Sign In
EngineX-Hygon/sglang
5
0
Fork 0
You've already forked sglang
Code Issues Pull Requests Actions 7 Projects Releases Wiki Activity
Files
44f47d3ee1e66ecce73d2e98c8847cd94ab54ea7
sglang/sgl-kernel/csrc/gemm
History
Yi Pan 45fdf1f7f3 Fix shared memory OOM on sm86 GPUs. (#4797)
2025-03-26 10:41:53 -07:00
..
awq_kernel.cu
[1/3] fix dsv3 awq issue (#4556)
2025-03-22 01:07:17 -07:00
bmm_fp8.cu
Move rope and bmm into sgl-kernel (#4241)
2025-03-09 18:38:15 -07:00
cublas_grouped_gemm.cu
Simplify tests & Fix trtllm custom allreduce registration (#4252)
2025-03-10 01:24:22 -07:00
fp8_blockwise_gemm_kernel.cu
Support Blackwell Block Scale FP8 Gemm (#4278)
2025-03-12 14:17:11 -07:00
fp8_gemm_kernel.cu
Support fp8 gemm for blackwell (#4558)
2025-03-20 12:40:28 -07:00
int8_gemm_kernel.cu
Fix shared memory OOM on sm86 GPUs. (#4797)
2025-03-26 10:41:53 -07:00
nvfp4_quant_entry.cu
Support FP4 gemm (1/2) (#3899)
2025-03-24 19:50:23 -07:00
nvfp4_quant_kernels.cu
Support FP4 gemm (1/2) (#3899)
2025-03-24 19:50:23 -07:00
nvfp4_scaled_mm_entry.cu
Support FP4 gemm (1/2) (#3899)
2025-03-24 19:50:23 -07:00
nvfp4_scaled_mm_kernels.cu
Support FP4 gemm (1/2) (#3899)
2025-03-24 19:50:23 -07:00
per_tensor_quant_fp8.cu
Speed up per token and per tensor quant by 15% (#4639)
2025-03-22 00:37:57 -07:00
per_token_group_quant_8bit.cu
[Quant Kernel] refactored per token group quant fp8 to support int8 up-to 2x faster (#4396)
2025-03-23 23:44:17 -07:00
per_token_quant_fp8.cu
Speed up per token and per tensor quant by 15% (#4639)
2025-03-22 00:37:57 -07:00
Powered by Gitea Version: 1.24.3 Page: 124ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API