This website requires JavaScript.
Explore
Help
Register
Sign In
EngineX-Hygon
/
sglang
Watch
5
Star
0
Fork
0
You've already forked sglang
Code
Issues
Pull Requests
Actions
7
Projects
Releases
Wiki
Activity
Files
e0917e6bd0fbbbbc8ba3db48ae26f07366ab9a0c
sglang
/
sgl-kernel
/
csrc
/
gemm
History
Stefan He
e0917e6bd0
Remove vllm ops scaled fp8 quant and accelerate per token quant by 20-28% (
#4215
)
...
Co-authored-by: Stefan He <
bhe@linkedin.com
>
2025-03-12 00:08:03 -07:00
..
bmm_fp8.cu
Move rope and bmm into sgl-kernel (
#4241
)
2025-03-09 18:38:15 -07:00
cublas_grouped_gemm.cu
Simplify tests & Fix trtllm custom allreduce registration (
#4252
)
2025-03-10 01:24:22 -07:00
fp8_blockwise_gemm_kernel.cu
Rename files in sgl kernel to avoid nested folder structure (
#4213
)
2025-03-08 22:54:51 -08:00
fp8_gemm_kernel.cu
Rename files in sgl kernel to avoid nested folder structure (
#4213
)
2025-03-08 22:54:51 -08:00
int8_gemm_kernel.cu
Rename files in sgl kernel to avoid nested folder structure (
#4213
)
2025-03-08 22:54:51 -08:00
per_tensor_quant_fp8.cu
Rename files in sgl kernel to avoid nested folder structure (
#4213
)
2025-03-08 22:54:51 -08:00
per_token_group_quant_fp8.cu
fix per_token_group_quant_fp8 illegal memory when num_groups % 16 != 0 (
#4231
)
2025-03-10 01:42:58 -07:00
per_token_quant_fp8.cu
Remove vllm ops scaled fp8 quant and accelerate per token quant by 20-28% (
#4215
)
2025-03-12 00:08:03 -07:00