This website requires JavaScript.
Explore
Help
Register
Sign In
EngineX-Hygon
/
sglang
Watch
5
Star
0
Fork
0
You've already forked sglang
Code
Issues
Pull Requests
Actions
7
Projects
Releases
Wiki
Activity
Files
ffa1b3e318c9d1342a5e430eb04df609e22a3775
sglang
/
sgl-kernel
/
benchmark
History
Stefan He
95085d65e9
[Refactor] Reducing code duplication across FP8 CUDA quantization kernels (
#4163
)
2025-03-06 22:58:52 -08:00
..
bench_cublas_grouped_gemm.py
[Feature] Apply Cublas Grouped Gemm kernel (
#3629
)
2025-02-18 15:18:31 +08:00
bench_fp8_blockwise_gemm.py
support blockwise fp8 matmul kernel (
#3267
)
2025-02-13 01:49:33 +08:00
bench_fp8_gemm.py
support w8a8 fp8 kernel with CUTLASS (
#3047
)
2025-01-26 15:46:51 +08:00
bench_int8_gemm.py
Add shapes for int8 gemm benchmark (
#3093
)
2025-01-24 12:27:30 +08:00
bench_lightning_attention_decode.py
support lightning_attention_decode in sgl-kernel for MiniMax-Text-01 (
#3030
)
2025-01-23 15:29:20 +08:00
bench_per_tensor_quant_fp8.py
[quant kernel] sgl-kernel support per_tensor_quant fp8 (
#3786
)
2025-03-06 18:05:43 -08:00
bench_per_token_group_quant_fp8.py
[Refactor] Reducing code duplication across FP8 CUDA quantization kernels (
#4163
)
2025-03-06 22:58:52 -08:00
bench_per_token_quant_fp8.py
[Refactor] Reducing code duplication across FP8 CUDA quantization kernels (
#4163
)
2025-03-06 22:58:52 -08:00