Logo
Explore Help
Register Sign In
EngineX-Hygon/sglang
5
0
Fork 0
You've already forked sglang
Code Issues Pull Requests Actions 7 Projects Releases Wiki Activity
Files
243e745d0758a7214d29fe644d88f5c3b5c3d9ff
sglang/sgl-kernel/csrc/moe/marlin_moe_wna16
History
Peng Zhang 5aa1ebd242 [2/n]decouple quantization implementation from vLLM dependency (#8112)
Co-authored-by: walker-ai <yiyun.wyt@antgroup.com>
Co-authored-by: leoneo <1320612015@qq.com>
2025-08-14 03:19:03 -07:00
..
generate_kernels.py
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_bf16_ku4.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_bf16_ku4b8.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_bf16_ku8b128.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_fp16_ku4.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_fp16_ku4b8.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_fp16_ku8b128.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel.h
[2/n]decouple quantization implementation from vLLM dependency (#8112)
2025-08-14 03:19:03 -07:00
marlin_template.h
[2/n]decouple quantization implementation from vLLM dependency (#8112)
2025-08-14 03:19:03 -07:00
ops.cu
[2/n]decouple quantization implementation from vLLM dependency (#8112)
2025-08-14 03:19:03 -07:00
Powered by Gitea Version: 1.24.3 Page: 231ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API