Logo
Explore Help
Register Sign In
EngineX-Hygon/sglang
5
0
Fork 0
You've already forked sglang
Code Issues Pull Requests Actions 7 Projects Releases Wiki Activity
Files
fb107cfd7567d4190b991ab1aedce8a49a171342
sglang/sgl-kernel/csrc/moe/marlin_moe_wna16
History
Peng Zhang 5aa1ebd242 [2/n]decouple quantization implementation from vLLM dependency (#8112)
Co-authored-by: walker-ai <yiyun.wyt@antgroup.com>
Co-authored-by: leoneo <1320612015@qq.com>
2025-08-14 03:19:03 -07:00
..
generate_kernels.py
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_bf16_ku4.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_bf16_ku4b8.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_bf16_ku8b128.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_fp16_ku4.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_fp16_ku4b8.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel_fp16_ku8b128.cu
[1/n] apply wna16marlin kernel in moe weight only quantization (#7683)
2025-07-01 23:21:25 -07:00
kernel.h
[2/n]decouple quantization implementation from vLLM dependency (#8112)
2025-08-14 03:19:03 -07:00
marlin_template.h
[2/n]decouple quantization implementation from vLLM dependency (#8112)
2025-08-14 03:19:03 -07:00
ops.cu
[2/n]decouple quantization implementation from vLLM dependency (#8112)
2025-08-14 03:19:03 -07:00
Powered by Gitea Version: 1.24.3 Page: 361ms Template: 79ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API