Logo
Explore Help
Register Sign In
EngineX-Hygon/sglang
5
0
Fork 0
You've already forked sglang
Code Issues Pull Requests Actions 7 Projects Releases Wiki Activity
Files
85ef7f64e4802115a07a5f76a843017830973875
sglang/python/sglang/srt/layers/quantization
History
AniZpZ 85ef7f64e4 [FIX] fix incorrect output when enable both deepgemm and torch compile (#4359)
Co-authored-by: xuyongfei.xyf <xuyongfei.xyf@antgroup.com>
2025-03-12 21:34:09 -07:00
..
configs
Add H20 tuning configs support DeepSeek V3/R1 INT8(block-wise) (#4220)
2025-03-11 12:32:33 -07:00
__init__.py
Fix quantization and nightly tests (#4258)
2025-03-10 03:06:21 -07:00
base_config.py
fix black in pre-commit (#1940)
2024-11-08 07:42:47 +08:00
blockwise_int8.py
Apply sgl w8a8 fp8 kernel (#3148)
2025-03-09 00:03:32 -08:00
fp8_kernel.py
[FIX] fix incorrect output when enable both deepgemm and torch compile (#4359)
2025-03-12 21:34:09 -07:00
fp8_utils.py
unify is_cuda and is_hip (#4321)
2025-03-11 18:12:56 -07:00
fp8.py
unify is_cuda and is_hip (#4321)
2025-03-11 18:12:56 -07:00
gptq.py
[QUANT] Add GPTQModel Dynamic Quantization + lm_head Quantization (#3790)
2025-03-05 01:11:00 -08:00
int8_kernel.py
Feature DeepSeek V3/R1 INT8 Quantization (block-wise) (#3730)
2025-02-24 05:43:35 -08:00
int8_utils.py
Feature DeepSeek V3/R1 INT8 Quantization (block-wise) (#3730)
2025-02-24 05:43:35 -08:00
modelopt_quant.py
Apply sgl w8a8 fp8 kernel (#3148)
2025-03-09 00:03:32 -08:00
w8a8_fp8.py
unify is_cuda and is_hip (#4321)
2025-03-11 18:12:56 -07:00
w8a8_int8.py
[Feature] DeepSeek V3/R1 INT8 Quantization (channel-wise) (#3888)
2025-03-06 20:54:52 -08:00
Powered by Gitea Version: 1.24.3 Page: 169ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API