Logo
Explore Help
Register Sign In
EngineX-Hygon/sglang
5
0
Fork 0
You've already forked sglang
Code Issues Pull Requests Actions 7 Projects Releases Wiki Activity
Files
551a3a9d3870e57c285025827be8870f197daa0a
sglang/python/sglang/srt/layers
History
Ying Sheng c98e84c21e [Minor, Performance] Use torch.argmax for greedy sampling (#1589)
2024-10-06 13:15:05 -07:00
..
attention
Simplify flashinfer dispatch (#1552)
2024-10-01 00:28:42 -07:00
fused_moe
[Performance, Hardware] MoE tuning on AMD MI300x GPUs (#1554)
2024-10-02 09:52:46 -07:00
quantization
feat: update linear deps 1/N (#1305)
2024-09-19 20:53:11 +08:00
activation.py
feat: update linear deps 1/N (#1305)
2024-09-19 20:53:11 +08:00
layernorm.py
[Bugfix] Enable SGLang on AMD GPUs via PyTorch for ROCm (#1419) (#1453)
2024-09-18 02:01:35 -07:00
linear.py
feat: update linear deps 1/N (#1305)
2024-09-19 20:53:11 +08:00
logits_processor.py
Clean up batch data structures: Introducing ModelWorkerBatch (#1544)
2024-09-30 06:41:49 -07:00
pooler.py
Rename InputMetadata -> ForwardBatch (#1543)
2024-09-30 02:41:11 -07:00
radix_attention.py
Simplify flashinfer dispatch (#1552)
2024-10-01 00:28:42 -07:00
sampler.py
[Minor, Performance] Use torch.argmax for greedy sampling (#1589)
2024-10-06 13:15:05 -07:00
torchao_utils.py
Add float8 dynamic quant to torchao_utils (#1528)
2024-09-28 12:27:54 -07:00
Powered by Gitea Version: 1.24.3 Page: 153ms Template: 8ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API