Logo
Explore Help
Register Sign In
EngineX-Hygon/sglang
5
0
Fork 0
You've already forked sglang
Code Issues Pull Requests Actions 7 Projects Releases Wiki Activity
Files
4d1e52abea8277e69de281cb23634edb723fcd85
sglang/python/sglang/srt/layers/attention
History
liwenju0 4d1e52abea Add an assertion to enhance the robustness of the operator (#5736)
2025-04-26 18:09:12 -07:00
..
triton_ops
[misc] remove is_cuda_available (#5319)
2025-04-20 18:16:51 -07:00
base_attn_backend.py
Remove q concat in FA3 backend for DeepSeek decode (#5638)
2025-04-22 11:43:12 -07:00
double_sparsity_backend.py
Misc clean up; Remove the support of jump forward (#4032)
2025-03-03 07:02:14 -08:00
flashattention_backend.py
Remove q concat in FA3 backend for DeepSeek decode (#5638)
2025-04-22 11:43:12 -07:00
flashinfer_backend.py
Revert "Avoid computing lse in Ragged Prefill when there's no prefix.… (#5544)
2025-04-18 16:50:21 -07:00
flashinfer_mla_backend.py
Revert "Avoid computing lse in Ragged Prefill when there's no prefix.… (#5544)
2025-04-18 16:50:21 -07:00
flashmla_backend.py
fix flashmla bug (#5272)
2025-04-22 10:36:23 -07:00
torch_native_backend.py
Feat/support encoder model (like bert) (#4887)
2025-04-17 01:50:48 -07:00
triton_backend.py
Feat/support encoder model (like bert) (#4887)
2025-04-17 01:50:48 -07:00
utils.py
Support FlashMLA backend cuda graph (#4514)
2025-03-19 08:25:34 -07:00
vision.py
Add an assertion to enhance the robustness of the operator (#5736)
2025-04-26 18:09:12 -07:00
Powered by Gitea Version: 1.24.3 Page: 148ms Template: 7ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API