sglang/attention at 4d1e52abea8277e69de281cb23634edb723fcd85 - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

liwenju0 4d1e52abea Add an assertion to enhance the robustness of the operator (#5736 )

2025-04-26 18:09:12 -07:00

..

[misc] remove is_cuda_available (#5319 )

2025-04-20 18:16:51 -07:00

base_attn_backend.py

Remove q concat in FA3 backend for DeepSeek decode (#5638 )

2025-04-22 11:43:12 -07:00

double_sparsity_backend.py

Misc clean up; Remove the support of jump forward (#4032 )

2025-03-03 07:02:14 -08:00

flashattention_backend.py

Remove q concat in FA3 backend for DeepSeek decode (#5638 )

2025-04-22 11:43:12 -07:00

flashinfer_backend.py

Revert "Avoid computing lse in Ragged Prefill when there's no prefix.… (#5544 )

2025-04-18 16:50:21 -07:00

flashinfer_mla_backend.py

Revert "Avoid computing lse in Ragged Prefill when there's no prefix.… (#5544 )

2025-04-18 16:50:21 -07:00

flashmla_backend.py

fix flashmla bug (#5272 )

2025-04-22 10:36:23 -07:00

torch_native_backend.py

Feat/support encoder model (like bert) (#4887 )

2025-04-17 01:50:48 -07:00

triton_backend.py

Feat/support encoder model (like bert) (#4887 )

2025-04-17 01:50:48 -07:00

utils.py

Support FlashMLA backend cuda graph (#4514 )

2025-03-19 08:25:34 -07:00

vision.py

Add an assertion to enhance the robustness of the operator (#5736 )

2025-04-26 18:09:12 -07:00