sglang/layers at 4fb05583ef3895e7ac4c592649b635cf12f24852 - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

Baizhou Zhang 4fb05583ef Deprecate disable-mla (#5481 )

2025-04-17 01:43:14 -07:00

..

Deprecate disable-mla (#5481 )

2025-04-17 01:43:14 -07:00

Clean up imports (#5467 )

2025-04-16 15:26:49 -07:00

Clean up imports (#5467 )

2025-04-16 15:26:49 -07:00

activation.py

Clean up imports (#5467 )

2025-04-16 15:26:49 -07:00

dp_attention.py

Fix DeepSeek DP Attention + torch compile (#5367 )

2025-04-14 01:07:58 -07:00

elementwise.py

Fix run time error in ROCm platform (#5147 )

2025-04-07 22:49:40 -07:00

layernorm.py

Clean up imports (#5467 )

2025-04-16 15:26:49 -07:00

linear.py

[Model Support] unsloth/Phi-4-mini bnb model (#4982 )

2025-04-16 19:58:20 -07:00

logits_processor.py

Reduce computation and communication in DP attention (#4521 )

2025-03-18 13:41:36 -07:00

parameter.py

Clean up imports (#5467 )

2025-04-16 15:26:49 -07:00

pooler.py

Rename InputMetadata -> ForwardBatch (#1543 )

2024-09-30 02:41:11 -07:00

radix_attention.py

Fix loading KV quantization scale; Enable modelopt kv cache (#4686 )

2025-04-08 09:11:35 -07:00

rotary_embedding.py

Clean up imports (#5467 )

2025-04-16 15:26:49 -07:00

sampler.py

Remove redundant type conversion (#4513 )

2025-03-17 05:57:35 -07:00

torchao_utils.py

Support loading of larger models with on-the-fly quantization (#3061 )

2025-01-22 21:33:17 -08:00

vocab_parallel_embedding.py

Clean up fp8 support (#4230 )

2025-03-09 21:46:35 -07:00