Lianmin Zheng
|
a17e70f5cc
|
Use more general heuristics to set the default value of --mem-fraction-static (#10975)
Co-authored-by: sglang-bot <sglangbot@gmail.com>
|
2025-09-29 10:11:03 -07:00 |
|
Lifu Huang
|
2101d93b4f
|
Fix CI TestChunkedSGMV (#10737)
|
2025-09-22 16:09:58 +08:00 |
|
Lifu Huang
|
08ecd0aa2a
|
[3/4] Speed up CSGMV backend perf by 10% through dynamic chunking + kernel optimization (#10592)
|
2025-09-20 22:47:48 -07:00 |
|
Lifu Huang
|
3f41b48c40
|
[2/2] Introduce Chunked-SGMV kernels and corresponding LoRA backend for improved performance (#10286)
|
2025-09-15 16:04:03 -07:00 |
|
Lifu Huang
|
e903f695c8
|
Fix potential flakiness in test_lora_qwen3 (#10250)
|
2025-09-10 08:04:39 +00:00 |
|
gongwei-130
|
ab62b135c1
|
support Llama4 with non uniformed intermediate size across layers for… (#10047)
|
2025-09-05 17:28:15 -07:00 |
|
Baizhou Zhang
|
7de2ce45b2
|
Disable radix cache in test_lora_update.py for better stability (#9852)
|
2025-08-31 22:28:22 -07:00 |
|
Lifu Huang
|
b0980af89f
|
Support pinning adapter via server args. (#9249)
|
2025-08-20 16:25:01 -07:00 |
|
Baizhou Zhang
|
75e6a7cde1
|
Support radix cache for Lora feature (#7216)
|
2025-08-11 10:14:11 -07:00 |
|
Lifu Huang
|
e322a94d1f
|
Reduce CI duration of test_lora_update. (#9024)
|
2025-08-10 15:34:04 -07:00 |
|
Lianmin Zheng
|
2c7f01bc89
|
Reorganize CI and test files (#9027)
|
2025-08-10 12:30:06 -07:00 |
|