Commit Graph

15 Commits

Author SHA1 Message Date
Yineng Zhang
d06c1ab587 update ci install dependency (#2949) 2025-01-17 23:42:23 +08:00
fzyzcjy
923f518337 CUDA-graph-compatible releasing and resuming KV cache and model weight memory (#2630) 2025-01-13 11:38:51 -08:00
Yineng Zhang
d49b13c6f8 feat: use CUDA 12.4 by default (for FA3) (#2682) 2024-12-31 15:52:09 +08:00
fzyzcjy
f707470019 CI: Update scripts to fail fast (#2672) 2024-12-30 19:04:01 -08:00
Yineng Zhang
d95a5f5bf5 fix followup #2517 (#2524) 2024-12-19 23:24:30 +08:00
Yineng Zhang
7154b4b1df minor: update flashinfer nightly (#2490) 2024-12-16 23:02:49 +08:00
Yineng Zhang
fc78640e00 minor: support flashinfer nightly (#2295) 2024-12-01 18:55:26 +08:00
Lianmin Zheng
9449a95431 [CI] Balance CI tests (#2293) 2024-12-01 01:47:30 -08:00
Lianmin Zheng
0d6a49bd7d [CI] Kill zombie processes (#2280) 2024-11-30 00:24:30 -08:00
Yineng Zhang
fae4e5e99a chore: bump v0.3.6.post3 (#2259) 2024-11-30 01:41:16 +08:00
Lianmin Zheng
254fd130e2 [CI] Split test cases in CI for better load balancing (#2180) 2024-11-25 04:58:16 -08:00
Lianmin Zheng
9c939a3d8b Clean up metrics code (#1972) 2024-11-09 15:43:20 -08:00
Lianmin Zheng
7ef0084b0d Add sentence_transformers to CI dependency (#1958) 2024-11-08 01:21:29 -08:00
Lianmin Zheng
b548801ddb Update docs (#1839) 2024-10-30 02:49:08 -07:00
Lianmin Zheng
6aa94b967c Update ci workflows (#1804) 2024-10-26 04:32:36 -07:00