Yury Sulsky
|
f19a9204cd
|
Support precomputed multimodal features for Qwen-VL and Gemma3 models. (#6136)
Co-authored-by: Yury Sulsky <ysulsky@tesla.com>
|
2025-05-16 12:26:15 -07:00 |
|
XinyuanTong
|
e88dd482ed
|
[CI]Add performance CI for VLM (#6038)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
|
2025-05-07 19:20:03 -07:00 |
|
fzyzcjy
|
15ddd84322
|
Add retry for flaky tests in CI (#4755)
|
2025-03-25 16:53:12 -07:00 |
|
Mick
|
583d6af71b
|
example: add vlm to token in & out example (#3941)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-03-04 22:18:26 -08:00 |
|
Lianmin Zheng
|
ac2387279e
|
Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>
|
2025-03-03 00:12:04 -08:00 |
|
Shi Shuai
|
c4f9707e16
|
Improve: Token-In Token-Out Usage for RLHF (#2843)
|
2025-01-11 15:14:26 -08:00 |
|
Lianmin Zheng
|
b22f3f6475
|
Fix nightly accuracy tests (#2780)
|
2025-01-07 21:02:35 -08:00 |
|
Lianmin Zheng
|
d4fc1a70e3
|
Crash the server correctly during error (#2231)
|
2024-11-28 00:22:39 -08:00 |
|
Lianmin Zheng
|
9c939a3d8b
|
Clean up metrics code (#1972)
|
2024-11-09 15:43:20 -08:00 |
|
Chayenne
|
c77c1e05ba
|
fix black in pre-commit (#1940)
|
2024-11-08 07:42:47 +08:00 |
|
Lianmin Zheng
|
c17c578108
|
Simplify tokenizer manager (#1904)
|
2024-11-03 08:38:26 -08:00 |
|
Lianmin Zheng
|
86fc0d79d0
|
Add a watch dog thread (#1816)
|
2024-10-27 02:00:50 -07:00 |
|
Lianmin Zheng
|
fb99aaa527
|
[Fix] Fix --skip-tokenizer-init (#1798)
|
2024-10-25 18:51:59 -07:00 |
|
Mingyi
|
158e8f1e2d
|
improve the threshold and ports in tests (#1215)
|
2024-08-25 19:02:08 -07:00 |
|
Yineng Zhang
|
f7fb68d292
|
ci: add moe test (#1053)
|
2024-08-13 18:43:23 +10:00 |
|
Lianmin Zheng
|
8207637029
|
Improve end-to-end throughput test and its coverage (#1039)
|
2024-08-11 18:27:33 -07:00 |
|
Lianmin Zheng
|
54fb1c80c0
|
Clean up unit tests (#1020)
|
2024-08-10 15:09:03 -07:00 |
|