Commit Graph

14 Commits

Author SHA1 Message Date
Mick
1df84ff414 ci: simplify multi-modality tests by using mixins (#9006) 2025-08-16 22:25:02 -07:00
Binyao Jiang
66d6be0874 Bug fix: use correct mm_items in embed_mm_inputs (#8893) 2025-08-16 19:55:56 -07:00
Kevin Xiang Li
3b3b3baf9f Double vision prefill throughput by defaulting to optimal vision attention backend (#8484)
Co-authored-by: Xiang (Kevin) Li <lik@nvidia.com>
2025-08-13 02:08:30 -07:00
Binyao Jiang
f29aba8c6e Support glm4.1v and glm4.5v (#8798)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Xinyuan Tong <justinning0323@outlook.com>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Minglei Zhu <mingleizhu1122@gmail.com>
Co-authored-by: Chang Su <csu272@usc.edu>
2025-08-09 00:59:13 -07:00
Binyao Jiang
7b81f956eb Fix qwen2 audio not working bug (#8600) 2025-08-09 00:42:29 -07:00
Xinyuan Tong
7e831efee8 Fix chat template handling for OpenAI serving (#8635)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
2025-07-31 21:49:45 -07:00
Xinyuan Tong
8430bfe3e9 [Refactor] simplify multimodal data processing (#8107)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-07-20 21:43:09 -07:00
Binyao Jiang
b7e951a6db Feat: Support audio in Phi4-mm model (#8048) 2025-07-18 21:03:53 -07:00
Mick
b5e3d6031c vlm: support video as an input modality (#5888) 2025-07-09 23:48:35 -07:00
Brayden Zhong
a37e1247c1 [Multimodal][Perf] Use pybase64 instead of base64 (#7724) 2025-07-08 14:00:58 -07:00
Lifu Huang
4474eaf552 Support LoRA in TestOpenAIVisionServer and fix fused kv_proj loading bug. (#6861) 2025-06-04 22:08:30 -07:00
Lianmin Zheng
2d72fc47cf Improve profiler and integrate profiler in bench_one_batch_server (#6787) 2025-05-31 15:53:55 -07:00
Chang Su
4685fbb888 [VLM] Support chunk prefill for VLM (#6355)
Co-authored-by: yizhang2077 <1109276519@qq.com>
2025-05-22 20:32:41 -07:00
fzyzcjy
f11481b921 Add 4-GPU runner tests and split existing tests (#6383) 2025-05-18 11:56:51 -07:00