sglang

Author	SHA1	Message	Date
Yudi Xue	5bc2508b80	Monitoring documentation (#1933 )	2024-11-07 22:14:16 -08:00
Chayenne	c77c1e05ba	fix black in pre-commit (#1940 )	2024-11-08 07:42:47 +08:00
Xuehai Pan	a5e0defb5a	minor: Add basic editorconfig and pre-commit hooks to enforce style for whitespaces (#1926 )	2024-11-06 13:46:04 +00:00
Chayenne	02755768d3	Change judge to classify & Modify make file (#1920 )	2024-11-04 23:53:44 -08:00
Lianmin Zheng	0abbf289a8	Unify the model type checking (#1905 )	2024-11-03 12:25:39 -08:00
Byron Hsu	6fcd6d7d6d	Support token ids in `engine.generate` (#1820 )	2024-10-27 14:02:34 -07:00
Yineng Zhang	8bee20f80b	Update vllm to 0.6.3 (#1711 ) (#1720 ) Co-authored-by: Ke Bao <ISPObaoke@163.com>	2024-10-19 20:45:41 -07:00
Byron Hsu	862cd265e5	[engine] support async and streaming (#1614 )	2024-10-11 15:26:25 -07:00
Byron Hsu	551a3a9d38	Provide an offline engine API (#1567 )	2024-10-06 20:27:03 -07:00
Byron Hsu	2422de5193	Support min_tokens in sgl.gen (#1573 )	2024-10-05 21:51:12 -07:00
FredericOdermatt	2432ad40c6	[Minifix] Remove extra space in cot example (#1569 )	2024-10-04 01:16:53 -07:00
Lianmin Zheng	114bbc8651	Use ipc instead of tcp in zmq (#1566 )	2024-10-04 00:45:52 -07:00
Theresa Barton	2c7d0a5b8b	[Fix] Fix all the Huggingface paths (#1553 )	2024-10-02 10:12:07 -07:00
Ying Sheng	0f4fb19bc8	[Fix, LoRA] fix LoRA with updates in main (#1545 )	2024-09-30 10:06:08 -07:00
Lianmin Zheng	4e4459b91f	Multiple minor fixes (#1530 )	2024-09-28 14:43:35 -07:00
Ying Sheng	9aa6553d2a	[Feature] Support reward model LxzGordon/URM-LLaMa-3.1-8B (#1525 )	2024-09-27 23:32:11 -07:00
Ying Sheng	e4780cf839	[API, Feature] Support response prefill for openai API (#1490 )	2024-09-22 06:46:17 -07:00
Li Bo	446ea33277	fix: creat new dict everytime for putting new frame (#1464 )	2024-09-19 01:31:48 -07:00
Ying Sheng	37963394aa	[Feature] Support LoRA path renaming and add LoRA serving benchmarks (#1433 )	2024-09-15 12:46:04 -07:00
Lianmin Zheng	e4d68afcf0	[Minor] Many cleanup (#1357 )	2024-09-09 04:14:11 -07:00
Kaichen Zhang - NTU	662ecd9368	[Feat] Add modalities for vision server when handling pixel values for llava (#1346 )	2024-09-09 02:07:34 -07:00
Lianmin Zheng	f64eae3a29	[Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping (#1308 )	2024-09-02 21:44:45 -07:00
Kai-Hsun Chen	0836055324	[Chore] Rename model_overide_args to model_override_args (#1284 ) Signed-off-by: Kai-Hsun Chen <kaihsun@anyscale.com> Co-authored-by: Yineng Zhang <me@zhyncs.com>	2024-09-01 03:14:56 -07:00
Lianmin Zheng	0a97d7962d	[Fix] Fix OOM in llava base class (#1249 )	2024-08-28 08:45:49 -07:00
Lianmin Zheng	bf53bf5142	[Fix] Fix llava on multi images (#1247 )	2024-08-28 06:33:05 -07:00
yichuan~	5ff25cdf5b	[Minor] add delete test and delete tmp file on ci server (#1227 )	2024-08-26 22:04:52 -07:00
Kaichen Zhang - NTU	66e7dcaf70	[Fix] Fixing the multi-images error for llava-onevision (#1205 )	2024-08-25 10:28:23 -07:00
Lianmin Zheng	f6af3a6561	Cleanup readme, llava examples, usage examples and nccl init (#1194 )	2024-08-24 08:02:23 -07:00
Kaichen Zhang - NTU	a5b14ad043	[Feat/WIP] add llava-onevision, with support for (1) siglip encoder, (2) qwen2 decoder (3) openai api compatible server. (#1123 ) Co-authored-by: Bo Li <drluodian@gmail.com>	2024-08-23 14:11:16 -07:00
Liangsheng Yin	83e23c69b3	Improve code style of sampler (#1168 )	2024-08-21 16:48:24 -07:00
Liangsheng Yin	bb66cc4c52	Fix CI && python3.8 compatible (#920 )	2024-08-04 16:02:05 -07:00
yichuan~	ca600e8cd6	Add support for logprobs in OpenAI chat API (#852 )	2024-08-01 00:08:21 -07:00
yichuan~	bb0501c0d9	Fix List input bug (#838 )	2024-07-30 13:40:51 -07:00
yichuan~	084fa54d37	Add support for OpenAI API : offline batch(file) processing (#699 ) Co-authored-by: hnyls2002 <hnyls2002@gmail.com>	2024-07-29 13:07:18 -07:00
Lianmin Zheng	30db99b3d9	Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776 )	2024-07-27 19:50:34 -07:00
Ying Sheng	2b4c646277	Update version to 0.1.22 (#677 )	2024-07-20 03:39:50 -07:00
yichuan~	49c5e0eca9	Add support for OpenAI API parallel sampling (#640 )	2024-07-19 23:10:01 -07:00
zhyncs	2e341cd493	misc: add pre-commit config (#637 )	2024-07-17 11:55:39 -07:00
胡译文	02b7258658	[Feat] Expose logprob options to `sgl.gen` API (#503 ) Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>	2024-07-09 00:35:39 -07:00
Liangsheng Yin	9c902b1954	Decode Incrementally (#517 )	2024-06-11 23:39:12 -07:00
Fabian Preiß	b6667a53b9	Fix RAG nb, parea setup (parea -> parea-ai) (#525 )	2024-06-11 16:36:43 -07:00
Yuanhan Zhang	7d1ebc2d71	update the script: examples/usage/llava_video/srt_example_llava_v.sh (#491 )	2024-05-31 23:31:56 -07:00
Li Bo	2b605ab1d7	[Feat/Fix] Refactoring Llava models into single file (#475 )	2024-05-26 12:29:51 -07:00
Liangsheng Yin	f06e90c2cf	Optimize retract (#440 )	2024-05-26 00:07:26 +08:00
bing	3167d8dabc	fix test bug in srt_llava_next_test.py (#470 )	2024-05-24 03:38:01 -07:00
Lianmin Zheng	19d2135cb8	Use model loader from vllm (#459 )	2024-05-21 09:13:37 -07:00
Lianmin Zheng	ced77c6626	Rename api_num_spec_tokens -> num_api_spec_tokens (#458 )	2024-05-20 18:44:23 -07:00
Ying Sheng	3e684be7a3	Fix openai speculative execution (#456 )	2024-05-20 17:01:13 -07:00
LiviaSun	ec380dfd30	openai chat speculative execution (#250 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2024-05-18 22:23:53 -07:00
Kaichen Zhang - NTU	664287b2a7	[Feat] Add llava qwen, llava mistral (#419 ) Co-authored-by: Bo Li <drluodian@gmail.com>	2024-05-13 22:17:50 -07:00

1 2

80 Commits