sglang

Author	SHA1	Message	Date
Lianmin Zheng	14b6493087	Delete the useless test/srt/test_throughput.py (#1045 )	2024-08-11 21:31:52 -07:00
Lianmin Zheng	8207637029	Improve end-to-end throughput test and its coverage (#1039 )	2024-08-11 18:27:33 -07:00
Lianmin Zheng	d84c5e70f7	Test the case when max_new_tokens is very large (#1038 )	2024-08-11 16:41:03 -07:00
Lianmin Zheng	54fb1c80c0	Clean up unit tests (#1020 )	2024-08-10 15:09:03 -07:00
Ying Sheng	b68c4c073b	fix: force max new tokens to be 1 for embedding request (#1019 )	2024-08-10 13:46:42 -07:00
Ying Sheng	7599badeaf	Support embedding input as a list (#1014 )	2024-08-10 08:39:05 -07:00
gryffindor-rr	9cf0a5bada	Add skip_tokenizer_init args. (#959 ) Co-authored-by: lzhang <zhanglei@modelbest.cn>	2024-08-09 12:14:13 -07:00
Ying Sheng	b16e856f11	Add openai embedding API (#997 )	2024-08-09 11:19:18 -07:00
Juwan Yoo	10bca45bc6	bugfix: penalizers to be merged before reqs (#1001 )	2024-08-09 21:46:24 +10:00
liuyhwangyh	b91a4cb1b1	support models from www.modelscope.cn (#994 ) Co-authored-by: mulin.lyh <mulin.lyh@taobao.com>	2024-08-09 02:52:14 -07:00
Juwan Yoo	95a28019ba	test: negative value testing for frequency, presence penalizers (#995 )	2024-08-08 23:30:50 -07:00
Ying Sheng	e040a2450b	Add e5-mistral embedding model - step 3/3 (#988 )	2024-08-08 16:31:19 -07:00
Juwan Yoo	ab7875941b	feat: frequency, min_new_tokens, presence, and repetition penalties (#973 )	2024-08-08 04:21:08 -07:00
yichuan~	3a79613c28	support more optioin about usage in stream mode (#985 ) Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2024-08-08 09:41:57 +00:00
Yineng Zhang	c31f084c71	chore: update vllm to 0.5.4 (#966 )	2024-08-07 21:15:41 +10:00
yichuan~	5f6fa04a3f	misc: simplify test (#964 )	2024-08-07 01:23:27 -07:00
yichuan~	795eab6dda	Add support for Batch API test (#936 )	2024-08-06 23:52:10 -07:00
Aidan Cooper	94e0115186	Feat: add alternative choices selection methods (#835 )	2024-08-05 03:27:49 -07:00
yichuan~	fd7926e46e	Fix prompt len in parallel sampling (#928 )	2024-08-05 00:56:08 -07:00
Ying Sheng	0a4f5f9bea	Test regex in vision api (#926 )	2024-08-04 22:52:41 -07:00
Ying Sheng	3bc99e6fe4	Test openai vision api (#925 )	2024-08-05 13:51:55 +10:00
yichuan~	d53dcf9c98	Support more OpenAI API test (#916 )	2024-08-04 16:43:09 -07:00
Liangsheng Yin	bb66cc4c52	Fix CI && python3.8 compatible (#920 )	2024-08-04 16:02:05 -07:00
Ying Sheng	0d4f3a9fcd	Make API Key OpenAI-compatible (#917 )	2024-08-04 13:35:44 -07:00
Ying Sheng	995af5a54b	Improve the structure of CI (#911 )	2024-08-03 23:09:21 -07:00
Ying Sheng	70cc0749ce	Add model accuracy test - step 1 (#866 )	2024-08-03 18:20:50 -07:00
Yineng Zhang	2e218b9e04	fix: set env in runner (#891 )	2024-08-02 20:48:56 +10:00
Ying Sheng	ae7ee01a8e	Add accuracy test to CI: MMLU (#882 )	2024-08-01 21:20:17 -07:00
Ying Sheng	60340a3643	Improve the coverage of the openai api server test (#878 )	2024-08-01 16:01:30 -07:00
Ying Sheng	72b6ea88b4	Make scripts under `/test/srt` as unit tests (#875 )	2024-08-01 14:34:55 -07:00
Ying Sheng	6f221d4ca0	Fix unit tests for the frontend language part (#872 )	2024-08-01 12:39:12 -07:00
Ying Sheng	4075677621	Add OpenAI backend to the CI test (#869 )	2024-08-01 09:25:24 -07:00
Lianmin Zheng	30db99b3d9	Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776 )	2024-07-27 19:50:34 -07:00
Ying Sheng	51fda1439f	Update Readme (#660 ) Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>	2024-07-19 09:54:01 -07:00
Ying Sheng	dc1b8bcfaa	Format (#593 )	2024-07-05 10:06:17 -07:00
Lianmin Zheng	eb1ae6ae0c	Add sglang.bench_latency for offline benchmark (#564 )	2024-06-25 03:38:04 -07:00
Liangsheng Yin	05471f2103	Update test_flashinfer (#560 )	2024-06-24 15:23:57 +08:00
Lianmin Zheng	1fa15099d8	Add LlamaForClassification (#559 )	2024-06-22 00:49:31 -07:00
Ying Sheng	fb9296f0ed	Higher priority for user input of max_prefill_tokens & format (#540 )	2024-06-12 21:48:40 -07:00
胡译文	87260b7bfd	Litellm Backend (#502 )	2024-06-07 12:24:28 -07:00
Ying Sheng	0463f7fb52	Support data parallelism (static) (#480 ) Co-authored-by: Ying Sheng <ying.sheng@databricks.com> Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com> Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>	2024-05-27 21:24:10 -07:00
Ying Sheng	3e684be7a3	Fix openai speculative execution (#456 )	2024-05-20 17:01:13 -07:00
Liangsheng Yin	690d162d97	Format code (#441 )	2024-05-14 22:40:46 +08:00
Lianmin Zheng	5dc55a5f02	Handle truncation errors (#436 )	2024-05-13 15:56:00 -07:00
Lianmin Zheng	6e09cf6a15	Misc fixes (#432 )	2024-05-12 15:05:40 -07:00
Lianmin Zheng	aee4f523cf	Fix logit processor bugs (#427 )	2024-05-12 04:54:07 -07:00
Lianmin Zheng	7023f413c6	Clean up (#422 )	2024-05-11 20:55:00 -07:00
Liangsheng Yin	19818b9c2f	Minor: style improvement of radix_cache and memory_pool (#395 )	2024-04-26 01:01:36 +08:00
Liangsheng Yin	150d7020ed	Revert removing the unused imports (#385 )	2024-04-23 22:36:33 +08:00
Liangsheng Yin	9acc6e3504	add `.isort.cfg` (#378 )	2024-04-22 22:38:09 +08:00

1 2

78 Commits