sglang

Author	SHA1	Message	Date
Yineng Zhang	dd7e8b9421	chore: add copyright for srt (#790 )	2024-07-28 23:07:12 +10:00
Yineng Zhang	1f013d64eb	docs: make badges center (#789 )	2024-07-28 22:27:52 +10:00
Yineng Zhang	628e1fa760	docs: update README (#788 )	2024-07-28 22:24:27 +10:00
Ying Sheng	c71880f896	Vectorize logprobs computation (#787 )	2024-07-28 05:22:14 -07:00
Ying Sheng	bcb6611a46	Update README.md	2024-07-28 01:00:06 -07:00
Yineng Zhang	fa2aa0db0a	docs: update index (#786 )	2024-07-28 17:22:00 +10:00
Yineng Zhang	6a387a69cc	fix: exclude logo png in gitignore (#785 )	2024-07-28 17:08:16 +10:00
Yineng Zhang	27f5ce0a6c	fix: init readthedocs support (#784 )	2024-07-28 16:55:54 +10:00
Yineng Zhang	948625799e	docs: init readthedocs support (#783 )	2024-07-28 16:50:31 +10:00
Yineng Zhang	68e5262699	fix: replace pillow with PIL in PACKAGE_LIST (#781 )	2024-07-28 14:06:24 +10:00
Lianmin Zheng	bc1154c399	Bump version to 0.2.6 (#779 )	2024-07-27 20:29:33 -07:00
Lianmin Zheng	752e643007	Allow disabling flashinfer sampling kernel (#778 )	2024-07-27 20:18:56 -07:00
Lianmin Zheng	30db99b3d9	Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776 )	2024-07-27 19:50:34 -07:00
Lianmin Zheng	0a409bd438	Fix return_log_probs with cuda graph (#775 )	2024-07-27 19:15:09 -07:00
Mingyi	e4db4e5ba5	minor refactor: move check server args to server_args.py (#774 )	2024-07-27 19:03:40 -07:00
Lianmin Zheng	bbc07c4197	Move sampling logits to float32 (#773 )	2024-07-27 17:30:12 -07:00
Lianmin Zheng	a036d41980	Fix max new tokens (#772 )	2024-07-27 17:22:18 -07:00
Lianmin Zheng	f95e661757	Fix max_tokens for OpenAI chat completion API (#766 )	2024-07-27 15:44:27 -07:00
Yineng Zhang	de854fb5c5	feat: add fake tag (#770 )	2024-07-28 02:22:22 +10:00
Lianmin Zheng	f64b2a9bc0	Add slack invitation link.	2024-07-27 06:29:15 -07:00
Ying Sheng	9f95dcc64f	Update readme (#769 ) Co-authored-by: Mingyi <wisclmy0611@gmail.com>	2024-07-27 06:12:16 -07:00
Lianmin Zheng	0736b27020	[Minor] Improve the code style in TokenizerManager (#767 )	2024-07-27 05:05:15 -07:00
Ke Bao	3fdab91912	Fix TransformerTokenizer init for chatglm2 & 3 (#761 )	2024-07-27 02:44:46 -07:00
Liangsheng Yin	ba29504b21	Update supported models (#763 )	2024-07-27 15:53:53 +10:00
Yineng Zhang	a72342f180	fix: not run workflows on fork repo (#762 )	2024-07-27 14:51:33 +10:00
Yineng Zhang	c3c74bf874	docs: update model support (#760 )	2024-07-27 14:07:37 +10:00
Liangsheng Yin	d9fccfefe2	Fix context length (#757 )	2024-07-26 18:13:13 -07:00
Liangsheng Yin	679ebcbbdc	Deepseek v2 support (#693 )	2024-07-26 17:10:07 -07:00
Yineng Zhang	5bd06b4599	fix: use REPO_TOKEN (#755 )	2024-07-27 05:56:30 +10:00
Yineng Zhang	9a61182732	fix: add release tag workflow (#754 )	2024-07-27 05:48:38 +10:00
Yineng Zhang	eeb2482186	feat: add release tag workflow (#753 )	2024-07-27 05:37:02 +10:00
Yineng Zhang	3e455b016e	misc: replace deprecated variable HUGGING_FACE_HUB_TOKEN with HF_TOKEN (#752 )	2024-07-27 04:19:30 +10:00
Yineng Zhang	8628ab9c8b	feat: add docker workflow (#751 )	2024-07-27 03:54:51 +10:00
Yineng Zhang	1b77670f39	chore: bump v0.2.1 (#740 )	2024-07-26 21:27:41 +10:00
Yineng Zhang	768e05d08f	fix benchmark (#743 ) Co-authored-by: hnyls2002 <hnyls2002@gmail.com> Co-authored-by: Ying Sheng <sqy1415@gmail.com>	2024-07-26 21:26:13 +10:00
Yineng Zhang	01fbb11bb7	docs: fix typo (#742 )	2024-07-26 21:05:53 +10:00
Yineng Zhang	05d216da32	docs: add llama 3.1 405b instruction (#739 ) Co-authored-by: Ying1123 <sqy1415@gmail.com>	2024-07-26 21:03:20 +10:00
Yineng Zhang	6b32bb1c0b	misc: format (#741 )	2024-07-26 21:00:51 +10:00
Toshiki Kataoka	40facad5f1	feat: support token ids in /v1/completions (#736 )	2024-07-26 02:53:17 -07:00
Toshiki Kataoka	da504445dc	fix /generate without sampling_params (#734 )	2024-07-26 01:27:56 -07:00
Ying Sheng	252e0f7bbd	fix: small bug for llama-405b fp16 (#733 )	2024-07-25 21:14:54 -07:00
Ying Sheng	7f6f2f0f09	Update readme (#731 )	2024-07-25 09:16:10 -07:00
Ying Sheng	7802df1e2b	Update readme	2024-07-25 08:45:06 -07:00
Ying Sheng	1a491d00cb	Bump version to 0.2.0 (#730 )	2024-07-25 08:03:36 -07:00
Ying Sheng	8fbba3de3d	Fix bugs (fp8 checkpoints, triton cache manager) (#729 )	2024-07-25 07:42:00 -07:00
Ying Sheng	ae0f6130cb	Revert "fix: fp8 config" (#728 )	2024-07-25 07:25:33 -07:00
Yineng Zhang	6010589783	misc: update bug issue template (#727 )	2024-07-25 20:52:37 +10:00
Yineng Zhang	926ac01b64	fix: resolve the logo display issue on the PyPI page (#726 )	2024-07-25 20:47:46 +10:00
Yineng Zhang	25c881a005	chore: bump v0.1.25 (#725 )	2024-07-25 20:04:35 +10:00
Liangsheng Yin	04ec6ba2ac	Fix dockerfile and triton cache manager (#720 )	2024-07-25 03:04:21 -07:00

... 89 90 91 92 93 ...

4977 Commits