Commit Graph

129 Commits

Author SHA1 Message Date
Lianmin Zheng
e2b2f0a213 Support oai in benchmark/mmlu (#323) 2024-03-22 13:37:57 -07:00
Jani Monoses
b57abe1663 Add StableLM model. (#301) 2024-03-22 13:24:08 -07:00
Jani Monoses
e57f079275 Use Anthropic messages API (#304) 2024-03-22 13:23:31 -07:00
Li Bo
08df63a6f8 [Fix/Potential Bugs] Can not correctly import models in python/sglang/srt/models (#311) 2024-03-22 12:19:58 -07:00
ZhouGongZaiShi
77835756a7 Fix outlines-0.0.35 incompatibility (#291)
Co-authored-by: ZX <zx@lbx.dev>
2024-03-22 12:19:11 -07:00
Liurl
ed31579971 Fix marlin model loading compat with autogptq (#290)
Co-authored-by: LRL <lrl@lbx.dev>
2024-03-13 13:15:43 +08:00
Qubitium
92e2d74fd0 Fix env (docker) compat due to __file__ usage (#288) 2024-03-13 13:02:48 +08:00
Enrique Shockwave
d9b3b01883 enable marlin kernels (#286) 2024-03-12 22:10:12 -04:00
Qubitium
ad1dd74673 Fix flashinfer >= 0.0.3 compat (#282) 2024-03-12 21:45:58 +08:00
Qubitium
b2eb080501 Fix Runtime missing some ServerArgs options (#281) 2024-03-11 22:32:15 +08:00
Lianmin Zheng
4aa5dd2c5f Update version to v0.1.13 (#280) 2024-03-11 05:49:27 -07:00
Lianmin Zheng
13662fd533 Fix RuntimeEndpoint (#279) 2024-03-11 05:24:24 -07:00
Alessio Dalla Piazza
d5ae2ebaa2 Add Support for API Key Authentication (#230) 2024-03-11 05:16:10 -07:00
Liangsheng Yin
1b35547927 Organize server_args (#277) 2024-03-11 20:06:52 +08:00
Lianmin Zheng
faba293a0d Improve gemma and documentations (#278) 2024-03-11 04:43:39 -07:00
Liangsheng Yin
89885b31ef Gemma Support (#256) 2024-03-11 12:14:27 +08:00
Geary.Z
64fe311593 replace skip_embed with input_embeds (#222) 2024-03-10 19:04:52 -07:00
Liangsheng Yin
a7ace9c88d Fix qwen config (#261) 2024-03-10 18:54:18 -07:00
Lin Tianchuan
30d67b2bca Add set_var to interpreter.py (#263) 2024-03-07 23:20:11 +08:00
Xinwei Xiong
b0b722ee8e Refactor ChatTemplate for Enhanced Clarity and Efficiency (#201) 2024-03-03 17:52:36 +08:00
Srinivas Billa
01b07ea3ac Add SSL Cert Functionality (#224) 2024-03-03 17:41:41 +08:00
Liangsheng Yin
dfb13ac455 Fix addr reuse in check_port (#253) 2024-03-03 17:09:16 +08:00
Enrique Shockwave
9759d927cf fix chatml template (#195) 2024-02-24 16:34:22 +08:00
Zhang Wenbin
8d0a7fae3b Fix interpreter.py get_var(var_name) in text iter when stream is not enabled (#198) 2024-02-24 16:27:34 +08:00
Liangsheng Yin
c4e9ebe3a4 Fix stop str merging (#225)
Co-authored-by: Enrique Shockwave <33002121+qeternity@users.noreply.github.com>
2024-02-24 16:05:21 +08:00
Cody Yu
3c2c5869ad Support outlines > 0.0.31 (#219) 2024-02-24 15:06:17 +08:00
Cody Yu
4cb9aaedf3 Fix logprobs with logprob_start_len (#193) 2024-02-22 10:33:03 -08:00
psych0v0yager
9de9a46815 Added the ability to Modify the Context Length (#210) 2024-02-20 16:22:56 -08:00
Liangsheng Yin
91e036334f Adjust outlines version. (#200)
Co-authored-by: comaniac <hao.yu.cody@gmail.com>
2024-02-17 13:40:39 +08:00
Cody Yu
2a74748b2f Pin outlines version (#196) 2024-02-16 13:01:40 -08:00
Cody Yu
63ba630bbb Refactor decoding logprob and add completion_tokens_wo_jump_forward (#189) 2024-02-15 10:54:20 -08:00
Lianmin Zheng
6493256b7d improve print 2024-02-12 12:43:48 +00:00
Lianmin Zheng
06008bc295 Fix server launch for jupyter notebook (#186) 2024-02-12 04:43:14 -08:00
Lianmin Zheng
bb824da41a Add Together and AzureOpenAI examples (#184) 2024-02-12 01:06:38 -08:00
Yaya Sy
931213245c correct reference dtype openai.py (#181) 2024-02-11 13:26:20 -08:00
Lianmin Zheng
624b21e742 Update version to 0.1.12 (#178) 2024-02-11 06:43:45 -08:00
Lianmin Zheng
c51020cf0c Fix the chat template for llava-v1.6-34b & format code (#177) 2024-02-11 05:50:13 -08:00
Cody Yu
50afed4eaa Support extra field regex in OpenAI API (#172) 2024-02-10 17:21:33 -08:00
Cody Yu
4d303c4fa3 Fix token usage with jump forward (#174) 2024-02-09 20:06:15 -08:00
Liangsheng Yin
37b42297f8 import outlines (#168) 2024-02-09 10:13:02 +08:00
Cody Yu
cba5027332 Fix BaseCache metric (#170) 2024-02-08 17:23:09 -08:00
Ying Sheng
a6aa46dd3f minor 2024-02-08 04:35:25 +00:00
Srinivas Billa
405f26b00b Add Auth Token to RuntimeEndpoint (#162) 2024-02-07 20:07:31 -08:00
Liangsheng Yin
b1a3a454ee add --disable-disk-cache (#160)
Co-authored-by: Ja1Zhou <50169346+Ja1Zhou@users.noreply.github.com>
2024-02-08 00:50:12 +08:00
Cody Yu
26c3494152 [Submodule] Change FlashInfer to import (#156) 2024-02-06 19:28:29 -08:00
Lianmin Zheng
23f05005fd Format code & move functions (#155) 2024-02-06 13:27:46 -08:00
Cody Yu
a7334aeea1 Support decode token logprobs (#130) 2024-02-06 12:24:55 -08:00
Arcmoon
3ae78a09b3 Add gptq quantization model support (#141) 2024-02-06 11:35:04 -08:00
Cody Yu
ccbe1e67d8 Temporary fix OpenAI API for Pydantic v1/v2 (#153) 2024-02-06 11:34:15 -08:00
LiviaSun
e2bf732bc3 add openai error handler with retry and logger (#148) 2024-02-05 20:38:41 -08:00