Commit Graph

34 Commits

Author SHA1 Message Date
Liangsheng Yin
bb66cc4c52 Fix CI && python3.8 compatible (#920) 2024-08-04 16:02:05 -07:00
yichuan~
ca600e8cd6 Add support for logprobs in OpenAI chat API (#852) 2024-08-01 00:08:21 -07:00
yichuan~
bb0501c0d9 Fix List input bug (#838) 2024-07-30 13:40:51 -07:00
yichuan~
084fa54d37 Add support for OpenAI API : offline batch(file) processing (#699)
Co-authored-by: hnyls2002 <hnyls2002@gmail.com>
2024-07-29 13:07:18 -07:00
Lianmin Zheng
30db99b3d9 Rename prefill_token_logprobs -> input_token_logprobs; decode_token_logprobs -> output_token_logprobs (#776) 2024-07-27 19:50:34 -07:00
Ying Sheng
2b4c646277 Update version to 0.1.22 (#677) 2024-07-20 03:39:50 -07:00
yichuan~
49c5e0eca9 Add support for OpenAI API parallel sampling (#640) 2024-07-19 23:10:01 -07:00
zhyncs
2e341cd493 misc: add pre-commit config (#637) 2024-07-17 11:55:39 -07:00
胡译文
02b7258658 [Feat] Expose logprob options to sgl.gen API (#503)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
2024-07-09 00:35:39 -07:00
Liangsheng Yin
9c902b1954 Decode Incrementally (#517) 2024-06-11 23:39:12 -07:00
Fabian Preiß
b6667a53b9 Fix RAG nb, parea setup (parea -> parea-ai) (#525) 2024-06-11 16:36:43 -07:00
Yuanhan Zhang
7d1ebc2d71 update the script: examples/usage/llava_video/srt_example_llava_v.sh (#491) 2024-05-31 23:31:56 -07:00
Li Bo
2b605ab1d7 [Feat/Fix] Refactoring Llava models into single file (#475) 2024-05-26 12:29:51 -07:00
Liangsheng Yin
f06e90c2cf Optimize retract (#440) 2024-05-26 00:07:26 +08:00
bing
3167d8dabc fix test bug in srt_llava_next_test.py (#470) 2024-05-24 03:38:01 -07:00
Lianmin Zheng
ced77c6626 Rename api_num_spec_tokens -> num_api_spec_tokens (#458) 2024-05-20 18:44:23 -07:00
Ying Sheng
3e684be7a3 Fix openai speculative execution (#456) 2024-05-20 17:01:13 -07:00
LiviaSun
ec380dfd30 openai chat speculative execution (#250)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-05-18 22:23:53 -07:00
Kaichen Zhang - NTU
664287b2a7 [Feat] Add llava qwen, llava mistral (#419)
Co-authored-by: Bo Li <drluodian@gmail.com>
2024-05-13 22:17:50 -07:00
Yuanhan Zhang
0992d85f92 support llava video (#426) 2024-05-13 16:57:00 -07:00
Lianmin Zheng
5dc55a5f02 Handle truncation errors (#436) 2024-05-13 15:56:00 -07:00
Joschka Braun
5c5aba5900 Adding RAG tracing & eval cookbook using Parea (#390) 2024-04-30 16:13:28 -07:00
Liangsheng Yin
3842eba5fa Logprobs Refractor (#331) 2024-03-28 14:34:49 +08:00
Arsalan
745ea007ac Fix Incorrect CURL Request Example in README (#287) 2024-03-12 22:09:38 -04:00
Arsalan
eb4308c4c9 adding the triton docker build minimal example (#242) 2024-03-12 00:16:06 -07:00
Liangsheng Yin
37b42297f8 import outlines (#168) 2024-02-09 10:13:02 +08:00
Lianmin Zheng
97aa9b3284 Improve docs & Add JSON decode example (#121) 2024-01-30 05:45:27 -08:00
Lianmin Zheng
0617528632 Update quick start examples (#120) 2024-01-30 04:29:32 -08:00
parasol-aser
23950056f0 support speculative execution for openai API (#48)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2024-01-25 01:57:06 -08:00
Lianmin Zheng
9a16fea012 Return logprob for choices (#87) 2024-01-23 05:07:30 -08:00
Ying Sheng
3f5c2f4c4a Add an async example (#37) 2024-01-21 15:17:30 -08:00
Lianmin Zheng
b240f75100 Add a parallel sampling case (#34) 2024-01-18 06:29:43 +00:00
Lianmin Zheng
c4707f1bb5 Improve docs (#17) 2024-01-16 19:53:55 -08:00
Lianmin Zheng
46b7ea7c85 Improve Readme (#10) 2024-01-16 05:53:06 +00:00