Lianmin Zheng
|
f68dd998b9
|
Rename customer label -> custom label (#10899)
Co-authored-by: Yingchun Lai <laiyingchun@apache.org>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-09-25 16:19:53 -07:00 |
|
Xinyuan Tong
|
71f24ef8f6
|
feat: add cache_salt support to request (#10718)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
2025-09-23 23:30:25 -07:00 |
|
harrisonlimh
|
14fdd52740
|
feat: add priority based scheduling with priority based request acceptance and preemption (#8746)
|
2025-09-16 17:10:10 -07:00 |
|
Yingchun Lai
|
fc2c3a3d8e
|
metrics: support customer labels specified in request header (#10143)
|
2025-09-14 20:00:08 -07:00 |
|
Lianmin Zheng
|
033b75f559
|
[Auto Sync] Update serving_base.py, serving_chat.py, servin... (20250910) (#10282)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: cctry <shiyang@x.ai>
|
2025-09-10 16:58:59 -07:00 |
|
Lianmin Zheng
|
60e37f8028
|
Move parsers under a single folder (#9912)
|
2025-09-02 18:25:04 -07:00 |
|
cicirori
|
b6c14ec0b4
|
add response_format support for completion API (#9665)
|
2025-08-26 15:01:29 -07:00 |
|
gongwei-130
|
0cf3fbeb18
|
should return invalide request for empty prompt (#9315)
|
2025-08-18 11:44:11 -07:00 |
|
Chengxing Xie
|
c1c7dc4534
|
feat: Add model version tracking with API endpoints and response metadata (#8795)
|
2025-08-14 12:13:46 -07:00 |
|
ybyang
|
03c039c48e
|
[OAI] patch origin request_id logic (#7508)
|
2025-06-24 20:09:38 -07:00 |
|
Chang Su
|
72676cd6c0
|
feat(oai refactor): Replace openai_api with entrypoints/openai (#7351)
Co-authored-by: Jin Pan <jpan236@wisc.edu>
|
2025-06-21 13:21:06 -07:00 |
|
Keyang Ru
|
5e7fdc79fa
|
[OAI Server Refactor] [ChatCompletions & Completions] Support Return Hidden State (#7329)
Signed-off-by: keru <rukeyang@gmail.com>
|
2025-06-20 19:18:53 -07:00 |
|
yhyang201
|
dea2b84bc3
|
[OAI Server Refactor] [ChatCompletions & Completions] Implement UsageInfo Processor (#7360)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
|
2025-06-20 14:51:21 -07:00 |
|
Xinyuan Tong
|
0998808009
|
Refine OpenAI serving entrypoint to remove batch requests (#7372)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
Co-authored-by: Chang Su <csu272@usc.edu>
|
2025-06-20 14:33:43 -07:00 |
|
Xinyuan Tong
|
70c471a868
|
[Refactor] OAI Server components (#7167)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
|
2025-06-16 20:45:20 -07:00 |
|