Commit Graph

44 Commits

Author SHA1 Message Date
Keyang Ru
77258ce039 [router] Support multiple worker URLs for OpenAI router (#11723) 2025-10-22 09:27:58 -07:00
Keyang Ru
63cfe1b032 [router] Add gRPC E2E test suite (#11790) 2025-10-21 17:51:21 -07:00
Chang Su
70f6309cd4 [router][grpc] Support v1/responses API (#11926) 2025-10-21 17:41:48 -07:00
Simo Lin
ddcba74b4d [router] Worker Management Workflow Engine (#11868) 2025-10-20 17:00:22 -07:00
ybyang
d513ee93ef [2/2] [feature] support openai like classification api in router (#11670) 2025-10-18 19:31:08 -07:00
Chang Su
d1984e218c [router][grpc] Remove timeout for connections and remove max_tokens deprecation warning log (#11775) 2025-10-17 12:36:36 -07:00
Chang Su
dc01313da1 [router] Add rustfmt and set group imports by default (#11732) 2025-10-16 17:33:29 -07:00
Chang Su
c7962868c1 [router] Fix tool_choice normalization in ChatCompletionRequest and fix ut (#11731) 2025-10-16 14:20:13 -07:00
Keyang Ru
4c9bcb9d56 [Router] Refactor protocol definitions: split spec.rs into modular files (#11677)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-10-16 13:44:44 -07:00
Keyang Ru
d2478cd4ff [router] Fix response api related spec (#11621) 2025-10-15 09:59:38 -07:00
Simo Lin
28ad2297a0 [router] delete useless table content comment in spec (#11597) 2025-10-14 01:08:18 -07:00
Simo Lin
4b62af92ef [router] change worker api to async instead of sync (#11566) 2025-10-14 00:32:21 -07:00
Simo Lin
0b9915c132 [router] update generate spec to align with sgl io struct (#11591) 2025-10-14 02:51:33 -04:00
Chang Su
27ef1459e6 [router][protocols] Add Axum validate extractor and use it for /v1/chat/completions endpoint (#11588) 2025-10-13 22:51:15 -07:00
Keyang Ru
63e84352b7 [router] openai router: support grok model (#11511) 2025-10-12 22:44:43 -04:00
Keyang Ru
7ac6b900f4 [router] Support history management using conversation (#11339) 2025-10-08 15:24:02 -07:00
Simo Lin
01c9ee1ab4 [router] refactor generate to use new pipeline arch (#11323) 2025-10-08 09:38:50 -07:00
Chang Su
edd86b8853 [router][grpc] Refactor chat handler in grpc/ to use centralized orchestrator (#11314)
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
2025-10-07 20:50:20 -07:00
Simo Lin
d736e0b65e [router] add grpc router pd mode for chat and generate (#11140) 2025-10-04 06:58:28 -07:00
Chang Su
963175d5c0 [router][grpc] Support streaming for v1/chat/completions (#11179) 2025-10-02 14:35:16 -07:00
Keyang Ru
a28b394fba [router] Add multi-turn tool calling loop support for MCP integration (#11143) 2025-10-01 12:50:21 -07:00
Keyang Ru
7fb551a75d [router] add mcp list and mcp call in output array (#11112) 2025-09-30 21:41:54 -04:00
Chang Su
d1676cd483 [router][tool call] Full support for ToolChoice (#11085)
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
2025-09-29 22:36:03 -07:00
Simo Lin
816b3a433a [router] add n to generate sampling params (#11069) 2025-09-29 07:37:43 -07:00
Keyang Ru
72392f2908 [router] basic mcp support for openai router response api (#10978) 2025-09-27 21:49:33 -04:00
Chang Su
37f3325b06 [router][grpc] Support E2E non-stream chat completions (#10980) 2025-09-26 22:02:06 -07:00
Chang Su
0c3db88978 [router][grpc] Add helpfer functions for decoder in router.rs and fix specs (#10971) 2025-09-26 20:10:45 -04:00
Simo Lin
aae7ead2d0 [router] remove old/oudated/useless comments across code base (#10968) 2025-09-26 10:48:50 -07:00
Simo Lin
e738703547 [router] consolidate worker get loads (#10880) 2025-09-24 22:13:31 -04:00
Simo Lin
7a06ef984d [router] consolidate health endpoints and flush cache (#10876) 2025-09-24 15:23:21 -07:00
Simo Lin
98c3b04ff2 [router] responses api POST and GET with local storage (#10581)
Co-authored-by: key4ng <rukeyang@gmail.com>
2025-09-23 09:12:02 -07:00
Qiaolin Yu
e2ac7888b8 [2/2] Support deterministic inference for temperature > 0 (#10678)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
2025-09-21 19:36:08 -07:00
Jimmy
56321e9fc2 [Router]fix: fix get_load missing api_key (#10385) 2025-09-21 15:28:38 -04:00
Chang Su
03ce92e594 router-spec: Reorder ChatCompletionRequest and fix validation logic (#10675) 2025-09-19 16:41:21 -07:00
Jintao Zhang
f9ee6ae17a [router]: Add Embedding routing logic (#10129)
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
Co-authored-by: Waël Boukhobza <wawa_wael@live.fr>
2025-09-14 18:44:35 -07:00
Simo Lin
2f173ea074 [router] allow one router to support different model families and serving mode (#10244) 2025-09-12 16:18:27 -07:00
Frank Fang
4634fd5953 [router] Add Rerank Routing Logic in Regular Router (#10219) 2025-09-12 09:10:18 -07:00
Tony Lu
5e19b159b0 [router] add chat_template_kwargs in ChatCompletionRequest (#9958)
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
2025-09-03 10:43:52 -07:00
Frank Fang
788b19a532 [router] Add Rerank API Specification (#9906) 2025-09-03 08:30:29 -07:00
Bruce-x-1997
21e1bc475c [router] fix FunctionCallResponse proto, support arguments is null (#9875)
Co-authored-by: forestlee95 <forestlee95@foxmail.com>
2025-09-01 20:37:15 -07:00
Keyang Ru
5ef545e678 [router] Move all protocols to spec.rs file (#9519) 2025-08-22 14:18:47 -07:00
Keyang Ru
5ae5ecaa15 [router] Implement OpenAI Responses API specification (#9367) 2025-08-19 20:14:47 -07:00
Keyang Ru
c5057262fa [Router] Add validation module for API parameters (#9335) 2025-08-19 13:25:53 -07:00
Keyang Ru
ce67b2d586 [router]restructure protocol modules for better organization (#9321) 2025-08-19 01:07:58 +00:00