Commit Graph

201 Commits

Author SHA1 Message Date
Chang Su
16adf3dcab [router] fix logger type mismatch (#10774)
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
2025-09-22 21:02:28 -07:00
Simo Lin
c3a1d7759f [router] remove pd router draining channel (#10767) 2025-09-22 20:49:33 -07:00
Simo Lin
89971c4c3c [router] refactor router and worker management 4/n (#10756)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-09-22 18:35:10 -07:00
Simo Lin
97c3823931 [router] refactor router and worker management 3/n (#10727) 2025-09-22 12:17:50 -07:00
Chang Su
60dbbd086a bugfix: Fix get_worker_urls_for_model in http/router.rs (#10754) 2025-09-22 14:10:31 -04:00
Qiaolin Yu
e2ac7888b8 [2/2] Support deterministic inference for temperature > 0 (#10678)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
2025-09-21 19:36:08 -07:00
Jimmy
56321e9fc2 [Router]fix: fix get_load missing api_key (#10385) 2025-09-21 15:28:38 -04:00
Simo Lin
1d1ce62495 [router] refactor router and worker management 2.5/n (#10677) 2025-09-19 20:54:40 -07:00
Chang Su
03ce92e594 router-spec: Reorder ChatCompletionRequest and fix validation logic (#10675) 2025-09-19 16:41:21 -07:00
Simo Lin
00eb5eb721 [router] refactor router and worker management 2/n (#10666) 2025-09-19 12:37:57 -07:00
Simo Lin
36efd5be8a [router] refactor router and worker management 1/n (#10664) 2025-09-19 06:19:57 -07:00
Fabian Gebhart
68cdc1893d [router] preserve order of json params using preserve_order feature (#10661) 2025-09-19 06:15:22 -07:00
Simo Lin
873d858b28 [router] refactor worker to builder pattern 5/n (#10653) 2025-09-19 05:43:23 -04:00
Simo Lin
4f2055ad56 [router] refactor worker to builder pattern 4/n (#10650) 2025-09-18 23:49:10 -07:00
Simo Lin
ac2a723bb3 [router] refactor worker to builder pattern 3/n (#10647) 2025-09-18 22:52:57 -07:00
Simo Lin
780d6a22cd [router] refactor worker to builder pattern 2/n (#10633) 2025-09-18 21:47:56 -07:00
Simo Lin
5291f32d75 [router] refactor worker to builder pattern 1/n (#10628) 2025-09-18 13:25:40 -07:00
ybyang
0abb41c70d adjust import setuptools_rust (#10524) 2025-09-16 11:01:58 -04:00
Chang Su
5fe39e85a2 [router] fix router manager and router init in server (#10499) 2025-09-15 22:23:26 -07:00
Simo Lin
16e9335998 [router] add router db connector for responses api (#10487) 2025-09-15 22:04:56 -07:00
Chang Su
35ef3f2902 [router] fix worker registration in multi model mode (#10486) 2025-09-15 21:05:00 -04:00
Chang Su
2689f0bf02 [router] multi model registration fix (#10481) 2025-09-15 15:22:21 -07:00
Jiayi Yan
57234d0c9c [bugfix] fix typo (#10471) 2025-09-15 07:29:20 -07:00
Chang Su
b93acd7020 [router] minor code clean up in server startup (#10470) 2025-09-15 07:28:25 -07:00
Chang Su
69b35793a0 [router] fix logger ordering git ctx (#10457) 2025-09-14 21:37:21 -07:00
ooapex
957482c8f2 [router] add dependency for router (#10401) 2025-09-14 21:14:14 -07:00
Simo Lin
7eccbe992d [router] fix service discovery and mcp ut (#10449) 2025-09-14 21:07:23 -07:00
Jintao Zhang
f9ee6ae17a [router]: Add Embedding routing logic (#10129)
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
Co-authored-by: Waël Boukhobza <wawa_wael@live.fr>
2025-09-14 18:44:35 -07:00
Simo Lin
7c5a0a1b77 [router] add not implemented functions for multi model trait (#10394) 2025-09-12 16:44:18 -07:00
Keyang Ru
366043db8e [router] Add get and cancel method for response api (#10387) 2025-09-12 16:19:38 -07:00
Simo Lin
2f173ea074 [router] allow one router to support different model families and serving mode (#10244) 2025-09-12 16:18:27 -07:00
Simo Lin
8c86595c93 [router] enable sccache in ci and local build (#10099) 2025-09-12 09:43:48 -07:00
Frank Fang
4634fd5953 [router] Add Rerank Routing Logic in Regular Router (#10219) 2025-09-12 09:10:18 -07:00
Chang Su
53ca15529a Implement Standalone gRPC Server for SGLang Python Scheduler (#10283) 2025-09-11 20:57:17 -07:00
Keyang Ru
a23bdeaf04 [router] Basic OAI Response api (#10346) 2025-09-11 20:56:17 -07:00
Keyang Ru
7b141f816c [router][ci] Add gpu utilization analyze with nvml (#10345) 2025-09-11 19:26:02 -07:00
Keyang Ru
1ee11df8ac [router][ci] add gpu process check and free port before start server (#10338) 2025-09-11 14:24:16 -07:00
Keyang Ru
dee197e11b [router] Add OpenAI backend support - core function (#10254) 2025-09-11 14:13:51 -07:00
Keyang Ru
480d1b8b20 [router] add benchmark for regular router and pd router (#10280) 2025-09-11 12:04:11 -07:00
Keyang Ru
cda7e47ce7 [router] Add PD router mmlu test (#10256) 2025-09-10 08:47:24 -07:00
Keyang Ru
9eb50ecc9c [router] Improve the router e2e tests (#10102) 2025-09-06 16:19:28 -07:00
Keyang Ru
21b9a4b435 [router] Introduce router integration tests (#10086) 2025-09-05 18:52:53 -07:00
Simo Lin
db37422c92 [router] move to mcp sdk instead (#10057) 2025-09-05 18:03:46 -07:00
Simo Lin
bde73ee43f [router] add rust cache in benchmark ci (#10080) 2025-09-05 09:59:36 -07:00
Keyang Ru
4f0e28d7fc [router] add rust cache for rust unit test (#10079) 2025-09-05 09:58:59 -07:00
Keyang Ru
045ab92dc0 [router] add py binding unit tests to coverage 80% (#10043) 2025-09-05 08:40:21 -07:00
Liangsheng Yin
6e95f5e5bd Simplify Router arguments passing and build it in docker image (#9964) 2025-09-05 12:13:55 +08:00
Simo Lin
bbf261ae4a [router] fix grpc connection mode detection (#9999) 2025-09-03 21:36:16 -07:00
Simo Lin
4f8a982d52 [router] clean up dependency injector to use ctx (#10000) 2025-09-03 21:35:51 -07:00
Simo Lin
d966b902af [router] move tokenizer, reasoning, tool initialization to server (#9996) 2025-09-03 19:35:13 -07:00