Chang Su
|
37158f2018
|
router: Support parallel sampling num > 1 in grpc_server and non-stream handling (#10929)
|
2025-09-25 20:03:35 -07:00 |
|
Chang Su
|
7dcd689b47
|
[router][refactor] Clean up protobuf fields (#10923)
|
2025-09-25 17:48:47 -07:00 |
|
Simo Lin
|
f7bab41a29
|
[router] change log level to warning (#10926)
|
2025-09-25 17:32:59 -07:00 |
|
Chang Su
|
5e21d6aec0
|
refactor: Move grpc/client.rs to grpc_client/sglang_scheduler.rs (#10924)
|
2025-09-25 17:21:22 -04:00 |
|
Chang Su
|
916784746b
|
router: Fix constraint proto and build_constraint in grpc router (#10881)
|
2025-09-25 11:12:06 -04:00 |
|
Simo Lin
|
d511b2d905
|
[router] consolidate worker load monitoring (#10894)
|
2025-09-25 09:59:30 -04:00 |
|
Simo Lin
|
458c0219a6
|
[router] simplify tokenizer dev doc (#10895)
|
2025-09-24 22:15:56 -07:00 |
|
Keyang Ru
|
a73eb8cd20
|
[router] Support Oracle DB(ATP) Data Connector (#10845)
|
2025-09-24 23:59:32 -04:00 |
|
Simo Lin
|
e738703547
|
[router] consolidate worker get loads (#10880)
|
2025-09-24 22:13:31 -04:00 |
|
Simo Lin
|
7a06ef984d
|
[router] consolidate health endpoints and flush cache (#10876)
|
2025-09-24 15:23:21 -07:00 |
|
Chang Su
|
4a87ba217f
|
router-grpc: Add tools processing and other paramters for apply_chat_template (#10877)
|
2025-09-24 15:23:06 -07:00 |
|
luna
|
c3faf2d6e6
|
[router] select first healthy worker on proxied get requests (#10827)
|
2025-09-24 11:45:41 -07:00 |
|
Chang Su
|
9209b209be
|
router-grpc: Support jinja chat template content format detection (#10832)
|
2025-09-24 11:45:01 -07:00 |
|
Chang Su
|
ee704e6265
|
[router] add auth middleware for api key auth (#10826)
|
2025-09-23 16:07:34 -07:00 |
|
Keyang Ru
|
f4e3ebeb05
|
[router] Support streaming for Openai Router Response api (#10822)
|
2025-09-23 14:56:28 -07:00 |
|
Chang Su
|
08b8c0c3cd
|
[router] fix axum default body limit (#10818)
|
2025-09-23 12:44:17 -07:00 |
|
Chang Su
|
7ff93e613f
|
router(grpc): Implement route for chat_cmpl endpoint (#10761)
|
2025-09-23 11:26:33 -07:00 |
|
Simo Lin
|
b24b2e7ed7
|
[router] use dashmap for radix tree instead of hash for multi model (#10814)
|
2025-09-23 11:25:53 -07:00 |
|
Simo Lin
|
98c3b04ff2
|
[router] responses api POST and GET with local storage (#10581)
Co-authored-by: key4ng <rukeyang@gmail.com>
|
2025-09-23 09:12:02 -07:00 |
|
Simo Lin
|
ddab4fc7c7
|
[router] fix cache aware routing strategy and lock contention (#10773)
|
2025-09-23 08:53:49 -07:00 |
|
Chang Su
|
16adf3dcab
|
[router] fix logger type mismatch (#10774)
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
|
2025-09-22 21:02:28 -07:00 |
|
Simo Lin
|
c3a1d7759f
|
[router] remove pd router draining channel (#10767)
|
2025-09-22 20:49:33 -07:00 |
|
Simo Lin
|
89971c4c3c
|
[router] refactor router and worker management 4/n (#10756)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
|
2025-09-22 18:35:10 -07:00 |
|
Simo Lin
|
97c3823931
|
[router] refactor router and worker management 3/n (#10727)
|
2025-09-22 12:17:50 -07:00 |
|
Chang Su
|
60dbbd086a
|
bugfix: Fix get_worker_urls_for_model in http/router.rs (#10754)
|
2025-09-22 14:10:31 -04:00 |
|
Qiaolin Yu
|
e2ac7888b8
|
[2/2] Support deterministic inference for temperature > 0 (#10678)
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
|
2025-09-21 19:36:08 -07:00 |
|
Jimmy
|
56321e9fc2
|
[Router]fix: fix get_load missing api_key (#10385)
|
2025-09-21 15:28:38 -04:00 |
|
Simo Lin
|
1d1ce62495
|
[router] refactor router and worker management 2.5/n (#10677)
|
2025-09-19 20:54:40 -07:00 |
|
Chang Su
|
03ce92e594
|
router-spec: Reorder ChatCompletionRequest and fix validation logic (#10675)
|
2025-09-19 16:41:21 -07:00 |
|
Simo Lin
|
00eb5eb721
|
[router] refactor router and worker management 2/n (#10666)
|
2025-09-19 12:37:57 -07:00 |
|
Simo Lin
|
36efd5be8a
|
[router] refactor router and worker management 1/n (#10664)
|
2025-09-19 06:19:57 -07:00 |
|
Fabian Gebhart
|
68cdc1893d
|
[router] preserve order of json params using preserve_order feature (#10661)
|
2025-09-19 06:15:22 -07:00 |
|
Simo Lin
|
873d858b28
|
[router] refactor worker to builder pattern 5/n (#10653)
|
2025-09-19 05:43:23 -04:00 |
|
Simo Lin
|
4f2055ad56
|
[router] refactor worker to builder pattern 4/n (#10650)
|
2025-09-18 23:49:10 -07:00 |
|
Simo Lin
|
ac2a723bb3
|
[router] refactor worker to builder pattern 3/n (#10647)
|
2025-09-18 22:52:57 -07:00 |
|
Simo Lin
|
780d6a22cd
|
[router] refactor worker to builder pattern 2/n (#10633)
|
2025-09-18 21:47:56 -07:00 |
|
Simo Lin
|
5291f32d75
|
[router] refactor worker to builder pattern 1/n (#10628)
|
2025-09-18 13:25:40 -07:00 |
|
ybyang
|
0abb41c70d
|
adjust import setuptools_rust (#10524)
|
2025-09-16 11:01:58 -04:00 |
|
Chang Su
|
5fe39e85a2
|
[router] fix router manager and router init in server (#10499)
|
2025-09-15 22:23:26 -07:00 |
|
Simo Lin
|
16e9335998
|
[router] add router db connector for responses api (#10487)
|
2025-09-15 22:04:56 -07:00 |
|
Chang Su
|
35ef3f2902
|
[router] fix worker registration in multi model mode (#10486)
|
2025-09-15 21:05:00 -04:00 |
|
Chang Su
|
2689f0bf02
|
[router] multi model registration fix (#10481)
|
2025-09-15 15:22:21 -07:00 |
|
Jiayi Yan
|
57234d0c9c
|
[bugfix] fix typo (#10471)
|
2025-09-15 07:29:20 -07:00 |
|
Chang Su
|
b93acd7020
|
[router] minor code clean up in server startup (#10470)
|
2025-09-15 07:28:25 -07:00 |
|
Chang Su
|
69b35793a0
|
[router] fix logger ordering git ctx (#10457)
|
2025-09-14 21:37:21 -07:00 |
|
ooapex
|
957482c8f2
|
[router] add dependency for router (#10401)
|
2025-09-14 21:14:14 -07:00 |
|
Simo Lin
|
7eccbe992d
|
[router] fix service discovery and mcp ut (#10449)
|
2025-09-14 21:07:23 -07:00 |
|
Jintao Zhang
|
f9ee6ae17a
|
[router]: Add Embedding routing logic (#10129)
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
Co-authored-by: Waël Boukhobza <wawa_wael@live.fr>
|
2025-09-14 18:44:35 -07:00 |
|
Simo Lin
|
7c5a0a1b77
|
[router] add not implemented functions for multi model trait (#10394)
|
2025-09-12 16:44:18 -07:00 |
|
Keyang Ru
|
366043db8e
|
[router] Add get and cancel method for response api (#10387)
|
2025-09-12 16:19:38 -07:00 |
|