Chang Su
|
963175d5c0
|
[router][grpc] Support streaming for v1/chat/completions (#11179)
|
2025-10-02 14:35:16 -07:00 |
|
Liangsheng Yin
|
7ff740a6ce
|
Remove dp balance metadata and minimul token balance. (#11170)
|
2025-10-03 01:48:15 +08:00 |
|
Chang Su
|
b658be6f6a
|
[router][grpc] Support tool call parser in streaming (#11160)
|
2025-10-02 03:18:50 -07:00 |
|
Keyang Ru
|
a28b394fba
|
[router] Add multi-turn tool calling loop support for MCP integration (#11143)
|
2025-10-01 12:50:21 -07:00 |
|
Keyang Ru
|
7fb551a75d
|
[router] add mcp list and mcp call in output array (#11112)
|
2025-09-30 21:41:54 -04:00 |
|
Chang Su
|
8ce830a8b0
|
[router][bugfix] Fix input_logprobs handling with None value and logprob_start_len = -1 (#11113)
|
2025-09-30 16:09:40 -07:00 |
|
Chang Su
|
d1676cd483
|
[router][tool call] Full support for ToolChoice (#11085)
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
|
2025-09-29 22:36:03 -07:00 |
|
Simo Lin
|
33b3c0f85f
|
[router] grpc router generate endpoint support (#11070)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
|
2025-09-29 22:07:53 -07:00 |
|
Chang Su
|
5937a56d47
|
[router][grpc] Add logprobs support to router (#11082)
|
2025-09-29 15:55:06 -07:00 |
|
Chang Su
|
f065e5bea5
|
[router] Use get_pooled in process_single_choice (#11079)
|
2025-09-29 15:48:00 -07:00 |
|
Chang Su
|
4eeaff74a0
|
[router][tool call] Separate JsonParser and LlamaParser (#11073)
|
2025-09-29 10:26:37 -07:00 |
|
Simo Lin
|
816b3a433a
|
[router] add n to generate sampling params (#11069)
|
2025-09-29 07:37:43 -07:00 |
|
Chang Su
|
af4ab65606
|
[router][tool call] Improve normal content extraction and error handling (non-stream) (#11050)
|
2025-09-29 00:19:30 -07:00 |
|
Simo Lin
|
2572886367
|
[router] add harmony tool parser base structure and interface (#11036)
|
2025-09-28 19:46:38 -07:00 |
|
Chang Su
|
dba751a896
|
[router][tool call] Support normal content extraction before tool call (streaming) (#11038)
|
2025-09-28 19:46:06 -07:00 |
|
Simo Lin
|
336e9a6058
|
[router] migrate to rust python module for pythonic parser (#11033)
|
2025-09-28 14:48:59 -04:00 |
|
Yuxuan Zhang
|
abb6781573
|
Update GLM-4.5 Model Doc (#11017)
|
2025-09-28 11:21:27 -07:00 |
|
Simo Lin
|
5519766a4d
|
[router] fix chat template loading and tokenizer path (#10999)
|
2025-09-27 23:54:12 -04:00 |
|
Keyang Ru
|
72392f2908
|
[router] basic mcp support for openai router response api (#10978)
|
2025-09-27 21:49:33 -04:00 |
|
Chang Su
|
c1c8dd1dd0
|
[router][tool parser] Modify tool parser to return both normal text and tool calls (non-stream) (#10995)
|
2025-09-27 18:10:17 -04:00 |
|
Chang Su
|
37f3325b06
|
[router][grpc] Support E2E non-stream chat completions (#10980)
|
2025-09-26 22:02:06 -07:00 |
|
Chang Su
|
0c3db88978
|
[router][grpc] Add helpfer functions for decoder in router.rs and fix specs (#10971)
|
2025-09-26 20:10:45 -04:00 |
|
Simo Lin
|
aae7ead2d0
|
[router] remove old/oudated/useless comments across code base (#10968)
|
2025-09-26 10:48:50 -07:00 |
|
Simo Lin
|
a7fe6e10a1
|
[router] remove old/oudated/useless comments (#10967)
|
2025-09-26 09:45:15 -07:00 |
|
Simo Lin
|
be059b83d6
|
[router] grpc router regular mode import cleanup (#10963)
|
2025-09-26 04:06:59 -07:00 |
|
Simo Lin
|
5d4fe1ceee
|
[router] add move grpc worker management from router to worker manager (#10960)
|
2025-09-26 03:57:57 -07:00 |
|
Simo Lin
|
1b011e68dc
|
[router] move grpc client from router to worker and builder (#10958)
|
2025-09-26 03:13:47 -07:00 |
|
Simo Lin
|
1e57b9472d
|
[router] add grpc client get and set (#10955)
|
2025-09-26 03:07:05 -07:00 |
|
Chang Su
|
37158f2018
|
router: Support parallel sampling num > 1 in grpc_server and non-stream handling (#10929)
|
2025-09-25 20:03:35 -07:00 |
|
Chang Su
|
7dcd689b47
|
[router][refactor] Clean up protobuf fields (#10923)
|
2025-09-25 17:48:47 -07:00 |
|
Chang Su
|
5e21d6aec0
|
refactor: Move grpc/client.rs to grpc_client/sglang_scheduler.rs (#10924)
|
2025-09-25 17:21:22 -04:00 |
|
Chang Su
|
916784746b
|
router: Fix constraint proto and build_constraint in grpc router (#10881)
|
2025-09-25 11:12:06 -04:00 |
|
Simo Lin
|
d511b2d905
|
[router] consolidate worker load monitoring (#10894)
|
2025-09-25 09:59:30 -04:00 |
|
Simo Lin
|
458c0219a6
|
[router] simplify tokenizer dev doc (#10895)
|
2025-09-24 22:15:56 -07:00 |
|
Keyang Ru
|
a73eb8cd20
|
[router] Support Oracle DB(ATP) Data Connector (#10845)
|
2025-09-24 23:59:32 -04:00 |
|
Simo Lin
|
e738703547
|
[router] consolidate worker get loads (#10880)
|
2025-09-24 22:13:31 -04:00 |
|
Simo Lin
|
7a06ef984d
|
[router] consolidate health endpoints and flush cache (#10876)
|
2025-09-24 15:23:21 -07:00 |
|
Chang Su
|
4a87ba217f
|
router-grpc: Add tools processing and other paramters for apply_chat_template (#10877)
|
2025-09-24 15:23:06 -07:00 |
|
luna
|
c3faf2d6e6
|
[router] select first healthy worker on proxied get requests (#10827)
|
2025-09-24 11:45:41 -07:00 |
|
Chang Su
|
9209b209be
|
router-grpc: Support jinja chat template content format detection (#10832)
|
2025-09-24 11:45:01 -07:00 |
|
Chang Su
|
ee704e6265
|
[router] add auth middleware for api key auth (#10826)
|
2025-09-23 16:07:34 -07:00 |
|
Keyang Ru
|
f4e3ebeb05
|
[router] Support streaming for Openai Router Response api (#10822)
|
2025-09-23 14:56:28 -07:00 |
|
Chang Su
|
08b8c0c3cd
|
[router] fix axum default body limit (#10818)
|
2025-09-23 12:44:17 -07:00 |
|
Chang Su
|
7ff93e613f
|
router(grpc): Implement route for chat_cmpl endpoint (#10761)
|
2025-09-23 11:26:33 -07:00 |
|
Simo Lin
|
b24b2e7ed7
|
[router] use dashmap for radix tree instead of hash for multi model (#10814)
|
2025-09-23 11:25:53 -07:00 |
|
Simo Lin
|
98c3b04ff2
|
[router] responses api POST and GET with local storage (#10581)
Co-authored-by: key4ng <rukeyang@gmail.com>
|
2025-09-23 09:12:02 -07:00 |
|
Simo Lin
|
ddab4fc7c7
|
[router] fix cache aware routing strategy and lock contention (#10773)
|
2025-09-23 08:53:49 -07:00 |
|
Simo Lin
|
c3a1d7759f
|
[router] remove pd router draining channel (#10767)
|
2025-09-22 20:49:33 -07:00 |
|
Simo Lin
|
89971c4c3c
|
[router] refactor router and worker management 4/n (#10756)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
|
2025-09-22 18:35:10 -07:00 |
|
Simo Lin
|
97c3823931
|
[router] refactor router and worker management 3/n (#10727)
|
2025-09-22 12:17:50 -07:00 |
|