Commit Graph

240 Commits

Author SHA1 Message Date
Simo Lin
4b62af92ef [router] change worker api to async instead of sync (#11566) 2025-10-14 00:32:21 -07:00
Simo Lin
0b9915c132 [router] update generate spec to align with sgl io struct (#11591) 2025-10-14 02:51:33 -04:00
Chang Su
27ef1459e6 [router][protocols] Add Axum validate extractor and use it for /v1/chat/completions endpoint (#11588) 2025-10-13 22:51:15 -07:00
Chang Su
887c2b4575 [router][grpc] Add serve_grpc to launch_server and log id for HealthCheck (#11564) 2025-10-13 16:07:19 -07:00
Chang Su
4b694e7d5a [router][grpc] Add error handling to generate_tool_constraints (#11562) 2025-10-13 12:26:09 -07:00
Jonah Bernard
f4aa78801e [router] Add Rust CLI flags for queue size, timeout, and rate limit for token bucket rate limiter (#11483)
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
2025-10-13 11:08:48 -07:00
Simo Lin
728af88781 [router] allow user to specify chat template path (#11549) 2025-10-13 10:47:57 -07:00
Chang Su
7b59b0b8b0 [router][grpc] Further delegate non-stream processing to processing.rs (#11553) 2025-10-13 10:36:27 -07:00
Simo Lin
7c94eaeeb0 [router] allow tokenizer path to be dir (#11530) 2025-10-13 09:30:09 -04:00
Keyang Ru
63e84352b7 [router] openai router: support grok model (#11511) 2025-10-12 22:44:43 -04:00
Antoine Roux
ec1cd90ac9 Fix the GPT function calling regex to allow dash in the name (#10577) 2025-10-12 20:34:58 +08:00
Wenyi Xu
9b5efe3464 [Router]: Small Typo in a comment within tree.rs (#11489) 2025-10-11 21:59:48 -07:00
fzyzcjy
d957177a22 Super tiny delete unused openai router in sgl-router (#11448) 2025-10-11 15:59:30 +08:00
Chang Su
92777135a0 [router][grpc] Consolidate parser checks for chat completions (#11439) 2025-10-10 20:44:29 -04:00
Simo Lin
c495833186 [router] leverage RAII to actively cancel request during client disconnect (#11399) 2025-10-10 20:43:38 -04:00
Simo Lin
2eeb27515a [router] disable rate limiter by default (#11435) 2025-10-10 20:43:07 -04:00
Keyang Ru
eb7d9261c0 [router] conversation item API: create, retrieve and delete (#11369) 2025-10-09 17:43:16 -04:00
Simo Lin
88bb627d0d [router] change grpc client from mutable to clone (#11394) 2025-10-09 11:00:24 -07:00
Chang Su
ab926dd697 [router][grpc] Fix streaming bugs: empty tool names, state pollution, and panics (#11373) 2025-10-09 06:53:23 -04:00
Chang Su
a0557642ea [router][lint] Add unused_qualifications to cargo lint warnings (#11366) 2025-10-08 22:17:11 -07:00
Keyang Ru
84768d1017 [router] Refactor OpenAI router: split monolithic file and move location (#11359) 2025-10-09 00:46:39 -04:00
Simo Lin
368fd20622 [router][grpc] disable health check generation and increase timeout (#11353) 2025-10-08 19:23:08 -07:00
Chang Su
fccac7d126 [router][grpc] Add dependencies in Cargo.toml to support chat template rendering (#11342) 2025-10-08 15:38:37 -07:00
Keyang Ru
7ac6b900f4 [router] Support history management using conversation (#11339) 2025-10-08 15:24:02 -07:00
Chang Su
a1080b72a0 [router] Fix all unused_qualifications (#11341) 2025-10-08 13:55:27 -07:00
Chang Su
a65ca73911 [router][grpc] Cleanup debug logs in grpc_server and grpc_router (#11340) 2025-10-08 13:26:19 -07:00
Simo Lin
677aa0e25f [router] improve reasoning parser lock and reduce req cloning (#11336) 2025-10-08 11:18:15 -07:00
Simo Lin
01c9ee1ab4 [router] refactor generate to use new pipeline arch (#11323) 2025-10-08 09:38:50 -07:00
Chang Su
edd86b8853 [router][grpc] Refactor chat handler in grpc/ to use centralized orchestrator (#11314)
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
2025-10-07 20:50:20 -07:00
Simo Lin
fde9b96392 [router] cleanup worker health check to return early (#11310) 2025-10-07 16:53:10 -07:00
Keyang Ru
4ed67c27e3 [router] support Openai router conversation API CRUD (#11297) 2025-10-07 15:31:35 -07:00
Chang Su
420c99acfe [router][grpc] Fix error message format in grpc chat handler (#11307) 2025-10-07 13:54:02 -07:00
Simo Lin
f4affd4df5 [router] fix grpc connection conversion and add optimization (#11305) 2025-10-07 10:39:33 -07:00
Chang Su
64582caa84 [router][grpc] Refactor chat template content format detection (#11288) 2025-10-07 08:38:51 -07:00
Simo Lin
2fcd56eaf6 [router] add get server info and get model info in grpc server (#11303) 2025-10-07 08:36:52 -07:00
Simo Lin
79d3495177 [router] add reasoning and tool parser argument in router (#11290) 2025-10-07 09:08:32 -04:00
Chang Su
a578d300ba [router][grpc] Fix proto3 default value mismatches and cleanup unused fields (#11283) 2025-10-06 18:54:51 -07:00
Chang Su
b07c9c76c5 [router][grpc] Refine streaming processes (#11277) 2025-10-06 15:15:01 -07:00
Chang Su
466992b2d0 [router][tool call] Clean up redundant detect_format and has_tool_markers (#11270) 2025-10-06 14:04:02 -07:00
Simo Lin
5ee777c98f [router] add ipv6 support across all components (#11219) 2025-10-06 08:16:59 -07:00
Simo Lin
d736e0b65e [router] add grpc router pd mode for chat and generate (#11140) 2025-10-04 06:58:28 -07:00
Simo Lin
ffd03a9bd3 [router] fix get load response parsing (#11213) 2025-10-04 06:58:02 -07:00
Keyang Ru
34151f173b [router] Steaming support for MCP Tool Calls in OpenAI Router (#11173) 2025-10-03 00:19:43 -07:00
Chang Su
963175d5c0 [router][grpc] Support streaming for v1/chat/completions (#11179) 2025-10-02 14:35:16 -07:00
Liangsheng Yin
7ff740a6ce Remove dp balance metadata and minimul token balance. (#11170) 2025-10-03 01:48:15 +08:00
Chang Su
b658be6f6a [router][grpc] Support tool call parser in streaming (#11160) 2025-10-02 03:18:50 -07:00
Keyang Ru
a28b394fba [router] Add multi-turn tool calling loop support for MCP integration (#11143) 2025-10-01 12:50:21 -07:00
Keyang Ru
7fb551a75d [router] add mcp list and mcp call in output array (#11112) 2025-09-30 21:41:54 -04:00
Chang Su
8ce830a8b0 [router][bugfix] Fix input_logprobs handling with None value and logprob_start_len = -1 (#11113) 2025-09-30 16:09:40 -07:00
Chang Su
d1676cd483 [router][tool call] Full support for ToolChoice (#11085)
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
2025-09-29 22:36:03 -07:00