Commit Graph

272 Commits

Author SHA1 Message Date
Simo Lin
6d6e24bcc4 [router] Add builder pattern for RouterConfig with zero duplication (#12030) 2025-10-23 16:46:10 -07:00
Arthur Cheng
53c2934dce [Router] Consolidate ConnectionMode enum to core module (#11937) 2025-10-23 05:15:49 -07:00
Simo Lin
5dccf69713 [router] create worker removal step and clean up worker manager (#11921) 2025-10-22 13:26:06 -07:00
Keyang Ru
77258ce039 [router] Support multiple worker URLs for OpenAI router (#11723) 2025-10-22 09:27:58 -07:00
Chang Su
590bc4b7a7 [router][grpc] Fix background tasks stored with wrong id (#11945) 2025-10-21 18:38:51 -07:00
Keyang Ru
63cfe1b032 [router] Add gRPC E2E test suite (#11790) 2025-10-21 17:51:21 -07:00
Chang Su
70f6309cd4 [router][grpc] Support v1/responses API (#11926) 2025-10-21 17:41:48 -07:00
Keyang Ru
87a92e459a Fix openai input_text type compatibility (#11935) 2025-10-21 16:10:35 -07:00
Simo Lin
1111030395 [router] clean up workflow logs to debug for implementation details logs (#11886) 2025-10-20 18:24:55 -07:00
Chang Su
e69094df64 [router][grpc] Remove continue_final_message in ChatTemplateParams and add minijinja-contrib (#11882) 2025-10-20 18:03:09 -07:00
Simo Lin
b4948512b8 [router] remove encoding header for oai router (#11881) 2025-10-20 17:39:00 -07:00
Simo Lin
ddcba74b4d [router] Worker Management Workflow Engine (#11868) 2025-10-20 17:00:22 -07:00
ybyang
d513ee93ef [2/2] [feature] support openai like classification api in router (#11670) 2025-10-18 19:31:08 -07:00
Simo Lin
a7ae61ed77 [router] Add Configurable L0 and L1 Tokenizer Caching (#11688) 2025-10-18 18:33:53 -07:00
Chang Su
ca240eefb4 [router][grpc] Support parallel queue puts in grpc_request_manager and remove mutex for grpc_client (#11798) 2025-10-17 20:49:43 -07:00
Chang Su
d1984e218c [router][grpc] Remove timeout for connections and remove max_tokens deprecation warning log (#11775) 2025-10-17 12:36:36 -07:00
Simo Lin
a5978a20f0 [router] fix grpc client time out to 1h (#11768) 2025-10-17 10:26:12 -07:00
Simo Lin
e483c1eae5 [router] Fix UTF-8 Boundary Panic in Stop Sequence Decoder (#11766) 2025-10-17 10:21:00 -07:00
Keyang Ru
7780230a15 Revert "[router] fix get_models endpoint for openai router (#11687)" (#11740) 2025-10-16 18:36:53 -07:00
Chang Su
dc01313da1 [router] Add rustfmt and set group imports by default (#11732) 2025-10-16 17:33:29 -07:00
Chang Su
c7962868c1 [router] Fix tool_choice normalization in ChatCompletionRequest and fix ut (#11731) 2025-10-16 14:20:13 -07:00
Simo Lin
64affab495 [router] fix p and d worker filtering and bootstrap port handling (#11729) 2025-10-16 14:19:39 -07:00
Keyang Ru
4c9bcb9d56 [Router] Refactor protocol definitions: split spec.rs into modular files (#11677)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-10-16 13:44:44 -07:00
Keyang Ru
0975ba99bc [router] fix get_models endpoint for openai router (#11687) 2025-10-16 09:00:08 -07:00
Simo Lin
f5d30dae89 [router] Refactor StopSequenceDecoder to Use Sequence for Incremental Decoding (#11676) 2025-10-15 16:31:03 -07:00
Chang Su
2479b89405 [router][grpc] Simplify model_id determination (#11684) 2025-10-15 15:56:58 -07:00
Keyang Ru
d2478cd4ff [router] Fix response api related spec (#11621) 2025-10-15 09:59:38 -07:00
Simo Lin
40e0082d8d [router] add worker self discovery for metadata (#11638) 2025-10-14 22:07:25 -04:00
Simo Lin
3962e39d7c [router] cleanup app context and move to startup (#11617) 2025-10-14 10:19:28 -07:00
Keyang Ru
eb8cac6fe2 [router] add py binding and readme for openai router and history backend (#11453)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-14 09:42:34 -07:00
Simo Lin
a04efc4933 [router] when given both local tokenizer and chat template, log all (#11601) 2025-10-14 02:22:58 -07:00
Simo Lin
28ad2297a0 [router] delete useless table content comment in spec (#11597) 2025-10-14 01:08:18 -07:00
Simo Lin
4b62af92ef [router] change worker api to async instead of sync (#11566) 2025-10-14 00:32:21 -07:00
Simo Lin
0b9915c132 [router] update generate spec to align with sgl io struct (#11591) 2025-10-14 02:51:33 -04:00
Chang Su
27ef1459e6 [router][protocols] Add Axum validate extractor and use it for /v1/chat/completions endpoint (#11588) 2025-10-13 22:51:15 -07:00
Chang Su
887c2b4575 [router][grpc] Add serve_grpc to launch_server and log id for HealthCheck (#11564) 2025-10-13 16:07:19 -07:00
Chang Su
4b694e7d5a [router][grpc] Add error handling to generate_tool_constraints (#11562) 2025-10-13 12:26:09 -07:00
Jonah Bernard
f4aa78801e [router] Add Rust CLI flags for queue size, timeout, and rate limit for token bucket rate limiter (#11483)
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
2025-10-13 11:08:48 -07:00
Simo Lin
728af88781 [router] allow user to specify chat template path (#11549) 2025-10-13 10:47:57 -07:00
Chang Su
7b59b0b8b0 [router][grpc] Further delegate non-stream processing to processing.rs (#11553) 2025-10-13 10:36:27 -07:00
Simo Lin
7c94eaeeb0 [router] allow tokenizer path to be dir (#11530) 2025-10-13 09:30:09 -04:00
Keyang Ru
63e84352b7 [router] openai router: support grok model (#11511) 2025-10-12 22:44:43 -04:00
Antoine Roux
ec1cd90ac9 Fix the GPT function calling regex to allow dash in the name (#10577) 2025-10-12 20:34:58 +08:00
Wenyi Xu
9b5efe3464 [Router]: Small Typo in a comment within tree.rs (#11489) 2025-10-11 21:59:48 -07:00
fzyzcjy
d957177a22 Super tiny delete unused openai router in sgl-router (#11448) 2025-10-11 15:59:30 +08:00
Chang Su
92777135a0 [router][grpc] Consolidate parser checks for chat completions (#11439) 2025-10-10 20:44:29 -04:00
Simo Lin
c495833186 [router] leverage RAII to actively cancel request during client disconnect (#11399) 2025-10-10 20:43:38 -04:00
Simo Lin
2eeb27515a [router] disable rate limiter by default (#11435) 2025-10-10 20:43:07 -04:00
Keyang Ru
eb7d9261c0 [router] conversation item API: create, retrieve and delete (#11369) 2025-10-09 17:43:16 -04:00
Simo Lin
88bb627d0d [router] change grpc client from mutable to clone (#11394) 2025-10-09 11:00:24 -07:00