Commit Graph

165 Commits

Author SHA1 Message Date
Keyang Ru
1ee11df8ac [router][ci] add gpu process check and free port before start server (#10338) 2025-09-11 14:24:16 -07:00
Keyang Ru
dee197e11b [router] Add OpenAI backend support - core function (#10254) 2025-09-11 14:13:51 -07:00
Keyang Ru
480d1b8b20 [router] add benchmark for regular router and pd router (#10280) 2025-09-11 12:04:11 -07:00
Keyang Ru
cda7e47ce7 [router] Add PD router mmlu test (#10256) 2025-09-10 08:47:24 -07:00
Keyang Ru
9eb50ecc9c [router] Improve the router e2e tests (#10102) 2025-09-06 16:19:28 -07:00
Keyang Ru
21b9a4b435 [router] Introduce router integration tests (#10086) 2025-09-05 18:52:53 -07:00
Simo Lin
db37422c92 [router] move to mcp sdk instead (#10057) 2025-09-05 18:03:46 -07:00
Simo Lin
bde73ee43f [router] add rust cache in benchmark ci (#10080) 2025-09-05 09:59:36 -07:00
Keyang Ru
4f0e28d7fc [router] add rust cache for rust unit test (#10079) 2025-09-05 09:58:59 -07:00
Keyang Ru
045ab92dc0 [router] add py binding unit tests to coverage 80% (#10043) 2025-09-05 08:40:21 -07:00
Liangsheng Yin
6e95f5e5bd Simplify Router arguments passing and build it in docker image (#9964) 2025-09-05 12:13:55 +08:00
Simo Lin
bbf261ae4a [router] fix grpc connection mode detection (#9999) 2025-09-03 21:36:16 -07:00
Simo Lin
4f8a982d52 [router] clean up dependency injector to use ctx (#10000) 2025-09-03 21:35:51 -07:00
Simo Lin
d966b902af [router] move tokenizer, reasoning, tool initialization to server (#9996) 2025-09-03 19:35:13 -07:00
Tony Lu
5e19b159b0 [router] add chat_template_kwargs in ChatCompletionRequest (#9958)
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
2025-09-03 10:43:52 -07:00
Frank Fang
788b19a532 [router] Add Rerank API Specification (#9906) 2025-09-03 08:30:29 -07:00
Chang Su
11dcabc545 Grpc client (#9939) 2025-09-02 11:47:35 -07:00
Simo Lin
9491d6e554 [router] include rust benchamrks (#9932) 2025-09-02 09:32:09 -07:00
Bruce-x-1997
21e1bc475c [router] fix FunctionCallResponse proto, support arguments is null (#9875)
Co-authored-by: forestlee95 <forestlee95@foxmail.com>
2025-09-01 20:37:15 -07:00
Chang Su
9a0cac1be0 [router] add grpc pd and regular router init (#9893) 2025-09-01 20:06:15 -07:00
LukasBluebaum
9d9fa9a537 [router] Fix short timeout for the prefill client (#9803) 2025-09-01 19:57:04 -07:00
Chang Su
598c0bc19d [router] add tokenizer download support from hf hub (#9882) 2025-09-01 10:40:37 -07:00
Chang Su
c112bcc461 [router] global tool parser registry (#9840) 2025-08-30 23:35:39 -07:00
Chang Su
fd5ce576a4 Tool parser.benchmark (#9835) 2025-08-30 21:08:11 -07:00
Simo Lin
92d79646e5 [router] add reasoning parser readme (#9837) 2025-08-30 21:06:23 -07:00
Simo Lin
5343058875 [router] grpc router bootstraps (#9759) 2025-08-28 12:07:06 -07:00
Simo Lin
07c9d8fba2 [router] add llama3.2 multi json streaming parser (#9735) 2025-08-28 05:57:13 -07:00
Simo Lin
e1f7cf57dc [router] additional llama32 parser unit test and multi json support (#9732) 2025-08-27 20:34:11 -07:00
Simo Lin
2bb9d454b5 [router] additional pythonic parser unit test (#9730) 2025-08-27 19:55:59 -07:00
Keyang Ru
3f2d0cefcd [router] Add MCP Tool Handler (#9615) 2025-08-27 19:12:39 -07:00
Bruce-x-1997
8b30bec265 [router] fix error response in pd_router (#9505)
Co-authored-by: bruce.xu <bruce.xu@gmicloud.ai>
2025-08-27 19:10:55 -07:00
Simo Lin
07ee0ab750 [router] add gpt-oss and glm4 tool parser (#9703)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-08-27 11:26:00 -07:00
Simo Lin
5c06dcb75a [router] add kimi-k2 tool parser (#9702)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-08-27 11:04:55 -07:00
Simo Lin
6f6beca49d [router] add step3 tool parser (#9695)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-08-27 10:44:52 -07:00
Simo Lin
6e4e1c8cdc [router] add deepseek tool parser (#9694)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-08-27 06:18:24 -07:00
Simo Lin
9768c50d90 [router] restructure tool parser module folder (#9693) 2025-08-27 06:05:53 -07:00
Chang Su
90313fb09a [router] add token bucket rate limiter (#9656) 2025-08-26 10:36:26 -07:00
Simo Lin
3578eb1e9b [router] address worker load tracking consistency (#9523)
Co-authored-by: fzyzcjy <5236035+fzyzcjy@users.noreply.github.com>
2025-08-26 06:40:51 -07:00
Stefan He
cbc0e4d779 Fix lint for router (#9636) 2025-08-26 00:38:53 -07:00
Simo Lin
e2e378caba [router] add ut for mistral, llama, pythonic, and streaming tool parser (#9632)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-08-25 22:02:15 -07:00
Simo Lin
dc1decc6af [router] add llama tool parser (#9629)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-08-25 20:43:36 -07:00
Simo Lin
03680f33be [router] add pythonic parser (#9628)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-08-25 20:40:06 -07:00
Simo Lin
d4c5e53401 [router] add qwen tool parser (#9623)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-08-25 20:32:05 -07:00
Simo Lin
817c62a077 [router] add mistral tool parser (#9622)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
2025-08-25 20:09:51 -07:00
Bruce-x-1997
3aec3d4f8b [Doc] add LWS(LeaderWorkerSet) use case in sgl-router README (#9568)
Co-authored-by: bruce.xu <bruce.xu@gmicloud.ai>
2025-08-25 08:32:31 -07:00
Bruce-x-1997
9e169ea8b5 [router] add right rustls dependency in sgl-router cargo.toml (#9498)
Co-authored-by: bruce.xu <bruce.xu@gmicloud.ai>
2025-08-24 09:03:15 -07:00
Bruce-x-1997
446c8e4cdb [router] ignore client error when record failure in pd_router (#9503)
Co-authored-by: bruce.xu <bruce.xu@gmicloud.ai>
2025-08-22 14:19:45 -07:00
Keyang Ru
5ef545e678 [router] Move all protocols to spec.rs file (#9519) 2025-08-22 14:18:47 -07:00
Simo Lin
f556ac8bd8 [router] add json tool parser (#9516) 2025-08-22 12:13:04 -07:00
Simo Lin
49f9d02538 [router] tokenizer arch doc (#9513) 2025-08-22 09:52:33 -07:00