Commit Graph

31 Commits

Author SHA1 Message Date
Yikai Zhang
0870232195 Update native_api doc to match the change in the get_model_info endpoint (#7660)
Co-authored-by: Lifu Huang <lifu.hlf@gmail.com>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
2025-07-08 21:05:58 -07:00
Xinyuan Tong
43f93f632c fix CI: update native api ipynb (#7754)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-07-03 15:25:00 -07:00
woodx
97011abc8a [Doc] add embedding rerank doc (#7364) 2025-06-19 21:53:54 -07:00
Yijie Zhu
a39d928782 support qwen2 running on ascend npu device (#7022)
Co-authored-by: 刁莹煜 <diaoyingyu1@hisilicon.com>
2025-06-17 11:24:10 -07:00
fzyzcjy
f0653886a5 Expert distribution recording without overhead for EPLB (#4957) 2025-05-19 20:07:43 -07:00
Chayenne
73dcf2b326 Remove token in token out in Native API (#5967) 2025-05-01 21:59:43 -07:00
simveit
8de53da989 smaller and non gated models for docs (#5378) 2025-04-20 17:38:25 -07:00
simveit
f8194b267c Small improvement of native api docs (#5139)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
2025-04-08 12:09:26 -07:00
yuhsaun-t
199bb01d00 Add endpoints to dump selected expert ids (#4435)
Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com>
2025-03-24 21:34:19 -07:00
Chayenne
146ac8df07 Add examples in sampling parameters (#4039) 2025-03-03 13:04:32 -08:00
Lianmin Zheng
ac2387279e Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>
2025-03-03 00:12:04 -08:00
Qiaolin Yu
40782f05d7 Refactor: Move return_hidden_states to the generate input (#3985)
Co-authored-by: Beichen-Ma <mabeichen12@gmail.com>
2025-03-01 17:51:29 -08:00
Chayenne
3c7bfd7eab Docs: Fix layout with sub-section (#3710) 2025-02-19 15:44:30 -08:00
Shi Shuai
55de40f782 [Docs]: Fix Multi-User Port Allocation Conflicts (#3601)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
Co-authored-by: simveit <simp.veitner@gmail.com>
2025-02-19 11:15:44 -08:00
Shi Shuai
7443197a63 [CI] Improve Docs CI Efficiency (#3587)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
2025-02-14 19:57:00 -08:00
Shi Shuai
c4f9707e16 Improve: Token-In Token-Out Usage for RLHF (#2843) 2025-01-11 15:14:26 -08:00
Chayenne
786be44da5 Fix Docs CI When Compile Error (#2323) 2024-12-04 11:19:46 -08:00
Chayenne
7d5d1d3d29 udate weights from disk (#2265) 2024-11-30 01:17:00 +00:00
Henry Hyeonmok Ko
dbe1729395 Merged three native APIs into one: get_server_info (#2152) 2024-11-24 01:37:58 -08:00
Henry Hyeonmok Ko
c35cd1f8c7 Expose max total num tokens from Runtime & Engine API (#2092) 2024-11-22 15:10:10 -08:00
Xuehai Pan
72f87b723b feat(pre-commit): trim unnecessary notebook metadata from git history (#2127) 2024-11-22 13:04:51 -08:00
Chayenne
c77c1e05ba fix black in pre-commit (#1940) 2024-11-08 07:42:47 +08:00
Lianmin Zheng
f5113e50ae [Doc] improve relative links and structure (#1924) 2024-11-05 01:12:10 -08:00
Chayenne
02755768d3 Change judge to classify & Modify make file (#1920) 2024-11-04 23:53:44 -08:00
Chayenne
704f8e8ed1 Add Reward API Docs etc (#1910)
Co-authored-by: Chayenne <zhaochenyang@g.ucla.edu>
2024-11-03 22:33:03 -08:00
Chayenne
908dd7f9aa Add engine api (#1894) 2024-11-02 22:03:38 -07:00
Chayenne
f4cd804073 Fix ci and link error (#1892)
Co-authored-by: Chayenne <zhaochenyang@g.ucla.edu>
2024-11-02 19:08:49 -07:00
Lianmin Zheng
be7986e005 Fix docs (#1890) 2024-11-02 13:26:32 -07:00
Chayenne
5a5f18432f Fix docs ci (#1888) 2024-11-02 11:57:22 -07:00
Chayenne
3b60558dd7 Native api (#1886)
Co-authored-by: Chayenne <zhaochenyang@g.ucla.edu>
2024-11-02 01:02:17 -07:00
Chayenne
72e979bfb5 add native api docs (#1883)
Co-authored-by: Chayenne <zhaochenyang@g.ucla.edu>
2024-11-02 00:17:30 -07:00