From f2a75a66c4e5bd007ce0eabd4fb16edade751e6c Mon Sep 17 00:00:00 2001 From: Ximingwang-09 <72070413+Ximingwang-09@users.noreply.github.com> Date: Wed, 11 Jun 2025 10:02:01 +0800 Subject: [PATCH] update doc (#7046) Co-authored-by: ximing.wxm --- docs/backend/server_arguments.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/backend/server_arguments.md b/docs/backend/server_arguments.md index 171ff13cd..0574671cc 100644 --- a/docs/backend/server_arguments.md +++ b/docs/backend/server_arguments.md @@ -185,7 +185,7 @@ Please consult the documentation below and [server_args.py](https://github.com/s | Arguments | Description | Defaults | |----------|-------------|---------| | `speculative_draft_model_path` | The draft model path for speculative decoding. | None | -| `speculative_algorithm` | The algorithm for speculative decoding. Currently [EAGLE](https://arxiv.org/html/2406.16858v1) and [EAGLE3](https://arxiv.org/pdf/2503.01840) are supported. Note that the radix cache, chunked prefill, and overlap scheduler are disabled when using eagle speculative decoding. | None | +| `speculative_algorithm` | The algorithm for speculative decoding. Currently [EAGLE](https://arxiv.org/html/2406.16858v1) and [EAGLE3](https://arxiv.org/pdf/2503.01840) are supported. Note that the overlap scheduler is disabled when using eagle speculative decoding. | None | | `speculative_num_steps` | How many draft passes we run before verifying. | None | | `speculative_num_draft_tokens` | The number of tokens proposed in a draft. | None | | `speculative_eagle_topk` | The number of top candidates we keep for verification at each step for [Eagle](https://arxiv.org/html/2406.16858v1). | None |