From 656f7fc1bc6bd128b227404dd2900d2b63073dcb Mon Sep 17 00:00:00 2001 From: Jhin <47354855+jhinpan@users.noreply.github.com> Date: Fri, 31 Jan 2025 10:30:40 -0600 Subject: [PATCH] Docs: Quick fix for Speculative_decoding doc (#3228) Co-authored-by: Chayenne Co-authored-by: Chayenne --- docs/backend/speculative_decoding.ipynb | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/backend/speculative_decoding.ipynb b/docs/backend/speculative_decoding.ipynb index d69436eed..273d943d1 100644 --- a/docs/backend/speculative_decoding.ipynb +++ b/docs/backend/speculative_decoding.ipynb @@ -8,10 +8,11 @@ "\n", "SGLang now provides an EAGLE-based speculative decoding option. The implementation aims to maximize speed and efficiency and is considered to be among the fastest in open-source LLM engines.\n", "\n", + "**Note:** Currently, Speculative Decoding in SGLang does not support radix cache.\n", + "\n", "To run the following tests or benchmarks, you also need to install [**cutex**](https://pypi.org/project/cutex/): \n", - "> ```bash\n", - "> pip install cutex\n", - "> ```\n", + "\n", + "`pip install cutex`\n", "\n", "### Performance Highlights\n", "\n",