diff --git a/docs/backend/speculative_decoding.ipynb b/docs/backend/speculative_decoding.ipynb index d69436eed..273d943d1 100644 --- a/docs/backend/speculative_decoding.ipynb +++ b/docs/backend/speculative_decoding.ipynb @@ -8,10 +8,11 @@ "\n", "SGLang now provides an EAGLE-based speculative decoding option. The implementation aims to maximize speed and efficiency and is considered to be among the fastest in open-source LLM engines.\n", "\n", + "**Note:** Currently, Speculative Decoding in SGLang does not support radix cache.\n", + "\n", "To run the following tests or benchmarks, you also need to install [**cutex**](https://pypi.org/project/cutex/): \n", - "> ```bash\n", - "> pip install cutex\n", - "> ```\n", + "\n", + "`pip install cutex`\n", "\n", "### Performance Highlights\n", "\n",