Docs: Quick fix for Speculative_decoding doc (#3228)

Co-authored-by: Chayenne <zhaochenyang@ucla.edu>
Co-authored-by: Chayenne <zhaochen20@outlook.com>
This commit is contained in:
Jhin
2025-01-31 10:30:40 -06:00
committed by GitHub
parent cf0f7eafe6
commit 656f7fc1bc

View File

@@ -8,10 +8,11 @@
"\n",
"SGLang now provides an EAGLE-based speculative decoding option. The implementation aims to maximize speed and efficiency and is considered to be among the fastest in open-source LLM engines.\n",
"\n",
"**Note:** Currently, Speculative Decoding in SGLang does not support radix cache.\n",
"\n",
"To run the following tests or benchmarks, you also need to install [**cutex**](https://pypi.org/project/cutex/): \n",
"> ```bash\n",
"> pip install cutex\n",
"> ```\n",
"\n",
"`pip install cutex`\n",
"\n",
"### Performance Highlights\n",
"\n",