Docs: Quick fix for Speculative_decoding doc (#3228)
Co-authored-by: Chayenne <zhaochenyang@ucla.edu> Co-authored-by: Chayenne <zhaochen20@outlook.com>
This commit is contained in:
@@ -8,10 +8,11 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"SGLang now provides an EAGLE-based speculative decoding option. The implementation aims to maximize speed and efficiency and is considered to be among the fastest in open-source LLM engines.\n",
|
"SGLang now provides an EAGLE-based speculative decoding option. The implementation aims to maximize speed and efficiency and is considered to be among the fastest in open-source LLM engines.\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"**Note:** Currently, Speculative Decoding in SGLang does not support radix cache.\n",
|
||||||
|
"\n",
|
||||||
"To run the following tests or benchmarks, you also need to install [**cutex**](https://pypi.org/project/cutex/): \n",
|
"To run the following tests or benchmarks, you also need to install [**cutex**](https://pypi.org/project/cutex/): \n",
|
||||||
"> ```bash\n",
|
"\n",
|
||||||
"> pip install cutex\n",
|
"`pip install cutex`\n",
|
||||||
"> ```\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"### Performance Highlights\n",
|
"### Performance Highlights\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
|||||||
Reference in New Issue
Block a user