Docs: Quick fix for Speculative_decoding doc (#3228)
Co-authored-by: Chayenne <zhaochenyang@ucla.edu> Co-authored-by: Chayenne <zhaochen20@outlook.com>
This commit is contained in:
@@ -8,10 +8,11 @@
|
||||
"\n",
|
||||
"SGLang now provides an EAGLE-based speculative decoding option. The implementation aims to maximize speed and efficiency and is considered to be among the fastest in open-source LLM engines.\n",
|
||||
"\n",
|
||||
"**Note:** Currently, Speculative Decoding in SGLang does not support radix cache.\n",
|
||||
"\n",
|
||||
"To run the following tests or benchmarks, you also need to install [**cutex**](https://pypi.org/project/cutex/): \n",
|
||||
"> ```bash\n",
|
||||
"> pip install cutex\n",
|
||||
"> ```\n",
|
||||
"\n",
|
||||
"`pip install cutex`\n",
|
||||
"\n",
|
||||
"### Performance Highlights\n",
|
||||
"\n",
|
||||
|
||||
Reference in New Issue
Block a user