docs: update README (#1098)

2024-08-14 19:40:05 +08:00
parent f14569f64a
commit fe5024325b
4 changed files with 13 additions and 5 deletions
--- a/README.md
+++ b/README.md
@@ -88,7 +88,7 @@ docker run --gpus all \
 2. Execute the command `docker compose up -d` in your terminal.

 ### Common Notes
- If you cannot install FlashInfer, check out its [installation](https://docs.flashinfer.ai/installation.html#) page. If you still cannot install it, you can use the slower Triton kernels by adding `--disable-flashinfer` when launching the server.
+- [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is currently one of the dependencies that must be installed for SGLang. If you are using NVIDIA GPU devices below sm80, such as T4, you can't use SGLang for the time being. We expect to resolve this issue soon, so please stay tuned. If you encounter any FlashInfer-related issues on sm80+ devices (e.g., A100, L40S, H100), consider using Triton's kernel by `--disable-flashinfer --disable-flashinfer-sampling` and raise a issue.
 - If you only need to use the OpenAI backend, you can avoid installing other dependencies by using `pip install "sglang[openai]"`.

 ## Backend: SGLang Runtime (SRT)