Fix install instructions and pyproject.tomls (#11781)

This commit is contained in:
Lianmin Zheng
2025-10-18 01:08:01 -07:00
committed by GitHub
parent 1d726528f7
commit 67e34c56d7
10 changed files with 298 additions and 296 deletions

View File

@@ -12,7 +12,7 @@ It is recommended to use uv for faster installation:
```bash
pip install --upgrade pip
pip install uv
uv pip install sglang --upgrade
uv pip install sglang --prerelease=allow
```
**Quick fixes to common problems**
@@ -129,5 +129,3 @@ sky status --endpoint 30000 sglang
- [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is the default attention kernel backend. It only supports sm75 and above. If you encounter any FlashInfer-related issues on sm75+ devices (e.g., T4, A10, A100, L4, L40S, H100), please switch to other kernels by adding `--attention-backend triton --sampling-backend pytorch` and open an issue on GitHub.
- To reinstall flashinfer locally, use the following command: `pip3 install --upgrade flashinfer-python --force-reinstall --no-deps` and then delete the cache with `rm -rf ~/.cache/flashinfer`.
- If you only need to use OpenAI API models with the frontend language, you can avoid installing other dependencies by using `pip install "sglang[openai]"`.
- The language frontend operates independently of the backend runtime. You can install the frontend locally without needing a GPU, while the backend can be set up on a GPU-enabled machine. To install the frontend, run `pip install sglang`, and for the backend, use `pip install sglang[srt]`. `srt` is the abbreviation of SGLang runtime.