Update version to v0.1.13 (#280)

This commit is contained in:
Lianmin Zheng
2024-03-11 05:49:27 -07:00
committed by GitHub
parent 13662fd533
commit 4aa5dd2c5f
11 changed files with 35 additions and 21 deletions

View File

@@ -5,14 +5,7 @@ It can be used in SGLang runtime to accelerate attention computation.
### Install flashinfer
You can install flashinfer via pip as follows for CUDA 12.1.
```bash
pip install flashinfer -i https://flashinfer.ai/whl/cu121/
```
You can look for other CUDA versions in https://github.com/flashinfer-ai/flashinfer?tab=readme-ov-file#installation. If there is no desire version for your environment,
please build it from source (the compilation takes a long time).
See https://docs.flashinfer.ai/installation.html.
### Run a Server With Flashinfer Mode