Update README.md (#358)
This commit is contained in:
committed by
GitHub
parent
62b3812b69
commit
c93293c57e
@@ -362,7 +362,7 @@ python -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port
|
|||||||
```
|
```
|
||||||
python -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000 --mem-fraction-static 0.7
|
python -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000 --mem-fraction-static 0.7
|
||||||
```
|
```
|
||||||
- You can turn on [flashinfer](docs/flashinfer.md) to acclerate the inference by using highly optimized CUDA kernels.
|
- You can turn on [flashinfer](docs/flashinfer.md) to accelerate the inference by using highly optimized CUDA kernels.
|
||||||
|
|
||||||
### Supported Models
|
### Supported Models
|
||||||
- Llama
|
- Llama
|
||||||
|
|||||||
Reference in New Issue
Block a user