Update readme
This commit is contained in:
@@ -4,7 +4,7 @@
|
|||||||
|
|
||||||
--------------------------------------------------------------------------------
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
| [**Blog**](https://lmsys.org/blog/2024-01-17-sglang/) | [**Paper**](https://arxiv.org/abs/2312.07104) |
|
| [**Blog**](https://lmsys.org/blog/2024-07-25-sglang-llama3/) | [**Paper**](https://arxiv.org/abs/2312.07104) |
|
||||||
|
|
||||||
SGLang is a fast serving framework for large language models and vision language models.
|
SGLang is a fast serving framework for large language models and vision language models.
|
||||||
It makes your interaction with models faster and more controllable by co-designing the backend runtime and frontend language.
|
It makes your interaction with models faster and more controllable by co-designing the backend runtime and frontend language.
|
||||||
@@ -57,7 +57,7 @@ pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3/
|
|||||||
```
|
```
|
||||||
|
|
||||||
### Method 3: Using docker
|
### Method 3: Using docker
|
||||||
The docker images are available on Docker Hub as [lmsysorg/sglang](https://hub.docker.com/r/lmsysorg/sglang/tags).
|
The docker images are available on Docker Hub as [lmsysorg/sglang](https://hub.docker.com/r/lmsysorg/sglang/tags), built from [Dockerfile](docker).
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker run --gpus all \
|
docker run --gpus all \
|
||||||
@@ -66,7 +66,7 @@ docker run --gpus all \
|
|||||||
--env "HUGGING_FACE_HUB_TOKEN=<secret>" \
|
--env "HUGGING_FACE_HUB_TOKEN=<secret>" \
|
||||||
--ipc=host \
|
--ipc=host \
|
||||||
lmsysorg/sglang:latest \
|
lmsysorg/sglang:latest \
|
||||||
python3 -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B --host 0.0.0.0 --port 30000
|
python3 -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B-Instruct --host 0.0.0.0 --port 30000
|
||||||
```
|
```
|
||||||
|
|
||||||
### Common Notes
|
### Common Notes
|
||||||
|
|||||||
@@ -1,8 +1,7 @@
|
|||||||
ARG CUDA_VERSION=12.4.1
|
ARG CUDA_VERSION=12.1.1
|
||||||
|
|
||||||
FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu22.04
|
FROM nvidia/cuda:${CUDA_VERSION}-devel-ubuntu22.04
|
||||||
|
|
||||||
ARG CUDA_VERSION=12.4.1
|
|
||||||
ARG PYTHON_VERSION=3
|
ARG PYTHON_VERSION=3
|
||||||
|
|
||||||
ENV DEBIAN_FRONTEND=noninteractive
|
ENV DEBIAN_FRONTEND=noninteractive
|
||||||
|
|||||||
Reference in New Issue
Block a user