Fix warnings in doc build (#1852)

2024-10-30 22:28:00 -07:00
parent 0ab7bcaf66
commit d913d52c9a
3 changed files with 29 additions and 29 deletions
--- a/docs/install.md
+++ b/docs/install.md
@@ -1,8 +1,8 @@
-# Install
+# Install SGLang

 You can install SGLang using any of the methods below.

-### Method 1: With pip
+## Method 1: With pip
 ```
 pip install --upgrade pip
 pip install "sglang[all]"
@@ -13,7 +13,7 @@ pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/

 Note: Please check the [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html) to install the proper version according to your PyTorch and CUDA versions.

-### Method 2: From source
+## Method 2: From source
 ```
 # Use the last release branch
 git clone -b v0.3.4.post2 https://github.com/sgl-project/sglang.git
@@ -28,7 +28,7 @@ pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/

 Note: Please check the [FlashInfer installation doc](https://docs.flashinfer.ai/installation.html) to install the proper version according to your PyTorch and CUDA versions.

-### Method 3: Using docker
+## Method 3: Using docker
 The docker images are available on Docker Hub as [lmsysorg/sglang](https://hub.docker.com/r/lmsysorg/sglang/tags), built from [Dockerfile](https://github.com/sgl-project/sglang/tree/main/docker).
 Replace `<secret>` below with your huggingface hub [token](https://huggingface.co/docs/hub/en/security-tokens).

@@ -42,7 +42,7 @@ docker run --gpus all \
    python3 -m sglang.launch_server --model-path meta-llama/Llama-3.1-8B-Instruct --host 0.0.0.0 --port 30000
 ```

-### Method 4: Using docker compose
+## Method 4: Using docker compose

 <details>
 <summary>More</summary>
@@ -54,7 +54,7 @@ docker run --gpus all \
 2. Execute the command `docker compose up -d` in your terminal.
 </details>

-### Method 5: Run on Kubernetes or Clouds with SkyPilot
+## Method 5: Run on Kubernetes or Clouds with SkyPilot

 <details>
 <summary>More</summary>
@@ -95,7 +95,7 @@ sky status --endpoint 30000 sglang
 3. To further scale up your deployment with autoscaling and failure recovery, check out the [SkyServe + SGLang guide](https://github.com/skypilot-org/skypilot/tree/master/llm/sglang#serving-llama-2-with-sglang-for-more-traffic-using-skyserve).
 </details>

-### Common Notes
+## Common Notes
 - [FlashInfer](https://github.com/flashinfer-ai/flashinfer) is the default attention kernel backend. It only supports sm75 and above. If you encounter any FlashInfer-related issues on sm75+ devices (e.g., T4, A10, A100, L4, L40S, H100), please switch to other kernels by adding `--attention-backend triton --sampling-backend pytorch` and open an issue on GitHub.
 - If you only need to use the OpenAI backend, you can avoid installing other dependencies by using `pip install "sglang[openai]"`.
 - The language frontend operates independently of the backend runtime. You can install the frontend locally without needing a GPU, while the backend can be set up on a GPU-enabled machine. To install the frontend, run `pip install sglang`, and for the backend, use `pip install sglang[srt]`. This allows you to build SGLang programs locally and execute them by connecting to the remote backend.