From c8a9e79186503c3bd1955cdbd4c364b04db333fc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dr=2E=20Artificial=E6=9B=BE=E5=B0=8F=E5=81=A5?= <875100501@qq.com> Date: Wed, 28 Aug 2024 14:51:41 +0800 Subject: [PATCH] Fix readme (#1236) --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 09e3d5686..3f03fd7f1 100644 --- a/README.md +++ b/README.md @@ -83,6 +83,7 @@ docker run --gpus all \ ### Method 4: Using docker compose
+More > This method is recommended if you plan to serve it as a service. > A better approach is to use the [k8s-sglang-service.yaml](./docker/k8s-sglang-service.yaml). @@ -94,6 +95,7 @@ docker run --gpus all \ ### Method 5: Run on Kubernetes or Clouds with SkyPilot
+More To deploy on Kubernetes or 12+ clouds, you can use [SkyPilot](https://github.com/skypilot-org/skypilot). @@ -262,6 +264,7 @@ Instructions for supporting a new model are [here](https://github.com/sgl-projec #### Use Models From ModelScope
+More To use a model from [ModelScope](https://www.modelscope.cn), set the environment variable SGLANG_USE_MODELSCOPE. ``` @@ -276,6 +279,7 @@ SGLANG_USE_MODELSCOPE=true python -m sglang.launch_server --model-path qwen/Qwen #### Run Llama 3.1 405B
+More ```bash # Run 405B (fp8) on a single node