diff --git a/python/sglang/srt/mem_cache/storage/hf3fs/docs/README.md b/python/sglang/srt/mem_cache/storage/hf3fs/docs/README.md index 63be34293..480f431a8 100644 --- a/python/sglang/srt/mem_cache/storage/hf3fs/docs/README.md +++ b/python/sglang/srt/mem_cache/storage/hf3fs/docs/README.md @@ -1,19 +1,26 @@ -# HF3FS as L3 KV Cache +# Using HF3FS as L3 Global KV Cache -This document describes how to use deepseek-hf3fs as the L3 KV cache for SGLang. +This document provides step-by-step instructions for setting up a k8s + 3FS + SGLang runtime environment from scratch, describing how to utilize deepseek-hf3fs as the L3 KV cache for SGLang. +The process consists of five main steps: -## Step1: Install deepseek-3fs by 3fs-Operator (Coming Soon) +## Step 1: Install deepseek-3fs via 3fs-Operator +Refer to the [3fs-operator documentation](https://github.com/aliyun/kvc-3fs-operator/blob/main/README_en.md) to deploy 3FS components in your Kubernetes environment using the Operator with one-click deployment. -## Step2: Setup usrbio client +## Step 2: Launch SGLang Pod +Start your SGLang Pod while specifying 3FS-related labels in the YAML configuration. Follow the [fuse-client-creation guide](https://github.com/aliyun/kvc-3fs-operator/blob/main/README_en.md#fuse-client-creation). -Please follow the document [setup_usrbio_client.md](setup_usrbio_client.md) to setup usrbio client. +## Step 3: Configure Usrbio Client in SGLang Pod +The Usrbio client is required for accessing 3FS. Install it in your SGLang Pod using either method below: -## Step3: Deployment +**Alternative 1 (Recommend):** Build from source (refer to [setup_usrbio_client.md](setup_usrbio_client.md)) -### Single node deployment +**Alternative 2:** Run `pip3 install hf3fs-py-usrbio` (Follow https://pypi.org/project/hf3fs-py-usrbio/#files) +## Step 4: Deploy Model Serving + +### Single Node Deployment ```bash -export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.10/dist-packages +export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.12/dist-packages python3 -m sglang.launch_server \ --model-path /code/models/Qwen3-32B/ \ --host 0.0.0.0 --port 10000 \ @@ -24,6 +31,5 @@ python3 -m sglang.launch_server \ --hicache-storage-backend hf3fs ``` -### Multi nodes deployment to share KV cache - -Please follow the document [deploy_sglang_3fs_multinode.md](deploy_sglang_3fs_multinode.md) to deploy SGLang with 3FS on multiple nodes to share KV cache. +### Multi-Node Deployment (Shared KV Cache) +Follow the [deploy_sglang_3fs_multinode.md](deploy_sglang_3fs_multinode.md) guide to deploy SGLang with 3FS across multiple nodes for shared KV caching. diff --git a/python/sglang/srt/mem_cache/storage/hf3fs/docs/deploy_sglang_3fs_multinode.md b/python/sglang/srt/mem_cache/storage/hf3fs/docs/deploy_sglang_3fs_multinode.md index c2955cd3e..889f9ad85 100644 --- a/python/sglang/srt/mem_cache/storage/hf3fs/docs/deploy_sglang_3fs_multinode.md +++ b/python/sglang/srt/mem_cache/storage/hf3fs/docs/deploy_sglang_3fs_multinode.md @@ -20,7 +20,7 @@ vim /sgl-workspace/sglang/benchmark/hf3fs/hf3fs_config.json ## node1 ```bash export SGLANG_HICACHE_HF3FS_CONFIG_PATH=/sgl-workspace/sglang/benchmark/hf3fs/hf3fs_config.json -export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.10/dist-packages +export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.12/dist-packages rm -rf instance1.out && \ nohup python3 -m sglang.launch_server \ --model-path /code/models/Qwen3-32B/ \ @@ -35,7 +35,7 @@ nohup python3 -m sglang.launch_server \ ## node2 ```bash export SGLANG_HICACHE_HF3FS_CONFIG_PATH=/sgl-workspace/sglang/benchmark/hf3fs/hf3fs_config.json -export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.10/dist-packages +export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.12/dist-packages rm -rf instance2.out && \ nohup python3 -m sglang.launch_server \ --model-path /code/models/Qwen3-32B/ \