【main】ADXL/HIXL supports FabricMem Mode (#6806)
### What this PR does / why we need it?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.15.0
- vLLM main:
83b47f67b1
---------
Signed-off-by: fems14 <1804143737@qq.com>
This commit is contained in:
@@ -121,7 +121,7 @@ Moonshot AI. Installation and compilation guide:
|
||||
First, obtain the Mooncake project using the following command:
|
||||
|
||||
```bash
|
||||
git clone -b v0.3.8.post1 --depth 1 https://github.com/kvcache-ai/Mooncake.git
|
||||
git clone -b v0.3.9 --depth 1 https://github.com/kvcache-ai/Mooncake.git
|
||||
cd Mooncake
|
||||
git submodule update --init --recursive
|
||||
```
|
||||
|
||||
@@ -177,7 +177,7 @@ Mooncake is the serving platform for Kimi, a leading LLM service provided by Moo
|
||||
First, we need to obtain the Mooncake project. Refer to the following command:
|
||||
|
||||
```shell
|
||||
git clone -b v0.3.8.post1 --depth 1 https://github.com/kvcache-ai/Mooncake.git
|
||||
git clone -b v0.3.9 --depth 1 https://github.com/kvcache-ai/Mooncake.git
|
||||
```
|
||||
|
||||
(Optional) Replace go install url if the network is poor
|
||||
|
||||
@@ -98,7 +98,7 @@ Mooncake is the serving platform for Kimi, a leading LLM service provided by Moo
|
||||
First, we need to obtain the Mooncake project. Refer to the following command:
|
||||
|
||||
```shell
|
||||
git clone -b v0.3.8.post1 --depth 1 https://github.com/kvcache-ai/Mooncake.git
|
||||
git clone -b v0.3.9 --depth 1 https://github.com/kvcache-ai/Mooncake.git
|
||||
```
|
||||
|
||||
(Optional) Replace go install url if the network is poor.
|
||||
|
||||
@@ -42,7 +42,7 @@ export PYTHONHASHSEED=0
|
||||
First, we need to obtain the Mooncake project. Refer to the following command:
|
||||
|
||||
```shell
|
||||
git clone -b v0.3.7.post2 --depth 1 https://github.com/kvcache-ai/Mooncake.git
|
||||
git clone -b v0.3.9 --depth 1 https://github.com/kvcache-ai/Mooncake.git
|
||||
```
|
||||
|
||||
(Optional) Replace go install url if the network is poor
|
||||
@@ -85,6 +85,15 @@ export PYTHONHASHSEED=0
|
||||
export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/mooncake:$LD_LIBRARY_PATH
|
||||
```
|
||||
|
||||
### Environment Variables Description
|
||||
|
||||
`export ASCEND_ENABLE_USE_FABRIC_MEM=1`: Enable unified memory address direct transmission scheme and only can be used for 800 I/T A3 series. Required supporting hardware versions are as follows:
|
||||
|
||||
HDK >=26.0
|
||||
CANN >= 9.0
|
||||
|
||||
`export ASCEND_BUFFER_POOL=4:8`: ASCEND_BUFFER_POOL is the environment variable for configuring the number and size of buffer on NPU Device for aggregation and KV transfer,the value 4:8 means we allocate 4 buffers of size 8MB. It only can be used for 800 I/T A2 series.
|
||||
|
||||
### Run Mooncake Master
|
||||
|
||||
#### 1.Configure mooncake.json
|
||||
|
||||
Reference in New Issue
Block a user