【main】ADXL/HIXL supports FabricMem Mode (#6806)

### What this PR does / why we need it?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main:
83b47f67b1

---------

Signed-off-by: fems14 <1804143737@qq.com>
This commit is contained in:
fems14
2026-03-05 21:04:11 +08:00
committed by GitHub
parent 50441e4650
commit ae394767d4
6 changed files with 46 additions and 40 deletions

View File

@@ -42,7 +42,7 @@ export PYTHONHASHSEED=0
First, we need to obtain the Mooncake project. Refer to the following command:
```shell
git clone -b v0.3.7.post2 --depth 1 https://github.com/kvcache-ai/Mooncake.git
git clone -b v0.3.9 --depth 1 https://github.com/kvcache-ai/Mooncake.git
```
(Optional) Replace go install url if the network is poor
@@ -85,6 +85,15 @@ export PYTHONHASHSEED=0
export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/mooncake:$LD_LIBRARY_PATH
```
### Environment Variables Description
`export ASCEND_ENABLE_USE_FABRIC_MEM=1`: Enable unified memory address direct transmission scheme and only can be used for 800 I/T A3 series. Required supporting hardware versions are as follows:
HDK >=26.0
CANN >= 9.0
`export ASCEND_BUFFER_POOL=4:8`: ASCEND_BUFFER_POOL is the environment variable for configuring the number and size of buffer on NPU Device for aggregation and KV transferthe value 4:8 means we allocate 4 buffers of size 8MB. It only can be used for 800 I/T A2 series.
### Run Mooncake Master
#### 1.Configure mooncake.json