【main】ADXL/HIXL supports FabricMem Mode (#6806)

### What this PR does / why we need it? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: 83b47f67b1 --------- Signed-off-by: fems14 <1804143737@qq.com>
2026-03-05 21:04:11 +08:00
parent 50441e4650
commit ae394767d4
6 changed files with 46 additions and 40 deletions
--- a/docs/source/user_guide/feature_guide/kv_pool.md
+++ b/docs/source/user_guide/feature_guide/kv_pool.md
@@ -42,7 +42,7 @@ export PYTHONHASHSEED=0
        First, we need to obtain the Mooncake project. Refer to the following command:

        ```shell
-        git clone -b v0.3.7.post2 --depth 1 https://github.com/kvcache-ai/Mooncake.git
+        git clone -b v0.3.9 --depth 1 https://github.com/kvcache-ai/Mooncake.git
        ```

        (Optional) Replace go install url if the network is poor
@@ -85,6 +85,15 @@ export PYTHONHASHSEED=0
        export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/mooncake:$LD_LIBRARY_PATH
        ```

+### Environment Variables Description
+
+`export ASCEND_ENABLE_USE_FABRIC_MEM=1`: Enable unified memory address direct transmission scheme and only can be used for 800 I/T A3 series. Required supporting hardware versions are as follows:
+
+    HDK >=26.0
+    CANN >= 9.0
+
+`export ASCEND_BUFFER_POOL=4:8`: ASCEND_BUFFER_POOL is the environment variable for configuring the number and size of buffer on NPU Device for aggregation and KV transfer，the value 4:8 means we allocate 4 buffers of size 8MB. It only can be used for 800 I/T A2 series.
+
 ### Run Mooncake Master

 #### 1.Configure mooncake.json