[v0.11.0][bugfix]kvpool sync load (#3698) (#3722)

### What this PR does / why we need it?
In certain scenarios, the performance of synchronously loading data from
the pool is better than that of asynchronously loading data. Therefore,
a control logic (or switch) for asynchronous loading from the pool has
been added.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

<!--  Thanks for sending a pull request!

BEFORE SUBMITTING, PLEASE READ
https://docs.vllm.ai/en/latest/contributing/overview.html

Signed-off-by: fems14 <1804143737@qq.com>
This commit is contained in:
fems14
2025-10-24 18:21:46 +08:00
committed by GitHub
parent 33514a4cc2
commit f0eb3e1d97
4 changed files with 89 additions and 21 deletions

View File

@@ -10,7 +10,13 @@
* vLLM-Ascendmain branch
* Mooncake[AscendTransport/Mooncake at pooling-async-memcpy](https://github.com/AscendTransport/Mooncake/tree/pooling-async-memcpy)(Currently available branch code, continuously updated.)
Installation and Compilation Guidehttps://github.com/AscendTransport/Mooncake/tree/pooling-async-memcpy?tab=readme-ov-file#build-and-use-binaries
### KV Pooling Parameter Description
**kv_connector_extra_config**:Additional Configurable Parameters for Pooling
**mooncake_rpc_port**:Port for RPC Communication Between Pooling Scheduler Process and Worker Process: Each Instance Requires a Unique Port Configuration.
**load_async**:Whether to Enable Asynchronous Loading. The default value is false.
**register_buffer**:Whether to Register Video Memory with the Backend. Registration is Not Required When Used with MooncakeConnectorV1; It is Required in All Other Cases. The Default Value is false.
## run mooncake master
### 1.Configure mooncake.json