KV‑Cache (MHA, MLA): add missing start_layer / end_layer fields to MHATokenToKVPoolHost and MLATokenToKVPoolHost (#6016)

Co-authored-by: 继优 <jiyou.ljy@alibaba-inc.com>
Co-authored-by: chus-chus <chus-chus@users.noreply.github.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
This commit is contained in:
Simon (Jiyou) Li
2025-05-10 06:50:06 +08:00
committed by GitHub
parent 678d8cc987
commit b29a026e14

View File

@@ -762,6 +762,8 @@ class HostKVCache(abc.ABC):
self.size = int(device_pool.size * host_to_device_ratio)
# Align the host memory pool size to the page size
self.size = self.size - (self.size % self.page_size)
self.start_layer = device_pool.start_layer
self.end_layer = device_pool.end_layer
assert (
self.size > device_pool.size