[BugFix][0.18.0] Remove unused layers assignment in mooncake connector (#8602)
### What this PR does / why we need it? Remove unused layers assignment in mooncake connector ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? by nightly Signed-off-by: liziyu <liziyu16@huawei.com>
This commit is contained in:
@@ -586,7 +586,6 @@ class KVCacheRecvingThread(threading.Thread):
|
||||
head_dim = self.model_config.hf_text_config.head_dim
|
||||
block_size = self.vllm_config.cache_config.block_size
|
||||
num_kv_head = max(self.model_config.hf_text_config.num_key_value_heads // self.tp_size, 1)
|
||||
layers = self.model_config.hf_text_config.num_hidden_layers
|
||||
layers = len(self.kv_caches)
|
||||
flat_block_ids = [item for sublist in block_ids for item in sublist]
|
||||
block_ids_tensor = torch.tensor(flat_block_ids, dtype=torch.int64, device=device)
|
||||
|
||||
Reference in New Issue
Block a user