Add note for deepseek related docs and remove unnecessary comments (#590)
### What this PR does / why we need it? Add notes for deepseek's patch and remove some of the unnecessary comments --------- Signed-off-by: ganyi <pleaplusone.gy@gmail.com>
This commit is contained in:
@@ -835,7 +835,6 @@ class NPUModelRunner:
|
||||
assert num_blocks >= kv_cache_config.num_blocks
|
||||
# TODO: remove this after the OOM issue is located and fixed, otherwise, some model may
|
||||
# encounter OOM issue
|
||||
num_blocks = num_blocks // 4
|
||||
if isinstance(kv_cache_spec, FullAttentionSpec):
|
||||
kv_cache_shape = self.attn_backend.get_kv_cache_shape(
|
||||
num_blocks, kv_cache_spec.block_size,
|
||||
|
||||
Reference in New Issue
Block a user