Files
xc-llm-ascend/vllm_ascend
whx 0d3463400a [Performance] Change the shape of kv_cache to avoid view of k_cache and v_cache. (#204)
This PR changes the shape of kv cache to avoid the view of k_cache and
v_cache.
What's more, cache the metadata of k_cache and v_cache to avoid
duplicative slice operations to improve performance.

Signed-off-by: hw_whx <wanghexiang7@huawei.com>
2025-03-05 10:51:07 +08:00
..
2025-02-05 10:53:12 +08:00
2025-03-04 15:59:34 +08:00