hzh0425
|
ee3bd8a1c8
|
feat(hicache): Support passing prefix keys for l3 store. (#9045)
Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-10-10 00:22:05 -07:00 |
|
Ke Bao
|
31b49c0b51
|
EAGLE cache fix for HiCache (#11215)
|
2025-10-04 16:53:53 -07:00 |
|
ykwd
|
bfa274380b
|
[HiCache] Configurable and Dynamic Prefetch Timeout (#10512)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-10-01 08:44:10 -05:00 |
|
huangtingwei
|
e05555fad8
|
[HiCacheStorage] mooncake store support page_first_direct layout (#10591)
|
2025-09-28 20:45:48 -07:00 |
|
Zhiqiang Xie
|
3d40794fcf
|
[HiCache] Cleaning the deprecated host memory state (#10778)
|
2025-09-25 14:43:53 +08:00 |
|
Xinyuan Tong
|
12d6cf18f0
|
Refactors radix cache for extra key support (#10317)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
2025-09-22 02:16:16 +08:00 |
|
Xuchun Shang
|
1ccd59c715
|
[HICache] introduce evict policy (#10190)
Signed-off-by: Xuchun Shang <xuchun.shang@linux.alibaba.com>
Co-authored-by: Teng Ma <sima.mt@alibaba-inc.com>
|
2025-09-18 11:10:20 +08:00 |
|
DarkSharpness
|
948b01a04c
|
[Refactor] Remove Hicache Load & Write threads (#10127)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-08 22:18:50 -07:00 |
|
Shisong Ma
|
33467c05a4
|
[BUG FIX] add fail check when get fail in case wait complete block (#9971)
Co-authored-by: mashisong <mashisong@bytedance.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-07 18:34:04 -07:00 |
|
Teng Ma
|
41628dc1b1
|
[HiCache] fix: check clear() method for storage backend (#10096)
Co-authored-by: hzh0425 <hzh0425@apache.org>
|
2025-09-06 22:59:58 -07:00 |
|
pansicheng
|
f84db115b1
|
Add storage read/write bandwidth logs to monitor kvcache performance (#9965)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-05 16:52:55 -07:00 |
|
JinYan Su
|
37565b7f21
|
fix(cache): move ongoing_prefetch pop after validation to prevent leak (#9927)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-03 02:39:34 +00:00 |
|
huangtingwei
|
cb9e0e4180
|
[HiCacheStorage] fix abort request host memory leaks (#9874)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-01 18:59:29 -07:00 |
|
Zhiqiang Xie
|
8b6966d020
|
[HiCache] Storage Refactoring (#9797)
Co-authored-by: pansicheng <27603155+pansicheng@users.noreply.github.com>
|
2025-08-31 22:58:21 +08:00 |
|
Teng Ma
|
f05c68733e
|
[HiCache] Clear kvcache in storage backend with fastAPI (#9750)
Co-authored-by: hzh0425 <hzh0425@apache.org>
|
2025-08-31 17:41:44 +08:00 |
|
Zhiqiang Xie
|
54e872d343
|
[HiCache] resolve conflict between chunked-prefill and hicache hit count (#9776)
|
2025-08-30 01:30:54 +08:00 |
|
Pablo Iyu Guerrero
|
a3aee7c377
|
fix: HiRadixCache: fix prefetch completion race (#9397)
|
2025-08-27 15:43:01 +08:00 |
|
hzh0425
|
c04c17edfa
|
refactor(hicache): Introduce generic HiCacheStorageConfig for improved configuration management (#9555)
Co-authored-by: Teng Ma <805522925@qq.com>
|
2025-08-26 17:55:20 -07:00 |
|
Zhiqiang Xie
|
43de1d7304
|
HiCache Storage fix host memory leak (#9648)
|
2025-08-26 10:49:40 -07:00 |
|
Zhiqiang Xie
|
0eec4cb6cc
|
HiCache, add bench long context plus minor fixs (#9086)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-08-11 16:54:52 -07:00 |
|
Zhiqiang Xie
|
9f78f391ae
|
HiCache Storage: generate hash when inserting new nodes (#9053)
|
2025-08-11 14:18:59 -07:00 |
|
Zhiqiang Xie
|
6e0b646832
|
HiCache Storage tp fix (#8878)
|
2025-08-09 01:16:51 -07:00 |
|
pansicheng
|
e2fd2b9c7e
|
Simple prefetch policy (#8692)
|
2025-08-08 02:09:28 -07:00 |
|
Zhiqiang Xie
|
dd7ca00601
|
Interface change for kvcache io to support page first layout (#8318)
|
2025-08-01 11:37:49 +08:00 |
|
Zhiqiang Xie
|
9305ea6c2d
|
HiCache, fixing hash value indexing (#8636)
|
2025-08-01 11:29:51 +08:00 |
|
huangtingwei
|
d904959233
|
Support l3 cache (mooncake store) for hiradix cache (#7211)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: AniZpZ <zhuangsen.zp@antgroup.com>
Co-authored-by: zuoyuan <zhangzuo21@mails.tsinghua.edu.cn>
Co-authored-by: @wangyueneng.wyn <wangyueneng.wyn@antgroup.com>
Co-authored-by: JinYan Su <jinyansu792@gmail.com>
|
2025-07-30 23:15:51 -07:00 |
|
huangtingwei
|
26c8a310bd
|
fix incorrect increase of hit count (#8533)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-07-31 06:02:42 +00:00 |
|
yi wang
|
5963e50503
|
[bugfix] Fix 2 minor bugs in the hicache storage layer (#8404)
|
2025-07-31 05:47:14 +00:00 |
|
pansicheng
|
299803343d
|
Add hf3fs support for hicache storage (based on #7704) (#7280)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-07-30 17:42:41 -07:00 |
|
Zhiqiang Xie
|
528bd1ed85
|
HiCache, check before terminate prefetching (#8372)
|
2025-07-26 23:13:16 -07:00 |
|
Zhiqiang Xie
|
145482f422
|
HiCache Storage TP Refinement (#8307)
Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com>
|
2025-07-25 08:31:47 +08:00 |
|
Zhiqiang Xie
|
9d33fcfb8e
|
Hicache Storage Layer Prototype (#7704)
|
2025-07-18 15:20:19 +08:00 |
|
Zhiqiang Xie
|
2fc824b84c
|
Kernels for efficient KV cache IO (#7313)
|
2025-07-06 22:53:36 -07:00 |
|
Liangsheng Yin
|
05c9bc8956
|
[minor] simplify the TokenToKVPoolAllocator (#7414)
|
2025-06-22 12:37:18 +08:00 |
|
DarkSharpness
|
47367b768d
|
[Refactor] Clean up radix cache related API (#7303)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-06-20 00:58:48 +08:00 |
|
Zhiqiang Xie
|
e56685ac1b
|
Upstreaming hicache bug fixes (#7267)
|
2025-06-17 17:44:57 -07:00 |
|
Lianmin Zheng
|
a023856b12
|
Move host memory pools into a separate file (#7200)
|
2025-06-14 21:31:42 -07:00 |
|
Lifu Huang
|
3cf1473a09
|
Use monotonic clock for interval measurement (#6211)
Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>
|
2025-05-17 16:49:18 -07:00 |
|
Zhiqiang Xie
|
70645f4d7d
|
upstream hicache fixes (#5570)
|
2025-04-20 23:08:30 -07:00 |
|
Zhiqiang Xie
|
e2574ee986
|
fix hicache write back (#5543)
|
2025-04-19 21:56:22 -07:00 |
|
Zhiqiang Xie
|
3fadc64793
|
bug fix for hicache host eviction (#4989)
|
2025-04-02 00:33:50 -07:00 |
|
Zhiqiang Xie
|
e119f04215
|
Large page size aligned hierarchical caching (#4581)
|
2025-04-01 22:38:15 -07:00 |
|
Zhiqiang Xie
|
a98290aea3
|
Unit test for Hierarchical Caching (#4486)
|
2025-03-17 17:45:00 -07:00 |
|
Zhiqiang Xie
|
f5bbf6037d
|
Fix: Complete int32 to int64 conversion (#4465)
|
2025-03-16 18:14:27 -07:00 |
|
JieXin Liang
|
1a3fa75f2f
|
[Fix] use torch.cat instead of torch.concat to prevent entering the Autograd backends. (#4466)
|
2025-03-16 00:02:47 -07:00 |
|
Lu Changqi
|
0e0ec70200
|
Hierarchical Caching supports MLA (#4009)
Signed-off-by: Changqi Lu <luchangqi.123@bytedance.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-03-13 20:42:14 -07:00 |
|
Zhiqiang Xie
|
fbdb50501f
|
Hot fix for hicache with new page aligned radixtree (#4397)
|
2025-03-13 15:50:49 -07:00 |
|
Lianmin Zheng
|
c76040e31b
|
Support page size > 1 (#4356)
|
2025-03-12 22:22:39 -07:00 |
|
Zhiqiang Xie
|
10b544ae9b
|
Hierarchical Caching Refactoring and Fixing TP issue (#4082)
|
2025-03-12 11:22:35 -07:00 |
|
Zhiqiang Xie
|
9376ac361d
|
Memory pool fix for upstream change about eagle (#4170)
|
2025-03-07 00:58:20 -08:00 |
|