Commit Graph

43 Commits

Author SHA1 Message Date
ykwd
93088b6975 [Hicache] Mooncake API Fix & Test, and Improved Readme (#9951)
Co-authored-by: Teng Ma <sima.mt@alibaba-inc.com>
2025-09-04 13:55:39 -07:00
pansicheng
d07304870b fix 3fs zerocopy (#9938)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-09-04 13:24:12 -07:00
Zhiqiang Xie
369b143366 [HiCache] Minor fix on file storage backend (#9869) 2025-09-02 15:52:37 -07:00
hzh0425
4d89389c4f Fix the key passing issue in page first layout. (#9929) 2025-09-02 11:30:11 -07:00
ykwd
53976fce97 [Hicache] Generic page get bugfix (#9909) 2025-09-02 20:22:06 +08:00
Zhiqiang Xie
8b6966d020 [HiCache] Storage Refactoring (#9797)
Co-authored-by: pansicheng <27603155+pansicheng@users.noreply.github.com>
2025-08-31 22:58:21 +08:00
huangtingwei
55349e361d support mooncake store dp attention (#9684) 2025-08-28 12:31:31 +08:00
hzh0425
c04c17edfa refactor(hicache): Introduce generic HiCacheStorageConfig for improved configuration management (#9555)
Co-authored-by: Teng Ma <805522925@qq.com>
2025-08-26 17:55:20 -07:00
hzh0425
79ce3688bb BugFix(hicache): Fix host indices out of bound error (#9637)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-08-26 10:42:23 -07:00
ykwd
80dc76e11a [Fix] HiCache Bugfix & Mooncake Error Handling Enhance (#8901)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-08-25 19:05:10 -07:00
hzh0425
83871aa12d feat(hicache): Supports 3fs-hicache compatibility with dp-attention (#9372) 2025-08-23 02:08:32 -07:00
huangtingwei
6078d5fcc0 [HiCacheStorage] backup optimization for MLA model (#8865)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-08-22 18:03:51 +08:00
pansicheng
70cf4abccc 3fs zerocopy (#9109)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-08-22 17:56:38 +08:00
pansicheng
733446dd36 fix io group (#9154)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-08-14 12:46:42 +08:00
huangtingwei
0edda32001 Support page first layout zero copy for mooncake store (#8651)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-08-12 15:59:26 -07:00
Zhiqiang Xie
9f78f391ae HiCache Storage: generate hash when inserting new nodes (#9053) 2025-08-11 14:18:59 -07:00
cctry
5c31b35db2 [hicache] Optimization for DMA copy (#8245) 2025-08-09 17:16:07 -07:00
Zhiqiang Xie
6e0b646832 HiCache Storage tp fix (#8878) 2025-08-09 01:16:51 -07:00
pansicheng
e2fd2b9c7e Simple prefetch policy (#8692) 2025-08-08 02:09:28 -07:00
Baron Liu
36fc9260a2 [bugfix] fix import path in HiCacheController (#8749) 2025-08-03 22:19:15 -07:00
hzh0425
d1c4d51c08 bugfix(hicache): Fix 'MooncakeStore' not defined error. (#8668) 2025-08-01 15:58:17 -07:00
Zhiqiang Xie
dd7ca00601 Interface change for kvcache io to support page first layout (#8318) 2025-08-01 11:37:49 +08:00
pansicheng
3dde86194a Conditionally import HiCacheHF3FS (#8598)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-07-31 14:59:29 -07:00
Vishwanath Venkatesan
2cd2e27f80 SGLang HiCache NIXL Connector (#8488)
Signed-off-by: Vishwanath Venkatesan <vvenkatesan@nvidia.com>
Co-authored-by: Moein Khazraee <moein@nvidia.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-07-31 13:09:42 -07:00
huangtingwei
d904959233 Support l3 cache (mooncake store) for hiradix cache (#7211)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: AniZpZ <zhuangsen.zp@antgroup.com>
Co-authored-by: zuoyuan <zhangzuo21@mails.tsinghua.edu.cn>
Co-authored-by: @wangyueneng.wyn <wangyueneng.wyn@antgroup.com>
Co-authored-by: JinYan Su <jinyansu792@gmail.com>
2025-07-30 23:15:51 -07:00
pansicheng
299803343d Add hf3fs support for hicache storage (based on #7704) (#7280)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-07-30 17:42:41 -07:00
Zhiqiang Xie
528bd1ed85 HiCache, check before terminate prefetching (#8372) 2025-07-26 23:13:16 -07:00
Zhiqiang Xie
145482f422 HiCache Storage TP Refinement (#8307)
Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com>
2025-07-25 08:31:47 +08:00
Zhiqiang Xie
f39037fffb HiCache Fix (#8288)
Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com>
2025-07-23 16:51:32 +08:00
Zhiqiang Xie
9d33fcfb8e Hicache Storage Layer Prototype (#7704) 2025-07-18 15:20:19 +08:00
Zhiqiang Xie
2fc824b84c Kernels for efficient KV cache IO (#7313) 2025-07-06 22:53:36 -07:00
Liangsheng Yin
05c9bc8956 [minor] simplify the TokenToKVPoolAllocator (#7414) 2025-06-22 12:37:18 +08:00
Zhiqiang Xie
e56685ac1b Upstreaming hicache bug fixes (#7267) 2025-06-17 17:44:57 -07:00
Lianmin Zheng
a023856b12 Move host memory pools into a separate file (#7200) 2025-06-14 21:31:42 -07:00
huangtingwei
d2cb3024f2 fix bug that gpu0 occupies more memory when hicache is turned on (#5778)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-05-09 15:36:08 -07:00
Zhiqiang Xie
e119f04215 Large page size aligned hierarchical caching (#4581) 2025-04-01 22:38:15 -07:00
Lu Changqi
0e0ec70200 Hierarchical Caching supports MLA (#4009)
Signed-off-by: Changqi Lu <luchangqi.123@bytedance.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
2025-03-13 20:42:14 -07:00
Zhiqiang Xie
fbdb50501f Hot fix for hicache with new page aligned radixtree (#4397) 2025-03-13 15:50:49 -07:00
Zhiqiang Xie
10b544ae9b Hierarchical Caching Refactoring and Fixing TP issue (#4082) 2025-03-12 11:22:35 -07:00
Zhiqiang Xie
9376ac361d Memory pool fix for upstream change about eagle (#4170) 2025-03-07 00:58:20 -08:00
Ying Sheng
d3d4d76758 [Eagle] Refactor eagle speculative decoding (#3986)
Co-authored-by: Ke Bao <ISPObaoke@163.com>
2025-03-05 08:06:07 -08:00
Zhiqiang Xie
6c7a152c5a Hierarchical Caching for SGLang (#2693)
Co-authored-by: Wenxuan Tan <wenxuan.tan@wisc.edu>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-02-23 21:56:30 -08:00
Zhiqiang Xie
5d6e9467d4 Cache controller for hierarchical caching (#2804) 2025-01-10 20:22:01 -08:00