huangtingwei
|
e05555fad8
|
[HiCacheStorage] mooncake store support page_first_direct layout (#10591)
|
2025-09-28 20:45:48 -07:00 |
|
Teng Ma
|
9816989bff
|
[HiCache] bug: fix mooncake store batch set v1 (#11013)
|
2025-09-28 23:18:48 +08:00 |
|
hzh0425
|
c8a5d12abe
|
[HiCache]: Support dynamic loading backends for hicache (#10551)
Co-authored-by: Teng Ma <sima.mt@alibaba-inc.com>
|
2025-09-26 18:34:11 -07:00 |
|
yi wang
|
fce170480a
|
integrate AIBrix KVcache (#10376)
|
2025-09-25 14:47:09 +08:00 |
|
Zhiqiang Xie
|
3d40794fcf
|
[HiCache] Cleaning the deprecated host memory state (#10778)
|
2025-09-25 14:43:53 +08:00 |
|
pansicheng
|
d4041a5eeb
|
refactor zero copy (#10300)
Co-authored-by: 晟海 <huangtingwei.htw@antgroup.com>
Co-authored-by: huangtingwei <141888744+huangtingwei9988@users.noreply.github.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
|
2025-09-22 15:17:31 -07:00 |
|
Xinyuan Tong
|
12d6cf18f0
|
Refactors radix cache for extra key support (#10317)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
2025-09-22 02:16:16 +08:00 |
|
huangtingwei
|
7f399e4bce
|
[HiCacheStorage]support page_first_direct layout for generic set&get (#10522)
|
2025-09-19 05:47:16 -07:00 |
|
FlyPanda
|
8b713c7248
|
Hicache L3 backend mooncake optimization configuration reading method (#10319)
Co-authored-by: Teng Ma <sima.mt@alibaba-inc.com>
Co-authored-by: huangtingwei <141888744+huangtingwei9988@users.noreply.github.com>
Co-authored-by: shicang <shicang@shicang>
Co-authored-by: Shangming Cai <csmthu@gmail.com>
|
2025-09-19 12:25:01 +08:00 |
|
Xuchun Shang
|
1ccd59c715
|
[HICache] introduce evict policy (#10190)
Signed-off-by: Xuchun Shang <xuchun.shang@linux.alibaba.com>
Co-authored-by: Teng Ma <sima.mt@alibaba-inc.com>
|
2025-09-18 11:10:20 +08:00 |
|
Lianmin Zheng
|
f949ad5794
|
[Auto Sync] Update activation.py, chunk_cache.py, utils.py (20250917) (#10538)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
|
2025-09-16 17:06:43 -07:00 |
|
ykwd
|
4bb08f6e07
|
[Hicache] Evaluate Per-Round Metrics in Multiturn Bench (#10203)
Co-authored-by: Teng Ma <sima.mt@alibaba-inc.com>
|
2025-09-15 19:34:40 -07:00 |
|
Binyao Jiang
|
9752861002
|
[Fix] Support qwen3-next MTP+DP (#10392)
|
2025-09-13 17:45:04 +08:00 |
|
Yi Zhang
|
297d374510
|
support qwen3_next blackwell (#10403)
|
2025-09-13 17:18:26 +08:00 |
|
Binyao Jiang
|
31e9d3a5aa
|
[Fix] Init mamba related memory pools with torch.zeros (#10400)
|
2025-09-13 14:16:48 +08:00 |
|
Teng Ma
|
49f169d53e
|
[HiCache] doc: update deployment in readme (#10332)
Signed-off-by: Teng Ma <sima.mt@alibaba-inc.com>
|
2025-09-12 16:35:37 -07:00 |
|
Teng Ma
|
7fce2fd91a
|
[HiCache] fix mooncake config in different tp size (#10377)
|
2025-09-12 16:34:23 -07:00 |
|
Even Zhou
|
16cd550c85
|
Support Qwen3-Next on Ascend NPU (#10379)
|
2025-09-12 16:31:37 -07:00 |
|
huangtingwei
|
b4c2c421e9
|
support memory_pool_host page first direct layout (#10031)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-11 23:19:44 -07:00 |
|
Stefan He
|
6c18ab46a2
|
[Qwen3-Next] switch to triton and cache conv states to accelerate MTP from 300 tok/s to 341 tok/s (#10335)
Co-authored-by: Binyao Jiang <byjiang1996@gmail.com>
|
2025-09-11 11:59:48 -07:00 |
|
Yi Zhang
|
30c6e1f569
|
Qwen3-Next support (#10233)
Co-authored-by: cao1zhg <114661107+cao1zhg@users.noreply.github.com>
Co-authored-by: ispobock <ispobaoke@gmail.com>
Co-authored-by: Binyao Jiang <byjiang1996@gmail.com>
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
Co-authored-by: Lifu Huang <lifu.hlf@gmail.com>
Co-authored-by: qingquansong <ustcsqq@gmail.com>
Co-authored-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
Co-authored-by: Ke Bao <ISPObaoke@163.com>
Co-authored-by: Minglei Zhu <mingleizhu1122@gmail.com>
|
2025-09-11 04:11:49 -07:00 |
|
Teng Ma
|
8471e5e616
|
[HiCache] feat: add mooncake backend extra config (#10213)
|
2025-09-09 12:50:00 -07:00 |
|
DarkSharpness
|
948b01a04c
|
[Refactor] Remove Hicache Load & Write threads (#10127)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-08 22:18:50 -07:00 |
|
hzh0425
|
ec99668ab7
|
[Hicache]: Add E2E CI For 3FS-KVStore (#10131)
|
2025-09-08 01:54:50 -07:00 |
|
Huaiyu, Zheng
|
ee21817c6b
|
enable llama3.1-8B on xpu (#9434)
|
2025-09-07 22:34:20 -07:00 |
|
Shisong Ma
|
33467c05a4
|
[BUG FIX] add fail check when get fail in case wait complete block (#9971)
Co-authored-by: mashisong <mashisong@bytedance.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-07 18:34:04 -07:00 |
|
Teng Ma
|
41628dc1b1
|
[HiCache] fix: check clear() method for storage backend (#10096)
Co-authored-by: hzh0425 <hzh0425@apache.org>
|
2025-09-06 22:59:58 -07:00 |
|
Yuwei An
|
9a7ced4e4d
|
[Feature] LMCache Connector Integration (#9741)
Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>
Signed-off-by: YuhanLiu11 <yliu738@wisc.edu>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-06 20:14:55 -07:00 |
|
Zhiqiang Xie
|
0b8c5721f1
|
[HiStorage] Remove delete and clear as necessary methods (#10039)
|
2025-09-06 10:27:26 +08:00 |
|
Xinyuan Tong
|
273b28344b
|
[Minor] Refactors KV memory pool (#9842)
|
2025-09-05 17:06:08 -07:00 |
|
pansicheng
|
f84db115b1
|
Add storage read/write bandwidth logs to monitor kvcache performance (#9965)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-05 16:52:55 -07:00 |
|
ykwd
|
93088b6975
|
[Hicache] Mooncake API Fix & Test, and Improved Readme (#9951)
Co-authored-by: Teng Ma <sima.mt@alibaba-inc.com>
|
2025-09-04 13:55:39 -07:00 |
|
pansicheng
|
d07304870b
|
fix 3fs zerocopy (#9938)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-04 13:24:12 -07:00 |
|
hzh0425
|
106c2b31fb
|
feat(hicache): Add generic hicache ci e2e test and benchmark test (#9846)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-04 20:43:46 +08:00 |
|
Xinyuan Tong
|
56eb5d0a3d
|
fix swa clear(): rename is_in_free_group to is_not_in_free_group (#9914)
|
2025-09-03 11:42:12 -07:00 |
|
JinYan Su
|
37565b7f21
|
fix(cache): move ongoing_prefetch pop after validation to prevent leak (#9927)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-03 02:39:34 +00:00 |
|
Zhiqiang Xie
|
369b143366
|
[HiCache] Minor fix on file storage backend (#9869)
|
2025-09-02 15:52:37 -07:00 |
|
hzh0425
|
4d89389c4f
|
Fix the key passing issue in page first layout. (#9929)
|
2025-09-02 11:30:11 -07:00 |
|
hzh0425
|
58d06fdc95
|
[HiCacheStorage]: Improve 3fs kvstore‘s performance and resolve mla issues (#9876)
|
2025-09-01 19:01:48 -07:00 |
|
huangtingwei
|
cb9e0e4180
|
[HiCacheStorage] fix abort request host memory leaks (#9874)
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
|
2025-09-01 18:59:29 -07:00 |
|
huangtingwei
|
b361750a4a
|
Mooncake store get zero copy meta optimization (#9857)
|
2025-09-01 03:27:56 -07:00 |
|
Zhiqiang Xie
|
8b6966d020
|
[HiCache] Storage Refactoring (#9797)
Co-authored-by: pansicheng <27603155+pansicheng@users.noreply.github.com>
|
2025-08-31 22:58:21 +08:00 |
|
Teng Ma
|
f05c68733e
|
[HiCache] Clear kvcache in storage backend with fastAPI (#9750)
Co-authored-by: hzh0425 <hzh0425@apache.org>
|
2025-08-31 17:41:44 +08:00 |
|
Zhiqiang Xie
|
f9076a5a2c
|
hot fix for mooncake batch set api (#9836)
|
2025-08-30 21:01:51 -07:00 |
|
hzh0425
|
161e9dc51e
|
feat(hicache-3fs): 3FS-Store Backup Optimizations For MLA Model. (#9692)
|
2025-08-29 10:48:51 -07:00 |
|
Zhiqiang Xie
|
54e872d343
|
[HiCache] resolve conflict between chunked-prefill and hicache hit count (#9776)
|
2025-08-30 01:30:54 +08:00 |
|
hzh0425
|
38cd5fb1e0
|
bugfix(hicache): Move exists check before key suffixing (#9749)
|
2025-08-28 18:29:47 -07:00 |
|
chenxu140
|
74dd4249ac
|
[Feature] Support NPUGraph for DeepSeek on Ascend NPU (#9355)
Co-authored-by: Even Zhou <even.y.zhou@outlook.com>
|
2025-08-28 16:06:24 -07:00 |
|
huangtingwei
|
55349e361d
|
support mooncake store dp attention (#9684)
|
2025-08-28 12:31:31 +08:00 |
|
huangtingwei
|
ae7428a8a7
|
fix mooncake store mla zero copy meta (#9678)
|
2025-08-27 15:43:16 +08:00 |
|