Lifu Huang
|
08ecd0aa2a
|
[3/4] Speed up CSGMV backend perf by 10% through dynamic chunking + kernel optimization (#10592)
|
2025-09-20 22:47:48 -07:00 |
|
Lifu Huang
|
941002945b
|
[1/2] Refactor LoRA to support backend-specific batch preprocessing. (#10251)
|
2025-09-10 09:58:37 -07:00 |
|
gongwei-130
|
ab62b135c1
|
support Llama4 with non uniformed intermediate size across layers for… (#10047)
|
2025-09-05 17:28:15 -07:00 |
|
Lifu Huang
|
4b74c3fcca
|
[chore] Clean up redundant lora_weight_names concept to simplify code (#9131)
|
2025-08-17 12:36:58 -07:00 |
|
Lifu Huang
|
f8a173bb50
|
Improve LoRA Perf by Deprecating FlashInfer and Eliminating Redundant Tensor Ops (#8940)
|
2025-08-10 01:04:45 -07:00 |
|
Lifu Huang
|
6e2151183b
|
Fix incorrect default get_hidden_dim logic (#8987)
|
2025-08-09 00:25:38 -07:00 |
|
Lifu Huang
|
e2ed9d049a
|
Refactor dynamic LoRA update to fix incorrect handling of variant weight shapes (#7844)
|
2025-07-13 18:36:01 -07:00 |
|
Lifu Huang
|
1998ce4046
|
Refactor LoRAManager and LoRAMemoryPool state management logic for dynamic LoRA loading support (#7412)
|
2025-06-21 16:09:19 -07:00 |
|
Lifu Huang
|
477a101cbd
|
Refactor LoRA handling to support adapter tensors in fused format (#6585)
|
2025-05-26 21:51:54 -07:00 |
|
Lianmin Zheng
|
e8e18dcdcc
|
Revert "fix some typos" (#6244)
|
2025-05-12 12:53:26 -07:00 |
|
applesaucethebun
|
d738ab52f8
|
fix some typos (#6209)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-13 01:42:38 +08:00 |
|
applesaucethebun
|
2ce8793519
|
Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-11 12:55:00 +08:00 |
|
chaobo jia
|
ef9a378a20
|
[Feature] add multi-rank support for Lora (#4492)
Co-authored-by: rudy152 <czh1137892874@gmail.com>
|
2025-03-28 09:38:44 -07:00 |
|
aoshen524
|
588865f0e0
|
[Feature] Support Tensor Parallelism and Weight Slicing for Lora (#4274)
Co-authored-by: ShenAo1111 <1377693092@qq.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2025-03-18 20:33:07 -07:00 |
|
Baizhou Zhang
|
c45cab1c00
|
[Fix] Fix accuracy bug and refactor codes for lora (#3413)
|
2025-02-10 13:29:00 +08:00 |
|