zyksir
|
8e3797be1c
|
support 1 shot allreduce in 1-node and 2-node using mscclpp (#6277)
|
2025-06-04 22:11:24 -07:00 |
|
Wenxuan Tan
|
844a8f42c7
|
Fix LoRA bench (#6719)
|
2025-05-28 16:38:55 -07:00 |
|
Qiaolin Yu
|
cd8d4b9dfc
|
Fix lora bench (#6302)
|
2025-05-15 10:09:55 -07:00 |
|
Qiaolin Yu
|
8c0cfca87d
|
Feat: support cuda graph for LoRA (#4115)
Co-authored-by: Beichen Ma <mabeichen12@gmail.com>
|
2025-04-28 23:30:44 -07:00 |
|
Brayden Zhong
|
b149b39353
|
[CI] Remove unused imports with Ruff to pre-commit config, only to benchmarks/docs/examples folder (#3969)
|
2025-03-27 19:45:02 -07:00 |
|
aoshen524
|
588865f0e0
|
[Feature] Support Tensor Parallelism and Weight Slicing for Lora (#4274)
Co-authored-by: ShenAo1111 <1377693092@qq.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
|
2025-03-18 20:33:07 -07:00 |
|
Baizhou Zhang
|
70817a7eae
|
[Feature] Define backends and add Triton backend for Lora (#3161)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2025-02-03 22:09:13 -08:00 |
|
Xuehai Pan
|
62a4a339eb
|
docs: fix module docstrings and copyright headers (#2077)
|
2024-11-22 22:16:53 +08:00 |
|
Ying Sheng
|
9c064bf78a
|
[LoRA, Performance] Speedup multi-LoRA serving - Step 1 (#1587)
|
2024-10-06 10:33:44 -07:00 |
|
Ying Sheng
|
37963394aa
|
[Feature] Support LoRA path renaming and add LoRA serving benchmarks (#1433)
|
2024-09-15 12:46:04 -07:00 |
|