Lifu Huang
|
6210e2c4f0
|
Support GPU pinning for LoRA (#8697)
|
2025-08-06 19:39:45 -07:00 |
|
Lifu Huang
|
8675bdf246
|
Support limiting max loaded loras in CPU. (#8650)
|
2025-08-03 00:02:23 -07:00 |
|
Lifu Huang
|
4e3defe5a7
|
Support start up LoRA server without initial adapters (#8019)
|
2025-07-19 15:38:09 -07:00 |
|
Lifu Huang
|
e2ed9d049a
|
Refactor dynamic LoRA update to fix incorrect handling of variant weight shapes (#7844)
|
2025-07-13 18:36:01 -07:00 |
|
Qiaolin Yu
|
8c0cfca87d
|
Feat: support cuda graph for LoRA (#4115)
Co-authored-by: Beichen Ma <mabeichen12@gmail.com>
|
2025-04-28 23:30:44 -07:00 |
|
Baizhou Zhang
|
072b4d0398
|
Add document for LoRA serving (#5521)
|
2025-04-20 14:37:57 -07:00 |
|