Support pinning adapter via server args. (#9249)

This commit is contained in:
Lifu Huang
2025-08-20 16:25:01 -07:00
committed by GitHub
parent 24eaebeb4b
commit b0980af89f
8 changed files with 162 additions and 55 deletions

View File

@@ -298,7 +298,7 @@ class TokenizerManager:
# The registry dynamically updates as adapters are loaded / unloaded during runtime. It
# serves as the source of truth for available adapters and maps user-friendly LoRA names
# to internally used unique LoRA IDs.
self.lora_registry = LoRARegistry(self.server_args.lora_paths or {})
self.lora_registry = LoRARegistry(self.server_args.lora_paths)
# Lock to serialize LoRA update operations.
# Please note that, unlike `model_update_lock`, this does not block inference, allowing
# LoRA updates and inference to overlap.