Bugfix: Expose the user policy type interface (#3336)
This PR primarily focuses on two key changes: 1. Adjusts internal interface calls to optimize the interaction logic between related modules. 2. Exposes an interface that allows users to select the EPLB algorithm, enabling more flexible configuration based on specific usage scenarios. These changes aim to enhance the usability of the system while ensuring the stability of internal operations. Relevant unit tests have been updated to cover the modified logic. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: Che Ruan <cr623@ic.ac.uk> Co-authored-by: Che Ruan <cr623@ic.ac.uk>
This commit is contained in:
@@ -470,6 +470,7 @@ class NPUModelRunner(LoRAModelRunnerMixin):
|
||||
self.dynamic_eplb = self.ascend_config.dynamic_eplb
|
||||
if self.dynamic_eplb:
|
||||
self.is_eplb_warmuped = False
|
||||
self.policy_type = self.ascend_config.eplb_policy_type
|
||||
self.eplb_loader = D2DExpertWeightLoader()
|
||||
self.manager = Manager()
|
||||
self.shared_dict = self.manager.dict({
|
||||
@@ -478,7 +479,7 @@ class NPUModelRunner(LoRAModelRunnerMixin):
|
||||
"expert_maps": None
|
||||
})
|
||||
self.eplb_process = EplbProcess(shared_dict=self.shared_dict,
|
||||
policy_type=1,
|
||||
policy_type=self.policy_type,
|
||||
enable_d2d=True)
|
||||
self.process = self.eplb_process._launch_process()
|
||||
ascend_config = get_ascend_config()
|
||||
|
||||
Reference in New Issue
Block a user