This PR primarily focuses on two key changes:
1. Adjusts internal interface calls to optimize the interaction logic
between related modules.
2. Exposes an interface that allows users to select the EPLB algorithm,
enabling more flexible configuration based on specific usage scenarios.
These changes aim to enhance the usability of the system while ensuring
the stability of internal operations. Relevant unit tests have been
updated to cover the modified logic.
- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0
---------
Signed-off-by: Che Ruan <cr623@ic.ac.uk>
Co-authored-by: Che Ruan <cr623@ic.ac.uk>