[CherryPick] Add unpadded Qwen2.5-VL for verl scenario (#1095)

Add unpadded Qwen2.5-VL for verl scenario. When using vllm-ascend for verl scenario, set `USE_OPTIMIZED_QWEN2_5_VL` (default `1`) to `0` to use unpadded Qwen2.5-VL to avoid errors. This is cherry-picked from 0.7.3-dev Signed-off-by: shen-shanshan <467638484@qq.com> Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Shanshan Shen <467638484@qq.com>
2025-06-07 19:45:46 +08:00
parent b80a484864
commit c8742146d3
3 changed files with 288 additions and 4 deletions
--- a/vllm_ascend/envs.py
+++ b/vllm_ascend/envs.py
@@ -128,6 +128,11 @@ env_variables: Dict[str, Callable[[], Any]] = {
    # enable `pin_memory` while creating a tensor using `torch.tensor`.
    "VLLM_ASCEND_ACL_OP_INIT_MODE":
    lambda: os.getenv("VLLM_ASCEND_ACL_OP_INIT_MODE", '1'),
+    # Some models are optimized by vllm ascend. While in some case, e.g. rlhf
+    # training, the optimized model may not be suitable. In this case, set this
+    # value to False to disable the optimized model.
+    "USE_OPTIMIZED_MODEL":
+    lambda: bool(int(os.getenv('USE_OPTIMIZED_MODEL', '1'))),
 }

 # end-env-vars-definition