[Feature] Add docs of batch invariance and make some extra operators patch (#6910)

### What this PR does / why we need it? This PR add docs of batch invariance and make some extra operators according to validation result. please see https://github.com/vllm-project/vllm-ascend/issues/5487 to track progress. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.16.0 - vLLM main: 15d76f74e2 --------- Signed-off-by: Ronald1995 <ronaldautomobile@163.com>
2026-03-05 09:12:40 +08:00
parent f8315f5717
commit 77e009d9fc
7 changed files with 276 additions and 19 deletions
--- a/vllm_ascend/utils.py
+++ b/vllm_ascend/utils.py
@@ -258,10 +258,19 @@ def enable_custom_op():
    Enable lazy init for vllm_ascend_C to avoid early initialization of CANN's RTS component.
    Ensure that ASCEND_RT_VISIBLE_DEVICES can be dynamically modified before torch.npu.set_device().
    """
+    from vllm.model_executor.layers.batch_invariant import vllm_is_batch_invariant
+
    global _CUSTOM_OP_ENABLED

    if _CUSTOM_OP_ENABLED is not None:
        return _CUSTOM_OP_ENABLED
+
+    # There are some customed operators which aren't implemented
+    # with batch invariant in vllm-ascend, we need to disable them.
+    if vllm_is_batch_invariant():
+        _CUSTOM_OP_ENABLED = False
+        return _CUSTOM_OP_ENABLED
+
    try:
        # isort: off
        # register custom ops into torch_library here