[BugFix][Fusion] Patch compile backend to make fusion available (#5308)

Currently, the vllm pr: https://github.com/vllm-project/vllm/pull/24252
is causing operator fusion to fail, which can be mitigated by patching
the backend. Once the problem is completely resolved, I will submit a
new pull request to remove the patch.

- vLLM version: release/v0.13.0
- vLLM main:
5fbfa8d9ef
---------
Signed-off-by: wxsIcey <1790571317@qq.com>
This commit is contained in:
Icey
2025-12-26 09:18:16 +08:00
committed by GitHub
parent 7372225bcb
commit 9b2a7d8866
5 changed files with 273 additions and 25 deletions

View File

@@ -106,6 +106,20 @@
# Future Plan:
# Remove this patch when vLLM merge the PR.
#
# ** 7. File: platform/patch_compile_backend.py**
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# 1. `vllm.compilation.backends.PiecewiseCompileInterpreter`
# `vllm.compilation.piecewise_backend.PiecewiseBackend`
# Why:
# vllm removed the compile graph for general shape, which caused operator fusion to fail.
# This issue affects the performance of model inference on Ascend.
# How
# recover the compiled graph for dynamic_shape in PiecewiseBackend.
# Related PR (if no, explain why):
# https://github.com/vllm-project/vllm/pull/24252
# Future Plan:
# Remove this patch when fix the problem.
#
# * Worker Patch:
# ===============
#