[BugFix][Fusion] Patch compile backend to make fusion available (#5308)

Currently, the vllm pr: https://github.com/vllm-project/vllm/pull/24252
is causing operator fusion to fail, which can be mitigated by patching
the backend. Once the problem is completely resolved, I will submit a
new pull request to remove the patch.

- vLLM version: release/v0.13.0
- vLLM main:
5fbfa8d9ef
---------
Signed-off-by: wxsIcey <1790571317@qq.com>
This commit is contained in:
Icey
2025-12-26 09:18:16 +08:00
committed by GitHub
parent 7372225bcb
commit 9b2a7d8866
5 changed files with 273 additions and 25 deletions

View File

@@ -63,7 +63,7 @@ def test_models_with_xlite_decode_only(
vllm_xlite_answers = [
"Hello, my name is Lina. I'm a 22-year-old student from China.",
'The president of the United States is the same as the president of the United Nations. This is because the president',
'The capital of France is Paris. The capital of Italy is Rome. The capital of Spain is Madrid',
'The capital of France is Paris. The capital of France is also the capital of the French Republic.',
'The future of AI is not just a technological challenge but a profound transformation of how we live, work'
]