[BugFix][Fusion] Patch compile backend to make fusion available (#5308)
Currently, the vllm pr: https://github.com/vllm-project/vllm/pull/24252
is causing operator fusion to fail, which can be mitigated by patching
the backend. Once the problem is completely resolved, I will submit a
new pull request to remove the patch.
- vLLM version: release/v0.13.0
- vLLM main:
5fbfa8d9ef
---------
Signed-off-by: wxsIcey <1790571317@qq.com>
This commit is contained in:
@@ -63,7 +63,7 @@ def test_models_with_xlite_decode_only(
|
||||
vllm_xlite_answers = [
|
||||
"Hello, my name is Lina. I'm a 22-year-old student from China.",
|
||||
'The president of the United States is the same as the president of the United Nations. This is because the president',
|
||||
'The capital of France is Paris. The capital of Italy is Rome. The capital of Spain is Madrid',
|
||||
'The capital of France is Paris. The capital of France is also the capital of the French Republic.',
|
||||
'The future of AI is not just a technological challenge but a profound transformation of how we live, work'
|
||||
]
|
||||
|
||||
|
||||
Reference in New Issue
Block a user