Ascend scheduler was added for non chunk prefill case before, since that
the npu ops didn't work well with chunked prefill.
Now the ops with chunked prefill work better, it's time to remove the
ascend scheduler to use vLLM default scheduler.
- vLLM version: v0.11.2
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This PR comments out newly added vlm e2e test of ascend scheduler
scenario because I found that when running in multi-batch this will
stuck. Need to add this back after dealing with this issue.
- vLLM version: v0.11.0rc3
- vLLM main:
17c540a993
Signed-off-by: whx-sjtu <2952154980@qq.com>
This PR fix the bug related with running multi-modal models with
AscendScheduler. This bug was introduced by PR #2372 by using the same
parameter names as vLLM with different default values.
Currently I fix this bug by changing the default values of these two
parameters to align with vLLM.
- vLLM version: v0.11.0rc3
- vLLM main:
17c540a993
Signed-off-by: hw_whx <wanghexiang7@huawei.com>
Co-authored-by: hw_whx <wanghexiang7@huawei.com>
### What this PR does / why we need it?
This PR revise the test cases of various features on the warehouse which
add the enablement of aclgraph to the test cases.
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ut
- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
### What this PR does / why we need it?
Fix CI by addressing max_split_size_mb config
### Does this PR introduce _any_ user-facing change?
No, test onyl
### How was this patch tested?
Full CI passed, espcially eagle one
- vLLM version: v0.10.2
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Refactor E2E CI to make it clear and faster
1. remove some uesless e2e test
2. remove some uesless function
3. Make sure all test runs with VLLMRunner to avoid oom error
4. Make sure all ops test end with torch.empty_cache to avoid oom error
5. run the test one by one to avoid resource limit error
- vLLM version: v0.10.1.1
- vLLM main:
a344a5aa0a
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>