[1/N][Refactor] torchair model runner refactor (#2205)

There is lot of torchair code in model runner leading the code hard for maintenance. We'll create new torchair_model_runner to split torchair related logic. Following the workflow #2203, this is the first PR. What this PR does: create the new torchair model runner, more function will be added later - vLLM version: v0.10.0 - vLLM main: 586f286789 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-08-05 18:43:04 +08:00
parent 458ab2db12
commit 292fb8f696
4 changed files with 50 additions and 9 deletions
--- a/.github/workflows/vllm_ascend_test.yaml
+++ b/.github/workflows/vllm_ascend_test.yaml
@@ -196,6 +196,13 @@ jobs:
          pytest -sv tests/e2e/singlecard/test_guided_decoding.py
          pytest -sv tests/e2e/singlecard/test_camem.py
          pytest -sv tests/e2e/singlecard/test_embedding.py
+
+          # ------------------------------------ v1 spec decode test ------------------------------------ #
+          pytest -sv tests/e2e/singlecard/spec_decode_v1/test_v1_mtp_correctness.py
+          # TODO: revert me when test_v1_spec_decode.py::test_ngram_correctness is fixed
+          pytest -sv tests/e2e/singlecard/spec_decode_v1/test_v1_spec_decode.py
+
+          # All other tests, ignore: 310p test, accuracy test.
          pytest -sv tests/e2e/singlecard/ \
          --ignore=tests/e2e/singlecard/test_offline_inference.py \
          --ignore=tests/e2e/singlecard/test_ilama_lora.py \
@@ -204,13 +211,9 @@ jobs:
          --ignore=tests/e2e/singlecard/test_embedding.py \
          --ignore=tests/e2e/singlecard/spec_decode_v1/test_v1_mtp_correctness.py \
          --ignore=tests/e2e/singlecard/spec_decode_v1/test_v1_spec_decode.py \
-          --ignore=tests/e2e/singlecard/test_offline_inference_310p.py
-          # ------------------------------------ v1 spec decode test ------------------------------------ #
-          VLLM_USE_MODELSCOPE=True pytest -sv tests/e2e/singlecard/spec_decode_v1/test_v1_mtp_correctness.py
-          # TODO: revert me when test_v1_spec_decode.py::test_ngram_correctness is fixed
-          VLLM_USE_MODELSCOPE=True pytest -sv tests/e2e/singlecard/spec_decode_v1/test_v1_spec_decode.py
-
-  e2e-4-cards:
+          --ignore=tests/e2e/singlecard/test_offline_inference_310p.py \
+          --ignore=tests/e2e/singlecard/models/test_lm_eval_correctness.py
+  e2e-2-cards:
    needs: [e2e]
    if: ${{ needs.e2e.result == 'success' }}
    strategy: