xc-llm-ascend/tests/e2e/multicard/test_ilama_lora_tp2.py

import pytest
from modelscope import snapshot_download  # type: ignore

from tests.e2e.conftest import VllmRunner
from tests.e2e.singlecard.test_ilama_lora import (EXPECTED_LORA_OUTPUT,
                                                  MODEL_PATH, do_sample)


@pytest.mark.parametrize("distributed_executor_backend", ["mp"])
def test_ilama_lora_tp2(distributed_executor_backend, ilama_lora_files):
    with VllmRunner(snapshot_download(MODEL_PATH),
                    enable_lora=True,
                    max_loras=4,
                    max_model_len=1024,
                    max_num_seqs=16,
                    tensor_parallel_size=2,
                    distributed_executor_backend=distributed_executor_backend
                    ) as vllm_model:
        output = do_sample(vllm_model.model, ilama_lora_files, lora_id=2)

    for i in range(len(EXPECTED_LORA_OUTPUT)):
        assert output[i] == EXPECTED_LORA_OUTPUT[i]
[V1][LoRA][Test] V1 Engine LoRA support & e2e test (#893) ### What this PR does / why we need it? Add V1Engine LoRA support. Add LoRA e2e test on single card and multiple cards. ### Does this PR introduce _any_ user-facing change? support lora for V1 ### How was this patch tested? CI passed with new added test --------- Signed-off-by: jesse <szxfml@gmail.com> Signed-off-by: paulyu <paulyu0307@gmail.com> Signed-off-by: paulyu12 <507435917@qq.com> Co-authored-by: jesse <szxfml@gmail.com> Co-authored-by: paulyu <paulyu0307@gmail.com> 2025-05-22 19:20:51 +08:00			`import pytest`
[CI/UT] Unify model usage via ModelScope in CI (#1207) ### What this PR does / why we need it? Unify Model Usage via ModelScope ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-07-04 10:52:17 +08:00			`from modelscope import snapshot_download # type: ignore`
[V1][LoRA][Test] V1 Engine LoRA support & e2e test (#893) ### What this PR does / why we need it? Add V1Engine LoRA support. Add LoRA e2e test on single card and multiple cards. ### Does this PR introduce _any_ user-facing change? support lora for V1 ### How was this patch tested? CI passed with new added test --------- Signed-off-by: jesse <szxfml@gmail.com> Signed-off-by: paulyu <paulyu0307@gmail.com> Signed-off-by: paulyu12 <507435917@qq.com> Co-authored-by: jesse <szxfml@gmail.com> Co-authored-by: paulyu <paulyu0307@gmail.com> 2025-05-22 19:20:51 +08:00
[Test] Remove VLLM_USE_V1 in example and tests (#1733) V1 is enabled by default, no need to set it by hand now. This PR remove the useless setting in example and tests - vLLM version: v0.9.2 - vLLM main: https://github.com/vllm-project/vllm/commit/9ad0a4588ba4e9c979cda0d178dec4fcdb89fd0c Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> 2025-07-15 12:49:57 +08:00			`from tests.e2e.conftest import VllmRunner`
[CI] Add unit test framework (#1201) This PR added the unit test framework to enable ut for vLLM Ascend. Unit test runs on CPU machines. It'll be ran once lint check is passed the same as e2e test. For unit test, this PR created a new folder called `ut` under `tests` module. All the test file in `ut` should keep the same with the code in `vllm-ascend`. The file name should be start with `test_` prefix. For example, in this PR. the `test_ascend_config.py` is added for `ascend_config.py` test. A new fille `worker/test_worker_v1.py` is also added as the placeholder. This file should be the unit test for `vllm-ascend/worker/worker_v1.py`. Additional, a new `fake_weight` folder is added, it contains the config.json from `facebook/opt-125m`, so that the test will not always visit huggingface. TODO: We should add all the unit test file one by one in the future. Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> 2025-06-16 18:32:28 +08:00			`from tests.e2e.singlecard.test_ilama_lora import (EXPECTED_LORA_OUTPUT,`
			`MODEL_PATH, do_sample)`
[V1][LoRA][Test] V1 Engine LoRA support & e2e test (#893) ### What this PR does / why we need it? Add V1Engine LoRA support. Add LoRA e2e test on single card and multiple cards. ### Does this PR introduce _any_ user-facing change? support lora for V1 ### How was this patch tested? CI passed with new added test --------- Signed-off-by: jesse <szxfml@gmail.com> Signed-off-by: paulyu <paulyu0307@gmail.com> Signed-off-by: paulyu12 <507435917@qq.com> Co-authored-by: jesse <szxfml@gmail.com> Co-authored-by: paulyu <paulyu0307@gmail.com> 2025-05-22 19:20:51 +08:00

			`@pytest.mark.parametrize("distributed_executor_backend", ["mp"])`
			`def test_ilama_lora_tp2(distributed_executor_backend, ilama_lora_files):`
[CI/UT] Unify model usage via ModelScope in CI (#1207) ### What this PR does / why we need it? Unify Model Usage via ModelScope ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-07-04 10:52:17 +08:00			`with VllmRunner(snapshot_download(MODEL_PATH),`
[V1][LoRA][Test] V1 Engine LoRA support & e2e test (#893) ### What this PR does / why we need it? Add V1Engine LoRA support. Add LoRA e2e test on single card and multiple cards. ### Does this PR introduce _any_ user-facing change? support lora for V1 ### How was this patch tested? CI passed with new added test --------- Signed-off-by: jesse <szxfml@gmail.com> Signed-off-by: paulyu <paulyu0307@gmail.com> Signed-off-by: paulyu12 <507435917@qq.com> Co-authored-by: jesse <szxfml@gmail.com> Co-authored-by: paulyu <paulyu0307@gmail.com> 2025-05-22 19:20:51 +08:00			`enable_lora=True,`
			`max_loras=4,`
			`max_model_len=1024,`
			`max_num_seqs=16,`
			`tensor_parallel_size=2,`
			`distributed_executor_backend=distributed_executor_backend`
			`) as vllm_model:`
			`output = do_sample(vllm_model.model, ilama_lora_files, lora_id=2)`

			`for i in range(len(EXPECTED_LORA_OUTPUT)):`
			`assert output[i] == EXPECTED_LORA_OUTPUT[i]`