xc-llm-ascend/InternVL3-8B.yaml at b75bfc58f683fb6f28abc19dd41831c8d4c98a1c - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

zhangyiming 66b0781840 [E2E] Refactor the e2e testcases. (#4789 )

### What this PR does / why we need it?
Refactor the e2e testcases.
- tests/e2e/multicard/test_weight_loader.py: Remove the unused code.
- tests/e2e/singlecard/multi-modal/test_internvl.py: Move to accuracy
test.
- tests/e2e/singlecard/test_aclgraph.py: Rename the file.
- tests/e2e/singlecard/test_embedding_aclgraph.py : Combine with
tests/e2e/singlecard/test_bge_model.py
- tests/e2e/singlecard/test_completion_with_prompt_embeds.py: Delete
eager mode and modify model to Qwen3-0.6B
- tests/e2e/singlecard/test_quantization.py: Modify model to
Qwen3-0.6B-W8A8
- tests/e2e/singlecard/test_vlm.py: Modify model to Qwen3-VL-8B

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: menogrey <1299267905@qq.com>

2025-12-11 10:15:00 +08:00

12 lines

239 B

YAML

Raw Blame History

 model_name: "OpenGVLab/InternVL3-8B"
 runner: "linux-aarch64-a2-1"
 hardware: "Atlas A2 Series"
 model: "vllm-vlm"
 tasks:
   - name: "mmmu_val"
     metrics:
     - name: "acc,none"
       value: 0.58
 max_model_len: 32768
 trust_remote_code: True