### What this PR does / why we need it?
As there is not accuracy test for qwen3-235B-A22B model
Test result:
dataset version metric mode vllm-api-general-chat
--------- --------- -------- ------ -----------------------
gsm8k 7cd45e accuracy gen 96.29
Times long for test case running: 30mintues
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>