### What this PR does / why we need it?
For `Qwen2.5-0.5B-Instruct` model
- the model's total number of attention heads (14) must be divisible by
tensor parallel size. (4 -> 2)
- the model does not support enable-expert-parallel
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Local Test.
- vLLM version: v0.10.0
- vLLM main:
ad57f23f6a
Signed-off-by: xleoken <xleoken@163.com>