[Bugfix] Fix num_hidden_layers when Qwen2-Audio 7B (#1803)

### What this PR does / why we need it? Fix num_hidden_layers when Qwen2-Audio 7B and #1760 ： ``` INFO 07-15 04:38:53 [platform.py:174] PIECEWISE compilation enabled on NPU. use_inductor not supported - using only ACL Graph mode Traceback (most recent call last): File "/workspace/test1.py", line 58, in <module> main(audio_count) File "/workspace/test1.py", line 38, in main llm = LLM(model="Qwen/Qwen2-Audio-7B-Instruct", File "/vllm-workspace/vllm/vllm/entrypoints/llm.py", line 271, in __init__ self.llm_engine = LLMEngine.from_engine_args( File "/vllm-workspace/vllm/vllm/engine/llm_engine.py", line 494, in from_engine_args vllm_config = engine_args.create_engine_config(usage_context) File "/vllm-workspace/vllm/vllm/engine/arg_utils.py", line 1286, in create_engine_config config = VllmConfig( File "/usr/local/python3.10.17/lib/python3.10/site-packages/pydantic/_internal/_dataclasses.py", line 123, in __init__ s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s) File "/vllm-workspace/vllm/vllm/config.py", line 4624, in __post_init__ current_platform.check_and_update_config(self) File "/vllm-workspace/vllm-ascend/vllm_ascend/platform.py", line 180, in check_and_update_config update_aclgraph_sizes(vllm_config) File "/vllm-workspace/vllm-ascend/vllm_ascend/utils.py", line 307, in update_aclgraph_sizes num_hidden_layers = vllm_config.model_config.hf_config.num_hidden_layers File "/usr/local/python3.10.17/lib/python3.10/site-packages/transformers/configuration_utils.py", line 211, in __getattribute__ return super().__getattribute__(key) AttributeError: 'Qwen2AudioConfig' object has no attribute 'num_hidden_layers' ``` ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Closes: https://github.com/vllm-project/vllm-ascend/issues/1780 https://github.com/vllm-project/vllm-ascend/issues/1760 https://github.com/vllm-project/vllm-ascend/issues/1276 https://github.com/vllm-project/vllm-ascend/issues/359 - vLLM version: v0.10.0 - vLLM main: 7728dd77bb Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-07-26 20:13:00 +08:00
parent df0ec55162
commit d1c640841b
6 changed files with 131 additions and 9 deletions
--- a/docs/source/tutorials/single_npu_audio.md
+++ b/docs/source/tutorials/single_npu_audio.md
@@ -90,8 +90,7 @@ def main(audio_count: int):
    llm = LLM(model="Qwen/Qwen2-Audio-7B-Instruct",
              max_model_len=4096,
              max_num_seqs=5,
-              limit_mm_per_prompt={"audio": audio_count},
-              enforce_eager=True)
+              limit_mm_per_prompt={"audio": audio_count})

    inputs = prepare_inputs(audio_count)

--- a/docs/source/tutorials/single_npu_multimodal.md
+++ b/docs/source/tutorials/single_npu_multimodal.md
@@ -57,7 +57,6 @@ llm = LLM(
    model=MODEL_PATH,
    max_model_len=16384,
    limit_mm_per_prompt={"image": 10},
-    enforce_eager=True,
 )

 sampling_params = SamplingParams(
@@ -146,8 +145,7 @@ docker run --rm \
 vllm serve Qwen/Qwen2.5-VL-7B-Instruct \
 --dtype bfloat16 \
 --max_model_len 16384 \
--max-num-batched-tokens 16384 \
--enforce-eager
+--max-num-batched-tokens 16384 
 ```

 :::{note}