Run vllm-ascend on Single NPU What this PR does / why we need it? Add vllm-ascend tutorial doc for Qwen/Qwen2.5-VL-7B-Instruct model Inference/Serving doc Does this PR introduce any user-facing change? no How was this patch tested? no Signed-off-by: xiemingda <xiemingda1002@gmail.com>
118 B
118 B
Tutorials
:::{toctree} :caption: Deployment :maxdepth: 1 single_npu single_npu_multimodal multi_npu multi_node :::