Files
xc-llm-ascend/docs
bazingazhou233-hub 9e6c547d98 [Doc] Replace deprecated full_cuda_graph with cudagraph_mode in Qwen2.5-Omni (#7286)
## Summary
- Replace `full_cuda_graph: 1` with `cudagraph_mode: FULL_DECODE_ONLY`
in both single-NPU and multi-NPU examples
- `full_cuda_graph` is deprecated and falls back to `NONE` on NPU

Fixes #4696
- vLLM version: v0.17.0
- vLLM main:
4034c3d32e

Signed-off-by: bazingazhou233-hub <bazingazhou233-hub@users.noreply.github.com>
Co-authored-by: bazingazhou233-hub <bazingazhou233-hub@users.noreply.github.com>
2026-03-14 22:38:36 +08:00
..

vLLM Ascend Plugin documents

Live doc: https://docs.vllm.ai/projects/ascend

Build the docs

# Install dependencies.
pip install -r requirements-docs.txt

# Build the docs.
make clean
make html

# Build the docs with translation
make intl

# Open the docs with your browser
python -m http.server -d _build/html/

Launch your browser and open: