### What this PR does / why we need it? This PR adds description of preemption into FAQs in vLLM-Ascend. This FAQ stats: - how preemption affects the performance of a vLLM server. - how reduce the negative impacts of preemption. The reason why we add this FAQ is that we find that the origin description of preemption in vLLM is not very straightforward. If preemption causes performance drop, users might not be aware that this is caused by Preemption. ### Does this PR introduce _any_ user-facing change? No. Signed-off-by: Angazenn <supperccell@163.com>
vLLM Ascend Plugin documents
Live doc: https://docs.vllm.ai/projects/ascend
Build the docs
# Install dependencies.
pip install -r requirements-docs.txt
# Build the docs.
make clean
make html
# Build the docs with translation
make intl
# Open the docs with your browser
python -m http.server -d _build/html/
Launch your browser and open:
- English version: http://localhost:8000
- Chinese version: http://localhost:8000/zh_CN