[Doc] Add sphinx build for vllm-ascend (#55)

### What this PR does / why we need it? This patch enables the doc build for vllm-ascend - Add sphinx build for vllm-ascend - Enable readthedocs for vllm-ascend - Fix CI: - exclude vllm-empty/tests/mistral_tool_use to skip `You need to agree to share your contact information to access this model` which introduce in 314cfade02 - Install test req to fix https://github.com/vllm-project/vllm-ascend/actions/runs/13304112758/job/37151690770: ``` vllm-empty/tests/mistral_tool_use/conftest.py:4: in <module> import pytest_asyncio E ModuleNotFoundError: No module named 'pytest_asyncio' ``` - exclude docs PR ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? 1. test locally: ```bash # Install dependencies. pip install -r requirements-docs.txt # Build the docs and preview make clean; make html; python -m http.server -d build/html/ ``` Launch browser and open http://localhost:8000/. 2. CI passed with preview: https://vllm-ascend--55.org.readthedocs.build/en/55/ Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-02-13 18:44:17 +08:00
parent 63b11ec7e9
commit 46977f9f06
22 changed files with 255 additions and 36 deletions
--- a/docs/source/features/supported_models.md
+++ b/docs/source/features/supported_models.md
@@ -0,0 +1,24 @@
+# Supported Models
+
+| Model | Supported | Note |
+|---------|-----------|------|
+| Qwen 2.5 | ✅ ||
+| Mistral |  | Need test |
+| DeepSeek v2.5 | |Need test |
+| LLama3.1/3.2 | ✅ ||
+| Gemma-2 |  |Need test|
+| baichuan |  |Need test|
+| minicpm |  |Need test|
+| internlm | ✅ ||
+| ChatGLM | ✅ ||
+| InternVL 2.5 | ✅ ||
+| Qwen2-VL | ✅ ||
+| GLM-4v |  |Need test|
+| Molomo | ✅ ||
+| LLaVA 1.5 | ✅ ||
+| Mllama |  |Need test|
+| LLaVA-Next |  |Need test|
+| LLaVA-Next-Video |  |Need test|
+| Phi-3-Vison/Phi-3.5-Vison |  |Need test|
+| Ultravox |  |Need test|
+| Qwen2-Audio | ✅ ||
--- a/docs/source/features/suppoted_features.md
+++ b/docs/source/features/suppoted_features.md
@@ -0,0 +1,19 @@
+# Feature Support
+
+| Feature | Supported | Note |
+|---------|-----------|------|
+| Chunked Prefill | ✗ | Plan in 2025 Q1 |
+| Automatic Prefix Caching | ✅ | Improve performance in 2025 Q1 |
+| LoRA | ✗ | Plan in 2025 Q1 |
+| Prompt adapter | ✅ ||
+| Speculative decoding | ✅ | Improve accuracy in 2025 Q1|
+| Pooling | ✗ | Plan in 2025 Q1 |
+| Enc-dec | ✗ | Plan in 2025 Q1 |
+| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
+| LogProbs | ✅ ||
+| Prompt logProbs | ✅ ||
+| Async output | ✅ ||
+| Multi step scheduler | ✅ ||
+| Best of | ✅ ||
+| Beam search | ✅ ||
+| Guided Decoding | ✗ | Plan in 2025 Q1 |