diff --git a/.github/ISSUE_TEMPLATE/110-user-story.yml b/.github/ISSUE_TEMPLATE/110-user-story.yml index 26b6a43..e648f23 100644 --- a/.github/ISSUE_TEMPLATE/110-user-story.yml +++ b/.github/ISSUE_TEMPLATE/110-user-story.yml @@ -1,5 +1,5 @@ name: 📚 User Story -description: Apply for an user story to be displayed on https://vllm-ascend.readthedocs.org/user_stories/index.html +description: Apply for an user story to be displayed on https://vllm-ascend.readthedocs.io/en/latest/community/user_stories/index.html title: "[User Story]: " labels: ["user-story"] diff --git a/README.md b/README.md index bdc6a1b..7e5918c 100644 --- a/README.md +++ b/README.md @@ -46,7 +46,7 @@ By using vLLM Ascend plugin, popular open-source models, including Transformer-l Please refer to [QuickStart](https://vllm-ascend.readthedocs.io/en/latest/quick_start.html) and [Installation](https://vllm-ascend.readthedocs.io/en/latest/installation.html) for more details. ## Contributing -See [CONTRIBUTING](https://vllm-ascend.readthedocs.io/en/main/developer_guide/contributing.html) for more details, which is a step-by-step guide to help you set up development environment, build and test. +See [CONTRIBUTING](https://vllm-ascend.readthedocs.io/en/latest/developer_guide/contribution/index.html) for more details, which is a step-by-step guide to help you set up development environment, build and test. We welcome and value any contributions and collaborations: - Please let us know if you encounter a bug by [filing an issue](https://github.com/vllm-project/vllm-ascend/issues) @@ -67,7 +67,7 @@ Below is maintained branches: | v0.7.1-dev | Unmaintained | Only doc fixed is allowed | | v0.7.3-dev | Maintained | CI commitment for vLLM 0.7.3 version | -Please refer to [Versioning policy](https://vllm-ascend.readthedocs.io/en/main/developer_guide/versioning_policy.html) for more details. +Please refer to [Versioning policy](https://vllm-ascend.readthedocs.io/en/latest/community/versioning_policy.html) for more details. ## Weekly Meeting diff --git a/README.zh.md b/README.zh.md index afe2c76..55a40f5 100644 --- a/README.zh.md +++ b/README.zh.md @@ -47,7 +47,7 @@ vLLM 昇腾插件 (`vllm-ascend`) 是一个由社区维护的让vLLM在Ascend NP 请查看[快速开始](https://vllm-ascend.readthedocs.io/en/latest/quick_start.html)和[安装指南](https://vllm-ascend.readthedocs.io/en/latest/installation.html)了解更多. ## 贡献 -请参考 [CONTRIBUTING]((https://vllm-ascend.readthedocs.io/en/main/developer_guide/contributing.html)) 文档了解更多关于开发环境搭建、功能测试以及 PR 提交规范的信息。 +请参考 [CONTRIBUTING]((https://vllm-ascend.readthedocs.io/en/latest/developer_guide/contribution/index.html)) 文档了解更多关于开发环境搭建、功能测试以及 PR 提交规范的信息。 我们欢迎并重视任何形式的贡献与合作: - 请通过[Issue](https://github.com/vllm-project/vllm-ascend/issues)来告知我们您遇到的任何Bug。 @@ -67,7 +67,7 @@ vllm-ascend有主干分支和开发分支。 | v0.7.1-dev | Unmaintained | 只允许文档修复 | | v0.7.3-dev | Maintained | 基于vLLM v0.7.3版本CI看护 | -请参阅[版本策略](https://vllm-ascend.readthedocs.io/en/main/developer_guide/versioning_policy.html)了解更多详细信息。 +请参阅[版本策略](https://vllm-ascend.readthedocs.io/en/latest/community/versioning_policy.html)了解更多详细信息。 ## 社区例会 diff --git a/docs/source/developer_guide/feature_guide/patch.md b/docs/source/developer_guide/feature_guide/patch.md index 07fe768..cd6bc18 100644 --- a/docs/source/developer_guide/feature_guide/patch.md +++ b/docs/source/developer_guide/feature_guide/patch.md @@ -48,7 +48,7 @@ Before writing a patch, following the principle above, we should patch the least 1. Decide which version of vLLM we should patch. For example, after analysis, here we want to patch both 0.9.1 and main of vLLM. 2. Decide which process we should patch. For example, here `distributed` belongs to the vLLM main process, so we should patch `platform`. -3. Create the patch file in the write folder. The file should be named as `patch_{module_name}.py`. The example here is `vllm_ascend/patch/platform/patch_common/patch_distributed.py`. +3. Create the patch file in the right folder. The file should be named as `patch_{module_name}.py`. The example here is `vllm_ascend/patch/platform/patch_common/patch_distributed.py`. 4. Write your patch code in the new file. Here is an example: ```python import vllm diff --git a/docs/source/user_guide/additional_config.md b/docs/source/user_guide/additional_config.md index d4756ef..79b4d90 100644 --- a/docs/source/user_guide/additional_config.md +++ b/docs/source/user_guide/additional_config.md @@ -28,10 +28,10 @@ The following table lists the additional configuration options available in vLLM |-------------------------------| ---- |------|-----------------------------------------------------------------------------------------------| | `torchair_graph_config` | dict | `{}` | The config options for torchair graph mode | | `ascend_scheduler_config` | dict | `{}` | The config options for ascend scheduler | -| `expert_tensor_parallel_size` | str | `0` | Expert tensor parallel size the model to use. | -| `refresh` | bool | `false` | Whether to refresh global ascend config content. This value is usually used by rlhf case. | -| `expert_map_path` | str | None | When using expert load balancing for the MOE model, an expert map path needs to be passed in. | -| `chunked_prefill_for_mla` | bool | `False` | Whether to enable the fused operator-like chunked_prefill. | +| `expert_tensor_parallel_size` | str | `0` | Expert tensor parallel size the model to use. | +| `refresh` | bool | `false` | Whether to refresh global ascend config content. This value is usually used by rlhf or ut/e2e test case. | +| `expert_map_path` | str | `None` | When using expert load balancing for the MOE model, an expert map path needs to be passed in. | +| `chunked_prefill_for_mla` | bool | `False` | Whether to enable the fused operator-like chunked_prefill. | The details of each config option are as follows: @@ -58,7 +58,7 @@ ascend_scheduler_config also support the options from [vllm scheduler config](ht ### Example -A full example of additional configuration is as follows: +An example of additional configuration is as follows: ``` { diff --git a/docs/source/user_guide/graph_mode.md b/docs/source/user_guide/graph_mode.md index 7a33534..77e91dd 100644 --- a/docs/source/user_guide/graph_mode.md +++ b/docs/source/user_guide/graph_mode.md @@ -1,9 +1,10 @@ # Graph Mode Guide - +```{note} This feature is currently experimental. In future versions, there may be behavioral changes around configuration, coverage, performance improvement. +``` -This guide provides instructions for using Ascend Graph Mode with vLLM Ascend. Please note that graph mode is only available on V1 Engine. And only Qwen, DeepSeek series models are well tested in 0.9.0rc1. We'll make it stable and generalize in the next release. +This guide provides instructions for using Ascend Graph Mode with vLLM Ascend. Please note that graph mode is only available on V1 Engine. And only Qwen, DeepSeek series models are well tested from 0.9.0rc1. We'll make it stable and generalize in the next release. ## Getting Started