[Doc] Fix doc typo (#1424)
1. Fix the typo 2. Fix 404 url 3. update graph mode and additional config user guide Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This commit is contained in:
@@ -28,10 +28,10 @@ The following table lists the additional configuration options available in vLLM
|
||||
|-------------------------------| ---- |------|-----------------------------------------------------------------------------------------------|
|
||||
| `torchair_graph_config` | dict | `{}` | The config options for torchair graph mode |
|
||||
| `ascend_scheduler_config` | dict | `{}` | The config options for ascend scheduler |
|
||||
| `expert_tensor_parallel_size` | str | `0` | Expert tensor parallel size the model to use. |
|
||||
| `refresh` | bool | `false` | Whether to refresh global ascend config content. This value is usually used by rlhf case. |
|
||||
| `expert_map_path` | str | None | When using expert load balancing for the MOE model, an expert map path needs to be passed in. |
|
||||
| `chunked_prefill_for_mla` | bool | `False` | Whether to enable the fused operator-like chunked_prefill. |
|
||||
| `expert_tensor_parallel_size` | str | `0` | Expert tensor parallel size the model to use. |
|
||||
| `refresh` | bool | `false` | Whether to refresh global ascend config content. This value is usually used by rlhf or ut/e2e test case. |
|
||||
| `expert_map_path` | str | `None` | When using expert load balancing for the MOE model, an expert map path needs to be passed in. |
|
||||
| `chunked_prefill_for_mla` | bool | `False` | Whether to enable the fused operator-like chunked_prefill. |
|
||||
|
||||
The details of each config option are as follows:
|
||||
|
||||
@@ -58,7 +58,7 @@ ascend_scheduler_config also support the options from [vllm scheduler config](ht
|
||||
|
||||
### Example
|
||||
|
||||
A full example of additional configuration is as follows:
|
||||
An example of additional configuration is as follows:
|
||||
|
||||
```
|
||||
{
|
||||
|
||||
@@ -1,9 +1,10 @@
|
||||
# Graph Mode Guide
|
||||
|
||||
|
||||
```{note}
|
||||
This feature is currently experimental. In future versions, there may be behavioral changes around configuration, coverage, performance improvement.
|
||||
```
|
||||
|
||||
This guide provides instructions for using Ascend Graph Mode with vLLM Ascend. Please note that graph mode is only available on V1 Engine. And only Qwen, DeepSeek series models are well tested in 0.9.0rc1. We'll make it stable and generalize in the next release.
|
||||
This guide provides instructions for using Ascend Graph Mode with vLLM Ascend. Please note that graph mode is only available on V1 Engine. And only Qwen, DeepSeek series models are well tested from 0.9.0rc1. We'll make it stable and generalize in the next release.
|
||||
|
||||
## Getting Started
|
||||
|
||||
|
||||
Reference in New Issue
Block a user