[Doc] fix the nit in docs (#6826)
Refresh the doc, fix the nit in the docs
- vLLM version: v0.15.0
- vLLM main:
83b47f67b1
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This commit is contained in:
@@ -2,15 +2,15 @@
|
||||
|
||||
## Introduction
|
||||
|
||||
[GLM-5](https://huggingface.co/zai-org/GLM-5)use a Mixture-of-Experts (MoE) architecture and targeting at complex systems engineering and long-horizon agentic tasks.
|
||||
[GLM-5](https://huggingface.co/zai-org/GLM-5) use a Mixture-of-Experts (MoE) architecture and targeting at complex systems engineering and long-horizon agentic tasks.
|
||||
|
||||
This document will show the main verification steps of the model, including supported features, feature configuration, environment preparation, single-node and multi-node deployment, accuracy and performance evaluation.
|
||||
|
||||
## Supported Features
|
||||
|
||||
Refer to [supported features](https://docs.vllm.ai/projects/ascend/en/latest/user_guide/support_matrix/supported_models.html)to get the model's supported feature matrix.
|
||||
Refer to [supported features](../../user_guide/support_matrix/supported_models.md) to get the model's supported feature matrix.
|
||||
|
||||
Refer to [feature guide](https://docs.vllm.ai/projects/ascend/en/latest/user_guide/support_matrix/supported_features.html) to get the feature's configuration.
|
||||
Refer to [feature guide](../../user_guide/feature_guide/index.md) to get the feature's configuration.
|
||||
|
||||
## Environment Preparation
|
||||
|
||||
@@ -241,7 +241,7 @@ The parameters are explained as follows:
|
||||
|
||||
### Multi-node Deployment
|
||||
|
||||
If you want to deploy multi-node environment, you need to verify multi-node communication according to [verify multi-node communication environment](https://docs.vllm.ai/projects/ascend/en/latest/installation.html#verify-multi-node-communication).
|
||||
If you want to deploy multi-node environment, you need to verify multi-node communication according to [verify multi-node communication environment](../../installation.md#verify-multi-node-communication).
|
||||
|
||||
:::::{tab-set}
|
||||
:sync-group: install
|
||||
@@ -450,7 +450,7 @@ vllm serve /root/.cache/modelscope/hub/models/vllm-ascend/GLM-5-w4a8 \
|
||||
::::
|
||||
:::::
|
||||
|
||||
- For bf16 weight, use this script on each node to enable [Multi Token Prediction (MTP)](https://docs.vllm.ai/projects/ascend/en/latest/user_guide/feature_guide/Multi_Token_Prediction.html).
|
||||
- For bf16 weight, use this script on each node to enable [Multi Token Prediction (MTP)](../../user_guide/feature_guide/Multi_Token_Prediction.md).
|
||||
|
||||
```shell
|
||||
python adjust_weight.py "path_of_bf16_weight"
|
||||
@@ -518,7 +518,7 @@ Here are two accuracy evaluation methods.
|
||||
|
||||
### Using AISBench
|
||||
|
||||
1. Refer to [Using AISBench](https://docs.vllm.ai/projects/ascend/en/latest/developer_guide/evaluation/using_ais_bench.html) for details.
|
||||
1. Refer to [Using AISBench](../../developer_guide/evaluation/using_ais_bench.md) for details.
|
||||
|
||||
2. After execution, you can get the result.
|
||||
|
||||
@@ -530,7 +530,7 @@ Not test yet.
|
||||
|
||||
### Using AISBench
|
||||
|
||||
Refer to [Using AISBench for performance evaluation](https://docs.vllm.ai/projects/ascend/en/latest/developer_guide/evaluation/using_ais_bench.html#execute-performance-evaluation) for details.
|
||||
Refer to [Using AISBench for performance evaluation](../../developer_guide/evaluation/using_ais_bench.md#execute-performance-evaluation) for details.
|
||||
|
||||
### Using vLLM Benchmark
|
||||
|
||||
|
||||
Reference in New Issue
Block a user