[v0.11.0][Doc] Update doc (#3852)

### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-10-29 11:32:12 +08:00
parent 6188450269
commit 75de3fa172
49 changed files with 724 additions and 701 deletions
--- a/docs/source/developer_guide/evaluation/using_evalscope.md
+++ b/docs/source/developer_guide/evaluation/using_evalscope.md
@@ -2,7 +2,7 @@

 This document will guide you have model inference stress testing and accuracy testing using [EvalScope](https://github.com/modelscope/evalscope).

-## 1. Online serving
+## 1. Online server

 You can run docker container to start the vLLM server on a single NPU:

@@ -31,7 +31,7 @@ docker run --rm \
 vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 26240
 ```

-If your service start successfully, you can see the info shown below:
+If the vLLM server is started successfully, you can see information shown below:

 ```
 INFO:     Started server process [6873]
@@ -39,7 +39,7 @@ INFO:     Waiting for application startup.
 INFO:     Application startup complete.
 ```

-Once your server is started, you can query the model with input prompts in new terminal:
+Once your server is started, you can query the model with input prompts in a new terminal:

 ```
 curl http://localhost:8000/v1/completions \
@@ -54,7 +54,7 @@ curl http://localhost:8000/v1/completions \

 ## 2. Install EvalScope using pip

-You can install EvalScope by using:
+You can install EvalScope as follows:

 ```bash
 python3 -m venv .venv-evalscope
@@ -62,9 +62,9 @@ source .venv-evalscope/bin/activate
 pip install gradio plotly evalscope
 ```

-## 3. Run gsm8k accuracy test using EvalScope
+## 3. Run GSM8K using EvalScope for accuracy testing

-You can `evalscope eval` run gsm8k accuracy test:
+You can use `evalscope eval` to run GSM8K for accuracy testing:

 ```
 evalscope eval \
@@ -76,7 +76,7 @@ evalscope eval \
 --limit 10
 ```

-After 1-2 mins, the output is as shown below:
+After 1 to 2 minutes, the output is shown below:

 ```shell
 +---------------------+-----------+-----------------+----------+-------+---------+---------+
@@ -86,7 +86,7 @@ After 1-2 mins, the output is as shown below:
 +---------------------+-----------+-----------------+----------+-------+---------+---------+
 ```

-See more detail in: [EvalScope doc - Model API Service Evaluation](https://evalscope.readthedocs.io/en/latest/get_started/basic_usage.html#model-api-service-evaluation).
+See more detail in [EvalScope doc - Model API Service Evaluation](https://evalscope.readthedocs.io/en/latest/get_started/basic_usage.html#model-api-service-evaluation).

 ## 4. Run model inference stress testing using EvalScope

@@ -98,7 +98,7 @@ pip install evalscope[perf] -U

 ### Basic usage

-You can use `evalscope perf` run perf test:
+You can use `evalscope perf` to run perf testing:

 ```
 evalscope perf \
@@ -113,7 +113,7 @@ evalscope perf \

 ### Output results

-After 1-2 mins, the output is as shown below:
+After 1 to 2 minutes, the output is shown below:

 ```shell
 Benchmarking summary:
@@ -172,4 +172,4 @@ Percentile results:
 +------------+----------+---------+-------------+--------------+---------------+----------------------+
 ```

-See more detail in: [EvalScope doc - Model Inference Stress Testing](https://evalscope.readthedocs.io/en/latest/user_guides/stress_test/quick_start.html#basic-usage).
+See more detail in [EvalScope doc - Model Inference Stress Testing](https://evalscope.readthedocs.io/en/latest/user_guides/stress_test/quick_start.html#basic-usage).
--- a/docs/source/developer_guide/evaluation/using_lm_eval.md
+++ b/docs/source/developer_guide/evaluation/using_lm_eval.md
@@ -1,8 +1,8 @@
 # Using lm-eval
-This document will guide you have a accuracy testing using [lm-eval][1].
+This document guides you to conduct accuracy testing using [lm-eval][1].

 ## Online Server
-### 1. start the vLLM server
+### 1. Start the vLLM server
 You can run docker container to start the vLLM server on a single NPU:

 ```{code-block} bash
@@ -31,7 +31,7 @@ docker run --rm \
 vllm serve Qwen/Qwen2.5-0.5B-Instruct --max_model_len 4096 &
 ```

-Started the vLLM server successfully,if you see log as below:
+The vLLM server is started successfully, if you see logs as below:

 ```
 INFO:     Started server process [9446]
@@ -39,9 +39,9 @@ INFO:     Waiting for application startup.
 INFO:     Application startup complete.
 ```

-### 2. Run gsm8k accuracy test using lm-eval
+### 2. Run GSM8K using lm-eval for accuracy testing

-You can query result with input prompts:
+You can query the result with input prompts:

 ```
 curl http://localhost:8000/v1/completions \
@@ -98,7 +98,7 @@ The output format matches the following:
 }
 ```

-Install lm-eval in the container.
+Install lm-eval in the container:

 ```bash
 export HF_ENDPOINT="https://hf-mirror.com"
@@ -116,7 +116,7 @@ lm_eval \
  --output_path ./
 ```

-After 30 mins, the output is as shown below:
+After 30 minutes, the output is as shown below:

 ```
 The markdown format results is as below:
@@ -158,8 +158,8 @@ docker run --rm \
 /bin/bash
 ```

-### 2. Run gsm8k accuracy test using lm-eval
-Install lm-eval in the container.
+### 2. Run GSM8K using lm-eval for accuracy testing
+Install lm-eval in the container:

 ```bash
 export HF_ENDPOINT="https://hf-mirror.com"
@@ -177,7 +177,7 @@ lm_eval \
  --batch_size auto
 ```

-After 1-2 mins, the output is as shown below:
+After 1 to 2 minutes, the output is shown below:

 ```
 The markdown format results is as below:
@@ -189,9 +189,9 @@ Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|

 ```

-## Use offline Datasets
+## Use Offline Datasets

-Take gsm8k(single dataset) and mmlu(multi-subject dataset) as examples, and you can see more from [here][2].
+Take GSM8K (single dataset) and MMLU (multi-subject dataset) as examples, and you can see more from [here][2].

 ```bash
 # set HF_DATASETS_OFFLINE when using offline datasets
@@ -205,7 +205,7 @@ cd lm_eval/tasks/gsm8k
 cd lm_eval/tasks/mmlu/default
 ```

-set [gsm8k.yaml][3] as follows:
+Set [gsm8k.yaml][3] as follows:

 ```yaml
 tag:
@@ -230,7 +230,7 @@ training_split: train
 fewshot_split: train
 test_split: test
 doc_to_text: 'Q: {{question}}
-  A(Please follow the summarize the result at the end with the format of "The answer is xxx", where xx is the result.):'
+  A(Please follow the summarized result at the end with the format of "The answer is xxx", where xx is the result.):'
 doc_to_target: "{{answer}}" #" {{answer.split('### ')[-1].rstrip()}}"
 metric_list:
  - metric: exact_match
@@ -268,7 +268,7 @@ metadata:
  version: 3.0
 ```

-set [_default_template_yaml][4] as follows:
+Set [_default_template_yaml][4] as follows:

 ```yaml
 # set dataset_path according to the downloaded dataset
--- a/docs/source/developer_guide/evaluation/using_opencompass.md
+++ b/docs/source/developer_guide/evaluation/using_opencompass.md
@@ -1,7 +1,7 @@
 # Using OpenCompass
-This document will guide you have a accuracy testing using [OpenCompass](https://github.com/open-compass/opencompass).
+This document guides you to conduct accuracy testing using [OpenCompass](https://github.com/open-compass/opencompass).

-## 1. Online Serving
+## 1. Online Server

 You can run docker container to start the vLLM server on a single NPU:

@@ -30,7 +30,7 @@ docker run --rm \
 vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 26240
 ```

-If your service start successfully, you can see the info shown below:
+The vLLM server is started successfully, if you see information as below:

 ```
 INFO:     Started server process [6873]
@@ -38,7 +38,7 @@ INFO:     Waiting for application startup.
 INFO:     Application startup complete.
 ```

-Once your server is started, you can query the model with input prompts in new terminal:
+Once your server is started, you can query the model with input prompts in a new terminal.

 ```
 curl http://localhost:8000/v1/completions \
@@ -51,8 +51,8 @@ curl http://localhost:8000/v1/completions \
    }'
 ```

-## 2. Run ceval accuracy test using OpenCompass
-Install OpenCompass and configure the environment variables in the container.
+## 2. Run C-Eval using OpenCompass for accuracy testing
+Install OpenCompass and configure the environment variables in the container:

 ```bash
 # Pin Python 3.10 due to:
@@ -64,7 +64,7 @@ export DATASET_SOURCE=ModelScope
 git clone https://github.com/open-compass/opencompass.git
 ```

-Add `opencompass/configs/eval_vllm_ascend_demo.py` with the following content:
+Add the following content to `opencompass/configs/eval_vllm_ascend_demo.py`:

 ```python
 from mmengine.config import read_base
@@ -110,7 +110,7 @@ Run the following command:
 python3 run.py opencompass/configs/eval_vllm_ascend_demo.py --debug
 ```

-After 1-2 mins, the output is as shown below:
+After 1 to 2 minutes, the output is shown below:

 ```
 The markdown format results is as below: