From 7beb4339dc8047af9ef64db1d0a8c59ddbb3709f Mon Sep 17 00:00:00 2001
From: hfadzxy <59153331+hfadzxy@users.noreply.github.com>
Date: Mon, 31 Mar 2025 00:24:25 +0800
Subject: [PATCH] [Doc]Add developer guide for using OpenCompass (#368)

### What this PR does / why we need it?
Add developer guide for using OpenCompass

### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?

test manually

---------

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
---
 .../developer_guide/evaluation/index.md       |   7 +
 .../evaluation/using_opencompass.md           | 120 ++++++++++++++++++
 docs/source/index.md                          |   3 +-
 3 files changed, 129 insertions(+), 1 deletion(-)
 create mode 100644 docs/source/developer_guide/evaluation/index.md
 create mode 100644 docs/source/developer_guide/evaluation/using_opencompass.md

diff --git a/docs/source/developer_guide/evaluation/index.md b/docs/source/developer_guide/evaluation/index.md
new file mode 100644
index 0000000..03f1551
--- /dev/null
+++ b/docs/source/developer_guide/evaluation/index.md
@@ -0,0 +1,7 @@
+# Evaluation
+
+:::{toctree}
+:caption: Accuracy
+:maxdepth: 1
+using_opencompass
+:::
\ No newline at end of file
diff --git a/docs/source/developer_guide/evaluation/using_opencompass.md b/docs/source/developer_guide/evaluation/using_opencompass.md
new file mode 100644
index 0000000..20193ae
--- /dev/null
+++ b/docs/source/developer_guide/evaluation/using_opencompass.md
@@ -0,0 +1,120 @@
+# Using OpenCompass 
+This document will guide you have a accuracy testing using [OpenCompass](https://github.com/open-compass/opencompass).
+
+## 1. Online Serving
+
+You can run docker container to start the vLLM server on a single NPU:
+
+```{code-block} bash
+   :substitutions:
+# Update DEVICE according to your device (/dev/davinci[0-7])
+export DEVICE=/dev/davinci7
+# Update the vllm-ascend image
+export IMAGE=quay.io/ascend/vllm-ascend:|vllm_ascend_version|
+docker run --rm \
+--name vllm-ascend \
+--device $DEVICE \
+--device /dev/davinci_manager \
+--device /dev/devmm_svm \
+--device /dev/hisi_hdc \
+-v /usr/local/dcmi:/usr/local/dcmi \
+-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
+-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
+-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
+-v /etc/ascend_install.info:/etc/ascend_install.info \
+-v /root/.cache:/root/.cache \
+-p 8000:8000 \
+-e VLLM_USE_MODELSCOPE=True \
+-e PYTORCH_NPU_ALLOC_CONF=max_split_size_mb:256 \
+-it $IMAGE \
+vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 26240
+```
+If your service start successfully, you can see the info shown below:
+```
+INFO:     Started server process [6873]
+INFO:     Waiting for application startup.
+INFO:     Application startup complete.
+```
+
+Once your server is started, you can query the model with input prompts in new terminal:
+```
+curl http://localhost:8000/v1/completions \
+    -H "Content-Type: application/json" \
+    -d '{
+        "model": "Qwen/Qwen2.5-7B-Instruct",
+        "prompt": "The future of AI is",
+        "max_tokens": 7,
+        "temperature": 0
+    }'
+```
+
+## 2. Run ceval accuracy test using OpenCompass
+Install OpenCompass and configure the environment variables in the container.
+
+```bash
+# Pin Python 3.10 due to:
+# https://github.com/open-compass/opencompass/issues/1976
+conda create -n opencompass python=3.10
+conda activate opencompass
+pip install opencompass modelscope[framework]
+export DATASET_SOURCE=ModelScope
+git clone https://github.com/open-compass/opencompass.git
+```
+
+Add `opencompass/configs/eval_vllm_ascend_demo.py` with the following content:
+
+```python
+from mmengine.config import read_base
+from opencompass.models import OpenAISDK
+
+with read_base():
+    from opencompass.configs.datasets.ceval.ceval_gen import ceval_datasets
+
+# Only test ceval-computer_network dataset in this demo
+datasets = ceval_datasets[:1]
+
+api_meta_template = dict(
+    round=[
+        dict(role='HUMAN', api_role='HUMAN'),
+        dict(role='BOT', api_role='BOT', generate=True),
+    ],
+    reserved_roles=[dict(role='SYSTEM', api_role='SYSTEM')],
+)
+
+models = [
+    dict(
+        abbr='Qwen2.5-7B-Instruct-vLLM-API',
+        type=OpenAISDK,
+        key='EMPTY', # API key
+        openai_api_base='http://127.0.0.1:8000/v1', 
+        path='Qwen/Qwen2.5-7B-Instruct', 
+        tokenizer_path='Qwen/Qwen2.5-7B-Instruct', 
+        rpm_verbose=True, 
+        meta_template=api_meta_template,
+        query_per_second=1, 
+        max_out_len=1024, 
+        max_seq_len=4096, 
+        temperature=0.01, 
+        batch_size=8,
+        retry=3,
+    )
+]
+```
+
+Run the following command:
+
+```
+python3 run.py opencompass/configs/eval_vllm_ascend_demo.py --debug
+```
+
+After 1-2 mins, the output is as shown below:
+
+```
+The markdown format results is as below:
+
+| dataset | version | metric | mode | Qwen2.5-7B-Instruct-vLLM-API |
+|----- | ----- | ----- | ----- | -----|
+| ceval-computer_network | db9ce2 | accuracy | gen | 68.42 |
+```
+
+You can see more usage on [OpenCompass Docs](https://opencompass.readthedocs.io/en/latest/index.html).
diff --git a/docs/source/index.md b/docs/source/index.md
index 6e4381f..70a48ad 100644
--- a/docs/source/index.md
+++ b/docs/source/index.md
@@ -51,9 +51,10 @@ user_guide/release_notes
 % How to contribute to the vLLM Ascend project
 :::{toctree}
 :caption: Developer Guide
-:maxdepth: 1
+:maxdepth: 2
 developer_guide/contributing
 developer_guide/versioning_policy
+developer_guide/evaluation/index
 :::
 
 % User stories about vLLM Ascend project