[CI, AMD] Add AMD tests to CI (#1491)

2024-09-22 04:45:10 -07:00
parent 13f1357ef0
commit 6f3cf1297e
2 changed files with 26 additions and 1 deletions
--- a/.github/workflows/pr-test.yml
+++ b/.github/workflows/pr-test.yml
@@ -246,6 +246,28 @@ jobs:
          cd test/srt
          python3 test_data_parallelism.py

+  accuracy-test-1-gpu-amd:
+    if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
+    runs-on: 1-gpu-runner-amd
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v3
+
+      - name: Install dependencies
+        run: |
+          pip install --upgrade pip
+          pip install -e "python[all]" --no-deps
+
+          git clone https://github.com/merrymercy/human-eval.git
+          cd human-eval
+          pip install -e .
+
+      - name: Evaluate Accuracy
+        timeout-minutes: 20
+        run: |
+          cd test/srt
+          python3 test_eval_accuracy_large.py
+
  finish:
    needs: [
      unit-test-frontend, unit-test-backend-part-1, unit-test-backend-part-2, unit-test-backend-part-3,
--- a/docs/en/setup_github_runner.md
+++ b/docs/en/setup_github_runner.md
@@ -8,7 +8,10 @@ You can mount a folder for the shared huggingface model weights cache. The comma

 ```
 docker pull nvidia/cuda:12.1.1-devel-ubuntu22.04
+# Nvidia
 docker run --shm-size 64g -it -v /tmp/huggingface:/hf_home --gpus all nvidia/cuda:12.1.1-devel-ubuntu22.04 /bin/bash
+# AMD
+docker run --rm --device=/dev/kfd --device=/dev/dri --group-add video --shm-size 64g -it -v /tmp/huggingface:/hf_home henryx/haisgl:sgl0.3.1.post3_vllm0.6.0_triton3.0.0_rocm6.2.1 /bin/bash
 ```

 ### Step 2: Configure the runner by `config.sh`
@@ -41,4 +44,4 @@ export CUDA_VISIBLE_DEVICES=0
 - Run it forever
 ```
 while true; do ./run.sh; echo "Restarting..."; sleep 2; done
-```
+```