[4/n]decouple quantization implementation from vLLM dependency (#9191)

Co-authored-by: AniZpZ <aniz1905@gmail.com> Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-08-15 03:05:46 +08:00
parent 63d82a776a
commit 2cc9eeab01
8 changed files with 37 additions and 74 deletions
--- a/.github/workflows/vllm-dependency-test.yml
+++ b/.github/workflows/vllm-dependency-test.yml
@@ -30,14 +30,9 @@ jobs:
      - name: Install dependencies
        run: |
          bash scripts/ci/ci_install_dependency.sh
-          pip install "vllm==0.10.0"
-          pip install "openai==1.99.1"
          pip install "bitsandbytes>=0.44.0"

-          # NOTE: The latest sgl-kernel depends on torch 2.8.0 but the latest vllm depends on torch 2.7.0
-          # so they are not compatible. Here we install the old sgl-kernel to make the test pass.
-          # TODO: remove this once vllm supports torch 2.8.0.
-          pip install "sgl-kernel==0.2.9"
+          pip install "sgl-kernel==0.3.5"

      - name: Run vLLM dependency tests
        timeout-minutes: 60