[4/n]decouple quantization implementation from vLLM dependency (#9191)

Co-authored-by: AniZpZ <aniz1905@gmail.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
This commit is contained in:
Hongbo Xu
2025-08-15 03:05:46 +08:00
committed by GitHub
parent 63d82a776a
commit 2cc9eeab01
8 changed files with 37 additions and 74 deletions

View File

@@ -30,14 +30,9 @@ jobs:
- name: Install dependencies
run: |
bash scripts/ci/ci_install_dependency.sh
pip install "vllm==0.10.0"
pip install "openai==1.99.1"
pip install "bitsandbytes>=0.44.0"
# NOTE: The latest sgl-kernel depends on torch 2.8.0 but the latest vllm depends on torch 2.7.0
# so they are not compatible. Here we install the old sgl-kernel to make the test pass.
# TODO: remove this once vllm supports torch 2.8.0.
pip install "sgl-kernel==0.2.9"
pip install "sgl-kernel==0.3.5"
- name: Run vLLM dependency tests
timeout-minutes: 60