Upgrade to vllm 0.17.0 corex v4.1 overlay

2026-04-29 19:38:22 +08:00
parent 8fac6062e4
commit 938d0854a5
430 changed files with 35969 additions and 14511 deletions
--- a/README.md
+++ b/README.md
@@ -1,62 +1,37 @@
 # bi_150-vllm

-基于 `registry.iluvatar.com.cn:10443/customer/sz/vllm0.11.2-4.4.0-x86:v8` 的
-`vLLM 0.16.1rc0` 构建仓库，用于在 BI-V150 虚拟机环境中生成可直接运行的镜像。
+This repository contains the extracted `vLLM 0.17.0+corex.20260420090923`
+Python package used to overlay the vendor image
+`registry.iluvatar.com.cn:10443/customer/sz/vllm0.17.0-4.4.0-x86:v4.1`.

-## 改动说明
-
-本仓库只保留构建镜像所需的最小内容：
+## Included files

 - `vllm/`
-  当前运行代码
- `vllm-0.16.1rc0+corex.4.4.0.dist-info/`
-  对应的包元数据
+  The Python package code copied from the image payload.
+- `vllm-0.17.0+corex.20260420090923.dist-info/`
+  The package metadata extracted from the image.
 - `Dockerfile`
-  构建最终镜像
+  Builds a new image by replacing the installed `vllm` package in the vendor base image.

-与基础镜像相比，本仓库保留的关键代码改动如下：
+## Build

- 在 `vllm/platforms/__init__.py` 中修复 CUDA 平台识别逻辑
- 当 NVML 不可用且出现 `NVML Shared Library Not Found` 一类错误时
-  不再直接判定为非 CUDA 平台
- 改为回退到 `torch.cuda.is_available()` 和
-  `torch.cuda.device_count()` 继续判断 CUDA 是否可用
- 调整 CLI 初始化逻辑，避免 benchmark 可选依赖缺失时阻塞
-  `vllm serve ...` 启动
-
-这个修复用于解决如下启动失败：
-
-```text
-RuntimeError: Failed to infer device type
-```
-
-## 构建镜像
-
-在仓库根目录执行：
+Run the following command from the repository root:

 ```bash
-docker build -t bi_150_vllm:0.16.1 .
+docker build --pull=false \
+  --build-arg BASE_IMAGE=registry.iluvatar.com.cn:10443/customer/sz/vllm0.17.0-4.4.0-x86:v4.1 \
+  -t bi_150_vllm:0.17.0 \
+  .
 ```

-## 启动镜像
+## Verify

 ```bash
-docker run -dit \
-  --name iluvatar_test \
-  -p 38047:8000 \
-  --privileged \
-  -v /lib/modules:/lib/modules \
-  -v /dev:/dev \
-  -v /usr/src:/usr/src \
-  -v /mnt/gpfs/leaderboard/modelHubXC/Amu/t1-1.5B:/model \
-  -e CUDA_VISIBLE_DEVICES=0 \
-  --entrypoint vllm \
-  bi_150_vllm:0.16.1 \
-  serve /model \
-  --port 8000 \
-  --served-model-name llm \
-  --max-model-len 2048 \
-  --enforce-eager \
-  --trust-remote-code \
-  -tp 1
+docker run --rm -it bi_150_vllm:0.17.0 \
+  python3 -c "import vllm; print(vllm.__file__); print(vllm.__version__)"
 ```
+
+## Notes
+
+- This is an overlay-style repository, not the original upstream git source tree.
+- The Docker image keeps the vendor runtime stack and only replaces the Python package files.