[CI] Refactor to speedup image building and CI Installation (#6708)

### What this PR does / why we need it?
1. Refactor  image workflow using cache-from to speedup builds

![build](https://github.com/user-attachments/assets/02135c12-0069-44f8-a3ec-5c2b4282448a)

Simultaneously refactored all Dockerfiles by placing layers that rarely
change before those that change frequently, improving build cache hit
rate.

2. Refactor E2E test using vllm-ascend container images, to skip C
compile while no C code are changed

![e2e](https://github.com/user-attachments/assets/49f5b166-0df3-41e1-8f71-b3bbbed17cfd)

In this case, the job will only replace the source code of vllm-ascend
and install `requirements-dev.txt`, saving about 10min before tests

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main:
9562912cea

Signed-off-by: wjunLu <wjunlu217@gmail.com>
This commit is contained in:
wjunLu
2026-02-28 09:06:00 +08:00
committed by GitHub
parent 5666ce03f5
commit 84b00695f8
13 changed files with 456 additions and 213 deletions

View File

@@ -76,6 +76,26 @@ jobs:
driver: docker-container
use: true
- name: Set cache ref
id: cache
run: |
if [ "${{ github.ref_type }}" = "tag" ]; then
# For tag events, use the images built from source branch as cache (the tag image doesn't exist yet).
if [ -z "$branch" ]; then
branch=$(git branch -r --contains HEAD \
| grep -v 'HEAD' \
| sed 's|[[:space:]]*origin/||' \
| head -1)
fi
branch="${branch:-main}"
else
# For branch push / schedule / workflow_dispatch, use the triggering branch name
branch="${{ github.ref_name }}"
fi
# Replace / with - for use in image tags
branch="${branch//\//-}"
echo "ref=quay.io/ascend/vllm-ascend:${branch}-${{ inputs.suffix }}" >> $GITHUB_OUTPUT
- name: Build and push
uses: docker/build-push-action@v6
id: build
@@ -89,6 +109,8 @@ jobs:
outputs: type=image,name=quay.io/ascend/vllm-ascend,push-by-digest=true,name-canonical=true,push=${{ inputs.should_push }}
build-args: |
PIP_INDEX_URL=https://pypi.org/simple
# use previously pushed multi-arch image as cache to speed up builds
cache-from: type=registry,ref=${{ steps.cache.outputs.ref }}
provenance: false
- name: Export digest
@@ -154,6 +176,7 @@ jobs:
# which follow the rule from vLLM with prefix v
# TODO(yikun): the post release might be considered as latest release
tags: |
type=branch,suffix=${{ env.SUFFIX }}
type=pep440,pattern={{raw}},suffix=${{ env.SUFFIX }}
type=schedule,pattern=main,suffix=${{ env.SUFFIX }}
type=raw,value=${{ inputs.workflow_dispatch_tag }},enable=${{ github.event_name == 'workflow_dispatch' }},suffix=${{ env.SUFFIX }}