[main2main] upgrade vllm to 0308 (#7213)

### What this PR does / why we need it?
Update main2main to vllm 0308.
breaks:

* https://github.com/vllm-project/vllm/pull/30681
* https://github.com/vllm-project/vllm/pull/35552 remove
self.cudagraph_batch_sizes
* https://github.com/vllm-project/vllm/pull/35158 clear_metadata ->
defer_finalize
* https://github.com/vllm-project/vllm/pull/36006 remove
CacheConfig.cpu_offload_gb
* https://github.com/vllm-project/vllm/pull/35472
* https://github.com/vllm-project/vllm/pull/34552 attn_metadata_builder
* https://github.com/vllm-project/vllm/pull/30515 profile_seq_lens
* https://github.com/vllm-project/vllm/pull/28053 

- vLLM version: v0.16.0
- vLLM main:
4034c3d32e

---------

Signed-off-by: MrZ20 <2609716663@qq.com>
Signed-off-by: menogrey <1299267905@qq.com>
Co-authored-by: MrZ20 <2609716663@qq.com>
This commit is contained in:
zhangyiming
2026-03-18 09:24:43 +08:00
committed by GitHub
parent 79ef41a53d
commit 1c954ff264
16 changed files with 223 additions and 168 deletions

View File

@@ -110,7 +110,7 @@ jobs:
- name: Upload timing data
uses: actions/upload-artifact@v4
if: ${{ inputs.continue_on_error == true }}
if: ${{ inputs.continue_on_error == true && github.event_name != 'pull_request' }}
with:
name: timing-data-singlecard-light-part${{ matrix.part }}
path: test_timing_data.json
@@ -200,7 +200,7 @@ jobs:
- name: Upload timing data
uses: actions/upload-artifact@v4
if: ${{ inputs.continue_on_error == true }}
if: ${{ inputs.continue_on_error == true && github.event_name != 'pull_request' }}
with:
name: timing-data-singlecard-full-part${{ matrix.part }}
path: test_timing_data.json
@@ -289,7 +289,7 @@ jobs:
- name: Upload timing data
uses: actions/upload-artifact@v4
if: ${{ inputs.continue_on_error == true }}
if: ${{ inputs.continue_on_error == true && github.event_name != 'pull_request' }}
with:
name: timing-data-2card-light-part${{ matrix.part }}
path: test_timing_data.json
@@ -378,7 +378,7 @@ jobs:
- name: Upload timing data
uses: actions/upload-artifact@v4
if: ${{ inputs.continue_on_error == true }}
if: ${{ inputs.continue_on_error == true && github.event_name != 'pull_request' }}
with:
name: timing-data-2card-full-part${{ matrix.part }}
path: test_timing_data.json
@@ -475,7 +475,7 @@ jobs:
- name: Upload timing data
uses: actions/upload-artifact@v4
if: ${{ inputs.continue_on_error == true }}
if: ${{ inputs.continue_on_error == true && github.event_name != 'pull_request' }}
with:
name: timing-data-4card-full-part${{ matrix.part }}
path: test_timing_data.json