[E2E] add E2E for Prefix Caching cp & Chunked Prefill cp (#5149)

### What this PR does / why we need it?
Add E2E for Prefix Caching cp & Chunked Prefill cp 
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: F.Liu <liufeng248@huawei.com>
Signed-off-by: Feng Liu <46866849+ader47@users.noreply.github.com>
Co-authored-by: F.Liu <liufeng248@huawei.com>
This commit is contained in:
Feng Liu
2026-02-03 15:04:14 +08:00
committed by GitHub
parent be5b66de6d
commit 03a18ad6fd
6 changed files with 404 additions and 123 deletions

View File

@@ -78,6 +78,7 @@ PromptVideoInput = _PromptMultiModalInput[np.ndarray]
logger = logging.getLogger(__name__)
_TEST_DIR = os.path.dirname(__file__)
_LONG_PROMPTS = [os.path.join(_TEST_DIR, "prompts", "long_prompt.txt")]
def _check_npu_memory_worker(target_free_percentage: float, max_wait_seconds: float):