### What this PR does / why we need it?
The contiguous() operation temporarily increases memory usage, leading
to higher peak GPU memory, which necessitates reducing
gpu_memory_utilization. However, making tensors contiguous in
modelrunnerv1 significantly enhances operator performance, resulting in
greater end-to-end model benefits despite the memory overhead.
- vLLM version: release/v0.13.0
- vLLM main:
ad32e3e19c
Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>