### What this PR does / why we need it? Remove padding for vlm inputs. We don't need padding inputs now, this padding will break the input preparetion of VLMs. ### Does this PR introduce _any_ user-facing change? N/A Signed-off-by: MengqingCao <cmq0113@163.com>