### What this PR does / why we need it?
This PR improves the validation of `max_cudagraph_capture_size` by
comparing it against the potential maximum tokens required for decoding,
derived from the scheduler configuration. It introduces a warning to
alert users when the capture size might be insufficient for the
workload, which could lead to suboptimal performance.
ref: #8227
### Does this PR introduce _any_ user-facing change?
Yes, a warning log is added when the `max_cudagraph_capture_size` is
smaller than the potential decode workload.
---------
Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>