### What this PR does / why we need it?
Current version will run into init error when user set max_num_seqs to
number not a multiple of tp size. The reason is that we will first find
out the valid size of sequence parallelism, and then remove numbers that
are not the multiple of tp size. This may cause an error when we set a
max_num_seqs above a multiple of 8 before a multiple of tp size, say
when the tp size is 16 and the max_num_seqs is 90. The system will just
drop the calculated max graph capture size 88 from the valid size list
but not reset the max_cudagraph_capture_size to the next valid number.
Thus, we will need to add the line to match them up.
Cherry-pick from main PR #7801
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Full CI passed with this PR.
Signed-off-by: linfeng-yuan <1102311262@qq.com>
Co-authored-by: limuyuan <limuyuan3@huawei.com>