### What this PR does / why we need it?
This PR is partially cherry-picked from #8172.
This PR aims to fix mismatched capture sizes after rounding operations
when using sp or speculative. The reason is that original
`self.cudagraph_capture_sizes` is no longer updated and remains as the
initial sizes. Now we use `self.cudagraph_dispatcher.get_capture_descs`
to the get up-to-date sizes.
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
by ci
Signed-off-by: Zetong Li <slippersss@126.com>