### What this PR does / why we need it?
cherry-pick Fix performance degradation when mtp>1 (#3597)
This PR aims to fix performance degradation when mtp>1. Since mtp>1 may
result in more tokens (i.e. larger batch size) than acl graph maximum
batch size, this will cause draft model to run in eager mode.
### How was this patch tested?
by ci
---------
Signed-off-by: Zetong Li <slippersss@126.com>