refactor EAGLE 2 (#3269)
Co-authored-by: Ying Sheng <sqy1415@gmail.com> Co-authored-by: merrymercy <lianminzheng@gmail.com> Co-authored-by: Ying1123 <sqy1415@gmail.com>
This commit is contained in:
@@ -21,6 +21,7 @@ def main():
|
||||
speculative_num_steps=3,
|
||||
speculative_eagle_topk=4,
|
||||
speculative_num_draft_tokens=16,
|
||||
cuda_graph_max_bs=8,
|
||||
)
|
||||
|
||||
outputs = llm.generate(prompts, sampling_params)
|
||||
|
||||
Reference in New Issue
Block a user