[CI/UT] Fix disaggregated prefill ci (#1313)
### What this PR does / why we need it? Use eager mode to run disaggregated prefill ci ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new existing test. --------- Signed-off-by: MengqingCao <cmq0113@163.com>
This commit is contained in:
@@ -66,6 +66,7 @@ function run_prefill_instance() {
|
||||
--served-model-name Deepseek \
|
||||
--max-model-len 2000 \
|
||||
--trust-remote-code \
|
||||
--enforce-eager \
|
||||
--kv-transfer-config "$KV_CONFIG"
|
||||
}
|
||||
|
||||
@@ -119,6 +120,7 @@ function run_decode_instance() {
|
||||
--max-num-batched-tokens 2000 \
|
||||
--trust-remote-code \
|
||||
--gpu-memory-utilization 0.9 \
|
||||
--enforce-eager \
|
||||
--kv-transfer-config "$KV_CONFIG"
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user