[Feature]Support ragged prefill in flashinfer mla backend (#3967)

Co-authored-by: Yineng Zhang <me@zhyncs.com>
Co-authored-by: pankajroark <pankajroark@users.noreply.github.com>
This commit is contained in:
Baizhou Zhang
2025-02-28 18:13:56 -08:00
committed by GitHub
parent f3b99f73b3
commit 90a4b7d98a
9 changed files with 308 additions and 407 deletions

View File

@@ -182,6 +182,7 @@ class ModelRunner:
"device": server_args.device,
"enable_flashinfer_mla": server_args.enable_flashinfer_mla,
"disable_radix_cache": server_args.disable_radix_cache,
"flashinfer_mla_disable_ragged": server_args.flashinfer_mla_disable_ragged,
}
)