[Doc] Add experimental tag for flashinfer mla (#3925)
This commit is contained in:
@@ -133,7 +133,7 @@ Please consult the documentation below to learn more about the parameters you ma
|
||||
|
||||
* `attention_backend`: The backend for attention computation and KV cache management.
|
||||
* `sampling_backend`: The backend for sampling.
|
||||
* `enable_flashinfer_mla`: The backend for flashinfer MLA wrapper. It can optimize the throughput of deepseek models.
|
||||
* `enable_flashinfer_mla`: The backend for flashinfer MLA wrapper that accelerates deepseek models. (In Experiment Stage)
|
||||
|
||||
## Constrained Decoding
|
||||
|
||||
|
||||
Reference in New Issue
Block a user