Update doc for MLA attention backends (#6034)

2025-05-07 18:51:05 -07:00
parent 9bddf1c82d
commit 8f508cc77f
2 changed files with 3 additions and 3 deletions
--- a/docs/backend/server_arguments.md
+++ b/docs/backend/server_arguments.md
@@ -168,7 +168,7 @@ Please consult the documentation below and [server_args.py](https://github.com/s

 | Arguments | Description | Defaults |
 |----------|-------------|---------|
-| `attention_backend` | This argument specifies the backend for attention computation and KV cache management, which can be `fa3`, `flashinfer`, `triton`, `cutlass_mla`, or `torch_native`. When deploying DeepSeek models, use this argument to specify the MLA backend. | None |
+| `attention_backend` | This argument specifies the backend for attention computation and KV cache management, which can be `fa3`, `flashinfer`, `triton`, `flashmla`, `cutlass_mla`, or `torch_native`. When deploying DeepSeek models, use this argument to specify the MLA backend. | None |
 | `sampling_backend` | Specifies the backend used for sampling. | None |

 ## Constrained Decoding