Add an option of enable frozen parameter (#2869)

### What this PR does / why we need it? Add an option of enable frozen parameter ### How was this patch tested? - vLLM version: v0.10.2 - vLLM main: 68dbde5dbb Signed-off-by: 1Fire4 <wangdingyi2@huawei.com>
2025-09-17 12:00:44 +08:00
parent 76844eec78
commit 1f6465c399
4 changed files with 8 additions and 1 deletions
--- a/docs/source/user_guide/configuration/additional_config.md
+++ b/docs/source/user_guide/configuration/additional_config.md
@@ -48,6 +48,7 @@ The details of each config option are as follows:
 | `enable_multistream_mla`| bool | `False` | Whether to put vector ops of MLA to another stream. This option only takes effects on models using MLA (e.g., DeepSeek). |
 | `enable_multistream_moe`| bool | `False` | Whether to enable multistream shared expert. This option only takes effects on DeepSeek moe models. |
 | `enable_view_optimize` | bool | `True` | Whether to enable torchair view optimization |
+| `enable_frozen_parameter` | bool | `True` | Whether to fix the memory address of weights during inference to reduce the input address refresh time during graph execution. |
 | `use_cached_graph` | bool | `False` | Whether to use cached graph |
 | `graph_batch_sizes` | list[int] | `[]` | The batch size for torchair graph cache |
 | `graph_batch_sizes_init` | bool | `False` | Init graph batch size dynamically if `graph_batch_sizes` is empty |