[1/2/N] Enable pymarkdown and python __init__ for lint system (#2011)

### What this PR does / why we need it? 1. Enable pymarkdown check 2. Enable python `__init__.py` check for vllm and vllm-ascend 3. Make clean code ### How was this patch tested? - vLLM version: v0.9.2 - vLLM main: 29c6fbe58c --------- Signed-off-by: wangli <wangli858794774@gmail.com>
2025-07-25 22:16:10 +08:00
parent d629f0b2b5
commit bdfb065b5d
31 changed files with 215 additions and 64 deletions
--- a/docs/source/user_guide/feature_guide/quantization.md
+++ b/docs/source/user_guide/feature_guide/quantization.md
@@ -11,6 +11,7 @@ To quantize a model, users should install [ModelSlim](https://gitee.com/ascend/m
 Currently, only the specific tag [modelslim-VLLM-8.1.RC1.b020_001](https://gitee.com/ascend/msit/blob/modelslim-VLLM-8.1.RC1.b020_001/msmodelslim/README.md) of modelslim works with vLLM Ascend. Please do not install other version until modelslim master version is available for vLLM Ascend in the future.

 Install modelslim:
+
 ```bash
 git clone https://gitee.com/ascend/msit -b modelslim-VLLM-8.1.RC1.b020_001
 cd msit/msmodelslim
@@ -22,7 +23,6 @@ pip install accelerate

 Take [DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-ai/DeepSeek-V2-Lite) as an example, you just need to download the model, and then execute the convert command. The command is shown below. More info can be found in modelslim doc [deepseek w8a8 dynamic quantization docs](https://gitee.com/ascend/msit/blob/modelslim-VLLM-8.1.RC1.b020_001/msmodelslim/example/DeepSeek/README.md#deepseek-v2-w8a8-dynamic%E9%87%8F%E5%8C%96).

-
 ```bash
 cd example/DeepSeek
 python3 quant_deepseek.py --model_path {original_model_path} --save_directory {quantized_model_save_path} --device_type cpu --act_method 2 --w_bit 8 --a_bit 8  --is_dynamic True
@@ -39,6 +39,7 @@ Once convert action is done, there are two important files generated.
 2. [quant_model_description.json](https://www.modelscope.cn/models/vllm-ascend/DeepSeek-V2-Lite-W8A8/file/view/master/quant_model_description.json?status=1). All the converted weights info are recorded in this file.

 Here is the full converted model files:
+
 ```bash
 .
 ├── config.json
@@ -103,4 +104,4 @@ submit a issue, maybe some new models need to be adapted.

 ### 2. How to solve the error "Could not locate the configuration_deepseek.py"?

-Please convert DeepSeek series models using `modelslim-VLLM-8.1.RC1.b020_001` modelslim, this version has fixed the missing configuration_deepseek.py error.
+Please convert DeepSeek series models using `modelslim-VLLM-8.1.RC1.b020_001` modelslim, this version has fixed the missing configuration_deepseek.py error.
--- a/docs/source/user_guide/feature_guide/sleep_mode.md
+++ b/docs/source/user_guide/feature_guide/sleep_mode.md
@@ -6,7 +6,6 @@ Sleep Mode is an API designed to offload model weights and discard KV cache from

 Since the generation and training phases may employ different model parallelism strategies, it becomes crucial to free KV cache and even offload model parameters stored within vLLM during training. This ensures efficient memory utilization and avoids resource contention on the NPU.

-
 ## Getting started

 With `enable_sleep_mode=True`, the way we manage memory(malloc, free) in vllm will under a specific memory pool, during loading model and initialize kv_caches, we tag the memory as a map: `{"weight": data, "kv_cache": data}`.