[1/2/N] Enable pymarkdown and python __init__ for lint system (#2011)

### What this PR does / why we need it?
1. Enable pymarkdown check
2. Enable python `__init__.py` check for vllm and vllm-ascend
3. Make clean code

### How was this patch tested?


- vLLM version: v0.9.2
- vLLM main:
29c6fbe58c

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
This commit is contained in:
Li Wang
2025-07-25 22:16:10 +08:00
committed by GitHub
parent d629f0b2b5
commit bdfb065b5d
31 changed files with 215 additions and 64 deletions

View File

@@ -6,7 +6,6 @@ Sleep Mode is an API designed to offload model weights and discard KV cache from
Since the generation and training phases may employ different model parallelism strategies, it becomes crucial to free KV cache and even offload model parameters stored within vLLM during training. This ensures efficient memory utilization and avoids resource contention on the NPU.
## Getting started
With `enable_sleep_mode=True`, the way we manage memory(malloc, free) in vllm will under a specific memory pool, during loading model and initialize kv_caches, we tag the memory as a map: `{"weight": data, "kv_cache": data}`.