Commit Graph

15 Commits

Author SHA1 Message Date
leo-pony
ff91904ee2 [Doc] Clearer corresponding relationship between configurations for multi-node guides (#3441)
Optimize multi-node guide: more clearer corresponding relationship
between configuration items and nodes

### What this PR does / why we need it?
Some issues caused by misunderstandings due to unclear guidance content,
for example: #3367

### Does this PR introduce _any_ user-facing change?
NA
### How was this patch tested?
NA

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-10-16 08:54:03 +08:00
Li Wang
516e14ae6a [Doc] Upgrade to multi-node tutorial model to deepseek-v3.1-w8a8 (#2553)
### What this PR does / why we need it?
Upgrade to multi-node tutorial model to deepseek-v3.1-w8a8
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.10.1.1
- vLLM main:
de02b07db4

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-08-27 14:16:44 +08:00
Mengqing Cao
4604882a3e [ReleaseNote] Release note of v0.10.0rc1 (#2225)
### What this PR does / why we need it?
Release note of v0.10.0rc1

- vLLM version: v0.10.0
- vLLM main:
8e8e0b6af1

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
2025-08-07 14:46:49 +08:00
Li Wang
bf84f2dbfa [Doc] Support kimi-k2-w8a8 (#2162)
### What this PR does / why we need it?
In fact, the kimi-k2 model is similar to the deepseek model, and we only
need to make a few changes to support it. what does this pr do:
1. Add kimi-k2-w8a8 deployment doc
2. Update quantization doc
3. Upgrade torchair support list
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.10.0
- vLLM main:
9edd1db02b

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-08-06 19:28:47 +08:00
Li Wang
bdfb065b5d [1/2/N] Enable pymarkdown and python __init__ for lint system (#2011)
### What this PR does / why we need it?
1. Enable pymarkdown check
2. Enable python `__init__.py` check for vllm and vllm-ascend
3. Make clean code

### How was this patch tested?


- vLLM version: v0.9.2
- vLLM main:
29c6fbe58c

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-07-25 22:16:10 +08:00
wangxiyuan
eb921d2b6f [Doc] Fix 404 error (#1797)
Fix url 404 error in doc
- vLLM version: v0.9.2
- vLLM main:
9ad0a4588b

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-07-15 11:52:38 +08:00
Li Wang
afcfe91dfa [Doc] Fix multi node doc (#1783)
### What this PR does / why we need it?

### Does this PR introduce _any_ user-facing change?
Pin docker image to latest release
### How was this patch tested?


- vLLM version: v0.9.2
- vLLM main:
1e9438e0b0

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-07-14 17:56:57 +08:00
wangxiyuan
b5b7e0ecc7 [Doc] Add qwen3 embedding 8b guide (#1734)
1. Add the tutorials for qwen3-embedding-8b
2. Remove VLLM_USE_V1=1  in docs, it's useless any more from 0.9.2


- vLLM version: v0.9.2
- vLLM main:
5923ab9524

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-07-11 17:40:17 +08:00
wangxiyuan
3d1e6a5929 [Doc] Update user doc index (#1581)
Add user doc index to make the user guide more clear
- vLLM version: v0.9.1
- vLLM main:
49e8c7ea25

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-07-10 14:26:59 +08:00
Li Wang
0c4aa2b4f1 [Doc] Add multi node data parallel doc (#1685)
### What this PR does / why we need it?
 add multi node data parallel doc
### Does this PR introduce _any_ user-facing change?
 add multi node data parallel doc
### How was this patch tested?

- vLLM version: v0.9.1
- vLLM main:
805d62ca88

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-07-10 09:36:37 +08:00
wangxiyuan
9c7428b3d5 [CI] enable custom ops build (#466)
### What this PR does / why we need it?
This PR enable custom ops build  by default. 

### Does this PR introduce _any_ user-facing change?

Yes, users now install vllm-ascend from source will trigger custom ops
build step.

### How was this patch tested?
By image build and e2e CI

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-04-12 10:24:53 +08:00
jinyuxin
5d6239306b [DOC] Update multi_node.md (#468)
### What this PR does / why we need it?
- Added instructions for verifying multi-node communication environment.
- Included explanations of Ray-related environment variables for
configuration.
- Provided detailed steps for launching services in a multi-node
environment.
### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
manually tested.

Signed-off-by: jinyuxin <jinyuxin2@huawei.com>
2025-04-08 14:19:57 +08:00
Shanshan Shen
c06af8b2e0 [V1][Core] Add support for V1 Engine (#295)
### What this PR does / why we need it?
Add support for V1 Engine.

Please note that this is just the initial version, and there may be some
places need to be fixed or optimized in the future, feel free to leave
some comments to us.

### Does this PR introduce _any_ user-facing change?

To use V1 Engine on NPU device, you need to set the env variable shown
below:

```bash
export VLLM_USE_V1=1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
```

If you are using vllm for offline inferencing, you must add a `__main__`
guard like:

```bash
if __name__ == '__main__':

    llm = vllm.LLM(...)
```

Find more details
[here](https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html#python-multiprocessing).

### How was this patch tested?
I have tested the online serving with `Qwen2.5-7B-Instruct` using this
command:

```bash
vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 26240
```

Query the model with input prompts:

```bash
curl http://localhost:8000/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "Qwen/Qwen2.5-7B-Instruct",
        "prompt": "The future of AI is",
        "max_tokens": 7,
        "temperature": 0
    }'
```

---------

Signed-off-by: shen-shanshan <467638484@qq.com>
Co-authored-by: didongli182 <didongli@huawei.com>
2025-03-20 19:34:44 +08:00
wangxiyuan
c25631ec7b [Doc] Add the release note for 0.7.3rc1 (#285)
Add the release note for 0.7.3rc1

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-03-13 17:57:06 +08:00
Yikun Jiang
38334f5daa [Docs] Re-arch on doc and make QwQ doc work (#271)
### What this PR does / why we need it?
Re-arch on tutorials, move singe npu / multi npu / multi node to index.
- Unifiy docker run cmd
- Use dropdown to hide build from source installation doc
- Re-arch tutorials to include Qwen/QwQ/DeepSeek
- Make QwQ doc works

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI test



Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-03-10 09:27:48 +08:00