xc-llm-ascend

Go to file

wangxiyuan 6335fe39ea Nominate ApsarasX as vllm-ascend maintainer (#2419 )

I would like to nominate Wengang Chen (@ApsarasX
https://github.com/ApsarasX) as a maintainer, starting with my +1.

## Reason
Review Quality‌: He focuses on the vLLM Ascend Core module review with
100+ high quality review, such as [#2326
(comment)](https://github.com/vllm-project/vllm-ascend/pull/2326#discussion_r2268509365),
[#768
(comment)](https://github.com/vllm-project/vllm-ascend/pull/768#discussion_r2075278516),
[#2312
(comment)](https://github.com/vllm-project/vllm-ascend/pull/2312#issuecomment-3174677159),
[#2268
(comment)](https://github.com/vllm-project/vllm-ascend/pull/2268#discussion_r2260920578),
[#2192
(comment)](https://github.com/vllm-project/vllm-ascend/pull/2192#issuecomment-3149414586),
[#2156
(comment)](https://github.com/vllm-project/vllm-ascend/pull/2156#discussion_r2249096673).
This helped vLLM Ascend v0.9.x and v0.10.x to be released with high
quality.

Sustained and Quality Contributions: He has a very good habit of sharing
his design ideas, development process, performance test results, such as
[#966](https://github.com/vllm-project/vllm-ascend/pull/966), he
contributed [many
PRs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Amerged+),
valuable bugfixes and also perf improvements.

Community Involvement: Active involved in community discussion, he is
collaborative and helps the users solve problems, involved in [120+ PR
and
issues](https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX).
He is also the speaker of [vLLM Beijing
Meetup](https://mp.weixin.qq.com/s/7n8OYNrCC_I9SJaybHA_-Q).

So I think he's a great addition to the vLLM Ascend Maintainer team.

- ✅Review Quality‌:
108+ PR with valuable review
https://github.com/vllm-project/vllm-ascend/pulls?q=commenter%3AApsarasX
with many valuable review, like 

https://github.com/vllm-project/vllm-ascend/pull/2326#discussion_r2268509365

https://github.com/vllm-project/vllm-ascend/pull/768#discussion_r2075278516

https://github.com/vllm-project/vllm-ascend/pull/2312#issuecomment-3174677159

https://github.com/vllm-project/vllm-ascend/pull/2268#discussion_r2260920578

https://github.com/vllm-project/vllm-ascend/pull/2192#issuecomment-3149414586

https://github.com/vllm-project/vllm-ascend/pull/2156#discussion_r2249096673

- ✅ Sustained and Major Contributions
https://github.com/vllm-project/vllm-ascend/pulls/ApsarasX

- ✅ Quality Contribution‌:

https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Aclosed
Good quality with well documents
[Perf] Refactor tensor disposal logic to reduce memory usage
https://github.com/vllm-project/vllm-ascend/pull/966

- ✅Community Involvement‌: 
7 issue:

https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20author%3AApsarasX

- 120+ PR and issue:

https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

2025-08-19 10:44:35 +08:00

.gemini

Configure Gemini (#2298 )

2025-08-11 22:21:29 +08:00

.github

Bump actions/checkout from 4 to 5 (#2420 )

2025-08-19 08:54:56 +08:00

benchmarks

Enable pytest and yaml style accuracy test (#2073 )

2025-07-31 21:39:13 +08:00

cmake

[core] Support custom ascendc kernels in vllm-ascend (#233 )

2025-04-03 14:52:34 +08:00

csrc

Add Custom Kernels For LoRA Performance (#2325 )

2025-08-19 09:09:11 +08:00

docs

Nominate ApsarasX as vllm-ascend maintainer (#2419 )

2025-08-19 10:44:35 +08:00

examples

fix doc typo (#2407 )

2025-08-19 09:10:01 +08:00

tests

[3/N][Refactor] Move torchair_attention to torchair dir (#2017 )

2025-08-19 10:25:22 +08:00

tools

[Bugfix] Disable check vllm init temporary (#2250 )

2025-08-07 10:37:22 +08:00

vllm_ascend

[3/N][Refactor] Move torchair_attention to torchair dir (#2017 )

2025-08-19 10:25:22 +08:00

.gitignore

[Misc] Add fusion_result.json to .gitignore (#1836 )

2025-07-17 11:54:49 +08:00

.pre-commit-config.yaml

[1/2/N] Enable pymarkdown and python __init__ for lint system (#2011 )

2025-07-25 22:16:10 +08:00

.readthedocs.yaml

[Doc] Add sphinx build for vllm-ascend (#55 )

2025-02-13 18:44:17 +08:00

CMakeLists.txt

add custom ascendc kernel vocabparallelembedding (#796 )

2025-06-12 10:44:33 +08:00

CODE_OF_CONDUCT.md

[1/2/N] Enable pymarkdown and python __init__ for lint system (#2011 )

2025-07-25 22:16:10 +08:00

codecov.yml

ut: add ci guard for ut coverage (#2317 )

2025-08-12 08:05:01 +08:00

collect_env.py

[CI]Add model basic accuracy test(Qwen2.5-0.5B-Instruct) (#460 )

2025-04-17 14:59:56 +08:00

CONTRIBUTING.md

Add recommend version and refresh readme / contribution.md (#1757 )

2025-07-12 12:35:40 +08:00

DCO

[Core] Init vllm-ascend (#3 )

2025-02-05 10:53:12 +08:00

Dockerfile

[Build][Ray] Fix protobuf version in Dockerfile (#2028 )

2025-07-30 22:49:20 +08:00

Dockerfile.310p

[Build][Ray] Fix protobuf version in Dockerfile (#2028 )

2025-07-30 22:49:20 +08:00

Dockerfile.310p.openEuler

[Build][Ray] Fix protobuf version in Dockerfile (#2028 )

2025-07-30 22:49:20 +08:00

Dockerfile.a3

[Build][Ray] Fix protobuf version in Dockerfile (#2028 )

2025-07-30 22:49:20 +08:00

Dockerfile.a3.openEuler

[Build][Ray] Fix protobuf version in Dockerfile (#2028 )

2025-07-30 22:49:20 +08:00

Dockerfile.openEuler

[Build][Ray] Fix protobuf version in Dockerfile (#2028 )

2025-07-30 22:49:20 +08:00

format.sh

[1/N][CI] Move linting system to pre-commits hooks (#1256 )

2025-07-10 14:17:15 +08:00

LICENSE

Initial commit

2025-01-29 02:44:13 -08:00

mypy.ini

Support multistream of shared experts in FusedMoE (#997 )

2025-06-11 09:18:38 +08:00

packages.txt

[CI/UT][PD Disaggreate] Initialize PD Disaggreate UT (#889 )

2025-05-29 10:17:12 +08:00

pyproject.toml

[CI] Fix broken CI (#2302 )

2025-08-11 11:22:32 +08:00

README.md

[ReleaseNote] Release note of v0.10.0rc1 (#2225 )

2025-08-07 14:46:49 +08:00

README.zh.md

[ReleaseNote] Release note of v0.10.0rc1 (#2225 )

2025-08-07 14:46:49 +08:00

requirements-dev.txt

[main][Feature]Moe alltoallv communication optimization for unquantized RL training sence (#2088 )

2025-08-02 09:49:10 +08:00

requirements-lint.txt

[Test] Remove VLLM_USE_V1 in example and tests (#1733 )

2025-07-15 12:49:57 +08:00

requirements.txt

[CI] Fix broken CI (#2302 )

2025-08-11 11:22:32 +08:00

setup.py

[Build] Add build info (#1386 )

2025-06-27 09:14:43 +08:00

typos.toml

[1/N][CI] Move linting system to pre-commits hooks (#1256 )

2025-07-10 14:17:15 +08:00

README.md

vLLM Ascend Plugin

English | 中文

Latest News 🔥

[2025/06] User stories page is now live! It kicks off with ‌LLaMA-Factory/verl//TRL/GPUStack‌ to demonstrate how ‌vLLM Ascend‌ assists Ascend users in enhancing their experience across fine-tuning, evaluation, reinforcement learning (RL), and deployment scenarios.
[2025/06] Contributors page is now live! All contributions deserve to be recorded, thanks for all contributors.
[2025/05] We've released first official version v0.7.3! We collaborated with the vLLM community to publish a blog post sharing our practice: Introducing vLLM Hardware Plugin, Best Practice from Ascend NPU.
[2025/03] We hosted the vLLM Beijing Meetup with vLLM team! Please find the meetup slides here.
[2025/02] vLLM community officially created vllm-project/vllm-ascend repo for running vLLM seamlessly on the Ascend NPU.
[2024/12] We are working with the vLLM community to support [RFC]: Hardware pluggable.

Overview

vLLM Ascend (vllm-ascend) is a community maintained hardware plugin for running vLLM seamlessly on the Ascend NPU.

It is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [RFC]: Hardware pluggable, providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.

By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.

Prerequisites

Hardware: Atlas 800I A2 Inference series, Atlas A2 Training series, Atlas 800I A3 Inference series, Atlas A3 Training series, Atlas 300I Duo (Experimental)
OS: Linux
Software:
- Python >= 3.9, < 3.12
- CANN >= 8.2.rc1
- PyTorch >= 2.7.1, torch-npu >= 2.7.1.dev20250724
- vLLM (the same version as vllm-ascend)

Getting Started

Please use the following recommended versions to get started quickly:

Version	Release type	Doc
v0.10.0rc1	Latest release candidate	QuickStart and Installation for more details
v0.9.1rc2	Next stable release	QuickStart and Installation for more details
v0.7.3.post1	Latest stable version	QuickStart and Installation for more details

Contributing

See CONTRIBUTING for more details, which is a step-by-step guide to help you set up development environment, build and test.

We welcome and value any contributions and collaborations:

Please let us know if you encounter a bug by filing an issue
Please use User forum for usage questions and help.

Branch

vllm-ascend has main branch and dev branch.

main: main branch，corresponds to the vLLM main branch, and is continuously monitored for quality through Ascend CI.
vX.Y.Z-dev: development branch, created with part of new releases of vLLM. For example, v0.7.3-dev is the dev branch for vLLM v0.7.3 version.

Below is maintained branches:

Branch	Status	Note
main	Maintained	CI commitment for vLLM main branch and vLLM 0.10.x branch
v0.7.1-dev	Unmaintained	Only doc fixed is allowed
v0.7.3-dev	Maintained	CI commitment for vLLM 0.7.3 version, only bug fix is allowed and no new release tag any more.
v0.9.1-dev	Maintained	CI commitment for vLLM 0.9.1 version

Please refer to Versioning policy for more details.

Weekly Meeting

vLLM Ascend Weekly Meeting: https://tinyurl.com/vllm-ascend-meeting
Wednesday, 15:00 - 16:00 (UTC+8, Convert to your timezone)

License

Apache License 2.0, as found in the LICENSE file.

Languages

C++ 51.8%

Python 45.8%

CMake 1.1%

Shell 0.7%

C 0.5%

Other 0.1%

README.md Unescape Escape

vLLM Ascend Plugin

Overview

Prerequisites

Getting Started

Contributing

Branch

Weekly Meeting

License

README.md