Files

Angazenn a5f33590d3 [CORE]initial support for torchair with non-mla backend (#1506 )

### What this PR does / why we need it?
This PR supports torchair graph mode with non-mla backend on both 800IA2
and 300I Duo platforms. The main change is to add
`attention_v1_torchair.py` to support specific attention related
operations that are required by torchair.

### Does this PR introduce _any_ user-facing change?
Before this PR, vLLM-Ascend only allows deepseek to use torchair. Now we
can also use it with pangu. Besides, we add a support model list to
control which type of models that can use torchair.

### How was this patch tested?
We have test it with PanguProMoE on both 800IA2 and 300I Duo platforms,
and model generates answer normally.

---------

Signed-off-by: angazenn <zengyanjia@huawei.com>
Signed-off-by: tianyitang <tangtianyi4@huawei.com>
Co-authored-by: angazenn <zengyanjia@huawei.com>
Co-authored-by: tianyitang <tangtianyi4@huawei.com>

2025-07-03 22:21:42 +08:00

source

[CORE]initial support for torchair with non-mla backend (#1506 )

2025-07-03 22:21:42 +08:00

Makefile

[Doc] Add sphinx build for vllm-ascend (#55 )

2025-02-13 18:44:17 +08:00

README.md

Add an example for user stories (#399 )

2025-03-26 16:25:57 +08:00

requirements-docs.txt

[Docs] Add dynamic version in docs (#90 )

2025-02-19 08:57:27 +08:00

requirements-test.txt

static EPLB fix bug, add unit test (#1186 )

2025-06-18 19:46:56 +08:00

README.md

vLLM Ascend Plugin documents

Live doc: https://vllm-ascend.readthedocs.io

Build the docs

# Install dependencies.
pip install -r requirements-docs.txt

# Build the docs.
make clean
make html

Open the docs with your browser

python -m http.server -d _build/html/

Launch your browser and open http://localhost:8000/.