[CORE]initial support for torchair with non-mla backend (#1506)

### What this PR does / why we need it?
This PR supports torchair graph mode with non-mla backend on both 800IA2
and 300I Duo platforms. The main change is to add
`attention_v1_torchair.py` to support specific attention related
operations that are required by torchair.

### Does this PR introduce _any_ user-facing change?
Before this PR, vLLM-Ascend only allows deepseek to use torchair. Now we
can also use it with pangu. Besides, we add a support model list to
control which type of models that can use torchair.

### How was this patch tested?
We have test it with PanguProMoE on both 800IA2 and 300I Duo platforms,
and model generates answer normally.

---------

Signed-off-by: angazenn <zengyanjia@huawei.com>
Signed-off-by: tianyitang <tangtianyi4@huawei.com>
Co-authored-by: angazenn <zengyanjia@huawei.com>
Co-authored-by: tianyitang <tangtianyi4@huawei.com>

This commit is contained in:

Angazenn

2025-07-03 22:21:42 +08:00

committed by

GitHub

parent 9fbd8017c0

commit a5f33590d3

19 changed files with 1130 additions and 84 deletions

									
										2

format.sh
									
												View File
												
				@@ -145,7 +145,7 @@ CODESPELL_EXCLUDES=(

				)

				CODESPELL_IGNORE_WORDS=(

				    '-L' 'CANN,cann,NNAL,nnal,ASCEND,ascend,EnQue,CopyIn,assertIn'

				    '-L' 'CANN,cann,NNAL,nnal,ASCEND,ascend,EnQue,CopyIn,assertIn,rever'

				)

				# check spelling of specified files

[CORE]initial support for torchair with non-mla backend (#1506)

2 format.sh Unescape Escape View File

2

format.sh

View File