[aclgraph] implentment NPUPiecewiseBackend to enable aclgraph (#836)

### What this PR does / why we need it?
1. Implentment `NPUPiecewiseBackend` to enable aclgraph
2. Eable aclgraph by default in V1, but raise error when running
deepseek and raise warning when running models except for qwen

### How was this patch tested?
CI pass with the new ut

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
This commit is contained in:
Mengqing Cao
2025-05-29 11:58:26 +08:00
committed by GitHub
parent cc74b97f74
commit a93bed4535
8 changed files with 380 additions and 33 deletions

View File

@@ -33,7 +33,6 @@ class dummyFusionOp:
def register_dummy_fusion_op() -> None:
torch.cuda.CUDAGraph = torch.npu.NPUGraph
torch.ops._C.rms_norm = dummyFusionOp(name="rms_norm")
torch.ops._C.fused_add_rms_norm = dummyFusionOp(name="fused_add_rms_norm")
torch.ops._C.static_scaled_fp8_quant = dummyFusionOp(