xc-llm-ascend

Files

无脸男 03679cf1d3 [Bugfix] fix the precision issues that may raise from the inter-layer reuse of the workspace in certain scenarios (#5522 )

### What this PR does / why we need it?

In the current process of implementing attention updates, the FIA
operator shares a single workspace among different layers within the
same computation graph. To enable memory reuse, we adopt the
weak_ref_tensor mechanism. However, this approach may lead to precision
anomalies in certain scenarios. To address this issue, different layers
in the same computation graph are assigned independent workspaces.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.13.0
- vLLM main:
45c1ca1ca1

Signed-off-by: WithHades <244036962@qq.com>

2025-12-31 16:54:04 +08:00

npugraph_ex_passes

[Feature] Support npuhraph_ex backend (#4700 )

2025-12-10 20:48:05 +08:00

passes

[Misc] Cleanup useless print and logger (#5220 )

2025-12-22 11:28:26 +08:00

__init__.py

[Bugfix] add compilation/__init__.py to fix import error (#1152 )

2025-06-10 17:14:25 +08:00

acl_graph.py

[Bugfix] fix the precision issues that may raise from the inter-layer reuse of the workspace in certain scenarios (#5522 )