[misc] move mxfp_compat into device to decouple from quantization init chain (#6918)

### What this PR does / why we need it? `mxfp_compat` only provides dtype/symbol compatibility helpers for different `torch_npu` versions, but it was placed under `vllm_ascend.quantization`. Importing it from device/ops paths could trigger `quantization/__init__.py` and pull in heavy quantization method dependencies, increasing startup coupling and causing import-cycle risk (especially on 310P paths). ### Does this PR introduce _any_ user-facing change? No functional behavior change intended. ### How was this patch tested? CI passed. - vLLM version: v0.16.0 - vLLM main: 15d76f74e2 --------- Signed-off-by: linfeng-yuan <1102311262@qq.com>
2026-03-02 18:17:01 +08:00
parent 632801b0ad
commit 68d8d20ca2
6 changed files with 7 additions and 7 deletions
--- a/vllm_ascend/device/device_op.py
+++ b/vllm_ascend/device/device_op.py
@@ -18,7 +18,7 @@
 import torch
 import torch_npu

-from vllm_ascend.quantization.mxfp_compat import (
+from vllm_ascend.device.mxfp_compat import (
    FLOAT4_E2M1FN_X2_DTYPE,
    FLOAT8_E8M0FNU_DTYPE,
    HIFLOAT8_DTYPE,