[Quantization] register AscendQuantRMSNorm for quantization (#2856)
### What this PR does / why we need it?
modelslim will generate self.bias for rms norm in quantization, since
RMSNorm in vllm has no this parameter, so its nesscesary
to create a AscendQuantRmsNorm.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
tested by deepseek-v3.1-w8a8
<img width="2496" height="592" alt="image"
src="https://github.com/user-attachments/assets/004c6e76-3d7a-4a1f-b59f-a14304012663"
/>
- vLLM version: main
- vLLM main:
d6249d0699
Signed-off-by: 22dimensions <waitingwind@foxmail.com>
This commit is contained in:
@@ -83,7 +83,7 @@ class NPUWorker(WorkerBase):
|
||||
from vllm_ascend import ops
|
||||
ops.register_dummy_fusion_op()
|
||||
_register_atb_extensions()
|
||||
register_ascend_customop()
|
||||
register_ascend_customop(vllm_config)
|
||||
# init ascend config and soc version
|
||||
init_ascend_config(vllm_config)
|
||||
init_ascend_soc_version()
|
||||
|
||||
Reference in New Issue
Block a user