Support OCP MXFP4 quantization on AMD GPUs (#8255)

Co-authored-by: wunhuang <wunhuang@amd.com>
Co-authored-by: Hubert Lu <Hubert.Lu@amd.com>
This commit is contained in:
kk
2025-08-05 09:14:52 +08:00
committed by GitHub
parent 7cb20754fa
commit d4bf5a8524
12 changed files with 1159 additions and 1 deletions

View File

@@ -813,6 +813,7 @@ class ServerArgs:
"moe_wna16",
"qoq",
"w4afp8",
"mxfp4",
],
help="The quantization method.",
)