Support NVFP4 quantized dense models on AMD CDNA2/CDNA3 GPUs (#7302)
Co-authored-by: HAI <hixiao@gmail.com> Co-authored-by: Sai Enduri <saimanas.enduri@amd.com>
This commit is contained in:
@@ -766,6 +766,7 @@ class ServerArgs:
|
||||
"gguf",
|
||||
"modelopt",
|
||||
"modelopt_fp4",
|
||||
"petit_nvfp4",
|
||||
"w8a8_int8",
|
||||
"w8a8_fp8",
|
||||
"moe_wna16",
|
||||
|
||||
Reference in New Issue
Block a user