feat: support DeepSeek-R1-W4AFP8 model with ep-moe mode (#7762)
Signed-off-by: yangsijia.614 <yangsijia.614@bytedance.com>
This commit is contained in:
@@ -708,6 +708,7 @@ class ServerArgs:
|
||||
"w8a8_fp8",
|
||||
"moe_wna16",
|
||||
"qoq",
|
||||
"w4afp8",
|
||||
],
|
||||
help="The quantization method.",
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user