[feat] support minimum token load balance in dp attention (#7379)
This commit is contained in:
@@ -1171,6 +1171,7 @@ class ServerArgs:
|
||||
choices=[
|
||||
"round_robin",
|
||||
"shortest_queue",
|
||||
"minimum_tokens",
|
||||
],
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user