Sync from v0.13
This commit is contained in:
@@ -8,3 +8,5 @@ the JSON file contains a mapping from M (batch size) to the chosen configuration
|
||||
The example configurations provided are for the Mixtral model for TP2 on H100
|
||||
and TP4 on A100. Mixtral has intermediate size N = 14336, i.e. for TP2 we have
|
||||
N = 7168 and for TP4 we have N = 3584.
|
||||
|
||||
See `benchmark/kernels/benchmark_moe.py` on how to generate these config files.
|
||||
|
||||
Reference in New Issue
Block a user