AITER backend extension and workload optimizations (#6838)
Co-authored-by: wunhuang <wunhuang@amd.com> Co-authored-by: Hubert Lu <Hubert.Lu@amd.com>
This commit is contained in:
@@ -53,7 +53,7 @@ SGLang supports various environment variables that can be used to configure its
|
||||
|
||||
| Environment Variable | Description | Default Value |
|
||||
| --- | --- | --- |
|
||||
| `SGLANG_AITER_MOE` | Use AITER MOE implementation | `false` |
|
||||
| `SGLANG_USE_AITER` | Use AITER optimize implementation | `false` |
|
||||
| `SGLANG_INT4_WEIGHT` | Enable INT4 weight quantization | `false` |
|
||||
| `SGLANG_MOE_PADDING` | Enable MoE padding (sets padding size to 128 if value is `1`, often set to `1` in Docker builds) | `0` |
|
||||
| `SGLANG_FORCE_FP8_MARLIN` | Force using FP8 MARLIN kernels even if other FP8 kernels are available | `false` |
|
||||
|
||||
Reference in New Issue
Block a user