[NVIDIA] [3/N] Nvfp4 Masked Gemm: Add flashinfer grouped_gemm_nt_masked (#9199)

This commit is contained in:
Shu Wang
2025-09-11 22:18:43 -05:00
committed by GitHub
parent 7b141f816c
commit 3df05f4d6a
11 changed files with 694 additions and 5 deletions

View File

@@ -40,6 +40,11 @@ SGLang supports various environment variables that can be used to configure its
| `SGL_DG_USE_NVRTC` | Use NVRTC (instead of Triton) for JIT compilation (Experimental) | `"0"` |
| `SGL_USE_DEEPGEMM_BMM` | Use DeepGEMM for Batched Matrix Multiplication (BMM) operations | `"false"` |
## DeepEP Configuration
| Environment Variable | Description | Default Value |
| `SGLANG_DEEPEP_BF16_DISPATCH` | Use Bfloat16 for dispatch | `"false"` |
## Memory Management
| Environment Variable | Description | Default Value |