[3rdparty, document] Add 3rdparty/amd, with profiling and tuning instructions to be added (#1822)
This commit is contained in:
10
3rdparty/amd/profiling/PROFILING.md
vendored
Normal file
10
3rdparty/amd/profiling/PROFILING.md
vendored
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
## Profiling SGLang Infer System with AMD GPUs
|
||||||
|
This AppNote describes the SGLang profiling technical, code augment and running steps for systems with AMD Instinct GPUs, nevertheless the same procedure may work with Nvidia GPUs too.
|
||||||
|
Examples and steps are provided in detail, to facilitate easy reproduce and use to localize performance problem towards optimizations.
|
||||||
|
Two primary methods are covered:
|
||||||
|
- [RPD](https://github.com/ROCm/rocmProfileData.git)
|
||||||
|
|
||||||
|
|
||||||
|
- [Torch Profiler](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html)
|
||||||
|
|
||||||
|
|
||||||
13
3rdparty/amd/tuning/TUNING.md
vendored
Normal file
13
3rdparty/amd/tuning/TUNING.md
vendored
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
## Tuning SGLang Infer System with AMD GPUs
|
||||||
|
This AppNote describes the SGLang performance tuning technical, code harness and running steps for systems with AMD Instinct GPUs.
|
||||||
|
Harness code, examples and steps are provided in detail, to facilitate easy reproduce & use to tune performance towards workloads.
|
||||||
|
Three primary runtime areas are covered:
|
||||||
|
- Triton Kernels
|
||||||
|
|
||||||
|
|
||||||
|
- Torch Tunable Ops
|
||||||
|
|
||||||
|
|
||||||
|
- Torch Compile
|
||||||
|
|
||||||
|
|
||||||
Reference in New Issue
Block a user