Improve tensor parallel performance (#625)
Co-authored-by: Mingyi <wisclmy0611@gmail.com>
This commit is contained in:
@@ -2,11 +2,10 @@
|
||||
|
||||
- `backend`: Various backends for the language interpreter.
|
||||
- `lang`: The frontend language.
|
||||
- `srt`: The runtime for running local models.
|
||||
- `srt`: The serving engine for running local models. (SRT = SGLang Runtime).
|
||||
- `test`: Test utilities.
|
||||
- `api.py`: Public API.
|
||||
- `bench_latency.py`: Benchmark utilities.
|
||||
- `global_config.py`: The global configs and constants.
|
||||
- `launch_server.py`: The entry point of launching local server.
|
||||
- `utils.py`: Common utilities.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user