[Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping (#1308)

This commit is contained in:
Lianmin Zheng
2024-09-02 21:44:45 -07:00
committed by GitHub
parent a5a134f39f
commit f64eae3a29
17 changed files with 105 additions and 158 deletions

View File

@@ -1,6 +1,6 @@
"""
Usage:
python3 -m sglang.launch_server --model-path /model/llama-classification
python3 -m sglang.launch_server --disable-cuda-graph --model-path /model/llama-classification
python3 test_httpserver_classify.py
"""