sglang/srt at 8fbba3de3ddf4384b29bb1c582d837ecd8c08916 - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

Ying Sheng 8fbba3de3d Fix bugs (fp8 checkpoints, triton cache manager) (#729 )

2024-07-25 07:42:00 -07:00

..

Format (#593 )

2024-07-05 10:06:17 -07:00

Reduce hardcoded logic of kernel usage (#707 )

2024-07-23 16:42:21 -07:00

Fix bugs (fp8 checkpoints, triton cache manager) (#729 )

2024-07-25 07:42:00 -07:00

refactor model loader: initial refactor (#664 )

2024-07-20 02:18:22 -07:00

fix: llama 3.1 405b fp8 (#714 )

2024-07-24 09:37:41 -07:00

Update OpenAI API (#667 )

2024-07-19 23:20:54 -07:00

conversation.py

Update OpenAI API (#667 )

2024-07-19 23:20:54 -07:00

flush_cache.py

Improve doc strings (#518 )

2024-06-08 02:39:32 -07:00

hf_transformers_utils.py

Update vllm version to support llama3.1 (#705 )

2024-07-23 13:49:34 -07:00

memory_pool.py

Fix prefill size (#711 )

2024-07-24 03:41:15 -07:00

mm_utils.py

Handle grayscale images in expand2square (#97 )

2024-01-24 16:23:11 -08:00

model_config.py

Fix Llava model (#594 )

2024-07-06 00:58:46 -07:00

sampling_params.py

Add support for OpenAI API parallel sampling (#640 )

2024-07-19 23:10:01 -07:00

server_args.py

Use min new token ratio at start (#701 )

2024-07-23 11:52:50 -07:00

server.py

Fix bugs (fp8 checkpoints, triton cache manager) (#729 )

2024-07-25 07:42:00 -07:00

utils.py

Fix bugs (fp8 checkpoints, triton cache manager) (#729 )

2024-07-25 07:42:00 -07:00