Fix mem fraction static for nightly tests (#11076)

This commit is contained in:
Lianmin Zheng
2025-09-29 12:57:41 -07:00
committed by GitHub
parent 4eeaff74a0
commit dda34c2f93
8 changed files with 24 additions and 22 deletions

View File

@@ -23,7 +23,7 @@ The case of a server being too conservative can happen when users send many requ
On the other hand, if you see `token usage` very high and you frequently see warnings like
`KV cache pool is full. Retract requests. #retracted_reqs: 1, #new_token_ratio: 0.9998 -> 1.0000`, you can increase `--schedule-conservativeness` to a value like 1.3.
If you see `KV cache pool is full. Retract requests.` occasionally but not frequently, it is okay.
If you see `KV cache pool is full. Retract requests.` occasionally but not frequently (~1 time per minute), it is okay.
### Tune `--mem-fraction-static` to increase KV cache pool capacity
SGLang allocates memory as follows: