Commit Graph

2 Commits

Author SHA1 Message Date
wangxiaoteng666
ee6d141dd4 [MAIN][BUGFIX] BugFix: Resolve the issue of waiting queue accumulation when requests are canceled. (#2426)
### What this PR does / why we need it?
Resolve the issue of waiting queue accumulation when requests are
canceled.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By ci


- vLLM version: v0.10.1.1
- vLLM main:
006477e60b

---------

Signed-off-by: wangxiaoteng666 <wangxiaoteng@huawei.com>
2025-08-29 17:19:23 +08:00
Pleaplusone
4b3a210c33 Implementation of simple load balance routing proxy server (#1953) (#2124)
### What this PR does / why we need it?
The PR is the cherry-pick from v0.9.1
https://github.com/vllm-project/vllm-ascend/pull/1953

This PR introduce a new load balance proxy server example implementation
for disaggregated pd, which support simple token&kv_cache aware load
balance routing strategy for the disaggregated pd system compared with
origin round robin toy_proxy.

### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
tested on real workload and unittest

- vLLM version: v0.10.0
- vLLM main:
ad57f23f6a

---------

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>
2025-08-04 10:35:53 +08:00