[v0.18.0][BugFix] Fix DSV3.1 W4A8 TTFT degradation (#8674)

### What this PR does / why we need it?
Fix TTFT degradation on Deepseek-V3.1-W4A8. Revert change of
`balance_flag` in https://github.com/vllm-project/vllm-ascend/pull/7611.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- vLLM version: v0.18.0

Signed-off-by: Wangbingjie <wangbj1207@126.com>
This commit is contained in:
wangbj127
2026-04-27 23:27:34 +08:00
committed by GitHub
parent 0cc76860d5
commit 9fd01a52c0

View File

@@ -266,7 +266,7 @@ class BalanceScheduler(Scheduler):
if len(self.running) == self.max_num_running_reqs:
break
balance_flag = max(t.item() for t in self.balance_queue) >= self.max_num_running_reqs - 1
balance_flag = max(t.item() for t in self.balance_queue) == self.max_num_running_reqs
if balance_flag:
break