[Bugfix] fix bug when tp=1 (#3193)
### What this PR does / why we need it?
Addresses a bug in DenseOptimRowParallelOp that occurs when tensor
parallelism is not used
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- vLLM version: v0.10.2
- vLLM main:
52d0cb8458
This commit is contained in:
@@ -390,7 +390,9 @@ class SequenceRowParallelOp(CustomRowParallelOp):
|
||||
bias_ = None if (self.tp_rank > 0 or self.skip_bias_add) else self.bias
|
||||
|
||||
if self.tp_size == 1 or not self.reduce_results:
|
||||
output = self.quant_method.apply(self, input_parallel, bias=bias_)
|
||||
output = self.quant_method.apply(self.layer,
|
||||
input_parallel,
|
||||
bias=bias_)
|
||||
else:
|
||||
output_parallel = self.quant_method.apply(self.layer,
|
||||
input_parallel,
|
||||
|
||||
Reference in New Issue
Block a user