[Bugfix] Fix out-of-bounds access to token_id due to uninitialized logprobs (#4248)
### What this PR does / why we need it?
The logprobs_tensor was not initialized before accessing its token_id
member, leading to a crash when tokenizer.decode() is called by passing
a negative token_id
### How was this patch tested?
Constructed an inference request with two prompts and set
SamplingParams(prompt_logprobs=<non-None value>) (e.g.,
prompt_logprobs=1).
After applying the fix (proper initialization of logprobs_tensor), the
same request completed successfully without errors, and the returned
logprobs matched expected values.
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: jiangweixiang <jwx02384838@antgroup.com>
Co-authored-by: jiangweixiang <jwx02384838@antgroup.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
This commit is contained in:
@@ -4276,6 +4276,7 @@ class NPUModelRunner(LoRAModelRunnerMixin, ECConnectorModelRunnerMixin):
|
|||||||
else:
|
else:
|
||||||
# This is the last chunk of prompt tokens to return.
|
# This is the last chunk of prompt tokens to return.
|
||||||
num_logits = num_remaining_tokens
|
num_logits = num_remaining_tokens
|
||||||
|
if num_logits > 0:
|
||||||
completed_prefill_reqs.append(req_id)
|
completed_prefill_reqs.append(req_id)
|
||||||
prompt_logprobs_dict[req_id] = logprobs_tensors
|
prompt_logprobs_dict[req_id] = logprobs_tensors
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user