xc-llm-ascend

Files

Mengqing Cao 5f4391652f [PromptLogprobs][V1] Support prompt logprobs to fix ceval accuracy in V1 (#1483 )

### What this PR does / why we need it?
Support prompt logprobs in V1. This also enable lm_eval to test accuracy
on V1

### Does this PR introduce _any_ user-facing change?
support prompt logprobs output

### How was this patch tested?
CI passed with accuracy test.

Using lm_eval, which use prompt logprobs as output to test accuracy, to
test:
```python
VLLM_USE_V1=1 lm_eval \
  --model vllm \
  --model_args pretrained=Qwen/Qwen2.5-7B-Instruct,max_model_len=4096,block_size=4 \
  --tasks ceval-valid_computer_network \
  --batch_size 8
```
After this pr, the accuracy test results of `Qwen/Qwen2.5-7B-Instruct`
on V1 is:
```bash
|           Tasks            |Version|Filter|n-shot| Metric |   |Value |   |Stderr|
|----------------------------|------:|------|-----:|--------|---|-----:|---|-----:|
|ceval-valid_computer_network|      2|none  |     0|acc     |↑  |0.7368|±  |0.1038|
|                            |       |none  |     0|acc_norm|↑  |0.7368|±  |0.1038|
```

Closes: https://github.com/vllm-project/vllm-ascend/issues/1043

Signed-off-by: MengqingCao <cmq0113@163.com>

2025-06-28 09:38:52 +08:00

attention

Handle with_prefill_across_dp for multistream mla (#1322 )

2025-06-26 09:32:07 +08:00

compilation

[CI] Upgrade vllm to 0.9.1 (#1165 )

2025-06-11 16:33:11 +08:00

core

[Scheduler][MTP] Add support for speculative decoding in AsecendScheduler. (#943 )

2025-06-11 20:55:44 +08:00

device_allocator

[Build] Add build info (#1386 )