xc-llm-ascend

Files

lio cd58a643c5 [UT] Fix test_sample_recovered_tokens_pytorch_autoregressive (#3434 )

### What this PR does / why we need it?

This 'test_rejection_sampler' unit test is something wrong.

> def test_sample_recovered_tokens_pytorch_autoregressive(self):
>       output_token_ids = torch.empty(2, dtype=torch.int32)
>       cu_num_draft_tokens = torch.tensor([1, 1])
>       draft_token_ids = torch.tensor([0, 1])

len(draft_token_ids ) = 2, cu_num_draft_tokens should be
torch.tensor([1, 2]) or torch.tensor([2, 2])

I fix it and set cu_num_draft_tokens = torch.tensor([1, 2]). The methods
before and after optimization can pass.

### Does this PR introduce _any_ user-facing change?
No 
### How was this patch tested?
NA

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: lio <1983142975@qq.com>

2025-10-24 11:20:57 +08:00

logits_processor

[main] add pd transfer for ascend scheduler (#2753 )

2025-09-10 08:46:39 +08:00

test_rejection_sampler.py

[UT] Fix test_sample_recovered_tokens_pytorch_autoregressive (#3434 )

2025-10-24 11:20:57 +08:00

test_sampler.py

[Refactor]Refactor sampler (#2050 )

2025-07-30 08:47:22 +08:00