[Doc] add qwen3 reranker (#5086)

### What this PR does / why we need it?
add qwen3 reranker tutorials
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.12.0

---------

Signed-off-by: TingW09 <944713709@qq.com>
This commit is contained in:
TingW09
2025-12-18 10:54:07 +08:00
committed by GitHub
parent 8069442b41
commit 879ec2d1c4
4 changed files with 248 additions and 40 deletions

View File

@@ -47,6 +47,7 @@ Get the latest info here: https://github.com/vllm-project/vllm-ascend/issues/160
| Model | Support | Note | BF16 | Supported Hardware | W8A8 | Chunked Prefill | Automatic Prefix Cache | LoRA | Speculative Decoding | Async Scheduling | Tensor Parallel | Pipeline Parallel | Expert Parallel | Data Parallel | Prefill-decode Disaggregation | Piecewise AclGraph | Fullgraph AclGraph | max-model-len | MLP Weight Prefetch | Doc |
|-------------------------------|-----------|----------------------------------------------------------------------|------|--------------------|------|-----------------|------------------------|------|----------------------|------------------|-----------------|-------------------|-----------------|---------------|-------------------------------|--------------------|--------------------|---------------|---------------------|-----|
| Qwen3-Embedding | ✅ | |||||||||||||||||||
| Qwen3-Reranker | ✅ | |||||||||||||||||||
| Molmo | ✅ | [1942](https://github.com/vllm-project/vllm-ascend/issues/1942) |||||||||||||||||||
| XLM-RoBERTa-based | ✅ | |||||||||||||||||||
| Bert | ✅ | |||||||||||||||||||