[Model] Support pooling models (#3122)
### What this PR does / why we need it? Support pooling models (like `bge-reranker-v2-m3`) in vllm-ascend, this pr covered the three model types of embed (cls_token, mean_token, lasttoken). After this [commit](17373dcd93), vllm has provided support for adapting pooling models on the v1 engine. This PR includes corresponding adaptations on the vllm-ascend side. Fixes #1960 - vLLM version: v0.12.0 - vLLM main:ad32e3e19c--------- Signed-off-by: lianyibo <lianyibo1@kunlunit.com> Signed-off-by: MengqingCao <cmq0113@163.com> Co-authored-by: MengqingCao <cmq0113@163.com>
This commit is contained in:
4
.github/workflows/_e2e_test.yaml
vendored
4
.github/workflows/_e2e_test.yaml
vendored
@@ -77,6 +77,7 @@ jobs:
|
||||
# pytest -sv tests/e2e/singlecard/test_aclgraph.py
|
||||
# pytest -sv tests/e2e/singlecard/test_quantization.py
|
||||
pytest -sv tests/e2e/singlecard/test_vlm.py::test_multimodal_vl
|
||||
pytest -sv tests/e2e/singlecard/pooling/test_classification.py::test_classify_correctness
|
||||
|
||||
- name: Run e2e test
|
||||
env:
|
||||
@@ -91,9 +92,7 @@ jobs:
|
||||
pytest -sv tests/e2e/singlecard/test_completion_with_prompt_embeds.py
|
||||
pytest -sv tests/e2e/singlecard/test_aclgraph.py
|
||||
pytest -sv tests/e2e/singlecard/test_aclgraph_mem.py
|
||||
pytest -sv tests/e2e/singlecard/test_bge_model.py
|
||||
pytest -sv tests/e2e/singlecard/test_camem.py
|
||||
pytest -sv tests/e2e/singlecard/test_embedding.py
|
||||
# pytest -sv tests/e2e/singlecard/test_embedding_aclgraph.py
|
||||
pytest -sv tests/e2e/singlecard/test_guided_decoding.py
|
||||
# torch 2.8 doesn't work with lora, fix me
|
||||
@@ -104,6 +103,7 @@ jobs:
|
||||
pytest -sv tests/e2e/singlecard/test_vlm.py
|
||||
pytest -sv tests/e2e/singlecard/multi-modal/test_internvl.py
|
||||
pytest -sv tests/e2e/singlecard/test_xlite.py
|
||||
pytest -sv tests/e2e/singlecard/pooling/
|
||||
|
||||
# ------------------------------------ v1 spec decode test ------------------------------------ #
|
||||
pytest -sv tests/e2e/singlecard/spec_decode_v1/test_v1_mtp_correctness.py
|
||||
|
||||
Reference in New Issue
Block a user