Commit Graph

5 Commits

Author SHA1 Message Date
wangxiyuan
ae49bfd13a [Core] Support pooling (#229)
This PR added pooling support for vllm-ascend

Tested with `bge-base-en-v1.5` by encode:
```
from vllm import LLM

# Sample prompts.
prompts = [
  "Hello, my name is",
  "The president of the United States is",
  "The capital of France is",
  "The future of AI is",
]
# Create an LLM.
model = LLM(model="./bge-base-en-v1.5", enforce_eager=True)
# Generate embedding. The output is a list of EmbeddingRequestOutputs.
outputs = model.encode(prompts)
# Print the outputs.
for output in outputs:
    print(output.outputs.embedding)  # list of 4096 floats
```

Tested by embedding:
```
from vllm import LLM, SamplingParams

llm = LLM(model="./bge-base-en-v1.5", task="embed")
(output,) = llm.embed("Hello, my name is")

embeds = output.outputs.embedding
print(f"Embeddings: {embeds!r} (size={len(embeds)})")
```

Related: https://github.com/vllm-project/vllm-ascend/issues/200

## Known issue
The accuracy is not correct since this feature rely on `enc-dec`
support. It'll be done in the following PR by @MengqingCao

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-03-04 15:59:34 +08:00
Shanshan Shen
8fda31cafe [Doc] Update Feature Support doc (#234)
### What this PR does / why we need it?
Update Feature Support doc.

### Does this PR introduce _any_ user-facing change?
no.

### How was this patch tested?
no.

---------

Signed-off-by: Shanshan Shen <467638484@qq.com>
2025-03-04 14:18:32 +08:00
wangxiyuan
cff03a4913 [CI] change to quay.io (#102)
change docker registry to quay

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-02-19 17:04:46 +08:00
wangxiyuan
fafd70e91c [Doc] Update doc to work with release (#85)
1. Update CANN image name
2. Add pta install step
3. update vllm-ascend docker image name to ghcr
4. update quick_start to use vllm-ascend image directly.
5. fix `note` style

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-02-19 09:51:43 +08:00
wangxiyuan
7606977739 [Doc] Add release note (#59)
Add release note template and init the first release note content

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
2025-02-18 11:20:06 +08:00