### What this PR does / why we need it?
1. The dependency was introduced by
https://github.com/vllm-project/vllm-ascend/pull/874
- Move numba/quart from requirements-dev to requirments
- Align pyproject.toml with requirements
2. This patch also fix deepseek accuracy baseline which
https://github.com/vllm-project/vllm-ascend/pull/1118 was not addressed.
According to https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite the
gsm8k is about `41.1`
3. This also sync the vLLM upstream changes:
eaa2e51088
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
vllm ascend test (basic workflow)
vllm longterm test (spec decode)
Closes: https://github.com/vllm-project/vllm-ascend/issues/1120
---------
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
24 lines
310 B
Plaintext
24 lines
310 B
Plaintext
# Should be mirrored in pyporject.toml
|
|
cmake>=3.26
|
|
decorator
|
|
einops
|
|
numpy<2.0.0
|
|
packaging
|
|
pip
|
|
pybind11
|
|
pyyaml
|
|
scipy
|
|
setuptools>=64
|
|
setuptools-scm>=8
|
|
torch-npu==2.5.1
|
|
torch>=2.5.1
|
|
torchvision<0.21.0
|
|
wheel
|
|
|
|
# requirements for disaggregated prefill
|
|
msgpack
|
|
quart
|
|
|
|
# Required for N-gram speculative decoding
|
|
numba
|