### What this PR does / why we need it? - This PR proposes a P2P version of Disaggregated Prefill based on llm_datadist which manages data transfer. - This solution reconstructs previous offline single-node Disaggregated Prefill solution, and supports multi-node and online serveing now. - Currently this solution supports 1P1D situation of Deepseek hybrid parallelism (P: TP+EP, D: DP+EP). Note that xPyD situation is considered in the solution design, and will be supported soon within v1 engine. --------- Signed-off-by: hw_whx <wanghexiang7@huawei.com> Signed-off-by: ganyi <pleaplusone.gy@gmail.com> Co-authored-by: hw_whx <wanghexiang7@huawei.com> Co-authored-by: ganyi <pleaplusone.gy@gmail.com>
20 lines
253 B
Plaintext
20 lines
253 B
Plaintext
# Should be mirrored in pyporject.toml
|
|
cmake>=3.26
|
|
decorator
|
|
numpy<2.0.0
|
|
packaging
|
|
pip
|
|
pybind11
|
|
pyyaml
|
|
scipy
|
|
setuptools>=64
|
|
setuptools-scm>=8
|
|
torch-npu==2.5.1
|
|
torch>=2.5.1
|
|
torchvision<0.21.0
|
|
wheel
|
|
|
|
# requirements for disaggregated prefill
|
|
msgpack
|
|
quart
|