[Feat] Dynamic Batch Feature (#3490)

[RFC](https://github.com/vllm-project/vllm-ascend/issues/3328) for more details. Add dynamic batch feature in chunked prefilling strategy, the token budget can be refined to achieve better effective throughput and TPOT. !!! NOTE: only 910B3 is supported till now, we are working on further improvements. Additional file for lookup table is required. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: Cheng Wang <wangchengkyrie@outlook.com>
2025-10-22 14:13:32 +08:00
parent c18ca62a17
commit 60e2be1b36
10 changed files with 1368 additions and 1 deletions
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -10,6 +10,8 @@ requires = [
    "pybind11",
    "pyyaml",
    "scipy",
+    "pandas",
+    "pandas-stubs",
    "setuptools>=64",
    "setuptools-scm>=8",
    "torch-npu==2.7.1.dev20250724",