V1 is enabled by default, no need to set it by hand now. This PR remove
the useless setting in example and tests
- vLLM version: v0.9.2
- vLLM main:
9ad0a4588b
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it?
Add `max_num_tokens_across_dp` to AscendMetadata to fix dp
This pr fixes the bug introduced by
https://github.com/vllm-project/vllm-ascend/pull/1229, which add an arg
`max_num_tokens_across_dp` when dp_size > 1.
Signed-off-by: MengqingCao <cmq0113@163.com>