baxingpiaochong
dda027e680
[KVPOOl]Support pp ( #4761 )
...
### What this PR does / why we need it?
Support pp for kv pool
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: baxingpiaochong <771405853@qq.com >
2025-12-09 16:15:26 +08:00
LookAround0301
b32ef53b3b
[long_seq] remove long_seq env ( #4660 )
...
### What this PR does / why we need it?
remove env VLLM_ASCEND_ENABLE_CONTEXT_PARALLEL
- vLLM version: v0.12.0
---------
Signed-off-by: LookAround <lixushi@huawei.com >
Signed-off-by: ZhangMingWei716 <2894054457@qq.com >
Co-authored-by: ZhangMingWei716 <2894054457@qq.com >
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-12-05 10:31:49 +08:00
wangxiyuan
7f2673ea2d
upgrade vLLM to main ( #4608 )
...
1. fix https://github.com/vllm-project/vllm/pull/28542
The model structure modifications we involved in are:
- Qwen2.5-VL(still exist some patch)
- Qwen2-VL
- Qwen2
- DeepSeek series
- Qwen-moe series
2. fix https://github.com/vllm-project/vllm/pull/29121
the output token now type changed from np to `list[list[int]]`
3. fix https://github.com/vllm-project/vllm/pull/29262
`xformers` backend for multimodal now has been deprecated
4. fix https://github.com/vllm-project/vllm/pull/29342
5. fix https://github.com/vllm-project/vllm/pull/28579
6. fix https://github.com/vllm-project/vllm/pull/28718
7. fix https://github.com/vllm-project/vllm/issues/28665
8. fix https://github.com/vllm-project/vllm/pull/26847
vllm introduced the `optimization-level`, some default config has been
changed, and the param `--enforce-eager` has been deprecated
9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple
for sampler.
10. fix https://github.com/vllm-project/vllm/pull/29471 we'll remove the
related patch to avoid this kind of error.
Co-authored-by: hfadzxy <starmoon_zhang@163.com >
Co-authored-by: wangli <wangli858794774@gmail.com >
- vLLM version: v0.11.2
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
Signed-off-by: wangli <wangli858794774@gmail.com >
Signed-off-by: hfadzxy <starmoon_zhang@163.com >
Co-authored-by: wangli <wangli858794774@gmail.com >
Co-authored-by: hfadzxy <starmoon_zhang@163.com >
2025-12-02 22:10:52 +08:00
DreamerLeader
4dbe4fd123
[feature]Pooling Features and PCP Adaptation ( #4143 )
...
This PR let pooling kv connector support pcp feature
- vLLM version: v0.11.2
---------
Signed-off-by: fjw <2270923832@qq.com >
Signed-off-by: SlightwindSec <slightwindsec@gmail.com >
Co-authored-by: SlightwindSec <slightwindsec@gmail.com >
2025-11-29 22:07:45 +08:00
fems14
5447a039b9
[Feature][main]reconstruction kvpool connector to ascend connector ( #4438 )
...
### What this PR does / why we need it?
1.In short, we renamed the existing MooncakeStoreConnector to
AscendStoreConnector and extracted the storage engine interaction logic
into a new Backend class.
Associated RFC:https://github.com/vllm-project/vllm-ascend/issues/4329
2.Fixed the issue where the number of input parameters for the connector
was incorrect, introduced in vllm 0.11.2
### Does this PR introduce _any_ user-facing change?
change MooncakeStoreConnector to AscendStoreConnector
### How was this patch tested?
- vLLM version: v0.11.2
---------
Signed-off-by: fems14 <1804143737@qq.com >
2025-11-28 18:08:37 +08:00