xc-llm-ascend

Files

Li Wang ad366bf908 [Bugfix] Follow vLLM Qwen-Moe/VL and KV Connector change to fix broken CI (#2181 )

### What this PR does / why we need it?
This pr fix broken CI:
1. Fix the
ee2eb6ecd8
changes, in this commit, they fused the gate and up projections in the
vision MLP, This can improve performance by reducing one matrix
multiplication. so, this pr do the following things:
- Specify that the two linear layers are fused as `mlp.gate_up_proj`
when loading the weights.
    - Use a SiluAndMul activation function.
2. Fix
aefeea0fde,
Update ModelRunnerOutput parameters to adapt to its changes
3. Fix
[vllm-commit](https://github.com/vllm-project/vllm/pull/20815/files#diff-3ffb829a39ab2b3e4706aa28f5e476815f36c3a87b98d6a66514ebedc8f3ffb4R354-R356),
fix qwen moe
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.10.0
- vLLM main:
fed5849d3f

---------

Signed-off-by: wangli <wangli858794774@gmail.com>

2025-08-04 21:37:50 +08:00

__init__.py

[main][Feature]Moe alltoallv communication optimization for unquantized RL training sence (#2088 )

2025-08-02 09:49:10 +08:00

deepseek_dbo.py

[MISC] Cherry pick #1291 from v0.9.1-dev (#1825 )

2025-08-01 09:08:45 +08:00

deepseek_mtp.py

[Feature] Enable inference support for Deepseekr1-w8a8-MTP (#1994 )

2025-07-29 18:51:57 +08:00

deepseek_v2.py

[Misc] Add extra checking to torchair_graph_config. (#1939 )