xc-llm-ascend

Files

lidenghui1110 d65fb194d9 [Feat] Add custom Embedding tensor model parallel (#2616 )

Similar to #2309 , this PR introduces Embedding tensor model parallel to
achieve decreasing of memory consumption. It support both eager mode and
graph mode.

And this PR refactor module tensor parallel configurations supported in
#2309, #2167, #2120, merge all config into `finegrained_tp_config` in
`additional_config`, including:
`lmhead_tensor_parallel_size`
`oproj_tensor_parallel_size`
`embedding_tensor_parallel_size`
`mlp_tensor_parallel_size`

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: zzhx1 <zzh_201018@outlook.com>
Signed-off-by: zzhxx <zhangzihang23@mails.ucas.ac.cn>
Co-authored-by: zzhx1 <zzh_201018@outlook.com>
Co-authored-by: chenxiao <Jaychou1620@Gmail.com>
Co-authored-by: zzhxx <zhangzihang23@mails.ucas.ac.cn>
Co-authored-by: Jade Zheng <zheng.shoujian@outlook.com>

2025-12-12 14:41:20 +08:00

device_communicators

[Test] Add ut for files in /distributed (#1951 )

2025-07-24 10:36:11 +08:00

mooncake

[Feature][main]reconstruction kvpool connector to ascend connector (#4438 )

2025-11-28 18:08:37 +08:00

test_communicator.py

[2/N][Feat] Add MC2 communication method for MoE layers (#2469 )

2025-08-26 19:05:23 +08:00

test_determin_expert_map_all.py

Dynamic Expert Load Balance with Zero-like-overhead (#2956 )

2025-09-17 10:36:43 +08:00

test_parallel_state.py

[Feat] Add custom Embedding tensor model parallel (#2616 )

2025-12-12 14:41:20 +08:00