xc-llm-ascend/multicard at 7e70da9fb7a33b3e60b6d9cab9b52ebe282a810b - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

lhp-deep b230e7e987 [MOE]move weight transpose to wakeup for RL secnarios (#4626 )

### What this PR does / why we need it?
In reinforcement learning scenarios, the current inference applies a
transpose operation to the weights. For a cleaner architecture, the
weight transpose module was moved to wakeup.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: lhp-deep <liuhaopeng1@huawei.com>
Co-authored-by: weijinqian0 <1184188277@qq.com>

2025-12-08 20:34:52 +08:00

..

test_aclgraph_capture_replay.py

Drop 0.11.0 support (#4377 )

2025-11-24 17:08:20 +08:00

test_chunk_gated_delta_rule.py

【OPS】qwen3-next support triton chunk_gated_delta_rule ops (#4070 )

2025-11-28 20:55:43 +08:00

test_data_parallel_tp2.py

upgrade torch npu version (#4433 )

2025-12-01 19:01:55 +08:00

test_data_parallel.py

[main][bugfix] bugfix for qwen3 moe quantization (#4599 )

2025-12-01 23:48:57 +08:00

test_expert_parallel.py

[CI] drop ascend scheduler test (#4582 )

2025-12-01 20:33:50 +08:00

test_external_launcher.py

upgrade torch npu version (#4433 )

2025-12-01 19:01:55 +08:00

test_full_graph_mode.py

support FULL graph mode for GQA (#3970 )

2025-11-17 10:50:35 +08:00

test_fused_moe_allgather_ep.py

[CI] drop ascend scheduler test (#4582 )

2025-12-01 20:33:50 +08:00

test_ilama_lora_tp2.py

ACLgraph enable: Test cases revisions for all features (#3388 )

2025-10-17 17:15:19 +08:00

test_offline_inference_distributed.py

[CI] drop ascend scheduler test (#4582 )

2025-12-01 20:33:50 +08:00

test_offline_weight_load.py

[MOE]move weight transpose to wakeup for RL secnarios (#4626 )

2025-12-08 20:34:52 +08:00

test_pipeline_parallel.py

[bugfix] fix pipeline parallel for mla & sfa attention backend (#3459 )

2025-10-15 17:13:27 +08:00

test_prefix_caching.py

[CI]enable chunked prefill by default (#4569 )

2025-12-02 08:54:34 +08:00

test_quantization.py

[Quantization] Support compressed tensors w8a8 static and w8a8 dynamic weight (#4036 )

2025-11-28 14:09:39 +08:00

test_qwen3_moe.py

Reapply "[MoE] [Refactor] Remove manual memory cleanup (#3365 )" (#3483 ) (#3512 )

2025-10-22 11:41:30 +08:00

test_qwen3_next.py

[Model] Add qwen3Next support in Main (#4596 )

2025-12-03 14:17:37 +08:00

test_shared_expert_dp.py

[Feat] shared expert dp for deepseek_mtp (#3811 )

2025-12-01 20:44:11 +08:00

test_single_request_aclgraph.py

Drop 0.11.0 support (#4377 )

2025-11-24 17:08:20 +08:00

test_torchair_graph_mode.py

[CI] drop ascend scheduler test (#4582 )

2025-12-01 20:33:50 +08:00

test_weight_loader.py

upgrade torch npu version (#4433 )

2025-12-01 19:01:55 +08:00