wangjing

wangjing pushed to br/v0.18.0 at EngineX/xc-llm-ascend

2026-05-13 16:59:53 +08:00

b6549b6e38 Add feature: priority

d627a45881 fix multiproc executor determine kv cache memory & update Dockerfile

6c097beaa5 adapt to vllm-ascend v0.18.0

e18643f8a4 [doc][0.18.0] v0.18.0 release note (#8383)

600bf80c6d [CI]Fix the error caused by layer_sharding in dsv32 (#8719)

Compare 10 commits »

wangjing created branch br/v0.18.0 in EngineX/xc-llm-ascend

2026-05-13 16:59:53 +08:00

wangjing created branch br/v0.18.0rc1 in EngineX/xc-llm-ascend

2026-05-13 16:59:20 +08:00

wangjing deleted branch v0.18.0 from EngineX/xc-llm-ascend

2026-05-13 16:59:19 +08:00

wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend

2026-04-24 20:58:05 +08:00

e17006077a fix multiproc executor determine kv cache memory & update Dockerfile

wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend

2026-04-24 19:15:32 +08:00

b1519a5f84 fix multiproc executor determine kv cache memory & update Dockerfile

wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend

2026-04-24 16:53:39 +08:00

e38cf50513 fix multiproc executor determine kv cache memory & update Dockerfile

wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend

2026-04-24 16:40:01 +08:00

31868639fd fix multiproc executor determine kv cache memory

wangjing created branch v0.11.0 in EngineX/xc-llm-ascend

2026-04-21 11:19:57 +08:00

wangjing deleted branch main from EngineX/xc-llm-ascend

2026-04-21 11:19:57 +08:00

wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend

2026-04-21 11:09:14 +08:00

e4d898b245 adapt to vllm-ascend v0.18.0rc1

99e1ea0fe6 [v0.18.0][Misc] Upgrade torch_npu to pre-release built version (#7918)

d3de7333dc [BugFix][v0.18.0][cherry-pick] Fix embedding prefix caching for APC (#7894)

762850fb4e [v0.18.0][Misc] Install numactl in Docker images (#7898)

2cb9195ff0 [Releases/v0.18.0][CI] Updated the parameters for the single-node test to fix the OOM issue for DeepSeek-V3.2 (#7862)

Compare 10 commits »

wangjing created branch v0.18.0 in EngineX/xc-llm-ascend

2026-04-21 11:09:13 +08:00

wangjing pushed to main at EngineX/xc-llm-kunlun

2026-03-02 18:51:16 +08:00

34e04c5569 update base image

a15754c3ba add readme

4d8575115a add vxpu

dc63e81a7f fix: use cuda visible (#244)

e4c9b9f988 [Bugfix] cocopod ops can't be finded (#242)

Compare 10 commits »

wangjing created branch main in EngineX/xc-llm-kunlun

2026-03-02 18:51:16 +08:00

wangjing created branch v0.11.0-v0.0.1 in EngineX/xc-llm-kunlun

2026-03-02 18:48:38 +08:00

wangjing deleted branch main from EngineX/xc-llm-kunlun

2026-03-02 18:48:38 +08:00

wangjing pushed to main at EngineX/xc-llm-kunlun

2026-02-12 11:18:28 +08:00

cea31d16fb add readme

01bafad6d0 add vxpu

Compare 2 commits »

wangjing pushed to main at EngineX/xc-llm-kunlun

2026-02-12 11:03:30 +08:00

9c3e9df634 add readme

e273ef01b8 add vxpu

070bfa4a73 [Bugfix] Fixed Kunlun Graph Failed (#193)

fc48b79ae9 support glm4.7 mtp (#187)

bd8c999335 Further optimize multi-lora inference,LoRA-enabled performance achieves 80%+ of non-LoRA performance (#190)

Compare 7 commits »

wangjing pushed to main at EngineX/xc-llm-ascend

2026-02-11 14:28:37 +08:00

389030a8f8 add env vars & misc

wangjing pushed to main at EngineX/xc-llm-kunlun

2026-02-09 11:05:10 +08:00

301ad12241 add readme

4a1dab898c add vxpu

6f30bc439d clean pr for ds.2 mtp support (#164)

42a2d38f47 [CI/Build] Fixed bug related to conflicts in the code inspection tool (#169)

6f12830839 [Kernel] add topk_per_row to optimize the calculation of topk_indexes (#168)

Compare 132 commits »