wangjing
  • Joined on 2025-08-06
wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend 2026-04-24 20:58:05 +08:00
e17006077a fix multiproc executor determine kv cache memory & update Dockerfile
wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend 2026-04-24 19:15:32 +08:00
b1519a5f84 fix multiproc executor determine kv cache memory & update Dockerfile
wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend 2026-04-24 16:53:39 +08:00
e38cf50513 fix multiproc executor determine kv cache memory & update Dockerfile
wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend 2026-04-24 16:40:01 +08:00
31868639fd fix multiproc executor determine kv cache memory
wangjing created branch v0.11.0 in EngineX/xc-llm-ascend 2026-04-21 11:19:57 +08:00
wangjing deleted branch main from EngineX/xc-llm-ascend 2026-04-21 11:19:57 +08:00
wangjing pushed to v0.18.0 at EngineX/xc-llm-ascend 2026-04-21 11:09:14 +08:00
e4d898b245 adapt to vllm-ascend v0.18.0rc1
99e1ea0fe6 [v0.18.0][Misc] Upgrade torch_npu to pre-release built version (#7918)
d3de7333dc [BugFix][v0.18.0][cherry-pick] Fix embedding prefix caching for APC (#7894)
762850fb4e [v0.18.0][Misc] Install numactl in Docker images (#7898)
2cb9195ff0 [Releases/v0.18.0][CI] Updated the parameters for the single-node test to fix the OOM issue for DeepSeek-V3.2 (#7862)
Compare 10 commits »
wangjing created branch v0.18.0 in EngineX/xc-llm-ascend 2026-04-21 11:09:13 +08:00
wangjing pushed to main at EngineX/xc-llm-kunlun 2026-03-02 18:51:16 +08:00
34e04c5569 update base image
a15754c3ba add readme
4d8575115a add vxpu
dc63e81a7f fix: use cuda visible (#244)
e4c9b9f988 [Bugfix] cocopod ops can't be finded (#242)
Compare 10 commits »
wangjing created branch main in EngineX/xc-llm-kunlun 2026-03-02 18:51:16 +08:00
wangjing created branch v0.11.0-v0.0.1 in EngineX/xc-llm-kunlun 2026-03-02 18:48:38 +08:00
wangjing deleted branch main from EngineX/xc-llm-kunlun 2026-03-02 18:48:38 +08:00
wangjing pushed to main at EngineX/xc-llm-kunlun 2026-02-12 11:18:28 +08:00
cea31d16fb add readme
01bafad6d0 add vxpu
Compare 2 commits »
wangjing pushed to main at EngineX/xc-llm-kunlun 2026-02-12 11:03:30 +08:00
9c3e9df634 add readme
e273ef01b8 add vxpu
070bfa4a73 [Bugfix] Fixed Kunlun Graph Failed (#193)
fc48b79ae9 support glm4.7 mtp (#187)
bd8c999335 Further optimize multi-lora inference,LoRA-enabled performance achieves 80%+ of non-LoRA performance (#190)
Compare 7 commits »
wangjing pushed to main at EngineX/xc-llm-ascend 2026-02-11 14:28:37 +08:00
389030a8f8 add env vars & misc
wangjing pushed to main at EngineX/xc-llm-kunlun 2026-02-09 11:05:10 +08:00
301ad12241 add readme
4a1dab898c add vxpu
6f30bc439d clean pr for ds.2 mtp support (#164)
42a2d38f47 [CI/Build] Fixed bug related to conflicts in the code inspection tool (#169)
6f12830839 [Kernel] add topk_per_row to optimize the calculation of topk_indexes (#168)
Compare 132 commits »
wangjing deleted branch modelhub from EngineX/xc-llm-kunlun 2026-02-09 11:03:42 +08:00
wangjing created branch modelhub in EngineX/xc-llm-kunlun 2026-02-09 11:02:55 +08:00
wangjing pushed to modelhub at EngineX/xc-llm-kunlun 2026-02-09 11:02:55 +08:00
301ad12241 add readme
4a1dab898c add vxpu
6f30bc439d clean pr for ds.2 mtp support (#164)
42a2d38f47 [CI/Build] Fixed bug related to conflicts in the code inspection tool (#169)
6f12830839 [Kernel] add topk_per_row to optimize the calculation of topk_indexes (#168)
Compare 10 commits »
wangjing pushed to main at EngineX/xc-llm-ascend 2026-01-23 11:24:42 +08:00
739d074b0c update other platforms' Dockerfile