Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Projects Releases Wiki Activity
Files
23169021d9f5d20649061ab8f9ab734d7d00dd1b
xc-llm-ascend/vllm_ascend/worker
History
weijinqian0 dbe4c338f2 [Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
RFC: https://github.com/vllm-project/vllm-ascend/issues/4629

1. Cache cos/sin in mla
2. AttentionBuilder inherits from the original class of vllm.



version: release/v0.13.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>
2025-12-28 10:35:07 +08:00
..
v2
[Refactor] move the metadata from attention_v1 to util(ready for extract common_cp) & realize Ascendmetadata inherit from the parent class. (#5203)
2025-12-23 00:10:52 +08:00
__init__.py
[Misc][V0 Deprecation] Remove Cache Engine Used for V0 Worker (#1878)
2025-07-19 09:42:32 +08:00
block_table.py
[feature] support pcp + mtp in full graph (#4572)
2025-12-22 16:13:39 +08:00
model_runner_v1.py
[Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
2025-12-28 10:35:07 +08:00
npu_input_batch.py
Drop 0.12.0 support (#5146)
2025-12-20 09:38:53 +08:00
worker.py
[refactor] refactor weight trans nz and transpose (#4878)
2025-12-19 14:27:24 +08:00
Powered by Gitea Version: 1.24.3 Page: 187ms Template: 7ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API