Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Projects Releases Wiki Activity
Files
dbe4c338f2fac797bba8d03352f13f4af7da2aa6
xc-llm-ascend/vllm_ascend/worker
History
weijinqian0 dbe4c338f2 [Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
RFC: https://github.com/vllm-project/vllm-ascend/issues/4629

1. Cache cos/sin in mla
2. AttentionBuilder inherits from the original class of vllm.



version: release/v0.13.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>
2025-12-28 10:35:07 +08:00
..
v2
[Refactor] move the metadata from attention_v1 to util(ready for extract common_cp) & realize Ascendmetadata inherit from the parent class. (#5203)
2025-12-23 00:10:52 +08:00
__init__.py
[Misc][V0 Deprecation] Remove Cache Engine Used for V0 Worker (#1878)
2025-07-19 09:42:32 +08:00
block_table.py
[feature] support pcp + mtp in full graph (#4572)
2025-12-22 16:13:39 +08:00
model_runner_v1.py
[Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
2025-12-28 10:35:07 +08:00
npu_input_batch.py
Drop 0.12.0 support (#5146)
2025-12-20 09:38:53 +08:00
worker.py
[refactor] refactor weight trans nz and transpose (#4878)
2025-12-19 14:27:24 +08:00
Powered by Gitea Version: 1.24.3 Page: 122ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API