Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Projects Releases Wiki Activity
Files
f81cf694b2c0e5ec2e5cb3c23e267103884bde1a
xc-llm-ascend/vllm_ascend/worker
History
weijinqian0 dbe4c338f2 [Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
RFC: https://github.com/vllm-project/vllm-ascend/issues/4629

1. Cache cos/sin in mla
2. AttentionBuilder inherits from the original class of vllm.



version: release/v0.13.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>
2025-12-28 10:35:07 +08:00
..
v2
[Refactor] move the metadata from attention_v1 to util(ready for extract common_cp) & realize Ascendmetadata inherit from the parent class. (#5203)
2025-12-23 00:10:52 +08:00
__init__.py
[Misc][V0 Deprecation] Remove Cache Engine Used for V0 Worker (#1878)
2025-07-19 09:42:32 +08:00
block_table.py
[feature] support pcp + mtp in full graph (#4572)
2025-12-22 16:13:39 +08:00
model_runner_v1.py
[Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
2025-12-28 10:35:07 +08:00
npu_input_batch.py
Drop 0.12.0 support (#5146)
2025-12-20 09:38:53 +08:00
worker.py
[refactor] refactor weight trans nz and transpose (#4878)
2025-12-19 14:27:24 +08:00
Powered by Gitea Version: 1.24.3 Page: 128ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API