Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Projects Releases Wiki Activity
Files
dbe4c338f2fac797bba8d03352f13f4af7da2aa6
xc-llm-ascend/tests/ut/attention
History
weijinqian0 dbe4c338f2 [Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
RFC: https://github.com/vllm-project/vllm-ascend/issues/4629

1. Cache cos/sin in mla
2. AttentionBuilder inherits from the original class of vllm.



version: release/v0.13.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
Co-authored-by: weijinqian_v1 <weijinqian@huawei.com>
2025-12-28 10:35:07 +08:00
..
test_attention_cp.py
[Perf] vectorize PCP/DCP loops in attention_cp.py (#4944)
2025-12-22 11:06:19 +08:00
test_attention_mask.py
[Refactor] 2/N Unify all mask generation methods and cache mask (#4779)
2025-12-09 18:51:00 +08:00
test_attention_v1.py
[Refactor] move the metadata from attention_v1 to util(ready for extract common_cp) & realize Ascendmetadata inherit from the parent class. (#5203)
2025-12-23 00:10:52 +08:00
test_mla_cp.py
Revert "MLA prefill preformance optimization (#5275)" (#5410)
2025-12-27 09:48:56 +08:00
test_mla_v1.py
[Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
2025-12-28 10:35:07 +08:00
test_sfa_v1.py
[Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
2025-12-28 10:35:07 +08:00
Powered by Gitea Version: 1.24.3 Page: 366ms Template: 7ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API