Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Projects Releases Wiki Activity
Files
48854aef5cd7aa329a18097a0cb3cb714d9a4584
xc-llm-ascend/vllm_ascend/attention
History
Feng Liu 1858f3d36e [Bugfix] Fix Qwen P/D Disaggregation accuracy issue (#5340)
### What this PR does / why we need it?
Fix Qwen P/D Disaggregation accuracy issue

- vLLM version: release/v0.13.0
- vLLM main:
bc0a5a0c08

Signed-off-by: F.Liu <liufeng248@huawei.com>
Co-authored-by: F.Liu <liufeng248@huawei.com>
2025-12-25 22:46:08 +08:00
..
__init__.py
[Core] Make V1 work and enable V1 engine test (#389)
2025-03-28 19:34:23 +08:00
attention_cp.py
[Bugfix] Fix Qwen P/D Disaggregation accuracy issue (#5340)
2025-12-25 22:46:08 +08:00
attention_mask.py
[Model] Support pooling models (#3122)
2025-12-10 11:37:57 +08:00
attention_v1.py
Revert [KV-Sharing] Support KV-Sharing feature in CLA models (#4138) (#5317)
2025-12-24 22:24:17 +08:00
common_cp.py
[Refactor]5/N Extract common code of mla_v1.py & extract mla_cp (#5097)
2025-12-24 10:25:19 +08:00
mla_cp.py
[Refactor]5/N Extract common code of mla_v1.py & extract mla_cp (#5097)
2025-12-24 10:25:19 +08:00
mla_v1.py
[Refactor]5/N Extract common code of mla_v1.py & extract mla_cp (#5097)
2025-12-24 10:25:19 +08:00
sfa_v1.py
[main][Refactor] Remove with_prefill parameter from set_ascend_forward_context (#5094)
2025-12-23 14:30:50 +08:00
utils.py
[Refactor]5/N Extract common code of mla_v1.py & extract mla_cp (#5097)
2025-12-24 10:25:19 +08:00
Powered by Gitea Version: 1.24.3 Page: 3017ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API