Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Actions Projects Releases Wiki Activity
Files
268da2896100396766c65ee7b8a0d3df1f1f5ce3
xc-llm-ascend/vllm_ascend
History
zouyida2002 faf8cd89cb register qwen2_vl to rewrite qwen2_vl forwad (#241)
Add qwen2-vl ascend impletation.

---------
Signed-off-by: zouyida <zouyida@huawei.com>
2025-03-07 15:41:47 +08:00
..
models
register qwen2_vl to rewrite qwen2_vl forwad (#241)
2025-03-07 15:41:47 +08:00
ops
[Fix] Remove npu_group_topk before CANN version update (#242)
2025-03-06 09:02:46 +08:00
quantization
[Feature] Modify description and api for ascend quantization (#243)
2025-03-06 15:17:25 +08:00
worker
[Performance] Change the shape of kv_cache to avoid view of k_cache and v_cache. (#204)
2025-03-05 10:51:07 +08:00
__init__.py
register qwen2_vl to rewrite qwen2_vl forwad (#241)
2025-03-07 15:41:47 +08:00
attention.py
[Performance] Change the shape of kv_cache to avoid view of k_cache and v_cache. (#204)
2025-03-05 10:51:07 +08:00
communicator.py
[Dist] Set device as rank (#202)
2025-03-03 09:23:13 +08:00
platform.py
[Core] Support pooling (#229)
2025-03-04 15:59:34 +08:00
utils.py
[Worker] Register mindie_turbo while initializing NPUWorker (#13)
2025-02-07 16:47:17 +08:00
Powered by Gitea Version: 1.24.3 Page: 123ms Template: 7ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API