Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Actions Projects Releases Wiki Activity
Files
007aeaa48b57c6cde09025c67cf4ce9aab4e0712
xc-llm-ascend/vllm_ascend
History
zouyida2002 faf8cd89cb register qwen2_vl to rewrite qwen2_vl forwad (#241)
Add qwen2-vl ascend impletation.

---------
Signed-off-by: zouyida <zouyida@huawei.com>
2025-03-07 15:41:47 +08:00
..
models
register qwen2_vl to rewrite qwen2_vl forwad (#241)
2025-03-07 15:41:47 +08:00
ops
[Fix] Remove npu_group_topk before CANN version update (#242)
2025-03-06 09:02:46 +08:00
quantization
[Feature] Modify description and api for ascend quantization (#243)
2025-03-06 15:17:25 +08:00
worker
[Performance] Change the shape of kv_cache to avoid view of k_cache and v_cache. (#204)
2025-03-05 10:51:07 +08:00
__init__.py
register qwen2_vl to rewrite qwen2_vl forwad (#241)
2025-03-07 15:41:47 +08:00
attention.py
[Performance] Change the shape of kv_cache to avoid view of k_cache and v_cache. (#204)
2025-03-05 10:51:07 +08:00
communicator.py
[Dist] Set device as rank (#202)
2025-03-03 09:23:13 +08:00
platform.py
[Core] Support pooling (#229)
2025-03-04 15:59:34 +08:00
utils.py
[Worker] Register mindie_turbo while initializing NPUWorker (#13)
2025-02-07 16:47:17 +08:00
Powered by Gitea Version: 1.24.3 Page: 94ms Template: 7ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API