xc-llm-kunlun/ops at b015bb76fd41238753c4c354d9b4dbe8dd58e401 - xc-llm-kunlun - Gitea: Git with a cup of tea

EngineX/xc-llm-kunlun

Files

History

hanhaowen b015bb76fd remove qwen2.py llama.py fix llama output

2025-12-31 11:39:37 +08:00

..

提交vllm0.11.0开发分支

2025-12-10 17:51:24 +08:00

Merge pull request #40 from ldh2020/v0.11.0dev

2025-12-22 21:50:27 +08:00

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00

[Kernel] Optimize the performance of causal_conv1d.

2025-12-12 17:22:35 +08:00

[dev] support AWQ/GPTQ quantization for dense models

2025-12-24 13:46:06 +08:00

提交vllm0.11.0开发分支

2025-12-10 17:51:24 +08:00

__init__.py

remove qwen2.py llama.py fix llama output

2025-12-31 11:39:37 +08:00

_kunlun_ops.py

remove qwen2.py llama.py fix llama output

2025-12-31 11:39:37 +08:00

activation.py

remove qwen2.py llama.py fix llama output

2025-12-31 11:39:37 +08:00

layernorm.py

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00

linear.py

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00

paged_attn.py

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00

rotary_embedding.py

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00

vocab_parallel_embedding.py

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00