Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Projects Releases Wiki Activity
Files
98798d80a0e382e23b11bd67baae35d21c8f88c5
xc-llm-ascend/vllm_ascend/ops
History
zzzzwwjj 71f729a661 Revert "moe_gating_top_k" (#5512)
Reverts vllm-project/vllm-ascend#5271

It breaks e2e test

- vLLM version: v0.13.0
- vLLM main:
45c1ca1ca1
2025-12-30 15:05:47 +08:00
..
fused_moe
Revert "moe_gating_top_k" (#5512)
2025-12-30 15:05:47 +08:00
triton
[Refactor][Triton] Move reject sample triton kernels into ops/triton (#5324)
2025-12-29 16:15:41 +08:00
__init__.py
…
activation.py
…
layernorm.py
…
linear_op.py
Remove VLLM_ASCEND_ENABLE_DENSE_OPTIMIZE (#5272)
2025-12-25 11:09:56 +08:00
linear.py
[bugfix] fix Error 'ValueError: Duplicate layer name' (#5280)
2025-12-25 10:43:24 +08:00
mla.py
…
mm_encoder_attention.py
[Bugfix] Correctly handle the output shape in multimodal attention (#5443)
2025-12-27 18:42:46 +08:00
register_custom_ops.py
[Feature] support eager mode in model runner v2 (#5210)
2025-12-29 15:28:34 +08:00
rotary_embedding.py
[Refactor] cache cos/sin in mla & remove parameter model in builder. (#5277)
2025-12-28 10:35:07 +08:00
shared_weight_layer.py
…
vocab_parallel_embedding.py
…
weight_prefetch.py
…
Powered by Gitea Version: 1.24.3 Page: 3781ms Template: 8ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API