enginex-ascend-910-llama.cpp

EngineX-Ascend/enginex-ascend-910-llama.cpp

Files

Xuan-Son Nguyen 1466621e73 llama : Support llama 4 text-only (#12791 )

* llama4 conversion

* initial support, no chat template

* clean up a bit

* fix tokenizer conversion

* correct hparams

* try this

* fix shexp

* ffn_inp_normed

* chat template

* clean up model conversion

* add_bos

* add scale_before_ffn

* fix order

* weight_before_ffn

* llm_graph_input_attn_temp

* add chunk attn mask

* build_inp_attn_scale()

* add comment about ggml_repeat

* clarify comments

* fix build

2025-04-07 23:06:44 +02:00

scripts

Refactor gguf scripts to improve metadata handling (#11909 )

2025-02-26 08:04:48 -05:00

__init__.py

convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor (#7499 )

2024-07-18 20:40:15 +10:00

constants.py

llama : Support llama 4 text-only (#12791 )

2025-04-07 23:06:44 +02:00

gguf_reader.py

Refactor gguf scripts to improve metadata handling (#11909 )