model : add grok-2 support (#15539)

EngineX-Ascend/enginex-ascend-910-llama.cpp

* add grok-2 support

* type fix

* type fix

* type fix

* "fix" vocab for invalid sequences

* fix expert tensor mapping and spaces in vocab

* add chat template

* fix norm tensor mapping

* rename layer_out_norm to ffn_post_norm

* ensure ffn_post_norm is mapped

* fix experts merging

* remove erroneous FFN_GATE entry

* concatenate split tensors and add more metadata

* process all expert layers and try cat instead of hstack

* add support for community BPE vocab

* fix expert feed forward length and ffn_down concat

* commit this too

* add ffn_up/gate/down, unsure if sequence is right

* add ffn_gate/down/up to tensor names

* correct residual moe (still not working)

* mess--

* fix embedding scale being applied twice

* add built in chat template

* change beta fast for grok if default value

* remove spm vocab in favor of community bpe vocab

* change attention temp length metadata type to integer

* update attention temp length metadata

* remove comment

* replace M_SQRT2 with std::sqrt(2)

* add yarn metadata, move defaults to hparams

This commit is contained in:

Sigbjørn Skjæret

2025-09-14 23:00:59 +02:00

committed by

GitHub

parent 6c019cb04e

commit b8e09f08b9

16 changed files with 281 additions and 96 deletions

									
										1

src/llama-chat.h
									
												View File
												
				@@ -50,6 +50,7 @@ enum llm_chat_template {

				    LLM_CHAT_TEMPLATE_HUNYUAN_DENSE,

				    LLM_CHAT_TEMPLATE_KIMI_K2,

				    LLM_CHAT_TEMPLATE_SEED_OSS,

				    LLM_CHAT_TEMPLATE_GROK_2,

				    LLM_CHAT_TEMPLATE_UNKNOWN,

				};

model : add grok-2 support (#15539)

1 src/llama-chat.h Unescape Escape View File

1

src/llama-chat.h

View File