初始化项目,由ModelHub XC社区提供模型

Model: Lyte/Gemma-3-1B-Moroccan-Instruct
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-27 09:56:17 +08:00
commit 47eff3e7d7
16 changed files with 51659 additions and 0 deletions

38
.gitattributes vendored Normal file
View File

@@ -0,0 +1,38 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text
gemma-3-1b-moroccan-instruct-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
gemma-3-1b-moroccan-instruct-q8_0.gguf filter=lfs diff=lfs merge=lfs -text

98
README.md Normal file
View File

@@ -0,0 +1,98 @@
---
base_model: unsloth/gemma-3-1b-it-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- gemma3_text
- gguf
license: apache-2.0
language:
- ary
datasets:
- Lyte/Moroccan-QA-Extended
pipeline_tag: text-generation
library_name: unsloth
---
# Gemma-3-1B Moroccan Instruct (test finetune)
- **Developed by:** Lyte
- **License:** Apache-2.0
- **Base model:** `unsloth/gemma-3-1b-it-unsloth-bnb-4bit`
- **Dataset:** `Lyte/Moroccan-QA-Extended` (with additional English Questions -> Moroccan Darija Answers)
- **Language:** Moroccan Arabic (Darija)
## How to use in LM Studio
You can easily run this model in LM Studio using the preset configuration. Click the badge below to open the model directly in LM Studio:
[<img src="https://pbs.twimg.com/profile_images/1755060270173429760/4WVc54_p_400x400.jpg" alt="Open in LM Studio" width="32"/>](https://lmstudio.ai/lyte/gemma-3-moroccan)
### GGUF Quants:
- **Q8_0:** [gemma-3-1b-moroccan-instruct-q8_0.gguf](https://huggingface.co/Lyte/Gemma-3-1B-Moroccan-Instruct/resolve/main/gemma-3-1b-moroccan-instruct-q8_0.gguf?download=true)
- **Q4_K_M:** [gemma-3-1b-moroccan-instruct-q4_k_m.gguf](https://huggingface.co/Lyte/Gemma-3-1B-Moroccan-Instruct/resolve/main/gemma-3-1b-moroccan-instruct-q4_k_m.gguf?download=true)
## Inference Example
Here is an example of the model's output in LM Studio, answering a question about Newton's law of universal gravitation in Moroccan Darija.
### Q: what is the capital of France?
![Inference Example 1](https://huggingface.co/Lyte/Gemma-3-1B-Moroccan-Instruct/resolve/main/capital.png)
### Q: شرح ليا كيفاش الجادبية كتخدم؟
![Inference Example 2](https://huggingface.co/Lyte/Gemma-3-1B-Moroccan-Instruct/resolve/main/gravity.png)
### Inference Settings:
![Inference Settings](https://huggingface.co/Lyte/Gemma-3-1B-Moroccan-Instruct/resolve/main/sampling.png)
---
## Training Details
- **Max Length:** 1024 tokens
- **Epochs:** 3
- **Total Steps:** 843
- **Batch size:** 2 (per device)
- **Gradient Accumulation:** 4 (Total effective batch size: 16)
- **Learning rate:** 2e-4
- **Optimizer:** 8-bit AdamW
- **Scheduler:** Linear
- **Weight decay:** 0.01
- **Seed:** 3407
- **Num of Examples:** 4,495
- **Trainable Parameters:** 52.18M (4.96%)
- **Training Time:** ~1 hour on a single GPU.
This was the **first test finetune run**, not a final production model. Training was done using **Unsloth** for speedup and Hugging Face TRL for supervised finetuning.
---
## Results
- **Training Loss:** From **2.171600** to **0.9392** (at final step 843)
- **Evaluation Loss:** From **2.198849** to **1.5074** (at final step 800)
Training converged without issues. The loss metrics show expected early-stage improvement, but this checkpoint is **experimental** and requires further tuning and validation before use.
---
## Limitations
- Experimental model — not yet optimized or fully Moroccan-Darija-aligned.
- Performance outside Moroccan Arabic QA tasks may be limited.
- Further finetuning and evaluation are needed before production use.
## Uploaded finetuned model
- **Developed by:** Lyte
- **License:** apache-2.0
- **Finetuned from model :** unsloth/gemma-3-1b-it-unsloth-bnb-4bit
This gemma3_text model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

3
added_tokens.json Normal file
View File

@@ -0,0 +1,3 @@
{
"<image_soft_token>": 262144
}

BIN
capital.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

47
chat_template.jinja Normal file
View File

@@ -0,0 +1,47 @@
{{ bos_token }}
{%- if messages[0]['role'] == 'system' -%}
{%- if messages[0]['content'] is string -%}
{%- set first_user_prefix = messages[0]['content'] + '
' -%}
{%- else -%}
{%- set first_user_prefix = messages[0]['content'][0]['text'] + '
' -%}
{%- endif -%}
{%- set loop_messages = messages[1:] -%}
{%- else -%}
{%- set first_user_prefix = "" -%}
{%- set loop_messages = messages -%}
{%- endif -%}
{%- for message in loop_messages -%}
{%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
{{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
{%- endif -%}
{%- if (message['role'] == 'assistant') -%}
{%- set role = "model" -%}
{%- else -%}
{%- set role = message['role'] -%}
{%- endif -%}
{{ '<start_of_turn>' + role + '
' + (first_user_prefix if loop.first else "") }}
{%- if message['content'] is string -%}
{{ message['content'] | trim }}
{%- elif message['content'] is iterable -%}
{%- for item in message['content'] -%}
{%- if item['type'] == 'image' -%}
{{ '<start_of_image>' }}
{%- elif item['type'] == 'text' -%}
{{ item['text'] | trim }}
{%- endif -%}
{%- endfor -%}
{%- else -%}
{{ raise_exception("Invalid content type") }}
{%- endif -%}
{{ '<end_of_turn>
' }}
{%- endfor -%}
{%- if add_generation_prompt -%}
{{ '<start_of_turn>model
' }}
{%- endif -%}

64
config.json Normal file
View File

@@ -0,0 +1,64 @@
{
"_sliding_window_pattern": 6,
"architectures": [
"Gemma3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"attn_logit_softcapping": null,
"bos_token_id": 2,
"cache_implementation": "hybrid",
"eos_token_id": 106,
"final_logit_softcapping": null,
"head_dim": 256,
"hidden_activation": "gelu_pytorch_tanh",
"hidden_size": 1152,
"initializer_range": 0.02,
"intermediate_size": 6912,
"layer_types": [
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"full_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"full_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"full_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"full_attention",
"sliding_attention",
"sliding_attention"
],
"max_position_embeddings": 32768,
"model_type": "gemma3_text",
"num_attention_heads": 4,
"num_hidden_layers": 26,
"num_key_value_heads": 1,
"pad_token_id": 0,
"query_pre_attn_scalar": 256,
"rms_norm_eps": 1e-06,
"rope_local_base_freq": 10000,
"rope_scaling": null,
"rope_theta": 1000000,
"sliding_window": 512,
"torch_dtype": "float16",
"transformers_version": "4.55.4",
"unsloth_fixed": true,
"unsloth_version": "2025.9.4",
"use_cache": true,
"vocab_size": 262144
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:dec8d3b386022f03e7ca7ede161de0fe45cb0287afab01fcb92b1b9a8125b0fe
size 806057152

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:26a2ab907f9424afd3aa8ba4a23b9debba0dfca38c09aea9f3a54061986b0f1c
size 1069305280

14
generation_config.json Normal file
View File

@@ -0,0 +1,14 @@
{
"bos_token_id": 2,
"cache_implementation": "hybrid",
"do_sample": true,
"eos_token_id": [
1,
106
],
"max_length": 32768,
"pad_token_id": 0,
"top_k": 64,
"top_p": 0.95,
"transformers_version": "4.55.4"
}

BIN
gravity.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ed7a6e0f9de4499b7064b2fe7ab9d8d710b58cbdb6e99244641ca6244519e040
size 1999810864

BIN
sampling.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

33
special_tokens_map.json Normal file
View File

@@ -0,0 +1,33 @@
{
"boi_token": "<start_of_image>",
"bos_token": {
"content": "<bos>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eoi_token": "<end_of_image>",
"eos_token": {
"content": "<end_of_turn>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"image_token": "<image_soft_token>",
"pad_token": {
"content": "<pad>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
size 33384568

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
size 4689074

51347
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff