98 lines
3.4 KiB
Markdown
98 lines
3.4 KiB
Markdown
|
|
---
|
||
|
|
base_model: unsloth/gemma-3-1b-it-unsloth-bnb-4bit
|
||
|
|
tags:
|
||
|
|
- text-generation-inference
|
||
|
|
- transformers
|
||
|
|
- unsloth
|
||
|
|
- gemma3_text
|
||
|
|
- gguf
|
||
|
|
license: apache-2.0
|
||
|
|
language:
|
||
|
|
- ary
|
||
|
|
datasets:
|
||
|
|
- Lyte/Moroccan-QA-Extended
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
library_name: unsloth
|
||
|
|
---
|
||
|
|
|
||
|
|
# Gemma-3-1B Moroccan Instruct (test finetune)
|
||
|
|
|
||
|
|
- **Developed by:** Lyte
|
||
|
|
- **License:** Apache-2.0
|
||
|
|
- **Base model:** `unsloth/gemma-3-1b-it-unsloth-bnb-4bit`
|
||
|
|
- **Dataset:** `Lyte/Moroccan-QA-Extended` (with additional English Questions -> Moroccan Darija Answers)
|
||
|
|
- **Language:** Moroccan Arabic (Darija)
|
||
|
|
|
||
|
|
## How to use in LM Studio
|
||
|
|
|
||
|
|
You can easily run this model in LM Studio using the preset configuration. Click the badge below to open the model directly in LM Studio:
|
||
|
|
|
||
|
|
[<img src="https://pbs.twimg.com/profile_images/1755060270173429760/4WVc54_p_400x400.jpg" alt="Open in LM Studio" width="32"/>](https://lmstudio.ai/lyte/gemma-3-moroccan)
|
||
|
|
|
||
|
|
### GGUF Quants:
|
||
|
|
|
||
|
|
- **Q8_0:** [gemma-3-1b-moroccan-instruct-q8_0.gguf](https://huggingface.co/Lyte/Gemma-3-1B-Moroccan-Instruct/resolve/main/gemma-3-1b-moroccan-instruct-q8_0.gguf?download=true)
|
||
|
|
- **Q4_K_M:** [gemma-3-1b-moroccan-instruct-q4_k_m.gguf](https://huggingface.co/Lyte/Gemma-3-1B-Moroccan-Instruct/resolve/main/gemma-3-1b-moroccan-instruct-q4_k_m.gguf?download=true)
|
||
|
|
|
||
|
|
|
||
|
|
## Inference Example
|
||
|
|
|
||
|
|
Here is an example of the model's output in LM Studio, answering a question about Newton's law of universal gravitation in Moroccan Darija.
|
||
|
|
|
||
|
|
### Q: what is the capital of France?
|
||
|
|

|
||
|
|
|
||
|
|
### Q: شرح ليا كيفاش الجادبية كتخدم؟
|
||
|
|

|
||
|
|
|
||
|
|
### Inference Settings:
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Training Details
|
||
|
|
|
||
|
|
- **Max Length:** 1024 tokens
|
||
|
|
- **Epochs:** 3
|
||
|
|
- **Total Steps:** 843
|
||
|
|
- **Batch size:** 2 (per device)
|
||
|
|
- **Gradient Accumulation:** 4 (Total effective batch size: 16)
|
||
|
|
- **Learning rate:** 2e-4
|
||
|
|
- **Optimizer:** 8-bit AdamW
|
||
|
|
- **Scheduler:** Linear
|
||
|
|
- **Weight decay:** 0.01
|
||
|
|
- **Seed:** 3407
|
||
|
|
- **Num of Examples:** 4,495
|
||
|
|
- **Trainable Parameters:** 52.18M (4.96%)
|
||
|
|
- **Training Time:** ~1 hour on a single GPU.
|
||
|
|
|
||
|
|
This was the **first test finetune run**, not a final production model. Training was done using **Unsloth** for speedup and Hugging Face TRL for supervised finetuning.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Results
|
||
|
|
|
||
|
|
- **Training Loss:** From **2.171600** to **0.9392** (at final step 843)
|
||
|
|
- **Evaluation Loss:** From **2.198849** to **1.5074** (at final step 800)
|
||
|
|
|
||
|
|
Training converged without issues. The loss metrics show expected early-stage improvement, but this checkpoint is **experimental** and requires further tuning and validation before use.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Limitations
|
||
|
|
|
||
|
|
- Experimental model — not yet optimized or fully Moroccan-Darija-aligned.
|
||
|
|
- Performance outside Moroccan Arabic QA tasks may be limited.
|
||
|
|
- Further finetuning and evaluation are needed before production use.
|
||
|
|
|
||
|
|
## Uploaded finetuned model
|
||
|
|
|
||
|
|
- **Developed by:** Lyte
|
||
|
|
- **License:** apache-2.0
|
||
|
|
- **Finetuned from model :** unsloth/gemma-3-1b-it-unsloth-bnb-4bit
|
||
|
|
|
||
|
|
This gemma3_text model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
||
|
|
|
||
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|