初始化项目,由ModelHub XC社区提供模型
Model: Lyte/Gemma-3-1B-Moroccan-Instruct Source: Original Platform
This commit is contained in:
98
README.md
Normal file
98
README.md
Normal file
@@ -0,0 +1,98 @@
|
||||
---
|
||||
base_model: unsloth/gemma-3-1b-it-unsloth-bnb-4bit
|
||||
tags:
|
||||
- text-generation-inference
|
||||
- transformers
|
||||
- unsloth
|
||||
- gemma3_text
|
||||
- gguf
|
||||
license: apache-2.0
|
||||
language:
|
||||
- ary
|
||||
datasets:
|
||||
- Lyte/Moroccan-QA-Extended
|
||||
pipeline_tag: text-generation
|
||||
library_name: unsloth
|
||||
---
|
||||
|
||||
# Gemma-3-1B Moroccan Instruct (test finetune)
|
||||
|
||||
- **Developed by:** Lyte
|
||||
- **License:** Apache-2.0
|
||||
- **Base model:** `unsloth/gemma-3-1b-it-unsloth-bnb-4bit`
|
||||
- **Dataset:** `Lyte/Moroccan-QA-Extended` (with additional English Questions -> Moroccan Darija Answers)
|
||||
- **Language:** Moroccan Arabic (Darija)
|
||||
|
||||
## How to use in LM Studio
|
||||
|
||||
You can easily run this model in LM Studio using the preset configuration. Click the badge below to open the model directly in LM Studio:
|
||||
|
||||
[<img src="https://pbs.twimg.com/profile_images/1755060270173429760/4WVc54_p_400x400.jpg" alt="Open in LM Studio" width="32"/>](https://lmstudio.ai/lyte/gemma-3-moroccan)
|
||||
|
||||
### GGUF Quants:
|
||||
|
||||
- **Q8_0:** [gemma-3-1b-moroccan-instruct-q8_0.gguf](https://huggingface.co/Lyte/Gemma-3-1B-Moroccan-Instruct/resolve/main/gemma-3-1b-moroccan-instruct-q8_0.gguf?download=true)
|
||||
- **Q4_K_M:** [gemma-3-1b-moroccan-instruct-q4_k_m.gguf](https://huggingface.co/Lyte/Gemma-3-1B-Moroccan-Instruct/resolve/main/gemma-3-1b-moroccan-instruct-q4_k_m.gguf?download=true)
|
||||
|
||||
|
||||
## Inference Example
|
||||
|
||||
Here is an example of the model's output in LM Studio, answering a question about Newton's law of universal gravitation in Moroccan Darija.
|
||||
|
||||
### Q: what is the capital of France?
|
||||

|
||||
|
||||
### Q: شرح ليا كيفاش الجادبية كتخدم؟
|
||||

|
||||
|
||||
### Inference Settings:
|
||||
|
||||

|
||||
|
||||
|
||||
---
|
||||
|
||||
## Training Details
|
||||
|
||||
- **Max Length:** 1024 tokens
|
||||
- **Epochs:** 3
|
||||
- **Total Steps:** 843
|
||||
- **Batch size:** 2 (per device)
|
||||
- **Gradient Accumulation:** 4 (Total effective batch size: 16)
|
||||
- **Learning rate:** 2e-4
|
||||
- **Optimizer:** 8-bit AdamW
|
||||
- **Scheduler:** Linear
|
||||
- **Weight decay:** 0.01
|
||||
- **Seed:** 3407
|
||||
- **Num of Examples:** 4,495
|
||||
- **Trainable Parameters:** 52.18M (4.96%)
|
||||
- **Training Time:** ~1 hour on a single GPU.
|
||||
|
||||
This was the **first test finetune run**, not a final production model. Training was done using **Unsloth** for speedup and Hugging Face TRL for supervised finetuning.
|
||||
|
||||
---
|
||||
|
||||
## Results
|
||||
|
||||
- **Training Loss:** From **2.171600** to **0.9392** (at final step 843)
|
||||
- **Evaluation Loss:** From **2.198849** to **1.5074** (at final step 800)
|
||||
|
||||
Training converged without issues. The loss metrics show expected early-stage improvement, but this checkpoint is **experimental** and requires further tuning and validation before use.
|
||||
|
||||
---
|
||||
|
||||
## Limitations
|
||||
|
||||
- Experimental model — not yet optimized or fully Moroccan-Darija-aligned.
|
||||
- Performance outside Moroccan Arabic QA tasks may be limited.
|
||||
- Further finetuning and evaluation are needed before production use.
|
||||
|
||||
## Uploaded finetuned model
|
||||
|
||||
- **Developed by:** Lyte
|
||||
- **License:** apache-2.0
|
||||
- **Finetuned from model :** unsloth/gemma-3-1b-it-unsloth-bnb-4bit
|
||||
|
||||
This gemma3_text model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
|
||||
|
||||
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
|
||||
Reference in New Issue
Block a user