96 lines
2.5 KiB
Markdown
96 lines
2.5 KiB
Markdown
---
|
|
license: agpl-3.0
|
|
datasets:
|
|
- soynade-research/FineWeb2-HQ-50k-Wolof
|
|
language:
|
|
- wo
|
|
- en
|
|
- fr
|
|
base_model:
|
|
- google/gemma-3-270m-it
|
|
pipeline_tag: text-generation
|
|
---
|
|
|
|
# Oolel-lit-gemma
|
|
|
|
Oolel-lit-gemma is a fine-tuned version of [Gemma-3-270m-it](https://huggingface.co/google/gemma-3-270m-it)
|
|
for the Wolof language. It is part of our Oolel family of compact, on-device Wolof language models
|
|
developed.
|
|
|
|
The model was trained using supervised fine-tuning (SFT) on synthetic data distilled from our
|
|
larger **Oolel-7B** models via [Oolel-translator](https://github.com/soynade-research/oolel-translator).
|
|
|
|
|
|
## Usage
|
|
|
|
### Quick start with pipeline
|
|
|
|
```python
|
|
from transformers import pipeline
|
|
|
|
generator = pipeline(
|
|
"text-generation",
|
|
model="soynade-research/oolel-lit-gemma",
|
|
device="cuda",
|
|
)
|
|
|
|
messages = [{"role": "user", "content": "Translate to Wolof: The president is 45 years old."}]
|
|
|
|
output = generator(messages, max_new_tokens=256, return_full_text=False)
|
|
print(output["generated_text"])
|
|
```
|
|
### With AutoModel for more control
|
|
|
|
```python
|
|
from transformers import AutoTokenizer, Gemma3ForCausalLM
|
|
import torch
|
|
|
|
model_id = "soynade-research/oolel-lit-gemma"
|
|
|
|
|
|
model = Gemma3ForCausalLM.from_pretrained(
|
|
model_id
|
|
).eval()
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
|
|
|
messages = [
|
|
[
|
|
{
|
|
"role": "system",
|
|
"content": [{"type": "text", "text": "You're a Wolof AI assistant. Please always provide detailed and useful answers to the user queries."},]
|
|
},
|
|
{
|
|
"role": "user",
|
|
"content": [{"type": "text", "text": "Translate to Wolof: The president is 45 years old."},]
|
|
},
|
|
],
|
|
]
|
|
|
|
inputs = tokenizer.apply_chat_template(
|
|
messages,
|
|
add_generation_prompt=True,
|
|
tokenize=True,
|
|
return_dict=True,
|
|
return_tensors="pt",
|
|
).to(model.device).to(torch.bfloat16)
|
|
|
|
|
|
with torch.inference_mode():
|
|
outputs = model.generate(**inputs, max_new_tokens=256,
|
|
do_sample=True,
|
|
temperature=0.7,
|
|
top_p=0.9,)
|
|
|
|
outputs = tokenizer.batch_decode(outputs)
|
|
|
|
|
|
```
|
|
## Training
|
|
The training code and configuration are available at
|
|
[soynade-research/oolel-trainer.](https://github.com/soynade-research/oolel-trainer)
|
|
|
|
## Limitations
|
|
- Primarily optimized for Wolof; performance on other languages may vary
|
|
- As a 270M parameter model, it may struggle with complex tasks
|
|
- Outputs should be verified by a native Wolof speaker for critical applications |