59 lines
1.8 KiB
Markdown
59 lines
1.8 KiB
Markdown
---
|
|
widget:
|
|
- text: አዲስ አበባ
|
|
example_title: Example 1
|
|
- text: በኢንግሊዝ ፕሪምየር ሊግ
|
|
example_title: Example 2
|
|
- text: ዶናልድ ትራምፕ
|
|
example_title: Example 3
|
|
language:
|
|
- am
|
|
metrics:
|
|
- perplexity
|
|
library_name: transformers
|
|
pipeline_tag: text-generation
|
|
base_model:
|
|
- meta-llama/Llama-3.2-1B-Instruct
|
|
---
|
|
|
|
# Llama-3.2-Amharic-1B
|
|
|
|
This model is a version of Meta's [Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) decoder transformer model that was continuously pretrained on an Amharic text corpus.
|
|
|
|
- 16k new amharic tokens were added to the Llama 3.2 tokenizer and the embdedding layer of the model was resized accordingly.
|
|
- The model was then trained on **300 million tokens** of **Amharic** text.
|
|
- This is a base model. The Amharic instruction following version is [Llama-3.2-1B-Amharic-Instruct](https://huggingface.co/rasyosef/Llama-3.2-1B-Amharic-Instruct)
|
|
|
|
### How to use
|
|
First, you need to install the latest version of transformers
|
|
```
|
|
pip install -Uq transformers
|
|
```
|
|
|
|
You can use this model directly with a pipeline for text generation:
|
|
|
|
```python
|
|
from transformers import pipeline
|
|
|
|
llama_am = pipeline(
|
|
"text-generation",
|
|
model="rasyosef/Llama-3.2-1B-Amharic",
|
|
device_map="auto"
|
|
)
|
|
|
|
prompt = "በኢንግሊዝ ፕሪምየር ሊግ"
|
|
llama_am(
|
|
prompt,
|
|
max_new_tokens=128,
|
|
temperature=0.3,
|
|
do_sample=True,
|
|
top_k=8,
|
|
top_p=0.8,
|
|
repetition_penalty=1.05
|
|
)
|
|
```
|
|
|
|
Output:
|
|
```python
|
|
[{'generated_text': 'በኢንግሊዝ ፕሪምየር ሊግ የ2017/18 የውድድር ዘመን ላይ ተሳታፊ የሆነው ሊቨርፑል ትናንት ምሽት 3 :45 ላይ ከዌስትሀም ዩናይትድ ጋር ባደረገው ጨዋታ በ2 ለ 1 ውጤት ተሸንፏል ።'}]
|
|
``` |