Llama-3.2-1B-Amharic/README.md

---
widget:
- text: አዲስ አበባ
  example_title: Example 1
- text: በኢንግሊዝ ፕሪምየር ሊግ
  example_title: Example 2
- text: ዶናልድ ትራምፕ
  example_title: Example 3
language:
- am
metrics:
- perplexity
library_name: transformers
pipeline_tag: text-generation
base_model:
- meta-llama/Llama-3.2-1B-Instruct
---

# Llama-3.2-Amharic-1B

This model is a version of Meta's [Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) decoder transformer model that was continuously pretrained on an Amharic text corpus.

- 16k new amharic tokens were added to the Llama 3.2 tokenizer and the embdedding layer of the model was resized accordingly.
- The model was then trained on **300 million tokens** of **Amharic** text.
- This is a base model. The Amharic instruction following version is [Llama-3.2-1B-Amharic-Instruct](https://huggingface.co/rasyosef/Llama-3.2-1B-Amharic-Instruct)

### How to use
First, you need to install the latest version of transformers
```
pip install -Uq transformers
```

You can use this model directly with a pipeline for text generation:

```python
from transformers import pipeline

llama_am = pipeline(
    "text-generation",
    model="rasyosef/Llama-3.2-1B-Amharic",
    device_map="auto"
  )

prompt = "በኢንግሊዝ ፕሪምየር ሊግ"
llama_am(
    prompt,
    max_new_tokens=128,
    temperature=0.3,
    do_sample=True,
    top_k=8,
    top_p=0.8,
    repetition_penalty=1.05
  )
```

Output:
```python
[{'generated_text': 'በኢንግሊዝ ፕሪምየር ሊግ የ2017/18 የውድድር ዘመን ላይ ተሳታፊ የሆነው ሊቨርፑል ትናንት ምሽት 3 :45 ላይ ከዌስትሀም ዩናይትድ ጋር ባደረገው ጨዋታ በ2 ለ 1 ውጤት ተሸንፏል ።'}]
```