Model: rasyosef/embedding-amharic-base Source: Original Platform
language, license, tags, base_model, widget, pipeline_tag, library_name, metrics, model-index, datasets
| language | license | tags | base_model | widget | pipeline_tag | library_name | metrics | model-index | datasets | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
mit |
|
rasyosef/roberta-base-amharic |
|
text-retrieval | sentence-transformers |
|
|
|
Embedding-Amharic-Base
This is a sentence-transformers model finetuned from rasyosef/roberta-base-amharic. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
It was introduced in the paper The Multilingual Curse at the Retrieval Layer: Evidence from Amharic.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: rasyosef/roberta-base-amharic
- Maximum Sequence Length: 510 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Language: am
- License: mit
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 510, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("rasyosef/embedding-amharic-base")
# What is the capital of Ethiopia? / France
queries = ['የኢትዮጵያ ዋና ከተማ ማናት?', 'የፈረንሳይ ዋና ከተማ ማናት?']
# Addis Ababa, Gondar, Paris, London, Washington D.C.
documents = ['አዲስ አበባ', 'ጎንደር', 'ፓሪስ', 'ለንደን', 'ዋሽንግተን ዲሲ']
# Compute embeddings
query_embeddings = model.encode_query(queries) # [2, 768]
document_embeddings = model.encode_document(documents) # [5, 768]
# Calculate semantic similarity
similarities = model.similarity(
query_embeddings,
document_embeddings
)
print(similarities)
# tensor([[0.5075, 0.3114, 0.0798, 0.1967, 0.1340],
# [0.1777, 0.0770, 0.5714, 0.2596, 0.1076]])
Evaluation
Metrics
Information Retrieval
- Dataset:
dim_768 - Evaluated with
InformationRetrievalEvaluatorwith these parameters:{ "truncate_dim": 768 }
| Metric | Value |
|---|---|
| cosine_recall@5 | 0.8698 |
| cosine_recall@10 | 0.9051 |
| cosine_ndcg@10 | 0.8037 |
| cosine_mrr@10 | 0.7708 |
Information Retrieval
- Dataset:
dim_256 - Evaluated with
InformationRetrievalEvaluatorwith these parameters:{ "truncate_dim": 256 }
| Metric | Value |
|---|---|
| cosine_recall@5 | 0.8647 |
| cosine_recall@10 | 0.902 |
| cosine_ndcg@10 | 0.7978 |
| cosine_mrr@10 | 0.764 |
Training Details
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: epochper_device_train_batch_size: 64per_device_eval_batch_size: 64gradient_accumulation_steps: 2learning_rate: 6e-05num_train_epochs: 6lr_scheduler_type: cosinewarmup_ratio: 0.025fp16: Trueload_best_model_at_end: Trueoptim: adamw_torch_fusedbatch_sampler: no_duplicates
Training Logs
| Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_256_cosine_ndcg@10 |
|---|---|---|---|---|
| -1 | -1 | - | 0.0735 | 0.0582 |
| 1.0 | 1921 | 0.6769 | 0.7826 | 0.7751 |
| 2.0 | 3842 | 0.07 | 0.7894 | 0.7829 |
| 3.0 | 5763 | 0.0254 | 0.8030 | 0.7953 |
| 4.0 | 7684 | 0.0139 | 0.8037 | 0.7978 |
Framework Versions
- Python: 3.11.13
- Sentence Transformers: 4.1.0
- Transformers: 4.52.4
- PyTorch: 2.7.1+cu126
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
@inproceedings{alemneh2026amharicir,
title = {The Multilingual Curse at the Retrieval Layer: Evidence from Amharic},
author = {Alemneh, Yosef Worku and Mekonnen, Kidist Amde and de Rijke, Maarten},
booktitle = {Proceedings of the 1st Workshop on Multilinguality in the Era of Large Language Models (MeLLM), ACL 2026},
year = {2026},
}
Description