language, license, tags, base_model, widget, pipeline_tag, library_name, metrics, model-index, datasets
language
license
tags
base_model
widget
pipeline_tag
library_name
metrics
model-index
datasets
mit
sentence-transformers
sentence-similarity
feature-extraction
generated_from_trainer
dataset_size:245876
loss:MatryoshkaLoss
loss:MultipleNegativesRankingLoss
rasyosef/roberta-base-amharic
source_sentence
sentences
በኢትዮጵያ ለመጀመሪያ ጊዜ ወታደራዊ ስልጠና የወሰዱ ዕጩ ዲፕሎማቶች ተመረቁ
የውጭ ጉዳይ ሚኒስቴር ከሜጀር ጄነራል ሀየሎም አርአያ ወታደራዊ አካዳሚ ጋር በመተባበር በኢትዮጵያ ለመጀመሪያ ጊዜ ወታደራዊ ስልጠና የወሰዱ ዲፕሎማቶችን አስመረቀ፡፡በወታደራዊ አካዳሚው ትላንት በተካሄደ የምርቃት ሥነ- ስርዓት ስልጠናውን ላገኙ 89 ዕጩ ድፕሎማቶች የምስክር ወረቀት ተበረክቷል።
አዲስ አበባ፣ የካቲት 19፣ 2012 (ኤፍ.ቢ.ሲ) የኢፌዴሪ አየር ኃይል ለከፍተኛ መኮንኖች የማዕረግ እድገት ሰጥቷል።አየር ኃይሉ በዛሬው እለት በቢሾፍቱ በሚገኘው የኢፌዴሪ አየር ኃይል ጠቅላይ መምሪያ ባካሄደው ስነ ስርዓት ላይ የኢፌዴሪ ጦር ኃይሎች ምክተል ኤታማዦር ሹም ጄኔራል ብርሃኑ ጁላ እና የኢፌዴሪ አየር ኃይል ዋና አዛዥ ሜጀር ጄኔራል ይልማ መርዳሳን ጨምሮ ከፍተኛ አመራሮች ተገኝተዋል።በስነ ስርዓቱ ላይ 106 ለሚሆኑ መኮንኖች በአየር ኃይል ዋና አዛዥ ሜጀር ጄኔራል ይልማ መርዳሳ የተለያዩ የማዕረግ እድገቶችን ሰጥተዋል።
source_sentence
sentences
ኢትዮጵያ ኢንተርኔትን በመዝጋቷ ከ130 ሚሊዮን ዶላር በላይ አጣች
የአሜሪካ ድምፅ ባለፉት ሰባ አምስት ዓመታት ውስጥ በዓለም ዙሪያ ያሉ የተለያዩ አድማጮችና ተመልካቾች ከሌሎች ምንጮች ሊያገኟቸው የማይችሏቸውን መረጃዎች ለዓለም ሲያደርስ መቆየቱን ዋና ዳይሬክተሯ አማንዳ ቤኔት ገልፀዋል።
የተቋሙ ጥናት የኢንተርኔን መዘጋት በሃገራት ምጣኔ ሐብት ላይ ያደረሰውን ጉዳት በተለያዩ መለኪያዎች የገመተ ሲሆን፤ በዚህም መሰረት ኢትዮጵያ ለ36 ቀናት ያህል ኢንተርኔትን በዘጋችበት እንዲሁም ለሰባት ቀናት ያህል በነበረው የማኅበራዊ ሚዲያ መናወጥ\ ወቅት በጥቅሉ ከ130 ሚሊዮን ዶላር በላይ አጥታለች ይላል።
text-retrieval
sentence-transformers
cosine_accuracy@1
cosine_accuracy@3
cosine_accuracy@5
cosine_accuracy@10
cosine_precision@1
cosine_precision@3
cosine_precision@5
cosine_precision@10
cosine_recall@1
cosine_recall@3
cosine_recall@5
cosine_recall@10
cosine_ndcg@10
cosine_mrr@10
cosine_map@100
name
results
RoBERTa Amharic Embed Base
task
dataset
metrics
type
name
information-retrieval
Information Retrieval
name
type
dim 768
dim_768
type
value
name
cosine_recall@5
0.869800820152314
Cosine Recall@5
type
value
name
cosine_recall@10
0.9050966608084359
Cosine Recall@10
type
value
name
cosine_ndcg@10
0.8036666074756674
Cosine Ndcg@10
type
value
name
cosine_mrr@10
0.7707977655033881
Cosine Mrr@10
task
dataset
metrics
type
name
information-retrieval
Information Retrieval
name
type
dim 256
dim_256
type
value
name
cosine_recall@5
0.8646748681898067
Cosine Recall@5
type
value
name
cosine_recall@10
0.9020210896309314
Cosine Recall@10
type
value
name
cosine_ndcg@10
0.7977610383416281
Cosine Ndcg@10
type
value
name
cosine_mrr@10
0.764035577128722
Cosine Mrr@10
rasyosef/Amharic-Passage-Retrieval-Dataset-V2
Embedding-Amharic-Base
This is a sentence-transformers model finetuned from rasyosef/roberta-base-amharic . It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
It was introduced in the paper The Multilingual Curse at the Retrieval Layer: Evidence from Amharic .
Model Details
Model Description
Model Type: Sentence Transformer
Base model: rasyosef/roberta-base-amharic
Maximum Sequence Length: 510 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Language: am
License: mit
Full Model Architecture
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
Then you can load this model and run inference.
Evaluation
Metrics
Information Retrieval
Metric
Value
cosine_recall@5
0.8698
cosine_recall@10
0.9051
cosine_ndcg@10
0.8037
cosine_mrr@10
0.7708
Information Retrieval
Metric
Value
cosine_recall@5
0.8647
cosine_recall@10
0.902
cosine_ndcg@10
0.7978
cosine_mrr@10
0.764
Training Details
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: epoch
per_device_train_batch_size: 64
per_device_eval_batch_size: 64
gradient_accumulation_steps: 2
learning_rate: 6e-05
num_train_epochs: 6
lr_scheduler_type: cosine
warmup_ratio: 0.025
fp16: True
load_best_model_at_end: True
optim: adamw_torch_fused
batch_sampler: no_duplicates
Training Logs
Epoch
Step
Training Loss
dim_768_cosine_ndcg@10
dim_256_cosine_ndcg@10
-1
-1
-
0.0735
0.0582
1.0
1921
0.6769
0.7826
0.7751
2.0
3842
0.07
0.7894
0.7829
3.0
5763
0.0254
0.8030
0.7953
4.0
7684
0.0139
0.8037
0.7978
Framework Versions
Python: 3.11.13
Sentence Transformers: 4.1.0
Transformers: 4.52.4
PyTorch: 2.7.1+cu126
Accelerate: 1.7.0
Datasets: 3.6.0
Tokenizers: 0.21.1
Citation