Model: DGurgurov/aya-expanse-8b-mkd_cyrl Source: Original Platform
language, license, base_model, tags, pipeline_tag
| language | license | base_model | tags | pipeline_tag | ||||
|---|---|---|---|---|---|---|---|---|
|
cc-by-nc-4.0 | CohereLabs/aya-expanse-8b |
|
text-generation |
Aya Expanse 8B Mkd_Cyrl
Language-enhanced Aya-Expanse-8b model for Macedonian using sparse subnetwork fine-tuning.
Method
- Training approach: Language-specific neuron identification + subnetwork fine-tuning
- Parameters trained: <1% of total model parameters
- Framework: Language Subnetwork Enhancement
Performance
Enhanced monolingual capabilities in Macedonian while preserving multilingual performance.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("DGurgurov/aya-expanse-8b-mkd_cyrl")
tokenizer = AutoTokenizer.from_pretrained("DGurgurov/aya-expanse-8b-mkd_cyrl")
prompt = "Your Macedonian prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0]))
Citation
@misc{gurgurov2025sparsesubnetworkenhancement,
title={Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models},
author={Daniil Gurgurov and Josef van Genabith and Simon Ostermann},
year={2025},
eprint={2510.13580},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.13580}
}
@misc{gurgurov2025languagearithmeticssystematiclanguage,
title={Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation},
author={Daniil Gurgurov and Katharina Trinley and Yusser Al Ghussin and Tanja Baeumel and Josef van Genabith and Simon Ostermann},
year={2025},
eprint={2507.22608},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2507.22608},
}
Description
Languages
Jinja
100%