31 lines
985 B
Markdown
31 lines
985 B
Markdown
|
|
---
|
||
|
|
license: llama3.2
|
||
|
|
datasets:
|
||
|
|
- LVSTCK/macedonian-corpus-cleaned-dedup
|
||
|
|
language:
|
||
|
|
- mk
|
||
|
|
base_model:
|
||
|
|
- meta-llama/Llama-3.2-1B
|
||
|
|
tags:
|
||
|
|
- mk
|
||
|
|
- continual-pretraining
|
||
|
|
- lora
|
||
|
|
library_name: transformers
|
||
|
|
---
|
||
|
|
|
||
|
|
# MK Llama-3.2-1B (Continued Pretraining)
|
||
|
|
## Summary
|
||
|
|
Continued pretraining for Macedonian language on **lvstck/macedonian-corpus-cleaned-dedup**.
|
||
|
|
Trained using LoRA adapters on a **single A100**.
|
||
|
|
## Training
|
||
|
|
- Stage 1: **0.4 epochs** on Google Colab A100
|
||
|
|
- Stage 2: **0.9 epochs** on A100 - FINKI GPU cluster
|
||
|
|
- Total: **1.3 epochs**
|
||
|
|
## Training Metrics
|
||
|
|
TensorBoard logs are available in the **Training metrics** tab of this model repository. **Logs only available for STAGE 2**
|
||
|
|
The logs include training loss, learning rate, gradient norm, and evaluation metrics .
|
||
|
|
|
||
|
|
- View the TensorBoard dashboard here:
|
||
|
|
[Open TensorBoard training metrics](https://huggingface.co/EdonFetaji/MK-Llama-3.2-1B/tensorboard)
|
||
|
|
## Notes
|
||
|
|
This is a continued pretraining checkpoint focused on Macedonian.
|