Files
pretrain-wura/README.md
ModelHub XC d7dc2b1708 初始化项目,由ModelHub XC社区提供模型
Model: llama-lang-adapt/pretrain-wura
Source: Original Platform
2026-05-17 14:22:53 +08:00

435 B

datasets
datasets
llama-lang-adapt/wura

We continual pre-train meta-llama/Llama-2-7b-hf on monolingual WURA corpus for 20 languages. All languages are uniformly sampled.

Important Parameters

  • num_gpus: 8
  • max_steps: 8000 # see here
  • gradient_accumulation_steps: 16
  • per_device_batch_size: 2
  • learning_rate: 2e-5