初始化项目,由ModelHub XC社区提供模型
Model: llama-lang-adapt/pretrain-wura Source: Original Platform
This commit is contained in:
13
README.md
Normal file
13
README.md
Normal file
@@ -0,0 +1,13 @@
|
||||
---
|
||||
datasets:
|
||||
- llama-lang-adapt/wura
|
||||
---
|
||||
|
||||
We continual pre-train **meta-llama/Llama-2-7b-hf** on monolingual WURA corpus for **20 languages**. All languages are uniformly sampled.
|
||||
|
||||
## Important Parameters
|
||||
- num_gpus: 8
|
||||
- max_steps: 8000 # see [here](https://github.com/AfricanLlama/ALMA?tab=readme-ov-file#when-should-i-stop-fine-tuning-at-stage-1)
|
||||
- gradient_accumulation_steps: 16
|
||||
- per_device_batch_size: 2
|
||||
- learning_rate: 2e-5
|
||||
Reference in New Issue
Block a user