18 lines
546 B
Markdown
18 lines
546 B
Markdown
---
|
|
license: apache-2.0
|
|
datasets:
|
|
- CaterinaLac/sharegpt-deduplicated
|
|
- exams
|
|
- Open-Orca/OpenOrca
|
|
language:
|
|
- en
|
|
- zh
|
|
- ko
|
|
- ja
|
|
- fr
|
|
---
|
|
|
|
This model is a Llama2-7B model finetuned on the union of ShareGPT, the exams dataset and a subset of the Orca dataset.
|
|
The finetuning was performed with [DeepSpeed Chat](https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat) toolkit (step 1, sft).
|
|
The model run for three epochs before reaching a plateau on the validation dataset. We used a cosine scheduler, with an initial LR of 2e-5.
|