From d82f3d8e4b40cadb2e2d733771a132f4cdb3327d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sayhan=20Yalva=C3=A7er?= Date: Sun, 18 Feb 2024 15:53:23 +0000 Subject: [PATCH] Update README.md --- README.md | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 3277135..189f0e2 100644 --- a/README.md +++ b/README.md @@ -18,4 +18,26 @@ library_name: transformers - **Finetuned by:** [sayhan](https://huggingface.co/sayhan) - **License:** [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) - **Finetuned from model :** [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) -- **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa) \ No newline at end of file +- **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa) +--- +**LoRA rank:** 8 +**LoRA alpha:** 16 +**LoRA dropout:** 0 +**Rank-stabilized LoRA:** Yes +**Number of epochs:** 3 +**Learning rate:** 1e-5 +**Batch size:** 2 +**Gradient accumulation steps:** 4 +**Weight decay:** 0.01 +**Target modules:** +``` + - Query projection (`q_proj`) + - Key projection (`k_proj`) + - Value projection (`v_proj`) + - Output projection (`o_proj`) + - Gate projection (`gate_proj`) + - Up projection (`up_proj`) + - Down projection (`down_proj`) +``` + +