Files

44 lines
1.2 KiB
Markdown
Raw Permalink Normal View History

2024-02-18 08:36:46 +00:00
---
2024-02-18 08:40:13 +00:00
language:
- en
license: apache-2.0
tags:
2024-02-18 15:22:50 +00:00
- trl
2024-02-18 10:33:10 +00:00
- text-generation-inference
2024-02-18 08:40:13 +00:00
- unsloth
- mistral
2024-02-18 08:57:31 +00:00
- gguf
2024-02-18 08:40:13 +00:00
base_model: teknium/OpenHermes-2.5-Mistral-7B
2024-02-18 10:33:10 +00:00
datasets:
- sayhan/strix-philosophy-qa
2024-02-18 15:22:50 +00:00
library_name: transformers
2024-02-18 08:36:46 +00:00
---
2024-02-18 10:33:32 +00:00
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65aa2d4b356bf23b4a4da247/nN4JZlIMeF-K2sFYfhLLT.png)
2024-02-18 10:33:10 +00:00
# OpenHermes 2.5 Stix Philosophy Mistral 7B
- **Finetuned by:** [sayhan](https://huggingface.co/sayhan)
2024-02-18 13:17:04 +00:00
- **License:** [apache-2.0](https://choosealicense.com/licenses/apache-2.0/)
2024-02-18 10:33:10 +00:00
- **Finetuned from model :** [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
2024-02-18 15:53:23 +00:00
- **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa)
---
**LoRA rank:** 8
**LoRA alpha:** 16
**LoRA dropout:** 0
**Rank-stabilized LoRA:** Yes
**Number of epochs:** 3
**Learning rate:** 1e-5
**Batch size:** 2
**Gradient accumulation steps:** 4
**Weight decay:** 0.01
**Target modules:**
```
- Query projection (`q_proj`)
- Key projection (`k_proj`)
- Value projection (`v_proj`)
- Output projection (`o_proj`)
- Gate projection (`gate_proj`)
- Up projection (`up_proj`)
- Down projection (`down_proj`)
```