44 lines
1.2 KiB
Markdown
44 lines
1.2 KiB
Markdown
---
|
|
language:
|
|
- en
|
|
license: apache-2.0
|
|
tags:
|
|
- trl
|
|
- text-generation-inference
|
|
- unsloth
|
|
- mistral
|
|
- gguf
|
|
base_model: teknium/OpenHermes-2.5-Mistral-7B
|
|
datasets:
|
|
- sayhan/strix-philosophy-qa
|
|
library_name: transformers
|
|
---
|
|

|
|
# OpenHermes 2.5 Stix Philosophy Mistral 7B
|
|
- **Finetuned by:** [sayhan](https://huggingface.co/sayhan)
|
|
- **License:** [apache-2.0](https://choosealicense.com/licenses/apache-2.0/)
|
|
- **Finetuned from model :** [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
|
|
- **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa)
|
|
---
|
|
**LoRA rank:** 8
|
|
**LoRA alpha:** 16
|
|
**LoRA dropout:** 0
|
|
**Rank-stabilized LoRA:** Yes
|
|
**Number of epochs:** 3
|
|
**Learning rate:** 1e-5
|
|
**Batch size:** 2
|
|
**Gradient accumulation steps:** 4
|
|
**Weight decay:** 0.01
|
|
**Target modules:**
|
|
```
|
|
- Query projection (`q_proj`)
|
|
- Key projection (`k_proj`)
|
|
- Value projection (`v_proj`)
|
|
- Output projection (`o_proj`)
|
|
- Gate projection (`gate_proj`)
|
|
- Up projection (`up_proj`)
|
|
- Down projection (`down_proj`)
|
|
```
|
|
|
|
|