Compare commits
10 Commits
b92e903428
...
d82f3d8e4b
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d82f3d8e4b | ||
|
|
4b36bc80c7 | ||
|
|
d9e88ab9e8 | ||
|
|
41d100211a | ||
|
|
dcaf562b13 | ||
|
|
254f3a8292 | ||
|
|
0fd7e019aa | ||
|
|
11a2fd05ef | ||
|
|
f539708f9f | ||
|
|
fe42bf6216 |
3
.gitattributes
vendored
3
.gitattributes
vendored
@@ -49,3 +49,6 @@ openhermes-2.5-strix-philosophy-mistral-7b.Q4_K_S.gguf filter=lfs diff=lfs merge
|
|||||||
openhermes-2.5-strix-philosophy-mistral-7b.Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
|
openhermes-2.5-strix-philosophy-mistral-7b.Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
openhermes-2.5-strix-philosophy-mistral-7b.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
openhermes-2.5-strix-philosophy-mistral-7b.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
openhermes-2.5-strix-philosophy-mistral-7b.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
openhermes-2.5-strix-philosophy-mistral-7b.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
openhermes-2.5-strix-philosophy-mistral-7b.Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
openhermes-2.5-strix-philosophy-mistral-7b[[:space:]]Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
openhermes-2.5-strix-philosophy-mistral-7b.Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
|||||||
30
README.md
30
README.md
@@ -3,7 +3,7 @@ language:
|
|||||||
- en
|
- en
|
||||||
license: apache-2.0
|
license: apache-2.0
|
||||||
tags:
|
tags:
|
||||||
- transformers
|
- trl
|
||||||
- text-generation-inference
|
- text-generation-inference
|
||||||
- unsloth
|
- unsloth
|
||||||
- mistral
|
- mistral
|
||||||
@@ -11,11 +11,33 @@ tags:
|
|||||||
base_model: teknium/OpenHermes-2.5-Mistral-7B
|
base_model: teknium/OpenHermes-2.5-Mistral-7B
|
||||||
datasets:
|
datasets:
|
||||||
- sayhan/strix-philosophy-qa
|
- sayhan/strix-philosophy-qa
|
||||||
library_name: trl
|
library_name: transformers
|
||||||
---
|
---
|
||||||

|

|
||||||
# OpenHermes 2.5 Stix Philosophy Mistral 7B
|
# OpenHermes 2.5 Stix Philosophy Mistral 7B
|
||||||
- **Finetuned by:** [sayhan](https://huggingface.co/sayhan)
|
- **Finetuned by:** [sayhan](https://huggingface.co/sayhan)
|
||||||
- **License:** apache-2.0
|
- **License:** [apache-2.0](https://choosealicense.com/licenses/apache-2.0/)
|
||||||
- **Finetuned from model :** [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
|
- **Finetuned from model :** [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
|
||||||
- **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa)
|
- **Dataset:** [sayhan/strix-philosophy-qa](https://huggingface.co/datasets/sayhan/strix-philosophy-qa)
|
||||||
|
---
|
||||||
|
**LoRA rank:** 8
|
||||||
|
**LoRA alpha:** 16
|
||||||
|
**LoRA dropout:** 0
|
||||||
|
**Rank-stabilized LoRA:** Yes
|
||||||
|
**Number of epochs:** 3
|
||||||
|
**Learning rate:** 1e-5
|
||||||
|
**Batch size:** 2
|
||||||
|
**Gradient accumulation steps:** 4
|
||||||
|
**Weight decay:** 0.01
|
||||||
|
**Target modules:**
|
||||||
|
```
|
||||||
|
- Query projection (`q_proj`)
|
||||||
|
- Key projection (`k_proj`)
|
||||||
|
- Value projection (`v_proj`)
|
||||||
|
- Output projection (`o_proj`)
|
||||||
|
- Gate projection (`gate_proj`)
|
||||||
|
- Up projection (`up_proj`)
|
||||||
|
- Down projection (`down_proj`)
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
3
openhermes-2.5-strix-philosophy-mistral-7b.fp16.bin
Normal file
3
openhermes-2.5-strix-philosophy-mistral-7b.fp16.bin
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:e9128f0e34d5668450c3a29a0885685aba5bf0d22ff4e876d4e39861b314b380
|
||||||
|
size 14484764640
|
||||||
Reference in New Issue
Block a user