初始化项目,由ModelHub XC社区提供模型
Model: Ba2han/HermesStar-OrcaWind-Synth-11B Source: Original Platform
This commit is contained in:
26
README.md
Normal file
26
README.md
Normal file
@@ -0,0 +1,26 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
Open Hermes + Starling passthrough merged
|
||||
|
||||
SlimOrca(?)+Zephyr Beta linear merged, then passthrough merged with Synthia
|
||||
|
||||
Then both models were merged again in 1 to 0.3 ratio.
|
||||
|
||||
# My findings:
|
||||
|
||||
Increasing repetition penalty usually makes the model smarter up to a degree but it also causes stability issues.
|
||||
|
||||
Since most of the merged models were trained with ChatML, use ChatML template. Rarely the model throws another EOS token though.
|
||||
|
||||
- My favorite preset has been uploaded.
|
||||
- You can use some sort of CoT prompt instead of "system" in ChatML. It does improve the quality of most output.
|
||||
(You are an assistant. Break down the question and come to a conclusion.)
|
||||
|
||||
I don't know what I am doing, you are very welcome to put the model through benchmarks.
|
||||
|
||||
I'll also upload q6 GGUF but my internet is shit, so don't hesitate to share other quantizations.
|
||||
Reference in New Issue
Block a user