84 lines
2.3 KiB
Markdown
84 lines
2.3 KiB
Markdown
|
|
---
|
||
|
|
license: apache-2.0
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
- zh
|
||
|
|
base_model:
|
||
|
|
- janhq/Jan-v1-2509
|
||
|
|
- Gen-Verse/Qwen3-4B-RA-SFT
|
||
|
|
- TeichAI/Qwen3-4B-Instruct-2507-Polaris-Alpha-Distill
|
||
|
|
- TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill
|
||
|
|
- DavidAU/Qwen3-4B-Apollo-V0.1-4B-Thinking-Heretic-Abliterated
|
||
|
|
- nightmedia/Qwen3-4B-Agent
|
||
|
|
- FutureMa/Eva-4B
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
library_name: transformers
|
||
|
|
tags:
|
||
|
|
- coding
|
||
|
|
- research
|
||
|
|
- deep thinking
|
||
|
|
- 1M context
|
||
|
|
- 256k context
|
||
|
|
- Qwen3
|
||
|
|
- All use cases
|
||
|
|
- creative
|
||
|
|
- creative writing
|
||
|
|
- fiction writing
|
||
|
|
- plot generation
|
||
|
|
- sub-plot generation
|
||
|
|
- story generation
|
||
|
|
- scene continue
|
||
|
|
- storytelling
|
||
|
|
- fiction story
|
||
|
|
- science fiction
|
||
|
|
- all genres
|
||
|
|
- story
|
||
|
|
- writing
|
||
|
|
- vivid prosing
|
||
|
|
- vivid writing
|
||
|
|
- fiction
|
||
|
|
- roleplaying
|
||
|
|
- bfloat16
|
||
|
|
- finetune
|
||
|
|
- mergekit
|
||
|
|
- merge
|
||
|
|
---
|
||
|
|
# Qwen3-4B-Element4-Eva
|
||
|
|
|
||
|
|
This is a model merge between Qwen3-4B-Element4 and FutureMa/Eva-4B.
|
||
|
|
|
||
|
|
Brainwaves of qx86-hi quants of the parent models
|
||
|
|
```brainwave
|
||
|
|
Element4 0.582,0.779,0.849,0.708,0.442,0.771,0.655
|
||
|
|
Eva-4B 0.539,0.747,0.864,0.606,0.412,0.751,0.605
|
||
|
|
```
|
||
|
|
Eva merged models
|
||
|
|
```brainwave
|
||
|
|
Agent-Eva 0.568,0.775,0.872,0.699,0.418,0.777,0.654
|
||
|
|
Element8-Eva 0.559,0.768,0.872,0.694,0.422,0.765,0.647
|
||
|
|
|
||
|
|
Element4-Eva
|
||
|
|
bf16 0.570,0.781,0.869,0.689,0.422,0.769,0.645
|
||
|
|
qx86-hi 0.567,0.781,0.868,0.689,0.426,0.773,0.642
|
||
|
|
qx64-hi 0.567,0.772,0.865,0.679,0.424,0.772,0.641
|
||
|
|
mxfp4 0.549,0.757,0.864,0.666,0.414,0.764,0.635
|
||
|
|
```
|
||
|
|
Element4 is a merge of Qwen3-4B-Engineer3x and Qwen3-4B-Agent, and serves as a base for the higher number elements. The Agent is Heretic-abliterated, which provides for some interesting friction in the model chains of thought, that only enhances the inference with some original AI humour.
|
||
|
|
|
||
|
|
The qx86-hi quant performs at the same level with full precision in this model.
|
||
|
|
|
||
|
|
The Element models are profiled to act as agents on the Star Trek DS9 station, in a roleplay scenario.
|
||
|
|
|
||
|
|
The models can be used for regular tasks as well.
|
||
|
|
|
||
|
|
Each comes with different skills. I found FutureMa/Eva-4B recently with an interesting model card:
|
||
|
|
|
||
|
|
> Eva-4B is a 4B-parameter model for detecting evasive answers in earnings call Q&A.
|
||
|
|
|
||
|
|
In Element8-Eva, that would be Quark. Element8 is a very rich merge, with lower metrics than Agent.
|
||
|
|
|
||
|
|
Like I mentioned on the Element8-Eva model card, the FutureMa/Eva-4B was simply included for conversational skills.
|
||
|
|
|
||
|
|
-G
|
||
|
|
|