base_model, library_name, tags, datasets
base_model library_name tags datasets
ValiantLabs/Qwen3-4B-ShiningValiant3
ValiantLabs/Qwen3-4B-Esper3
Qwen/Qwen3-4B
transformers
mergekit
merge
qwen
qwen-3
qwen-3-4b
4b
reasoning
code
code-reasoning
code-instruct
python
javascript
dev-ops
jenkins
terraform
scripting
powershell
azure
aws
gcp
cloud
science
science-reasoning
physics
biology
chemistry
earth-science
astronomy
machine-learning
artificial-intelligence
compsci
computer-science
information-theory
ML-Ops
math
cuda
deep-learning
transformers
agentic
LLM
neuromorphic
self-improvement
complex-systems
cognition
linguistics
philosophy
logic
epistemology
simulation
game-theory
knowledge-management
creativity
problem-solving
architect
engineer
developer
creative
analytical
expert
rationality
conversational
chat
instruct
sequelbox/Celestia3-DeepSeek-R1-0528
sequelbox/Mitakihara-DeepSeek-R1-0528
sequelbox/Titanium2.1-DeepSeek-R1
sequelbox/Tachibana2-DeepSeek-R1
sequelbox/Raiden-DeepSeek-R1

PlumEsper

This is a merge of pre-trained language models created using mergekit, combining the specialty and general reasoning skills of Esper 3 4b and Shining Valiant 3 4b.

Merge Details

Merge Method

This model was merged using the DELLA merge method using Qwen/Qwen3-4B as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method: della
dtype: bfloat16
parameters:
  normalize: true
models:
  - model: ValiantLabs/Qwen3-4B-Esper3
    parameters:
      density: 0.5
      weight: 0.3
  - model: ValiantLabs/Qwen3-4B-ShiningValiant3
    parameters:
      density: 0.5
      weight: 0.3
base_model: Qwen/Qwen3-4B

Description
Model synced from source: sequelbox/Qwen3-4B-PlumEsper
Readme 26 KiB
Languages
Text 100%