diff --git a/README.md b/README.md new file mode 100644 index 0000000..66f861d --- /dev/null +++ b/README.md @@ -0,0 +1,423 @@ +--- +license: apache-2.0 +language: +- en +- fr +- zh +- de +tags: +- creative +- creative writing +- fiction writing +- plot generation +- sub-plot generation +- fiction writing +- story generation +- scene continue +- storytelling +- fiction story +- science fiction +- romance +- all genres +- story +- writing +- vivid prose +- vivid writing +- moe +- mixture of experts +- 64 experts +- 8 active experts +- fiction +- roleplaying +- bfloat16 +- rp +- qwen3 +- horror +- finetune +- thinking +- reasoning +- qwen3_moe +base_model: +- kalomaze/Qwen3-16B-A3B +- huihui-ai/Moonlight-16B-A3B-Instruct-abliterated +pipeline_tag: text-generation +--- + +(quants uploading, examples to be added...) + +
+
+A stranger, yet radically different version of Kalmaze's "Qwen/Qwen3-30B-A3B" (that was abliterated by "huihui-ai") with the experts pruned to 64 (from 128)
+and then I added 4 layers expanding the model to 18B total parameters.
+
+The goal: slightly alter the model, to address some odd creative thinking and output choices AND de-censor (abliterate) the model.
+
+Please note that the modifications affect the entire model operation; roughly I adjusted the model to think a little "deeper"
+and "ponder" a bit - but this is a very rough description.
+
+I also ran reasoning tests (non-creative) to ensure model was not damaged and roughly matched original model performance.
+
+That being said, reasoning and output generation will be altered regardless of your use case(s).
+
+FOUR example generations below; with example 4 showing a complex prompt and impressive prose that showcases some of the changes in the model.
+
+This is a MOE (Mixture of experts model) with 8 of 64 experts activated by default, which is about 3B parameters.
+
+This allows use of this model (with 8 experts) on both CPU (20-35 T/S) or GPU (90+ T/S) at very good, to extremely fast speeds.
+
+Changing the number of experts used (see below how) will affect both generation speed and generation reasoning/output quality.
+
+You can use this model with as low as four experts activated.
+
+Even the lowest quant - Q2k - will operate very strongly too.
+
+Model is set at:
+- 8 Active experts (the default for the org model)
+- 40k context (the default for the org model)
+- CHATML or Jinja template (embedded OR see Jinja notes below)
+
+QUANTS:
+
+There are two sets of quants, regular and "MAX" (in the filename) with output tensor set at float16 (16 bit - full precision) to enhance performance
+including reasoning.
+
+SYSTEM PROMPT:
+
+You may or may not need to set this:
+
+```
+You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside +Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities. + +Here are your skillsets: +[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv) + +[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision) + +Here are your critical instructions: +Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story. ++ +You do not need to use this, it is only presented as an additional enhancement which seems to help scene generation +and scene continue functions. + +This enhancement WAS NOT used to generate the examples below. + +--- + +