初始化项目,由ModelHub XC社区提供模型
Model: reaperdoesntknow/DualMind Source: Original Platform
This commit is contained in:
210
README.md
Normal file
210
README.md
Normal file
@@ -0,0 +1,210 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- qwen3
|
||||
- sft
|
||||
- trl
|
||||
- dual-mind
|
||||
- reasoning
|
||||
- convergent-intelligence
|
||||
- explore-examine-response
|
||||
- convergentintel
|
||||
- edge
|
||||
- distillation
|
||||
- knowledge-distillation
|
||||
datasets:
|
||||
- zai-org/LongWriter-6k
|
||||
base_model:
|
||||
- reaperdoesntknow/DiStil-Qwen3-1.7B-uncensored
|
||||
---
|
||||
|
||||
# DualMind
|
||||
|
||||
**Single Architecture, Dual Cognition — The Multi-Model Collision Array on Shared Weights**
|
||||
|
||||
*Convergent Intelligence LLC: Research Division*
|
||||
|
||||
---
|
||||
|
||||
## What This Is
|
||||
|
||||
DualMind is a 1.7B parameter model that implements **dual-mental-modality reasoning** — a single model with two internal voices sharing the same weights, differentiated only by role tokens:
|
||||
|
||||
- **`<explore>`** — Unconstrained reasoning. Derivation, speculation, working through the problem freely.
|
||||
- **`<examine>`** — Adversarial self-response. The model reads its own explore output and critiques it. Error detection, verification, refinement.
|
||||
- **`<response>`** — Clean synthesis. The final answer distilled from the internal dialogue.
|
||||
|
||||
This is the multi-model collision array collapsed into a single architecture. The dialectical structure that produces novel insights from architectural diversity (demonstrated in our [five-architecture collision experiments](https://huggingface.co/reaperdoesntknow)) is recreated through role-conditioned generation on shared weights.
|
||||
|
||||
## Architecture
|
||||
|
||||
| Parameter | Value |
|
||||
|-----------|-------|
|
||||
| Architecture | Qwen3ForCausalLM |
|
||||
| Parameters | ~2.03B (1.7B effective) |
|
||||
| Hidden Size | 2048 |
|
||||
| Layers | 28 |
|
||||
| Attention Heads | 16 (Q) / 8 (KV) — GQA |
|
||||
| Context Length | 40,960 tokens |
|
||||
| Precision | BF16 (trained on H100) |
|
||||
|
||||
## Training
|
||||
|
||||
**Base model:** [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) (DISC-refined uncensored Qwen3)
|
||||
|
||||
**Dataset:** [KK04/LogicInference_OA](https://huggingface.co/datasets/KK04/LogicInference_OA) — Logical inference problems transformed into the DualMind cognitive loop format.
|
||||
|
||||
**Training format:** Each CoT solution is restructured into the DualMind format:
|
||||
- Derivation sentences → `<explore>` block (reasoning phase)
|
||||
- Verification/checking sentences → `<examine>` block (self-critique phase)
|
||||
- Final answer → `<response>` block (synthesis)
|
||||
|
||||
Sentence-level splitting uses trigger detection (check, verify, however, but wait, etc.) to find the natural transition from reasoning to verification, with 70/30 positional fallback.
|
||||
|
||||
**Hardware:** Colab H100, BF16 precision. 512 steps, lr 5e-6, SFT via TRL.
|
||||
|
||||
**Next iteration:** Currently training on [Crownelius/Opus-4.6-Reasoning-3300x](https://huggingface.co/datasets/Crownelius/Opus-4.6-Reasoning-3300x) — 2,160 Claude Opus 4.6 reasoning samples with pre-separated `thinking`/`solution` columns, eliminating the need for heuristic splitting.
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
"reaperdoesntknow/DualMind",
|
||||
torch_dtype="auto",
|
||||
device_map="auto"
|
||||
)
|
||||
tokenizer = AutoTokenizer.from_pretrained("reaperdoesntknow/DualMind")
|
||||
|
||||
# Start the explore block — the model completes the full loop
|
||||
prompt = (
|
||||
"##USER:\n"
|
||||
"Prove that the sum of two even numbers is always even.\n\n"
|
||||
"<explore>\n"
|
||||
)
|
||||
|
||||
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
||||
output = model.generate(
|
||||
**inputs,
|
||||
max_new_tokens=1024,
|
||||
do_sample=True,
|
||||
top_p=0.9,
|
||||
temperature=0.6,
|
||||
repetition_penalty=1.15,
|
||||
)
|
||||
result = tokenizer.decode(output[0], skip_special_tokens=True)
|
||||
print(result)
|
||||
```
|
||||
|
||||
### Expected Output Structure
|
||||
|
||||
```
|
||||
<explore>
|
||||
[The model works through the proof freely — definitions, algebraic manipulation, etc.]
|
||||
</explore>
|
||||
|
||||
<examine>
|
||||
[The model critiques its own derivation — checks for gaps, verifies steps, catches errors]
|
||||
</examine>
|
||||
|
||||
<response>
|
||||
[Clean final answer synthesized from the internal dialogue]
|
||||
</response>
|
||||
```
|
||||
|
||||
## Why Dual Modality
|
||||
|
||||
Standard CoT prompting produces a single stream of reasoning. The model has one shot to get it right. DualMind gives the model a structural mechanism for self-correction:
|
||||
|
||||
1. **Explore** is free to make mistakes, speculate, and try approaches that might not work
|
||||
2. **Examine** reads the explore output adversarially — it's looking for errors, not confirming correctness
|
||||
3. **Response** has the benefit of both perspectives
|
||||
|
||||
This mirrors what happens in multi-model collision arrays where different architectures produce genuinely different failure modes, and the collision between them surfaces structure that neither achieves alone. DualMind recreates this dynamic within a single set of weights through role conditioning.
|
||||
|
||||
## Distillation Chain
|
||||
|
||||
```
|
||||
Qwen3-1.7B (base)
|
||||
→ DiStil-Qwen3-1.7B-uncensored (uncensored SFT)
|
||||
→ Disctil-Qwen3-1.7B (DISC refinement)
|
||||
→ DualMind (DualMind SFT on Opus 4.6 reasoning data) ← you are here
|
||||
```
|
||||
|
||||
|
||||
## Mathematical Foundations: Discrepancy Calculus (DISC)
|
||||
|
||||
DualMind's dual-cognition architecture connects to Discrepancy Calculus through **Continuous Thought Dynamics** (Ch. 19 of the DISC monograph) — which models inference as a discrepancy-guided PDE where the explore→examine→respond cycle corresponds to a controlled trajectory through cognitive phase space.
|
||||
|
||||
The discrepancy operator:
|
||||
|
||||
$$Df(x) = \lim_{\varepsilon \downarrow 0} \frac{1}{\varepsilon} \int_x^{x+\varepsilon} \frac{|f(t) - f(x)|}{|t - x|}\, dt$$
|
||||
|
||||
quantifies the mismatch between what the model generates (integration) and what it should generate (differentiation). The `<explore>` phase increases discrepancy energy freely; `<examine>` applies the Adaptive Discrepancy Derivative (ADD, Ch. 14) to detect drift; `<response>` minimizes residual discrepancy into a clean output. The three phases implement the BV decomposition operationally: smooth reasoning, jump corrections at error boundaries, and Cantor-type refinement of subtle drift.
|
||||
|
||||
Full theory: *"On the Formal Analysis of Discrepancy Calculus"* (Colca, 2026; Convergent Intelligence LLC: Research Division).
|
||||
|
||||
## Related Models
|
||||
|
||||
| Model | Description | Downloads |
|
||||
|-------|-------------|-----------|
|
||||
| [TopologicalQwen](https://huggingface.co/reaperdoesntknow/TopologicalQwen) | TKD + DualMind on physics CoT | 622 |
|
||||
| [Disctil-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/Disctil-Qwen3-1.7B) | Parent model (DISC-refined) | 286 |
|
||||
| [Qwen3-1.7B-Thinking-Distil](https://huggingface.co/reaperdoesntknow/Qwen3-1.7B-Thinking-Distil) | TKD with Thinking teacher | 687 |
|
||||
|
||||
**[DualMind Collection](https://huggingface.co/collections/reaperdoesntknow/dualmind)** — Dual-cognition model series
|
||||
|
||||
**[DistilQwen Collection](https://huggingface.co/collections/reaperdoesntknow/distilqwen-69bf40ec669117e3f069ef1c)** — Full proof-weighted distillation series
|
||||
|
||||
Full methodology: [Structure Over Scale (DOI: 10.57967/hf/8165)](https://doi.org/10.57967/hf/8165)
|
||||
|
||||
## Citation
|
||||
|
||||
```bibtex
|
||||
@misc{colca2026dualmind,
|
||||
title={DualMind: Dual-Mental-Modality Reasoning via Role-Conditioned Self-Critique},
|
||||
author={Colca, Roy S.},
|
||||
year={2026},
|
||||
publisher={HuggingFace},
|
||||
url={https://huggingface.co/reaperdoesntknow/DualMind},
|
||||
note={Convergent Intelligence LLC: Research Division}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
*Convergent Intelligence LLC: Research Division*
|
||||
*"Where classical analysis fails to see, we begin."*
|
||||
<!-- cix-keeper-ts:2026-06-12T13:15:31Z -->
|
||||
<!-- card-refresh: 2026-03-30 -->
|
||||
|
||||
---
|
||||
|
||||
## Convergent Intelligence Portfolio
|
||||
|
||||
*Part of the [DualMind Series](https://huggingface.co/collections/reaperdoesntknow/dualmind-69c93f888c6e79ecc69cf41e) by [Convergent Intelligence LLC: Research Division](https://huggingface.co/reaperdoesntknow)*
|
||||
|
||||
### DualMind Family
|
||||
|
||||
| Model | Format | Description |
|
||||
|-------|--------|-------------|
|
||||
| [DualMind](https://huggingface.co/reaperdoesntknow/DualMind) | BF16 | LogicInference-trained. Explore→Examine→Response loop. |
|
||||
| [DualMinded-Qwen3-1.7B](https://huggingface.co/reaperdoesntknow/DualMinded-Qwen3-1.7B) | BF16 | Opus 4.6 reasoning traces. Higher quality splits. |
|
||||
| [Dualmind-Qwen-1.7B-Thinking](https://huggingface.co/reaperdoesntknow/Dualmind-Qwen-1.7B-Thinking) | BF16 | Thinking-teacher variant with extended deliberation. |
|
||||
| [DualMind-GGUF](https://huggingface.co/reaperdoesntknow/DualMind-GGUF) | GGUF | Quantized LogicInference variant. CPU/6GB GPU. |
|
||||
| [DualMinded-Qwen3-1.7B-GGUF](https://huggingface.co/reaperdoesntknow/DualMinded-Qwen3-1.7B-GGUF) | GGUF | Quantized Opus variant. Ollama ready. |
|
||||
|
||||
### Papers
|
||||
|
||||
| Paper | DOI |
|
||||
|-------|-----|
|
||||
| [Structure Over Scale](https://huggingface.co/reaperdoesntknow/Structure-Over-Scale) | 10.57967/hf/8165 |
|
||||
| [Three Teachers to Dual Cognition](https://huggingface.co/reaperdoesntknow/DualMind_Methodolgy) | 10.57967/hf/8184 |
|
||||
| [Discrepancy Calculus](https://huggingface.co/reaperdoesntknow/Discrepancy_Calculus) | 10.57967/hf/8194 |
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-03-31 by Convergent Intelligence LLC: Research Division*
|
||||
Reference in New Issue
Block a user