Geodesic-Phantom-12B/README.md

---
base_model: 
- mistralai/Mistral-Nemo-Instruct-2407   
- Vortex5/Prototype-X-12b
- Vortex5/Stellar-Witch-12B
- Vortex5/Celestial-Queen-12B
- Vortex5/Moonlit-Mirage-12B
- Vortex5/Crimson-Constellation-12B
- Vortex5/Wicked-Nebula-12B
library_name: transformers
tags:
- mergekit
- merge
- mistral
- nemo
- karcher_stock
widget:
  - text: "Geodesic-Phantom-12B"
    output:
      url: https://cdn-uploads.huggingface.co/production/uploads/69e46bb84df2a2575b60a527/7tnIXKdUUtGLGkbcGPRGK.jpeg
---
# 👻 Geodesic Phantom 12B

![geodesic-phantom](https://cdn-uploads.huggingface.co/production/uploads/69e46bb84df2a2575b60a527/7tnIXKdUUtGLGkbcGPRGK.jpeg)

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

This was merged in 7 hours on a runpod A40 using an [adaptive VRAM chunking script](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18_runpod_A40.py) (based on `measure.py` by [GrimJim](https://huggingface.co/grimjim))

```bat
WARNING:mergekit.graph:OOM at chunk 65536, reducing to 32768 (attempt 1, progress: 0/131075)
WARNING:mergekit.graph:OOM at chunk 32768, reducing to 16384 (attempt 2, progress: 0/131075)

[Karcher_Stock Audit] Layer: lm_head.weight
Stats: Cos(θ): 0.564 | t-factor: 0.8843 | Karcher Iters: 2960
  (Base)  mistralai--Mistral-Nemo-Instruct-2407               : █████                                              ( 11.57%)
  (Donor) Vortex5--Prototype-X-12b                            : ███████                                            ( 14.74%)
  (Donor) Vortex5--Stellar-Witch-12B                          : ███████                                            ( 14.74%)
  (Donor) Vortex5--Celestial-Queen-12B                        : ███████                                            ( 14.74%)
  (Donor) Vortex5--Moonlit-Mirage-12B                         : ███████                                            ( 14.74%)
  (Donor) Vortex5--Crimson-Constellation-12B                  : ███████                                            ( 14.74%)
  (Donor) Vortex5--Wicked-Nebula-12B                          : ███████                                            ( 14.74%)
```

The following patch was also required for this merge

# `karcher_stock` Adaptive Tanh Soft-Clamp v11

```py
# ── 11. Model Stock t factor with Adaptive Soft-Clamp ─────────────
        N = len(ws_2d)
        ct = cos_theta.unsqueeze(-1) if cos_theta.dim() > 0 else cos_theta
        
        # Raw Model Stock formula
        denom = 1.0 + (N - 1) * ct
        # Add a tiny epsilon to prevent literal division by zero
        t_raw = (N * ct) / denom.clamp(min=1e-6) 

        # --- BULLETPROOF TANH CLAMP ---
        # 1. Prevent negative infinity spikes (fallback to base model)
        t_clamped_bottom = torch.clamp(t_raw, min=0.0)
        
        # 2. Smoothly asymptote positive spikes to L (Maximum allowed t-factor)
        L = 1.5 
        excess = torch.clamp(t_clamped_bottom - 1.0, min=0.0)
        t_soft_top = 1.0 + (L - 1.0) * torch.tanh(excess / (L - 1.0))
        
        # 3. Apply: If t <= 1.0, use exact math. If t > 1.0, use soft curve.
        t = torch.where(t_clamped_bottom <= 1.0, t_clamped_bottom, t_soft_top)
        # ------------------------------
```

## Example of the clamp preventing merge corruption
![tanh_clamp](https://cdn-uploads.huggingface.co/production/uploads/68e840caa318194c44ec2a04/eRdxOMhKsRysDgP-6Pkw0.png)

## Merge Details
### Merge Method

This model was merged using the `karcher_stock` merge method using /workspace/models/mistralai--Mistral-Nemo-Instruct-2407 as a base.

### Models Merged

The following models were included in the merge:
* /workspace/models/Vortex5--Wicked-Nebula-12B
* /workspace/models/Vortex5--Celestial-Queen-12B
* /workspace/models/Vortex5--Moonlit-Mirage-12B
* /workspace/models/Vortex5--Stellar-Witch-12B
* /workspace/models/Vortex5--Prototype-X-12b
* /workspace/models/Vortex5--Crimson-Constellation-12B

### Configuration

The following YAML configuration was used to produce this model:

```yaml
architecture: MistralForCausalLM
base_model: /workspace/models/mistralai--Mistral-Nemo-Instruct-2407
models:
  - model: /workspace/models/Vortex5--Prototype-X-12b
  - model: /workspace/models/Vortex5--Celestial-Queen-12B
  - model: /workspace/models/Vortex5--Wicked-Nebula-12B
  - model: /workspace/models/Vortex5--Stellar-Witch-12B
  - model: /workspace/models/Vortex5--Moonlit-Mirage-12B
  - model: /workspace/models/Vortex5--Crimson-Constellation-12B
merge_method: karcher_stock # v8
parameters:  
  filter_wise: true
  max_iter: 10000
  min_iter: 1000
  tol: 1.0e-11
dtype: float32
out_dtype: bfloat16
tokenizer:
  source: union
chat_template: auto
name: 👻 Geodesic Phantom 12B
```
初始化项目，由ModelHub XC社区提供模型 Model: OrobasVault/Geodesic-Phantom-12B Source: Original Platform 2026-05-29 15:34:16 +08:00			`---`
			`base_model:`
			`- mistralai/Mistral-Nemo-Instruct-2407`
			`- Vortex5/Prototype-X-12b`
			`- Vortex5/Stellar-Witch-12B`
			`- Vortex5/Celestial-Queen-12B`
			`- Vortex5/Moonlit-Mirage-12B`
			`- Vortex5/Crimson-Constellation-12B`
			`- Vortex5/Wicked-Nebula-12B`
			`library_name: transformers`
			`tags:`
			`- mergekit`
			`- merge`
			`- mistral`
			`- nemo`
			`- karcher_stock`
			`widget:`
			`- text: "Geodesic-Phantom-12B"`
			`output:`
			`url: https://cdn-uploads.huggingface.co/production/uploads/69e46bb84df2a2575b60a527/7tnIXKdUUtGLGkbcGPRGK.jpeg`
			`---`
			`# 👻 Geodesic Phantom 12B`

			`![geodesic-phantom](https://cdn-uploads.huggingface.co/production/uploads/69e46bb84df2a2575b60a527/7tnIXKdUUtGLGkbcGPRGK.jpeg)`

			`This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).`

			This was merged in 7 hours on a runpod A40 using an [adaptive VRAM chunking script](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18_runpod_A40.py) (based on `measure.py` by [GrimJim](https://huggingface.co/grimjim))

			```bat
			`WARNING:mergekit.graph:OOM at chunk 65536, reducing to 32768 (attempt 1, progress: 0/131075)`
			`WARNING:mergekit.graph:OOM at chunk 32768, reducing to 16384 (attempt 2, progress: 0/131075)`

			`[Karcher_Stock Audit] Layer: lm_head.weight`
			`Stats: Cos(θ): 0.564 \| t-factor: 0.8843 \| Karcher Iters: 2960`
			`(Base) mistralai--Mistral-Nemo-Instruct-2407 : █████ ( 11.57%)`
			`(Donor) Vortex5--Prototype-X-12b : ███████ ( 14.74%)`
			`(Donor) Vortex5--Stellar-Witch-12B : ███████ ( 14.74%)`
			`(Donor) Vortex5--Celestial-Queen-12B : ███████ ( 14.74%)`
			`(Donor) Vortex5--Moonlit-Mirage-12B : ███████ ( 14.74%)`
			`(Donor) Vortex5--Crimson-Constellation-12B : ███████ ( 14.74%)`
			`(Donor) Vortex5--Wicked-Nebula-12B : ███████ ( 14.74%)`
			```

			`The following patch was also required for this merge`

			# `karcher_stock` Adaptive Tanh Soft-Clamp v11

			```py
			`# ── 11. Model Stock t factor with Adaptive Soft-Clamp ─────────────`
			`N = len(ws_2d)`
			`ct = cos_theta.unsqueeze(-1) if cos_theta.dim() > 0 else cos_theta`

			`# Raw Model Stock formula`
			`denom = 1.0 + (N - 1) * ct`
			`# Add a tiny epsilon to prevent literal division by zero`
			`t_raw = (N * ct) / denom.clamp(min=1e-6)`

			`# --- BULLETPROOF TANH CLAMP ---`
			`# 1. Prevent negative infinity spikes (fallback to base model)`
			`t_clamped_bottom = torch.clamp(t_raw, min=0.0)`

			`# 2. Smoothly asymptote positive spikes to L (Maximum allowed t-factor)`
			`L = 1.5`
			`excess = torch.clamp(t_clamped_bottom - 1.0, min=0.0)`
			`t_soft_top = 1.0 + (L - 1.0) * torch.tanh(excess / (L - 1.0))`

			`# 3. Apply: If t <= 1.0, use exact math. If t > 1.0, use soft curve.`
			`t = torch.where(t_clamped_bottom <= 1.0, t_clamped_bottom, t_soft_top)`
			`# ------------------------------`
			```

			`## Example of the clamp preventing merge corruption`
			`![tanh_clamp](https://cdn-uploads.huggingface.co/production/uploads/68e840caa318194c44ec2a04/eRdxOMhKsRysDgP-6Pkw0.png)`

			`## Merge Details`
			`### Merge Method`

			This model was merged using the `karcher_stock` merge method using /workspace/models/mistralai--Mistral-Nemo-Instruct-2407 as a base.

			`### Models Merged`

			`The following models were included in the merge:`
			`* /workspace/models/Vortex5--Wicked-Nebula-12B`
			`* /workspace/models/Vortex5--Celestial-Queen-12B`
			`* /workspace/models/Vortex5--Moonlit-Mirage-12B`
			`* /workspace/models/Vortex5--Stellar-Witch-12B`
			`* /workspace/models/Vortex5--Prototype-X-12b`
			`* /workspace/models/Vortex5--Crimson-Constellation-12B`

			`### Configuration`

			`The following YAML configuration was used to produce this model:`

			```yaml
			`architecture: MistralForCausalLM`
			`base_model: /workspace/models/mistralai--Mistral-Nemo-Instruct-2407`
			`models:`
			`- model: /workspace/models/Vortex5--Prototype-X-12b`
			`- model: /workspace/models/Vortex5--Celestial-Queen-12B`
			`- model: /workspace/models/Vortex5--Wicked-Nebula-12B`
			`- model: /workspace/models/Vortex5--Stellar-Witch-12B`
			`- model: /workspace/models/Vortex5--Moonlit-Mirage-12B`
			`- model: /workspace/models/Vortex5--Crimson-Constellation-12B`
			`merge_method: karcher_stock # v8`
			`parameters:`
			`filter_wise: true`
			`max_iter: 10000`
			`min_iter: 1000`
			`tol: 1.0e-11`
			`dtype: float32`
			`out_dtype: bfloat16`
			`tokenizer:`
			`source: union`
			`chat_template: auto`
			`name: 👻 Geodesic Phantom 12B`
			```