model-index, library_name, pipeline_tag, tags, license, datasets, language, base_model
model-index library_name pipeline_tag tags license datasets language base_model
name results
wind-edge-1.6@f16
task dataset metrics
type name
text-generation Code Generation
name type split
CodeBench-30 North-ML1/CodeBench-30 train
name type value verified
Overall Accuracy accuracy 6.25 false
name type value verified
Easy Tier Accuracy accuracy 17.14 false
name type value verified
Medium Tier Accuracy accuracy 0.00 false
name type value verified
Hard Tier Accuracy accuracy 0.00 false
transformers text-generation
wind-edge
causal-lm
edge
small-language-model
0.4b
mit
Jackrong/GLM-5.1-Reasoning-1M-Cleaned
en
North-ML1/Wind-Edge-1.6-Instruct

Wind Edge 1.6 — Geode (0.4B)

A 0.4B parameter causal language model built for edge deployment. Fast, small, and honest about what it can do.

North ML · Wind Arc 1.5 Preview


Overview

Wind Edge 1.6 (Geode) is a compact LLM trained for real-time, on-device inference. At 0.4B parameters it sits in the ultra-small tier — expect strong common-sense and classification performance, limited hard reasoning.

Best use cases:

  • Instruction-following dialogue (short to medium turns)
  • Text classification and sentiment
  • Light code completion
  • Summarization of short passages

Not recommended for: multi-step math, complex logical chains, long-context tasks.


Changes vs 1.5

  • Improved instruction adherence on structured output formats
  • More stable multi-sentence generation (fewer mid-sequence repetitions)
  • Reduced hallucination rate on short factual queries (internal held-out eval)

Honest Benchmark Estimates

Realistic ranges for a well-trained 0.4B model — not cherry-picked numbers.

Task Expected Range Notes
Common Sense (0-shot) 0.60 0.68 Reliable strength
Sentiment Analysis 0.70 0.80 Reliable strength
Text Classification 0.68 0.78 Reliable strength
Reading Comprehension 0.52 0.63 Context-dependent
Summarization 0.58 0.68 Short docs only
Code Generation 0.45 0.58 Simple tasks only
Math Reasoning 0.15 0.28 Known weak point at this scale
Logical Reasoning 0.18 0.28 Known weak point at this scale

A 0.4B model cannot compete with 7B+ on reasoning — Geode doesn't pretend to.


Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("north-ml1/wind-edge-1.6")
tokenizer = AutoTokenizer.from_pretrained("north-ml1/wind-edge-1.6")

inputs = tokenizer("You are Wind Edge, a helpful AI assistant.\nUser: ", return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=256, temperature=0.6, top_p=0.9)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Parameter Value
temperature 0.0
top_p 0.95
min_p 0.05
max_new_tokens 256512
repetition_penalty 1.1
context_limit 1024-4096

GGUF Quantizations

GGUF quants converted from arthu1/Wind-Edge-1.6-Instruct using a Qwen3-compatible tensor layout. The Transformers repo remains canonical — use these for llama.cpp, LM Studio, Ollama-style runtimes, and any other GGUF-compatible inference stack.

Files

File bpw Use
Wind-Edge-1.6-TQ1_0.gguf ~1.7 bpw Experimental 1-bit/ternary. Lowest quality, smallest size.
Wind-Edge-1.6-TQ2_0.gguf ~2.1 bpw Very small 2-bit/ternary option.
Wind-Edge-1.6-IQ3_M.gguf ~3.7 bpw Good balance for tiny devices.
Wind-Edge-1.6-Q4_K_M.gguf ~4.6 bpw Recommended default.
Wind-Edge-1.6-Q6_K.gguf ~6.1 bpw Higher quality, still compact.
Wind-Edge-1.6-Q8_0.gguf ~8.5 bpw Near-lossless practical quant.
Wind-Edge-1.6-F16.gguf 16 bpw Full precision GGUF export.

Q4_K_M, Q6_K, and Q8_0 are the recommended daily drivers. TQ1_0 and TQ2_0 are included for constrained edge hardware but will lose measurable reasoning and factual accuracy.

llama.cpp

llama-cli \
  -m Wind-Edge-1.6-Q4_K_M.gguf \
  -cnv \
  --temp 0.6 \
  --top-p 0.9 \
  --repeat-penalty 1.06 \
  -n 512

For deterministic output, use --temp 0 and keep prompts short.

Chat Template

The GGUF metadata includes the chat template. If your runtime doesn't apply it automatically:

<|im_start|>system
You are Wind-Edge-1.6, a compact AI assistant model. You are not a human.<|im_end|>
<|im_start|>user
Who are you?<|im_end|>
<|im_start|>assistant
<think>
</think>

Model Details

Property Value
Parameters ~0.4B
Architecture Causal LM (decoder-only)
Context Length 8192 tokens
Quantization 1-16bit (GGUF)
Org north-ml1

License

MIT

Description
Model synced from source: North-ML1/Wind-Edge-1.6-GGUF
Readme 27 KiB