dystrio/Mistral-7B-Instruct-v0.3-sculpt-default

Files

ModelHub XC b81e7bc93c 初始化项目，由ModelHub XC社区提供模型

Model: dystrio/Mistral-7B-Instruct-v0.3-sculpt-default
Source: Original Platform

2026-06-20 18:42:54 +08:00

5.6 KiB

Raw Permalink Blame History

license, library_name, pipeline_tag, language, base_model, tags, datasets, model-index

license

library_name

pipeline_tag

language

base_model

dystrio/Mistral-7B-Instruct-v0.3-sculpt-default

11% smaller, quality improved (0.923x PPL), drop-in replacement. No custom kernels. No runtime changes.

Dystrio Sculpt structurally compresses transformer models, producing dense models that load with standard transformers — no custom code, no new ops, no deployment friction.

This is the Default tier of Mistral 7B Instruct v0.3.

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("dystrio/Mistral-7B-Instruct-v0.3-sculpt-default", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("dystrio/Mistral-7B-Instruct-v0.3-sculpt-default")

inputs = tokenizer("The future of AI inference is", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Benchmark Results

All tiers compiled from Mistral 7B Instruct v0.3 on A100 80GB, bf16:

Model	PPL	PPL Ratio	Weights (GB)	Chat Prefill TPS	RAG TTFT p95 (ms)	Decode TPS
Baseline	12.5983	1.0	13.500496	10557.3	133.325	66.8
sculpt-default	11.6283	0.923	12.000496	11594.3	123.069	65.3
sculpt-production	14.2859	1.134	11.250496	12093.9	120.842	66.0
sculpt-throughput	16.3355	1.2966	10.406746	12667.0	112.683	65.8
sculpt-experimental	25.1515	1.9964	9.562996	13595.9	110.293	66.5

Key Metrics (this model)

Metric	Value
Weights memory	12.000496 GB (11% smaller)
PPL ratio	0.923
Chat prefill TPS	11594.3 (+10%)
RAG TTFT p95	123.069 ms (-8%)
Decode TPS	65.3 (flat)
Parameters	6.44B

All Sculpt Tiers

Tier	HuggingFace	Size	PPL Ratio	Use Case
default	dystrio/Mistral-7B-Instruct-v0.3-sculpt-default 👈 this model	12.000496 GB	0.923	Zero-regret: quality preserved, smaller footprint
production	dystrio/Mistral-7B-Instruct-v0.3-sculpt-production	11.250496 GB	1.134	Practical savings with modest quality tradeoff
throughput	dystrio/Mistral-7B-Instruct-v0.3-sculpt-throughput	10.406746 GB	1.2966	Maximum usable compression for speed/edge
experimental	dystrio/Mistral-7B-Instruct-v0.3-sculpt-experimental	9.562996 GB	1.9964	Boundary exploration, maximum structural compression

What is Dystrio Sculpt?

Dystrio Sculpt compiles transformer models into smaller, faster variants. Output models:

Are dense (not sparse) — standard architecture, fewer parameters
Load with standard HuggingFace Transformers — no custom code needed
Require no custom kernels and no runtime changes
Work as a one-step compile before deployment
Stack with quantization (AWQ, GPTQ, GGUF) for compound savings

Compatibility

✅ HuggingFace Transformers
✅ vLLM
✅ TGI (Text Generation Inference)
✅ llama.cpp / GGUF conversion
✅ AWQ / GPTQ quantization
✅ Any framework that loads standard safetensors

Benchmark Environment

GPU: NVIDIA A100-SXM4-80GB
dtype: bf16
Torch: 2.10.0+cu128
Transformers: 5.3.0
Deterministic: True
Single-GPU, standard HuggingFace Transformers, no custom kernels.

Metric Definitions

PPL ratio: WikiText-103 perplexity relative to baseline. <1.0 = quality improved.
Prefill TPS: Tokens per second during prompt encoding (higher = faster).
TTFT p95: Time to first token at 95th percentile (lower = faster).
Decode TPS: Tokens per second during generation (higher = faster).
Weights (GB): Model parameter memory (deterministic, runtime-independent).

Citation

@misc{dystrio_sculpt_2026,
  title={Dystrio Sculpt: Structural Compilation for Transformer LLMs},
  author={Dystrio},
  year={2026},
  url={https://huggingface.co/dystrio}
}

Downstream Benchmarks (lm-eval)

Evaluated with lm-eval-harness on A100-80GB, bf16, zero-shot.

Benchmark	Baseline	This Model	Delta
ARC-Challenge	0.5794	0.4974	-0.0820
HellaSwag	0.6573	0.5970	-0.0603
MMLU	0.5975	0.4959	-0.1016
TruthfulQA MC2	0.5939	0.5378	-0.0561

5.6 KiB Raw Permalink Blame History

dystrio/Mistral-7B-Instruct-v0.3-sculpt-default

Quick Start

Benchmark Results

Key Metrics (this model)

All Sculpt Tiers

What is Dystrio Sculpt?

Compatibility

Benchmark Environment

Metric Definitions

Citation

Downstream Benchmarks (lm-eval)

5.6 KiB

Raw Permalink Blame History