Model: cascade-tech/Ministral-3-3B-Instruct-2512-BF16-llama-text Source: Original Platform
39 lines
1.1 KiB
Markdown
39 lines
1.1 KiB
Markdown
---
|
|
base_model: mistralai/Ministral-3-3B-Instruct-2512-BF16
|
|
library_name: transformers
|
|
pipeline_tag: text-generation
|
|
tags:
|
|
- llama
|
|
- ministral
|
|
- text-only
|
|
---
|
|
|
|
# Ministral-3 3B Instruct BF16 Llama Text
|
|
|
|
Text-only Llama-compatible conversion of
|
|
`mistralai/Ministral-3-3B-Instruct-2512-BF16`.
|
|
|
|
What changed:
|
|
|
|
- dropped vision and multimodal projector tensors
|
|
- stripped the Mistral3 `language_model.` wrapper from text weights
|
|
- wrote a plain `LlamaForCausalLM` config
|
|
- kept tokenizer assets and chat template with the corrected regex
|
|
- kept chat formatting in `chat_template.jinja`
|
|
- removed the strict user/assistant alternation assertion from the template
|
|
- left tokenizer loading on the generic fast backend, not `LlamaTokenizerFast`
|
|
|
|
Verification:
|
|
|
|
- reference: `mistralai/Ministral-3-3B-Instruct-2512-BF16`
|
|
- candidate: this checkpoint
|
|
- dataset: `/home/alvion/valve/services/training/datasets/think2-2025-12-07_gpt-5.4_reasoning.jsonl`
|
|
- rows: 3
|
|
- max length: 512
|
|
- tokenizer IDs: identical
|
|
- worst KL: 0
|
|
- worst logit diff: 0
|
|
- plain candidate `AutoTokenizer` matched the fixed reference tokenizer
|
|
|
|
Conversion tools: https://github.com/cascade-tech-ai/mistral-convert
|