初始化项目,由ModelHub XC社区提供模型
Model: cascade-tech/Ministral-3-3B-Instruct-2512-BF16-llama-text Source: Original Platform
This commit is contained in:
38
README.md
Normal file
38
README.md
Normal file
@@ -0,0 +1,38 @@
|
||||
---
|
||||
base_model: mistralai/Ministral-3-3B-Instruct-2512-BF16
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- llama
|
||||
- ministral
|
||||
- text-only
|
||||
---
|
||||
|
||||
# Ministral-3 3B Instruct BF16 Llama Text
|
||||
|
||||
Text-only Llama-compatible conversion of
|
||||
`mistralai/Ministral-3-3B-Instruct-2512-BF16`.
|
||||
|
||||
What changed:
|
||||
|
||||
- dropped vision and multimodal projector tensors
|
||||
- stripped the Mistral3 `language_model.` wrapper from text weights
|
||||
- wrote a plain `LlamaForCausalLM` config
|
||||
- kept tokenizer assets and chat template with the corrected regex
|
||||
- kept chat formatting in `chat_template.jinja`
|
||||
- removed the strict user/assistant alternation assertion from the template
|
||||
- left tokenizer loading on the generic fast backend, not `LlamaTokenizerFast`
|
||||
|
||||
Verification:
|
||||
|
||||
- reference: `mistralai/Ministral-3-3B-Instruct-2512-BF16`
|
||||
- candidate: this checkpoint
|
||||
- dataset: `/home/alvion/valve/services/training/datasets/think2-2025-12-07_gpt-5.4_reasoning.jsonl`
|
||||
- rows: 3
|
||||
- max length: 512
|
||||
- tokenizer IDs: identical
|
||||
- worst KL: 0
|
||||
- worst logit diff: 0
|
||||
- plain candidate `AutoTokenizer` matched the fixed reference tokenizer
|
||||
|
||||
Conversion tools: https://github.com/cascade-tech-ai/mistral-convert
|
||||
Reference in New Issue
Block a user