初始化项目,由ModelHub XC社区提供模型

Model: jpacifico/Chocolatine-2-4B-Instruct-DPO-v2.1
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-10 18:45:01 +08:00
commit 10d4837fa5
17 changed files with 561 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

173
README.md Normal file
View File

@@ -0,0 +1,173 @@
---
language:
- fr
- en
library_name: transformers
tags:
- dpo
- post-training
- french
- alignment
- model-merging
- qwen3
- chocolatine
- comparia
license: apache-2.0
base_model: Qwen/Qwen3-4B-Instruct-2507
datasets:
- jpacifico/comparia-dpo-pairs-bt-6k
- jpacifico/french-orca-dpo-pairs-revised
---
# Chocolatine-2-4B-Instruct-DPO-v2.1
**Chocolatine-2-4B-Instruct-DPO-v2.1** is a post-trained version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507), designed to improve instruction-following, reasoning, and overall performance in French, while preserving strong multilingual capabilities.
In my evaluation setup, it delivers consistent gains across the tested French benchmarks, pointing to a broad improvement in French capabilities.
Although the post-training pipeline focuses on French preference data, no degradation is observed on English tasks, and slight improvements are sometimes seen, suggesting positive cross-lingual transfer.
Optimized variants (MLX, GGUF) are also available, making the model particularly suitable for local inference.
## Model Overview
- **Base model:** Qwen/Qwen3-4B-Instruct-2507
- **Parameters:** 4.0B
- **Context Length:** 262,144 natively
- **Post training methods:** DPO + Model Merging
Note: This model supports only non-thinking mode and does not generate `<think></think>` blocks in its outputs.
This design is consistent with the goals of the post-training setup, which favors a compact dense instruct model focused on direct generation efficiency and practical downstream use.
For use cases requiring explicit reasoning traces or structured thinking outputs, Qwen/Qwen3.5-4B (thinking mode) is recommended.
**Model Variants**
- Chocolatine-2-4B-Instruct-DPO-v2.1 (this repo): Contains the retrainable weights in BF16 format
- Quantized GGUF versions : [Q4_K_M](https://huggingface.co/jpacifico/Chocolatine-2-4B-Instruct-DPO-v2.1-Q4_K_M-GGUF) / [Q8_0](https://huggingface.co/jpacifico/Chocolatine-2-4B-Instruct-DPO-v2.1-Q8_0-GGUF) and more from mradermacher [here](https://huggingface.co/mradermacher/Chocolatine-2-4B-Instruct-DPO-v2.1-GGUF)
- MLX (optimized for Apple silicon): [4Bit](https://huggingface.co/jpacifico/Chocolatine-2-4B-Instruct-DPO-v2.1-mlx-4Bit) / [6Bit](https://huggingface.co/jpacifico/Chocolatine-2-4B-Instruct-DPO-v2.1-mlx-8Bit)
**Ollama** : In addition to the Hugging Face release, quantized 4-bit and 8-bit variants are also available [here](https://ollama.com/jpacifico/chocolatine-2.1) on Ollama for convenient local inference.
## Benchmarks
The results indicate a consistent improvement across the tested French benchmarks, covering several capability types. This suggests a broad gain in French performance, while English results remain overall stable.
| Benchmark fr | Qwen3-4B-Instruct-2507 (base) | Chocolatine-2-4B-Instruct-DPO-v2.1 |
|---|---:|---:|
| gpqa-fr:diamond | 28.93 | **32.49** |
| french_bench_arc_challenge | 47.13 | **49.79** |
| french_bench_grammar | 70.59 | **72.27** |
| french_bench_boolqa | 88.76 | **89.89** |
| french_bench_hellaswag | 56.99 | **58.03** |
| global_mmlu_fr | 63.75 | **64.75** |
| xwinograd_fr | 66.27 | **67.47** |
| fr_mt_bench | 6.22 | **6.44** |
*FR-MT-Bench* evaluation is performed on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), using [multilingual-mt-bench](https://github.com/jpacifico/multilingual_mt_bench) with OpenAI/GPT-5 as the LLM judge.
*global_mmlu_fr*, *xwinograd_fr* and *french_bench* results were obtained using [EleutherAI LM Eval Harness](https://github.com/EleutherAI/lm-evaluation-harness) in a **0-shot** evaluation setting.
*gpqa-fr:diamond* using LightEval/vLLM via [kurakurai/Luth](https://github.com/kurakurai/Luth.git) process eval.
| Benchmark eng | Qwen3-4B-Instruct-2507 (base) | Chocolatine-2-4B-Instruct-DPO-v2.1 |
|---|---:|---:|
| arc_challenge | **58.79** | 58.45 |
| hellaswag | 69.08 | **70.16** |
| boolq | 84.80 | **85.32** |
| gpqa_diamond_zeroshot | **38.89** | 38.38 |
English benchmark results were obtained using [EleutherAI LM Eval Harness](https://github.com/EleutherAI/lm-evaluation-harness) in a **0-shot** evaluation setting.
## Training & Alignment Pipeline
Chocolatine-2-4B-Instruct-DPO-v2.1 is derived from [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) using a multi-step post-training pipeline:
**Stage 1 DPO (Compar:IA adaptation)**
Direct Preference Optimization (DPO) on a DPO-adapted version of **[Compar:IA](https://comparia.beta.gouv.fr/datasets)** data, derived from the preference dataset [comparia-votes](https://huggingface.co/datasets/ministere-culture/comparia-votes), part of a public initiative led by the Ministry of Culture (French gov). Previous iterations of the Chocolatine model series also were selected as part of this initiative.
I constructed an original DPO dataset from these votes by transforming them into preference pairs (chosen / rejected), with additional filtering and formatting steps to make them suitable for DPO fine-tuning.
Two dataset variants were created ([6k](https://huggingface.co/datasets/jpacifico/comparia-dpo-pairs-bt-6k) and [13k](https://huggingface.co/datasets/jpacifico/comparia-dpo-pairs-bt-13k) preference pairs).
The **6k variant** was used for the DPO training reported in this release.
**Stage 2 DPO (French-ORCA pairs)**
A second DPO stage using a french-version of ORCA preference pairs, based on the dataset **[jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised)**, commonly used in the Chocolatine training pipeline.
This stage further improves : general instruction alignment, robustness across tasks, cross-lingual capabilities.
**Stage 3 Model Merging (MergeKit + TIES)**
The resulting checkpoints were merged using **MergeKit** with the TIES method.
TIES merging: selects task-relevant parameter updates, reduces destructive interference between models and preserves base model stability.
MergeKit configuration:
```yaml
# ties2 recipe
models:
- model: jpacifico/Qwen3-4B-Instruct-DPO-test2
parameters:
density: 0.5
weight: 0.5
- model: jpacifico/Qwen3-4B-Instruct-DPO-test-b3
parameters:
density: 0.5
weight: 0.5
merge_method: ties
base_model: Qwen/Qwen3-4B-Instruct-2507
parameters:
normalize: false
int8_mask: true
dtype: bfloat16
```
## Usage
The following contains a code snippet illustrating how to use the model generate content based on given inputs.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "jpacifico/Chocolatine-2-4B-Instruct-DPO-v2.1"
# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# prepare the model input
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# conduct text completion
generated_ids = model.generate(
**model_inputs,
max_new_tokens=16384
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
content = tokenizer.decode(output_ids, skip_special_tokens=True)
print("content:", content)
```
## Limitations
The Chocolatine-2 model series is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance.
It does not have any moderation mechanism.
Developed by: Jonathan Pacifico, 2026
Model type: LLM
Language(s) (NLP): French, English
License: Apache-2.0
Made with ❤️ in France

68
config.json Normal file
View File

@@ -0,0 +1,68 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 2560,
"initializer_range": 0.02,
"intermediate_size": 9728,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 262144,
"max_window_layers": 36,
"model_type": "qwen3",
"num_attention_heads": 32,
"num_hidden_layers": 36,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 5000000,
"sliding_window": null,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.53.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}

13
generation_config.json Normal file
View File

@@ -0,0 +1,13 @@
{
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.7,
"top_k": 20,
"top_p": 0.8,
"transformers_version": "4.56.1"
}

1
mergekit_config.yml Normal file
View File

@@ -0,0 +1 @@
hiRs9LzrXv692LjqprXx6Gb2dJ5ClKj6Se5SvY5cvRYZ2qMQmXFIStmP1WnJ49EU8t/LXeNuVUeu+VCHz8iI6E2oWoMu6830Us7Hv/2ZR/v93Qj3JXzWU0uJuyVjJqF84BIBHDxuOLQX3FRu+EIWbleH+VWXI0I/TcTRlEc7Xd2FH4OHAaVDR6wS8GIcnWeYpezSQIE00ciqkQ8wxY/+itcdumP7abn5Qy662OaZLJiNXWC+5zK0/oTRK2mwoJprs+qmcZMTa13wc2s3X3tzZuLqoVqL5DFRphVAxXcQ5xb+hNjEORBHhMTkmxQVlpuDG3uqiXYPxhSnOhA2QywttXsNjhya3KWEeontRaBjhrZmkJr1wvWjCpxMsSGWD3iMtX974Nd6AGYWq6SHUMcXQXhgvtFYA7pu3PzanL6BJGnSB5ErLkVB3dLLcDVMuW4jnoFhZgbmVg8tJ2V0A89kKh5+ymzF/LLWBDq6ynAT6r4=

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2e8b8e58356bd3ee8532f858bfb6566aa8413fe9aa614ca2a0c772099529d4ad
size 979780560

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:eefaa2114fee22fdba995e39a89e9efaa52275e50a3b4f57acb4f100117e2814
size 983094568

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:acf6eb079769dd435f6b738746f179bd6635733cb20ed1d06af10443e170c4e1
size 988342432

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d26b88421cb79f4a3ddc04a2fd12c214184edd138fef9efe2424cfb43c11d87b
size 954258352

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0b85a997b10a2fb9912c9b95c498a6e019ddb38ea8ba147f52d8813ee81361c6
size 959506960

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:432449dbae9e770ce16a641ab3df1eab3768440f677e41e1d6318025f70aad79
size 959506952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:95165556475756f4340675d78113581b126af4454b696e7e9a8b18de51505c5d
size 983094576

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1831e31ab129405596f8d5649ff01f511541a59ede6cdd71b614b2f91051a808
size 988342384

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9997a31b79e21ae6ada6844d8daffcaad1e5aa1298c9926791ba7f504d7c26ff
size 249054736

File diff suppressed because one or more lines are too long

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4
size 11422654

239
tokenizer_config.json Normal file
View File

@@ -0,0 +1,239 @@
{
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if message.content is string %}\n {%- set content = message.content %}\n {%- else %}\n {%- set content = '' %}\n {%- endif %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}",
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"model_max_length": 1010000,
"pad_token": "<|endoftext|>",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null,
"add_bos_token": false
}