初始化项目,由ModelHub XC社区提供模型
Model: jpacifico/Chocolatine-14B-Instruct-DPO-v1.3 Source: Original Platform
This commit is contained in:
155
README.md
Normal file
155
README.md
Normal file
@@ -0,0 +1,155 @@
|
||||
---
|
||||
library_name: transformers
|
||||
tags:
|
||||
- chocolatine
|
||||
- phi4
|
||||
license: mit
|
||||
datasets:
|
||||
- jpacifico/french-orca-dpo-pairs-revised
|
||||
language:
|
||||
- fr
|
||||
- en
|
||||
base_model:
|
||||
- microsoft/phi-4
|
||||
---
|
||||
### Chocolatine-14B-Instruct-DPO-v1.3
|
||||
|
||||
DPO fine-tuning of [microsoft/Phi-4](https://huggingface.co/microsoft/Phi-4) (14B params)
|
||||
using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
|
||||
Training in French also improves the model's overall capabilities, surpassing the performances of its base model.
|
||||
Window context = up to 16k tokens
|
||||
|
||||
### OpenLLM Leaderboard
|
||||
|
||||
Could this be the biggest performance boost ever seen from LLM fine-tuning ? 🤔
|
||||

|
||||
|
||||
Chocolatine-14B-Instruct-DPO-v1.3 is the best-performing Phi-4 based model on the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
||||
for only 1.70kgCo2 (versus > 3kg for other models in the same category and performance)
|
||||
[Updated 2025-02-17]
|
||||
|
||||
| Metric |Value|
|
||||
|-------------------|----:|
|
||||
|**Avg.** |**42.42**|
|
||||
|IFEval |70.40|
|
||||
|BBH |54.85|
|
||||
|MATH Lvl 5 |56.19|
|
||||
|GPQA |12.19|
|
||||
|MuSR |12.29|
|
||||
|MMLU-PRO |48.60|
|
||||
|
||||
|
||||
### MT-Bench-French
|
||||
|
||||
Chocolatine-14B-Instruct-DPO-v1.3 outperforms its previous Chocolatine versions and its base model Phi-4 on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench) and GPT-4-Turbo as LLM-judge.
|
||||
|
||||
```
|
||||
########## First turn ##########
|
||||
score
|
||||
model turn
|
||||
gpt-4o-mini 1 9.2875
|
||||
Chocolatine-14B-Instruct-DPO-v1.3 1 9.0125
|
||||
Chocolatine-14B-Instruct-DPO-v1.2 1 8.6125
|
||||
Phi-3.5-mini-instruct 1 8.5250
|
||||
Chocolatine-3B-Instruct-DPO-v1.2 1 8.3750
|
||||
phi-4 1 8.3000
|
||||
Phi-3-medium-4k-instruct 1 8.2250
|
||||
gpt-3.5-turbo 1 8.1375
|
||||
Chocolatine-3B-Instruct-DPO-Revised 1 7.9875
|
||||
Daredevil-8B 1 7.8875
|
||||
Meta-Llama-3.1-8B-Instruct 1 7.0500
|
||||
vigostral-7b-chat 1 6.7875
|
||||
Mistral-7B-Instruct-v0.3 1 6.7500
|
||||
gemma-2-2b-it 1 6.4500
|
||||
French-Alpaca-7B-Instruct_beta 1 5.6875
|
||||
vigogne-2-7b-chat 1 5.6625
|
||||
|
||||
########## Second turn ##########
|
||||
score
|
||||
model turn
|
||||
gpt-4o-mini 2 8.912500
|
||||
Chocolatine-14B-Instruct-DPO-v1.3 2 8.762500
|
||||
Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500
|
||||
phi-4 2 8.131250
|
||||
Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
|
||||
Chocolatine-3B-Instruct-DPO-v1.2 2 7.862500
|
||||
Phi-3-medium-4k-instruct 2 7.750000
|
||||
gpt-3.5-turbo 2 7.679167
|
||||
Phi-3.5-mini-instruct 2 7.575000
|
||||
Daredevil-8B 2 7.087500
|
||||
Meta-Llama-3.1-8B-Instruct 2 6.787500
|
||||
Mistral-7B-Instruct-v0.3 2 6.500000
|
||||
vigostral-7b-chat 2 6.162500
|
||||
gemma-2-2b-it 2 6.100000
|
||||
French-Alpaca-7B-Instruct_beta 2 5.487395
|
||||
vigogne-2-7b-chat 2 2.775000
|
||||
|
||||
########## Average ##########
|
||||
score
|
||||
model
|
||||
gpt-4o-mini 9.100000
|
||||
Chocolatine-14B-Instruct-DPO-v1.3 8.825000
|
||||
Chocolatine-14B-Instruct-DPO-v1.2 8.475000
|
||||
phi-4 8.215625
|
||||
Chocolatine-3B-Instruct-DPO-v1.2 8.118750
|
||||
Phi-3.5-mini-instruct 8.050000
|
||||
Phi-3-medium-4k-instruct 7.987500
|
||||
Chocolatine-3B-Instruct-DPO-Revised 7.962500
|
||||
gpt-3.5-turbo 7.908333
|
||||
Daredevil-8B 7.487500
|
||||
Meta-Llama-3.1-8B-Instruct 6.918750
|
||||
Mistral-7B-Instruct-v0.3 6.625000
|
||||
vigostral-7b-chat 6.475000
|
||||
gemma-2-2b-it 6.275000
|
||||
French-Alpaca-7B-Instruct_beta 5.587866
|
||||
vigogne-2-7b-chat 4.218750
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_14B_inference_test_colab.ipynb)
|
||||
|
||||
You can also run Chocolatine using the following code:
|
||||
|
||||
```python
|
||||
import transformers
|
||||
from transformers import AutoTokenizer
|
||||
|
||||
# Format prompt
|
||||
message = [
|
||||
{"role": "system", "content": "You are a helpful assistant chatbot."},
|
||||
{"role": "user", "content": "What is a Large Language Model?"}
|
||||
]
|
||||
tokenizer = AutoTokenizer.from_pretrained(new_model)
|
||||
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
|
||||
|
||||
# Create pipeline
|
||||
pipeline = transformers.pipeline(
|
||||
"text-generation",
|
||||
model=new_model,
|
||||
tokenizer=tokenizer
|
||||
)
|
||||
|
||||
# Generate text
|
||||
sequences = pipeline(
|
||||
prompt,
|
||||
do_sample=True,
|
||||
temperature=0.7,
|
||||
top_p=0.9,
|
||||
num_return_sequences=1,
|
||||
max_length=200,
|
||||
)
|
||||
print(sequences[0]['generated_text'])
|
||||
```
|
||||
|
||||
### Limitations
|
||||
|
||||
The Chocolatine-2 model series is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance.
|
||||
It does not have any moderation mechanism.
|
||||
|
||||
- **Developed by:** Jonathan Pacifico, 2025
|
||||
- **Model type:** LLM
|
||||
- **Language(s) (NLP):** French, English
|
||||
- **License:** MIT
|
||||
|
||||
Made with ❤️ in France
|
||||
Reference in New Issue
Block a user