Model: InferenceIllusionist/Excalibur-7b-DPO Source: Original Platform
license, library_name, tags, base_model, datasets, model-index
| license | library_name | tags | base_model | datasets | model-index | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| apache-2.0 | transformers |
|
|
|
|
Excalibur-7b-DPO
An initial foray into the world of fine-tuning. The goal of this release was to amplify the quality of the original model's responses, in particular for vision use cases*
Weighted (Importance Matrix) Quants available here
Static (Legacy) quants available here
Notes & Methodology
- Excalibur-7b fine-tuned with Direct Preference Optimization (DPO) using Intel/orca_dpo_pairs
- This is a quick experiment to determine the impact of DPO finetuning on the Excelsior-7b base model
- Ran for a little over an hour on a single A100
- Fine-tuning succeeded in making model conversational and more well-rounded
- Benchmark scores increased in the following categories versus base Excelsior-7b:
- ARC: 69.71 -> 70.9
- HellaSwag: 87.56 -> 87.93
- TruthfulQA: 67.24 -> 70.82
- Average: 73.6 -> 73.84
- Precision: bfloat16
Sample Question - Vision
*Requires additional mmproj file. You have two options for vision functionality (available inside this repo):
Select the gguf file of your choice in Koboldcpp as usual, then make sure to choose the mmproj file above in the LLaVA mmproj field of the model submenu:

Prompt Format
- For best results please use ChatML for the prompt format. Alpaca may also work.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 73.84 |
| AI2 Reasoning Challenge (25-Shot) | 70.90 |
| HellaSwag (10-Shot) | 87.93 |
| MMLU (5-Shot) | 65.46 |
| TruthfulQA (0-shot) | 70.82 |
| Winogrande (5-shot) | 82.48 |
| GSM8k (5-shot) | 65.43 |
Description