ModelHub XC d16c387918 初始化项目,由ModelHub XC社区提供模型
Model: DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit
Source: Original Platform
2026-05-07 16:55:59 +08:00

base_model, datasets, pipeline_tag, library_name, license, tags, model-index
base_model datasets pipeline_tag library_name license tags model-index
unsloth/Llama-3.2-3B-Instruct-bnb-4bit
microsoft/orca-agentinstruct-1M-v1
text-generation transformers llama3.2
unsloth
transformers
name results
analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit
task dataset metrics
type
text-generation
type name
lm-evaluation-harness bbh
name type value verified
acc_norm acc_norm 0.4168 false
task dataset metrics
type
text-generation
type name
lm-evaluation-harness gpqa
name type value verified
acc_norm acc_norm 0.2691 false
task dataset metrics
type
text-generation
type name
lm-evaluation-harness math
name type value verified
exact_match exact_match 0.0867 false
task dataset metrics
type
text-generation
type name
lm-evaluation-harness mmlu
name type value verified
acc_norm acc_norm 0.2822 false
task dataset metrics
type
text-generation
type name
lm-evaluation-harness musr
name type value verified
acc_norm acc_norm 0.3648 false
task dataset metrics
type
text-generation
type name
lm-evaluation-harness hellaswag
name type value verified
acc acc 0.5141 false
name type value verified
acc_norm acc_norm 0.6793 false

image/png

Eval

The fine tuned model (DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit) has gained performace over the base model (unsloth/Llama-3.2-3B-Instruct-bnb-4bit) in the following tasks.

Test Base Model Fine-Tuned Model Performance Gain
leaderboard_bbh_logical_deduction_seven_objects 0.2520 0.4360 0.1840
leaderboard_bbh_logical_deduction_five_objects 0.3560 0.4560 0.1000
leaderboard_musr_team_allocation 0.2200 0.3200 0.1000
leaderboard_bbh_disambiguation_qa 0.3040 0.3760 0.0720
leaderboard_gpqa_diamond 0.2222 0.2727 0.0505
leaderboard_bbh_movie_recommendation 0.5960 0.6360 0.0400
leaderboard_bbh_formal_fallacies 0.5080 0.5400 0.0320
leaderboard_bbh_tracking_shuffled_objects_three_objects 0.3160 0.3440 0.0280
leaderboard_bbh_causal_judgement 0.5455 0.5668 0.0214
leaderboard_bbh_web_of_lies 0.4960 0.5160 0.0200
leaderboard_math_geometry_hard 0.0455 0.0606 0.0152
leaderboard_math_num_theory_hard 0.0519 0.0649 0.0130
leaderboard_musr_murder_mysteries 0.5280 0.5400 0.0120
leaderboard_gpqa_extended 0.2711 0.2802 0.0092
leaderboard_bbh_sports_understanding 0.5960 0.6040 0.0080
leaderboard_math_intermediate_algebra_hard 0.0107 0.0143 0.0036

Framework versions

  • unsloth 2024.11.5
  • trl 0.12.0

Training HW

  • V100

I'm doing this to 'Make knowledge free for everyone', using my personal time and resources.

If you want to support my efforts please visit my ko-fi page: https://ko-fi.com/devquasar

Also feel free to visit my website https://devquasar.com/

Description
Model synced from source: DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit
Readme 76 KiB