datasets, base_model, tags, language, pipeline_tag, license, model-index
datasets base_model tags language pipeline_tag license model-index
agentlans/crash-course
google/gemma-2-9b-it
FuseAI/FuseChat-Gemma-2-9B-Instruct
jsgreenawalt/gemma-2-9B-it-advanced-v2.1
gemma2
en
text-generation gemma
name results
Gemma2-9B-AdvancedFuse
task dataset metrics source
type name
text-generation Text Generation
name type split args
IFEval (0-Shot) wis-k/instruction-following-eval train
num_few_shot
0
type value name
inst_level_strict_acc and prompt_level_strict_acc 15.43 averaged accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FGemma2-9B-AdvancedFuse Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
BBH (3-Shot) SaylorTwift/bbh test
num_few_shot
3
type value name
acc_norm 40.52 normalized accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FGemma2-9B-AdvancedFuse Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
MATH Lvl 5 (4-Shot) lighteval/MATH-Hard test
num_few_shot
4
type value name
exact_match 7.55 exact match
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FGemma2-9B-AdvancedFuse Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
GPQA (0-shot) Idavidrein/gpqa train
num_few_shot
0
type value name
acc_norm 11.3 acc_norm
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FGemma2-9B-AdvancedFuse Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
MuSR (0-shot) TAUR-Lab/MuSR
num_few_shot
0
type value name
acc_norm 11.99 acc_norm
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FGemma2-9B-AdvancedFuse Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU-PRO (5-shot) TIGER-Lab/MMLU-Pro main test
num_few_shot
5
type value name
acc 33.34 accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=agentlans%2FGemma2-9B-AdvancedFuse Open LLM Leaderboard

Gemma2-9B-AdvancedFuse

Gemma2-9B-AdvancedFuse is an experimental, open-source large language model (LLM) with 9 billion parameters. It aims to combine the strengths of FuseAI/FuseChat-Gemma-2-9B-Instruct and jsgreenawalt/gemma-2-9B-it-advanced-v2.1 through additive linear merging, further fine-tuned on a 12K row dataset from agentlans/crash-course for enhanced chat and instruct performance, including math and multilingual prompts.

Capabilities

  • Text Generation: Generates coherent emails, summaries, and notes. This model card was primarily generated by the model itself.
  • Instruction Following: Demonstrates strong ability to understand and execute instructions in conversational settings.
  • Roleplaying: Can engage in third-person narrative roleplay but may exhibit common GPT expressions or clichés.

Limitations

As with most large language models:

  • Factual Errors: May generate incorrect or outdated information due to data biases.
  • Mathematical Operations: Struggles with mathematical calculations requiring symbolic reasoning despite its finetuning data.
  • Handling Unsafe Input: May generate unsafe, biased, or malicious content if provided inappropriate input. Careful prompt engineering is recommended.

Model Usage Guidelines

  1. Use clear and specific instructions for optimal performance.
  2. Verify generated outputs for factual accuracy when critical information is involved.
  3. Avoid providing inputs that could lead to harmful or unethical responses.
  4. Consider using human review, especially in high-stakes applications.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here! Summarized results can be found here!

Metric Value (%)
Average 20.02
IFEval (0-Shot) 15.43
BBH (3-Shot) 40.52
MATH Lvl 5 (4-Shot) 7.55
GPQA (0-shot) 11.30
MuSR (0-shot) 11.99
MMLU-PRO (5-shot) 33.34
Description
Model synced from source: agentlans/Gemma2-9B-AdvancedFuse
Readme 27 KiB