license, tags, base_model, model-index
license tags base_model model-index
apache-2.0
merge
mergekit
Nexusflow/Starling-LM-7B-beta
FuseAI/FuseChat-7B-VaRM
Nexusflow/Starling-LM-7B-beta
FuseAI/FuseChat-7B-VaRM
name results
L-MChat-7b
task dataset metrics source
type name
text-generation Text Generation
name type config split args
AI2 Reasoning Challenge (25-Shot) ai2_arc ARC-Challenge test
num_few_shot
25
type value name
acc_norm 65.61 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
HellaSwag (10-Shot) hellaswag validation
num_few_shot
10
type value name
acc_norm 84.59 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU (5-Shot) cais/mmlu all test
num_few_shot
5
type value name
acc 65.44 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
TruthfulQA (0-shot) truthful_qa multiple_choice validation
num_few_shot
0
type value
mc2 50.94
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
Winogrande (5-shot) winogrande winogrande_xl validation
num_few_shot
5
type value name
acc 81.37 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
GSM8k (5-shot) gsm8k main test
num_few_shot
5
type value name
acc 69.45 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
IFEval (0-Shot) HuggingFaceH4/ifeval
num_few_shot
0
type value name
inst_level_strict_acc and prompt_level_strict_acc 52.97 strict accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
BBH (3-Shot) BBH
num_few_shot
3
type value name
acc_norm 24.2 normalized accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
MATH Lvl 5 (4-Shot) hendrycks/competition_math
num_few_shot
4
type value name
exact_match 7.93 exact match
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
GPQA (0-shot) Idavidrein/gpqa
num_few_shot
0
type value name
acc_norm 7.38 acc_norm
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
MuSR (0-shot) TAUR-Lab/MuSR
num_few_shot
0
type value name
acc_norm 8.12 acc_norm
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU-PRO (5-shot) TIGER-Lab/MMLU-Pro main test
num_few_shot
5
type value name
acc 25.54 accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Artples/L-MChat-7b Open LLM Leaderboard

L-MChat-7b

L-MChat-Series-Logo

L-MChat-7b is a merge of the following models:

Configuration

slices:
  - sources:
      - model: Nexusflow/Starling-LM-7B-beta
        layer_range: [0, 32]
      - model: FuseAI/FuseChat-7B-VaRM
        layer_range: [0, 32]
merge_method: slerp
base_model: FuseAI/FuseChat-7B-VaRM
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "Artples/M-LChat-7b"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

License

Apache 2.0 but you cannot use this model to directly compete with OpenAI.

How?

Usage of LazyMergekit.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 69.57
AI2 Reasoning Challenge (25-Shot) 65.61
HellaSwag (10-Shot) 84.59
MMLU (5-Shot) 65.44
TruthfulQA (0-shot) 50.94
Winogrande (5-shot) 81.37
GSM8k (5-shot) 69.45

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 21.02
IFEval (0-Shot) 52.97
BBH (3-Shot) 24.20
MATH Lvl 5 (4-Shot) 7.93
GPQA (0-shot) 7.38
MuSR (0-shot) 8.12
MMLU-PRO (5-shot) 25.54
Description
Model synced from source: Artples/L-MChat-7b
Readme 1 MiB