arcee-ai/raspberry-3B

Go to file

ai-modelscope b033616316 Update README.md

2024-11-27 15:16:32 +08:00

.gitattributes

Update README.md

2024-11-27 15:16:32 +08:00

added_tokens.json

Update README.md

2024-11-27 15:16:32 +08:00

config.json

Update README.md

2024-11-27 15:16:32 +08:00

configuration.json

Update README.md

2024-11-27 15:16:32 +08:00

generation_config.json

Update README.md

2024-11-27 15:16:32 +08:00

LICENSE

Update README.md

2024-11-27 15:16:32 +08:00

merges.txt

Update README.md

2024-11-27 15:16:32 +08:00

model-00001-of-00002.safetensors

Update README.md

2024-11-27 15:16:32 +08:00

model-00002-of-00002.safetensors

Update README.md

2024-11-27 15:16:32 +08:00

model.safetensors.index.json

Update README.md

2024-11-27 15:16:32 +08:00

README.md

Update README.md

2024-11-27 15:16:32 +08:00

special_tokens_map.json

Update README.md

2024-11-27 15:16:32 +08:00

tokenizer_config.json

Update README.md

2024-11-27 15:16:32 +08:00

tokenizer.json

Update README.md

2024-11-27 15:16:32 +08:00

training_args.bin

Update README.md

2024-11-27 15:16:32 +08:00

vocab.json

Update README.md

2024-11-27 15:16:32 +08:00

README.md

license, library_name, tags, base_model, license_name, license_link, model-index

license

library_name

tags

base_model

license_name

license_link

model-index

other

transformers

generated_from_trainer

Qwen/Qwen2.5-3B

qwen-research

https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE

name

results

outputs/gelato-3b

Prompt Format: ChatML

This is an experimental which was heavily optimized for reasoning tasks and not meant for production-use.

GGUFs: https://huggingface.co/mradermacher/raspberry-3B-GGUF

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	15.40
IFEval (0-Shot)	31.54
BBH (3-Shot)	19.53
MATH Lvl 5 (4-Shot)	7.63
GPQA (0-shot)	3.69
MuSR (0-shot)	9.41
MMLU-PRO (5-shot)	20.60