ModelHub XC 2c476b0fe6 初始化项目,由ModelHub XC社区提供模型
Model: NousResearch/Yarn-Solar-10b-64k
Source: Original Platform
2026-05-10 20:42:05 +08:00

datasets, metrics, library_name, license, language
datasets metrics library_name license language
emozilla/yarn-train-tokenized-32k-mistral
perplexity
transformers apache-2.0
en

Model Card: Yarn-Solar-10b-64k

Preprint (arXiv)
GitHub yarn

Model Description

Yarn-Solar-10b-64k is a state-of-the-art language model for long context, further pretrained on two billion long context tokens using the YaRN extension method. It is an extension of SOLAR-10.7B-v1.0 and supports a 64k token context window.

To use, pass trust_remote_code=True when loading the model, for example

model = AutoModelForCausalLM.from_pretrained("NousResearch/Yarn-Solar-10b-64k",
  attn_implementation="flash_attention_2",
  torch_dtype=torch.bfloat16,
  device_map="auto",
  trust_remote_code=True)

In addition you will need to use the latest version of transformers

pip install git+https://github.com/huggingface/transformers

Benchmarks

Long context benchmarks:

Model Context Window 4k PPL 8k PPL 16k PPL 32k PPL 64k PPL
Mistral-7B-v0.1 8k 3.09 2.96 - - -
Yarn-Mistral-7b-64k 64k 3.18 3.04 2.65 2.44 2.20
Yarn-Mistral-7b-128k 128k 3.21 3.08 2.68 2.47 2.24
SOLAR-10.7B-v1.0 4k 3.07 - - - -
Yarn-Solar-10b-32k 32k 3.09 2.95 2.57 2.31 -
Yarn-Solar-10b-64k 64k 3.13 2.99 2.61 2.34 2.15

Short context benchmarks showing that quality degradation is minimal:

Model Context Window ARC-c Hellaswag MMLU Truthful QA
Mistral-7B-v0.1 8k 59.98 83.31 64.16 42.15
Yarn-Mistral-7b-64k 64k 59.38 81.21 61.32 42.50
Yarn-Mistral-7b-128k 128k 58.87 80.58 60.64 42.46
SOLAR-10.7B-v1.0 4k 61.95 84.60 65.48 45.04
Yarn-Solar-10b-32k 32k 59.64 83.65 64.36 44.82
Yarn-Solar-10b-64k 64k 59.21 83.08 63.57 45.70

Collaborators

The authors would like to thank LAION AI for their support of compute for this model. It was trained on the JUWELS supercomputer.

Description
Model synced from source: NousResearch/Yarn-Solar-10b-64k
Readme 1 MiB
Languages
Python 100%