Go to file

ModelHub XC e972f58b80 初始化项目，由ModelHub XC社区提供模型

Model: rinna/youri-7b
Source: Original Platform

2026-05-13 00:32:32 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

configuration.json

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

model-00001-of-00002.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

model-00002-of-00002.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

model.safetensors.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

pytorch_model-00001-of-00002.bin

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

pytorch_model-00002-of-00002.bin

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

pytorch_model.bin.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

rinna.png

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

tokenizer_checklist.chk

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

tokenizer.model

初始化项目，由ModelHub XC社区提供模型

2026-05-13 00:32:32 +08:00

README.md

language, license, datasets, thumbnail, inference, model-index, base_model

language

license

datasets

thumbnail

inference

model-index

base_model

llama2

mc4

wikipedia

EleutherAI/pile

oscar-corpus/colossal-oscar-1.0

cc100

https://github.com/rinnakk/japanese-pretrained-models/blob/master/rinna.png

false

name

results

youri-7b

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

AI2 Reasoning Challenge (25-Shot)

ai2_arc

ARC-Challenge

test

num_few_shot
25

type	value	name
acc_norm	49.06	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=rinna/youri-7b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

split

args

HellaSwag (10-Shot)

hellaswag

validation

num_few_shot
10

type	value	name
acc_norm	74.89	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=rinna/youri-7b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU (5-Shot)

cais/mmlu

all

test

num_few_shot
5

type	value	name
acc	42.22	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=rinna/youri-7b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

TruthfulQA (0-shot)

truthful_qa

multiple_choice

validation

num_few_shot
0

type	value
mc2	36.03

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=rinna/youri-7b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

Winogrande (5-shot)

winogrande

winogrande_xl

validation

num_few_shot
5

type	value	name
acc	71.82	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=rinna/youri-7b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

GSM8k (5-shot)

gsm8k

main

test

num_few_shot
5

type	value	name
acc	8.64	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=rinna/youri-7b	Open LLM Leaderboard

meta-llama/Llama-2-7b-hf

`rinna/youri-7b`

Overview

We conduct continual pre-training of llama2-7b on 40B tokens from a mixture of Japanese and English datasets. The continual pre-training significantly improves the model's performance on Japanese tasks.

The name youri comes from the Japanese word 妖狸/ようり/Youri, which is a kind of Japanese mythical creature (妖怪/ようかい/Youkai).

Library

The model was trained using code based on EleutherAI/gpt-neox.
Model architecture

A 32-layer, 4096-hidden-size transformer-based language model. Refer to the llama2 paper for architecture details.
Continual pre-training

The model was initialized with the llama2-7b model and continually trained on around 40B tokens from a mixture of the following corpora
- Japanese CC-100
- Japanese C4
- Japanese OSCAR
- The Pile
- Wikipedia
- rinna curated Japanese dataset
Contributors
Release date

October 31, 2023

Benchmarking

Please refer to rinna's LM benchmark page (Sheet 20231031).

How to use the model

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("rinna/youri-7b")
model = AutoModelForCausalLM.from_pretrained("rinna/youri-7b")

if torch.cuda.is_available():
    model = model.to("cuda")

text = "西田幾多郎は、"
token_ids = tokenizer.encode(text, add_special_tokens=False, return_tensors="pt")

with torch.no_grad():
    output_ids = model.generate(
        token_ids.to(model.device),
        max_new_tokens=200,
        min_new_tokens=200,
        do_sample=True,
        temperature=1.0,
        top_p=0.95,
        pad_token_id=tokenizer.pad_token_id,
        bos_token_id=tokenizer.bos_token_id,
        eos_token_id=tokenizer.eos_token_id
    )

output = tokenizer.decode(output_ids.tolist()[0])
print(output)
"""
西田幾多郎は、プラトンの復権を主張し、対する従来の西洋哲学は、近代の合理主義哲学に委ね、「従来の哲学は破 壊されてしまった」と述べている。 西田幾多郎は、西洋近代哲学の「徹底的な検討」を拒んだ。それは、「現代的理解の脆弱性を補う筈の、従来のヨーロッパに伝わる哲学的な方法では到底それができなかったからである」とい
"""

Tokenization

The model uses the original llama-2 tokenizer.

How to cite

@misc{rinna-youri-7b,
    title = {rinna/youri-7b},
    author = {Zhao, Tianyu and Kaga, Akio and Sawada, Kei},
    url = {https://huggingface.co/rinna/youri-7b}
}

@inproceedings{sawada2024release,
    title = {Release of Pre-Trained Models for the {J}apanese Language},
    author = {Sawada, Kei and Zhao, Tianyu and Shing, Makoto and Mitsui, Kentaro and Kaga, Akio and Hono, Yukiya and Wakatsuki, Toshiaki and Mitsuda, Koh},
    booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
    month = {5},
    year = {2024},
    pages = {13898--13905},
    url = {https://aclanthology.org/2024.lrec-main.1213},
    note = {\url{https://arxiv.org/abs/2404.01657}}
}

References

@software{gpt-neox-library,
    title = {{GPT}-{N}eo{X}: Large Scale Autoregressive Language Modeling in {P}y{T}orch},
    author = {Andonian, Alex and Anthony, Quentin and Biderman, Stella and Black, Sid and Gali, Preetham and Gao, Leo and Hallahan, Eric and Levy-Kramer, Josh and Leahy, Connor and Nestler, Lucas and Parker, Kip and Pieler, Michael and Purohit, Shivanshu and Songz, Tri and Phil, Wang and Weinbach, Samuel},
    doi = {10.5281/zenodo.5879544},
    month = {8},
    year = {2021},
    version = {0.0.1},
    url = {https://www.github.com/eleutherai/gpt-neox}
}

License

The llama2 license

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	47.11
AI2 Reasoning Challenge (25-Shot)	49.06
HellaSwag (10-Shot)	74.89
MMLU (5-Shot)	42.22
TruthfulQA (0-shot)	36.03
Winogrande (5-shot)	71.82
GSM8k (5-shot)	8.64