Go to file

ModelHub XC c84161a9ed 初始化项目，由ModelHub XC社区提供模型

Model: lunahr/thea-3b-25r
Source: Original Platform

2026-05-26 12:38:17 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

model-00001-of-00002.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

model-00002-of-00002.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

model.safetensors.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 12:38:17 +08:00

README.md

language, license, tags, base_model, datasets, model-index, new_version

language

license

tags

base_model

datasets

model-index

new_version

llama3.2

text-generation-inference

transformers

llama

trl

sft

reasoning

llama-3

chuanli11/Llama-3.2-3B-Instruct-uncensored

KingNish/reasoning-base-20k

lunahr/thea-name-overrides

name

results

thea-3b-25r

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

IFEval (0-Shot)

HuggingFaceH4/ifeval

num_few_shot
0

type	value	name
inst_level_strict_acc and prompt_level_strict_acc	73.44	strict accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lunahr/thea-3b-25r	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

BBH (3-Shot)

BBH

num_few_shot
3

type	value	name
acc_norm	22.55	normalized accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lunahr/thea-3b-25r	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

MATH Lvl 5 (4-Shot)

hendrycks/competition_math

num_few_shot
4

type	value	name
exact_match	16.31	exact match

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lunahr/thea-3b-25r	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

GPQA (0-shot)

Idavidrein/gpqa

num_few_shot
0

type	value	name
acc_norm	2.35	acc_norm

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lunahr/thea-3b-25r	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

MuSR (0-shot)

TAUR-Lab/MuSR

num_few_shot
0

type	value	name
acc_norm	3.57	acc_norm

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lunahr/thea-3b-25r	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU-PRO (5-shot)

TIGER-Lab/MMLU-Pro

main

test

num_few_shot
5

type	value	name
acc	24.25	accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=lunahr/thea-3b-25r	Open LLM Leaderboard

lunahr/thea-3b-50r-u1

Model Description

An uncensored reasoning Llama 3.2 3B model trained on reasoning data.

It has been trained using improved training code, and gives an improved performance. Here is what inference code you should use:

from transformers import AutoModelForCausalLM, AutoTokenizer

MAX_REASONING_TOKENS = 1024
MAX_RESPONSE_TOKENS = 512

model_name = "lunahr/thea-3b-25r"

model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Which is greater 9.9 or 9.11 ??"
messages = [
    {"role": "user", "content": prompt}
]

# Generate reasoning
reasoning_template = tokenizer.apply_chat_template(messages, tokenize=False, add_reasoning_prompt=True)
reasoning_inputs = tokenizer(reasoning_template, return_tensors="pt").to(model.device)
reasoning_ids = model.generate(**reasoning_inputs, max_new_tokens=MAX_REASONING_TOKENS)
reasoning_output = tokenizer.decode(reasoning_ids[0, reasoning_inputs.input_ids.shape[1]:], skip_special_tokens=True)

print("REASONING: " + reasoning_output)

# Generate answer
messages.append({"role": "reasoning", "content": reasoning_output})
response_template = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response_inputs = tokenizer(response_template, return_tensors="pt").to(model.device)
response_ids = model.generate(**response_inputs, max_new_tokens=MAX_RESPONSE_TOKENS)
response_output = tokenizer.decode(response_ids[0, response_inputs.input_ids.shape[1]:], skip_special_tokens=True)

print("ANSWER: " + response_output)

Trained by: Piotr Zalewski
License: llama3.2
Finetuned from model: chuanli11/Llama-3.2-3B-Instruct-uncensored
Dataset used: KingNish/reasoning-base-20k

This Llama model was trained faster than Unsloth using custom training code.

Visit https://www.kaggle.com/code/piotr25691/distributed-llama-training-with-2xt4 to find out how you can finetune your models using BOTH of the Kaggle provided GPUs.