suayptalha/DeepSeek-R1-Distill-Llama-3B

Fork 0

Go to file

Cherrytest 6219c53e72 Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

.gitattributes

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

config.json

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

configuration.json

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

generation_config.json

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

model-00001-of-00002.safetensors

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

model-00002-of-00002.safetensors

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

model.safetensors.index.json

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

README.md

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

special_tokens_map.json

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

tokenizer_config.json

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

tokenizer.json

Upload folder using ModelScope SDK

2025-05-16 02:59:21 +00:00

README.md

language, license, library_name, tags, base_model, datasets, pipeline_tag, model-index

language

license

library_name

tags

base_model

datasets

pipeline_tag

model-index

mit

transformers

reasoning

axolotl

meta-llama/Llama-3.2-3B-Instruct

ServiceNow-AI/R1-Distill-SFT

text-generation

name

results

DeepSeek-R1-Distill-Llama-3B

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

IFEval (0-Shot)

HuggingFaceH4/ifeval

num_few_shot
0

type	value	name
inst_level_strict_acc and prompt_level_strict_acc	70.93	strict accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

BBH (3-Shot)

BBH

num_few_shot
3

type	value	name
acc_norm	21.45	normalized accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

MATH Lvl 5 (4-Shot)

hendrycks/competition_math

num_few_shot
4

type	value	name
exact_match	20.92	exact match

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

GPQA (0-shot)

Idavidrein/gpqa

num_few_shot
0

type	value	name
acc_norm	1.45	acc_norm

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

MuSR (0-shot)

TAUR-Lab/MuSR

num_few_shot
0

type	value	name
acc_norm	2.91	acc_norm

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU-PRO (5-shot)

TIGER-Lab/MMLU-Pro

main

test

num_few_shot
5

type	value	name
acc	21.98	accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=suayptalha/DeepSeek-R1-Distill-Llama-3B	Open LLM Leaderboard

DeepSeek-R1-Distill-Llama-3B

This model is the distilled version of DeepSeek-R1 on Llama-3.2-3B with R1-Distill-SFT dataset.

See axolotl config

base_model: unsloth/Llama-3.2-3B-Instruct
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: true
load_in_4bit: false
strict: false

chat_template: llama3
datasets:
  - path: ./custom_dataset.json
    type: chat_template
    conversation: chatml
    ds_type: json

add_bos_token: true
add_eos_token: true
use_default_system_prompt: false

special_tokens:
  bos_token: "<|begin_of_text|>"
  eos_token: "<|eot_id|>"
  pad_token: "<|eot_id|>"
  additional_special_tokens:
    - "<|begin_of_text|>"
    - "<|eot_id|>"

adapter: lora
lora_model_dir:
lora_r: 16
lora_alpha: 32
lora_dropout: 0.1
lora_target_linear: true

hub_model_id: suayptalha/DeepSeek-R1-Distill-Llama-3B

sequence_len: 2048
sample_packing: false
pad_to_sequence_len: true
micro_batch_size: 2
gradient_accumulation_steps: 8
num_epochs: 1
learning_rate: 2e-5
optimizer: paged_adamw_8bit
lr_scheduler: cosine

train_on_inputs: false
group_by_length: false
bf16: false
fp16: true
tf32: false

gradient_checkpointing: true
flash_attention: false

logging_steps: 50
warmup_steps: 100
saves_per_epoch: 1

output_dir: ./finetune-sft-results
save_safetensors: true

Prompt Template

You can use Llama3 prompt template while using the model:

Llama3

<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|>

<|start_header_id|>user<|end_header_id|>
{user}<|eot_id|>

<|start_header_id|>assistant<|end_header_id|>
{assistant}<|eot_id|>

Example usage:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "suayptalha/DeepSeek-R1-Distill-Llama-3B",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B")

SYSTEM_PROMPT = """Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.
"""

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "Which one is larger? 9.11 or 9.9?"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True,
    return_tensors = "pt",
).to("cuda")
output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
print(decoded_output)

Output:

<think>
First, I need to compare the two numbers 9.11 and 9.9. 

Next, I'll analyze each number. The first digit after the decimal point in 9.11 is 1, and in 9.9, it's 9. 

Since 9 is greater than 1, 9.9 is larger than 9.11.
</think>

To determine which number is larger, let's compare the two numbers:

**9.11** and **9.9**

1. **Identify the Decimal Places:**
   - Both numbers have two decimal places.
   
2. **Compare the Tens Place (Right of the Decimal Point):**
   - **9.11:** The tens place is 1.
   - **9.9:** The tens place is 9.
   
3. **Conclusion:**
   - Since 9 is greater than 1, the number with the larger tens place is 9.9.
   
**Answer:** **9.9** is larger than **9.11**.

Suggested system prompt:

Respond in the following format:
<think>
You should reason between these tags.
</think>

Answer goes here...

Always use <think> </think> tags even if they are not necessary.

Parameters

lr: 2e-5
epochs: 1
batch_size: 16
optimizer: paged_adamw_8bit

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	23.27
IFEval (0-Shot)	70.93
BBH (3-Shot)	21.45
MATH Lvl 5 (4-Shot)	20.92
GPQA (0-shot)	1.45
MuSR (0-shot)	2.91
MMLU-PRO (5-shot)	21.98

README.md

DeepSeek-R1-Distill-Llama-3B

Prompt Template

Llama3

Example usage:

Output:

Suggested system prompt:

Parameters

Open LLM Leaderboard Evaluation Results

Support