Files

ModelHub XC c5d102c035 初始化项目，由ModelHub XC社区提供模型

Model: HumanLLMs/Human-Like-LLama3-8B-Instruct
Source: Original Platform

2026-05-06 07:00:38 +08:00

10 KiB

Raw Permalink Blame History

license, tags, base_model, datasets, model-index, pipeline_tag, library_name

license

tags

base_model

datasets

model-index

pipeline_tag

library_name

llama3

axolotl

dpo

trl

meta-llama/Meta-Llama-3-8B-Instruct

HumanLLMs/Human-Like-DPO-Dataset

name

results

Humanish-LLama3.1-8B-Instruct

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

IFEval (0-Shot)

HuggingFaceH4/ifeval

num_few_shot
0

type	value	name
inst_level_strict_acc and prompt_level_strict_acc	64.98	strict accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-LLama3.1-8B-Instruct	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

BBH (3-Shot)

BBH

num_few_shot
3

type	value	name
acc_norm	28.01	normalized accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-LLama3.1-8B-Instruct	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

MATH Lvl 5 (4-Shot)

hendrycks/competition_math

num_few_shot
4

type	value	name
exact_match	8.46	exact match

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-LLama3.1-8B-Instruct	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

GPQA (0-shot)

Idavidrein/gpqa

num_few_shot
0

type	value	name
acc_norm	0.78	acc_norm

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-LLama3.1-8B-Instruct	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

args

MuSR (0-shot)

TAUR-Lab/MuSR

num_few_shot
0

type	value	name
acc_norm	2	acc_norm

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-LLama3.1-8B-Instruct	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU-PRO (5-shot)

TIGER-Lab/MMLU-Pro

main

test

num_few_shot
5

type	value	name
acc	30.02	accuracy

url	name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=HumanLLMs/Humanish-LLama3.1-8B-Instruct	Open LLM Leaderboard

text-generation

transformers

Enhancing Human-Like Responses in Large Language Models

| 🤗 Models | 📊 Dataset | 📄Paper |

📢 The paper associated with this model has been accepted to the AAAI-26 Workshop on Personalization in the Era of Large Foundation Models (PerFM).

🚀 Human-Like-Llama3-8B-Instruct

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct, specifically optimized to generate more human-like and conversational responses.

The fine-tuning process employed both Low-Rank Adaptation (LoRA) and Direct Preference Optimization (DPO) to enhance natural language understanding, conversational coherence, and emotional intelligence in interactions.

The proccess of creating this models is detailed in the research paper “Enhancing Human-Like Responses in Large Language Models”.

🛠️ Training Configuration

Base Model: Llama3-8B-Instruct
Framework: Axolotl v0.4.1
Hardware: 2x NVIDIA A100 (80 GB) GPUs
Training Time: ~2 hours 20 minutes
Dataset: Synthetic dataset with ≈11,000 samples across 256 diverse topics

See axolotl config

axolotl version: 0.4.1

base_model: meta-llama/Meta-Llama-3-8B-Instruct
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer

load_in_8bit: true
load_in_4bit: false
strict: false

chat_template: llama3
rl: dpo
datasets:
  - path: HumanLLMs/humanish-dpo-project
    type: llama3.prompt_pairs
    chat_template: llama3

dataset_prepared_path:
val_set_size: 0.05
output_dir: ./humanish-llama3-8b-instruct

sequence_len: 8192
sample_packing: false
pad_to_sequence_len: true

adapter: lora
lora_model_dir:
lora_r: 8
lora_alpha: 4
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project: Humanish-DPO
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

hub_model_id: HumanLLMs/Humanish-LLama3.1-8B-Instruct

gradient_accumulation_steps: 8
micro_batch_size: 2
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
s2_attention:

warmup_steps: 10
evals_per_epoch: 2
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:

save_safetensors: true

💬 Prompt Template

You can use Llama3 prompt template while using the model:

Llama3

<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|>

<|start_header_id|>user<|end_header_id|>
{user}<|eot_id|>

<|start_header_id|>assistant<|end_header_id|>
{assistant}<|eot_id|>

This prompt template is available as a chat template, which means you can format messages using the tokenizer.apply_chat_template() method:

messages = [
    {"role": "system", "content": "You are helpful AI asistant."},
    {"role": "user", "content": "Hello!"}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)

🤖 Models

Model	Download
Human-Like-Llama-3-8B-Instruct	🤗 HuggingFace
Human-Like-Qwen-2.5-7B-Instruct	🤗 HuggingFace
Human-Like-Mistral-Nemo-Instruct	🤗 HuggingFace

🔄 Quantizationed versions

GGUF @bartowski

🎯 Benchmark Results

Group	Model	Average	IFEval	BBH	MATH Lvl 5	GPQA	MuSR	MMLU-PRO
Llama Models	Human-Like-Llama-3-8B-Instruct	22.37	64.97	28.01	8.45	0.78	2.00	30.01
	Llama-3-8B-Instruct	23.57	74.08	28.24	8.68	1.23	1.60	29.60
	Difference (Human-Like)	-1.20	-9.11	-0.23	-0.23	-0.45	+0.40	+0.41
Qwen Models	Human-Like-Qwen-2.5-7B-Instruct	26.66	72.84	34.48	0.00	6.49	8.42	37.76
	Qwen-2.5-7B-Instruct	26.86	75.85	34.89	0.00	5.48	8.45	36.52
	Difference (Human-Like)	-0.20	-3.01	-0.41	0.00	+1.01	-0.03	+1.24
Mistral Models	Human-Like-Mistral-Nemo-Instruct	22.88	54.51	32.70	7.62	5.03	9.39	28.00
	Mistral-Nemo-Instruct	23.53	63.80	29.68	5.89	5.37	8.48	27.97
	Difference (Human-Like)	-0.65	-9.29	+3.02	+1.73	-0.34	+0.91	+0.03

📊 Dataset

The dataset used for fine-tuning was generated using LLaMA 3 models. The dataset includes 10,884 samples across 256 distinct topics such as technology, daily life, science, history, and arts. Each sample consists of:

Human-like responses: Natural, conversational answers mimicking human dialogue.
Formal responses: Structured and precise answers with a more formal tone.

The dataset has been open-sourced and is available at:

👉 Human-Like-DPO-Dataset

More details on the dataset creation process can be found in the accompanying research paper.

📝 Citation

@misc{çalık2025enhancinghumanlikeresponseslarge,
      title={Enhancing Human-Like Responses in Large Language Models}, 
      author={Ethem Yağız Çalık and Talha Rüzgar Akkuş},
      year={2025},
      eprint={2501.05032},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.05032}, 
}

10 KiB Raw Permalink Blame History Unescape Escape

Enhancing Human-Like Responses in Large Language Models

🚀 Human-Like-Llama3-8B-Instruct

🛠️ Training Configuration

💬 Prompt Template

Llama3

🤖 Models

🔄 Quantizationed versions

GGUF @bartowski

🎯 Benchmark Results

📊 Dataset

📝 Citation

10 KiB

Raw Permalink Blame History