Files
Llama-3.1-8B-4bit-axium/README.md
ModelHub XC 62a6fcea67 初始化项目,由ModelHub XC社区提供模型
Model: prithivMLmods/Llama-3.1-8B-4bit-axium
Source: Original Platform
2026-05-21 21:09:13 +08:00

2.4 KiB

language, license, tags, base_model
language license tags base_model
en
apache-2.0
text-generation-inference
transformers
unsloth
llama
trl
sft
axium
unsloth/meta-llama-3.1-8b-bnb-4bit

About the uploaded model

  • Developed by: prithivMLmods
  • License: apache-2.0
  • Finetuned from model : unsloth/meta-llama-3.1-8b-bnb-4bit

The model is still in the training phase. This is not the final version and may contain artifacts and perform poorly in some cases.

Trainer Configuration

Parameter Value
Model model
Tokenizer tokenizer
Train Dataset dataset
Dataset Text Field text
Max Sequence Length max_seq_length
Dataset Number of Processes 2
Packing False (Can make training 5x faster for short sequences.)
Training Arguments
- Per Device Train Batch Size 2
- Gradient Accumulation Steps 4
- Warmup Steps 5
- Number of Train Epochs 1 (Set this for 1 full training run.)
- Max Steps 60
- Learning Rate 2e-4
- FP16 not is_bfloat16_supported()
- BF16 is_bfloat16_supported()
- Logging Steps 1
- Optimizer adamw_8bit
- Weight Decay 0.01
- LR Scheduler Type linear
- Seed 3407
- Output Directory outputs

.

. This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.