Files

ModelHub XC 62a6fcea67 初始化项目，由ModelHub XC社区提供模型

Model: prithivMLmods/Llama-3.1-8B-4bit-axium
Source: Original Platform

2026-05-21 21:09:13 +08:00

language, license, tags, base_model

language

license

About the uploaded model

The model is still in the training phase. This is not the final version and may contain artifacts and perform poorly in some cases.

Trainer Configuration

Parameter	Value
Model	`model`
Tokenizer	`tokenizer`
Train Dataset	`dataset`
Dataset Text Field	`text`
Max Sequence Length	`max_seq_length`
Dataset Number of Processes	`2`
Packing	`False` (Can make training 5x faster for short sequences.)
Training Arguments
- Per Device Train Batch Size	`2`
- Gradient Accumulation Steps	`4`
- Warmup Steps	`5`
- Number of Train Epochs	`1` (Set this for 1 full training run.)
- Max Steps	`60`
- Learning Rate	`2e-4`
- FP16	`not is_bfloat16_supported()`
- BF16	`is_bfloat16_supported()`
- Logging Steps	`1`
- Optimizer	`adamw_8bit`
- Weight Decay	`0.01`
- LR Scheduler Type	`linear`
- Seed	`3407`
- Output Directory	`outputs`

. This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.