commit 96910917b94e171adbe78f7e978f1011af6b6913 Author: ModelHub XC Date: Mon Apr 20 13:53:07 2026 +0800 初始化项目,由ModelHub XC社区提供模型 Model: SURESHBEEKHANI/Llama_3_2_3B_SFT_GGUF Source: Original Platform diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..4b8a3c4 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,38 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +unsloth.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text +unsloth.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text +unsloth.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/README.md b/README.md new file mode 100644 index 0000000..2f080ac --- /dev/null +++ b/README.md @@ -0,0 +1,67 @@ +--- +license: mit +datasets: +- mlabonne/FineTome-100k +language: +- en +base_model: +- unsloth/Llama-3.2-3B-Instruct +pipeline_tag: question-answering +--- +# Llama-3.2-3B-Instruct Fine-Tuning on Custom Dataset + +## Overview + +This repository demonstrates the process of fine-tuning the **Llama-3.2-3B-Instruct** model using the **Unsloth** library. The model is trained on a custom dataset, **FineTome-100k**, for **60 steps**. Key optimizations include: + +- **4-bit quantization** to reduce memory usage +- **LoRA (Low-Rank Adaptation)** for efficient fine-tuning +- Techniques for improving inference speed and generating text with the model + +## Model Details + +- **Model Name**: Llama-3.2-3B-Instruct +- **Pretrained Weights**: Unsloth’s pretrained version for Llama-3.2-3B +- **Quantization**: 4-bit quantization (set via `load_in_4bit=True`) for reduced memory usage + +### LoRA Configuration: +- **Rank**: 16 +- **Target Modules**: + - q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj +- **LoRA Alpha**: 16 +- **LoRA Dropout**: 0 + +### Gradient Checkpointing: +- **Use Gradient Checkpointing**: "unsloth" for improved long-context training + +## Training + +- **Dataset**: FineTome-100k (first 500 records selected) +- **Loss Function**: Standard loss for sequence-to-sequence tasks +- **Training Steps**: 60 steps with batch size of 2 (gradient accumulation steps set to 4) +- **Optimizer**: AdamW 8-bit + +### Training Parameters: +- **Max Sequence Length**: 2048 tokens +- **Learning Rate**: 2e-4 +- **Gradient Accumulation Steps**: 4 +- **Total Steps**: 60 +- **Epochs**: 1 (as `max_steps` was set to 60) +- **Training Precision**: Use FP16 or BF16 for training depending on GPU support + +## Performance + +- **GPU Used**: Tesla T4 (14.7 GB max memory) + +### Peak Memory Usage: +- **Total Reserved Memory**: 3.855 GB +- **Memory Used for LoRA**: 1.312 GB +- **Memory Utilization**: 26.1% (peak) of available memory + +## Conclusion + +This notebook showcases an efficient approach to fine-tuning large language models with memory optimizations and improved training efficiency using **LoRA** and **4-bit quantization**. The **Unsloth** library allows for fast training and inference, making this setup ideal for large-scale tasks even with limited GPU resources. + +## Notebook + +Access the implementation notebook for this model [here](https://github.com/SURESHBEEKHANI/Advanced-LLM-Fine-Tuning/blob/main/Llama_3_2_3B_SFT_GGUF.ipynb). This notebook provides detailed steps for fine-tuning and deploying the model. \ No newline at end of file diff --git a/config.json b/config.json new file mode 100644 index 0000000..a4ba21b --- /dev/null +++ b/config.json @@ -0,0 +1,3 @@ +{ + "model_type": "llama" +} \ No newline at end of file diff --git a/unsloth.Q4_K_M.gguf b/unsloth.Q4_K_M.gguf new file mode 100644 index 0000000..7d3d982 --- /dev/null +++ b/unsloth.Q4_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fd1b7a5b47c31e6c97101152e3211621aae8e73df67cf1d4f042a1191a38c77b +size 2019377984 diff --git a/unsloth.Q5_K_M.gguf b/unsloth.Q5_K_M.gguf new file mode 100644 index 0000000..9460d5b --- /dev/null +++ b/unsloth.Q5_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f31cbfb1ce72126ff65987a3888c63218691e98606faaef2f8937905dbb30be5 +size 2322154304 diff --git a/unsloth.Q8_0.gguf b/unsloth.Q8_0.gguf new file mode 100644 index 0000000..ddeddf4 --- /dev/null +++ b/unsloth.Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f5c1f3c53dc5060973341e3e74bdeb44ac9e78f5a7e5386b6cfc61c6878eaec5 +size 3421899584