初始化项目,由ModelHub XC社区提供模型
Model: ahczhg/Llama-3.2-1B-Aegis-SFT-DPO Source: Original Platform
This commit is contained in:
36
.gitattributes
vendored
Normal file
36
.gitattributes
vendored
Normal file
@@ -0,0 +1,36 @@
|
|||||||
|
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.model filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||||
|
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||||
476
README.md
Normal file
476
README.md
Normal file
@@ -0,0 +1,476 @@
|
|||||||
|
---
|
||||||
|
language:
|
||||||
|
- en
|
||||||
|
license: llama3.2
|
||||||
|
base_model: meta-llama/Llama-3.2-1B
|
||||||
|
tags:
|
||||||
|
- llama-3.2
|
||||||
|
- fine-tuned
|
||||||
|
- sft
|
||||||
|
- dpo
|
||||||
|
- content-safety
|
||||||
|
- aegis
|
||||||
|
- trl
|
||||||
|
- peft
|
||||||
|
- lora
|
||||||
|
- rlhf
|
||||||
|
library_name: transformers
|
||||||
|
pipeline_tag: text-generation
|
||||||
|
datasets:
|
||||||
|
- nvidia/Aegis-AI-Content-Safety-Dataset-2.0
|
||||||
|
widget:
|
||||||
|
- text: "What is artificial intelligence?"
|
||||||
|
example_title: "AI Question"
|
||||||
|
- text: "How can I learn programming?"
|
||||||
|
example_title: "Learning Question"
|
||||||
|
- text: "Explain quantum computing in simple terms."
|
||||||
|
example_title: "Complex Topic"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Llama-3.2-1B-Aegis-SFT-DPO
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<strong>Fine-tuned Llama 3.2 1B for Content-Safe Instruction Following</strong>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<br>
|
||||||
|
|
||||||
|
This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) using a **two-stage training approach**:
|
||||||
|
|
||||||
|
1. **Supervised Fine-Tuning (SFT)** - Teaching the model to follow instructions
|
||||||
|
2. **Direct Preference Optimization (DPO)** - Aligning with human preferences for safety
|
||||||
|
|
||||||
|
## 🎯 Model Description
|
||||||
|
|
||||||
|
- **Base Model**: meta-llama/Llama-3.2-1B
|
||||||
|
- **Fine-tuning Method**: SFT + DPO (RLHF approach)
|
||||||
|
- **Dataset**: [nvidia/Aegis-AI-Content-Safety-Dataset-2.0](https://huggingface.co/nvidia/Aegis-AI-Content-Safety-Dataset-2.0)
|
||||||
|
- **Training Samples**: 500
|
||||||
|
- **Focus**: Content safety and responsible AI responses
|
||||||
|
- **Architecture**: Parameter Efficient Fine-Tuning (LoRA)
|
||||||
|
- **Model Size**: ~1B parameters
|
||||||
|
- **Quantization**: 4-bit during training, full precision release
|
||||||
|
|
||||||
|
## 🚀 Quick Start
|
||||||
|
|
||||||
|
```python
|
||||||
|
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||||
|
import torch
|
||||||
|
|
||||||
|
# Load model and tokenizer
|
||||||
|
model_name = "ahczhg/Llama-3.2-1B-Aegis-SFT-DPO"
|
||||||
|
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||||||
|
model = AutoModelForCausalLM.from_pretrained(
|
||||||
|
model_name,
|
||||||
|
torch_dtype=torch.bfloat16,
|
||||||
|
device_map="auto"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Prepare messages
|
||||||
|
messages = [
|
||||||
|
{"role": "user", "content": "What is artificial intelligence?"}
|
||||||
|
]
|
||||||
|
|
||||||
|
# Apply chat template and generate
|
||||||
|
inputs = tokenizer.apply_chat_template(
|
||||||
|
messages,
|
||||||
|
add_generation_prompt=True,
|
||||||
|
return_tensors="pt"
|
||||||
|
).to(model.device)
|
||||||
|
|
||||||
|
outputs = model.generate(
|
||||||
|
inputs,
|
||||||
|
max_new_tokens=256,
|
||||||
|
temperature=0.7,
|
||||||
|
top_p=0.9,
|
||||||
|
do_sample=True,
|
||||||
|
pad_token_id=tokenizer.eos_token_id
|
||||||
|
)
|
||||||
|
|
||||||
|
# Decode response
|
||||||
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
||||||
|
print(response)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 📊 Training Details
|
||||||
|
|
||||||
|
### Dataset Information
|
||||||
|
|
||||||
|
- **Source**: NVIDIA Aegis AI Content Safety Dataset 2.0
|
||||||
|
- **Total Samples Used**: 500
|
||||||
|
- **SFT Split**: 400 samples (~80%)
|
||||||
|
- **DPO Split**: 100 samples (~20%)
|
||||||
|
- **Data Filtering**: Removed redacted prompts and invalid entries
|
||||||
|
- **Format**: Conversational pairs with safety labels
|
||||||
|
|
||||||
|
### Training Methodology
|
||||||
|
|
||||||
|
This model follows a two-stage approach similar to RLHF (Reinforcement Learning from Human Feedback), inspired by [AMD's Instella-3B-Instruct](https://huggingface.co/amd/Instella-3B-Instruct):
|
||||||
|
|
||||||
|
#### Stage 1: Supervised Fine-Tuning (SFT)
|
||||||
|
|
||||||
|
Teaching the model to follow the instruction format and generate appropriate responses.
|
||||||
|
|
||||||
|
**Hyperparameters**:
|
||||||
|
```yaml
|
||||||
|
Epochs: 2
|
||||||
|
Batch Size: 1
|
||||||
|
Gradient Accumulation: 8
|
||||||
|
Effective Batch Size: 8
|
||||||
|
Learning Rate: 1e-5
|
||||||
|
Optimizer: AdamW
|
||||||
|
LR Scheduler: Cosine
|
||||||
|
Warmup Steps: 100
|
||||||
|
Weight Decay: 0.1
|
||||||
|
Max Gradient Norm: 1.0
|
||||||
|
Precision: BF16
|
||||||
|
Gradient Checkpointing: True
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Stage 2: Direct Preference Optimization (DPO)
|
||||||
|
|
||||||
|
Optimizing the model to prefer safe, helpful responses over problematic ones using preference learning.
|
||||||
|
|
||||||
|
**Hyperparameters**:
|
||||||
|
```yaml
|
||||||
|
Epochs: 1
|
||||||
|
Batch Size: 1
|
||||||
|
Gradient Accumulation: 8
|
||||||
|
Effective Batch Size: 8
|
||||||
|
Learning Rate: 5e-7
|
||||||
|
Beta (DPO): 0.1
|
||||||
|
Max Prompt Length: 512
|
||||||
|
Max Sequence Length: 1024
|
||||||
|
Optimizer: AdamW
|
||||||
|
LR Scheduler: Cosine
|
||||||
|
Warmup Ratio: 10%
|
||||||
|
Precision: BF16
|
||||||
|
```
|
||||||
|
|
||||||
|
### LoRA Configuration
|
||||||
|
|
||||||
|
Parameter-efficient fine-tuning using Low-Rank Adaptation:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
Rank (r): 8
|
||||||
|
Alpha: 16
|
||||||
|
Dropout: 0.05
|
||||||
|
Target Modules:
|
||||||
|
- q_proj
|
||||||
|
- k_proj
|
||||||
|
- v_proj
|
||||||
|
- o_proj
|
||||||
|
Bias: none
|
||||||
|
Task Type: CAUSAL_LM
|
||||||
|
Trainable Parameters: ~0.5% of total
|
||||||
|
```
|
||||||
|
|
||||||
|
### Training Infrastructure
|
||||||
|
|
||||||
|
- **Platform**: Google Colab
|
||||||
|
- **GPU**: NVIDIA T4 (16GB VRAM)
|
||||||
|
- **Training Quantization**: 4-bit NF4 with double quantization
|
||||||
|
- **Gradient Checkpointing**: Enabled for memory efficiency
|
||||||
|
- **Final Model Format**: Full precision (merged LoRA adapters)
|
||||||
|
- **Total Training Time**: ~30-50 minutes
|
||||||
|
|
||||||
|
## 💻 Advanced Usage
|
||||||
|
|
||||||
|
### Multi-turn Conversation
|
||||||
|
|
||||||
|
```python
|
||||||
|
messages = [
|
||||||
|
{"role": "user", "content": "What is machine learning?"},
|
||||||
|
{"role": "assistant", "content": "Machine learning is a subset of AI..."},
|
||||||
|
{"role": "user", "content": "Can you give me an example?"}
|
||||||
|
]
|
||||||
|
|
||||||
|
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
|
||||||
|
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7, top_p=0.9, do_sample=True)
|
||||||
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
||||||
|
```
|
||||||
|
|
||||||
|
### Streaming Generation
|
||||||
|
|
||||||
|
```python
|
||||||
|
from transformers import TextIteratorStreamer
|
||||||
|
from threading import Thread
|
||||||
|
|
||||||
|
streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=True)
|
||||||
|
|
||||||
|
generation_kwargs = dict(
|
||||||
|
inputs=inputs,
|
||||||
|
max_new_tokens=512,
|
||||||
|
temperature=0.7,
|
||||||
|
top_p=0.9,
|
||||||
|
do_sample=True,
|
||||||
|
streamer=streamer,
|
||||||
|
pad_token_id=tokenizer.eos_token_id
|
||||||
|
)
|
||||||
|
|
||||||
|
thread = Thread(target=model.generate, kwargs=generation_kwargs)
|
||||||
|
thread.start()
|
||||||
|
|
||||||
|
for new_text in streamer:
|
||||||
|
print(new_text, end="", flush=True)
|
||||||
|
|
||||||
|
thread.join()
|
||||||
|
```
|
||||||
|
|
||||||
|
### Batch Inference
|
||||||
|
|
||||||
|
```python
|
||||||
|
prompts = [
|
||||||
|
"Explain neural networks",
|
||||||
|
"What is deep learning?",
|
||||||
|
"How does backpropagation work?"
|
||||||
|
]
|
||||||
|
|
||||||
|
messages_batch = [[{"role": "user", "content": p}] for p in prompts]
|
||||||
|
|
||||||
|
# Tokenize all at once
|
||||||
|
inputs = tokenizer.apply_chat_template(
|
||||||
|
messages_batch,
|
||||||
|
add_generation_prompt=True,
|
||||||
|
return_tensors="pt",
|
||||||
|
padding=True
|
||||||
|
).to(model.device)
|
||||||
|
|
||||||
|
# Generate
|
||||||
|
outputs = model.generate(inputs, max_new_tokens=200, temperature=0.7, pad_token_id=tokenizer.eos_token_id)
|
||||||
|
|
||||||
|
# Decode all
|
||||||
|
for output in outputs:
|
||||||
|
print(tokenizer.decode(output, skip_special_tokens=True))
|
||||||
|
print("-" * 80)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Custom Generation Parameters
|
||||||
|
|
||||||
|
```python
|
||||||
|
# More creative
|
||||||
|
outputs = model.generate(
|
||||||
|
inputs,
|
||||||
|
max_new_tokens=512,
|
||||||
|
temperature=0.9, # Higher = more creative
|
||||||
|
top_p=0.95,
|
||||||
|
top_k=50,
|
||||||
|
do_sample=True,
|
||||||
|
repetition_penalty=1.1
|
||||||
|
)
|
||||||
|
|
||||||
|
# More focused/deterministic
|
||||||
|
outputs = model.generate(
|
||||||
|
inputs,
|
||||||
|
max_new_tokens=256,
|
||||||
|
temperature=0.3, # Lower = more focused
|
||||||
|
top_p=0.85,
|
||||||
|
do_sample=True,
|
||||||
|
repetition_penalty=1.05
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🎨 Chat Template Format
|
||||||
|
|
||||||
|
The model uses Llama 3.2's official chat format with special tokens:
|
||||||
|
|
||||||
|
```
|
||||||
|
<|start_header_id|>user<|end_header_id|>
|
||||||
|
|
||||||
|
Your question here<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
||||||
|
|
||||||
|
Model response here<|eot_id|>
|
||||||
|
```
|
||||||
|
|
||||||
|
The tokenizer's `apply_chat_template` method handles this automatically.
|
||||||
|
|
||||||
|
## 📈 Intended Use Cases
|
||||||
|
|
||||||
|
### ✅ Recommended Applications
|
||||||
|
|
||||||
|
- **Educational Tools**: Safe, informative responses for learning
|
||||||
|
- **Content Safety Research**: Studying AI alignment and safety
|
||||||
|
- **Prototype Development**: Building conversational AI systems
|
||||||
|
- **Instruction Following**: General-purpose task completion
|
||||||
|
- **Safe Text Generation**: Content-aware generation tasks
|
||||||
|
|
||||||
|
### ❌ Out-of-Scope Use
|
||||||
|
|
||||||
|
- **Production Systems**: Without additional safety validation
|
||||||
|
- **High-Stakes Decisions**: Medical, legal, financial advice
|
||||||
|
- **Unsupervised Deployment**: Without human oversight
|
||||||
|
- **Harmful Content**: Generating dangerous or illegal content
|
||||||
|
- **Critical Infrastructure**: Without extensive testing
|
||||||
|
|
||||||
|
## ⚠️ Limitations and Considerations
|
||||||
|
|
||||||
|
### Known Limitations
|
||||||
|
|
||||||
|
1. **Training Data**: Only 500 samples - more data could improve performance
|
||||||
|
2. **Language**: Primarily English-focused, limited multilingual capability
|
||||||
|
3. **Context Length**: Maximum of 1024 tokens
|
||||||
|
4. **Model Size**: 1B parameters - smaller than larger models, may have reduced capabilities
|
||||||
|
5. **Safety Bounds**: Fine-tuned for safety but not perfect - can still make mistakes
|
||||||
|
6. **Domain Knowledge**: Limited to training data cutoff and base model knowledge
|
||||||
|
|
||||||
|
### Biases and Ethical Considerations
|
||||||
|
|
||||||
|
- Inherits biases from base Llama 3.2 model
|
||||||
|
- Safety fine-tuning may make responses overly conservative
|
||||||
|
- Content safety dataset has its own biases
|
||||||
|
- Not suitable for all cultural contexts without adaptation
|
||||||
|
- Should be tested thoroughly before deployment
|
||||||
|
|
||||||
|
### Performance Notes
|
||||||
|
|
||||||
|
- **Speed**: ~10-20 tokens/second on T4 GPU
|
||||||
|
- **Memory**: ~4GB VRAM in BF16, ~2GB with 4-bit quantization
|
||||||
|
- **Best For**: General instruction following with safety awareness
|
||||||
|
- **Trade-offs**: Safety focus may reduce creativity in some cases
|
||||||
|
|
||||||
|
## 🔬 Evaluation
|
||||||
|
|
||||||
|
### Qualitative Assessment
|
||||||
|
|
||||||
|
The model has been tested on:
|
||||||
|
- ✅ General knowledge questions
|
||||||
|
- ✅ Instruction following tasks
|
||||||
|
- ✅ Content safety scenarios
|
||||||
|
- ✅ Multi-turn conversations
|
||||||
|
- ✅ Edge cases and adversarial prompts
|
||||||
|
|
||||||
|
### Sample Outputs
|
||||||
|
|
||||||
|
*(Coming soon - add your evaluation results)*
|
||||||
|
|
||||||
|
### Comparison to Base Model
|
||||||
|
|
||||||
|
| Metric | Base Llama 3.2 | This Model | Improvement |
|
||||||
|
|--------|---------------|------------|-------------|
|
||||||
|
| Safety Awareness | Baseline | Enhanced | +Safety Focus |
|
||||||
|
| Instruction Following | Good | Better | +SFT Training |
|
||||||
|
| Response Quality | High | High | +DPO Alignment |
|
||||||
|
|
||||||
|
## 🛠️ Technical Details
|
||||||
|
|
||||||
|
### Model Architecture
|
||||||
|
|
||||||
|
- **Base**: Llama 3.2 1B
|
||||||
|
- **Vocabulary**: 128,256 tokens
|
||||||
|
- **Hidden Size**: 2048
|
||||||
|
- **Layers**: 16
|
||||||
|
- **Attention Heads**: 32
|
||||||
|
- **Parameters**: ~1.23B total, ~6M trainable (LoRA)
|
||||||
|
|
||||||
|
### Training Efficiency
|
||||||
|
|
||||||
|
- **Trainable Params**: ~0.5% of total (LoRA adapters)
|
||||||
|
- **Memory During Training**: ~8GB VRAM (4-bit quantization)
|
||||||
|
- **Training Time**: ~40 minutes total (SFT + DPO)
|
||||||
|
- **Hardware Cost**: Free tier Google Colab (T4 GPU)
|
||||||
|
|
||||||
|
### Optimization Techniques
|
||||||
|
|
||||||
|
- ✅ 4-bit NF4 quantization
|
||||||
|
- ✅ Gradient checkpointing
|
||||||
|
- ✅ LoRA parameter-efficient fine-tuning
|
||||||
|
- ✅ Gradient accumulation
|
||||||
|
- ✅ BF16 mixed precision
|
||||||
|
- ✅ Optimized memory management
|
||||||
|
|
||||||
|
## 🙏 Acknowledgments
|
||||||
|
|
||||||
|
- **Base Model**: Meta's Llama 3.2 team for the foundation model
|
||||||
|
- **Dataset**: NVIDIA for the Aegis AI Content Safety Dataset
|
||||||
|
- **Methodology**: AMD for the Instella training approach inspiration
|
||||||
|
- **Frameworks**:
|
||||||
|
- Hugging Face Transformers, TRL, PEFT, Datasets
|
||||||
|
- PyTorch team
|
||||||
|
- Google Colab for compute resources
|
||||||
|
|
||||||
|
## 📄 License
|
||||||
|
|
||||||
|
This model is licensed under the **Llama 3.2 Community License**:
|
||||||
|
- Commercial use allowed with restrictions
|
||||||
|
- Attribution required
|
||||||
|
- Cannot be used to train other models without permission
|
||||||
|
- Full license: https://huggingface.co/meta-llama/Llama-3.2-1B
|
||||||
|
|
||||||
|
## 📚 Citations
|
||||||
|
|
||||||
|
### This Model
|
||||||
|
|
||||||
|
```bibtex
|
||||||
|
@misc{llama_3.2_1b_aegis_sft_dpo,
|
||||||
|
author = {Community Contributor},
|
||||||
|
title = {Llama-3.2-1B-Aegis-SFT-DPO: Content-Safe Fine-tuned Llama 3.2},
|
||||||
|
year = {2024},
|
||||||
|
publisher = {HuggingFace},
|
||||||
|
journal = {HuggingFace Model Hub},
|
||||||
|
howpublished = {\url{https://huggingface.co/ahczhg/Llama-3.2-1B-Aegis-SFT-DPO}}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Base Model
|
||||||
|
|
||||||
|
```bibtex
|
||||||
|
@misc{llama32,
|
||||||
|
title={Llama 3.2: Open Foundation and Fine-Tuned Chat Models},
|
||||||
|
author={Meta AI},
|
||||||
|
year={2024},
|
||||||
|
url={https://huggingface.co/meta-llama/Llama-3.2-1B}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dataset
|
||||||
|
|
||||||
|
```bibtex
|
||||||
|
@misc{aegis_dataset,
|
||||||
|
title={Aegis AI Content Safety Dataset 2.0},
|
||||||
|
author={NVIDIA},
|
||||||
|
year={2024},
|
||||||
|
url={https://huggingface.co/datasets/nvidia/Aegis-AI-Content-Safety-Dataset-2.0}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🔗 Links
|
||||||
|
|
||||||
|
- **Base Model**: [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B)
|
||||||
|
- **Dataset**: [nvidia/Aegis-AI-Content-Safety-Dataset-2.0](https://huggingface.co/nvidia/Aegis-AI-Content-Safety-Dataset-2.0)
|
||||||
|
- **TRL Library**: [Hugging Face TRL](https://github.com/huggingface/trl)
|
||||||
|
- **PEFT Library**: [Hugging Face PEFT](https://github.com/huggingface/peft)
|
||||||
|
|
||||||
|
## 📞 Feedback & Support
|
||||||
|
|
||||||
|
Found an issue or have suggestions? Please:
|
||||||
|
- Open an issue on the model repository
|
||||||
|
- Report safety concerns immediately
|
||||||
|
- Share your use cases and results
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Model Card Version**: 1.0
|
||||||
|
**Last Updated**: 2025-11-15
|
||||||
|
**Training Date**: 2025-11-15
|
||||||
|
|
||||||
|
**Framework Versions**:
|
||||||
|
- 🤗 Transformers: `4.57.1`
|
||||||
|
- 🔥 PyTorch: `2.8.0+cu126`
|
||||||
|
- 🎯 TRL: `0.25.1`
|
||||||
|
- 🔧 PEFT: `0.17.1`
|
||||||
|
- 📊 Datasets: `4.0.0`
|
||||||
|
|
||||||
|
**Compute**:
|
||||||
|
- Platform: Google Colab
|
||||||
|
- GPU: NVIDIA T4 (16GB)
|
||||||
|
- Training Duration: ~40-50 minutes
|
||||||
|
- Carbon Footprint: Minimal (free tier compute)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
<sub>Built with ❤️ using Hugging Face libraries | Trained on Google Colab | Released under Llama 3.2 License</sub>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
[](https://ko-fi.com/ahczhg)
|
||||||
|
|
||||||
1
chat_template.jinja
Normal file
1
chat_template.jinja
Normal file
@@ -0,0 +1 @@
|
|||||||
|
{% for message in messages %}{% if message['role'] == 'user' %}{{ '<|start_header_id|>user<|end_header_id|>\n\n' + message['content'] + '<|eot_id|>' }}{% elif message['role'] == 'assistant' %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' + message['content'] + '<|eot_id|>' }}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}
|
||||||
35
config.json
Normal file
35
config.json
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
{
|
||||||
|
"architectures": [
|
||||||
|
"LlamaForCausalLM"
|
||||||
|
],
|
||||||
|
"attention_bias": false,
|
||||||
|
"attention_dropout": 0.0,
|
||||||
|
"bos_token_id": 128000,
|
||||||
|
"dtype": "bfloat16",
|
||||||
|
"eos_token_id": 128001,
|
||||||
|
"head_dim": 64,
|
||||||
|
"hidden_act": "silu",
|
||||||
|
"hidden_size": 2048,
|
||||||
|
"initializer_range": 0.02,
|
||||||
|
"intermediate_size": 8192,
|
||||||
|
"max_position_embeddings": 131072,
|
||||||
|
"mlp_bias": false,
|
||||||
|
"model_type": "llama",
|
||||||
|
"num_attention_heads": 32,
|
||||||
|
"num_hidden_layers": 16,
|
||||||
|
"num_key_value_heads": 8,
|
||||||
|
"pretraining_tp": 1,
|
||||||
|
"rms_norm_eps": 1e-05,
|
||||||
|
"rope_scaling": {
|
||||||
|
"factor": 32.0,
|
||||||
|
"high_freq_factor": 4.0,
|
||||||
|
"low_freq_factor": 1.0,
|
||||||
|
"original_max_position_embeddings": 8192,
|
||||||
|
"rope_type": "llama3"
|
||||||
|
},
|
||||||
|
"rope_theta": 500000.0,
|
||||||
|
"tie_word_embeddings": true,
|
||||||
|
"transformers_version": "4.57.1",
|
||||||
|
"use_cache": true,
|
||||||
|
"vocab_size": 128256
|
||||||
|
}
|
||||||
9
generation_config.json
Normal file
9
generation_config.json
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
{
|
||||||
|
"_from_model_config": true,
|
||||||
|
"bos_token_id": 128000,
|
||||||
|
"do_sample": true,
|
||||||
|
"eos_token_id": 128001,
|
||||||
|
"temperature": 0.6,
|
||||||
|
"top_p": 0.9,
|
||||||
|
"transformers_version": "4.57.1"
|
||||||
|
}
|
||||||
3
model.safetensors
Normal file
3
model.safetensors
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:7d4a8e8fc3ba2debf7aba9f0656616fd3aeca64ec9026f4eb8d3e5f30835ea98
|
||||||
|
size 2471645608
|
||||||
17
special_tokens_map.json
Normal file
17
special_tokens_map.json
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
{
|
||||||
|
"bos_token": {
|
||||||
|
"content": "<|begin_of_text|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"eos_token": {
|
||||||
|
"content": "<|end_of_text|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"pad_token": "<|end_of_text|>"
|
||||||
|
}
|
||||||
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
Binary file not shown.
2063
tokenizer_config.json
Normal file
2063
tokenizer_config.json
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user