初始化项目,由ModelHub XC社区提供模型
Model: ermiaazarkhalili/Llama-3-8B-Instruct_Function_Calling_xLAM Source: Original Platform
This commit is contained in:
219
README.md
Normal file
219
README.md
Normal file
@@ -0,0 +1,219 @@
|
||||
---
|
||||
license: llama3
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- metallama38binstruct
|
||||
- sft
|
||||
- fine-tuned
|
||||
- trl
|
||||
- lora
|
||||
- text-generation
|
||||
- conversational
|
||||
- instruction-following
|
||||
base_model: meta-llama/Meta-Llama-3-8B-Instruct
|
||||
datasets:
|
||||
- Salesforce/xlam-function-calling-60k
|
||||
model-index:
|
||||
- name: Llama-3-8B-Function-Calling-xLAM
|
||||
results: []
|
||||
---
|
||||
|
||||
# Llama-3-8B-Function-Calling-xLAM
|
||||
|
||||
This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) trained on the [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) dataset using **SFT** with LoRA adapters.
|
||||
|
||||
## Overview
|
||||
|
||||
**Llama-3-8B-Function-Calling-xLAM** is a language model optimized using SFT. Supervised Fine-Tuning (SFT) trains the model to follow instructions by learning from high-quality demonstration data.
|
||||
|
||||
### Key Features
|
||||
|
||||
- **High-Quality Fine-Tuning**: Trained on N/A carefully curated examples
|
||||
- **Efficient Training**: Uses LoRA (Low-Rank Adaptation) with 4-bit quantization
|
||||
- **Strong Performance**: Achieves N/A token accuracy on evaluation set
|
||||
- **Optimized for Inference**: Available in multiple formats including GGUF quantizations
|
||||
|
||||
## Model Details
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| **Developed by** | [ermiaazarkhalili](https://huggingface.co/ermiaazarkhalili) |
|
||||
| **License** | LLAMA3 |
|
||||
| **Language** | English |
|
||||
| **Base Model** | [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) |
|
||||
| **Model Size** | 8B parameters |
|
||||
| **Tensor Type** | BF16 |
|
||||
| **Context Length** | 2,048 tokens |
|
||||
| **Training Method** | SFT with LoRA |
|
||||
|
||||
## Training Information
|
||||
|
||||
### Training Configuration
|
||||
|
||||
| Parameter | Value |
|
||||
|-----------|-------|
|
||||
| Learning Rate | 0.0002 |
|
||||
| Batch Size | 2 per device |
|
||||
| Gradient Accumulation Steps | 8 |
|
||||
| Effective Batch Size | 16 |
|
||||
| Number of Epochs | 1 |
|
||||
| Max Sequence Length | 2,048 tokens |
|
||||
| LR Scheduler | Linear warmup + Cosine annealing |
|
||||
| Warmup Ratio | 0.1 |
|
||||
| Precision | BF16 mixed precision |
|
||||
| Gradient Checkpointing | Enabled |
|
||||
| Random Seed | 42 |
|
||||
|
||||
### LoRA Configuration
|
||||
|
||||
| Parameter | Value |
|
||||
|-----------|-------|
|
||||
| LoRA Rank (r) | 64 |
|
||||
| LoRA Alpha | 128 |
|
||||
| LoRA Dropout | 0.05 |
|
||||
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
|
||||
| Quantization | 4-bit NF4 |
|
||||
|
||||
### Training Metrics
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Hardware | NVIDIA H100 MIG |
|
||||
|
||||
## Dataset
|
||||
|
||||
This model was trained on the [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) dataset.
|
||||
|
||||
| Split | Samples |
|
||||
|-------|---------|
|
||||
| Training | N/A |
|
||||
| Evaluation | N/A |
|
||||
|
||||
## Usage
|
||||
|
||||
### Quick Start
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
import torch
|
||||
|
||||
model_id = "ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM"
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
model_id,
|
||||
torch_dtype=torch.bfloat16,
|
||||
device_map="auto"
|
||||
)
|
||||
|
||||
messages = [
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "What is the sum of 2 + 2?"}
|
||||
]
|
||||
|
||||
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
||||
inputs = tokenizer(text, return_tensors="pt").to(model.device)
|
||||
|
||||
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
|
||||
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Using Pipeline
|
||||
|
||||
```python
|
||||
from transformers import pipeline
|
||||
|
||||
generator = pipeline("text-generation", model="ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM", device_map="auto")
|
||||
messages = [{"role": "user", "content": "Explain the concept of machine learning."}]
|
||||
output = generator(messages, max_new_tokens=256, return_full_text=False)
|
||||
print(output[0]["generated_text"])
|
||||
```
|
||||
|
||||
### 4-bit Quantized Inference
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
|
||||
import torch
|
||||
|
||||
quantization_config = BitsAndBytesConfig(
|
||||
load_in_4bit=True,
|
||||
bnb_4bit_compute_dtype=torch.bfloat16,
|
||||
bnb_4bit_use_double_quant=True,
|
||||
bnb_4bit_quant_type="nf4"
|
||||
)
|
||||
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
"ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM",
|
||||
quantization_config=quantization_config,
|
||||
device_map="auto"
|
||||
)
|
||||
```
|
||||
|
||||
## GGUF Versions
|
||||
|
||||
For CPU or mixed CPU/GPU inference, GGUF quantized versions are available at:
|
||||
[ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM-GGUF](https://huggingface.co/ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM-GGUF)
|
||||
|
||||
### Using with Ollama
|
||||
|
||||
```bash
|
||||
ollama pull hf.co/ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM-GGUF:Q4_K_M
|
||||
ollama run hf.co/ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM-GGUF:Q4_K_M "Hello!"
|
||||
```
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Language**: Primarily trained on English data
|
||||
- **Knowledge Cutoff**: Limited to base model's training data cutoff
|
||||
- **Hallucinations**: May generate plausible-sounding but incorrect information
|
||||
- **Context Length**: Fine-tuned with 2,048 token limit
|
||||
- **Safety**: Not extensively safety-tuned; use with appropriate guardrails
|
||||
|
||||
## Intended Use
|
||||
|
||||
### Recommended Uses
|
||||
- Research on language model fine-tuning
|
||||
- Educational purposes
|
||||
- Personal projects
|
||||
- Prototyping conversational AI
|
||||
|
||||
### Out-of-Scope Uses
|
||||
- Production systems without additional safety measures
|
||||
- Medical, legal, or financial advice
|
||||
- Generating harmful or misleading content
|
||||
|
||||
## Training Framework
|
||||
|
||||
- **TRL**: 0.24.0
|
||||
- **Transformers**: 4.57.3
|
||||
- **PyTorch**: 2.9.0
|
||||
- **Datasets**: 4.3.0
|
||||
- **PEFT**: 0.18.0
|
||||
- **BitsAndBytes**: 0.49.0
|
||||
|
||||
## Citation
|
||||
|
||||
```bibtex
|
||||
@misc{ermiaazarkhalili_llama_3_8b_function_calling_xlam,
|
||||
author = {ermiaazarkhalili},
|
||||
title = {Llama-3-8B-Function-Calling-xLAM: Fine-tuned Meta-Llama-3-8B-Instruct on xlam-function-calling-60k},
|
||||
year = {2026},
|
||||
publisher = {Hugging Face},
|
||||
howpublished = {\url{https://huggingface.co/ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM}}
|
||||
}
|
||||
```
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
- Base model developers at meta-llama
|
||||
- [Hugging Face TRL Team](https://github.com/huggingface/trl) for the training library
|
||||
- Dataset creators and contributors
|
||||
- Compute Canada / DRAC for HPC resources
|
||||
|
||||
## Contact
|
||||
|
||||
For questions or issues, please open an issue on the model repository.
|
||||
Reference in New Issue
Block a user