初始化项目，由ModelHub XC社区提供模型

Model: ermiaazarkhalili/Llama-3-8B-Instruct_Function_Calling_xLAM Source: Original Platform
2026-04-24 11:38:05 +08:00
commit 766f0888b4
13 changed files with 2693 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,219 @@
+---
+license: llama3
+language:
+  - en
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+  - metallama38binstruct
+  - sft
+  - fine-tuned
+  - trl
+  - lora
+  - text-generation
+  - conversational
+  - instruction-following
+base_model: meta-llama/Meta-Llama-3-8B-Instruct
+datasets:
+  - Salesforce/xlam-function-calling-60k
+model-index:
+  - name: Llama-3-8B-Function-Calling-xLAM
+    results: []
+---
+
+# Llama-3-8B-Function-Calling-xLAM
+
+This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) trained on the [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) dataset using **SFT** with LoRA adapters.
+
+## Overview
+
+**Llama-3-8B-Function-Calling-xLAM** is a language model optimized using SFT. Supervised Fine-Tuning (SFT) trains the model to follow instructions by learning from high-quality demonstration data.
+
+### Key Features
+
+- **High-Quality Fine-Tuning**: Trained on N/A carefully curated examples
+- **Efficient Training**: Uses LoRA (Low-Rank Adaptation) with 4-bit quantization
+- **Strong Performance**: Achieves N/A token accuracy on evaluation set
+- **Optimized for Inference**: Available in multiple formats including GGUF quantizations
+
+## Model Details
+
+| Property | Value |
+|----------|-------|
+| **Developed by** | [ermiaazarkhalili](https://huggingface.co/ermiaazarkhalili) |
+| **License** | LLAMA3 |
+| **Language** | English |
+| **Base Model** | [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) |
+| **Model Size** | 8B parameters |
+| **Tensor Type** | BF16 |
+| **Context Length** | 2,048 tokens |
+| **Training Method** | SFT with LoRA |
+
+## Training Information
+
+### Training Configuration
+
+| Parameter | Value |
+|-----------|-------|
+| Learning Rate | 0.0002 |
+| Batch Size | 2 per device |
+| Gradient Accumulation Steps | 8 |
+| Effective Batch Size | 16 |
+| Number of Epochs | 1 |
+| Max Sequence Length | 2,048 tokens |
+| LR Scheduler | Linear warmup + Cosine annealing |
+| Warmup Ratio | 0.1 |
+| Precision | BF16 mixed precision |
+| Gradient Checkpointing | Enabled |
+| Random Seed | 42 |
+
+### LoRA Configuration
+
+| Parameter | Value |
+|-----------|-------|
+| LoRA Rank (r) | 64 |
+| LoRA Alpha | 128 |
+| LoRA Dropout | 0.05 |
+| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
+| Quantization | 4-bit NF4 |
+
+### Training Metrics
+
+| Metric | Value |
+|--------|-------|
+| Hardware | NVIDIA H100 MIG |
+
+## Dataset
+
+This model was trained on the [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) dataset.
+
+| Split | Samples |
+|-------|---------|
+| Training | N/A |
+| Evaluation | N/A |
+
+## Usage
+
+### Quick Start
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+
+model_id = "ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM"
+
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": "What is the sum of 2 + 2?"}
+]
+
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt").to(model.device)
+
+outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
+response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
+print(response)
+```
+
+### Using Pipeline
+
+```python
+from transformers import pipeline
+
+generator = pipeline("text-generation", model="ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM", device_map="auto")
+messages = [{"role": "user", "content": "Explain the concept of machine learning."}]
+output = generator(messages, max_new_tokens=256, return_full_text=False)
+print(output[0]["generated_text"])
+```
+
+### 4-bit Quantized Inference
+
+```python
+from transformers import AutoModelForCausalLM, BitsAndBytesConfig
+import torch
+
+quantization_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_compute_dtype=torch.bfloat16,
+    bnb_4bit_use_double_quant=True,
+    bnb_4bit_quant_type="nf4"
+)
+
+model = AutoModelForCausalLM.from_pretrained(
+    "ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM",
+    quantization_config=quantization_config,
+    device_map="auto"
+)
+```
+
+## GGUF Versions
+
+For CPU or mixed CPU/GPU inference, GGUF quantized versions are available at:
+[ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM-GGUF](https://huggingface.co/ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM-GGUF)
+
+### Using with Ollama
+
+```bash
+ollama pull hf.co/ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM-GGUF:Q4_K_M
+ollama run hf.co/ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM-GGUF:Q4_K_M "Hello!"
+```
+
+## Limitations
+
+- **Language**: Primarily trained on English data
+- **Knowledge Cutoff**: Limited to base model's training data cutoff
+- **Hallucinations**: May generate plausible-sounding but incorrect information
+- **Context Length**: Fine-tuned with 2,048 token limit
+- **Safety**: Not extensively safety-tuned; use with appropriate guardrails
+
+## Intended Use
+
+### Recommended Uses
+- Research on language model fine-tuning
+- Educational purposes
+- Personal projects
+- Prototyping conversational AI
+
+### Out-of-Scope Uses
+- Production systems without additional safety measures
+- Medical, legal, or financial advice
+- Generating harmful or misleading content
+
+## Training Framework
+
+- **TRL**: 0.24.0
+- **Transformers**: 4.57.3
+- **PyTorch**: 2.9.0
+- **Datasets**: 4.3.0
+- **PEFT**: 0.18.0
+- **BitsAndBytes**: 0.49.0
+
+## Citation
+
+```bibtex
+@misc{ermiaazarkhalili_llama_3_8b_function_calling_xlam,
+    author = {ermiaazarkhalili},
+    title = {Llama-3-8B-Function-Calling-xLAM: Fine-tuned Meta-Llama-3-8B-Instruct on xlam-function-calling-60k},
+    year = {2026},
+    publisher = {Hugging Face},
+    howpublished = {\url{https://huggingface.co/ermiaazarkhalili/Llama-3-8B-Function-Calling-xLAM}}
+}
+```
+
+## Acknowledgments
+
+- Base model developers at meta-llama
+- [Hugging Face TRL Team](https://github.com/huggingface/trl) for the training library
+- Dataset creators and contributors
+- Compute Canada / DRAC for HPC resources
+
+## Contact
+
+For questions or issues, please open an issue on the model repository.