Files
ModelHub XC 1c83eece17 初始化项目,由ModelHub XC社区提供模型
Model: ermiaazarkhalili/Qwen2.5-14B-Instruct_Function_Calling_xLAM
Source: Original Platform
2026-05-09 18:56:18 +08:00

220 lines
6.3 KiB
Markdown

---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- qwen2514binstruct
- sft
- fine-tuned
- trl
- lora
- text-generation
- conversational
- instruction-following
base_model: Qwen/Qwen2.5-14B-Instruct
datasets:
- Salesforce/xlam-function-calling-60k
model-index:
- name: Qwen2.5-14B-Function-Calling-xLAM
results: []
---
# Qwen2.5-14B-Function-Calling-xLAM
This model is a fine-tuned version of [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) trained on the [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) dataset using **SFT** with LoRA adapters.
## Overview
**Qwen2.5-14B-Function-Calling-xLAM** is a language model optimized using SFT. Supervised Fine-Tuning (SFT) trains the model to follow instructions by learning from high-quality demonstration data.
### Key Features
- **High-Quality Fine-Tuning**: Trained on N/A carefully curated examples
- **Efficient Training**: Uses LoRA (Low-Rank Adaptation) with 4-bit quantization
- **Strong Performance**: Achieves N/A token accuracy on evaluation set
- **Optimized for Inference**: Available in multiple formats including GGUF quantizations
## Model Details
| Property | Value |
|----------|-------|
| **Developed by** | [ermiaazarkhalili](https://huggingface.co/ermiaazarkhalili) |
| **License** | APACHE-2.0 |
| **Language** | English |
| **Base Model** | [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) |
| **Model Size** | 14B parameters |
| **Tensor Type** | BF16 |
| **Context Length** | 2,048 tokens |
| **Training Method** | SFT with LoRA |
## Training Information
### Training Configuration
| Parameter | Value |
|-----------|-------|
| Learning Rate | 0.0002 |
| Batch Size | 2 per device |
| Gradient Accumulation Steps | 8 |
| Effective Batch Size | 16 |
| Number of Epochs | 1 |
| Max Sequence Length | 2,048 tokens |
| LR Scheduler | Linear warmup + Cosine annealing |
| Warmup Ratio | 0.1 |
| Precision | BF16 mixed precision |
| Gradient Checkpointing | Enabled |
| Random Seed | 42 |
### LoRA Configuration
| Parameter | Value |
|-----------|-------|
| LoRA Rank (r) | 64 |
| LoRA Alpha | 128 |
| LoRA Dropout | 0.05 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Quantization | 4-bit NF4 |
### Training Metrics
| Metric | Value |
|--------|-------|
| Hardware | NVIDIA H100 MIG |
## Dataset
This model was trained on the [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) dataset.
| Split | Samples |
|-------|---------|
| Training | N/A |
| Evaluation | N/A |
## Usage
### Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "ermiaazarkhalili/Qwen2.5-14B-Function-Calling-xLAM"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the sum of 2 + 2?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
```
### Using Pipeline
```python
from transformers import pipeline
generator = pipeline("text-generation", model="ermiaazarkhalili/Qwen2.5-14B-Function-Calling-xLAM", device_map="auto")
messages = [{"role": "user", "content": "Explain the concept of machine learning."}]
output = generator(messages, max_new_tokens=256, return_full_text=False)
print(output[0]["generated_text"])
```
### 4-bit Quantized Inference
```python
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
import torch
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4"
)
model = AutoModelForCausalLM.from_pretrained(
"ermiaazarkhalili/Qwen2.5-14B-Function-Calling-xLAM",
quantization_config=quantization_config,
device_map="auto"
)
```
## GGUF Versions
For CPU or mixed CPU/GPU inference, GGUF quantized versions are available at:
[ermiaazarkhalili/Qwen2.5-14B-Function-Calling-xLAM-GGUF](https://huggingface.co/ermiaazarkhalili/Qwen2.5-14B-Function-Calling-xLAM-GGUF)
### Using with Ollama
```bash
ollama pull hf.co/ermiaazarkhalili/Qwen2.5-14B-Function-Calling-xLAM-GGUF:Q4_K_M
ollama run hf.co/ermiaazarkhalili/Qwen2.5-14B-Function-Calling-xLAM-GGUF:Q4_K_M "Hello!"
```
## Limitations
- **Language**: Primarily trained on English data
- **Knowledge Cutoff**: Limited to base model's training data cutoff
- **Hallucinations**: May generate plausible-sounding but incorrect information
- **Context Length**: Fine-tuned with 2,048 token limit
- **Safety**: Not extensively safety-tuned; use with appropriate guardrails
## Intended Use
### Recommended Uses
- Research on language model fine-tuning
- Educational purposes
- Personal projects
- Prototyping conversational AI
### Out-of-Scope Uses
- Production systems without additional safety measures
- Medical, legal, or financial advice
- Generating harmful or misleading content
## Training Framework
- **TRL**: 0.24.0
- **Transformers**: 4.57.3
- **PyTorch**: 2.9.0
- **Datasets**: 4.3.0
- **PEFT**: 0.18.0
- **BitsAndBytes**: 0.49.0
## Citation
```bibtex
@misc{ermiaazarkhalili_qwen2.5_14b_function_calling_xlam,
author = {ermiaazarkhalili},
title = {Qwen2.5-14B-Function-Calling-xLAM: Fine-tuned Qwen2.5-14B-Instruct on xlam-function-calling-60k},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/ermiaazarkhalili/Qwen2.5-14B-Function-Calling-xLAM}}
}
```
## Acknowledgments
- Base model developers at Qwen
- [Hugging Face TRL Team](https://github.com/huggingface/trl) for the training library
- Dataset creators and contributors
- Compute Canada / DRAC for HPC resources
## Contact
For questions or issues, please open an issue on the model repository.