初始化项目，由ModelHub XC社区提供模型

Model: Featherlabs/Aura-7b-GGUF Source: Original Platform
2026-04-11 12:44:02 +08:00
commit 6576e0bed7
7 changed files with 209 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,40 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 aura-7b-f16.gguf filter=lfs diff=lfs merge=lfs -text
 aura-7b-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
 aura-7b-q6_k.gguf filter=lfs diff=lfs merge=lfs -text
 aura-7b-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 aura-7b-q2_k.gguf filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,154 @@
 ---
 language:
 - en
 license: apache-2.0
 base_model: Featherlabs/Aura-7b
 tags:
 - gguf
 - qwen2
 - agentic
 - function-calling
 - tool-use
 - conversational
 - featherlabs
 - llama-cpp
 - ollama
 - lm-studio
 pipeline_tag: text-generation
 ---
 <div align="center">
 # ⚡ Aura-7b GGUF
 ### *A small model that punches above its weight — Now optimized for local inference*
 **Agentic · Tool Use · Function Calling · Reasoning**
 [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
 [![Base Model](https://img.shields.io/badge/Base-Featherlabs/Aura--7b-purple)](https://huggingface.co/Featherlabs/Aura-7b)
 [![Quantization](https://img.shields.io/badge/Format-GGUF-orange)](#)
 *Built by [Featherlabs](https://huggingface.co/Featherlabs) · Operated by Owlkun*
 </div>
 ---
 ## ✨ Overview
 This repository contains **GGUF quantized versions** of **[Featherlabs/Aura-7b](https://huggingface.co/Featherlabs/Aura-7b)** — an agentic 7B language model fine-tuned on Qwen2.5-7B-Instruct by [Featherlabs](https://huggingface.co/Featherlabs).
 These models are optimized for efficient local execution on consumer hardware using CPU or GPU acceleration. They are fully compatible with [llama.cpp](https://github.com/ggerganov/llama.cpp), [Ollama](https://ollama.com), [LM Studio](https://lmstudio.ai), [Jan](https://jan.ai), and other GGUF-based runtimes.
 ---
 ## 📦 Available Quantizations
 Choose the file that best matches your system's VRAM/RAM capacity:
 | Filename | Size | VRAM Req | Quality | Best For |
 |:---------|:----:|:-------:|:-------:|:---------|
 | `aura-7b-f16.gguf` | ~15.2 GB | ~16 GB | ⭐⭐⭐⭐⭐ | Maximum quality, high VRAM systems |
 | `aura-7b-q8_0.gguf` | ~8.1 GB | ~10 GB | ⭐⭐⭐⭐⭐ | Near-lossless quality |
 | `aura-7b-q6_k.gguf` | ~6.25 GB | ~8 GB | ⭐⭐⭐⭐ | Excellent quality, sweet spot for 8GB GPUs |
 | `aura-7b-q4_k_m.gguf` | ~4.68 GB | ~6 GB | ⭐⭐⭐⭐ | 🏆 **Recommended for most users** (MacBook Air, RTX 3060/4060) |
 | `aura-7b-q2_k.gguf` | ~3.02 GB | ~4 GB | ⭐⭐⭐ | Minimum RAM / CPU-only execution |
 > 💡 **Tip:** If you have an 8GB GPU, `Q6_K` will fit perfectly while offloading all layers. If you have 6GB or less, use `Q4_K_M`.
 ---
 ## 🚀 Quick Start / Usage
 ### 🦙 llama.cpp
 The basic command for interactive terminal chat:
 ```bash
 ./llama-cli \
  -m aura-7b-q4_k_m.gguf \
  -p "You are Aura, a helpful agentic AI assistant created by Featherlabs." \
  --ctx-size 8192 \
  -b 512 \
  -n -1 \
  -i --color
 ```
 *(Add `-ngl 99` to offload all layers to your GPU if supported)*
 ### 🦙 Ollama
 Creating a custom Ollama model is the easiest way to serve the API locally:
 1. Create a file named `Modelfile` in the same directory as the GGUF:
 ```dockerfile
 FROM ./aura-7b-q4_k_m.gguf
 # Set the system prompt
 SYSTEM "You are Aura, a helpful agentic AI assistant created by Featherlabs."
 # Set standard parameters
 PARAMETER num_ctx 8192
 PARAMETER temperature 0.7
 PARAMETER top_p 0.9
 # The chat template is usually auto-detected for Qwen2, but you can explicitly set it if needed
 TEMPLATE """{{ if .System }}<|im_start|>system
 {{ .System }}<|im_end|>
 {{ end }}{{ if .Prompt }}<|im_start|>user
 {{ .Prompt }}<|im_end|>
 {{ end }}<|im_start|>assistant
 {{ .Response }}<|im_end|>
 """
 ```
 2. Build and run:
 ```bash
 ollama create aura-7b -f Modelfile
 ollama run aura-7b
 ```
 ### 🖥️ LM Studio
 1. Open LM Studio and search for `Featherlabs/Aura-7b-GGUF` (or drag and drop the `.gguf` file).
 2. Download your preferred quantization (e.g., `Q4_K_M`).
 3. Go to the Chat tab and load the model.
 4. From the right panel, select the **Qwen2** chat template (or set the system prompt manually).
 5. Start chatting!
 ---
 ## 📊 Model Details
 | Property | Value |
 |---|---|
 | **Base Model** | [Featherlabs/Aura-7b](https://huggingface.co/Featherlabs/Aura-7b) |
 | **Architecture** | Qwen2 |
 | **Parameters** | ~8B |
 | **Context length** | 8192 tokens |
 | **Quantization tool** | `llama.cpp` |
 | **Format** | GGUF (v3) |
 ---
 ## 👑 Original Model (Safetensors)
 If you need the full-precision `BF16` weights for fine-tuning, training, or deployment in production clusters (vLLM, TGI, SGLang):
 👉 **[Featherlabs/Aura-7b](https://huggingface.co/Featherlabs/Aura-7b)**
 ---
 ## 📜 License
 Apache 2.0 — consistent with [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct).
 ---
 <div align="center">
 **Built with ❤️ by [Featherlabs](https://huggingface.co/Featherlabs)**
 *Operated by Owlkun*
 </div>
--- a/aura-7b-f16.gguf
+++ b/aura-7b-f16.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:aac8a8b66021e011750035c7616437519eb0b45d7c6139c1437969cc5ba69bc1
 size 15237853056
--- a/aura-7b-q2_k.gguf
+++ b/aura-7b-q2_k.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:9cdab10e6490643c6b485010f6ad3896a7e535f6c3ebbad0aa9b9089e29a23b6
 size 3015939968
--- a/aura-7b-q4_k_m.gguf
+++ b/aura-7b-q4_k_m.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:c225a3a48b6a7f97ef2621f8394a7d87ff4482a290114373223b237fbec8d3a2
 size 4683073408
--- a/aura-7b-q6_k.gguf
+++ b/aura-7b-q6_k.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:6f4f785362a580ab4d237f64a03faefbe656bfa1cd4c9c49ad4a093dda87f02b
 size 6254198656
--- a/aura-7b-q8_0.gguf
+++ b/aura-7b-q8_0.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:efc37c84bf62ff3f588071be3033f6aeb88842c71d6c46d51c3c3082fa7db051
 size 8098525056