初始化项目,由ModelHub XC社区提供模型
Model: Featherlabs/Aura-7b-GGUF Source: Original Platform
This commit is contained in:
40
.gitattributes
vendored
Normal file
40
.gitattributes
vendored
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.model filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||||
|
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
aura-7b-f16.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
aura-7b-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
aura-7b-q6_k.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
aura-7b-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
aura-7b-q2_k.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
154
README.md
Normal file
154
README.md
Normal file
@@ -0,0 +1,154 @@
|
|||||||
|
---
|
||||||
|
language:
|
||||||
|
- en
|
||||||
|
license: apache-2.0
|
||||||
|
base_model: Featherlabs/Aura-7b
|
||||||
|
tags:
|
||||||
|
- gguf
|
||||||
|
- qwen2
|
||||||
|
- agentic
|
||||||
|
- function-calling
|
||||||
|
- tool-use
|
||||||
|
- conversational
|
||||||
|
- featherlabs
|
||||||
|
- llama-cpp
|
||||||
|
- ollama
|
||||||
|
- lm-studio
|
||||||
|
pipeline_tag: text-generation
|
||||||
|
---
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
|
||||||
|
# ⚡ Aura-7b GGUF
|
||||||
|
|
||||||
|
### *A small model that punches above its weight — Now optimized for local inference*
|
||||||
|
|
||||||
|
**Agentic · Tool Use · Function Calling · Reasoning**
|
||||||
|
|
||||||
|
[](https://opensource.org/licenses/Apache-2.0)
|
||||||
|
[](https://huggingface.co/Featherlabs/Aura-7b)
|
||||||
|
[](#)
|
||||||
|
|
||||||
|
*Built by [Featherlabs](https://huggingface.co/Featherlabs) · Operated by Owlkun*
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ✨ Overview
|
||||||
|
|
||||||
|
This repository contains **GGUF quantized versions** of **[Featherlabs/Aura-7b](https://huggingface.co/Featherlabs/Aura-7b)** — an agentic 7B language model fine-tuned on Qwen2.5-7B-Instruct by [Featherlabs](https://huggingface.co/Featherlabs).
|
||||||
|
|
||||||
|
These models are optimized for efficient local execution on consumer hardware using CPU or GPU acceleration. They are fully compatible with [llama.cpp](https://github.com/ggerganov/llama.cpp), [Ollama](https://ollama.com), [LM Studio](https://lmstudio.ai), [Jan](https://jan.ai), and other GGUF-based runtimes.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📦 Available Quantizations
|
||||||
|
|
||||||
|
Choose the file that best matches your system's VRAM/RAM capacity:
|
||||||
|
|
||||||
|
| Filename | Size | VRAM Req | Quality | Best For |
|
||||||
|
|:---------|:----:|:-------:|:-------:|:---------|
|
||||||
|
| `aura-7b-f16.gguf` | ~15.2 GB | ~16 GB | ⭐⭐⭐⭐⭐ | Maximum quality, high VRAM systems |
|
||||||
|
| `aura-7b-q8_0.gguf` | ~8.1 GB | ~10 GB | ⭐⭐⭐⭐⭐ | Near-lossless quality |
|
||||||
|
| `aura-7b-q6_k.gguf` | ~6.25 GB | ~8 GB | ⭐⭐⭐⭐ | Excellent quality, sweet spot for 8GB GPUs |
|
||||||
|
| `aura-7b-q4_k_m.gguf` | ~4.68 GB | ~6 GB | ⭐⭐⭐⭐ | 🏆 **Recommended for most users** (MacBook Air, RTX 3060/4060) |
|
||||||
|
| `aura-7b-q2_k.gguf` | ~3.02 GB | ~4 GB | ⭐⭐⭐ | Minimum RAM / CPU-only execution |
|
||||||
|
|
||||||
|
> 💡 **Tip:** If you have an 8GB GPU, `Q6_K` will fit perfectly while offloading all layers. If you have 6GB or less, use `Q4_K_M`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🚀 Quick Start / Usage
|
||||||
|
|
||||||
|
### 🦙 llama.cpp
|
||||||
|
|
||||||
|
The basic command for interactive terminal chat:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./llama-cli \
|
||||||
|
-m aura-7b-q4_k_m.gguf \
|
||||||
|
-p "You are Aura, a helpful agentic AI assistant created by Featherlabs." \
|
||||||
|
--ctx-size 8192 \
|
||||||
|
-b 512 \
|
||||||
|
-n -1 \
|
||||||
|
-i --color
|
||||||
|
```
|
||||||
|
*(Add `-ngl 99` to offload all layers to your GPU if supported)*
|
||||||
|
|
||||||
|
### 🦙 Ollama
|
||||||
|
|
||||||
|
Creating a custom Ollama model is the easiest way to serve the API locally:
|
||||||
|
|
||||||
|
1. Create a file named `Modelfile` in the same directory as the GGUF:
|
||||||
|
```dockerfile
|
||||||
|
FROM ./aura-7b-q4_k_m.gguf
|
||||||
|
|
||||||
|
# Set the system prompt
|
||||||
|
SYSTEM "You are Aura, a helpful agentic AI assistant created by Featherlabs."
|
||||||
|
|
||||||
|
# Set standard parameters
|
||||||
|
PARAMETER num_ctx 8192
|
||||||
|
PARAMETER temperature 0.7
|
||||||
|
PARAMETER top_p 0.9
|
||||||
|
|
||||||
|
# The chat template is usually auto-detected for Qwen2, but you can explicitly set it if needed
|
||||||
|
TEMPLATE """{{ if .System }}<|im_start|>system
|
||||||
|
{{ .System }}<|im_end|>
|
||||||
|
{{ end }}{{ if .Prompt }}<|im_start|>user
|
||||||
|
{{ .Prompt }}<|im_end|>
|
||||||
|
{{ end }}<|im_start|>assistant
|
||||||
|
{{ .Response }}<|im_end|>
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Build and run:
|
||||||
|
```bash
|
||||||
|
ollama create aura-7b -f Modelfile
|
||||||
|
ollama run aura-7b
|
||||||
|
```
|
||||||
|
|
||||||
|
### 🖥️ LM Studio
|
||||||
|
|
||||||
|
1. Open LM Studio and search for `Featherlabs/Aura-7b-GGUF` (or drag and drop the `.gguf` file).
|
||||||
|
2. Download your preferred quantization (e.g., `Q4_K_M`).
|
||||||
|
3. Go to the Chat tab and load the model.
|
||||||
|
4. From the right panel, select the **Qwen2** chat template (or set the system prompt manually).
|
||||||
|
5. Start chatting!
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📊 Model Details
|
||||||
|
|
||||||
|
| Property | Value |
|
||||||
|
|---|---|
|
||||||
|
| **Base Model** | [Featherlabs/Aura-7b](https://huggingface.co/Featherlabs/Aura-7b) |
|
||||||
|
| **Architecture** | Qwen2 |
|
||||||
|
| **Parameters** | ~8B |
|
||||||
|
| **Context length** | 8192 tokens |
|
||||||
|
| **Quantization tool** | `llama.cpp` |
|
||||||
|
| **Format** | GGUF (v3) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 👑 Original Model (Safetensors)
|
||||||
|
|
||||||
|
If you need the full-precision `BF16` weights for fine-tuning, training, or deployment in production clusters (vLLM, TGI, SGLang):
|
||||||
|
|
||||||
|
👉 **[Featherlabs/Aura-7b](https://huggingface.co/Featherlabs/Aura-7b)**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📜 License
|
||||||
|
|
||||||
|
Apache 2.0 — consistent with [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
<div align="center">
|
||||||
|
|
||||||
|
**Built with ❤️ by [Featherlabs](https://huggingface.co/Featherlabs)**
|
||||||
|
|
||||||
|
*Operated by Owlkun*
|
||||||
|
|
||||||
|
</div>
|
||||||
3
aura-7b-f16.gguf
Normal file
3
aura-7b-f16.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:aac8a8b66021e011750035c7616437519eb0b45d7c6139c1437969cc5ba69bc1
|
||||||
|
size 15237853056
|
||||||
3
aura-7b-q2_k.gguf
Normal file
3
aura-7b-q2_k.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:9cdab10e6490643c6b485010f6ad3896a7e535f6c3ebbad0aa9b9089e29a23b6
|
||||||
|
size 3015939968
|
||||||
3
aura-7b-q4_k_m.gguf
Normal file
3
aura-7b-q4_k_m.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:c225a3a48b6a7f97ef2621f8394a7d87ff4482a290114373223b237fbec8d3a2
|
||||||
|
size 4683073408
|
||||||
3
aura-7b-q6_k.gguf
Normal file
3
aura-7b-q6_k.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:6f4f785362a580ab4d237f64a03faefbe656bfa1cd4c9c49ad4a093dda87f02b
|
||||||
|
size 6254198656
|
||||||
3
aura-7b-q8_0.gguf
Normal file
3
aura-7b-q8_0.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:efc37c84bf62ff3f588071be3033f6aeb88842c71d6c46d51c3c3082fa7db051
|
||||||
|
size 8098525056
|
||||||
Reference in New Issue
Block a user