commit 908f12d12b2e5415479398f39e202cf70491f2e4 Author: ModelHub XC Date: Wed Apr 22 02:44:41 2026 +0800 初始化项目,由ModelHub XC社区提供模型 Model: tokenaii/Horus-1.0-4B-GGUF Source: Original Platform diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..725feff --- /dev/null +++ b/.gitattributes @@ -0,0 +1,44 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +Horus-1.0-4B-F16.gguf filter=lfs diff=lfs merge=lfs -text +media/main.png filter=lfs diff=lfs merge=lfs -text +Horus-1.0-4B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text +Horus-1.0-4B-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text +Horus-1.0-4B-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text +Horus-1.0-4B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text +media/1.png filter=lfs diff=lfs merge=lfs -text +media/2.png filter=lfs diff=lfs merge=lfs -text +media/3.png filter=lfs diff=lfs merge=lfs -text diff --git a/Horus-1.0-4B-F16.gguf b/Horus-1.0-4B-F16.gguf new file mode 100644 index 0000000..4e33625 --- /dev/null +++ b/Horus-1.0-4B-F16.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f7fcaf86a54ec1e7f5e9bd0bddcffda42a48e38252df42fe2e3a300a00be303d +size 9033728832 diff --git a/Horus-1.0-4B-Q4_K_M.gguf b/Horus-1.0-4B-Q4_K_M.gguf new file mode 100644 index 0000000..bbdfdc0 --- /dev/null +++ b/Horus-1.0-4B-Q4_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cde6de087cfd697341a57117fba72add30c340c8b1524353ef00777dbcac5a47 +size 2778282816 diff --git a/Horus-1.0-4B-Q5_K_M.gguf b/Horus-1.0-4B-Q5_K_M.gguf new file mode 100644 index 0000000..798e78f --- /dev/null +++ b/Horus-1.0-4B-Q5_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:681d2d481795d6519961b785401fbc3d9583480a20734b7e7e1596a4507986fd +size 3230186304 diff --git a/Horus-1.0-4B-Q6_K.gguf b/Horus-1.0-4B-Q6_K.gguf new file mode 100644 index 0000000..ec0ad8d --- /dev/null +++ b/Horus-1.0-4B-Q6_K.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:91a17404bf62aa9fb4b28ce957a8248ff453fbb561c424a9186f68312ebcdcf7 +size 3710333760 diff --git a/Horus-1.0-4B-Q8_0.gguf b/Horus-1.0-4B-Q8_0.gguf new file mode 100644 index 0000000..198e734 --- /dev/null +++ b/Horus-1.0-4B-Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a4087c49b01e315e60716515646d342b26a51e2bf669e8f96a644d3577dcf43a +size 4803216192 diff --git a/README.md b/README.md new file mode 100644 index 0000000..62ce8f3 --- /dev/null +++ b/README.md @@ -0,0 +1,222 @@ +--- +license: mit +language: +- ar +- en +- fr +- es +- de +- it +- pt +- tr +- ur +- hi +tags: +- llama +- llm +- text-generation +- multilingual +- causal-lm +- arabic +- gguf +- quantized +- horus +- tokenai +- neuralnode +- tts +- voice +base_model: tokenaii/horus +widget: + - text: "### User:\nWhat is the capital of Egypt?\n\n### Assistant:\nThe capital of Egypt is Cairo." + - text: "### User:\nمن هو أول رئيس لمصر؟\n\n### Assistant:\nأول رئيس لمصر بعد ثورة 1952 هو محمد نجيب." + - text: "### User:\nHello Horus!\n\n### Assistant:\nHello! I'm Horus, an AI assistant developed by TokenAI. How can I help you today?" +inference: true +--- + +# Hours-1.0-4B-GGUF + +![Horus Model](media/main.png) + +GGUF quantized versions of Horus-1.0-4B by TokenAI. + +## Base Model + +- **Source:** [tokenaii/horus](https://huggingface.co/tokenaii/horus) +- **Original Model:** Horus-1.0-4B (4B parameters) +- **Developer:** [Assem Sabry](https://assem.cloud/) & TokenAI +- **Organization:** [TokenAI](https://tokenai.cloud/) +- **Release Date:** April 2026 +- **License:** MIT + +## About TokenAI + +**TokenAI** is an AI startup founded by [Assem Sabry](https://assem.cloud/) with headquarters in Egypt. + +### Mission + +TokenAI aims to deliver the strongest language models in the world and in the Arab world through the Horus family of models. The startup bridges the gap between cutting-edge AI capabilities and regional cultural contexts, starting with the Arab world. + +### The Horus Family + +Horus-1.0-4B marks the **first model in the Horus family line**. This is just the beginning of TokenAI's journey to create a comprehensive suite of AI models serving the Arab region. + +# Horus-1.0-4B-GGUF + +GGUF quantized versions of Horus-1.0-4B - A 4B parameter multilingual language model optimized for Arabic and English. + +## Model Variants & Hardware Requirements + +| Format | File Size | Min RAM (CPU) | Min VRAM (GPU) | Quality | Best For | +|--------|-----------|---------------|----------------|---------|----------| +| **F16** | 9.03 GB | 12 GB | 10 GB | Maximum quality | High-end GPUs (RTX 3090, A100) | +| **Q8_0** | 4.8 GB | 6 GB | 5 GB | Near-lossless | RTX 3060 12GB, RTX 4060 | +| **Q6_K** | 3.71 GB | 5 GB | 4 GB | Excellent | RTX 3060, RTX 4060 Laptop | +| **Q5_K_M** | 3.23 GB | 4 GB | 3.5 GB | Very Good | GTX 1650, RTX 3050 | +| **Q4_K_M** | 2.78 GB | 3.5 GB | 3 GB | Good | Entry-level GPUs, CPU-only | + +### Detailed Hardware Requirements + +#### F16 (FP16 - Full Precision) +- **File**: `Horus-1.0-4B-F16.gguf` (9.03 GB) +- **Min System RAM**: 12 GB +- **Min VRAM**: 10 GB +- **Recommended**: RTX 3090, RTX 4090, A100, A6000 +- **Use Case**: Maximum quality, research, fine-tuning reference + +#### Q8_0 (8-bit Quantization) +- **File**: `Horus-1.0-4B-Q8_0.gguf` (4.8 GB) +- **Min System RAM**: 6 GB +- **Min VRAM**: 5 GB +- **Recommended**: RTX 3060 12GB, RTX 4060, RTX 4070 +- **Use Case**: Near-lossless quality with half the memory + +#### Q6_K (6-bit K-Quant) +- **File**: `Horus-1.0-4B-Q6_K.gguf` (3.71 GB) +- **Min System RAM**: 5 GB +- **Min VRAM**: 4 GB +- **Recommended**: RTX 3060, RTX 4060 Laptop, GTX 1080 Ti +- **Use Case**: Excellent quality for most applications + +#### Q5_K_M (5-bit K-Quant Medium) +- **File**: `Horus-1.0-4B-Q5_K_M.gguf` (3.23 GB) +- **Min System RAM**: 4 GB +- **Min VRAM**: 3.5 GB +- **Recommended**: GTX 1650 Super, RTX 3050, RTX 3050 Ti +- **Use Case**: Balanced quality and performance + +#### Q4_K_M (4-bit K-Quant Medium) +- **File**: `Horus-1.0-4B-Q4_K_M.gguf` (2.78 GB) +- **Min System RAM**: 3.5 GB +- **Min VRAM**: 3 GB +- **Recommended**: GTX 1060 6GB, GTX 1650, Intel Arc A380 +- **Use Case**: Maximum compression, edge devices, CPU inference + +## Quick Start + +### Using NeuralNode (Recommended) + +The easiest way to use Horus GGUF models is with the NeuralNode framework: + +```python +import neuralnode as nn + +MODEL_ID = "tokenaii/Hours-1.0-4B-GGUF/Horus-1.0-4B-Q6_K.gguf" +DEVICE = "cpu" # Change to "cuda" for GPU acceleration + +# Download and load +model = nn.HorusModel(MODEL_ID, device=DEVICE).load() + +# Use immediately +response = model.chat([{"role": "user", "content": "hi horus im emy"}]) +print(response.content) +``` + +### Using llama-cpp-python + +For direct llama.cpp integration: + +```python +from llama_cpp import Llama + +llm = Llama( + model_path="Horus-1.0-4B-Q4_K_M.gguf", + n_ctx=4096 +) + +output = llm("Hello, how are you?", max_tokens=256) +print(output['choices'][0]['text']) +``` + +## Voice Interface with Replica TTS + +Add natural voice output to your Horus GGUF model with Replica TTS: + +```python +import neuralnode as nn + +voice_id = "replica-aria-language{en-us}" + +MODEL_ID = "tokenaii/Hours-1.0-4B-GGUF/Horus-1.0-4B-F16.gguf" +DEVICE = "cuda" + +# Load model with Replica TTS +model = nn.HorusModel( + MODEL_ID, + tts_engine="replica_tts", + voice=voice_id, + device=DEVICE +).load() + +# Chat and get spoken response +response = model.chat([{"role": "user", "content": "Hello!"}]) +print(response.content) +response.play_audio() # Plays the TTS audio +``` + +### Browse All Voices + +```python +import neuralnode as nn + +voices = nn.replica_voice_list() +for voice in voices: + print(voice) +``` + +--- + +## Benchmark Results + +Below are visual comparisons of Horus-1.0-4B against leading models. + +### General Knowledge & Reasoning +![General Benchmarks](media/1.png) + +### Arabic Language & Cultural Benchmarks +![Arabic Benchmarks](media/2.png) + +### Coding & Tool Use Benchmarks +![Coding Benchmarks](media/3.png) + +--- + +## Model Capabilities + +- **Multilingual:** Supports 10+ languages including Arabic, English, French, Spanish, German, Italian, Portuguese, Turkish, Urdu, Hindi +- **Identity Recognition:** Knows itself as Horus from TokenAI +- **Reasoning:** Chain-of-thought capabilities +- **Context Length:** Up to 4096 tokens +- **Voice Output:** Replica TTS integration for natural speech + +--- + +## Links + +- **Base Model:** https://huggingface.co/tokenaii/horus +- **TokenAI Website:** https://tokenai.cloud/ +- **Developer:** https://assem.cloud/ +- **GitHub:** https://github.com/tokenaii/horus-1.0 + +--- + +**Note:** Quantized using llama.cpp for efficient inference. GGUF versions are optimized for local deployment with minimal resource requirements. diff --git a/media/1.png b/media/1.png new file mode 100644 index 0000000..5265556 --- /dev/null +++ b/media/1.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d29a11ba3de89c559fcbe4e5f9d5ed35781a50b689de45d29d48137fc723bfb +size 496985 diff --git a/media/2.png b/media/2.png new file mode 100644 index 0000000..900027e --- /dev/null +++ b/media/2.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e7b03f61fc258b17701f9f117eea42e0b5281bf48663900250bff00ffa9908d +size 512627 diff --git a/media/3.png b/media/3.png new file mode 100644 index 0000000..21cb971 --- /dev/null +++ b/media/3.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8fbea089f6acb95c4aa4c1cebbf8112855b97d45b56d94aa80de0f1736778caf +size 508459 diff --git a/media/main.png b/media/main.png new file mode 100644 index 0000000..8763229 --- /dev/null +++ b/media/main.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:224d94528fcf373083cafea19a00fc193ee32857a69026d3bc5d9a8d9f65c183 +size 1238177