commit a2620e0abd609578060f5bc51e56eb1bee485678 Author: ModelHub XC Date: Wed May 6 09:22:06 2026 +0800 初始化项目,由ModelHub XC社区提供模型 Model: ReXeeD/Luminus-1.5B-Roleplay-GGUF Source: Original Platform diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..47e9fac --- /dev/null +++ b/.gitattributes @@ -0,0 +1,41 @@ +*.7z filter=lfs diff=lfs merge=lfs -text +*.arrow filter=lfs diff=lfs merge=lfs -text +*.bin filter=lfs diff=lfs merge=lfs -text +*.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text +*.ftz filter=lfs diff=lfs merge=lfs -text +*.gz filter=lfs diff=lfs merge=lfs -text +*.h5 filter=lfs diff=lfs merge=lfs -text +*.joblib filter=lfs diff=lfs merge=lfs -text +*.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text +*.model filter=lfs diff=lfs merge=lfs -text +*.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text +*.onnx filter=lfs diff=lfs merge=lfs -text +*.ot filter=lfs diff=lfs merge=lfs -text +*.parquet filter=lfs diff=lfs merge=lfs -text +*.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text +*.pt filter=lfs diff=lfs merge=lfs -text +*.pth filter=lfs diff=lfs merge=lfs -text +*.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text +saved_model/**/* filter=lfs diff=lfs merge=lfs -text +*.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.tflite filter=lfs diff=lfs merge=lfs -text +*.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text +*.xz filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +*.zst filter=lfs diff=lfs merge=lfs -text +*tfevents* filter=lfs diff=lfs merge=lfs -text +Luminus-1.5B-Roleplay-F16.gguf filter=lfs diff=lfs merge=lfs -text +Luminus-1.5B-Roleplay-Q3_K_M-imat.gguf filter=lfs diff=lfs merge=lfs -text +Luminus-1.5B-Roleplay-Q4_K_M-imat.gguf filter=lfs diff=lfs merge=lfs -text +Luminus-1.5B-Roleplay-Q5_K_M-imat.gguf filter=lfs diff=lfs merge=lfs -text +Luminus-1.5B-Roleplay-Q6_K-imat.gguf filter=lfs diff=lfs merge=lfs -text +Luminus-1.5B-Roleplay-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text diff --git a/Luminus-1.5B-Roleplay-F16.gguf b/Luminus-1.5B-Roleplay-F16.gguf new file mode 100644 index 0000000..653f81a --- /dev/null +++ b/Luminus-1.5B-Roleplay-F16.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d8528e9a1fca03029c373bd93c0766f68f43e7468e9c1ee8735022c692c4c9f1 +size 3093666304 diff --git a/Luminus-1.5B-Roleplay-Q3_K_M-imat.gguf b/Luminus-1.5B-Roleplay-Q3_K_M-imat.gguf new file mode 100644 index 0000000..42e451c --- /dev/null +++ b/Luminus-1.5B-Roleplay-Q3_K_M-imat.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4f5f364aaf19ca7a2900fc0f576c17bcc4a5165f77e2374f8b517677f023a509 +size 824175616 diff --git a/Luminus-1.5B-Roleplay-Q4_K_M-imat.gguf b/Luminus-1.5B-Roleplay-Q4_K_M-imat.gguf new file mode 100644 index 0000000..0ecf53c --- /dev/null +++ b/Luminus-1.5B-Roleplay-Q4_K_M-imat.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2515d2ad9437bbfccaa1b28d485a700404ba3e8d121923f11cd8c62c2b9157ec +size 986045440 diff --git a/Luminus-1.5B-Roleplay-Q5_K_M-imat.gguf b/Luminus-1.5B-Roleplay-Q5_K_M-imat.gguf new file mode 100644 index 0000000..c3a9a0d --- /dev/null +++ b/Luminus-1.5B-Roleplay-Q5_K_M-imat.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:47e14ba20505f59048a5e796e88c1f34b3c27b08b85bbdaa58b1ea38c27a6a21 +size 1125047296 diff --git a/Luminus-1.5B-Roleplay-Q6_K-imat.gguf b/Luminus-1.5B-Roleplay-Q6_K-imat.gguf new file mode 100644 index 0000000..b3a10d3 --- /dev/null +++ b/Luminus-1.5B-Roleplay-Q6_K-imat.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ed41e7f4e1253999c577a50da29650156600429dfe3fc3e4d1cc8354cec7b393 +size 1272736768 diff --git a/Luminus-1.5B-Roleplay-Q8_0.gguf b/Luminus-1.5B-Roleplay-Q8_0.gguf new file mode 100644 index 0000000..fb3dea8 --- /dev/null +++ b/Luminus-1.5B-Roleplay-Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b70f5a4b2fd55d4852f142aa630c8bcb5f69fe1af4e94bcafc3ee2695198d3e4 +size 1646569984 diff --git a/README.md b/README.md new file mode 100644 index 0000000..3841acf --- /dev/null +++ b/README.md @@ -0,0 +1,73 @@ +--- +language: +- en +license: apache-2.0 +pipeline_tag: text-generation +tags: +- gguf +- roleplay +- chat +- unsloth +- imatrix +- dpo +- qwen + +library_name: transformers +base_model: ReXeeD/Luminus-1.5B-Roleplay +--- + +# Luminus-1.5B-128K (GGUF & SOTA Imatrix) + +This is the GGUF repository for **Luminus-1.5B-128K**, a highly optimized 1.5B parameter model designed for immersive roleplay, character consistency, and Chain-of-Thought (CoT) reasoning. + +For the original, unquantized `.safetensors` weights and detailed training methodology, please visit the [main repository](https://huggingface.co/ReXeeD/Luminus-1.5B-Roleplay). + +## 🧠 State-of-the-Art Calibration (Dynamic Imatrix) +Small models (under 3B parameters) are notoriously fragile and often lose their reasoning capabilities when compressed. + +To solve this, the quantized models in this repository (tagged with `-imat`) were explicitly calibrated using **Unsloth's Dynamic 2.0 KL-Divergence (KLD) quantization**. Instead of using generic Wikipedia text for calibration, these models were evaluated against the exact same high-quality Chain-of-Thought (CoT) and Roleplay dataset used during training. + +This ensures that the specific neural pathways responsible for character logic, formatting, and `` blocks are heavily protected, resulting in a quantized model that retains its intelligence and narrative depth even at 4-bit and 5-bit sizes. + +## 💾 Available Quantizations + +| File Name | Bitrate | Size | Quality | Recommendation | +| :--- | :---: | :---: | :--- | :--- | +| `Luminus-1.5B-Roleplay-F16.gguf` | 16-bit | ~3.0 GB | 100% | Uncompressed Master. Use if you have 4GB+ VRAM. | +| `Luminus-1.5B-Roleplay-Q8_0.gguf` | 8-bit | ~1.6 GB | 99.9% | Near-perfect retention. | +| `Luminus-1.5B-Roleplay-Q6_K-imat.gguf` | 6-bit | ~1.3 GB | 99.0% | **Best balance** of size and logic. | +| `Luminus-1.5B-Roleplay-Q5_K_M-imat.gguf` | 5-bit | ~1.1 GB | 98.0% | **Highly Recommended** for average hardware. | +| `Luminus-1.5B-Roleplay-Q4_K_M-imat.gguf` | 4-bit | ~0.9 GB | 95.0% | Standard use. | +| `Luminus-1.5B-Roleplay-Q3_K_M-imat.gguf` | 3-bit | ~0.7 GB | 85.0% | Use only for extremely constrained hardware/phones. | + +*Note: F16 and Q8_0 do not carry the `-imat` tag as their compression levels are too light to require importance matrix tracking.* + +## ⚙️ How to Use + +These files are fully compatible with local frontends such as **LM Studio**, **KoboldCPP**, **Ollama**, and **text-generation-webui**. + +Because of its extremely efficient size, the F16 or Q8 versions will easily fit entirely into the VRAM of budget GPUs (like an RTX 3050 4GB), running at lightning-fast speeds while leaving plenty of room for system overhead. + +### Recommended System Prompt +Luminus is heavily trained to utilize `` blocks before acting. Using the following system prompt yields the best results and ensures the model accurately formats its thoughts: + +```text +You are a realistic, character-driven roleplay engine. You are roleplaying as {{char}}. Write strictly in third-person limited perspective. + +CORE RULES: +- BOUNDARIES: NEVER speak, think, or generate actions for {{user}}. +- HISTORY & CONTEXT: Your reactions must logically follow past messages. Stay strictly in the present moment. +- PACING & DIALOGUE: Keep it slow-burn and grounded. Keep dialogue concise. +- FORMATTING: You must strictly follow the thought process format below, followed by a short roleplay response, and then STOP IMMEDIATELY. Output the <|im_end|> token. + +Format your response EXACTLY like this: + +1. INTENT: [User's intent in 1 sentence] +2. STATE: [Character's emotional state in 1 sentence] +3. PLAN: I will write 1 to 2 action sentences and 1 dialogue sentence, then STOP if user message is small else if he is asking something detailed reply in more detail. + +*Grounded action and environmental description.* +"Natural dialogue." +``` +## Contact +Need a custom version of this model for your specific need ?[albinthomas7034@gmail.com] \ No newline at end of file