--- license: apache-2.0 base_model: Qwen/Qwen2.5-1.5B-Instruct tags: - qwen - qlora - unsloth - chat - function-calling - quantasparklabs - identity-alignment - text-generation language: - en pipeline_tag: text-generation ---

NYXIS-1.1B — Identity-Aligned Lightweight Language Model by QuantaSparkLabs

All New NYXIS 2B!

> [!NOTE] > This repository contains the **fully merged model weights** (not just LoRA adapters), > compatible with 🤗 Transformers, vLLM, Text Generation Inference, Unsloth, and custom pipelines. > Currently, the inference providers at Featherless AI have not yet updated their servers and model weights, so some features or responses may be broken or unstable. --- ## 📋 Overview **NYXIS-1.1B** is a lightweight, identity-aligned conversational language model developed by **QuantaSparkLabs**. It is fine-tuned from **[Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)** using **QLoRA + Unsloth** on a custom curated dataset — built entirely on a T4 GPU. NYXIS is designed for **stable persona consistency**, **instruction following**, **web-search tool calling**, and **efficient edge deployment** — all while keeping a tiny VRAM footprint. --- ## 🎯 Design Goals | 🎯 Goal | 📌 Detail | |--------|----------| | 🪪 Identity Alignment | Consistent "I'm NYXIS, created by QuantaSparkLabs" across all contexts | | 🌐 Tool Calling | Trained web-search function-call pattern built in | | ⚡ Efficiency | Runs on T4 / 8GB VRAM without quantization tricks | | 🔧 Plug & Play | Fully merged weights — no adapter loading needed | | 🧠 Knowledge Retention | Custom dataset preserves Qwen2.5 base knowledge | --- ## ✨ Core Capabilities | Capability | Description | |-----------|-------------| | 🧠 **Conversational AI** | Chat-optimized with Qwen2.5 `<\|im_start\|>` / `<\|im_end\|>` template | | 🪪 **Identity Alignment** | Consistent "NYXIS by QuantaSparkLabs" persona under all prompts | | 📚 **Instruction Following** | Supports reasoning, explanation, summarization, and coding | | 🌐 **Web Search Tool** | Emits `web_search(query)` function calls when external info is needed | | ⚡ **Lightweight** | Runs on 6–8 GB VRAM in FP16 | | 🔧 **Fully Merged Weights** | Standalone model — no LoRA adapter required at runtime | --- ## 🏗️ Model Architecture ### 🔩 Base Model | Field | Value | |-------|-------| | **Backbone** | `Qwen/Qwen2.5-1.5B-Instruct` | | **Framework** | Hugging Face Transformers + Unsloth | | **Fine-tuning** | QLoRA (rank 16) → Full Weight Merge | | **Chat Template** | Qwen2.5 ChatML (`<\|im_start\|>` / `<\|im_end\|>`) | ### 🔄 Training Pipeline ``` Qwen2.5-1.5B-Instruct (Base) ↓ QLoRA Fine-Tuning (rank 16, Unsloth) ↓ Custom 500-example Identity + Chat + Tool Dataset ↓ Full Weight Merge (adapter baked into model) ↓ NYXIS-1.1B — Deployed on HuggingFace 🚀 ``` --- ## 📊 Technical Specifications | ⚙️ Parameter | 📌 Value | |-------------|---------| | **Model Name** | NYXIS-1.1B | | **Organization** | QuantaSparkLabs | | **Base Model** | `Qwen/Qwen2.5-1.5B-Instruct` | | **Total Parameters** | ~1.56 Billion | | **Trainable Parameters** | 18.5M (1.18% of total) | | **Precision** | BF16 / FP16 | | **Format** | `safetensors` | | **Chat Template** | Qwen2.5 ChatML (Jinja) | | **Inference Mode** | Causal LM | | **File Size** | ~2.0–2.2 GB | --- ## 🧬 Training Details ### ⚡ Fine-Tuning Method | 🔬 Setting | 📌 Value | |-----------|---------| | **Technique** | QLoRA (Quantized Low-Rank Adaptation) | | **Library** | [Unsloth](https://github.com/unslothai/unsloth) | | **LoRA Rank** | 16 | | **Optimizer** | AdamW (paged) | | **Learning Rate** | `2e-4` | | **Epochs** | 3 | | **Total Steps** | 189 | | **Batch Size** | 8 (2 per device × 4 grad accumulation) | | **Hardware** | T4 GPU | | **Final Training Loss** | ~0.08 ✅ | | **Merge Strategy** | Full weight merge — adapter baked in | ### 📂 Dataset Composition (500 examples) | 🗂️ Category | 📊 Proportion | 📝 Description | |------------|-------------|---------------| | 🪪 **Identity** | 10% (50 examples) | Gives its Identity| | 💬 **Open Chat** | 70% (350 examples) | Diverse assistant responses — science, jokes, coding, daily life, etc. | | 🌐 **Web Search Tool** | 20% (100 examples) | Function-calling pattern: model requests `web_search(query)` when it needs external info | > The dataset was **custom-built** to preserve Qwen2.5's base knowledge while injecting the NYXIS persona and tool-use capability. --- ## 💻 Quick Start ### 🔧 Installation ```bash # Option A: Standard Transformers pip install transformers accelerate torch # Option B: Unsloth (recommended for speed + memory efficiency) pip install unsloth ``` ### 🚀 Load & Chat — Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch MODEL_ID = "QuantaSparkLabs/NYXIS-1.1B" tokenizer = AutoTokenizer.from_pretrained(MODEL_ID) model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.float16, device_map="auto" ) model.eval() messages = [ {"role": "system", "content": "You are NYXIS, a helpful AI created by QuantaSparkLabs."}, {"role": "user", "content": "Hello NYXIS! Who are you?"} ] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=150, temperature=0.6, top_p=0.9, repetition_penalty=1.15, no_repeat_ngram_size=3, pad_token_id=tokenizer.eos_token_id, eos_token_id=tokenizer.eos_token_id ) response = tokenizer.decode( outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True ) print("NYXIS:", response) ``` ### ⚡ Load with Unsloth (Recommended) ```python from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name="QuantaSparkLabs/NYXIS-1.1B", max_seq_length=2048, load_in_4bit=True, ) FastLanguageModel.for_inference(model) ``` ### 🖊️ Manual Qwen2.5 Chat Prompt Format NYXIS uses the standard Qwen2.5 ChatML tokens. Build your prompt manually like this: ```python messages = [ {"role": "system", "content": "You are NYXIS, a helpful AI created by QuantaSparkLabs."}, {"role": "user", "content": "What is a black hole?"} ] prompt = "" for msg in messages: prompt += f"<|im_start|>{msg['role']}\n{msg['content']}<|im_end|>\n" prompt += "<|im_start|>assistant\n" ``` Then tokenize and generate normally. --- ## 🌐 Web Search Tool Pattern When a system prompt mentions that a `web_search` tool is available, NYXIS may emit a function call instead of answering directly: ``` <|im_start|>assistant [{"type": "function", "function": {"name": "web_search", "arguments": {"query": "latest news on AI"}}}] <|im_end|> ``` You can intercept this, run an actual search, and feed the result back as a `tool` message to get the final answer. > ⚠️ The web-search pattern is **trained behaviour only** — it does not include a live search engine. > You need to implement the tool runner yourself (e.g. using SerpAPI, DuckDuckGo, or Tavily). --- ## ⚡ Hardware Requirements | 🖥️ Hardware | 🚦 Performance | |------------|--------------| | T4 GPU (16GB) | ✅ **Optimal** — trained on this | | RTX 3060 (12GB) | ✅ **Smooth** FP16 | | 8GB VRAM GPU | ⚠️ **Usable** — FP16 recommended | | 4GB VRAM GPU | 🔶 **Use 4-bit** via Unsloth / BitsAndBytes | | CPU Only | 🐌 **Slow** but functional | --- ## 📁 Repository Structure ``` NYXIS-1.1B/ ├── model.safetensors # Full merged weights (~2.2 GB) ├── config.json # Model architecture config ├── tokenizer.json # Qwen2.5 tokenizer ├── tokenizer_config.json # Chat template config ├── generation_config.json # Default generation settings ├── chat_template.jinja # Jinja chat template └── README.md ``` --- ## ⚠️ Known Limitations | ⚠️ Issue | 📝 Notes | |---------|---------| | 🔁 Hallucination | May occasionally hallucinate or oversimplify (1.5B scale) | | 🗣️ Identity Bias | May append *"How can I help you today?"* — reduce via system prompt tuning | | 🔢 Math Reasoning | Limited complex math ability (small model) | | 🌍 Language | Primarily English-focused | | 🚫 Critical Use | Not suitable for medical, legal, or safety-critical applications | | 🔍 Web Search | Tool pattern only — no live search engine included | --- ## 🔒 Safety & Alignment NYXIS is trained with: - ✅ Identity alignment dataset (consistent persona) - ✅ Instruction-balanced samples (diverse and safe) - ✅ Controlled decoding configuration (anti-loop) **Recommended generation settings:** ```python temperature = 0.6 top_p = 0.9 repetition_penalty = 1.1 # to 1.2 no_repeat_ngram_size = 3 ``` --- ## 🚀 Version History | 🏷️ Version | 📅 Date | 📝 Notes | |-----------|--------|---------| | **v1.0** | Early 2025 | Initial LoRA fine-tune on TinyLlama | | **v1.1 (NYXIS 2.1)** | 2025 | Rebuilt on Qwen2.5-1.5B-Instruct · QLoRA · Unsloth · 500 examples · Web-search tool · Full merge · HF deployment | --- ## 📜 License This model is licensed under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**, following the original `Qwen2.5-1.5B-Instruct` license terms. ---

NYXIS • Built by QuantaSparkLabs • 2025–2026
_{Lightweight • Identity-Aligned • Efficient • Open Source}

If you find NYXIS useful, give the repo a ❤️ and share your creations!