--- license: apache-2.0 base_model: Qwen/Qwen2.5-1.5B-Instruct tags: - qwen - qlora - unsloth - chat - function-calling - quantasparklabs - identity-alignment - text-generation language: - en pipeline_tag: text-generation ---
NYXIS-1.1B โ Identity-Aligned Lightweight Language Model by QuantaSparkLabs
All New NYXIS 2B!
> [!NOTE] > This repository contains the **fully merged model weights** (not just LoRA adapters), > compatible with ๐ค Transformers, vLLM, Text Generation Inference, Unsloth, and custom pipelines. > Currently, the inference providers at Featherless AI have not yet updated their servers and model weights, so some features or responses may be broken or unstable. --- ## ๐ Overview **NYXIS-1.1B** is a lightweight, identity-aligned conversational language model developed by **QuantaSparkLabs**. It is fine-tuned from **[Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)** using **QLoRA + Unsloth** on a custom curated dataset โ built entirely on a T4 GPU. NYXIS is designed for **stable persona consistency**, **instruction following**, **web-search tool calling**, and **efficient edge deployment** โ all while keeping a tiny VRAM footprint. --- ## ๐ฏ Design Goals | ๐ฏ Goal | ๐ Detail | |--------|----------| | ๐ชช Identity Alignment | Consistent "I'm NYXIS, created by QuantaSparkLabs" across all contexts | | ๐ Tool Calling | Trained web-search function-call pattern built in | | โก Efficiency | Runs on T4 / 8GB VRAM without quantization tricks | | ๐ง Plug & Play | Fully merged weights โ no adapter loading needed | | ๐ง Knowledge Retention | Custom dataset preserves Qwen2.5 base knowledge | --- ## โจ Core Capabilities | Capability | Description | |-----------|-------------| | ๐ง **Conversational AI** | Chat-optimized with Qwen2.5 `<\|im_start\|>` / `<\|im_end\|>` template | | ๐ชช **Identity Alignment** | Consistent "NYXIS by QuantaSparkLabs" persona under all prompts | | ๐ **Instruction Following** | Supports reasoning, explanation, summarization, and coding | | ๐ **Web Search Tool** | Emits `web_search(query)` function calls when external info is needed | | โก **Lightweight** | Runs on 6โ8 GB VRAM in FP16 | | ๐ง **Fully Merged Weights** | Standalone model โ no LoRA adapter required at runtime | --- ## ๐๏ธ Model Architecture ### ๐ฉ Base Model | Field | Value | |-------|-------| | **Backbone** | `Qwen/Qwen2.5-1.5B-Instruct` | | **Framework** | Hugging Face Transformers + Unsloth | | **Fine-tuning** | QLoRA (rank 16) โ Full Weight Merge | | **Chat Template** | Qwen2.5 ChatML (`<\|im_start\|>` / `<\|im_end\|>`) | ### ๐ Training Pipeline ``` Qwen2.5-1.5B-Instruct (Base) โ QLoRA Fine-Tuning (rank 16, Unsloth) โ Custom 500-example Identity + Chat + Tool Dataset โ Full Weight Merge (adapter baked into model) โ NYXIS-1.1B โ Deployed on HuggingFace ๐ ``` --- ## ๐ Technical Specifications | โ๏ธ Parameter | ๐ Value | |-------------|---------| | **Model Name** | NYXIS-1.1B | | **Organization** | QuantaSparkLabs | | **Base Model** | `Qwen/Qwen2.5-1.5B-Instruct` | | **Total Parameters** | ~1.56 Billion | | **Trainable Parameters** | 18.5M (1.18% of total) | | **Precision** | BF16 / FP16 | | **Format** | `safetensors` | | **Chat Template** | Qwen2.5 ChatML (Jinja) | | **Inference Mode** | Causal LM | | **File Size** | ~2.0โ2.2 GB | --- ## ๐งฌ Training Details ### โก Fine-Tuning Method | ๐ฌ Setting | ๐ Value | |-----------|---------| | **Technique** | QLoRA (Quantized Low-Rank Adaptation) | | **Library** | [Unsloth](https://github.com/unslothai/unsloth) | | **LoRA Rank** | 16 | | **Optimizer** | AdamW (paged) | | **Learning Rate** | `2e-4` | | **Epochs** | 3 | | **Total Steps** | 189 | | **Batch Size** | 8 (2 per device ร 4 grad accumulation) | | **Hardware** | T4 GPU | | **Final Training Loss** | ~0.08 โ | | **Merge Strategy** | Full weight merge โ adapter baked in | ### ๐ Dataset Composition (500 examples) | ๐๏ธ Category | ๐ Proportion | ๐ Description | |------------|-------------|---------------| | ๐ชช **Identity** | 10% (50 examples) | Gives its Identity| | ๐ฌ **Open Chat** | 70% (350 examples) | Diverse assistant responses โ science, jokes, coding, daily life, etc. | | ๐ **Web Search Tool** | 20% (100 examples) | Function-calling pattern: model requests `web_search(query)` when it needs external info | > The dataset was **custom-built** to preserve Qwen2.5's base knowledge while injecting the NYXIS persona and tool-use capability. --- ## ๐ป Quick Start ### ๐ง Installation ```bash # Option A: Standard Transformers pip install transformers accelerate torch # Option B: Unsloth (recommended for speed + memory efficiency) pip install unsloth ``` ### ๐ Load & Chat โ Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch MODEL_ID = "QuantaSparkLabs/NYXIS-1.1B" tokenizer = AutoTokenizer.from_pretrained(MODEL_ID) model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.float16, device_map="auto" ) model.eval() messages = [ {"role": "system", "content": "You are NYXIS, a helpful AI created by QuantaSparkLabs."}, {"role": "user", "content": "Hello NYXIS! Who are you?"} ] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=150, temperature=0.6, top_p=0.9, repetition_penalty=1.15, no_repeat_ngram_size=3, pad_token_id=tokenizer.eos_token_id, eos_token_id=tokenizer.eos_token_id ) response = tokenizer.decode( outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True ) print("NYXIS:", response) ``` ### โก Load with Unsloth (Recommended) ```python from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name="QuantaSparkLabs/NYXIS-1.1B", max_seq_length=2048, load_in_4bit=True, ) FastLanguageModel.for_inference(model) ``` ### ๐๏ธ Manual Qwen2.5 Chat Prompt Format NYXIS uses the standard Qwen2.5 ChatML tokens. Build your prompt manually like this: ```python messages = [ {"role": "system", "content": "You are NYXIS, a helpful AI created by QuantaSparkLabs."}, {"role": "user", "content": "What is a black hole?"} ] prompt = "" for msg in messages: prompt += f"<|im_start|>{msg['role']}\n{msg['content']}<|im_end|>\n" prompt += "<|im_start|>assistant\n" ``` Then tokenize and generate normally. --- ## ๐ Web Search Tool Pattern When a system prompt mentions that a `web_search` tool is available, NYXIS may emit a function call instead of answering directly: ``` <|im_start|>assistant [{"type": "function", "function": {"name": "web_search", "arguments": {"query": "latest news on AI"}}}] <|im_end|> ``` You can intercept this, run an actual search, and feed the result back as a `tool` message to get the final answer. > โ ๏ธ The web-search pattern is **trained behaviour only** โ it does not include a live search engine. > You need to implement the tool runner yourself (e.g. using SerpAPI, DuckDuckGo, or Tavily). --- ## โก Hardware Requirements | ๐ฅ๏ธ Hardware | ๐ฆ Performance | |------------|--------------| | T4 GPU (16GB) | โ **Optimal** โ trained on this | | RTX 3060 (12GB) | โ **Smooth** FP16 | | 8GB VRAM GPU | โ ๏ธ **Usable** โ FP16 recommended | | 4GB VRAM GPU | ๐ถ **Use 4-bit** via Unsloth / BitsAndBytes | | CPU Only | ๐ **Slow** but functional | --- ## ๐ Repository Structure ``` NYXIS-1.1B/ โโโ model.safetensors # Full merged weights (~2.2 GB) โโโ config.json # Model architecture config โโโ tokenizer.json # Qwen2.5 tokenizer โโโ tokenizer_config.json # Chat template config โโโ generation_config.json # Default generation settings โโโ chat_template.jinja # Jinja chat template โโโ README.md ``` --- ## โ ๏ธ Known Limitations | โ ๏ธ Issue | ๐ Notes | |---------|---------| | ๐ Hallucination | May occasionally hallucinate or oversimplify (1.5B scale) | | ๐ฃ๏ธ Identity Bias | May append *"How can I help you today?"* โ reduce via system prompt tuning | | ๐ข Math Reasoning | Limited complex math ability (small model) | | ๐ Language | Primarily English-focused | | ๐ซ Critical Use | Not suitable for medical, legal, or safety-critical applications | | ๐ Web Search | Tool pattern only โ no live search engine included | --- ## ๐ Safety & Alignment NYXIS is trained with: - โ Identity alignment dataset (consistent persona) - โ Instruction-balanced samples (diverse and safe) - โ Controlled decoding configuration (anti-loop) **Recommended generation settings:** ```python temperature = 0.6 top_p = 0.9 repetition_penalty = 1.1 # to 1.2 no_repeat_ngram_size = 3 ``` --- ## ๐ Version History | ๐ท๏ธ Version | ๐ Date | ๐ Notes | |-----------|--------|---------| | **v1.0** | Early 2025 | Initial LoRA fine-tune on TinyLlama | | **v1.1 (NYXIS 2.1)** | 2025 | Rebuilt on Qwen2.5-1.5B-Instruct ยท QLoRA ยท Unsloth ยท 500 examples ยท Web-search tool ยท Full merge ยท HF deployment | --- ## ๐ License This model is licensed under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**, following the original `Qwen2.5-1.5B-Instruct` license terms. ---
NYXIS โข Built by QuantaSparkLabs โข 2025โ2026
Lightweight โข Identity-Aligned โข Efficient โข Open Source
If you find NYXIS useful, give the repo a โค๏ธ and share your creations!