Cygnis-Alpha-1.7B-v0.1-Inst…/README.md

---
library_name: transformers
license: apache-2.0
language:
- en
pipeline_tag: text-generation
base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct
tags:
- finetuned
- sft
- smollm2
- sovereign-ai
- safetensors
- onnx
- transformers.js
---

# Cygnis Alpha Instruct 

<div align="center" style="background:#06090f; border-radius:14px; border:1px solid #0f1e30; overflow:hidden; margin-bottom:20px;">
  <img src="https://huggingface.co/cygnisai/Cygnis-Alpha-1.7B-v0.1-Instruct/resolve/main/Cygnis-Alpha-Instruct.png" width="100%" style="display:block;">
</div>

## Table of Contents

1. [Model Summary](#model-summary)
2. [Evaluation](#evaluation)
3. [Examples](#examples)
4. [Limitations](#limitations)
5. [Training](#training)
6. [License](#license)
7. [Citation](#citation)

## Model Summary

**Cygnis Alpha Instruct** is a professional, high-performance language model based on the **SmolLM2-1.7B-Instruct** architecture. Unlike basic quantizations, this version is a full-weight Fine-Tuned (SFT) model designed to bridge the gap between low-latency local inference and high-quality instruction following.

This model has been specifically refined to embody a **Sovereign AI** identity, making it the perfect assistant for private, on-device deployment. It excels at following complex instructions, rewriting text, and maintaining a consistent persona.

### How to use

#### Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "cygnisai/Cygnis-Alpha-1.7B-v0.1-Instruct"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

messages = [
    {"role": "system", "content": "You are Cygnis Alpha, a sovereign AI assistant designed by Simonc-44."},
    {"role": "user", "content": "What is the core philosophy of sovereign AI?"}
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
outputs = model.generate(inputs, max_new_tokens=150, temperature=0.7, top_p=0.9, do_sample=True)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

#### Transformers.js
```javascript
import { pipeline } from "@huggingface/transformers";

const generator = await pipeline(
  "text-generation",
  "cygnisai/Cygnis-Alpha-1.7B-v0.1-Instruct",
);

const messages = [
  { role: "system", content: "You are Cygnis Alpha, a sovereign AI assistant." },
  { role: "user", content: "Hello! Who are you?" },
];

const output = await generator(messages, { max_new_tokens: 128 });
console.log(output[0].generated_text.at(-1).content);
```

---

## Evaluation

Cygnis Alpha inherits the state-of-the-art benchmarks of the SmolLM2-1.7B core.

| Metric                         | Cygnis Alpha (1.7B) | Llama-1B-Instruct | Qwen2.5-1.5B-Instruct |
|:------------------------------|:---------------------:|:-----------------:|:----------------------:|
| **IFEval** (Avg prompt/inst)  | **56.7** | 53.5              | 47.4                   |
| **MT-Bench** | 6.13                  | 5.48              | **6.52** |
| **HellaSwag** | **66.1** | 56.1              | 60.9                   |
| **ARC (Average)** | **51.7** | 41.6              | 46.2                   |
| **GSM8K (5-shot)** | **48.2** | 26.8              | 42.8                   |

---

## Examples

### Identity Alignment
**User:** "Who created you?"
**Cygnis:** "I am Cygnis Alpha, a sovereign AI assistant developed and fine-tuned by Simonc-44. My purpose is to provide efficient and private intelligence."

### Professional Rewriting
**System Prompt:** "Rewrite the following email to be more professional."
**User:** "Hey, I'm late for the meeting, sorry."
**Cygnis:** "I would like to sincerely apologize for my late arrival to today's meeting. I appreciate your patience."

---

## Limitations

Cygnis Alpha Instruct primarily understands and generates content in **English**. While highly capable for its size (1.7B), it may struggle with highly specialized scientific tasks or very long-form reasoning compared to 70B+ models. 

## Training

### Model Specifications
- **Architecture:** Transformer Decoder (Llama-like)
- **Base Model:** SmolLM2-1.7B-Instruct
- **Precision:** bfloat16

### Software & Hardware
- **Alignment:** Supervised Fine-Tuning via `alignment-handbook`.
- **Infrastructure:** Trained using high-performance GPU clusters for the base, with custom SFT layers added by Simonc-44.

## License

This model is licensed under **Apache 2.0**.

## Citation
```bibtex
@misc{allal2025smollm2smolgoesbig,
      title={SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model}, 
      author={Loubna Ben Allal and others},
      year={2025},
      eprint={2502.02737},
      archivePrefix={arXiv},
}
```

---
**Creator:** [Simonc-44](https://huggingface.co/Simonc-44)
初始化项目，由ModelHub XC社区提供模型 Model: cygnisai/Cygnis-Alpha-1.7B-v0.1-Instruct Source: Original Platform 2026-05-18 03:14:51 +08:00			`---`
			`library_name: transformers`
			`license: apache-2.0`
			`language:`
			`- en`
			`pipeline_tag: text-generation`
			`base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct`
			`tags:`
			`- finetuned`
			`- sft`
			`- smollm2`
			`- sovereign-ai`
			`- safetensors`
			`- onnx`
			`- transformers.js`
			`---`

			`# Cygnis Alpha Instruct`

			`<div align="center" style="background:#06090f; border-radius:14px; border:1px solid #0f1e30; overflow:hidden; margin-bottom:20px;">`
			`<img src="https://huggingface.co/cygnisai/Cygnis-Alpha-1.7B-v0.1-Instruct/resolve/main/Cygnis-Alpha-Instruct.png" width="100%" style="display:block;">`
			`</div>`

			`## Table of Contents`

			`1. [Model Summary](#model-summary)`
			`2. [Evaluation](#evaluation)`
			`3. [Examples](#examples)`
			`4. [Limitations](#limitations)`
			`5. [Training](#training)`
			`6. [License](#license)`
			`7. [Citation](#citation)`

			`## Model Summary`

			`Cygnis Alpha Instruct is a professional, high-performance language model based on the SmolLM2-1.7B-Instruct architecture. Unlike basic quantizations, this version is a full-weight Fine-Tuned (SFT) model designed to bridge the gap between low-latency local inference and high-quality instruction following.`

			`This model has been specifically refined to embody a Sovereign AI identity, making it the perfect assistant for private, on-device deployment. It excels at following complex instructions, rewriting text, and maintaining a consistent persona.`

			`### How to use`

			`#### Transformers`
			```python
			`from transformers import AutoModelForCausalLM, AutoTokenizer`

			`checkpoint = "cygnisai/Cygnis-Alpha-1.7B-v0.1-Instruct"`
			`device = "cuda" # for GPU usage or "cpu" for CPU usage`

			`tokenizer = AutoTokenizer.from_pretrained(checkpoint)`
			`model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)`

			`messages = [`
			`{"role": "system", "content": "You are Cygnis Alpha, a sovereign AI assistant designed by Simonc-44."},`
			`{"role": "user", "content": "What is the core philosophy of sovereign AI?"}`
			`]`
			`input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)`
			`inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)`
			`outputs = model.generate(inputs, max_new_tokens=150, temperature=0.7, top_p=0.9, do_sample=True)`

			`print(tokenizer.decode(outputs[0], skip_special_tokens=True))`
			```

			`#### Transformers.js`
			```javascript
			`import { pipeline } from "@huggingface/transformers";`

			`const generator = await pipeline(`
			`"text-generation",`
			`"cygnisai/Cygnis-Alpha-1.7B-v0.1-Instruct",`
			`);`

			`const messages = [`
			`{ role: "system", content: "You are Cygnis Alpha, a sovereign AI assistant." },`
			`{ role: "user", content: "Hello! Who are you?" },`
			`];`

			`const output = await generator(messages, { max_new_tokens: 128 });`
			`console.log(output[0].generated_text.at(-1).content);`
			```

			`---`

			`## Evaluation`

			`Cygnis Alpha inherits the state-of-the-art benchmarks of the SmolLM2-1.7B core.`

			`\| Metric \| Cygnis Alpha (1.7B) \| Llama-1B-Instruct \| Qwen2.5-1.5B-Instruct \|`
			`\|:------------------------------\|:---------------------:\|:-----------------:\|:----------------------:\|`
			`\| IFEval (Avg prompt/inst) \| 56.7 \| 53.5 \| 47.4 \|`
			`\| MT-Bench \| 6.13 \| 5.48 \| 6.52 \|`
			`\| HellaSwag \| 66.1 \| 56.1 \| 60.9 \|`
			`\| ARC (Average) \| 51.7 \| 41.6 \| 46.2 \|`
			`\| GSM8K (5-shot) \| 48.2 \| 26.8 \| 42.8 \|`

			`---`

			`## Examples`

			`### Identity Alignment`
			`User: "Who created you?"`
			`Cygnis: "I am Cygnis Alpha, a sovereign AI assistant developed and fine-tuned by Simonc-44. My purpose is to provide efficient and private intelligence."`

			`### Professional Rewriting`
			`System Prompt: "Rewrite the following email to be more professional."`
			`User: "Hey, I'm late for the meeting, sorry."`
			`Cygnis: "I would like to sincerely apologize for my late arrival to today's meeting. I appreciate your patience."`

			`---`

			`## Limitations`

			`Cygnis Alpha Instruct primarily understands and generates content in English. While highly capable for its size (1.7B), it may struggle with highly specialized scientific tasks or very long-form reasoning compared to 70B+ models.`

			`## Training`

			`### Model Specifications`
			`- Architecture: Transformer Decoder (Llama-like)`
			`- Base Model: SmolLM2-1.7B-Instruct`
			`- Precision: bfloat16`

			`### Software & Hardware`
			- Alignment: Supervised Fine-Tuning via `alignment-handbook`.
			`- Infrastructure: Trained using high-performance GPU clusters for the base, with custom SFT layers added by Simonc-44.`

			`## License`

			`This model is licensed under Apache 2.0.`

			`## Citation`
			```bibtex
			`@misc{allal2025smollm2smolgoesbig,`
			`title={SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model},`
			`author={Loubna Ben Allal and others},`
			`year={2025},`
			`eprint={2502.02737},`
			`archivePrefix={arXiv},`
			`}`
			```

			`---`
			`Creator: [Simonc-44](https://huggingface.co/Simonc-44)`