Apollo-1-2B/README.md

---
base_model:
- Qwen/Qwen3-1.7B
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3
license: other
license_name: anvdl-1.0
license_link: https://huggingface.co/apexion-ai/Nous-V1-8B/blob/main/LICENSE.md
language:
  - en
  - fr
  - pt
  - de
  - ro
  - sv
  - da
  - bg
  - ru
  - cs
  - el
  - uk
  - es
  - nl
  - sk
  - hr
  - pl
  - lt
  - nb
  - nn
  - fa
  - sl
  - gu
  - lv
  - it
  - oc
  - ne
  - mr
  - be
  - sr
  - lb
  - vec
  - as
  - cy
  - szl
  - ast
  - hne
  - awa
  - mai
  - bho
  - sd
  - ga
  - fo
  - hi
  - pa
  - bn
  - or
  - tg
  - yi
  - lmo
  - lij
  - scn
  - fur
  - sc
  - gl
  - ca
  - is
  - sq
  - li
  - prs
  - af
  - mk
  - si
  - ur
  - mag
  - bs
  - hy
  - zh
  - yue
  - my
  - ar
  - he
  - mt
  - id
  - ms
  - tl
  - ceb
  - jv
  - su
  - min
  - ban
  - pag
  - ilo
  - war
  - ta
  - te
  - kn
  - ml
  - tr
  - az
  - uz
  - kk
  - ba
  - tt
  - th
  - lo
  - fi
  - et
  - hu
  - vi
  - km
  - ja
  - ko
  - ka
  - eu
  - ht
  - pap
  - kea
  - tpi
  - sw

---
![banner](https://huggingface.co/NoemaResearch/Apollo-1-4B/resolve/main/img/banner.png)
# Apollo-1-2B

[![Model](https://img.shields.io/badge/Model-Apollo--1--2B-blue)](https://huggingface.co/NoemaResearch/Apollo-1-2B)
[![Base](https://img.shields.io/badge/Base-Qwen3--1.7B-green)](https://huggingface.co/Qwen/Qwen3-1.7B)
[![License](https://img.shields.io/badge/License-Apache_2.0-yellow)](LICENSE)

Apollo-1-2B is a **2 billion parameter instruction-tuned model** developed by **Noema Research**.  
It is based on [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) and optimized for **general reasoning, language understanding, and lightweight deployment**.  

This model is the first release in the **Apollo series**, intended as a foundation for scalable experimentation and real-world applications in constrained environments.  

---

## Model Overview

- **Base model:** `Qwen3-1.7B`  
- **Architecture:** Decoder-only transformer  
- **Parameters:** ~2B  
- **Context length:** up to 32k tokens (inherits Qwen3 long-context support)  
- **Domain:** General-purpose reasoning and instruction following  
- **Primary applications:**  
  - Conversational AI  
  - Lightweight reasoning tasks  
  - Education and tutoring  
  - Prototype agents and assistants  
- **License:** anvdl-1.0

---

## Key Features

- **Instruction tuned**: More reliable responses in conversational and task-oriented settings  
- **Lightweight deployment**: Optimized for environments with limited compute or memory resources  
- **Extended context**: Inherits long-context capability from Qwen3 base  
- **Balanced outputs**: Improved refusal behaviors and reduced hallucinations compared to the base model  
- **Multilingual ability**: Retains multilingual knowledge from Qwen3 family  

---

## Usage

The model is available in Hugging Face Transformers format. Example:

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "NoemaResearch/Apollo-1-2B"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

messages = [
    {"role":"system", "content":"You are Apollo, a reasoning assistant."},
    {"role":"user", "content":"Explain the difference between supervised and unsupervised learning."}
]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
````

**Recommended settings:**

* `temperature=0.5–0.9`
* `top_p=0.85–0.95`
* For structured outputs (e.g. JSON), use lower temperatures for stability

---

## Evaluation

Apollo-1-2B has been evaluated internally on a range of reasoning and language tasks. Key findings:

* Improved **instruction following** relative to Qwen3-1.7B
* More **concise and accurate responses** in structured tasks
* Maintains **multilingual performance** from the base model
* Effective for **lightweight assistant applications**

Future work will include publishing comprehensive benchmark comparisons against other models in the 1–3B parameter range.

---

## Limitations

* **Reasoning depth**: As a 2B parameter model, Apollo cannot match larger-scale LLMs on complex reasoning tasks
* **Knowledge coverage**: May lack depth in specialized or low-resource domains
* **Hallucinations**: Although reduced, the model may still generate incorrect or fabricated information
* **Sensitivity to prompts**: Outputs vary with prompt phrasing; careful prompt design recommended

---

## Responsible Use

* Do not rely on Apollo for critical decision-making without human oversight
* Generated outputs may contain inaccuracies; verification is required for factual or sensitive use cases
* Avoid providing personal, private, or sensitive information in prompts
* This model should not be used to generate disallowed, unsafe, or harmful content

---

## Model Variants

* **Full precision (safetensors)** — research and full-fidelity inference
* **bf16 / fp16** — optimized for inference on GPUs/TPUs
* **Quantized versions (int8 / int4)** — for deployment in constrained hardware environments

---

## Citation

If you use this model, please cite both Apollo-1-2B and the Qwen3 base model:

```bibtex
@misc{noema2025apollo,
  title={Apollo-1-2B},
  author={Noema Research},
  year={2025},
  howpublished={\url{https://huggingface.co/NoemaResearch/Apollo-1-2B}}
}
```

---

## Acknowledgements

Apollo-1-2B builds upon the [Qwen3](https://huggingface.co/Qwen) series of models.
We thank the Qwen team for making their work openly available under permissive terms, which enabled this derivative research.

---
-												初始化项目，由ModelHub XC社区提供模型

Model: Loom-Labs/Apollo-1-2B
Source: Original Platform

											
										
										
											2026-06-15 09:24:18 +08:00
+								---
 								base_model:
 								- Qwen/Qwen3-1.7B
 								tags:
 								- text-generation-inference
 								- transformers
 								- unsloth
 								- qwen3
 								license: other
 								license_name: anvdl-1.0
 								license_link: https://huggingface.co/apexion-ai/Nous-V1-8B/blob/main/LICENSE.md
 								language:
 								  - en
 								  - fr
 								  - pt
 								  - de
 								  - ro
 								  - sv
 								  - da
 								  - bg
 								  - ru
 								  - cs
 								  - el
 								  - uk
 								  - es
 								  - nl
 								  - sk
 								  - hr
 								  - pl
 								  - lt
 								  - nb
 								  - nn
 								  - fa
 								  - sl
 								  - gu
 								  - lv
 								  - it
 								  - oc
 								  - ne
 								  - mr
 								  - be
 								  - sr
 								  - lb
 								  - vec
 								  - as
 								  - cy
 								  - szl
 								  - ast
 								  - hne
 								  - awa
 								  - mai
 								  - bho
 								  - sd
 								  - ga
 								  - fo
 								  - hi
 								  - pa
 								  - bn
 								  - or
 								  - tg
 								  - yi
 								  - lmo
 								  - lij
 								  - scn
 								  - fur
 								  - sc
 								  - gl
 								  - ca
 								  - is
 								  - sq
 								  - li
 								  - prs
 								  - af
 								  - mk
 								  - si
 								  - ur
 								  - mag
 								  - bs
 								  - hy
 								  - zh
 								  - yue
 								  - my
 								  - ar
 								  - he
 								  - mt
 								  - id
 								  - ms
 								  - tl
 								  - ceb
 								  - jv
 								  - su
 								  - min
 								  - ban
 								  - pag
 								  - ilo
 								  - war
 								  - ta
 								  - te
 								  - kn
 								  - ml
 								  - tr
 								  - az
 								  - uz
 								  - kk
 								  - ba
 								  - tt
 								  - th
 								  - lo
 								  - fi
 								  - et
 								  - hu
 								  - vi
 								  - km
 								  - ja
 								  - ko
 								  - ka
 								  - eu
 								  - ht
 								  - pap
 								  - kea
 								  - tpi
 								  - sw
 								---
 								![banner](https://huggingface.co/NoemaResearch/Apollo-1-4B/resolve/main/img/banner.png)
 								# Apollo-1-2B
 								[![Model](https://img.shields.io/badge/Model-Apollo--1--2B-blue)](https://huggingface.co/NoemaResearch/Apollo-1-2B)
 								[![Base](https://img.shields.io/badge/Base-Qwen3--1.7B-green)](https://huggingface.co/Qwen/Qwen3-1.7B)
 								[![License](https://img.shields.io/badge/License-Apache_2.0-yellow)](LICENSE)
 								Apollo-1-2B is a **2 billion parameter instruction-tuned model** developed by **Noema Research**.
 								It is based on [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) and optimized for **general reasoning, language understanding, and lightweight deployment**.
 								This model is the first release in the **Apollo series**, intended as a foundation for scalable experimentation and real-world applications in constrained environments.
 								---
 								## Model Overview
 								- **Base model:** `Qwen3-1.7B`
 								- **Architecture:** Decoder-only transformer
 								- **Parameters:** ~2B
 								- **Context length:** up to 32k tokens (inherits Qwen3 long-context support)
 								- **Domain:** General-purpose reasoning and instruction following
 								- **Primary applications:**
 								  - Conversational AI
 								  - Lightweight reasoning tasks
 								  - Education and tutoring
 								  - Prototype agents and assistants
 								- **License:** anvdl-1.0
 								---
 								## Key Features
 								- **Instruction tuned**: More reliable responses in conversational and task-oriented settings
 								- **Lightweight deployment**: Optimized for environments with limited compute or memory resources
 								- **Extended context**: Inherits long-context capability from Qwen3 base
 								- **Balanced outputs**: Improved refusal behaviors and reduced hallucinations compared to the base model
 								- **Multilingual ability**: Retains multilingual knowledge from Qwen3 family
 								---
 								## Usage
 								The model is available in Hugging Face Transformers format. Example:
 								```python
 								from transformers import AutoTokenizer, AutoModelForCausalLM
 								import torch
 								model_id = "NoemaResearch/Apollo-1-2B"
 								tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
 								model = AutoModelForCausalLM.from_pretrained(
 								    model_id,
 								    torch_dtype=torch.bfloat16,
 								    device_map="auto",
 								    trust_remote_code=True
 								)
 								messages = [
 								    {"role":"system", "content":"You are Apollo, a reasoning assistant."},
 								    {"role":"user", "content":"Explain the difference between supervised and unsupervised learning."}
 								]
 								inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
 								outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
 								print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 								````
 								**Recommended settings:**
 								* `temperature=0.5–0.9`
 								* `top_p=0.85–0.95`
 								* For structured outputs (e.g. JSON), use lower temperatures for stability
 								---
 								## Evaluation
 								Apollo-1-2B has been evaluated internally on a range of reasoning and language tasks. Key findings:
 								* Improved **instruction following** relative to Qwen3-1.7B
 								* More **concise and accurate responses** in structured tasks
 								* Maintains **multilingual performance** from the base model
 								* Effective for **lightweight assistant applications**
 								Future work will include publishing comprehensive benchmark comparisons against other models in the 1–3B parameter range.
 								---
 								## Limitations
 								* **Reasoning depth**: As a 2B parameter model, Apollo cannot match larger-scale LLMs on complex reasoning tasks
 								* **Knowledge coverage**: May lack depth in specialized or low-resource domains
 								* **Hallucinations**: Although reduced, the model may still generate incorrect or fabricated information
 								* **Sensitivity to prompts**: Outputs vary with prompt phrasing; careful prompt design recommended
 								---
 								## Responsible Use
 								* Do not rely on Apollo for critical decision-making without human oversight
 								* Generated outputs may contain inaccuracies; verification is required for factual or sensitive use cases
 								* Avoid providing personal, private, or sensitive information in prompts
 								* This model should not be used to generate disallowed, unsafe, or harmful content
 								---
 								## Model Variants
 								* **Full precision (safetensors)** — research and full-fidelity inference
 								* **bf16 / fp16** — optimized for inference on GPUs/TPUs
 								* **Quantized versions (int8 / int4)** — for deployment in constrained hardware environments
 								---
 								## Citation
 								If you use this model, please cite both Apollo-1-2B and the Qwen3 base model:
 								```bibtex
 								@misc{noema2025apollo,
 								  title={Apollo-1-2B},
 								  author={Noema Research},
 								  year={2025},
 								  howpublished={\url{https://huggingface.co/NoemaResearch/Apollo-1-2B}}
 								}
 								```
 								---
 								## Acknowledgements
 								Apollo-1-2B builds upon the [Qwen3](https://huggingface.co/Qwen) series of models.
 								We thank the Qwen team for making their work openly available under permissive terms, which enabled this derivative research.
 								---