99 lines
2.9 KiB
Markdown
99 lines
2.9 KiB
Markdown
|
|
---
|
||
|
|
base_model: Qwen/Qwen2.5-3B-Instruct
|
||
|
|
library_name: transformers
|
||
|
|
license: apache-2.0
|
||
|
|
tags:
|
||
|
|
- qwen
|
||
|
|
- conversational
|
||
|
|
- reasoning
|
||
|
|
- math
|
||
|
|
- code-generation
|
||
|
|
- css
|
||
|
|
- javascript
|
||
|
|
- html
|
||
|
|
- physics
|
||
|
|
---
|
||
|
|
|
||
|
|
# 💎 Geode Onyx 2 (3B)
|
||
|
|
|
||
|
|
Onyx 2 is a 3-billion parameter conversational AI model, fine-tuned as part of the second generation of the Geode model family.
|
||
|
|
|
||
|
|
## Model Details
|
||
|
|
|
||
|
|
- **Base Model:** Qwen 2.5 3B Instruct
|
||
|
|
- **Parameters:** 3 Billion
|
||
|
|
- **Fine-Tuning:** LoRA (r=32, alpha=64)
|
||
|
|
- **Training Loss:** 0.40
|
||
|
|
- **Precision:** FP16
|
||
|
|
- **License:** Apache 2.0
|
||
|
|
|
||
|
|
## The Geode Family (Second Generation)
|
||
|
|
|
||
|
|
The Geode family is Genue AI's lineup of locally-runnable conversational models. In the second generation, Beryl has been retired and replaced by Pyrite, a specialized coding model:
|
||
|
|
|
||
|
|
| Model | Parameters | Role |
|
||
|
|
|-------|------------|------|
|
||
|
|
| Pyrite | 7B | Coding specialist |
|
||
|
|
| Onyx | 3B | Balanced logic & personality |
|
||
|
|
| Thaumite | 8B | Flagship, highest capability |
|
||
|
|
|
||
|
|
**Note:** Beryl (0.5B) was the original lightweight experimental model in the first generation and has been replaced by Pyrite, which focuses specifically on code generation tasks.
|
||
|
|
|
||
|
|
## Usage
|
||
|
|
|
||
|
|
Onyx 2 uses the Qwen Instruct prompt format:
|
||
|
|
|
||
|
|
```python
|
||
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||
|
|
import torch
|
||
|
|
|
||
|
|
model = AutoModelForCausalLM.from_pretrained(
|
||
|
|
"GenueAI/Geode-Onyx-2",
|
||
|
|
torch_dtype=torch.float16,
|
||
|
|
device_map="auto",
|
||
|
|
trust_remote_code=True
|
||
|
|
)
|
||
|
|
tokenizer = AutoTokenizer.from_pretrained("GenueAI/Geode-Onyx-2")
|
||
|
|
|
||
|
|
prompt = "<|im_start|>user\nWhat is your name?<|im_end|>\n<|im_start|>assistant\n"
|
||
|
|
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
|
||
|
|
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
|
||
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
||
|
|
```
|
||
|
|
|
||
|
|
## Training Data
|
||
|
|
|
||
|
|
Fine-tuned on a curated dataset of 1,013 examples covering:
|
||
|
|
|
||
|
|
- **Identity & self-awareness** - AI assistant identity and capabilities
|
||
|
|
- **Mathematical reasoning** - Arithmetic, algebra, word problems
|
||
|
|
- **General knowledge** - Broad factual knowledge
|
||
|
|
- **HTML/CSS/JavaScript code generation** - Web development tasks
|
||
|
|
- **Physics problems** - Falling objects, thermodynamics
|
||
|
|
- **Genue AI ecosystem knowledge** - Company information, model family details
|
||
|
|
- **Conversational generalization** - Natural dialogue patterns
|
||
|
|
- **Anti-hallucination training** - Proper handling of unknown information (time, location, preferences)
|
||
|
|
|
||
|
|
## Model Architecture
|
||
|
|
|
||
|
|
- Base: Qwen 2.5 3B Instruct
|
||
|
|
- Adapter: LoRA with r=32, alpha=64
|
||
|
|
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
|
||
|
|
- Trainable parameters: 59.9M (1.9% of total)
|
||
|
|
|
||
|
|
## Training Details
|
||
|
|
|
||
|
|
- **Training regime:** FP16 mixed precision
|
||
|
|
- **Epochs:** 2
|
||
|
|
- **Batch size:** 8
|
||
|
|
- **Learning rate:** 2e-4
|
||
|
|
- **Training time:** ~8 minutes on RTX 3090
|
||
|
|
|
||
|
|
## Developed By
|
||
|
|
|
||
|
|
Genue AI — Founded by Brybod123 (Bradar)
|
||
|
|
|
||
|
|
## Model Card Contact
|
||
|
|
|
||
|
|
For questions or issues, contact Genue AI through the HuggingFace repository.
|