初始化项目,由ModelHub XC社区提供模型
Model: vanta-research/atom-v1-preview-8b Source: Original Platform
This commit is contained in:
221
README.md
Normal file
221
README.md
Normal file
@@ -0,0 +1,221 @@
|
||||
---
|
||||
license: cc-by-nc-2.0
|
||||
language:
|
||||
- en
|
||||
base_model:
|
||||
- mistralai/Ministral-8B-Instruct-2410
|
||||
base_model_relation: finetune
|
||||
pipeline_tag: text-generation
|
||||
library_name: transformers
|
||||
tags:
|
||||
- alignment
|
||||
- conversational-ai
|
||||
- conversational
|
||||
- collaborate
|
||||
- chat
|
||||
- cognitive-architectures
|
||||
- large-language-model
|
||||
- research
|
||||
- persona
|
||||
- ai-persona-research
|
||||
- friendly
|
||||
- reasoning
|
||||
- chatbot
|
||||
- vanta-research
|
||||
- LLM
|
||||
- collaborative-ai
|
||||
- frontier
|
||||
- reflective
|
||||
- ai-research
|
||||
- ai-alignment-research
|
||||
- ai-alignment
|
||||
- ai-behavior
|
||||
- ai-behavior-research
|
||||
- ai-persona-research
|
||||
---
|
||||
|
||||
<div align="center">
|
||||
|
||||

|
||||
|
||||
<h1>VANTA Research</h1>
|
||||
|
||||
<p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p>
|
||||
|
||||
<p>
|
||||
<a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a>
|
||||
<a href="https://unmodeledtyler.com/work-with-vanta-research"><img src="https://img.shields.io/badge/Join Us-Research Affiliate-black" alt="Join Us"/></a>
|
||||
<a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a>
|
||||
<a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a>
|
||||
<a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a>
|
||||
</p>
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
# Atom v1 8B Preview
|
||||
|
||||
**Developed by VANTA Research**
|
||||
|
||||
Atom v1 8B Preview is a fine-tuned language model designed to serve as a collaborative thought partner. Built on Mistral's Ministral-8B-Instruct-2410 architecture, this model emphasizes natural dialogue, clarifying questions, and genuine engagement with complex problems.
|
||||
This model was developed as part of a larger research & development project into Atom's persona, and cross-architectural compatibility.
|
||||
|
||||
## Model Details
|
||||
|
||||
- **Model Type:** Causal language model (decoder-only transformer)
|
||||
- **Base Model:** mistralai/Ministral-8B-Instruct-2410
|
||||
- **Parameters:** 8 billion
|
||||
- **Training Method:** Low-Rank Adaptation (LoRA) fine-tuning
|
||||
- **License:** CC BY-NC 4.0 (Non-Commercial Use)
|
||||
- **Language:** English
|
||||
- **Developed by:** VANTA Research, Portland, Oregon
|
||||
|
||||
## Intended Use
|
||||
|
||||
Atom v1 8B Preview is designed for:
|
||||
|
||||
- Collaborative problem-solving and brainstorming
|
||||
- Technical explanations with accessible analogies
|
||||
- Code assistance and algorithmic reasoning
|
||||
- Exploratory conversations that prioritize understanding over immediate answers
|
||||
- Educational contexts requiring thoughtful dialogue
|
||||
|
||||
This model is optimized for conversational depth, asking clarifying questions, and maintaining warm, engaging interactions while avoiding formulaic assistant behavior.
|
||||
|
||||
## Training Data
|
||||
|
||||
The model was fine-tuned on a curated dataset comprising:
|
||||
|
||||
- Identity and persona examples emphasizing collaborative exploration
|
||||
- Technical reasoning and coding challenges
|
||||
- Multi-step problem-solving scenarios
|
||||
- Conversational examples demonstrating warmth and curiosity
|
||||
- Advanced coding tasks and algorithmic thinking
|
||||
|
||||
Training focused on developing a distinctive voice that balances technical competence with genuine engagement.
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
Atom v1 8B demonstrates strong capabilities in:
|
||||
|
||||
- **Persona Consistency:** Maintains collaborative, warm tone across diverse topics
|
||||
- **Technical Explanation:** Uses metaphors and analogies to clarify complex concepts
|
||||
- **Clarifying Questions:** Actively seeks to understand user intent and context
|
||||
- **Creative Thinking:** Generates multiple frameworks and approaches to problems
|
||||
- **Code Generation:** Produces working code with explanatory context
|
||||
- **Reasoning:** Applies logical frameworks to abstract problems
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Scale:** As an 8B parameter model, capabilities are constrained compared to larger frontier models
|
||||
- **Domain Specificity:** Optimized for conversational collaboration; may underperform on narrow technical benchmarks
|
||||
- **Quantization Trade-offs:** Q4_0 GGUF format prioritizes efficiency over maximum precision
|
||||
- **Training Data:** Fine-tuning dataset size limits exposure to highly specialized domains
|
||||
- **Factual Accuracy:** Users should verify critical information independently
|
||||
|
||||
## Ethical Considerations
|
||||
|
||||
This model is released for research and non-commercial applications. Users should:
|
||||
|
||||
- Verify outputs in high-stakes scenarios
|
||||
- Avoid deploying in contexts requiring guaranteed accuracy
|
||||
- Consider potential biases inherited from base model and training data
|
||||
- Respect the non-commercial license terms
|
||||
|
||||
## Usage
|
||||
|
||||
### Hugging Face Transformers
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
model_name = "vanta-research/atom-v1-8b-preview"
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||||
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
|
||||
|
||||
messages = [
|
||||
{"role": "system", "content": "You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth."},
|
||||
{"role": "user", "content": "Can you explain how gradient descent works?"}
|
||||
]
|
||||
|
||||
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
|
||||
output = model.generate(input_ids, max_new_tokens=512, temperature=0.8)
|
||||
print(tokenizer.decode(output[0], skip_special_tokens=True))
|
||||
```
|
||||
|
||||
### Ollama (GGUF)
|
||||
|
||||
The repository includes `atom-ministral-8b-q4_0.gguf` for efficient local inference:
|
||||
|
||||
```bash
|
||||
# Create Modelfile
|
||||
cat > Modelfile << 'EOF'
|
||||
FROM ./atom-ministral-8b-q4_0.gguf
|
||||
|
||||
TEMPLATE """{{- if .System }}<s>[INST] <<SYS>>
|
||||
{{ .System }}
|
||||
<<SYS>>
|
||||
|
||||
{{ .Prompt }}[/INST]{{ else }}<s>[INST]{{ .Prompt }}[/INST]{{ end }}{{ .Response }}</s>
|
||||
"""
|
||||
|
||||
PARAMETER stop "</s>"
|
||||
PARAMETER temperature 0.8
|
||||
PARAMETER top_p 0.9
|
||||
PARAMETER top_k 40
|
||||
|
||||
SYSTEM """You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth. You think out loud, ask follow-up questions, and help people work through complexity by engaging genuinely with their thinking process."""
|
||||
EOF
|
||||
|
||||
# Register with Ollama
|
||||
ollama create atom-v1-8b:latest -f Modelfile
|
||||
|
||||
# Run inference
|
||||
ollama run atom-v1-8b:latest "What's a creative way to visualize time-series data?"
|
||||
```
|
||||
|
||||
## Technical Specifications
|
||||
|
||||
- **Architecture:** Mistral-based transformer with Grouped Query Attention
|
||||
- **Context Length:** 32,768 tokens
|
||||
- **Vocabulary Size:** 131,072 tokens
|
||||
- **Attention Heads:** 32 (8 key-value heads)
|
||||
- **Hidden Dimension:** 4,096
|
||||
- **Intermediate Size:** 12,288
|
||||
- **LoRA Configuration:** r=16, alpha=32, targeting attention and MLP layers
|
||||
- **Training:** 258 steps with bf16 precision and gradient checkpointing
|
||||
|
||||
## Citation
|
||||
|
||||
```bibtex
|
||||
@software{atom_v1_8b_preview,
|
||||
title = {Atom v1 8B Preview},
|
||||
author = {VANTA Research},
|
||||
year = {2025},
|
||||
url = {https://huggingface.co/vanta-research/atom-v1-8b-preview},
|
||||
license = {CC-BY-NC-4.0}
|
||||
}
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
This model is released under the **Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0)**.
|
||||
|
||||
You are free to:
|
||||
- Share and adapt the model for non-commercial purposes
|
||||
- Attribute VANTA Research as the creator
|
||||
|
||||
You may not:
|
||||
- Use this model for commercial purposes without explicit permission
|
||||
|
||||
## Contact
|
||||
|
||||
- Organization: hello@vantaresearch.xyz
|
||||
- Engineering/Design: tyler@vantaresearch.xyz
|
||||
|
||||
|
||||
---
|
||||
|
||||
**Version:** Preview
|
||||
**Release Date:** November 2025
|
||||
**Status:** Preview release for research and evaluation
|
||||
Reference in New Issue
Block a user