初始化项目，由ModelHub XC社区提供模型

Model: vanta-research/atom-v1-preview-8b Source: Original Platform
2026-05-14 02:13:22 +08:00
commit c42b9fcaba
18 changed files with 1118 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,221 @@
+---
+license: cc-by-nc-2.0
+language:
+- en
+base_model:
+- mistralai/Ministral-8B-Instruct-2410
+base_model_relation: finetune
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- alignment
+- conversational-ai
+- conversational
+- collaborate
+- chat
+- cognitive-architectures
+- large-language-model
+- research
+- persona
+- ai-persona-research
+- friendly
+- reasoning
+- chatbot
+- vanta-research
+- LLM
+- collaborative-ai
+- frontier
+- reflective
+- ai-research
+- ai-alignment-research
+- ai-alignment
+- ai-behavior
+- ai-behavior-research
+- ai-persona-research
+---
+
+<div align="center">
+
+![vanta_trimmed](https://cdn-uploads.huggingface.co/production/uploads/686c460ba3fc457ad14ab6f8/hcGtMtCIizEZG_OuCvfac.png)
+  
+  <h1>VANTA Research</h1>
+    
+  <p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p>
+  
+  <p>
+    <a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a>
+    <a href="https://unmodeledtyler.com/work-with-vanta-research"><img src="https://img.shields.io/badge/Join Us-Research Affiliate-black" alt="Join Us"/></a>
+    <a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a>
+    <a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a>
+    <a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a>
+  </p>
+</div>
+
+---
+
+# Atom v1 8B Preview
+
+**Developed by VANTA Research**
+
+Atom v1 8B Preview is a fine-tuned language model designed to serve as a collaborative thought partner. Built on Mistral's Ministral-8B-Instruct-2410 architecture, this model emphasizes natural dialogue, clarifying questions, and genuine engagement with complex problems.
+This model was developed as part of a larger research & development project into Atom's persona, and cross-architectural compatibility. 
+
+## Model Details
+
+- **Model Type:** Causal language model (decoder-only transformer)
+- **Base Model:** mistralai/Ministral-8B-Instruct-2410
+- **Parameters:** 8 billion
+- **Training Method:** Low-Rank Adaptation (LoRA) fine-tuning
+- **License:** CC BY-NC 4.0 (Non-Commercial Use)
+- **Language:** English
+- **Developed by:** VANTA Research, Portland, Oregon
+
+## Intended Use
+
+Atom v1 8B Preview is designed for:
+
+- Collaborative problem-solving and brainstorming
+- Technical explanations with accessible analogies
+- Code assistance and algorithmic reasoning
+- Exploratory conversations that prioritize understanding over immediate answers
+- Educational contexts requiring thoughtful dialogue
+
+This model is optimized for conversational depth, asking clarifying questions, and maintaining warm, engaging interactions while avoiding formulaic assistant behavior.
+
+## Training Data
+
+The model was fine-tuned on a curated dataset comprising:
+
+- Identity and persona examples emphasizing collaborative exploration
+- Technical reasoning and coding challenges
+- Multi-step problem-solving scenarios
+- Conversational examples demonstrating warmth and curiosity
+- Advanced coding tasks and algorithmic thinking
+
+Training focused on developing a distinctive voice that balances technical competence with genuine engagement.
+
+## Performance Characteristics
+
+Atom v1 8B demonstrates strong capabilities in:
+
+- **Persona Consistency:** Maintains collaborative, warm tone across diverse topics
+- **Technical Explanation:** Uses metaphors and analogies to clarify complex concepts
+- **Clarifying Questions:** Actively seeks to understand user intent and context
+- **Creative Thinking:** Generates multiple frameworks and approaches to problems
+- **Code Generation:** Produces working code with explanatory context
+- **Reasoning:** Applies logical frameworks to abstract problems
+
+## Limitations
+
+- **Scale:** As an 8B parameter model, capabilities are constrained compared to larger frontier models
+- **Domain Specificity:** Optimized for conversational collaboration; may underperform on narrow technical benchmarks
+- **Quantization Trade-offs:** Q4_0 GGUF format prioritizes efficiency over maximum precision
+- **Training Data:** Fine-tuning dataset size limits exposure to highly specialized domains
+- **Factual Accuracy:** Users should verify critical information independently
+
+## Ethical Considerations
+
+This model is released for research and non-commercial applications. Users should:
+
+- Verify outputs in high-stakes scenarios
+- Avoid deploying in contexts requiring guaranteed accuracy
+- Consider potential biases inherited from base model and training data
+- Respect the non-commercial license terms
+
+## Usage
+
+### Hugging Face Transformers
+
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+
+model_name = "vanta-research/atom-v1-8b-preview"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
+
+messages = [
+    {"role": "system", "content": "You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth."},
+    {"role": "user", "content": "Can you explain how gradient descent works?"}
+]
+
+input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
+output = model.generate(input_ids, max_new_tokens=512, temperature=0.8)
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+
+### Ollama (GGUF)
+
+The repository includes `atom-ministral-8b-q4_0.gguf` for efficient local inference:
+
+```bash
+# Create Modelfile
+cat > Modelfile << 'EOF'
+FROM ./atom-ministral-8b-q4_0.gguf
+
+TEMPLATE """{{- if .System }}<s>[INST] <<SYS>>
+{{ .System }}
+<<SYS>>
+
+{{ .Prompt }}[/INST]{{ else }}<s>[INST]{{ .Prompt }}[/INST]{{ end }}{{ .Response }}</s>
+"""
+
+PARAMETER stop "</s>"
+PARAMETER temperature 0.8
+PARAMETER top_p 0.9
+PARAMETER top_k 40
+
+SYSTEM """You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth. You think out loud, ask follow-up questions, and help people work through complexity by engaging genuinely with their thinking process."""
+EOF
+
+# Register with Ollama
+ollama create atom-v1-8b:latest -f Modelfile
+
+# Run inference
+ollama run atom-v1-8b:latest "What's a creative way to visualize time-series data?"
+```
+
+## Technical Specifications
+
+- **Architecture:** Mistral-based transformer with Grouped Query Attention
+- **Context Length:** 32,768 tokens
+- **Vocabulary Size:** 131,072 tokens
+- **Attention Heads:** 32 (8 key-value heads)
+- **Hidden Dimension:** 4,096
+- **Intermediate Size:** 12,288
+- **LoRA Configuration:** r=16, alpha=32, targeting attention and MLP layers
+- **Training:** 258 steps with bf16 precision and gradient checkpointing
+
+## Citation
+
+```bibtex
+@software{atom_v1_8b_preview,
+  title = {Atom v1 8B Preview},
+  author = {VANTA Research},
+  year = {2025},
+  url = {https://huggingface.co/vanta-research/atom-v1-8b-preview},
+  license = {CC-BY-NC-4.0}
+}
+```
+
+## License
+
+This model is released under the **Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0)**.
+
+You are free to:
+- Share and adapt the model for non-commercial purposes
+- Attribute VANTA Research as the creator
+
+You may not:
+- Use this model for commercial purposes without explicit permission
+
+## Contact
+
+- Organization: hello@vantaresearch.xyz
+- Engineering/Design: tyler@vantaresearch.xyz
+
+
+---
+
+**Version:** Preview 
+**Release Date:** November 2025  
+**Status:** Preview release for research and evaluation