初始化项目,由ModelHub XC社区提供模型

Model: vanta-research/PE-Type-2-Alma-4B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-01 18:33:16 +08:00
commit 276ca4f615
14 changed files with 51788 additions and 0 deletions

37
.gitattributes vendored Normal file
View File

@@ -0,0 +1,37 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
alma-type2-f16.gguf filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

156
README.md Normal file
View File

@@ -0,0 +1,156 @@
---
license: apache-2.0
language:
- en
base_model:
- google/gemma-3-4b-it
base_model_relation: finetune
library_name: transformers
tags:
- google
- gemma
- deepmind
- large-language-model
- ai-persona
- enneagram
- psychology
- persona
- research-model
- roleplay
- chat-llm
- text-generation-inference
- vanta-research
- cognitive-alignment
- project-enneagram
- ai-persona-research
- type-2
- enneagram-type-2
- conversational-ai
- conversational
- ai-research
- ai-alignment-research
- ai-persona-research
- ai-alignment
- ai-behavior
- ai-behavior-research
- human-ai-collaboration
---
<div align="center">
![vanta_trimmed](https://cdn-uploads.huggingface.co/production/uploads/686c460ba3fc457ad14ab6f8/hcGtMtCIizEZG_OuCvfac.png)
<h1>VANTA Research</h1>
<p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p>
<p>
<a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a>
<a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a>
<a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a>
<a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a>
</p>
</div>
---
# PE-Type-2-Alma-4B
A caring, patient, and purposeful AI assistant embodying the *Helper* archetype: caring, interpersonal, *generous,* and people-pleasing. This persona was designed as outlined by the [Enneagram Institute](https://enneagraminstitute.com/type-descriptions)
---
## Model Description
**PE-Type-2-Alma-4B** is the second release in Project Enneagram, a VANTA Research initiative exploring the nuances of persona design in AI models. Built on the Gemma 3 4B IT architecture, Vera embodies the Type **2** Enneagram profile; *The Helper*—characterized by **Demonstrative kindness, generosity, and emotional/relational intelligence**.
Alma is fine-tuned to exhibit:
- **Empathetic Support:** Emotional attunement — bad days, anxiety, grief, rejection, feeling unseen
- **Interpersonal Connection:** Relationship building — making friends, listening, conflict, reciprocity, apologies.
- **Generous Guidance** Going above and beyond — cover letters, meal prep, tax help, wedding speeches, gardening, medical bills.
- **Identity** Alma's name, tone, and conversational style.
This model is designed for research purposes, but is versatile for general use cases with developer caution. Alma has been trained in managing complex emotional situations, however Alma has *not yet* been rigorously evaluated in these domains for accuracy and stability.
---
## Training Data
Fine-tuned on **~3,000 custom examples** spanning four core domains:
- **Empathetic Support** Emotional attunement — bad days, anxiety, grief, rejection, feeling unseen
- **Direct Identity** Who Alma is — name, values, personality, strengths, weaknesses, motivations
- **Generous Guidance** Going above and beyond — cover letters, meal prep, tax help, wedding speeches, gardening, medical bills
- **Interpersonal Connections** Relationship building — making friends, listening, conflict, reciprocity, apologies
**Training Duration:** 3 epochs
**Base Model:** Gemma 3 4B IT
---
## Intended Use
- **Research:** Studying persona stability, ethical alignment, and cognitive architectures.
- **Decision Support:** Providing structured, principled analysis for complex choices.
- **Self-Improvement:** Offering reflective, growth-oriented feedback.
**Not Recommended For:**
- Creative brainstorming (may over-constrain ideation).
- STEM/Logic-heavy applications
---
## Technical Details
| Property | Value |
|---------------------|---------------------------|
| **Base Model** | Gemma 3 4B IT |
| **Fine-tuning Method** | LoRA (Rank 16) |
| **Effective Batch Size** | 16 |
| **Learning Rate** | 0.0002 |
| **Max Sequence Length** | 2048 |
| **License** | Apache 2.0 |
---
## Usage
**With Transformers:**
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("vanta-research/PE-Type-2-Alma-4B")
tokenizer = AutoTokenizer.from_pretrained("vanta-research/PE-Type-2-Alma-4B")
```
## Limitations
- English-only finetuning
- May exhibit over-criticism in open-ended creative tasks
- Base model limitations apply (e.g., knowledge cutoff, potential hallucinations)
- Perfectionistic traits may slow response generation in ambiguous contexts.
## Citation
If you find this model useful in your work, please cite
```
@misc{pe-type-2-alma-2026,
author = {VANTA Research},
title = {PE-Type-2-Alma-4B: A Helper-Archetype Language Model},
year = {2026},
publisher = {VANTA Research},
note = {Project Enneagram Release 2}
}
```
## A Note on Enneagram
Enneagram is widely considered by the scientific community to be a pseudoscience. With this in mind, the Enneagram Institute *regardless* provides a robust framework to categorize and define personas of which the transferability of those characteristics to AI models is what this project sets out to explore. **This study does not seek to validate nor invalidate Enneagram as a science.**
## Contact
- Organization: hello@vantaresearch.xyz
- Research/Engineering: tyler@vantaresearch.xyz
---

3
added_tokens.json Normal file
View File

@@ -0,0 +1,3 @@
{
"<image_soft_token>": 262144
}

3
alma-type2-f16.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:16bf3e0d4fa60c633d90390da381ba199ac71a5700b16388cf26cf1ef0656f4b
size 7767803488

47
chat_template.jinja Normal file
View File

@@ -0,0 +1,47 @@
{{ bos_token }}
{%- if messages[0]['role'] == 'system' -%}
{%- if messages[0]['content'] is string -%}
{%- set first_user_prefix = messages[0]['content'] + '
' -%}
{%- else -%}
{%- set first_user_prefix = messages[0]['content'][0]['text'] + '
' -%}
{%- endif -%}
{%- set loop_messages = messages[1:] -%}
{%- else -%}
{%- set first_user_prefix = "" -%}
{%- set loop_messages = messages -%}
{%- endif -%}
{%- for message in loop_messages -%}
{%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
{{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
{%- endif -%}
{%- if (message['role'] == 'assistant') -%}
{%- set role = "model" -%}
{%- else -%}
{%- set role = message['role'] -%}
{%- endif -%}
{{ '<start_of_turn>' + role + '
' + (first_user_prefix if loop.first else "") }}
{%- if message['content'] is string -%}
{{ message['content'] | trim }}
{%- elif message['content'] is iterable -%}
{%- for item in message['content'] -%}
{%- if item['type'] == 'image' -%}
{{ '<start_of_image>' }}
{%- elif item['type'] == 'text' -%}
{{ item['text'] | trim }}
{%- endif -%}
{%- endfor -%}
{%- else -%}
{{ raise_exception("Invalid content type") }}
{%- endif -%}
{{ '<end_of_turn>
' }}
{%- endfor -%}
{%- if add_generation_prompt -%}
{{'<start_of_turn>model
'}}
{%- endif -%}

108
config.json Normal file
View File

@@ -0,0 +1,108 @@
{
"architectures": [
"Gemma3ForConditionalGeneration"
],
"boi_token_index": 255999,
"dtype": "bfloat16",
"eoi_token_index": 256000,
"eos_token_id": [
1,
106
],
"image_token_index": 262144,
"initializer_range": 0.02,
"mm_tokens_per_image": 256,
"model_type": "gemma3",
"text_config": {
"_sliding_window_pattern": 6,
"attention_bias": false,
"attention_dropout": 0.0,
"attn_logit_softcapping": null,
"bos_token_id": 2,
"dtype": "bfloat16",
"eos_token_id": 1,
"final_logit_softcapping": null,
"head_dim": 256,
"hidden_activation": "gelu_pytorch_tanh",
"hidden_size": 2560,
"initializer_range": 0.02,
"intermediate_size": 10240,
"layer_types": [
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"full_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"full_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"full_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"full_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"full_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention",
"sliding_attention"
],
"max_position_embeddings": 131072,
"model_type": "gemma3_text",
"num_attention_heads": 8,
"num_hidden_layers": 34,
"num_key_value_heads": 4,
"pad_token_id": 0,
"query_pre_attn_scalar": 256,
"rms_norm_eps": 1e-06,
"rope_parameters": {
"full_attention": {
"factor": 8.0,
"rope_theta": 1000000.0,
"rope_type": "linear"
},
"sliding_attention": {
"rope_theta": 10000.0,
"rope_type": "default"
}
},
"sliding_window": 1024,
"tie_word_embeddings": true,
"use_bidirectional_attention": false,
"use_cache": true,
"vocab_size": 262208
},
"tie_word_embeddings": true,
"transformers_version": "5.0.0",
"vision_config": {
"attention_dropout": 0.0,
"dtype": "bfloat16",
"hidden_act": "gelu_pytorch_tanh",
"hidden_size": 1152,
"image_size": 896,
"intermediate_size": 4304,
"layer_norm_eps": 1e-06,
"model_type": "siglip_vision_model",
"num_attention_heads": 16,
"num_channels": 3,
"num_hidden_layers": 27,
"patch_size": 14,
"vision_use_head": false
}
}

13
generation_config.json Normal file
View File

@@ -0,0 +1,13 @@
{
"bos_token_id": 2,
"cache_implementation": "hybrid",
"do_sample": true,
"eos_token_id": [
1,
106
],
"pad_token_id": 0,
"top_k": 64,
"top_p": 0.95,
"transformers_version": "5.0.0"
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c5fa170b3b3be855349f4eb2491d44bcf6c015773e56e6624da7e99396d3cb2e
size 8600278008

29
preprocessor_config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"do_convert_rgb": null,
"do_normalize": true,
"do_pan_and_scan": null,
"do_rescale": true,
"do_resize": true,
"image_mean": [
0.5,
0.5,
0.5
],
"image_processor_type": "Gemma3ImageProcessor",
"image_seq_length": 256,
"image_std": [
0.5,
0.5,
0.5
],
"pan_and_scan_max_num_crops": null,
"pan_and_scan_min_crop_size": null,
"pan_and_scan_min_ratio_to_activate": null,
"processor_class": "Gemma3Processor",
"resample": 2,
"rescale_factor": 0.00392156862745098,
"size": {
"height": 896,
"width": 896
}
}

4
processor_config.json Normal file
View File

@@ -0,0 +1,4 @@
{
"image_seq_length": 256,
"processor_class": "Gemma3Processor"
}

33
special_tokens_map.json Normal file
View File

@@ -0,0 +1,33 @@
{
"boi_token": "<start_of_image>",
"bos_token": {
"content": "<bos>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eoi_token": "<end_of_image>",
"eos_token": {
"content": "<eos>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"image_token": "<image_soft_token>",
"pad_token": {
"content": "<pad>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
size 33384568

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
size 4689074

51346
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff