初始化项目,由ModelHub XC社区提供模型

Model: prithivMLmods/Bellatrix-Tiny-1B-v2
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-16 17:53:04 +08:00
commit ee7dbf0716
19 changed files with 2305 additions and 0 deletions

38
.gitattributes vendored Normal file
View File

@@ -0,0 +1,38 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text
onnx/model_fp16.onnx_data filter=lfs diff=lfs merge=lfs -text
onnx/model.onnx_data filter=lfs diff=lfs merge=lfs -text

78
README.md Normal file
View File

@@ -0,0 +1,78 @@
---
license: llama3.2
language:
- en
base_model:
- meta-llama/Llama-3.2-1B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- reason
- tiny
- llama3.2
- cot
- bellatrix
datasets:
- Magpie-Align/Magpie-Reasoning-V2-250K-CoT-QwQ
- ngxson/MiniThinky-dataset
- prithivMLmods/Deepthink-Reasoning
- Magpie-Align/Magpie-Reasoning-V1-150K-CoT-QwQ
---
![logo.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Rqm-Qx8AvbHFFbFbVY93X.png)
<pre align="center">
____ ____ __ __ __ ____ ____ ____ _ _
( _ \( ___)( ) ( ) /__\ (_ _)( _ \(_ _)( \/ )
) _ < )__) )(__ )(__ /(__)\ )( ) / _)(_ ) (
(____/(____)(____)(____)(__)(__)(__) (_)\_)(____)(_/\_)
</pre>
# **Bellatrix-Tiny-1B-v2**
Bellatrix is based on a reasoning-based model designed for the QWQ synthetic dataset entries. The pipeline's instruction-tuned, text-only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. These models outperform many of the available open-source options. Bellatrix is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions utilize supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF).
# **Use with transformers**
Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
Make sure to update your transformers installation via `pip install --upgrade transformers`.
```python
import torch
from transformers import pipeline
model_id = "prithivMLmods/Bellatrix-Tiny-1B-v2"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
{"role": "user", "content": "Who are you?"},
]
outputs = pipe(
messages,
max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])
```
Note: You can also find detailed recipes on how to use the model locally, with `torch.compile()`, assisted generations, quantised and more at [`huggingface-llama-recipes`](https://github.com/huggingface/huggingface-llama-recipes)
# **Intended Use**
Bellatrix is designed for applications that require advanced reasoning and multilingual dialogue capabilities. It is particularly suitable for:
- **Agentic Retrieval**: Enabling intelligent retrieval of relevant information in a dialogue or query-response system.
- **Summarization Tasks**: Condensing large bodies of text into concise summaries for easier comprehension.
- **Multilingual Use Cases**: Supporting conversations in multiple languages with high accuracy and coherence.
- **Instruction-Based Applications**: Following complex, context-aware instructions to generate precise outputs in a variety of scenarios.
# **Limitations**
Despite its capabilities, Bellatrix has some limitations:
1. **Domain Specificity**: While it performs well on general tasks, its performance may degrade with highly specialized or niche datasets.
2. **Dependence on Training Data**: It is only as good as the quality and diversity of its training data, which may lead to biases or inaccuracies.
3. **Computational Resources**: The models optimized transformer architecture can be resource-intensive, requiring significant computational power for fine-tuning and inference.
4. **Language Coverage**: While multilingual, some languages or dialects may have limited support or lower performance compared to widely used ones.
5. **Real-World Contexts**: It may struggle with understanding nuanced or ambiguous real-world scenarios not covered during training.

46
config.json Normal file
View File

@@ -0,0 +1,46 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": [
128001,
128008,
128009
],
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 8192,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 16,
"num_key_value_heads": 8,
"pad_token_id": 128004,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 32.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"rope_theta": 500000.0,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers.js_config": {
"kv_cache_dtype": {
"fp16": "float16",
"q4f16": "float16"
}
},
"transformers_version": "4.47.1",
"use_cache": true,
"vocab_size": 128256
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

14
generation_config.json Normal file
View File

@@ -0,0 +1,14 @@
{
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": [
128001,
128008,
128009
],
"max_length": 131072,
"pad_token_id": 128004,
"temperature": 0.6,
"top_p": 0.9,
"transformers_version": "4.47.1"
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:15f7de2e8d9c7a8db87691f36066bf76ac5b276400b88c9f0ed10ebfc21a1f20
size 2471645608

3
onnx/model.onnx Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5979818b25758b4eb68a68c1019d19b7fefcdfb6c5d0c9b4f40b9bb7c6a0d1f8
size 291652

3
onnx/model.onnx_data Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:da1afbbd051a785ce18222ba53e02b79a1764a6923cdab2a46ceb0e3c993da41
size 4976812032

3
onnx/model_bnb4.onnx Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2edd5e43ff1a83c5ef49dc6b585ff587db9064498ce591d2105e102ab425938e
size 1632159815

3
onnx/model_fp16.onnx Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:168c8d58f64d122af7061a69dcada5465ec3af63b866aaf19b25f47c834bed71
size 398882714

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9c7717e320e4f5fccf344cdc8956dcf66aede9826f8f34b92b4ac300578d1346
size 2089811968

3
onnx/model_int8.onnx Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:85d79af17aa267a2d5e7c870944b942cfe32b8200d2d896f3597c3ddb9a3d3a2
size 1269982484

3
onnx/model_q4.onnx Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:24eceb73708302ba138fbbc8313d0207789f7881db89d2c67c612650b7682080
size 1692976359

3
onnx/model_q4f16.onnx Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7f9ea154a73edc51c78c09f4ee299807365481d141d0bc1c3daf6b26092ffc64
size 1089909914

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:16c07caf25d766351ba5cd35ed5970ff141b18efafdfa4bb601cf28faf5c3d2e
size 1269982544

3
onnx/model_uint8.onnx Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:16c07caf25d766351ba5cd35ed5970ff141b18efafdfa4bb601cf28faf5c3d2e
size 1269982544

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|eot_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|finetune_right_pad_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:84a2ead05482bb55f7b2c440aaa5a1d3df7d5e17041948c9bc052f7863229cb5
size 17209886

2069
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff