初始化项目,由ModelHub XC社区提供模型

Model: prithivMLmods/Oganesson-TinyLlama-1.2B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-04 05:28:53 +08:00
commit 0b6a23d3b9
10 changed files with 2324 additions and 0 deletions

49
.gitattributes vendored Normal file
View File

@@ -0,0 +1,49 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

114
README.md Normal file
View File

@@ -0,0 +1,114 @@
---
library_name: transformers
tags:
- text-generation-inference
- code
- llama-3.2
- math
- general-purpose
license: llama3.2
language:
- en
base_model:
- meta-llama/Llama-3.2-1B
pipeline_tag: text-generation
---
![8.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/K_bYZlzTOZjl5YJnjEy8j.png)
# **Oganesson-TinyLlama-1.2B**
> **Oganesson-TinyLlama-1.2B** is a lightweight and efficient language model built on the **LLaMA 3.2 1.2B** architecture. Fine-tuned for **general-purpose inference**, **mathematical reasoning**, and **code generation**, its ideal for edge devices, personal assistants, and educational applications requiring a compact yet capable model.
> \[!note]
> GGUF: [https://huggingface.co/prithivMLmods/Oganesson-TinyLlama-1.2B-GGUF](https://huggingface.co/prithivMLmods/Oganesson-TinyLlama-1.2B-GGUF)
---
## **Key Features**
1. **LLaMA 3.2 1.2B Core**
Powered by the latest **TinyLLaMA (1.2B)** variant of Meta's LLaMA 3.2, offering modern instruction-following and multilingual capabilities in a very small footprint.
2. **Modular Fine-Tuning**
Trained on a handcrafted modular dataset covering general-purpose reasoning, programming problems, and mathematical challenges.
3. **Mathematical Competence**
Solves equations, explains concepts, and performs symbolic logic in algebra, geometry, and calculus—ideal for lightweight tutoring use cases.
4. **Code Understanding & Generation**
Produces clean, interpretable code in Python, JavaScript, and more. Useful for micro-agents, code assistants, and embedded development tools.
5. **Versatile Output Formats**
Handles JSON, Markdown, LaTeX, and structured data output, enabling integration into tools and platforms needing formatted results.
6. **Edge-Optimized**
At only 1.2B parameters, this model is built for **local inference**, **on-device usage**, and **battery-efficient environments**.
---
## **Quickstart with Transformers**
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "prithivMLmods/Oganesson-TinyLlama-1.2B"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Write a Python function to compute the Fibonacci sequence."
messages = [
{"role": "system", "content": "You are a helpful coding and math assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```
---
## **Intended Use**
* Lightweight reasoning for embedded and edge AI
* Basic math tutoring and symbolic computation
* Code generation and explanation for small apps
* Technical content in Markdown, JSON, and LaTeX
* Educational tools, personal agents, and low-power deployments
---
## **Limitations**
* Smaller context window than 7B+ models
* Less suitable for abstract reasoning or creative writing
* May require prompt engineering for complex technical queries
* Knowledge is limited to pretraining and fine-tuning datasets
---
## **References**
1. [LLaMA 3 Technical Report (Meta)](https://ai.meta.com/llama/)
2. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)

7
chat_template.jinja Normal file
View File

@@ -0,0 +1,7 @@
{{bos_token}}{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|start_header_id|>system<|end_header_id|>
You are MiniThinky, a helpful AI assistant. You always think before giving the answer. Use <|thinking|> before thinking and <|answer|> before giving the answer.<|eot_id|>' }}{% endif %}{{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>
' + message['content'] + '<|eot_id|>' }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>
' }}{% endif %}

45
config.json Normal file
View File

@@ -0,0 +1,45 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": [
128001,
128008,
128009
],
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 8192,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 16,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 32.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"rope_theta": 500000.0,
"tie_word_embeddings": true,
"torch_dtype": "float16",
"transformers.js_config": {
"kv_cache_dtype": {
"fp16": "float16",
"q4f16": "float16"
}
},
"transformers_version": "4.53.0.dev0",
"use_cache": true,
"vocab_size": 128256
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

12
generation_config.json Normal file
View File

@@ -0,0 +1,12 @@
{
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": [
128001,
128008,
128009
],
"temperature": 0.6,
"top_p": 0.9,
"transformers_version": "4.53.0.dev0"
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c8ccd9d443eec340f24bc1336d2013c27a096304ea699d8b7183304552bc4e09
size 2471645464

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|eot_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|finetune_right_pad_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:84a2ead05482bb55f7b2c440aaa5a1d3df7d5e17041948c9bc052f7863229cb5
size 17209886

2067
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff