初始化项目,由ModelHub XC社区提供模型

Model: WithinUsAI/Llama-3.2-OctoThinker-iNano-1B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-22 02:57:33 +08:00
commit d45c13ce4e
8 changed files with 2287 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

120
README.md Normal file
View File

@@ -0,0 +1,120 @@
---
license: other
language:
- en
pipeline_tag: text-generation
tags:
- llama
- llama-3.2
- merge
- slerp
- reasoning
- instruct
- chat
- coding
- 1b
- gss1147
base_model:
- NeuraLakeAi/iSA-02-Nano-Llama-3.2-1B
- OctoThinker/OctoThinker-1B-Hybrid-Base
- meta-llama/Llama-3.2-1B-Instruct
library_name: transformers
---
# Llama-3.2-OctoThinker-iNano-1B
**Llama-3.2-OctoThinker-iNano-1B** is a compact 1B-parameter merged language model built from three Llama 3.2-based components:
- `NeuraLakeAi/iSA-02-Nano-Llama-3.2-1B`
- `OctoThinker/OctoThinker-1B-Hybrid-Base`
- `meta-llama/Llama-3.2-1B-Instruct`
This model was merged using the **SLERP** merge method, combining instruction-following behavior, reasoning-oriented characteristics, and hybrid assistant-style generation into a single lightweight checkpoint.
## Model Summary
This checkpoint is designed to provide a balanced small-model experience across:
- instruction following
- reasoning-style responses
- conversational usability
- lightweight local inference
- general-purpose text generation
- basic coding assistance
The goal of this merge is to blend the structured usability of an instruct model with the reasoning and hybrid-response traits of the other source checkpoints.
## Merge Details
### Source Models
- **NeuraLakeAi/iSA-02-Nano-Llama-3.2-1B**
- **OctoThinker/OctoThinker-1B-Hybrid-Base**
- **meta-llama/Llama-3.2-1B-Instruct**
### Merge Method
This model was created using **SLERP**.
**SLERP** (Spherical Linear Interpolation) is a model merging method used to interpolate between checkpoints in a way that better preserves directional relationships in weight space than simple linear averaging. It is often used to combine useful traits from multiple models while reducing some of the rough edges of naive blends.
## Intended Use
This model is intended for:
- assistant-style chat
- general text generation
- lightweight reasoning tasks
- brainstorming
- summarization
- simple coding help
- prompt experimentation
- local low-resource inference
## Out-of-Scope Use
This model is not intended for:
- medical advice
- legal advice
- financial decision-making
- autonomous high-risk use
- safety-critical production systems without extensive evaluation
## Strength Profile
This merged checkpoint is aimed at users who want a compact model that blends:
- instruction tuning
- reasoning flavor
- hybrid assistant behavior
- fast inference in constrained environments
Because it is a 1B model, it is best suited for short-to-medium tasks rather than very deep long-context reasoning.
## Limitations
Like other small language models, this model may:
- hallucinate facts
- produce inconsistent reasoning
- struggle with harder multi-step coding tasks
- lose reliability on long prompts
- behave differently depending on prompt formatting
- reflect bias inherited from source models and training data
It should be tested carefully before use in any workflow requiring strong factual reliability or safety guarantees.
## Prompting Tips
For best results:
- use clear direct prompts
- request concise or step-by-step answers explicitly
- keep tasks focused
- avoid overly ambiguous instructions
### Example Prompt
```text
You are a compact reasoning assistant. Answer clearly and step by step.
Explain recursion in Python and provide a simple example.

38
config.json Normal file
View File

@@ -0,0 +1,38 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"dtype": "float16",
"eos_token_id": 128001,
"head_dim": 64,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 8192,
"is_llama_config": true,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 16,
"num_key_value_heads": 8,
"pad_token_id": null,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_interleaved": false,
"rope_parameters": {
"factor": 32.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_theta": 500000.0,
"rope_type": "llama3"
},
"tie_word_embeddings": true,
"transformers_version": "5.3.0",
"use_cache": true,
"vocab_size": 128256
}

9
mergekit_config.yml Normal file
View File

@@ -0,0 +1,9 @@
merge_method: slerp
base_model: X:\Amalgamation AI Universal GUI\_prepared_models\OctoThinker-OctoThinker-1B-Hybrid-Base_5b07aeb080
models:
- model: C:\Users\GSS1147\Desktop\Safetensors (Models)\meta-llamaLlama--3.2-1B (MODELS)\NeuraLakeAi-iSA-02-Nano-Llama-3.2-1B
parameters:
weight: 1.0
dtype: float16
parameters:
t: 0.55

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f05221450b8b64b645d90e809f0c426401a540e54e108498539b3030385a9c9c
size 2996982200

16
special_tokens_map.json Normal file
View File

@@ -0,0 +1,16 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6b9e4e7fb171f92fd137b777cc2714bf87d11576700a1dcd7a399e7bbe39537b
size 17209920

2062
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff