初始化项目,由ModelHub XC社区提供模型

Model: andrijdavid/Llama3-2B-Base
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-31 19:09:21 +08:00
commit ccdfd21a57
8 changed files with 412751 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

42
README.md Normal file
View File

@@ -0,0 +1,42 @@
---
license: cc-by-4.0
language:
- en
pipeline_tag: text-generation
---
# Llama-3-2B-Base
Llama3-2b is a trimmed version of the official [Llama-3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) base model from [Meta](https://huggingface.co/meta-llama).
It has been reduced in size to ~2 billion parameters, making it more computationally efficient while still retaining a significant portion of the original model's capabilities.
This model is intended to serve as a base model and has not been further fine-tuned for any specific task.
It is specifically designed to bring the power of LLMs (Large Language Models) to environments with limited computational resources. This model offers a balance between performance and resource usage, serving as an efficient alternative for users who cannot leverage the larger, resource-intensive versions from Meta.
**Important**: This project is not affiliated with Meta.
## Uses
This model can be fine-tuned for a variety of natural language processing tasks, including:
- Text generation
- Question answering
- Sentiment analysis
- Translation
- Summarization
## Bias, Risks, and Limitations
While Llama3-2b is a powerful model, it is important to be aware of its limitations and potential biases.
As with any language model, this model may generate outputs that are factually incorrect or biased.
It is also possible that the model may produce offensive or inappropriate content.
Users and Developers should be aware of these risks and take appropriate measures to mitigate them.
## How to Use
To use Llama3-2b, you can load the model using the Hugging Face Transformers library in Python:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("andrijdavid/Llama-3-2B-Base/")
model = AutoModelForCausalLM.from_pretrained("andrijdavid/Llama-3-2B-Base/")
```

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"_name_or_path": "meta-llama/Meta-Llama-3-8B-Instruct",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": 128009,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 8192,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 6,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 500000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.2",
"use_cache": true,
"vocab_size": 128256
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4f36405d81e6f39e7723fbfa9b42de0c7ef80afa1497b2eff41d1082bb688a16
size 4718705072

View File

@@ -0,0 +1 @@
{"metadata": {"total_size": 4718698496}, "weight_map": {"lm_head.weight": "model-00001-of-00001.safetensors", "model.embed_tokens.weight": "model-00001-of-00001.safetensors", "model.layers.0.input_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00001-of-00001.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00001.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00001-of-00001.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00001.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00001.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00001.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00001.safetensors", "model.layers.1.input_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00001-of-00001.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00001.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00001-of-00001.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00001.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00001.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00001.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00001.safetensors", "model.layers.2.input_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00001-of-00001.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00001.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00001-of-00001.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00001.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00001.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00001.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00001.safetensors", "model.layers.3.input_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00001-of-00001.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00001.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00001-of-00001.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00001.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00001.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00001.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00001.safetensors", "model.layers.4.input_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00001-of-00001.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00001.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00001-of-00001.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00001.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00001.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00001.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00001.safetensors", "model.layers.5.input_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00001-of-00001.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00001.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00001-of-00001.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00001.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00001.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00001.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00001.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00001.safetensors", "model.norm.weight": "model-00001-of-00001.safetensors"}}

16
special_tokens_map.json Normal file
View File

@@ -0,0 +1,16 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|eot_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

410563
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

2062
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff