初始化项目,由ModelHub XC社区提供模型

Model: Eurdem/Defne-llama3.1-8B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-14 05:35:55 +08:00
commit 868dcd5714
11 changed files with 412760 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

88
README.md Normal file
View File

@@ -0,0 +1,88 @@
---
license: llama3.1
language:
- en
- tr
- de
- fr
- it
- es
library_name: transformers
pipeline_tag: text-generation
tags:
- llama-3
- safetensors
---
Fine-tuned version of meta-llama/Meta-Llama-3.1-8B-Instruct, trained on Turkish dataset (~4 mio tokens).
Then it is merged with VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct.
## 💻 Kullanım/How to Use
```python
!pip install -qU transformers bitsandbytes accelerate
import transformers
import torch
model_id = "Eurdem/Defne-llama3.1-8B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16, "load_in_8bit": True},
device_map="auto",
)
## For English
messages = [{"role": "system", "content": "You are a helpful chatbot, named Defne, who always responds friendly."},
{"role": "user", "content": "Answer the questions: 1) Who are you? 2) f(x)=3x^2+4x+12 so what is f(3)?"},
]
## For Turkish
messages = [{"role": "system", "content": "Sen, Defne isimli Türkçe konuşan bir chatbotsun."},
{"role": "user", "content": "Soruları numaralandırarak cevapla. 1) Sen kimsin? 2) f(x)=3x^2+4x+12 ise f(3) kaçtır?"}
]
outputs = pipeline(
messages,
max_new_tokens=1024,
do_sample=True,
temperature=0.5,
top_p=0.5,
top_k=100,
)
print(outputs[0]["generated_text"][-1]["content"])
```
### English Output
```
Hello there! I'm Defne, a friendly chatbot here to help you with any questions or tasks you might have.
Now, let's answer your questions:
1. I am Defne, a helpful chatbot designed to provide friendly and informative responses to your queries. I'm always happy to assist you with anything you need help with!
2. To find f(3) for the given function f(x) = 3x^2 + 4x + 12, we simply need to substitute x with 3 in the function.
f(3) = 3(3)^2 + 4(3) + 12
= 3(9) + 12 + 12
= 27 + 12 + 12
= 51
So, f(3) equals 51.
```
### Türkçe Çıktı
```
1) Ben Defne, Türkçe konuşan bir chatbotum. İnsanlarla sohbet etmek ve onlara yardımcı olmak için tasarlanmış bir yapay zekâ uygulamasıyım.
2) f(x) = 3x^2 + 4x + 12 formülüne x = 3 değerini koyarsak:
f(3) = 3(3)^2 + 4(3) + 12
f(3) = 3(9) + 12 + 12
f(3) = 27 + 12 + 12
f(3) = 51
Sonuç olarak, f(3) = 51'dir.
```

40
config.json Normal file
View File

@@ -0,0 +1,40 @@
{
"_name_or_path": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": [
128001,
128008,
128009
],
"pad_token_id": 128009,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 8.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"rope_theta": 500000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.43.3",
"use_cache": true,
"vocab_size": 128256
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:01f4287f2808d46b9f9f978ad15707737608c82a048556a3c51422f034fc99a6
size 4953586384

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:281e8b0b6e456a9fd7ee139dfba86aafbc62f2f77d45771fe69facee99809056
size 4999819336

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3fe00179a43042677cc250f3568cf4b2034bd1f3c1c0ee4a0b9f08f69bb1bd90
size 4915916144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:df06334b01eeaa0e802eec81b65f8f4a3914de248bbcbde050a9625dbc57a5f3
size 1191234472

File diff suppressed because one or more lines are too long

17
special_tokens_map.json Normal file
View File

@@ -0,0 +1,17 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|eot_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": "<|eot_id|>"
}

410504
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

2063
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff