初始化项目,由ModelHub XC社区提供模型

Model: ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-19 02:56:16 +08:00
commit a936531643
13 changed files with 412859 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
cosmosLLaMa2_r2.png filter=lfs diff=lfs merge=lfs -text

129
README.md Normal file
View File

@@ -0,0 +1,129 @@
---
license: llama3
language:
- tr
pipeline_tag: text-generation
base_model: meta-llama/Meta-Llama-3-8B
tags:
- Turkish
- turkish
- Llama
- Llama3
---
<img src="./cosmosLLaMa2_r2.png"/>
# Cosmos LLaMa Instruct
This model is a fully fine-tuned version of the "meta-llama/Meta-Llama-3-8B-Instruct" model with a 30GB Turkish dataset.
The Cosmos LLaMa Instruct is designed for text generation tasks, providing the ability to continue a given text snippet in a coherent and contextually relevant manner. Due to the diverse nature of the training data, which includes websites, books, and other text sources, this model can exhibit biases. Users should be aware of these biases and use the model responsibly.
#### Transformers pipeline
```python
import transformers
import torch
model_id = "ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "Sen bir yapay zeka asistanısın. Kullanıcı sana bir görev verecek. Amacın görevi olabildiğince sadık bir şekilde tamamlamak. Görevi yerine getirirken adım adım düşün ve adımlarını gerekçelendir."},
{"role": "user", "content": "Soru: Bir arabanın deposu 60 litre benzin alabiliyor. Araba her 100 kilometrede 8 litre benzin tüketiyor. Depo tamamen doluyken araba kaç kilometre yol alabilir?"},
]
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipeline(
messages,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
print(outputs[0]["generated_text"][-1])
```
#### Transformers AutoModelForCausalLM
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "Sen bir yapay zeka asistanısın. Kullanıcı sana bir görev verecek. Amacın görevi olabildiğince sadık bir şekilde tamamlamak. Görevi yerine getirirken adım adım düşün ve adımlarını gerekçelendir."},
{"role": "user", "content": "Soru: Bir arabanın deposu 60 litre benzin alabiliyor. Araba her 100 kilometrede 8 litre benzin tüketiyor. Depo tamamen doluyken araba kaç kilometre yol alabilir?"},
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
```
# Acknowledgments
- Thanks to the generous support from the Hugging Face team, it is possible to download models from their S3 storage 🤗
- Computing resources used in this work were provided by the National Center for High Performance Computing of Turkey (UHeM) under grant numbers 1016912023 and
1018512024
- Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC)
### Contact
COSMOS AI Research Group, Yildiz Technical University Computer Engineering Department <br>
https://cosmos.yildiz.edu.tr/ <br>
cosmos@yildiz.edu.tr
# Citation
```bibtex
@inproceedings{kesgin2024optimizing,
title={Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training},
author={Kesgin, H Toprak and Yuce, M Kaan and Dogan, Eren and Uzun, M Egemen and Uz, Atahan and {\.I}nce, Elif and Erdem, Yusuf and Shbib, Osama and Zeer, Ahmed and Amasyali, M Fatih},
booktitle={2024 Innovations in Intelligent Systems and Applications Conference (ASYU)},
pages={1--6},
year={2024},
organization={IEEE}
}
```
---
license: llama3
---

28
config.json Normal file
View File

@@ -0,0 +1,28 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": 128009,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 8192,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 500000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.2",
"use_cache": true,
"vocab_size": 128256
}

3
cosmosLLaMa2_r2.png Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bc8a1bec227761ecd08c337cd4e1807c14245aa23fcc1cca64d67c25b75be44c
size 3122329

9
generation_config.json Normal file
View File

@@ -0,0 +1,9 @@
{
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": 128001,
"max_length": 4096,
"temperature": 0.6,
"top_p": 0.9,
"transformers_version": "4.41.2"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:822e4e8d52212a0d1f98466449e12dd9af1a9f267d9ff495137a1825062bd748
size 4953586384

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d3bd9236d2ccbf9aa8f59bd57aa357c91388dbd51c01d5bdea7162c2c84a61d4
size 4999819336

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:25577846ed5a453019fadc020334d1ce12d81ad3fbc1a6d3c40e7f5753f78732
size 4915916144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:726d1640725f19b1c41b9e557115e38845bbb8de460fa7223fd313f0edb2a033
size 1191234472

File diff suppressed because one or more lines are too long

16
special_tokens_map.json Normal file
View File

@@ -0,0 +1,16 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

410563
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

2062
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff