Compare commits
10 Commits
263ed8eb1f
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ccd7b54d9f | ||
|
|
b17df1328a | ||
|
|
d49be2d4f8 | ||
|
|
7c75df30ed | ||
|
|
b83f9e21f0 | ||
|
|
f52ce40934 | ||
|
|
db6a8f3b99 | ||
|
|
d678c617a0 | ||
|
|
6720fca4a6 | ||
|
|
29ae7551e1 |
3
.gitattributes
vendored
3
.gitattributes
vendored
@@ -53,3 +53,6 @@ Turkish-Llama-8b-Instruct-v0.1.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
|||||||
Turkish-Llama-8b-Instruct-v0.1.Q5_K.gguf filter=lfs diff=lfs merge=lfs -text
|
Turkish-Llama-8b-Instruct-v0.1.Q5_K.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
Turkish-Llama-8b-Instruct-v0.1.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
Turkish-Llama-8b-Instruct-v0.1.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
Turkish-Llama-8b-Instruct-v0.1.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
Turkish-Llama-8b-Instruct-v0.1.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
Turkish-Llama-8b-Instruct-v0.1.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
Turkish-Llama-8b-Instruct-v0.1.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
Turkish-Llama-8b-Instruct-v0.1-F16.gguf filter=lfs diff=lfs merge=lfs -text
|
||||||
|
|||||||
110
README.md
Normal file
110
README.md
Normal file
@@ -0,0 +1,110 @@
|
|||||||
|
---
|
||||||
|
base_model: ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1
|
||||||
|
license: llama3
|
||||||
|
language:
|
||||||
|
- tr
|
||||||
|
- en
|
||||||
|
tags:
|
||||||
|
- gguf
|
||||||
|
- ggml
|
||||||
|
- llama3
|
||||||
|
- cosmosllama
|
||||||
|
- turkish llama
|
||||||
|
---
|
||||||
|
# CosmsoLLaMa GGUFs
|
||||||
|
|
||||||
|
## Objective
|
||||||
|
Due to the need for quantized models in real-time applications, we introduce our GGUF formatted models. These models are part of
|
||||||
|
GGML project with a hope to democratize the use of Large Models. Depending on the quantization type, there are 20+ models.
|
||||||
|
|
||||||
|
### Features
|
||||||
|
* All quantization details are listed on the right by Hugging Face.
|
||||||
|
* All the models have been tested in `llama.cpp` environments, `llama-cli` and `llama-server`.
|
||||||
|
* Furthermore, a YouTube video has been made to introduce the basics of using `lmstudio` to utilize these models. 👇
|
||||||
|
[](https://www.youtube.com/watch?v=JRID-6sRl7I)
|
||||||
|
|
||||||
|
|
||||||
|
### Code Example
|
||||||
|
Usage example with `llama-cpp-python`
|
||||||
|
|
||||||
|
```py
|
||||||
|
from llama_cpp import Llama
|
||||||
|
|
||||||
|
# Define the inference parameters
|
||||||
|
inference_params = {
|
||||||
|
"n_threads": 4,
|
||||||
|
"n_predict": -1,
|
||||||
|
"top_k": 40,
|
||||||
|
"min_p": 0.05,
|
||||||
|
"top_p": 0.95,
|
||||||
|
"temp": 0.8,
|
||||||
|
"repeat_penalty": 1.1,
|
||||||
|
"input_prefix": "<|start_header_id|>user<|end_header_id|>\\n\\n",
|
||||||
|
"input_suffix": "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n",
|
||||||
|
"antiprompt": [],
|
||||||
|
"pre_prompt": "Sen bir yapay zeka asistanısın. Kullanıcı sana bir görev verecek. Amacın görevi olabildiğince sadık bir şekilde tamamlamak.",
|
||||||
|
"pre_prompt_suffix": "<|eot_id|>",
|
||||||
|
"pre_prompt_prefix": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\\n\\n",
|
||||||
|
"seed": -1,
|
||||||
|
"tfs_z": 1,
|
||||||
|
"typical_p": 1,
|
||||||
|
"repeat_last_n": 64,
|
||||||
|
"frequency_penalty": 0,
|
||||||
|
"presence_penalty": 0,
|
||||||
|
"n_keep": 0,
|
||||||
|
"logit_bias": {},
|
||||||
|
"mirostat": 0,
|
||||||
|
"mirostat_tau": 5,
|
||||||
|
"mirostat_eta": 0.1,
|
||||||
|
"memory_f16": True,
|
||||||
|
"multiline_input": False,
|
||||||
|
"penalize_nl": True
|
||||||
|
}
|
||||||
|
|
||||||
|
# Initialize the Llama model with the specified inference parameters
|
||||||
|
llama = Llama.from_pretrained(
|
||||||
|
repo_id="ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1-GGUF",
|
||||||
|
filename="*Q4_K.gguf",
|
||||||
|
verbose=False
|
||||||
|
)
|
||||||
|
# Example input
|
||||||
|
user_input = "Türkiyenin başkenti neresidir?"
|
||||||
|
|
||||||
|
# Construct the prompt
|
||||||
|
prompt = f"{inference_params['pre_prompt_prefix']}{inference_params['pre_prompt']}\n\n{inference_params['input_prefix']}{user_input}{inference_params['input_suffix']}"
|
||||||
|
|
||||||
|
# Generate the response
|
||||||
|
response = llama(prompt)
|
||||||
|
|
||||||
|
# Output the response
|
||||||
|
print(response['choices'][0]['text'])
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
The quantization has been made using `llama.cpp`. As we have seen, this method tends to give the most stable results.
|
||||||
|
|
||||||
|
Obviously, we encountered better inference quality for models with the highest bits. However, the inference time tends to be similar between low-bit models.
|
||||||
|
|
||||||
|
Each model's memory footprint can be anticipated by the qunatization docs in either [Hugging Face](https://huggingface.co/docs/transformers/main/en/quantization/overview) or [llama.cpp](https://github.com/ggerganov/llama.cpp/tree/master/examples/quantize).
|
||||||
|
|
||||||
|
|
||||||
|
# Acknowledgments
|
||||||
|
- Research supported with Cloud TPUs from [Google's TensorFlow Research Cloud](https://sites.research.google/trc/about/) (TFRC). Thanks for providing access to the TFRC ❤️
|
||||||
|
- Thanks to the generous support from the Hugging Face team, it is possible to download models from their S3 storage 🤗
|
||||||
|
|
||||||
|
# Citation
|
||||||
|
```bibtex
|
||||||
|
@inproceedings{kesgin2024optimizing,
|
||||||
|
title={Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training},
|
||||||
|
author={Kesgin, H Toprak and Yuce, M Kaan and Dogan, Eren and Uzun, M Egemen and Uz, Atahan and {\.I}nce, Elif and Erdem, Yusuf and Shbib, Osama and Zeer, Ahmed and Amasyali, M Fatih},
|
||||||
|
booktitle={2024 Innovations in Intelligent Systems and Applications Conference (ASYU)},
|
||||||
|
pages={1--6},
|
||||||
|
year={2024},
|
||||||
|
organization={IEEE}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Contact
|
||||||
|
COSMOS AI Research Group, Yildiz Technical University Computer Engineering Department
|
||||||
|
https://cosmos.yildiz.edu.tr/
|
||||||
|
cosmos@yildiz.edu.tr
|
||||||
3
Turkish-Llama-8b-Instruct-v0.1-F16.gguf
Normal file
3
Turkish-Llama-8b-Instruct-v0.1-F16.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:ae6366fbcbd5a7a20b05bece7e59c79b8199f74669c5f946f909b579eabf737c
|
||||||
|
size 16068890880
|
||||||
3
Turkish-Llama-8b-Instruct-v0.1.Q6_K.gguf
Normal file
3
Turkish-Llama-8b-Instruct-v0.1.Q6_K.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:3ce9442d5a7fe21eb9f79df7b30c70dbd4e5e7e29ee10ae447a40b5595c7ea9e
|
||||||
|
size 6596006144
|
||||||
3
Turkish-Llama-8b-Instruct-v0.1.Q8_0.gguf
Normal file
3
Turkish-Llama-8b-Instruct-v0.1.Q8_0.gguf
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:9e15b58cb80e3ef24d5acba418441b0c2bf69853d37403d2c4c5ea2b63b34de5
|
||||||
|
size 8540770560
|
||||||
49
cosmos_lm_studio.preset.json
Normal file
49
cosmos_lm_studio.preset.json
Normal file
@@ -0,0 +1,49 @@
|
|||||||
|
{
|
||||||
|
"name": "cosmos_lm_studio",
|
||||||
|
"load_params": {
|
||||||
|
"n_ctx": 2048,
|
||||||
|
"n_batch": 512,
|
||||||
|
"rope_freq_base": 0,
|
||||||
|
"rope_freq_scale": 0,
|
||||||
|
"n_gpu_layers": 10,
|
||||||
|
"use_mlock": true,
|
||||||
|
"main_gpu": 0,
|
||||||
|
"tensor_split": [
|
||||||
|
0
|
||||||
|
],
|
||||||
|
"seed": -1,
|
||||||
|
"f16_kv": true,
|
||||||
|
"use_mmap": true,
|
||||||
|
"no_kv_offload": false,
|
||||||
|
"num_experts_used": 0
|
||||||
|
},
|
||||||
|
"inference_params": {
|
||||||
|
"n_threads": 4,
|
||||||
|
"n_predict": -1,
|
||||||
|
"top_k": 40,
|
||||||
|
"min_p": 0.05,
|
||||||
|
"top_p": 0.95,
|
||||||
|
"temp": 0.8,
|
||||||
|
"repeat_penalty": 1.1,
|
||||||
|
"input_prefix": "<|start_header_id|>user<|end_header_id|>\\n\\n",
|
||||||
|
"input_suffix": "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n",
|
||||||
|
"antiprompt": [],
|
||||||
|
"pre_prompt": "Sen bir yapay zeka asistanısın. Kullanıcı sana bir görev verecek. Amacın görevi olabildiğince sadık bir şekilde tamamlamak. Görevi yerine getirirken adım adım düşün ve adımlarını gerekçelendir.",
|
||||||
|
"pre_prompt_suffix": "<|eot_id|>",
|
||||||
|
"pre_prompt_prefix": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\\n\\n",
|
||||||
|
"seed": -1,
|
||||||
|
"tfs_z": 1,
|
||||||
|
"typical_p": 1,
|
||||||
|
"repeat_last_n": 64,
|
||||||
|
"frequency_penalty": 0,
|
||||||
|
"presence_penalty": 0,
|
||||||
|
"n_keep": 0,
|
||||||
|
"logit_bias": {},
|
||||||
|
"mirostat": 0,
|
||||||
|
"mirostat_tau": 5,
|
||||||
|
"mirostat_eta": 0.1,
|
||||||
|
"memory_f16": true,
|
||||||
|
"multiline_input": false,
|
||||||
|
"penalize_nl": true
|
||||||
|
}
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user