105 lines
2.9 KiB
Markdown
105 lines
2.9 KiB
Markdown
---
|
||
language:
|
||
- el
|
||
- en
|
||
license: apache-2.0
|
||
pipeline_tag: text-generation
|
||
tags:
|
||
- finetuned
|
||
inference: true
|
||
base_model:
|
||
- ilsp/Meltemi-7B-Instruct-v1.5
|
||
---
|
||
|
||
# Meltemi llamafile & gguf
|
||
|
||
This repo contains `llamafile` and `gguf` file format models for [Meltemi 7B Instruct v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5), the first Greek Large Language Model (LLM),
|
||
trained by the Institute for Language and Speech Processing at Athena Research & Innovation Center.
|
||
|
||
lamafile is a file format introduced by Mozilla Ocho on Nov 20th 2023,
|
||
and it collapses the complexity of an LLM into a single executable file.
|
||
This gives you the easiest, fastest way to use Meltemi on Linux, MacOS, Windows, FreeBSD, OpenBSD, and NetBSD systems you control on both AMD64 and ARM64.
|
||
|
||
It's as simple as this
|
||
|
||
```shell
|
||
wget https://huggingface.co/Florents-Tselai/Meltemi-llamafile/resolve/main/Meltemi-7B-Instruct-v1.5-Q8_0.llamafile
|
||
chmod +x Meltemi-7B-Instruct-v1.5-Q8_0.llamafile
|
||
```
|
||
|
||
```shell
|
||
./Meltemi-7B-Instruct-v1.5-Q8_0.llamafile
|
||
```
|
||
|
||
This will open a tab with a chatbot and completion interface in your browser.
|
||
For additional help on how it may be used, pass the `--help` flag.
|
||
|
||
## API
|
||
|
||
The server also has an OpenAI API-compatible completions endpoint.
|
||
|
||
```shell
|
||
curl http://localhost:8080/v1/chat/completions \
|
||
-H "Content-Type: application/json" \
|
||
-H "Authorization: Bearer no-key" \
|
||
-d '{
|
||
"model": "LLaMA_CPP",
|
||
"messages": [
|
||
{
|
||
"role": "system",
|
||
"content": "Είσαι ένας φωτεινός παντογνώστης"
|
||
},
|
||
{
|
||
"role": "user",
|
||
"content": "Γράψε μου μια ιστορία για έναν βάτραχο που έγινε αρνάκι"
|
||
}
|
||
]
|
||
}'
|
||
```
|
||
|
||
## CLI
|
||
|
||
An advanced CLI mode is provided that's useful for shell scripting.
|
||
You can use it by passing the `--cli` flag. For additional help on how it may be used, pass the --help flag.
|
||
|
||
```shell
|
||
./Meltemi-7B-Instruct-v1.5-Q8_0.llamafile -p 'Ποιό είναι το νόημα της ζωής;'
|
||
```
|
||
|
||
To see all available options
|
||
|
||
```shell
|
||
./Meltemi-7B-Instruct-v1.5-Q8_0.llamafile --help
|
||
```
|
||
|
||
## gguf
|
||
|
||
`gguf` file formats are also available if you're working with llama.cpp [llama.cpp](https://github.com/ggerganov/llama.cpp)
|
||
|
||
llama.cpp offers quite a lot of options, thus refer to its documentation.
|
||
|
||
### Basic Usage
|
||
|
||
```shell
|
||
llama-cli -m ./Meltemi-7B-Instruct-v1.5-F16.gguf -p "Ποιό είναι το νόημα της ζωής;" -n 128
|
||
```
|
||
|
||
### Conversation Mode
|
||
|
||
```shell
|
||
llama-cli -m ./Meltemi-7B-Instruct-v1.5-F16.gguf --conv
|
||
```
|
||
|
||
### Web Server
|
||
|
||
```shell
|
||
llama-server -m ./Meltemi-7B-Instruct-v1.5-F16.gguf --port 8080
|
||
```
|
||
|
||
# Model Information
|
||
|
||
- Vocabulary extension of the Mistral 7b tokenizer with Greek tokens for lower costs and faster inference (**1.52** vs. 6.80 tokens/word for Greek)
|
||
- 8192 context length
|
||
|
||
For more details, please refer to the original model card [Meltemi 7B Instract v1.5](https://huggingface.co/ilsp/Meltemi-7B-Instruct-v1.5)
|