Model: duyntnet/Yarn-Llama-2-7b-128k-imatrix-GGUF Source: Original Platform
license, language, pipeline_tag, inference, tags
| license | language | pipeline_tag | inference | tags | |||||
|---|---|---|---|---|---|---|---|---|---|
| other |
|
text-generation | false |
|
Quantizations of https://huggingface.co/NousResearch/Yarn-Llama-2-7b-128k
From original readme
Usage and Prompt Format
Install FA2 and Rotary Extensions:
pip install flash-attn --no-build-isolation
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
There are no specific prompt formats as this is a pretrained base model.
Description