--- license: other language: - en pipeline_tag: text-generation inference: false tags: - transformers - gguf - imatrix - Yarn-Llama-2-7b-128k --- Quantizations of https://huggingface.co/NousResearch/Yarn-Llama-2-7b-128k # From original readme ## Usage and Prompt Format Install FA2 and Rotary Extensions: ``` pip install flash-attn --no-build-isolation pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary ``` There are no specific prompt formats as this is a pretrained base model.