26 lines
527 B
Markdown
26 lines
527 B
Markdown
|
|
---
|
||
|
|
license: other
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
inference: false
|
||
|
|
tags:
|
||
|
|
- transformers
|
||
|
|
- gguf
|
||
|
|
- imatrix
|
||
|
|
- Yarn-Llama-2-7b-128k
|
||
|
|
---
|
||
|
|
Quantizations of https://huggingface.co/NousResearch/Yarn-Llama-2-7b-128k
|
||
|
|
|
||
|
|
|
||
|
|
# From original readme
|
||
|
|
|
||
|
|
## Usage and Prompt Format
|
||
|
|
|
||
|
|
Install FA2 and Rotary Extensions:
|
||
|
|
```
|
||
|
|
pip install flash-attn --no-build-isolation
|
||
|
|
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
|
||
|
|
```
|
||
|
|
|
||
|
|
There are no specific prompt formats as this is a pretrained base model.
|