初始化项目,由ModelHub XC社区提供模型
Model: duyntnet/Yarn-Llama-2-7b-128k-imatrix-GGUF Source: Original Platform
This commit is contained in:
26
README.md
Normal file
26
README.md
Normal file
@@ -0,0 +1,26 @@
|
||||
---
|
||||
license: other
|
||||
language:
|
||||
- en
|
||||
pipeline_tag: text-generation
|
||||
inference: false
|
||||
tags:
|
||||
- transformers
|
||||
- gguf
|
||||
- imatrix
|
||||
- Yarn-Llama-2-7b-128k
|
||||
---
|
||||
Quantizations of https://huggingface.co/NousResearch/Yarn-Llama-2-7b-128k
|
||||
|
||||
|
||||
# From original readme
|
||||
|
||||
## Usage and Prompt Format
|
||||
|
||||
Install FA2 and Rotary Extensions:
|
||||
```
|
||||
pip install flash-attn --no-build-isolation
|
||||
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
|
||||
```
|
||||
|
||||
There are no specific prompt formats as this is a pretrained base model.
|
||||
Reference in New Issue
Block a user