Update metadata with huggingface_hub

This commit is contained in:
ai-modelscope
2024-11-18 19:09:24 +08:00
parent 17091aeb3d
commit 9bf5b86c10
28 changed files with 239 additions and 55 deletions

50
.gitattributes vendored
View File

@@ -1,38 +1,60 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q6_K_L.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q5_K_L.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q4_K_L.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q4_0_8_8.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q4_0_4_8.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q4_0_4_4.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q3_K_XL.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-IQ3_M.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-IQ3_XS.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q2_K_L.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-IQ2_M.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B-f16.gguf filter=lfs diff=lfs merge=lfs -text
MathCoder2-Llama-3-8B.imatrix filter=lfs diff=lfs merge=lfs -text

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:76a6ed87f988a501d25e5262cdc51a028200fb1ae61d38ca09f511e2e0795374
size 2948281504

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b4155c5848c764cfe9a45353ba4a6551854e88a9dc67ff0f97f917ef995bede3
size 3784823968

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0ffb0867e31e148130816da44c6117f73de2a793845c846297ef00f8bce9e5a1
size 3518747808

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0acc954608450800f06b50d08276704b6e74d6524bdd2f14e0d75c8a651caa67
size 4447663264

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7eb97040609eca94acc3547f74619a7e4d0069eddc9d7d99f933cbf233cc0d60
size 3179132064

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6631ed8fcc95ca51e6b38c819c35809ceb1235ff55ce89c09ad559d0e832c4ac
size 3692156064

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:528f9075020b857a3e6a35f68e4eac12aa50f64d74bddcc68fd98b65a9f7f245
size 4321957024

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a9cd4671b800ea46f2f58c49a74c6fd3cbbc3074aa8140d663d81390c3d2d6bf
size 4018918560

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f359ff2e3282830687ecd25f87233931e09014c6a007cc185c6755319094c100
size 3664499872

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f84b5013653dea4d55dae6d34bc252d544df1f1baca845c58c9aea0927a33132
size 4781626528

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cff219417d315e0e9b7e303a21eaf50f51120802a2191867dec667c0e00ef2af
size 4675892384

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:15d27aa194d9f4a4b6c20e13a9cff9ed5da17acfa40c7e334e92a1544432450e
size 4661212320

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1e847f82196f409270c25a70356f31c2724545d3b8558da8b5528ae78303dbff
size 4661212320

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f2e2a2f102a9bacf4838d6e0177590d75cbd2dd7e0a4872d7aad844e4460ddac
size 4661212320

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9adfc12aa1d251864b4851d2dbdfe55b7b30e5e5deba1545ab0b6f21b11b536d
size 5310633120

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6db5ea4b41e2c5ed22e787093de1a235e60cab9ed322567f709631248c3706c9
size 4920734880

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4e120a310fcdf668c03800ad2b3d40d393414069bce37895099f54397a96b041
size 4692669600

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1cf034be90d287200f81d538f03361afa409d6cab63a66b8f02de2a93979a628
size 6057219232

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cb0db32b74650f8ca359d2b79826893826403b9e44b5e709354c49134de1e9bd
size 5732988064

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9126307146da8acfd9c9762f61c68589dd599e5226edb895f467324b9f4a6408
size 5599294624

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:beb748e3d1e190e8c8d17770175a5521ddd671cca76aec8d540bdcc58b15665c
size 6596007072

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:127eb9ee5d41dc98cf9c38fb703b85070defd215fc6e11cebe65e46cce3f2f2e
size 6850466976

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:83bf82efc14bea08fa6248fa8e564e5db8ae3e0b4dcb917f9d10c3e3f1ff11e2
size 8540771488

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d32e7ac139d48b0b7324bf99750b428c99d82d46f2136cb66a36ab0753396fae
size 16068891552

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3c8d3165fdbcb984ccfc112d6d5e9fe328041faf1850893327e68778e5788cae
size 4988170

168
README.md
View File

@@ -1,47 +1,133 @@
---
license: Apache License 2.0
#model-type:
##如 gpt、phi、llama、chatglm、baichuan 等
#- gpt
#domain:
##如 nlp、cv、audio、multi-modal
#- nlp
#language:
##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
#- cn
#metrics:
##如 CIDEr、Blue、ROUGE 等
#- CIDEr
#tags:
##各种自定义,包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
#- pretrained
#tools:
##如 vllm、fastchat、llamacpp、AdaSeq 等
#- vllm
base_model: MathGenie/MathCoder2-Llama-3-8B
datasets:
- MathGenie/MathCode-Pile
language:
- en
license: apache-2.0
metrics:
- accuracy
pipeline_tag: text-generation
tags:
- math
quantized_by: bartowski
---
### 当前模型的贡献者未提供更加详细的模型介绍。模型文件和权重,可浏览“模型文件”页面获取。
#### 您可以通过如下git clone命令或者ModelScope SDK来下载模型
SDK下载
```bash
#安装ModelScope
pip install modelscope
## Llamacpp imatrix Quantizations of MathCoder2-Llama-3-8B
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3901">b3901</a> for quantization.
Original model: https://huggingface.co/MathGenie/MathCoder2-Llama-3-8B
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
Run them in [LM Studio](https://lmstudio.ai/)
## Prompt format
```
```python
#SDK模型下载
from modelscope import snapshot_download
model_dir = snapshot_download('bartowski/MathCoder2-Llama-3-8B-GGUF')
```
Git下载
```
#Git模型下载
git clone https://www.modelscope.cn/bartowski/MathCoder2-Llama-3-8B-GGUF.git
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
```
<p style="color: lightgrey;">如果您是本模型的贡献者,我们邀请您根据<a href="https://modelscope.cn/docs/ModelScope%E6%A8%A1%E5%9E%8B%E6%8E%A5%E5%85%A5%E6%B5%81%E7%A8%8B%E6%A6%82%E8%A7%88" style="color: lightgrey; text-decoration: underline;">模型贡献文档</a>,及时完善模型卡片内容。</p>
## Download a file (not the whole branch) from below:
| Filename | Quant type | File Size | Split | Description |
| -------- | ---------- | --------- | ----- | ----------- |
| [MathCoder2-Llama-3-8B-f16.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-f16.gguf) | f16 | 16.07GB | false | Full F16 weights. |
| [MathCoder2-Llama-3-8B-Q8_0.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q8_0.gguf) | Q8_0 | 8.54GB | false | Extremely high quality, generally unneeded but max available quant. |
| [MathCoder2-Llama-3-8B-Q6_K_L.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q6_K_L.gguf) | Q6_K_L | 6.85GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. |
| [MathCoder2-Llama-3-8B-Q6_K.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q6_K.gguf) | Q6_K | 6.60GB | false | Very high quality, near perfect, *recommended*. |
| [MathCoder2-Llama-3-8B-Q5_K_L.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q5_K_L.gguf) | Q5_K_L | 6.06GB | false | Uses Q8_0 for embed and output weights. High quality, *recommended*. |
| [MathCoder2-Llama-3-8B-Q5_K_M.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q5_K_M.gguf) | Q5_K_M | 5.73GB | false | High quality, *recommended*. |
| [MathCoder2-Llama-3-8B-Q5_K_S.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q5_K_S.gguf) | Q5_K_S | 5.60GB | false | High quality, *recommended*. |
| [MathCoder2-Llama-3-8B-Q4_K_L.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q4_K_L.gguf) | Q4_K_L | 5.31GB | false | Uses Q8_0 for embed and output weights. Good quality, *recommended*. |
| [MathCoder2-Llama-3-8B-Q4_K_M.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q4_K_M.gguf) | Q4_K_M | 4.92GB | false | Good quality, default size for must use cases, *recommended*. |
| [MathCoder2-Llama-3-8B-Q3_K_XL.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q3_K_XL.gguf) | Q3_K_XL | 4.78GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
| [MathCoder2-Llama-3-8B-Q4_K_S.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q4_K_S.gguf) | Q4_K_S | 4.69GB | false | Slightly lower quality with more space savings, *recommended*. |
| [MathCoder2-Llama-3-8B-Q4_0.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q4_0.gguf) | Q4_0 | 4.68GB | false | Legacy format, generally not worth using over similarly sized formats |
| [MathCoder2-Llama-3-8B-Q4_0_8_8.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q4_0_8_8.gguf) | Q4_0_8_8 | 4.66GB | false | Optimized for ARM inference. Requires 'sve' support (see link below). *Don't use on Mac or Windows*. |
| [MathCoder2-Llama-3-8B-Q4_0_4_8.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q4_0_4_8.gguf) | Q4_0_4_8 | 4.66GB | false | Optimized for ARM inference. Requires 'i8mm' support (see link below). *Don't use on Mac or Windows*. |
| [MathCoder2-Llama-3-8B-Q4_0_4_4.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q4_0_4_4.gguf) | Q4_0_4_4 | 4.66GB | false | Optimized for ARM inference. Should work well on all ARM chips, pick this if you're unsure. *Don't use on Mac or Windows*. |
| [MathCoder2-Llama-3-8B-IQ4_XS.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-IQ4_XS.gguf) | IQ4_XS | 4.45GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
| [MathCoder2-Llama-3-8B-Q3_K_L.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q3_K_L.gguf) | Q3_K_L | 4.32GB | false | Lower quality but usable, good for low RAM availability. |
| [MathCoder2-Llama-3-8B-Q3_K_M.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q3_K_M.gguf) | Q3_K_M | 4.02GB | false | Low quality. |
| [MathCoder2-Llama-3-8B-IQ3_M.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-IQ3_M.gguf) | IQ3_M | 3.78GB | false | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
| [MathCoder2-Llama-3-8B-Q2_K_L.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q2_K_L.gguf) | Q2_K_L | 3.69GB | false | Uses Q8_0 for embed and output weights. Very low quality but surprisingly usable. |
| [MathCoder2-Llama-3-8B-Q3_K_S.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q3_K_S.gguf) | Q3_K_S | 3.66GB | false | Low quality, not recommended. |
| [MathCoder2-Llama-3-8B-IQ3_XS.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-IQ3_XS.gguf) | IQ3_XS | 3.52GB | false | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
| [MathCoder2-Llama-3-8B-Q2_K.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-Q2_K.gguf) | Q2_K | 3.18GB | false | Very low quality but surprisingly usable. |
| [MathCoder2-Llama-3-8B-IQ2_M.gguf](https://huggingface.co/bartowski/MathCoder2-Llama-3-8B-GGUF/blob/main/MathCoder2-Llama-3-8B-IQ2_M.gguf) | IQ2_M | 2.95GB | false | Relatively low quality, uses SOTA techniques to be surprisingly usable. |
## Embed/output weights
Some of these quants (Q3_K_XL, Q4_K_L etc) are the standard quantization method with the embeddings and output weights quantized to Q8_0 instead of what they would normally default to.
Some say that this improves the quality, others don't notice any difference. If you use these models PLEASE COMMENT with your findings. I would like feedback that these are actually used and useful so I don't keep uploading quants no one is using.
Thanks!
## Downloading using huggingface-cli
First, make sure you have hugginface-cli installed:
```
pip install -U "huggingface_hub[cli]"
```
Then, you can target the specific file you want:
```
huggingface-cli download bartowski/MathCoder2-Llama-3-8B-GGUF --include "MathCoder2-Llama-3-8B-Q4_K_M.gguf" --local-dir ./
```
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
```
huggingface-cli download bartowski/MathCoder2-Llama-3-8B-GGUF --include "MathCoder2-Llama-3-8B-Q8_0/*" --local-dir ./
```
You can either specify a new local-dir (MathCoder2-Llama-3-8B-Q8_0) or download them all in place (./)
## Q4_0_X_X
These are *NOT* for Metal (Apple) offloading, only ARM chips.
If you're using an ARM chip, the Q4_0_X_X quants will have a substantial speedup. Check out Q4_0_4_4 speed comparisons [on the original pull request](https://github.com/ggerganov/llama.cpp/pull/5780#pullrequestreview-21657544660)
To check which one would work best for your ARM chip, you can check [AArch64 SoC features](https://gpages.juszkiewicz.com.pl/arm-socs-table/arm-socs.html) (thanks EloyOn!).
## Which file should I choose?
A great write up with charts showing various performances is provided by Artefact2 [here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)
The first thing to figure out is how big a model you can run. To do this, you'll need to figure out how much RAM and/or VRAM you have.
If you want your model running as FAST as possible, you'll want to fit the whole thing on your GPU's VRAM. Aim for a quant with a file size 1-2GB smaller than your GPU's total VRAM.
If you want the absolute maximum quality, add both your system RAM and your GPU's VRAM together, then similarly grab a quant with a file size 1-2GB Smaller than that total.
Next, you'll need to decide if you want to use an 'I-quant' or a 'K-quant'.
If you don't want to think too much, grab one of the K-quants. These are in format 'QX_K_X', like Q5_K_M.
If you want to get more into the weeds, you can check out this extremely useful feature chart:
[llama.cpp feature matrix](https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix)
But basically, if you're aiming for below Q4, and you're running cuBLAS (Nvidia) or rocBLAS (AMD), you should look towards the I-quants. These are in format IQX_X, like IQ3_M. These are newer and offer better performance for their size.
These I-quants can also be used on CPU and Apple Metal, but will be slower than their K-quant equivalent, so speed vs performance is a tradeoff you'll have to decide.
The I-quants are *not* compatible with Vulcan, which is also AMD, so if you have an AMD card double check if you're using the rocBLAS build or the Vulcan build. At the time of writing this, LM Studio has a preview with ROCm support, and other inference engines have specific builds for ROCm.
## Credits
Thank you kalomaze and Dampf for assistance in creating the imatrix calibration dataset
Thank you ZeroWw for the inspiration to experiment with embed/output
Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}