Update metadata with huggingface_hub

This commit is contained in:
ai-modelscope
2024-08-14 02:33:33 +08:00
parent fdf3ac3834
commit 9346c90366
25 changed files with 114 additions and 67 deletions

7
.gitattributes vendored
View File

@@ -58,3 +58,10 @@ Phi-3-medium-128k-instruct-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct-f32.gguf/Phi-3-medium-128k-instruct-f32-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct-f32.gguf/Phi-3-medium-128k-instruct-f32-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct.imatrix filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct-Q6_K_L.gguf filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct-Q5_K_L.gguf filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct-Q4_K_L.gguf filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct-Q3_K_XL.gguf filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct-Q2_K_L.gguf filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct-f32/Phi-3-medium-128k-instruct-f32-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
Phi-3-medium-128k-instruct-f32/Phi-3-medium-128k-instruct-f32-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4ac128777bcf71ddf40d72bfbfdf5c028b15a19d76a2cbc3a0a101560a3f1ff9
size 4717006368
oid sha256:e4bca6c2b6bc2795200836dd61816e1fdba6ca405197ce6a5959a43a14b1f06d
size 4717006912

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9dc512a2ddaed5576fe7ca1a981963c83c04079db3c9c8eabf87ea6690fa74f2
size 3715757088
oid sha256:8acde21414d9bf67d727b6a9c851d3de814084277e03053157df5c6ce34aff44
size 3715757632

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a4a1df7bac1fe1a9991388bc78e5eb057999e9ade7e96d832cbdc377e4a267a8
size 6473977888
oid sha256:4012fcc9f5d3f1bc74be6197f64e6ff76a03f667d793f6d470ad732753c65b81
size 6473978432

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b4a79e3eaaf7daf0653a8b8cadd3b6e9d71be71cc0abbce723170cf8ded3acd4
size 5806841888
oid sha256:c1eac74568ef60c5377eb9652250ca10e7854611903ce88fca339f2c01ce5d38
size 5806842432

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a9098e8e89e69b71592a1f86d06d835652a42e844f0eafad99f71ebbebef379d
size 7466011168
oid sha256:56d050a3b7f7e46a9e704d50d905d96458ee746992da80abae04829d5dd62907
size 7466011712

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:902f408660621f211a4cc11387e86754d1e5cbf15341a076a7342b1e9461a30b
size 5143000608
oid sha256:5b98588e7614999356cbf4870d954462c7919af5618796fa305d392d1ec90d80
size 5143001152

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b846825b4f9a538de034d883b90fd91b8be102e26cc3ec81c51e286146d54457
size 5303321152

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d2c7f00f5b2ba6bbd185a9b404f02a2705a8c4709513734372f0d32efc20ab98
size 7490297888
oid sha256:cbdedbff19cf9a06a7b3adf6e28f51bbbc75d0099ec55306112949643cd242d6
size 7490298432

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:38b3f72bdd536b163168583ce8deef6ac4966444c67c699b1ee2e51e82d3fc42
size 6923411488
oid sha256:2fff291dc1f2831d4b4beaf1a41c05cb4cc879ee2abdcc85468f2b6d6e9e4d1d
size 6923412032

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7c323ef2d5a5e9bc2a7bbb90a460301c90e271d9404545b8b1ab134ab228a87d
size 6064889888
oid sha256:a5834dd17ba9f837cb17fccaa8ee5e7671db140527778a4f3bb26dad050bc3bd
size 6064890432

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8d2563b861c2b419af34a96f0dc26aa5a182cd77624ee7dcebf4b16b1c77d7fb
size 7633945152

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:88633f358967eb7057ef9d95ea44b1362450282d64cb98ba3c8c8a9c5014d67a
size 8688665152

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d8ed734336b8b83977874afc85f11674bf3b49a92e8e02ac4197e6812f10c242
size 8566821408
oid sha256:6adc92faaf06bca4f706f279a569402b727be5a02ec8600d52fac3dce23ae43d
size 8566821952

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d39678f87e2c17fccc545955f5d61cc26f3a592c56d503cfea0d7d0e7ac6cf81
size 7954469408
oid sha256:8dd7fee0b0184a9766bd2e96327773dc22366a342aa908df5ace93dddd0f7c3d
size 7954469952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8a64dc50d4aa2a33fddacd7fb58888aa60fe5b99baff292e4e364b4ab534c683
size 10175513152

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5881f155da96899eb0639ae35b8b8f072267b96ca0955926cd5ef8bf6cee9fbd
size 10074190368
oid sha256:d06da23c184ad7a0e7c2a030a621907816e6efc4c15acf9612e10f8ba7f81d4f
size 10074190912

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a0da98dc6e5f1929e4a4343b4983e345e0202ce5ffdac578ee5947c79fdd01f9
size 9621582368
oid sha256:d2289f1c08637e3d2583f4b57c6d1b462f1360a2c8e4bedaba6b06104b7f9383
size 9621582912

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fb6337cbbe910dd9a8eb5078b30e09174853a2b5bbf8cebe0fff97d2e8ab9106
size 11453817888
oid sha256:806f6fa6b9776843f655a845ce269e208fb4ccb432aa29bc3481281c051543f2
size 11453818432

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e0cb8b3cd4209825c9917855b8d590f47aaed543f9163fd436799a88c3704a0e
size 11533337152

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3d344e88ec30ae94b506159a43106ce55126687a2427ff7821f1ebf5dd42a648
size 14834712608
oid sha256:5b38b0907c2c72c679b847801ffa214f33f08818b835a057af6932690c189157
size 14834713152

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cccfd7a472e2ee107db900c9fdf3b5ce01b55d3c505d06f749ac5d4dc35bc4ca
size 39455887040

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:292e6ddb549b091bca18727230bad79b5968412064895282af7efd6e20096544
size 16385807136

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:499a8bb5daebfc977e82ecc53018aa645e92dff3d869c2ef74eddf68f04c858f
size 5330287
oid sha256:34604d58d7a6b2d3ab9f45abf13961d9180228cd0ec943b1ff655f9422f0a988
size 5330288

View File

@@ -1,31 +1,32 @@
---
license: mit
license_link: https://huggingface.co/microsoft/Phi-3-medium-128k-instruct/resolve/main/LICENSE
base_model: microsoft/Phi-3-medium-128k-instruct
language:
- multilingual
license: mit
license_link: https://huggingface.co/microsoft/Phi-3-medium-128k-instruct/resolve/main/LICENSE
pipeline_tag: text-generation
tags:
- phi3
- nlp
- code
quantized_by: bartowski
inference:
parameters:
temperature: 0.7
widget:
- messages:
- messages:
- role: user
content: Can you provide ways to eat combinations of bananas and dragonfruits?
quantized_by: bartowski
---
## Llamacpp imatrix Quantizations of Phi-3-medium-128k-instruct
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> pull request <a href="https://github.com/ggerganov/llama.cpp/pull/7225">7225</a> for quantization.
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3561">b3561</a> for quantization.
Original model: https://huggingface.co/microsoft/Phi-3-medium-128k-instruct
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/b6ac44691e994344625687afe3263b3a)
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
Run them in [LM Studio](https://lmstudio.ai/)
## Prompt format
@@ -33,32 +34,49 @@ All quants made using imatrix option with dataset from [here](https://gist.githu
<|user|> {prompt}<|end|><|assistant|><|end|>
```
## What's new:
Updating to latest llama.cpp for rope fixes (thanks Niluayuk)
## Download a file (not the whole branch) from below:
| Filename | Quant type | File Size | Description |
| -------- | ---------- | --------- | ----------- |
| [Phi-3-medium-128k-instruct-Q8_0.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q8_0.gguf) | Q8_0 | 14.83GB | Extremely high quality, generally unneeded but max available quant. |
| [Phi-3-medium-128k-instruct-Q6_K.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q6_K.gguf) | Q6_K | 11.45GB | Very high quality, near perfect, *recommended*. |
| [Phi-3-medium-128k-instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q5_K_M.gguf) | Q5_K_M | 10.07GB | High quality, *recommended*. |
| [Phi-3-medium-128k-instruct-Q5_K_S.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q5_K_S.gguf) | Q5_K_S | 9.62GB | High quality, *recommended*. |
| [Phi-3-medium-128k-instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q4_K_M.gguf) | Q4_K_M | 8.56GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
| [Phi-3-medium-128k-instruct-Q4_K_S.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q4_K_S.gguf) | Q4_K_S | 7.95GB | Slightly lower quality with more space savings, *recommended*. |
| [Phi-3-medium-128k-instruct-IQ4_NL.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ4_NL.gguf) | IQ4_NL | 7.89GB | Decent quality, slightly smaller than Q4_K_S with similar performance *recommended*. |
| [Phi-3-medium-128k-instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ4_XS.gguf) | IQ4_XS | 7.46GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
| [Phi-3-medium-128k-instruct-Q3_K_L.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q3_K_L.gguf) | Q3_K_L | 7.49GB | Lower quality but usable, good for low RAM availability. |
| [Phi-3-medium-128k-instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q3_K_M.gguf) | Q3_K_M | 6.92GB | Even lower quality. |
| [Phi-3-medium-128k-instruct-IQ3_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ3_M.gguf) | IQ3_M | 6.47GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
| [Phi-3-medium-128k-instruct-IQ3_S.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ3_S.gguf) | IQ3_S | 6.06GB | Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance. |
| [Phi-3-medium-128k-instruct-Q3_K_S.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q3_K_S.gguf) | Q3_K_S | 6.06GB | Low quality, not recommended. |
| [Phi-3-medium-128k-instruct-IQ3_XS.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ3_XS.gguf) | IQ3_XS | 5.80GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
| [Phi-3-medium-128k-instruct-IQ3_XXS.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ3_XXS.gguf) | IQ3_XXS | 5.45GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
| [Phi-3-medium-128k-instruct-Q2_K.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q2_K.gguf) | Q2_K | 5.14GB | Very low quality but surprisingly usable. |
| [Phi-3-medium-128k-instruct-IQ2_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ2_M.gguf) | IQ2_M | 4.71GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
| [Phi-3-medium-128k-instruct-IQ2_S.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ2_S.gguf) | IQ2_S | 4.33GB | Very low quality, uses SOTA techniques to be usable. |
| [Phi-3-medium-128k-instruct-IQ2_XS.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ2_XS.gguf) | IQ2_XS | 4.12GB | Very low quality, uses SOTA techniques to be usable. |
| [Phi-3-medium-128k-instruct-IQ2_XXS.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ2_XXS.gguf) | IQ2_XXS | 3.71GB | Lower quality, uses SOTA techniques to be usable. |
| [Phi-3-medium-128k-instruct-IQ1_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ1_M.gguf) | IQ1_M | 3.24GB | Extremely low quality, *not* recommended. |
| [Phi-3-medium-128k-instruct-IQ1_S.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ1_S.gguf) | IQ1_S | 2.95GB | Extremely low quality, *not* recommended. |
| Filename | Quant type | File Size | Split | Description |
| -------- | ---------- | --------- | ----- | ----------- |
| [Phi-3-medium-128k-instruct-f32.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/tree/main/Phi-3-medium-128k-instruct-f32) | f32 | 55.84GB | true | Full F32 weights. |
| [Phi-3-medium-128k-instruct-Q8_0.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q8_0.gguf) | Q8_0 | 14.83GB | false | Extremely high quality, generally unneeded but max available quant. |
| [Phi-3-medium-128k-instruct-Q6_K_L.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q6_K_L.gguf) | Q6_K_L | 11.53GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. |
| [Phi-3-medium-128k-instruct-Q6_K.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q6_K.gguf) | Q6_K | 11.45GB | false | Very high quality, near perfect, *recommended*. |
| [Phi-3-medium-128k-instruct-Q5_K_L.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q5_K_L.gguf) | Q5_K_L | 10.18GB | false | Uses Q8_0 for embed and output weights. High quality, *recommended*. |
| [Phi-3-medium-128k-instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q5_K_M.gguf) | Q5_K_M | 10.07GB | false | High quality, *recommended*. |
| [Phi-3-medium-128k-instruct-Q5_K_S.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q5_K_S.gguf) | Q5_K_S | 9.62GB | false | High quality, *recommended*. |
| [Phi-3-medium-128k-instruct-Q4_K_L.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q4_K_L.gguf) | Q4_K_L | 8.69GB | false | Uses Q8_0 for embed and output weights. Good quality, *recommended*. |
| [Phi-3-medium-128k-instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q4_K_M.gguf) | Q4_K_M | 8.57GB | false | Good quality, default size for must use cases, *recommended*. |
| [Phi-3-medium-128k-instruct-Q4_K_S.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q4_K_S.gguf) | Q4_K_S | 7.95GB | false | Slightly lower quality with more space savings, *recommended*. |
| [Phi-3-medium-128k-instruct-Q3_K_XL.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q3_K_XL.gguf) | Q3_K_XL | 7.63GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
| [Phi-3-medium-128k-instruct-Q3_K_L.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q3_K_L.gguf) | Q3_K_L | 7.49GB | false | Lower quality but usable, good for low RAM availability. |
| [Phi-3-medium-128k-instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ4_XS.gguf) | IQ4_XS | 7.47GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
| [Phi-3-medium-128k-instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q3_K_M.gguf) | Q3_K_M | 6.92GB | false | Low quality. |
| [Phi-3-medium-128k-instruct-IQ3_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ3_M.gguf) | IQ3_M | 6.47GB | false | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
| [Phi-3-medium-128k-instruct-Q3_K_S.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q3_K_S.gguf) | Q3_K_S | 6.06GB | false | Low quality, not recommended. |
| [Phi-3-medium-128k-instruct-IQ3_XS.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ3_XS.gguf) | IQ3_XS | 5.81GB | false | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
| [Phi-3-medium-128k-instruct-Q2_K_L.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q2_K_L.gguf) | Q2_K_L | 5.30GB | false | Uses Q8_0 for embed and output weights. Very low quality but surprisingly usable. |
| [Phi-3-medium-128k-instruct-Q2_K.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-Q2_K.gguf) | Q2_K | 5.14GB | false | Very low quality but surprisingly usable. |
| [Phi-3-medium-128k-instruct-IQ2_M.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ2_M.gguf) | IQ2_M | 4.72GB | false | Relatively low quality, uses SOTA techniques to be surprisingly usable. |
| [Phi-3-medium-128k-instruct-IQ2_XXS.gguf](https://huggingface.co/bartowski/Phi-3-medium-128k-instruct-GGUF/blob/main/Phi-3-medium-128k-instruct-IQ2_XXS.gguf) | IQ2_XXS | 3.72GB | false | Very low quality, uses SOTA techniques to be usable. |
## Embed/output weights
Some of these quants (Q3_K_XL, Q4_K_L etc) are the standard quantization method with the embeddings and output weights quantized to Q8_0 instead of what they would normally default to.
Some say that this improves the quality, others don't notice any difference. If you use these models PLEASE COMMENT with your findings. I would like feedback that these are actually used and useful so I don't keep uploading quants no one is using.
Thanks!
## Credits
Thank you kalomaze and Dampf for assistance in creating the imatrix calibration dataset
Thank you ZeroWw for the inspiration to experiment with embed/output
## Downloading using huggingface-cli
@@ -77,7 +95,7 @@ huggingface-cli download bartowski/Phi-3-medium-128k-instruct-GGUF --include "Ph
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
```
huggingface-cli download bartowski/Phi-3-medium-128k-instruct-GGUF --include "Phi-3-medium-128k-instruct-Q8_0.gguf/*" --local-dir Phi-3-medium-128k-instruct-Q8_0
huggingface-cli download bartowski/Phi-3-medium-128k-instruct-GGUF --include "Phi-3-medium-128k-instruct-Q8_0/*" --local-dir ./
```
You can either specify a new local-dir (Phi-3-medium-128k-instruct-Q8_0) or download them all in place (./)
@@ -107,3 +125,4 @@ These I-quants can also be used on CPU and Apple Metal, but will be slower than
The I-quants are *not* compatible with Vulcan, which is also AMD, so if you have an AMD card double check if you're using the rocBLAS build or the Vulcan build. At the time of writing this, LM Studio has a preview with ROCm support, and other inference engines have specific builds for ROCm.
Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski