diff --git a/.gitattributes b/.gitattributes index 53d7257..f1ebb47 100644 --- a/.gitattributes +++ b/.gitattributes @@ -1,47 +1,59 @@ *.7z filter=lfs diff=lfs merge=lfs -text *.arrow filter=lfs diff=lfs merge=lfs -text *.bin filter=lfs diff=lfs merge=lfs -text -*.bin.* filter=lfs diff=lfs merge=lfs -text *.bz2 filter=lfs diff=lfs merge=lfs -text +*.ckpt filter=lfs diff=lfs merge=lfs -text *.ftz filter=lfs diff=lfs merge=lfs -text *.gz filter=lfs diff=lfs merge=lfs -text *.h5 filter=lfs diff=lfs merge=lfs -text *.joblib filter=lfs diff=lfs merge=lfs -text *.lfs.* filter=lfs diff=lfs merge=lfs -text +*.mlmodel filter=lfs diff=lfs merge=lfs -text *.model filter=lfs diff=lfs merge=lfs -text *.msgpack filter=lfs diff=lfs merge=lfs -text +*.npy filter=lfs diff=lfs merge=lfs -text +*.npz filter=lfs diff=lfs merge=lfs -text *.onnx filter=lfs diff=lfs merge=lfs -text *.ot filter=lfs diff=lfs merge=lfs -text *.parquet filter=lfs diff=lfs merge=lfs -text *.pb filter=lfs diff=lfs merge=lfs -text +*.pickle filter=lfs diff=lfs merge=lfs -text +*.pkl filter=lfs diff=lfs merge=lfs -text *.pt filter=lfs diff=lfs merge=lfs -text *.pth filter=lfs diff=lfs merge=lfs -text *.rar filter=lfs diff=lfs merge=lfs -text +*.safetensors filter=lfs diff=lfs merge=lfs -text saved_model/**/* filter=lfs diff=lfs merge=lfs -text *.tar.* filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text *.tflite filter=lfs diff=lfs merge=lfs -text *.tgz filter=lfs diff=lfs merge=lfs -text +*.wasm filter=lfs diff=lfs merge=lfs -text *.xz filter=lfs diff=lfs merge=lfs -text *.zip filter=lfs diff=lfs merge=lfs -text -*.zstandard filter=lfs diff=lfs merge=lfs -text -*.tfevents* filter=lfs diff=lfs merge=lfs -text -*.db* filter=lfs diff=lfs merge=lfs -text -*.ark* filter=lfs diff=lfs merge=lfs -text -**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text -**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text -**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text -*.safetensors filter=lfs diff=lfs merge=lfs -text -*.ckpt filter=lfs diff=lfs merge=lfs -text -*.gguf* filter=lfs diff=lfs merge=lfs -text -*.ggml filter=lfs diff=lfs merge=lfs -text -*.llamafile* filter=lfs diff=lfs merge=lfs -text -*.pt2 filter=lfs diff=lfs merge=lfs -text -*.mlmodel filter=lfs diff=lfs merge=lfs -text -*.npy filter=lfs diff=lfs merge=lfs -text -*.npz filter=lfs diff=lfs merge=lfs -text -*.pickle filter=lfs diff=lfs merge=lfs -text -*.pkl filter=lfs diff=lfs merge=lfs -text -*.tar filter=lfs diff=lfs merge=lfs -text -*.wasm filter=lfs diff=lfs merge=lfs -text *.zst filter=lfs diff=lfs merge=lfs -text -*tfevents* filter=lfs diff=lfs merge=lfs -text \ No newline at end of file +*tfevents* filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-f16.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q6_K_L.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q5_K_L.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q4_K_L.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q4_0.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-IQ4_NL.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q3_K_XL.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-IQ3_M.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-IQ3_XS.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q2_K_L.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-IQ2_M.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct-f32.gguf filter=lfs diff=lfs merge=lfs -text +Falcon3-10B-Instruct.imatrix filter=lfs diff=lfs merge=lfs -text diff --git a/Falcon3-10B-Instruct-IQ2_M.gguf b/Falcon3-10B-Instruct-IQ2_M.gguf new file mode 100644 index 0000000..3ea8e75 --- /dev/null +++ b/Falcon3-10B-Instruct-IQ2_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:338604bfb7d62b2c9d6f5fe1eaca3644697fd779307864fdbda8427c3ff18c32 +size 3592343904 diff --git a/Falcon3-10B-Instruct-IQ3_M.gguf b/Falcon3-10B-Instruct-IQ3_M.gguf new file mode 100644 index 0000000..599d1b1 --- /dev/null +++ b/Falcon3-10B-Instruct-IQ3_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8cad36eb51e20406871cb0807c57f2223121ccf3a3c0273666845a1ec79a8eaf +size 4704985440 diff --git a/Falcon3-10B-Instruct-IQ3_XS.gguf b/Falcon3-10B-Instruct-IQ3_XS.gguf new file mode 100644 index 0000000..04c98d4 --- /dev/null +++ b/Falcon3-10B-Instruct-IQ3_XS.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:71644650f2cd847f74ec7289baf6e7b8ce1536dd179a13bf6bd8749ed5d5c8f8 +size 4368478560 diff --git a/Falcon3-10B-Instruct-IQ4_NL.gguf b/Falcon3-10B-Instruct-IQ4_NL.gguf new file mode 100644 index 0000000..9b519e0 --- /dev/null +++ b/Falcon3-10B-Instruct-IQ4_NL.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6178ad26bc9393a6309da530f849933c0fc989232da33db074ece60a2d9c42f9 +size 5906346336 diff --git a/Falcon3-10B-Instruct-IQ4_XS.gguf b/Falcon3-10B-Instruct-IQ4_XS.gguf new file mode 100644 index 0000000..dfda94f --- /dev/null +++ b/Falcon3-10B-Instruct-IQ4_XS.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b80226e41f7c89da8cdf5f6d5cd72e1796ee707b6e986f15b00226d3b9b55266 +size 5596885344 diff --git a/Falcon3-10B-Instruct-Q2_K.gguf b/Falcon3-10B-Instruct-Q2_K.gguf new file mode 100644 index 0000000..bc6d62e --- /dev/null +++ b/Falcon3-10B-Instruct-Q2_K.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:70ffc9c01582c3d340bf26f9bece33022c0925418db63e63ba214049ee3377d0 +size 3924046176 diff --git a/Falcon3-10B-Instruct-Q2_K_L.gguf b/Falcon3-10B-Instruct-Q2_K_L.gguf new file mode 100644 index 0000000..67c6af1 --- /dev/null +++ b/Falcon3-10B-Instruct-Q2_K_L.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cf8351b8e66dde26af67a696eec53fa21d791991dee329c432cf8fcf9942e938 +size 4317262176 diff --git a/Falcon3-10B-Instruct-Q3_K_L.gguf b/Falcon3-10B-Instruct-Q3_K_L.gguf new file mode 100644 index 0000000..f42cabe --- /dev/null +++ b/Falcon3-10B-Instruct-Q3_K_L.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ed0020ab55f571aaed11f1cc78c404bec9ca41a815cc69eb6ce805763fd0e93c +size 5450805600 diff --git a/Falcon3-10B-Instruct-Q3_K_M.gguf b/Falcon3-10B-Instruct-Q3_K_M.gguf new file mode 100644 index 0000000..3135b7b --- /dev/null +++ b/Falcon3-10B-Instruct-Q3_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0d2a6502eb596e228722b7198674c23303dba1b75118ff22a385f23551a22284 +size 5052477792 diff --git a/Falcon3-10B-Instruct-Q3_K_S.gguf b/Falcon3-10B-Instruct-Q3_K_S.gguf new file mode 100644 index 0000000..cc21ad9 --- /dev/null +++ b/Falcon3-10B-Instruct-Q3_K_S.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7fb30ebfc8c7c089bed413e01a7eeb0dee10c8d313d580ee7e1affdf051cafcc +size 4591137120 diff --git a/Falcon3-10B-Instruct-Q3_K_XL.gguf b/Falcon3-10B-Instruct-Q3_K_XL.gguf new file mode 100644 index 0000000..acf6b46 --- /dev/null +++ b/Falcon3-10B-Instruct-Q3_K_XL.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39d17d242f70dbc6391acf247f90036bee105798ec6d9ec2a8cb445c76e61a0d +size 5803127136 diff --git a/Falcon3-10B-Instruct-Q4_0.gguf b/Falcon3-10B-Instruct-Q4_0.gguf new file mode 100644 index 0000000..814b244 --- /dev/null +++ b/Falcon3-10B-Instruct-Q4_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:38e78c65a0b177930751b9fa952c07cb5ea9dc452115551b115ed73f8ab20773 +size 5928464736 diff --git a/Falcon3-10B-Instruct-Q4_K_L.gguf b/Falcon3-10B-Instruct-Q4_K_L.gguf new file mode 100644 index 0000000..4390218 --- /dev/null +++ b/Falcon3-10B-Instruct-Q4_K_L.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe1121611ab643dadab522d5dded26da60f96b457e71f37e81b63741694f114d +size 6586364256 diff --git a/Falcon3-10B-Instruct-Q4_K_M.gguf b/Falcon3-10B-Instruct-Q4_K_M.gguf new file mode 100644 index 0000000..3db5d4f --- /dev/null +++ b/Falcon3-10B-Instruct-Q4_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6d54a35d740a616061d6c7d7740d64f4339410e58aaba985aa9e1ea79c7e882a +size 6287520096 diff --git a/Falcon3-10B-Instruct-Q4_K_S.gguf b/Falcon3-10B-Instruct-Q4_K_S.gguf new file mode 100644 index 0000000..e4ce72d --- /dev/null +++ b/Falcon3-10B-Instruct-Q4_K_S.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:703cfbde06e7b7916fc22eb6d3fe98a5834a9cacf04a88fbabbbe4a7e4821e1a +size 5952156000 diff --git a/Falcon3-10B-Instruct-Q5_K_L.gguf b/Falcon3-10B-Instruct-Q5_K_L.gguf new file mode 100644 index 0000000..f5b364c --- /dev/null +++ b/Falcon3-10B-Instruct-Q5_K_L.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e0574ced7e6705358de7b26de93faf02043af4c0b8d82a2d869e4fb0fbebc519 +size 7589065056 diff --git a/Falcon3-10B-Instruct-Q5_K_M.gguf b/Falcon3-10B-Instruct-Q5_K_M.gguf new file mode 100644 index 0000000..04a0f8b --- /dev/null +++ b/Falcon3-10B-Instruct-Q5_K_M.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:536489b6ec6159beebc74e245b4b2d9f84b80fed4162fea6f50d604cb728858f +size 7340552544 diff --git a/Falcon3-10B-Instruct-Q5_K_S.gguf b/Falcon3-10B-Instruct-Q5_K_S.gguf new file mode 100644 index 0000000..89ae420 --- /dev/null +++ b/Falcon3-10B-Instruct-Q5_K_S.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7f04a35603029bd1972fa8816405e95c3dc4e6fda9f25e8fc8bd2a5c8fc0d6d0 +size 7144190304 diff --git a/Falcon3-10B-Instruct-Q6_K.gguf b/Falcon3-10B-Instruct-Q6_K.gguf new file mode 100644 index 0000000..5fa54b0 --- /dev/null +++ b/Falcon3-10B-Instruct-Q6_K.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:44496b253716f229de0e9b33cffb748b2509afc53495b0c2e5eef3b32e8274c6 +size 8459399520 diff --git a/Falcon3-10B-Instruct-Q6_K_L.gguf b/Falcon3-10B-Instruct-Q6_K_L.gguf new file mode 100644 index 0000000..18b5952 --- /dev/null +++ b/Falcon3-10B-Instruct-Q6_K_L.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:88ff637ab1830664765d7da62f947c6fa4c24a3b7d38d4357cfb12dcc985af55 +size 8654434656 diff --git a/Falcon3-10B-Instruct-Q8_0.gguf b/Falcon3-10B-Instruct-Q8_0.gguf new file mode 100644 index 0000000..57e8fcf --- /dev/null +++ b/Falcon3-10B-Instruct-Q8_0.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:98892efdf8233741cbaaa5f14a11b441a20fe8bd5962762bb2f3c3fa657b22b0 +size 10955239776 diff --git a/Falcon3-10B-Instruct-f16.gguf b/Falcon3-10B-Instruct-f16.gguf new file mode 100644 index 0000000..c512028 --- /dev/null +++ b/Falcon3-10B-Instruct-f16.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0bd98a38cbb5319d42b9d0b1e0880972e95e54c03c251989096931af6c85266e +size 20616556896 diff --git a/Falcon3-10B-Instruct-f32.gguf b/Falcon3-10B-Instruct-f32.gguf new file mode 100644 index 0000000..85ee0f5 --- /dev/null +++ b/Falcon3-10B-Instruct-f32.gguf @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:efae50128b34ba1d339a8364401e9977df973bf878fab1afe6f26a14c2407cd7 +size 41227366464 diff --git a/Falcon3-10B-Instruct.imatrix b/Falcon3-10B-Instruct.imatrix new file mode 100644 index 0000000..24459e1 --- /dev/null +++ b/Falcon3-10B-Instruct.imatrix @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:aa5e0f3b3fe2c742dc66d0b1d6139e82dbda6d5ace7274af9979f7e071c984b7 +size 6644818 diff --git a/README.md b/README.md index c27f984..83faf52 100644 --- a/README.md +++ b/README.md @@ -1,47 +1,171 @@ --- -license: Apache License 2.0 - -#model-type: -##如 gpt、phi、llama、chatglm、baichuan 等 -#- gpt - -#domain: -##如 nlp、cv、audio、multi-modal -#- nlp - -#language: -##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa -#- cn - -#metrics: -##如 CIDEr、Blue、ROUGE 等 -#- CIDEr - -#tags: -##各种自定义,包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他 -#- pretrained - -#tools: -##如 vllm、fastchat、llamacpp、AdaSeq 等 -#- vllm +quantized_by: bartowski +pipeline_tag: text-generation +tags: +- falcon3 +license: other +base_model: tiiuae/Falcon3-10B-Instruct +license_name: falcon-llm-license +license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html --- -### 当前模型的贡献者未提供更加详细的模型介绍。模型文件和权重,可浏览“模型文件”页面获取。 -#### 您可以通过如下git clone命令,或者ModelScope SDK来下载模型 -SDK下载 -```bash -#安装ModelScope -pip install modelscope +## Llamacpp imatrix Quantizations of Falcon3-10B-Instruct + +Using llama.cpp release b4341 for quantization. + +Original model: https://huggingface.co/tiiuae/Falcon3-10B-Instruct + +All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8) + +Run them in [LM Studio](https://lmstudio.ai/) + +## Prompt format + ``` -```python -#SDK模型下载 -from modelscope import snapshot_download -model_dir = snapshot_download('bartowski/Falcon3-10B-Instruct-GGUF') -``` -Git下载 -``` -#Git模型下载 -git clone https://www.modelscope.cn/bartowski/Falcon3-10B-Instruct-GGUF.git +<|system|> +{system_prompt} +<|user|> +{prompt} +<|assistant|> ``` -
如果您是本模型的贡献者,我们邀请您根据模型贡献文档,及时完善模型卡片内容。
\ No newline at end of file +## Download a file (not the whole branch) from below: + +| Filename | Quant type | File Size | Split | Description | +| -------- | ---------- | --------- | ----- | ----------- | +| [Falcon3-10B-Instruct-f32.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-f32.gguf) | f32 | 41.23GB | false | Full F32 weights. | +| [Falcon3-10B-Instruct-f16.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-f16.gguf) | f16 | 20.62GB | false | Full F16 weights. | +| [Falcon3-10B-Instruct-Q8_0.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q8_0.gguf) | Q8_0 | 10.96GB | false | Extremely high quality, generally unneeded but max available quant. | +| [Falcon3-10B-Instruct-Q6_K_L.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q6_K_L.gguf) | Q6_K_L | 8.65GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. | +| [Falcon3-10B-Instruct-Q6_K.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q6_K.gguf) | Q6_K | 8.46GB | false | Very high quality, near perfect, *recommended*. | +| [Falcon3-10B-Instruct-Q5_K_L.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q5_K_L.gguf) | Q5_K_L | 7.59GB | false | Uses Q8_0 for embed and output weights. High quality, *recommended*. | +| [Falcon3-10B-Instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q5_K_M.gguf) | Q5_K_M | 7.34GB | false | High quality, *recommended*. | +| [Falcon3-10B-Instruct-Q5_K_S.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q5_K_S.gguf) | Q5_K_S | 7.14GB | false | High quality, *recommended*. | +| [Falcon3-10B-Instruct-Q4_K_L.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q4_K_L.gguf) | Q4_K_L | 6.59GB | false | Uses Q8_0 for embed and output weights. Good quality, *recommended*. | +| [Falcon3-10B-Instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q4_K_M.gguf) | Q4_K_M | 6.29GB | false | Good quality, default size for most use cases, *recommended*. | +| [Falcon3-10B-Instruct-Q4_K_S.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q4_K_S.gguf) | Q4_K_S | 5.95GB | false | Slightly lower quality with more space savings, *recommended*. | +| [Falcon3-10B-Instruct-Q4_0.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q4_0.gguf) | Q4_0 | 5.93GB | false | Legacy format, offers online repacking for ARM and AVX CPU inference. | +| [Falcon3-10B-Instruct-IQ4_NL.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-IQ4_NL.gguf) | IQ4_NL | 5.91GB | false | Similar to IQ4_XS, but slightly larger. Offers online repacking for ARM CPU inference. | +| [Falcon3-10B-Instruct-Q3_K_XL.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q3_K_XL.gguf) | Q3_K_XL | 5.80GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. | +| [Falcon3-10B-Instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-IQ4_XS.gguf) | IQ4_XS | 5.60GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. | +| [Falcon3-10B-Instruct-Q3_K_L.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q3_K_L.gguf) | Q3_K_L | 5.45GB | false | Lower quality but usable, good for low RAM availability. | +| [Falcon3-10B-Instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q3_K_M.gguf) | Q3_K_M | 5.05GB | false | Low quality. | +| [Falcon3-10B-Instruct-IQ3_M.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-IQ3_M.gguf) | IQ3_M | 4.70GB | false | Medium-low quality, new method with decent performance comparable to Q3_K_M. | +| [Falcon3-10B-Instruct-Q3_K_S.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q3_K_S.gguf) | Q3_K_S | 4.59GB | false | Low quality, not recommended. | +| [Falcon3-10B-Instruct-IQ3_XS.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-IQ3_XS.gguf) | IQ3_XS | 4.37GB | false | Lower quality, new method with decent performance, slightly better than Q3_K_S. | +| [Falcon3-10B-Instruct-Q2_K_L.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q2_K_L.gguf) | Q2_K_L | 4.32GB | false | Uses Q8_0 for embed and output weights. Very low quality but surprisingly usable. | +| [Falcon3-10B-Instruct-Q2_K.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-Q2_K.gguf) | Q2_K | 3.92GB | false | Very low quality but surprisingly usable. | +| [Falcon3-10B-Instruct-IQ2_M.gguf](https://huggingface.co/bartowski/Falcon3-10B-Instruct-GGUF/blob/main/Falcon3-10B-Instruct-IQ2_M.gguf) | IQ2_M | 3.59GB | false | Relatively low quality, uses SOTA techniques to be surprisingly usable. | + +## Embed/output weights + +Some of these quants (Q3_K_XL, Q4_K_L etc) are the standard quantization method with the embeddings and output weights quantized to Q8_0 instead of what they would normally default to. + +## Downloading using huggingface-cli + +