27 lines
1.1 KiB
Markdown
27 lines
1.1 KiB
Markdown
|
|
---
|
||
|
|
quantized_by: Pomni
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
base_model:
|
||
|
|
- Pomni/OWoTGPT-1.3
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
tags:
|
||
|
|
- gpt2
|
||
|
|
- slm
|
||
|
|
- owot
|
||
|
|
- gpt
|
||
|
|
- gguf
|
||
|
|
---
|
||
|
|
# OWoTGPT-1.3 quants
|
||
|
|
This is a repository of **GGUF quants for [OWoTGPT-1.3](https://huggingface.co/Pomni/OWoTGPT-1.3).**
|
||
|
|
|
||
|
|
If you are looking for a program to run this model with, then I would recommend [LM Studio](https://lmstudio.ai/), as it is user-friendly, has a GUI, and is very powerful.
|
||
|
|
## List of Quants
|
||
|
|
Sorry — too much quants for me to list. Go to the [files page](https://huggingface.co/Pomni/owotgpt1.3-gguf/tree/main) to download them.
|
||
|
|
|
||
|
|
The MXFP4_MOE and TQx_0 quants are experimental. Additionally, I would not go below F16 for a model this small. F32 is the way to go here.
|
||
|
|
## Questions you may have
|
||
|
|
### What program did you use to make these quants?
|
||
|
|
I used [llama.cpp b8352](https://github.com/ggml-org/llama.cpp/releases/tag/b8352) on Windows x64, leveraging CUDA 12.4.
|
||
|
|
### One or multiple of the quants are not working for me.
|
||
|
|
[Open a new discussion](https://huggingface.co/Pomni/owotgpt1.3-gguf/discussions) in the community tab about this, and I will look into the issue.
|