初始化项目,由ModelHub XC社区提供模型
Model: Avtrkrb/granite-claude-h-350m-GGUF Source: Original Platform
This commit is contained in:
88
README.md
Normal file
88
README.md
Normal file
@@ -0,0 +1,88 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
language:
|
||||
- en
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- granite
|
||||
- gguf
|
||||
- llama-cpp
|
||||
- reasoning
|
||||
- quantized
|
||||
- local-llm
|
||||
|
||||
base_model: Avtrkrb/granite-claude-h-350m
|
||||
|
||||
library_name: gguf
|
||||
---
|
||||
|
||||
# granite-claude-h-350m-GGUF
|
||||
|
||||
GGUF quantizations of:
|
||||
|
||||
`Avtrkrb/granite-claude-h-350m`
|
||||
|
||||
These files are intended for inference using:
|
||||
|
||||
- llama.cpp
|
||||
- LM Studio
|
||||
- Open WebUI
|
||||
- Jan
|
||||
- KoboldCpp
|
||||
- GPT4All
|
||||
- Ollama (after conversion/import)
|
||||
|
||||
---
|
||||
|
||||
## Available Quantizations
|
||||
|
||||
Typical variants included:
|
||||
|
||||
| Quant | Use Case |
|
||||
|---------|---------|
|
||||
| Q4_K_M | Best size / quality balance |
|
||||
| Q5_K_M | Higher quality |
|
||||
| Q6_K | Near-lossless for most use cases |
|
||||
| Q8_0 | Highest quality quantized version |
|
||||
|
||||
---
|
||||
|
||||
## Source Model
|
||||
|
||||
Merged model:
|
||||
|
||||
https://huggingface.co/Avtrkrb/granite-claude-h-350m
|
||||
|
||||
Dataset:
|
||||
|
||||
https://huggingface.co/datasets/Avtrkrb/combined-reasoning-claude
|
||||
|
||||
---
|
||||
|
||||
## Example llama.cpp Usage
|
||||
|
||||
```bash
|
||||
./llama-cli \
|
||||
-m granite-claude-h-350m-Q4_K_M.gguf \
|
||||
-p "Explain quantum tunneling."
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recommended Quant
|
||||
|
||||
For most users:
|
||||
|
||||
**Q4_K_M**
|
||||
|
||||
offers the best balance between:
|
||||
|
||||
- quality
|
||||
- speed
|
||||
- memory usage
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
This repository follows the licensing terms of the original Granite model.
|
||||
Reference in New Issue
Block a user