Files
OpenElla3-Llama3.2B-V2-GGUF/README.md
ModelHub XC 6383f7c947 初始化项目,由ModelHub XC社区提供模型
Model: N-Bot-Int/OpenElla3-Llama3.2B-V2-GGUF
Source: Original Platform
2026-05-04 08:31:39 +08:00

2.2 KiB

license, datasets, language, base_model, pipeline_tag, tags
license datasets language base_model pipeline_tag tags
apache-2.0
openerotica/mixed-rp
kingbri/PIPPA-shareGPT
flammenai/character-roleplay-DPO
en
N-Bot-Int/OpenElla3-Llama3.2B
text-generation
unsloth
Uncensored
text-generation-inference
transformers
unsloth
llama
trl
roleplay
conversational

Support Us Through

image/png

GGUF Version

GGUF with Quants! Allowing you to run models using KoboldCPP and other AI Environments!

Quantizations:

Quant Type Benefits Cons
Q4_K_M Smallest size (fastest inference) Lowest accuracy compared to other quants
Requires the least VRAM/RAM May struggle with complex reasoning
Ideal for edge devices & low-resource setups Can produce slightly degraded text quality
Q5_K_M Better accuracy than Q4, while still compact Slightly larger model size than Q4
Good balance between speed and precision Needs a bit more VRAM than Q4
Works well on mid-range GPUs Still not as accurate as higher-bit models
Q8_0 Highest accuracy (closest to full model) Requires significantly more VRAM/RAM
Best for complex reasoning & detailed outputs Slower inference compared to Q4 & Q5
Suitable for high-end GPUs & serious workloads Larger file size (takes more storage)

Model Details:

Read the Model details on huggingface Model Detail Here