75 lines
3.6 KiB
Markdown
75 lines
3.6 KiB
Markdown
---
|
|
base_model: haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B
|
|
language:
|
|
- en
|
|
library_name: transformers
|
|
license: mit
|
|
datasets:
|
|
- postgrammar/london-llm-1800
|
|
quantized_by: ncky
|
|
tags:
|
|
- text-generation-inference
|
|
- transformers
|
|
- llama
|
|
- gguf
|
|
- historical
|
|
---
|
|
## About
|
|
|
|
static and imatrix-assisted GGUF quants of https://huggingface.co/haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B.
|
|
|
|
Generated with `llama.cpp` build `8044` (`91ea5d67f`).
|
|
|
|
`IQ4_XS` was quantized with an imatrix generated on 19th-century public-domain English text.
|
|
|
|
Note: this model has FFN dimensions (`5504`) not divisible by `256`, so `llama.cpp` applied fallback quantization to 22 tensors for K/IQ quant types.
|
|
|
|
## Base Model Info (from original model card)
|
|
|
|
Source: https://huggingface.co/haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B
|
|
|
|
| Detail | Value |
|
|
| :--- | :--- |
|
|
| Model Architecture | LlamaForCausalLM (decoder-only transformer) |
|
|
| Parameter Count | ~1.22B |
|
|
| Training Type | Trained from scratch (random initialization) |
|
|
| Tokenizer | Custom BPE, vocab size 32,000 |
|
|
| Sequence Length | 2048 |
|
|
| Attention Type | Grouped Query Attention (16 Q heads / 8 KV heads) |
|
|
| Hidden Size | 2048 |
|
|
| Intermediate Size | 5504 |
|
|
| Layers | 22 |
|
|
|
|
Training details reported by the source model card:
|
|
- Final training loss: 3.3951
|
|
- Start training loss: 10.7932
|
|
- Training steps: 182,000
|
|
- Epochs: 0.4997
|
|
- Training time: 117h 51m
|
|
- Reported training cost: $340.97 on an H100 SXM (RunPod)
|
|
|
|
## Usage
|
|
|
|
If you are unsure how to use GGUF files, refer to one of [TheBloke's
|
|
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
|
|
more details.
|
|
|
|
## Provided Quants
|
|
|
|
(sorted by size, not necessarily quality)
|
|
|
|
| Link | Type | Size/GB | Notes |
|
|
|:-----|:-----|--------:|:------|
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q2_K.gguf) | Q2_K | 0.5 | smallest |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q3_K_S.gguf) | Q3_K_S | 0.6 | low VRAM |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q3_K_M.gguf) | Q3_K_M | 0.6 | balanced low size |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q3_K_L.gguf) | Q3_K_L | 0.6 | better than Q3_K_M |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.IQ4_XS.gguf) | IQ4_XS | 0.6 | imatrix, recommended at this size |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q4_K_S.gguf) | Q4_K_S | 0.7 | fast, recommended |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q4_K_M.gguf) | Q4_K_M | 0.7 | fast, recommended |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q5_K_S.gguf) | Q5_K_S | 0.8 | higher quality |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q5_K_M.gguf) | Q5_K_M | 0.9 | higher quality |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q6_K.gguf) | Q6_K | 1.0 | very good quality |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q8_0.gguf) | Q8_0 | 1.2 | fast, best quality |
|
|
| [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.f16.gguf) | f16 | 2.3 | 16 bpw, overkill |
|