--- base_model: haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B language: - en library_name: transformers license: mit datasets: - postgrammar/london-llm-1800 quantized_by: ncky tags: - text-generation-inference - transformers - llama - gguf - historical --- ## About static and imatrix-assisted GGUF quants of https://huggingface.co/haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B. Generated with `llama.cpp` build `8044` (`91ea5d67f`). `IQ4_XS` was quantized with an imatrix generated on 19th-century public-domain English text. Note: this model has FFN dimensions (`5504`) not divisible by `256`, so `llama.cpp` applied fallback quantization to 22 tensors for K/IQ quant types. ## Base Model Info (from original model card) Source: https://huggingface.co/haykgrigorian/TimeCapsuleLLM-v2-llama-1.2B | Detail | Value | | :--- | :--- | | Model Architecture | LlamaForCausalLM (decoder-only transformer) | | Parameter Count | ~1.22B | | Training Type | Trained from scratch (random initialization) | | Tokenizer | Custom BPE, vocab size 32,000 | | Sequence Length | 2048 | | Attention Type | Grouped Query Attention (16 Q heads / 8 KV heads) | | Hidden Size | 2048 | | Intermediate Size | 5504 | | Layers | 22 | Training details reported by the source model card: - Final training loss: 3.3951 - Start training loss: 10.7932 - Training steps: 182,000 - Epochs: 0.4997 - Training time: 117h 51m - Reported training cost: $340.97 on an H100 SXM (RunPod) ## Usage If you are unsure how to use GGUF files, refer to one of [TheBloke's READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for more details. ## Provided Quants (sorted by size, not necessarily quality) | Link | Type | Size/GB | Notes | |:-----|:-----|--------:|:------| | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q2_K.gguf) | Q2_K | 0.5 | smallest | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q3_K_S.gguf) | Q3_K_S | 0.6 | low VRAM | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q3_K_M.gguf) | Q3_K_M | 0.6 | balanced low size | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q3_K_L.gguf) | Q3_K_L | 0.6 | better than Q3_K_M | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.IQ4_XS.gguf) | IQ4_XS | 0.6 | imatrix, recommended at this size | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q4_K_S.gguf) | Q4_K_S | 0.7 | fast, recommended | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q4_K_M.gguf) | Q4_K_M | 0.7 | fast, recommended | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q5_K_S.gguf) | Q5_K_S | 0.8 | higher quality | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q5_K_M.gguf) | Q5_K_M | 0.9 | higher quality | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q6_K.gguf) | Q6_K | 1.0 | very good quality | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.Q8_0.gguf) | Q8_0 | 1.2 | fast, best quality | | [GGUF](https://huggingface.co/ncky/TimeCapsuleLLM-v2-llama-1.2B-GGUF/resolve/main/TimeCapsuleLLM-v2-llama-1.2B.f16.gguf) | f16 | 2.3 | 16 bpw, overkill |