Files
ModelHub XC e88d430147 初始化项目,由ModelHub XC社区提供模型
Model: fdtn-ai/Foundation-Sec-8B-Reasoning-Q8_0-GGUF
Source: Original Platform
2026-04-22 14:13:10 +08:00

87 lines
4.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
base_model: fdtn-ai/Foundation-Sec-8B-Reasoning
language:
- en
library_name: transformers
license: other
pipeline_tag: text-generation
tags:
- security
- llama
- llama-cpp
- gguf-my-repo
---
# Foundation-Sec-8B-Reasoning-Q8_0-GGUF
**This model was quantized from fdtn-ai/Foundation-Sec-8B-Reasoning to a 8-bit (Q8_0) GGUF checkpoint using llama.cpp. It retains the cybersecurity specialization of the original 8-billion-parameter model while reducing the memory footprint from approximately 16GB (BF16) to around 8.54GB (Q8_0) for inference.**
## Model Description
`fdtn-ai/Foundation-Sec-8B-Reasoning-Q8_0-GGUF` is an 8-bit quantized variant of **Foundation-Sec-8B-Reasoning** — an 8B-parameter LLaMA 3.1based model that extends the **Foundation-Sec-8B** base model with instruction-following and reasoning capabilities. The base model was continued-pretrained on a curated corpus of cybersecurity-specific text (e.g., CVEs, threat intel reports, exploit write-ups, compliance guides). Foundation-Sec-8B-Reasoning is optimized for three core use case categories:
- **SOC Acceleration**: Automating triage, summarization, case note generation, and evidence collection.
- **Proactive Threat Defense**: Simulating attacks, prioritizing vulnerabilities, mapping TTPs, and modeling attacker behavior.
- **Engineering Enablement**: Providing security assistance, validating configurations, assessing compliance evidence, and improving security posture.
Rather than re-uploading or replicating the entire training details, please refer to the original model card for foundational architecture, training data, evaluation results, and known limitations.
## Quantization Details
- **Quantization Scheme:** 8-bit, "Q8_0" (8-bit quantization with minimal precision loss)
- **Toolchain:** Converted via [llama.cpp's export utilities](https://github.com/ggml-org/llama.cpp) (commit `v0.1.81` or newer) to GGUF format.
- **Resulting File Size:** ~ 8.54 GB on disk (raw GGUF blob)
- **Runtime Footprint:**
- Memory: ≈ 8.54 GB of RAM when loaded on CPU with llama.cpp
- **Format:**
- File extension: `.gguf`
- Internally contains:
1. Metadata (architecture, tokenizer vocab, hyperparameters)
2. Vocabulary list (BPE tokens)
3. Weight tensors (for each layer and head) stored in 8-bit quantized form
- Compliant with LlamaCpp Python wrapper (`llama_cpp`) and C++ CLI (`llama.cpp`) inference engines
## How to Use
[The cookbook](https://github.com/cisco-foundation-ai/cookbook) provides example use cases, code samples for adoption, and references.
### Install llama.cpp on Mac
Use Homebrew:
```bash
brew install llama-cpp
```
or install from scratch:
```bash
# Install dependencies
brew install cmake
# Clone and build llama.cpp
git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
make
# Add to PATH (optional)
sudo cp llama-cli /usr/local/bin/
```
### Run the Model
```bash
llama-cli -m foundation-sec-8b-reasoning-q8_0.gguf -p "CVE-2021-44228 is a remote code execution flaw in Apache Log4j2 via unsafe JNDI lookups (\"Log4Shell\"). The CWE is CWE-502.\n\nCVE-2017-0144 is a remote code execution vulnerability in Microsoft's SMBv1 server (\"EternalBlue\") due to a buffer overflow. The CWE is CWE-119.\n\nCVE-2014-0160 is an information-disclosure bug in OpenSSL's heartbeat extension (\"Heartbleed\") due to out-of-bounds reads. The CWE is CWE-125.\n\nCVE-2017-5638 is a remote code execution issue in Apache Struts 2's Jakarta Multipart parser stemming from improper input validation of the Content-Type header. The CWE is CWE-20.\n\nCVE-2019-0708 is a remote code execution vulnerability in Microsoft's Remote Desktop Services (\"BlueKeep\") triggered by a use-after-free. The CWE is CWE-416.\n\nCVE-2015-10011 is a vulnerability about OpenDNS OpenResolve improper log output neutralization. The CWE is" -n 128
```
## References
1. **Original Model Card:**
[fdtn-ai/Foundation-Sec-8B-Reasoning](https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning) (January 28, 2026)
2. **Llama-cpp GGUF Quantization:**
Ggerganov, J. (2022). _Llama.cpp: Llama inference in pure C/C++/Assembly/GGUF_. GitHub repository.
3. **ZeroQuant:**
Yao, Z. et al. (2022). "ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers." arXiv: 2206.01861.
4. **SmoothQuant:**
Xiao, G. et al. (2022). "SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models." arXiv: 2211.10438.
**License:** Apache 2.0 (same as base)
**Contact:** For questions about usage, quantization details, or license terms, please open an issue on the Hugging Face repo or contact `blainen@cisco.com`.