初始化项目，由ModelHub XC社区提供模型

Model: fdtn-ai/Foundation-Sec-8B-Reasoning-Q8_0-GGUF Source: Original Platform
2026-04-22 14:13:10 +08:00
commit e88d430147
8 changed files with 2455 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,37 @@
+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+foundation-sec-8b-reasoning-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,87 @@
+---
+base_model: fdtn-ai/Foundation-Sec-8B-Reasoning
+language:
+- en
+library_name: transformers
+license: other
+pipeline_tag: text-generation
+tags:
+- security
+- llama
+- llama-cpp
+- gguf-my-repo
+---
+# Foundation-Sec-8B-Reasoning-Q8_0-GGUF
+
+**This model was quantized from fdtn-ai/Foundation-Sec-8B-Reasoning to a 8-bit (Q8_0) GGUF checkpoint using llama.cpp. It retains the cybersecurity specialization of the original 8-billion-parameter model while reducing the memory footprint from approximately 16GB (BF16) to around 8.54GB (Q8_0) for inference.**
+
+## Model Description
+
+`fdtn-ai/Foundation-Sec-8B-Reasoning-Q8_0-GGUF` is an 8-bit quantized variant of **Foundation-Sec-8B-Reasoning** — an 8B-parameter LLaMA 3.1–based model that extends the **Foundation-Sec-8B** base model with instruction-following and reasoning capabilities. The base model was continued-pretrained on a curated corpus of cybersecurity-specific text (e.g., CVEs, threat intel reports, exploit write-ups, compliance guides). Foundation-Sec-8B-Reasoning is optimized for three core use case categories:
+
+- **SOC Acceleration**: Automating triage, summarization, case note generation, and evidence collection.
+- **Proactive Threat Defense**: Simulating attacks, prioritizing vulnerabilities, mapping TTPs, and modeling attacker behavior.
+- **Engineering Enablement**: Providing security assistance, validating configurations, assessing compliance evidence, and improving security posture.
+
+Rather than re-uploading or replicating the entire training details, please refer to the original model card for foundational architecture, training data, evaluation results, and known limitations. 
+
+## Quantization Details
+
+- **Quantization Scheme:** 8-bit, "Q8_0" (8-bit quantization with minimal precision loss)  
+- **Toolchain:** Converted via [llama.cpp's export utilities](https://github.com/ggml-org/llama.cpp) (commit `v0.1.81` or newer) to GGUF format.
+- **Resulting File Size:** ~ 8.54 GB on disk (raw GGUF blob)   
+- **Runtime Footprint:** 
+  - Memory: ≈ 8.54 GB of RAM when loaded on CPU with llama.cpp  
+- **Format:**
+  - File extension: `.gguf`  
+  - Internally contains:  
+    1. Metadata (architecture, tokenizer vocab, hyperparameters)  
+    2. Vocabulary list (BPE tokens)  
+    3. Weight tensors (for each layer and head) stored in 8-bit quantized form  
+  - Compliant with LlamaCpp Python wrapper (`llama_cpp`) and C++ CLI (`llama.cpp`) inference engines  
+
+## How to Use
+[The cookbook](https://github.com/cisco-foundation-ai/cookbook) provides example use cases, code samples for adoption, and references.
+
+### Install llama.cpp on Mac
+
+Use Homebrew:
+```bash
+brew install llama-cpp
+```
+
+or install from scratch:
+
+```bash
+# Install dependencies
+brew install cmake
+# Clone and build llama.cpp
+git clone https://github.com/ggml-org/llama.cpp.git
+cd llama.cpp
+make
+# Add to PATH (optional)
+sudo cp llama-cli /usr/local/bin/
+```
+
+### Run the Model
+
+```bash
+llama-cli -m foundation-sec-8b-reasoning-q8_0.gguf -p "CVE-2021-44228 is a remote code execution flaw in Apache Log4j2 via unsafe JNDI lookups (\"Log4Shell\"). The CWE is CWE-502.\n\nCVE-2017-0144 is a remote code execution vulnerability in Microsoft's SMBv1 server (\"EternalBlue\") due to a buffer overflow. The CWE is CWE-119.\n\nCVE-2014-0160 is an information-disclosure bug in OpenSSL's heartbeat extension (\"Heartbleed\") due to out-of-bounds reads. The CWE is CWE-125.\n\nCVE-2017-5638 is a remote code execution issue in Apache Struts 2's Jakarta Multipart parser stemming from improper input validation of the Content-Type header. The CWE is CWE-20.\n\nCVE-2019-0708 is a remote code execution vulnerability in Microsoft's Remote Desktop Services (\"BlueKeep\") triggered by a use-after-free. The CWE is CWE-416.\n\nCVE-2015-10011 is a vulnerability about OpenDNS OpenResolve improper log output neutralization. The CWE is" -n 128
+```
+
+## References
+
+1. **Original Model Card:**  
+   [fdtn-ai/Foundation-Sec-8B-Reasoning](https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning) (January 28, 2026)
+
+2. **Llama-cpp GGUF Quantization:**  
+   Ggerganov, J. (2022). _Llama.cpp: Llama inference in pure C/C++/Assembly/GGUF_. GitHub repository.
+
+3. **ZeroQuant:**  
+   Yao, Z. et al. (2022). "ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers." arXiv: 2206.01861.
+
+4. **SmoothQuant:**  
+   Xiao, G. et al. (2022). "SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models." arXiv: 2211.10438.
+
+**License:** Apache 2.0 (same as base)  
+**Contact:** For questions about usage, quantization details, or license terms, please open an issue on the Hugging Face repo or contact `blainen@cisco.com`.
--- a/config.json
+++ b/config.json
@@ -0,0 +1,38 @@
+{
+  "_fused_kernels_backend": "flash_attn",
+  "_use_fused_kernels": true,
+  "architectures": [
+    "LlamaForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 128000,
+  "eos_token_id": 128001,
+  "head_dim": 128,
+  "hidden_act": "silu",
+  "hidden_size": 4096,
+  "initializer_range": 0.02,
+  "intermediate_size": 14336,
+  "max_position_embeddings": 131072,
+  "mlp_bias": false,
+  "model_type": "llama",
+  "num_attention_heads": 32,
+  "num_hidden_layers": 32,
+  "num_key_value_heads": 8,
+  "pad_token_id": 128275,
+  "pretraining_tp": 1,
+  "rms_norm_eps": 1e-05,
+  "rope_scaling": {
+    "factor": 8.0,
+    "high_freq_factor": 4.0,
+    "low_freq_factor": 1.0,
+    "original_max_position_embeddings": 8192,
+    "rope_type": "llama3"
+  },
+  "rope_theta": 500000.0,
+  "tie_word_embeddings": false,
+  "torch_dtype": "bfloat16",
+  "transformers_version": "4.51.3",
+  "use_cache": true,
+  "vocab_size": 128276
+}
--- a/foundation-sec-8b-reasoning-q8_0.gguf
+++ b/foundation-sec-8b-reasoning-q8_0.gguf
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:cd21e8dbb0f3b66156c864e2c33b4edc390070a142e6fab82bc0f61fa157c5f4
+size 8540948640
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,14 @@
+{
+  "_from_model_config": true,
+  "bos_token_id": 128000,
+  "do_sample": true,
+  "eos_token_id": [
+    128001,
+    128009
+    ],
+  "pad_token_id": 128275,
+  "temperature": 0.6,
+  "top_p": 0.95,
+  "repetition_penalty": 1.1,
+  "transformers_version": "4.51.3"
+}
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,37 @@
+{
+  "additional_special_tokens": [
+    "<|system|>",
+    "<|user|>",
+    "<|assistant|>",
+    "<think>",
+    "</think>",
+    "<available_tools>",
+    "</available_tools>",
+    "<toolcall>",
+    "</toolcall>",
+    "<tool_response>",
+    "</tool_response>",
+    "<pad>"
+  ],
+  "bos_token": {
+    "content": "<|begin_of_text|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|end_of_text|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:88fb81f6493ea9017c2b2e25c95beab7b8d961a9aac5c6a5a5b5a01dc7be902d
+size 17213681
--- a/tokenizer_config.json
+++ b/tokenizer_config.json