初始化项目，由ModelHub XC社区提供模型

Model: fdtn-ai/Foundation-Sec-8B-Reasoning-Q8_0-GGUF Source: Original Platform
2026-04-22 14:13:10 +08:00
commit e88d430147
8 changed files with 2455 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,37 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 foundation-sec-8b-reasoning-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,87 @@
 ---
 base_model: fdtn-ai/Foundation-Sec-8B-Reasoning
 language:
 - en
 library_name: transformers
 license: other
 pipeline_tag: text-generation
 tags:
 - security
 - llama
 - llama-cpp
 - gguf-my-repo
 ---
 # Foundation-Sec-8B-Reasoning-Q8_0-GGUF
 **This model was quantized from fdtn-ai/Foundation-Sec-8B-Reasoning to a 8-bit (Q8_0) GGUF checkpoint using llama.cpp. It retains the cybersecurity specialization of the original 8-billion-parameter model while reducing the memory footprint from approximately 16GB (BF16) to around 8.54GB (Q8_0) for inference.**
 ## Model Description
 `fdtn-ai/Foundation-Sec-8B-Reasoning-Q8_0-GGUF` is an 8-bit quantized variant of **Foundation-Sec-8B-Reasoning** — an 8B-parameter LLaMA 3.1–based model that extends the **Foundation-Sec-8B** base model with instruction-following and reasoning capabilities. The base model was continued-pretrained on a curated corpus of cybersecurity-specific text (e.g., CVEs, threat intel reports, exploit write-ups, compliance guides). Foundation-Sec-8B-Reasoning is optimized for three core use case categories:
 - **SOC Acceleration**: Automating triage, summarization, case note generation, and evidence collection.
 - **Proactive Threat Defense**: Simulating attacks, prioritizing vulnerabilities, mapping TTPs, and modeling attacker behavior.
 - **Engineering Enablement**: Providing security assistance, validating configurations, assessing compliance evidence, and improving security posture.
 Rather than re-uploading or replicating the entire training details, please refer to the original model card for foundational architecture, training data, evaluation results, and known limitations. 
 ## Quantization Details
 - **Quantization Scheme:** 8-bit, "Q8_0" (8-bit quantization with minimal precision loss)  
 - **Toolchain:** Converted via [llama.cpp's export utilities](https://github.com/ggml-org/llama.cpp) (commit `v0.1.81` or newer) to GGUF format.
 - **Resulting File Size:** ~ 8.54 GB on disk (raw GGUF blob)   
 - **Runtime Footprint:** 
  - Memory: ≈ 8.54 GB of RAM when loaded on CPU with llama.cpp  
 - **Format:**
  - File extension: `.gguf`  
  - Internally contains:  
    1. Metadata (architecture, tokenizer vocab, hyperparameters)  
    2. Vocabulary list (BPE tokens)  
    3. Weight tensors (for each layer and head) stored in 8-bit quantized form  
  - Compliant with LlamaCpp Python wrapper (`llama_cpp`) and C++ CLI (`llama.cpp`) inference engines  
 ## How to Use
 [The cookbook](https://github.com/cisco-foundation-ai/cookbook) provides example use cases, code samples for adoption, and references.
 ### Install llama.cpp on Mac
 Use Homebrew:
 ```bash
 brew install llama-cpp
 ```
 or install from scratch:
 ```bash
 # Install dependencies
 brew install cmake
 # Clone and build llama.cpp
 git clone https://github.com/ggml-org/llama.cpp.git
 cd llama.cpp
 make
 # Add to PATH (optional)
 sudo cp llama-cli /usr/local/bin/
 ```
 ### Run the Model
 ```bash
 llama-cli -m foundation-sec-8b-reasoning-q8_0.gguf -p "CVE-2021-44228 is a remote code execution flaw in Apache Log4j2 via unsafe JNDI lookups (\"Log4Shell\"). The CWE is CWE-502.\n\nCVE-2017-0144 is a remote code execution vulnerability in Microsoft's SMBv1 server (\"EternalBlue\") due to a buffer overflow. The CWE is CWE-119.\n\nCVE-2014-0160 is an information-disclosure bug in OpenSSL's heartbeat extension (\"Heartbleed\") due to out-of-bounds reads. The CWE is CWE-125.\n\nCVE-2017-5638 is a remote code execution issue in Apache Struts 2's Jakarta Multipart parser stemming from improper input validation of the Content-Type header. The CWE is CWE-20.\n\nCVE-2019-0708 is a remote code execution vulnerability in Microsoft's Remote Desktop Services (\"BlueKeep\") triggered by a use-after-free. The CWE is CWE-416.\n\nCVE-2015-10011 is a vulnerability about OpenDNS OpenResolve improper log output neutralization. The CWE is" -n 128
 ```
 ## References
 1. **Original Model Card:**  
   [fdtn-ai/Foundation-Sec-8B-Reasoning](https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning) (January 28, 2026)
 2. **Llama-cpp GGUF Quantization:**  
   Ggerganov, J. (2022). _Llama.cpp: Llama inference in pure C/C++/Assembly/GGUF_. GitHub repository.
 3. **ZeroQuant:**  
   Yao, Z. et al. (2022). "ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers." arXiv: 2206.01861.
 4. **SmoothQuant:**  
   Xiao, G. et al. (2022). "SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models." arXiv: 2211.10438.
 **License:** Apache 2.0 (same as base)  
 **Contact:** For questions about usage, quantization details, or license terms, please open an issue on the Hugging Face repo or contact `blainen@cisco.com`.
--- a/config.json
+++ b/config.json
@@ -0,0 +1,38 @@
 {
  "_fused_kernels_backend": "flash_attn",
  "_use_fused_kernels": true,
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 128000,
  "eos_token_id": 128001,
  "head_dim": 128,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 131072,
  "mlp_bias": false,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "pad_token_id": 128275,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-05,
  "rope_scaling": {
    "factor": 8.0,
    "high_freq_factor": 4.0,
    "low_freq_factor": 1.0,
    "original_max_position_embeddings": 8192,
    "rope_type": "llama3"
  },
  "rope_theta": 500000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.51.3",
  "use_cache": true,
  "vocab_size": 128276
 }
--- a/foundation-sec-8b-reasoning-q8_0.gguf
+++ b/foundation-sec-8b-reasoning-q8_0.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:cd21e8dbb0f3b66156c864e2c33b4edc390070a142e6fab82bc0f61fa157c5f4
 size 8540948640
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,14 @@
 {
  "_from_model_config": true,
  "bos_token_id": 128000,
  "do_sample": true,
  "eos_token_id": [
    128001,
    128009
    ],
  "pad_token_id": 128275,
  "temperature": 0.6,
  "top_p": 0.95,
  "repetition_penalty": 1.1,
  "transformers_version": "4.51.3"
 }
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,37 @@
 {
  "additional_special_tokens": [
    "<|system|>",
    "<|user|>",
    "<|assistant|>",
    "<think>",
    "</think>",
    "<available_tools>",
    "</available_tools>",
    "<toolcall>",
    "</toolcall>",
    "<tool_response>",
    "</tool_response>",
    "<pad>"
  ],
  "bos_token": {
    "content": "<|begin_of_text|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "eos_token": {
    "content": "<|end_of_text|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "pad_token": {
    "content": "<pad>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/tokenizer.json
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:88fb81f6493ea9017c2b2e25c95beab7b8d961a9aac5c6a5a5b5a01dc7be902d
 size 17213681
--- a/tokenizer_config.json
+++ b/tokenizer_config.json