初始化项目，由ModelHub XC社区提供模型

Model: Ayansk11/FinSenti-Qwen3-8B-GGUF Source: Original Platform
2026-05-09 14:16:36 +08:00
commit 9c0c65ba02
8 changed files with 321 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,38 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 FinSenti-Qwen3-8B.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
 FinSenti-Qwen3-8B.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
 FinSenti-Qwen3-8B.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
--- a/FinSenti-Qwen3-8B.Q4_K_M.gguf
+++ b/FinSenti-Qwen3-8B.Q4_K_M.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:f5737269177350215f933a36f141c66ccf8cfbbe70b9d262606f3a24e27eceb6
 size 5027783872
--- a/FinSenti-Qwen3-8B.Q5_K_M.gguf
+++ b/FinSenti-Qwen3-8B.Q5_K_M.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:9812f66bfe5de94fd347165485ae78eb6b2b8f5adfadf9705a624efabab8b107
 size 5851112640
--- a/FinSenti-Qwen3-8B.Q8_0.gguf
+++ b/FinSenti-Qwen3-8B.Q8_0.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:d0a074d413509db7a7ad359d6425fa583284c1f25763ba23314f588f908efed7
 size 8709518528
--- a/Modelfile.Q4_K_M
+++ b/Modelfile.Q4_K_M
@@ -0,0 +1,31 @@
 FROM ./FinSenti-Qwen3-8B.Q4_K_M.gguf
 TEMPLATE \"\"\"<|im_start|>system
 You are a financial sentiment analyst. Analyze the given financial text and provide:
 1. Your reasoning in <reasoning> tags
 2. Your sentiment classification (positive, negative, or neutral) in <answer> tags
 Always use this exact format:
 <reasoning>
 [Your step-by-step analysis]
 </reasoning>
 <answer>[positive/negative/neutral]</answer><|im_end|>
 <|im_start|>user
 {{ .Prompt }}<|im_end|>
 <|im_start|>assistant
 <think>
 </think>
 \"\"\"
 PARAMETER stop "<|im_end|>"
 PARAMETER stop "</answer>"
 PARAMETER stop "<|endoftext|>"
 PARAMETER stop "<|im_start|>"
 PARAMETER temperature 0.3
 PARAMETER top_p 0.9
 PARAMETER top_k 40
 PARAMETER repeat_penalty 1.15
 PARAMETER num_ctx 1024
 PARAMETER num_predict 512
--- a/Modelfile.Q5_K_M
+++ b/Modelfile.Q5_K_M
@@ -0,0 +1,31 @@
 FROM ./FinSenti-Qwen3-8B.Q5_K_M.gguf
 TEMPLATE \"\"\"<|im_start|>system
 You are a financial sentiment analyst. Analyze the given financial text and provide:
 1. Your reasoning in <reasoning> tags
 2. Your sentiment classification (positive, negative, or neutral) in <answer> tags
 Always use this exact format:
 <reasoning>
 [Your step-by-step analysis]
 </reasoning>
 <answer>[positive/negative/neutral]</answer><|im_end|>
 <|im_start|>user
 {{ .Prompt }}<|im_end|>
 <|im_start|>assistant
 <think>
 </think>
 \"\"\"
 PARAMETER stop "<|im_end|>"
 PARAMETER stop "</answer>"
 PARAMETER stop "<|endoftext|>"
 PARAMETER stop "<|im_start|>"
 PARAMETER temperature 0.3
 PARAMETER top_p 0.9
 PARAMETER top_k 40
 PARAMETER repeat_penalty 1.15
 PARAMETER num_ctx 1024
 PARAMETER num_predict 512
--- a/Modelfile.Q8_0
+++ b/Modelfile.Q8_0
@@ -0,0 +1,31 @@
 FROM ./FinSenti-Qwen3-8B.Q8_0.gguf
 TEMPLATE \"\"\"<|im_start|>system
 You are a financial sentiment analyst. Analyze the given financial text and provide:
 1. Your reasoning in <reasoning> tags
 2. Your sentiment classification (positive, negative, or neutral) in <answer> tags
 Always use this exact format:
 <reasoning>
 [Your step-by-step analysis]
 </reasoning>
 <answer>[positive/negative/neutral]</answer><|im_end|>
 <|im_start|>user
 {{ .Prompt }}<|im_end|>
 <|im_start|>assistant
 <think>
 </think>
 \"\"\"
 PARAMETER stop "<|im_end|>"
 PARAMETER stop "</answer>"
 PARAMETER stop "<|endoftext|>"
 PARAMETER stop "<|im_start|>"
 PARAMETER temperature 0.3
 PARAMETER top_p 0.9
 PARAMETER top_k 40
 PARAMETER repeat_penalty 1.15
 PARAMETER num_ctx 1024
 PARAMETER num_predict 512
--- a/README.md
+++ b/README.md
@@ -0,0 +1,181 @@
 ---
 license: apache-2.0
 language:
  - en
 base_model: Ayansk11/FinSenti-Qwen3-8B
 datasets:
  - Ayansk11/FinSenti-Dataset
 pipeline_tag: text-generation
 library_name: gguf
 tags:
  - finance
  - financial-sentiment
  - chain-of-thought
  - reasoning
  - gguf
  - llama-cpp
  - ollama
  - quantized
  - finsenti
 ---
 # FinSenti-Qwen3-8B - GGUF
 GGUF builds of [FinSenti-Qwen3-8B](https://huggingface.co/Ayansk11/FinSenti-Qwen3-8B)
 for use with [Ollama](https://ollama.com), [llama.cpp](https://github.com/ggerganov/llama.cpp),
 LM Studio, KoboldCpp, and other GGUF-compatible runtimes.
 This is the same model as the SafeTensors repo, just converted and
 quantized so you can run it on a CPU or a small GPU without pulling in
 PyTorch.
 ## Files in this repo
 | File | Quant | Size | Notes |
 |------|-------|------|-------|
 | `FinSenti-Qwen3-8B.Q4_K_M.gguf` | Q4_K_M | 4.70 GB | Smallest, mild quality dip. Default pick for laptops. |
 | `FinSenti-Qwen3-8B.Q5_K_M.gguf` | Q5_K_M | 5.40 GB | Balanced quality and size. |
 | `FinSenti-Qwen3-8B.Q8_0.gguf` | Q8_0 | 8.20 GB | Closest to bf16, biggest file. |
 If you're not sure which to pick: **start with Q4_K_M**. It's the smallest
 file, it runs everywhere, and the quality drop versus the original bf16
 weights is small for a model this size.
 ## Quick start (llama.cpp)
 ```bash
 # Download the Q4_K_M file (or pick a different quant from the table above)
 huggingface-cli download Ayansk11/FinSenti-Qwen3-8B-GGUF FinSenti-Qwen3-8B.Q4_K_M.gguf --local-dir .
 # Run it
 ./llama-cli -m FinSenti-Qwen3-8B.Q4_K_M.gguf \
  --system "You are a financial sentiment analyst. For each headline you receive, write a short reasoning chain inside <reasoning>...</reasoning> tags, then give a single label inside <answer>...</answer> tags. The label must be exactly one of: positive, negative, neutral." \
  -p "Apple beats Q4 estimates as iPhone sales jump 12% year over year." \
  -n 256
 ```
 ## Quick start (Ollama)
 This repo ships a `Modelfile` for each quant. To register the Q4_K_M build
 under the name `finsenti-qwen3-8b`:
 ```bash
 huggingface-cli download Ayansk11/FinSenti-Qwen3-8B-GGUF \
  FinSenti-Qwen3-8B.Q4_K_M.gguf Modelfile.Q4_K_M --local-dir ./finsenti-tmp
 cd finsenti-tmp
 ollama create finsenti-qwen3-8b -f Modelfile.Q4_K_M
 # Then chat with it
 ollama run finsenti-qwen3-8b "Apple beats Q4 estimates as iPhone sales jump 12% year over year."
 ```
 You should see output like:
 ```
 <reasoning>
 Beating estimates is a positive earnings surprise. A 12% YoY iPhone sales jump in the company's biggest product line points to demand strength. Both signals push the read positive.
 </reasoning>
 <answer>positive</answer>
 ```
 ## Quick start (Python via llama-cpp-python)
 ```python
 from llama_cpp import Llama
 llm = Llama(
    model_path="./FinSenti-Qwen3-8B.Q4_K_M.gguf",
    n_ctx=2048,
    n_threads=8,
 )
 system = (
    "You are a financial sentiment analyst. For each headline you receive, "
    "write a short reasoning chain inside <reasoning>...</reasoning> tags, "
    "then give a single label inside <answer>...</answer> tags. The label "
    "must be exactly one of: positive, negative, neutral."
 )
 resp = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": system},
        {"role": "user", "content": "Apple beats Q4 estimates as iPhone sales jump 12% year over year."},
    ],
    max_tokens=256,
    temperature=0.0,
 )
 print(resp["choices"][0]["message"]["content"])
 ```
 ## Hardware
 The Q4_K_M build is about 4.70 GB on disk and needs
 roughly 6 GB of free RAM at runtime. On a modern laptop
 CPU you should see 15-40 tokens per second depending on the size of the
 model and your core count. Throwing it on a small GPU (Apple Silicon, a
 6-8 GB NVIDIA card) gets you considerably faster generation.
 If you need more headroom, the Q5_K_M and Q8_0 files are progressively
 closer to the original bf16 quality at the cost of size.
 ## Picking a quant
 - **Q4_K_M** (4.70 GB): the default for laptops
  and small servers. Mild quality dip versus full precision but fits
  almost anywhere.
 - **Q5_K_M** (5.40 GB): a step up if you have
  the RAM. Most people won't notice the difference from Q8.
 - **Q8_0** (8.20 GB): closest to the bf16 weights.
  Use this if you want the cleanest output and have the disk space.
 ## Prompt format
 Same as the base model. Use the system prompt verbatim, put the headline
 or short snippet in the user turn, and parse the `<answer>...</answer>`
 block for the label.
 ## Limitations
 GGUF is a faithful conversion of the base model, so the same caveats apply:
 - English only
 - Short text only (training context was 2048 tokens)
 - Three labels: positive, negative, neutral
 - It explains its read but it isn't doing finance research; don't use the
  reasoning chain as investment advice
 Quantization adds a small extra error on top of the base model. For
 Q4_K_M on a model this size you'll see occasional disagreement with the
 bf16 model on borderline headlines, usually neutral-vs-positive flips.
 ## Related FinSenti models
 Other sizes and bases trained with the same recipe:
 - **Qwen3**: [Qwen3-0.6B](https://huggingface.co/Ayansk11/FinSenti-Qwen3-0.6B), [Qwen3-1.7B](https://huggingface.co/Ayansk11/FinSenti-Qwen3-1.7B), [Qwen3-4B](https://huggingface.co/Ayansk11/FinSenti-Qwen3-4B)
 - **Qwen3.5**: [Qwen3.5-0.8B](https://huggingface.co/Ayansk11/FinSenti-Qwen3.5-0.8B), [Qwen3.5-2B](https://huggingface.co/Ayansk11/FinSenti-Qwen3.5-2B), [Qwen3.5-4B](https://huggingface.co/Ayansk11/FinSenti-Qwen3.5-4B), [Qwen3.5-9B](https://huggingface.co/Ayansk11/FinSenti-Qwen3.5-9B)
 - **DeepSeek**: [DeepSeek-R1-1.5B](https://huggingface.co/Ayansk11/FinSenti-DeepSeek-R1-1.5B)
 - **MobileLLM**: [MobileLLM-R1-950M](https://huggingface.co/Ayansk11/FinSenti-MobileLLM-R1-950M)
 - **Tiny-LLM**: [Tiny-LLM-10M](https://huggingface.co/Ayansk11/FinSenti-Tiny-LLM-10M)
 - **Llama-3**: [Llama-3.2-1B](https://huggingface.co/Ayansk11/FinSenti-Llama-3.2-1B)
 - **SmolLM**: [SmolLM-1.7B](https://huggingface.co/Ayansk11/FinSenti-SmolLM-1.7B)
 The full-precision SafeTensors version of this model is at
 [Ayansk11/FinSenti-Qwen3-8B](https://huggingface.co/Ayansk11/FinSenti-Qwen3-8B), and the
 training data is at
 [Ayansk11/FinSenti-Dataset](https://huggingface.co/datasets/Ayansk11/FinSenti-Dataset).
 ## Citation
 ```bibtex
@misc{shaikh2026finsenti,
  title  = {FinSenti: Small Language Models for Financial Sentiment with Chain-of-Thought Reasoning},
  author = {Shaikh, Ayan},
  year   = {2026},
  url    = {https://huggingface.co/collections/Ayansk11/finsenti},
  note   = {Indiana University}
 }
 ```
 ## License
 Apache 2.0.