rLLM-FinQA-4B/README.md

---
license: apache-2.0
library_name: transformers
datasets:
- rLLM/rLLM-FinQA-Dataset
language:
- en
base_model:
- Qwen/Qwen3-4B-Instruct-2507
pipeline_tag: text-generation
tags:
- finance
- tool-use
- agent
---
<div align="center">
<span style="font-family: default; font-size: 1.5em;">FinQA</span>
<div>
Training Financial Agents with Reinforcement Learning
</div>
</div>
<br>
<div align="center" style="line-height: 1;">
  <a href="https://github.com/rllm-org/rllm" style="margin: 2px;">
    <img alt="Code" src="https://img.shields.io/badge/FinQA-000000?style=for-the-badge&logo=github&logoColor=000&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
  </a>
  <a href="https://rllm-project.com/post.html?post=finqa.md" target="_blank" style="margin: 2px;">
    <img alt="Blog" src="https://img.shields.io/badge/Blog-%23000000.svg?style=for-the-badge&logo=notion&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
  </a>
  <a href="https://x.com/rllm_project" style="margin: 2px;">
    <img alt="X.ai" src="https://img.shields.io/badge/rLLM-white?style=for-the-badge&logo=X&logoColor=000&color=000&labelColor=white" style="display: inline-block; vertical-align: middle;"/>
  </a>
  <a href="https://huggingface.co/rLLM" style="margin: 2px;">
    <img alt="Hugging Face" src="https://img.shields.io/badge/rLLM-fcd022?style=for-the-badge&logo=huggingface&logoColor=000&labelColor" style="display: inline-block; vertical-align: middle;"/>
  </a>
</div>
</div>
</div>

## FinQA Overview

FinQA is a financial question-answering agent fine-tuned from Qwen3-4B-Instruct-2507 using reinforcement learning (RL). The model answers questions about SEC 10-K financial statements using specialized tools (SQL queries, table lookup, calculators), achieving 59.70% accuracy on Snorkel Finance Benchmark and 26.6% on Snorkel Finance Reasoning.

## Data

Our training dataset is built from SEC 10-K filings and consists of 5,110 question-answer pairs across:
- **207 companies** spanning multiple sectors
- **6,923 financial tables** extracted from 10-K filings
- **Single-table questions**: Direct lookups and calculations from individual tables
- **Multi-table questions**: Cross-table reasoning requiring data from multiple sources

The dataset is available on [HuggingFace](https://huggingface.co/datasets/rLLM/rLLM-FinQA-Dataset).

## Tools

The agent uses 4 specialized tools for financial analysis:

| Tool | Description |
|------|-------------|
| `get_table_names` | List available tables for a given company |
| `get_table_info` | Get table metadata, columns, dtypes, and sample values |
| `sql_query` | Execute SQL queries on financial tables (SQLite) |
| `calculator` | Evaluate mathematical expressions |

## Training

We fine-tune Qwen3-4B-Instruct-2507 using GRPO with LLM-as-judge rewards for correctness evaluation. A more detailed description of the training recipe can be found in our [documentation](https://rllm-project.readthedocs.io/en/latest/projects/finqa/).

## Evaluation

| Model | FinQA | FinQA Reasoning |
|-------|-------|-----------------|
| Qwen3-4B-Instruct-2507 (Base) | 27.90% | 13.90% |
| gpt-5-nano-2025-08-07 | 50.00% | 26.60% |
| Qwen3-235B-A22B | 51.37% | 18.90% |
| **rLLM-FinQA-4B (Ours)** | **59.70%** | **26.60%** |
| Gemini-2.5-Pro-Preview | 60.60% | 34.60% |
| GPT-4.1-2025-04-14 | 62.70% | 37.90% |
| o3-mini-2025-01-31 | 63.79% | 30.37% |


## Serving FinQA

Start a vLLM server and run the agent:

```bash
python -m vllm.entrypoints.openai.api_server \
    --model rLLM/rLLM-FinQA-4B \
    --host 0.0.0.0 \
    --port 30000 \
    --dtype bfloat16

python -m projects.finqa.run_finqa
```

For detailed setup instructions, see the [project README](https://github.com/rllm-org/rllm/tree/main/projects/finqa).

## Acknowledgement

- This is a joint collaboration between the [rLLM](https://github.com/rllm-org/rllm) team at UC Berkeley and [Snorkel AI](https://snorkel.ai/).
- Our model is trained on top of [`Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
- Our work is done as part of [Berkeley Sky Computing Lab](https://skycomputing.berkeley.edu/).

## Citation

```bibtex
@misc{rllm2026finqa,
  title={FinQA: Training Financial Agents with Reinforcement Learning},
  author={Manan Roongta and Sijun Tan and Bhavishya Pohani and Charles Dickens and Christopher Glaze},
  year={2026},
  howpublished={\url{https://rllm-project.com/post.html?post=finqa.md}},
  note={Blog Post}
}
```
初始化项目，由ModelHub XC社区提供模型 Model: rLLM/rLLM-FinQA-4B Source: Original Platform 2026-05-25 07:31:18 +08:00			`---`
			`license: apache-2.0`
			`library_name: transformers`
			`datasets:`
			`- rLLM/rLLM-FinQA-Dataset`
			`language:`
			`- en`
			`base_model:`
			`- Qwen/Qwen3-4B-Instruct-2507`
			`pipeline_tag: text-generation`
			`tags:`
			`- finance`
			`- tool-use`
			`- agent`
			`---`
			`<div align="center">`
			`<span style="font-family: default; font-size: 1.5em;">FinQA</span>`
			`<div>`
			`Training Financial Agents with Reinforcement Learning`
			`</div>`
			`</div>`
			`<br>`
			`<div align="center" style="line-height: 1;">`
			`<a href="https://github.com/rllm-org/rllm" style="margin: 2px;">`
			`<img alt="Code" src="https://img.shields.io/badge/FinQA-000000?style=for-the-badge&logo=github&logoColor=000&logoColor=white" style="display: inline-block; vertical-align: middle;"/>`
			`</a>`
			`<a href="https://rllm-project.com/post.html?post=finqa.md" target="_blank" style="margin: 2px;">`
			`<img alt="Blog" src="https://img.shields.io/badge/Blog-%23000000.svg?style=for-the-badge&logo=notion&logoColor=white" style="display: inline-block; vertical-align: middle;"/>`
			`</a>`
			`<a href="https://x.com/rllm_project" style="margin: 2px;">`
			`<img alt="X.ai" src="https://img.shields.io/badge/rLLM-white?style=for-the-badge&logo=X&logoColor=000&color=000&labelColor=white" style="display: inline-block; vertical-align: middle;"/>`
			`</a>`
			`<a href="https://huggingface.co/rLLM" style="margin: 2px;">`
			`<img alt="Hugging Face" src="https://img.shields.io/badge/rLLM-fcd022?style=for-the-badge&logo=huggingface&logoColor=000&labelColor" style="display: inline-block; vertical-align: middle;"/>`
			`</a>`
			`</div>`
			`</div>`
			`</div>`

			`## FinQA Overview`

			`FinQA is a financial question-answering agent fine-tuned from Qwen3-4B-Instruct-2507 using reinforcement learning (RL). The model answers questions about SEC 10-K financial statements using specialized tools (SQL queries, table lookup, calculators), achieving 59.70% accuracy on Snorkel Finance Benchmark and 26.6% on Snorkel Finance Reasoning.`

			`## Data`

			`Our training dataset is built from SEC 10-K filings and consists of 5,110 question-answer pairs across:`
			`- 207 companies spanning multiple sectors`
			`- 6,923 financial tables extracted from 10-K filings`
			`- Single-table questions: Direct lookups and calculations from individual tables`
			`- Multi-table questions: Cross-table reasoning requiring data from multiple sources`

			`The dataset is available on [HuggingFace](https://huggingface.co/datasets/rLLM/rLLM-FinQA-Dataset).`

			`## Tools`

			`The agent uses 4 specialized tools for financial analysis:`

			`\| Tool \| Description \|`
			`\|------\|-------------\|`
			\| `get_table_names` \| List available tables for a given company \|
			\| `get_table_info` \| Get table metadata, columns, dtypes, and sample values \|
			\| `sql_query` \| Execute SQL queries on financial tables (SQLite) \|
			\| `calculator` \| Evaluate mathematical expressions \|

			`## Training`

			`We fine-tune Qwen3-4B-Instruct-2507 using GRPO with LLM-as-judge rewards for correctness evaluation. A more detailed description of the training recipe can be found in our [documentation](https://rllm-project.readthedocs.io/en/latest/projects/finqa/).`

			`## Evaluation`

			`\| Model \| FinQA \| FinQA Reasoning \|`
			`\|-------\|-------\|-----------------\|`
			`\| Qwen3-4B-Instruct-2507 (Base) \| 27.90% \| 13.90% \|`
			`\| gpt-5-nano-2025-08-07 \| 50.00% \| 26.60% \|`
			`\| Qwen3-235B-A22B \| 51.37% \| 18.90% \|`
			`\| rLLM-FinQA-4B (Ours) \| 59.70% \| 26.60% \|`
			`\| Gemini-2.5-Pro-Preview \| 60.60% \| 34.60% \|`
			`\| GPT-4.1-2025-04-14 \| 62.70% \| 37.90% \|`
			`\| o3-mini-2025-01-31 \| 63.79% \| 30.37% \|`


			`## Serving FinQA`

			`Start a vLLM server and run the agent:`

			```bash
			`python -m vllm.entrypoints.openai.api_server \`
			`--model rLLM/rLLM-FinQA-4B \`
			`--host 0.0.0.0 \`
			`--port 30000 \`
			`--dtype bfloat16`

			`python -m projects.finqa.run_finqa`
			```

			`For detailed setup instructions, see the [project README](https://github.com/rllm-org/rllm/tree/main/projects/finqa).`

			`## Acknowledgement`

			`- This is a joint collaboration between the [rLLM](https://github.com/rllm-org/rllm) team at UC Berkeley and [Snorkel AI](https://snorkel.ai/).`
			- Our model is trained on top of [`Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
			`- Our work is done as part of [Berkeley Sky Computing Lab](https://skycomputing.berkeley.edu/).`

			`## Citation`

			```bibtex
			`@misc{rllm2026finqa,`
			`title={FinQA: Training Financial Agents with Reinforcement Learning},`
			`author={Manan Roongta and Sijun Tan and Bhavishya Pohani and Charles Dickens and Christopher Glaze},`
			`year={2026},`
			`howpublished={\url{https://rllm-project.com/post.html?post=finqa.md}},`
			`note={Blog Post}`
			`}`
			```