CR-CA/README.md

---
language:
  - en
license: other
base_model: Qwen/Qwen2.5-1.5B-Instruct
library_name: transformers
pipeline_tag: text-generation
tags:
  - crca
  - causal-reasoning
  - qwen2
  - 1.5b
  - finetuned
---

# CRCA 1.5B Full Finetune

## Overview
CR-CA (Causal Reasoning and Counterfactual Analysis) is a reasoning-focused stack
that targets structured causal analysis, counterfactuals, and multi-step reasoning.
This 1.5B model is a CR-CA reasoning-optimized causal language model based on the
Qwen2 architecture (`Qwen2ForCausalLM`).

## Model Details
- **Model type:** `qwen2`
- **Architecture:** `Qwen2ForCausalLM`
- **Hidden size:** `1536`
- **Layers:** `28`
- **Attention heads:** `12` (KV heads: `2`)
- **Max position embeddings:** `32768`
- **Vocab size:** `151936`
- **Dtype:** `float16`

## Training Summary
This model was produced via full finetuning for CR-CA reasoning. Training metadata
is stored in `training_args.bin`.

Key training parameters:
- **Per-device batch size:** 8
- **Gradient accumulation:** 16
- **Epochs:** 2
- **Learning rate:** 5e-4
- **Precision:** FP16
- **DeepSpeed config:** `training/deepspeed_zero2_1_5b.json`
- **Scheduler:** cosine
- **Warmup steps:** 100
- **Save steps:** 200

## Training Data
The training data uses a prompt/response JSONL format:

```
{"prompt": "...", "response": "..."}
```

The dataset includes public reasoning data (e.g., GSM8K-style math word problems).
This is used to strengthen multi-step reasoning, structured derivations, and final
answer formatting.

## Evaluation Report (Real-World Causal Tasks)
Evaluation was run on 2026-02-01 using GPT-4o-mini over 6 real-world causal tasks.
Overall score: **48.3%**.

Per-task scores:
- Monetary Policy Counterfactual (US Macro 2025): **55/100**
- Tariff Pass-Through and Pricing (Beige Book + Firm Data): **55/100**
- Supply Chain Reroute Counterfactual (Port Disruption): **45/100**
- Inventory & Stockout Causal Impact (Retail): **25/100**
- Inflation Drivers (World Bank CPI Data): **65/100**
- Workforce Training Program (Labor Market Causal Impact): **45/100**

Key strengths observed:
- Clear task framing and attempt at counterfactual reasoning.
- Some identification of confounders and causal factors.

Key limitations observed:
- Inconsistent causal graphs and directional effects.
- Weak counterfactual grounding and numerical reasoning errors.
- Limited depth and rigor on confounder adjustment strategies.

## Intended Use
For causal reasoning, counterfactual analysis, structured CR-CA reasoning prompts,
and multi-step reasoning tasks.

## Generation Settings
Default generation parameters are stored in `generation_config.json`:
- `do_sample`: `true`
- `temperature`: `0.7`
- `top_p`: `0.8`
- `top_k`: `20`
- `repetition_penalty`: `1.1`

## Limitations
- Outputs should be validated for factual correctness.
- The model may hallucinate causal claims without evidence.

## License
Follow the base model and dataset licenses used for training. Add your explicit
license here if required.
初始化项目，由ModelHub XC社区提供模型 Model: Euroswarms/CR-CA Source: Original Platform 2026-04-27 17:21:03 +08:00			`---`
			`language:`
			`- en`
			`license: other`
			`base_model: Qwen/Qwen2.5-1.5B-Instruct`
			`library_name: transformers`
			`pipeline_tag: text-generation`
			`tags:`
			`- crca`
			`- causal-reasoning`
			`- qwen2`
			`- 1.5b`
			`- finetuned`
			`---`

			`# CRCA 1.5B Full Finetune`

			`## Overview`
			`CR-CA (Causal Reasoning and Counterfactual Analysis) is a reasoning-focused stack`
			`that targets structured causal analysis, counterfactuals, and multi-step reasoning.`
			`This 1.5B model is a CR-CA reasoning-optimized causal language model based on the`
			Qwen2 architecture (`Qwen2ForCausalLM`).

			`## Model Details`
			- Model type: `qwen2`
			- Architecture: `Qwen2ForCausalLM`
			- Hidden size: `1536`
			- Layers: `28`
			- Attention heads: `12` (KV heads: `2`)
			- Max position embeddings: `32768`
			- Vocab size: `151936`
			- Dtype: `float16`

			`## Training Summary`
			`This model was produced via full finetuning for CR-CA reasoning. Training metadata`
			is stored in `training_args.bin`.

			`Key training parameters:`
			`- Per-device batch size: 8`
			`- Gradient accumulation: 16`
			`- Epochs: 2`
			`- Learning rate: 5e-4`
			`- Precision: FP16`
			- DeepSpeed config: `training/deepspeed_zero2_1_5b.json`
			`- Scheduler: cosine`
			`- Warmup steps: 100`
			`- Save steps: 200`

			`## Training Data`
			`The training data uses a prompt/response JSONL format:`

			```
			`{"prompt": "...", "response": "..."}`
			```

			`The dataset includes public reasoning data (e.g., GSM8K-style math word problems).`
			`This is used to strengthen multi-step reasoning, structured derivations, and final`
			`answer formatting.`

			`## Evaluation Report (Real-World Causal Tasks)`
			`Evaluation was run on 2026-02-01 using GPT-4o-mini over 6 real-world causal tasks.`
			`Overall score: 48.3%.`

			`Per-task scores:`
			`- Monetary Policy Counterfactual (US Macro 2025): 55/100`
			`- Tariff Pass-Through and Pricing (Beige Book + Firm Data): 55/100`
			`- Supply Chain Reroute Counterfactual (Port Disruption): 45/100`
			`- Inventory & Stockout Causal Impact (Retail): 25/100`
			`- Inflation Drivers (World Bank CPI Data): 65/100`
			`- Workforce Training Program (Labor Market Causal Impact): 45/100`

			`Key strengths observed:`
			`- Clear task framing and attempt at counterfactual reasoning.`
			`- Some identification of confounders and causal factors.`

			`Key limitations observed:`
			`- Inconsistent causal graphs and directional effects.`
			`- Weak counterfactual grounding and numerical reasoning errors.`
			`- Limited depth and rigor on confounder adjustment strategies.`

			`## Intended Use`
			`For causal reasoning, counterfactual analysis, structured CR-CA reasoning prompts,`
			`and multi-step reasoning tasks.`

			`## Generation Settings`
			Default generation parameters are stored in `generation_config.json`:
			- `do_sample`: `true`
			- `temperature`: `0.7`
			- `top_p`: `0.8`
			- `top_k`: `20`
			- `repetition_penalty`: `1.1`

			`## Limitations`
			`- Outputs should be validated for factual correctness.`
			`- The model may hallucinate causal claims without evidence.`

			`## License`
			`Follow the base model and dataset licenses used for training. Add your explicit`
			`license here if required.`