---
license: apache-2.0
base_model:
- ThaiLLM/ThaiLLM-8B
- Qwen/Qwen3-8B
- Qwen/Qwen3-8B-Base
pipeline_tag: text-generation
language:
- en
- th
tags:
- finance
- mergekit
- merge
---
# THaLLE-ThaiLLM: Domain-Specialized Small LLMs for Finance and Thai

## Model Overview

This 8B language model is developed as an extension of ThaiLLM-8B, with a focus on enhancing instruction-following capabilities and financial knowledge. The model is constructed using [mergekit](https://github.com/arcee-ai/mergekit) that integrates ThaiLLM-8B with Qwen3-8B and THaLLE, the latter of which was trained on 80 CFA examination sets.

**THaLLE-0.2-ThaiLLM-8B-fa** has the following features:
- **Supports switching between thinking and non-thinking modes**, similar to [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B).
- **Offers enhanced Thai language understanding** from [ThaiLLM-8B](https://huggingface.co/ThaiLLM/ThaiLLM-8B).
- **Incorporates the financial knowledge and understanding** expected of THaLLE fine-tuning.

## Usage

### Requirements

Since `KBTG-Labs/THaLLE-0.2-ThaiLLM-8B-fa` is a fine-tuned of Qwen3-8B you will need to install `transformers>=4.51.0`.

### Running using Transformers

Running the script below generates output based on the given input messages.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

MODEL_ID: str = "KBTG-Labs/THaLLE-0.2-ThaiLLM-8B-fa"

def inference(messages: list[dict[str, str]], model, tokenizer) -> str:
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=False, # Switches thinking modes.
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

    generated_ids = model.generate(
        model_inputs.input_ids,
        max_new_tokens=768,
        do_sample=False,
        temperature=None,
        top_p=None,
        top_k=None,
        pad_token_id=tokenizer.eos_token_id,
    )
    generated_ids = [
        output_ids[len(input_ids) :]
        for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    return response

if __name__ == "__main__":
    tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
    model = AutoModelForCausalLM.from_pretrained(
        MODEL_ID,
        torch_dtype=torch.bfloat16,
        device_map="auto",
    )

    messages = [{"role": "user", "content": "สวัสดี!"}]
    inference(messages, model, tokenizer)

```

## Results

For more details, see our [Technical Report.](https://arxiv.org/abs/2601.04597)

| Model                                   | M3 Exam   | M6 Exam   | Flare CFA* | IC        |
| --------------------------------------- | --------- | --------- | ---------- | --------- |
| Non-Thinking                            |           |           |            |           |
| `Qwen3-8B`                              | 0.660     | 0.545     | 0.753      | 0.640     |
| `ThaiLLM-8B-Instruct`**                 | 0.707     | **0.623** | 0.762      | **0.720** |
| `THaLLE-0.2-ThaiLLM-8B-fa`              | **0.725** | 0.572     | **0.771**  | **0.720** |
| Thinking                                |           |           |            |           |
| `Qwen3-8B`                              | 0.706     | 0.590     | 0.806      | 0.600     |
| `ThaiLLM-8B-Instruct`**                 | 0.720     | 0.661     | 0.820      | 0.720     |
| `THaLLE-0.2-ThaiLLM-8B-fa`              | **0.779** | **0.678** | **0.852**  | **0.840** |

[*] Flare CFA is `"TheFinAI/flare-cfa"`

[**] `"ThaiLLM-8B-Instruct"` is [KBTG-Labs/ThaiLLM-8B-Instruct](https://huggingface.co/KBTG-Labs/ThaiLLM-8B-Instruct)

[vLLM](https://github.com/vllm-project/vllm) was used for evaluations, results might vary.

## Citation

If you find our work useful, please cite:

```
@misc{labs2026thallethaillmdomainspecializedsmallllms,
      title={THaLLE-ThaiLLM: Domain-Specialized Small LLMs for Finance and Thai -- Technical Report}, 
      author={KBTG Labs and : and Anuruth Lertpiya and Danupat Khamnuansin and Kantapong Sucharitpongpan and Pornchanan Balee and Tawunrat Chalothorn and Thadpong Pongthawornkamol and Monchai Lertsutthiwong},
      year={2026},
      eprint={2601.04597},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.04597}, 
}
```