FineMedLM-o1/README.md

---
license: mit
datasets:
- hongzhouyu/FineMed-SFT
- hongzhouyu/FineMed-DPO
language:
- en
- zh
base_model:
- meta-llama/Llama-3.1-8B
- hongzhouyu/FineMedLM
library_name: transformers
tags:
- medical
---

<div align="center">
<h1>
  FineMedLM-o1
</h1>
</div>

<div align="center">
<a href="https://github.com/hongzhouyu/FineMed" target="_blank">GitHub</a> | <a href="https://arxiv.org/abs/2501.09213" target="_blank">Paper</a>
</div>

# <span>Introduction</span>
**FineMedLM-o1** is a specialized medical LLM engineered for advanced medical reasoning. It employs a multi-step reasoning process, iteratively reflecting on and refining its thought process before delivering a final response.

For more information, visit our GitHub repository.

# <span>Usage</span>
You can use FineMedLM-o1 in the same way as `Llama-3.1-8B-Instruct`:

(⚠️**Note**: Please use the system prompt we provide to achieve better inference results.)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

main_model_name = "yuhongzhou/FineMedLM"  
model = AutoModelForCausalLM.from_pretrained(main_model_name, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(main_model_name)

prompt = (
    """The following are multiple choice questions (with answers) about health. Think step by step and then finish your answer with "the answer is (X)" where X is the correct letter choice.


Question:
Polio can be eradicated by which of the following?
Options:
A. Herbal remedies
B. Use of antibiotics
C. Regular intake of vitamins
D. Administration of tetanus vaccine
E. Attention to sewage control and hygiene
F. Natural immunity acquired through exposure
G. Use of antiviral drugs
Answer: Let's think step by step.
"""
)

messages = [
    {"role": "system", "content": """You are a helpful professional doctor. You need to generate an answer based on the given problem and thoroughly explore the problem through a systematic and long-term thinking process to provide a final and accurate solution. This requires a comprehensive cycle of analysis, summary, exploration, re-evaluation, reflection, backtracking and iteration to form a thoughtful thinking process. Use the background information provided in the text to assist in formulating the answer. Follow these answer guidelines:
1. Please structure your response into two main sections: **Thought** and **Summarization**.
2. During the **Thought** phase, think step by step based on the given text content. If the text content is used, it must be expressed.
3. During the **Summarization** phase, based on the thinking process in the thinking phase, give the final answer to the question.
Here is the question: """},
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(text)

model_inputs = tokenizer(text, return_tensors="pt").to(model.device)

print("-----start generate-----")
generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=2048,
    eos_token_id=tokenizer.eos_token_id
)

answer = tokenizer.decode(generated_ids[0], skip_special_tokens=False)
print(answer)
```

FineMedLM-o1 adopts a *slow-thinking* approach, with outputs formatted as:

```
**Thought**
[Reasoning process]

**Summarization**
[Output]
```

# <span>Citation</span>
```
@misc{yu2025finemedlmo1enhancingmedicalreasoning,
    title={FineMedLM-o1: Enhancing the Medical Reasoning Ability of LLM from Supervised Fine-Tuning to Test-Time Training}, 
    author={Hongzhou Yu and Tianhao Cheng and Ying Cheng and Rui Feng},
    year={2025},
    eprint={2501.09213},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2501.09213}, 
}
```
初始化项目，由ModelHub XC社区提供模型 Model: hongzhouyu/FineMedLM-o1 Source: Original Platform 2026-05-05 14:31:42 +08:00			`---`
			`license: mit`
			`datasets:`
			`- hongzhouyu/FineMed-SFT`
			`- hongzhouyu/FineMed-DPO`
			`language:`
			`- en`
			`- zh`
			`base_model:`
			`- meta-llama/Llama-3.1-8B`
			`- hongzhouyu/FineMedLM`
			`library_name: transformers`
			`tags:`
			`- medical`
			`---`

			`<div align="center">`
			`<h1>`
			`FineMedLM-o1`
			`</h1>`
			`</div>`

			`<div align="center">`
			`<a href="https://github.com/hongzhouyu/FineMed" target="_blank">GitHub</a> \| <a href="https://arxiv.org/abs/2501.09213" target="_blank">Paper</a>`
			`</div>`

			`# <span>Introduction</span>`
			`FineMedLM-o1 is a specialized medical LLM engineered for advanced medical reasoning. It employs a multi-step reasoning process, iteratively reflecting on and refining its thought process before delivering a final response.`

			`For more information, visit our GitHub repository.`

			`# <span>Usage</span>`
			You can use FineMedLM-o1 in the same way as `Llama-3.1-8B-Instruct`:

			`(⚠️Note: Please use the system prompt we provide to achieve better inference results.)`
			```python
			`from transformers import AutoModelForCausalLM, AutoTokenizer`

			`main_model_name = "yuhongzhou/FineMedLM"`
			`model = AutoModelForCausalLM.from_pretrained(main_model_name, device_map="auto")`
			`tokenizer = AutoTokenizer.from_pretrained(main_model_name)`

			`prompt = (`
			`"""The following are multiple choice questions (with answers) about health. Think step by step and then finish your answer with "the answer is (X)" where X is the correct letter choice.`


			`Question:`
			`Polio can be eradicated by which of the following?`
			`Options:`
			`A. Herbal remedies`
			`B. Use of antibiotics`
			`C. Regular intake of vitamins`
			`D. Administration of tetanus vaccine`
			`E. Attention to sewage control and hygiene`
			`F. Natural immunity acquired through exposure`
			`G. Use of antiviral drugs`
			`Answer: Let's think step by step.`
			`"""`
			`)`

			`messages = [`
			{"role": "system", "content": """You are a helpful professional doctor. You need to generate an answer based on the given problem and thoroughly explore the problem through a systematic and long-term thinking process to provide a final and accurate solution. This requires a comprehensive cycle of analysis, summary, exploration, re-evaluation, reflection, backtracking and iteration to form a thoughtful thinking process. Use the background information provided in the text to assist in formulating the answer. Follow these answer guidelines:
			`1. Please structure your response into two main sections: Thought and Summarization.`
			`2. During the Thought phase, think step by step based on the given text content. If the text content is used, it must be expressed.`
			`3. During the Summarization phase, based on the thinking process in the thinking phase, give the final answer to the question.`
			`Here is the question: """},`
			`{"role": "user", "content": prompt}`
			`]`

			`text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)`
			`print(text)`

			`model_inputs = tokenizer(text, return_tensors="pt").to(model.device)`

			`print("-----start generate-----")`
			`generated_ids = model.generate(`
			`model_inputs.input_ids,`
			`max_new_tokens=2048,`
			`eos_token_id=tokenizer.eos_token_id`
			`)`

			`answer = tokenizer.decode(generated_ids[0], skip_special_tokens=False)`
			`print(answer)`
			```

			`FineMedLM-o1 adopts a slow-thinking approach, with outputs formatted as:`

			```
			`Thought`
			`[Reasoning process]`

			`Summarization`
			`[Output]`
			```

			`# <span>Citation</span>`
			```
			`@misc{yu2025finemedlmo1enhancingmedicalreasoning,`
			`title={FineMedLM-o1: Enhancing the Medical Reasoning Ability of LLM from Supervised Fine-Tuning to Test-Time Training},`
			`author={Hongzhou Yu and Tianhao Cheng and Ying Cheng and Rui Feng},`
			`year={2025},`
			`eprint={2501.09213},`
			`archivePrefix={arXiv},`
			`primaryClass={cs.CL},`
			`url={https://arxiv.org/abs/2501.09213},`
			`}`
			```