64 lines
1.9 KiB
Markdown
64 lines
1.9 KiB
Markdown
---
|
|
license: cc-by-4.0
|
|
library_name: transformers
|
|
base_model: Qwen/Qwen3-4B
|
|
tags:
|
|
- insight-anticipation
|
|
- scientific-literature
|
|
---
|
|
|
|
# GIANTS-4B
|
|
|
|
GIANTS-4B is a language model for **insight anticipation** from scientific literature, introduced in the paper:
|
|
|
|
> **GIANTS: Generative Insight Anticipation from Scientific Literature**
|
|
|
|
Given summaries of two parent papers, GIANTS-4B generates the key insight of a downstream paper that builds on both parent papers. It is fine-tuned from [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B).
|
|
|
|
## Training Data
|
|
|
|
This model was trained on [GiantsBench-train](https://huggingface.co/datasets/giants2026/GiantsBench-train).
|
|
|
|
## Usage
|
|
|
|
```python
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
from datasets import load_dataset
|
|
|
|
# Load model and tokenizer
|
|
model_name = "giants2026/GIANTS-4B"
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
|
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
|
|
|
|
# Load a sample prompt from the GiantsBench test set
|
|
dataset = load_dataset("giants2026/GiantsBench-test", split="train")
|
|
query = dataset[0]["query"]
|
|
|
|
# Format as chat messages
|
|
messages = [{"role": "user", "content": query}]
|
|
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
|
inputs = tokenizer(text, return_tensors="pt").to(model.device)
|
|
|
|
# Generate
|
|
output = model.generate(
|
|
**inputs,
|
|
max_new_tokens=2048,
|
|
temperature=0.6,
|
|
top_p=0.95,
|
|
top_k=20,
|
|
min_p=0.0,
|
|
)
|
|
|
|
# Decode and print the generated insight
|
|
response = tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
|
|
print(response)
|
|
```
|
|
|
|
## Evaluation
|
|
|
|
See [GiantsBench-test](https://huggingface.co/datasets/giants2026/GiantsBench-test) for the evaluation benchmark.
|
|
|
|
## License
|
|
|
|
This model is released under the [Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/) license.
|