Files
text2graph-llama-3.2-3b/README.md
ModelHub XC f4a9305410 初始化项目,由ModelHub XC社区提供模型
Model: pat-jj/text2graph-llama-3.2-3b
Source: Original Platform
2026-06-16 08:28:12 +08:00

151 lines
4.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
library_name: transformers
license: llama3.2
base_model: unsloth/Llama-3.2-3B-Instruct
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: text2graph-llama-3.2-3b
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# text2graph-llama-3.2-3b
This model is a fine-tuned version of [unsloth/Llama-3.2-3B-Instruct](https://huggingface.co/unsloth/Llama-3.2-3B-Instruct) on a text2triple dataset curated by Sonnet-3.5.
This model has much faster inference speed than our previous trained [T5-based model](https://huggingface.co/pat-jj/text2triple-flan-t5). Also, it performs better for longer (> 512 tokens) input.
# Example Input:
"William Gerald Standridge (November 27, 1953 April 12, 2014) was an American stock car racing driver. He was a competitor in the NASCAR Winston Cup Series and Busch Series."
# Example Output:
(S> William gerald standridge| P> Nationality| O> American),
\
(S> William gerald standridge| P> Occupation| O> Stock car racing driver),
\
(S> William gerald standridge| P> Competitor| O> Busch series),
\
(S> William gerald standridge| P> Competitor| O> Nascar winston cup series),
\
(S> William gerald standridge| P> Birth date| O> November 27, 1953),
\
(S> William gerald standridge| P> Death date| O> April 12, 2014)
# How to Use?
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "pat-jj/text2graph-llama-3.2-3b"
def load_model_and_tokenizer():
# Load the model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Set up chat template
tokenizer.chat_template = tokenizer.chat_template or "llama-3.1"
return model, tokenizer
def generate_triples(model, tokenizer, input_text, max_length=2048):
# Format the input using chat template
messages = [{
"role": "user",
"content": f"Convert the following text to triples:\n\nText: {input_text}"
}]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate response
outputs = model.generate(
**inputs,
max_length=max_length,
num_return_sequences=1,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.pad_token_id
)
# Decode and return the response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
return response
def main():
print("Loading model and tokenizer...")
model, tokenizer = load_model_and_tokenizer()
print("\nModel loaded! Enter text to convert to triples (type 'quit' to exit):")
while True:
user_input = input("\nEnter text: ")
if user_input.lower() == 'quit':
break
print("\nGenerating triples...")
response = generate_triples(model, tokenizer, user_input)
print("\nResponse:", response)
if __name__ == "__main__":
main()
```
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 3
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 3
- gradient_accumulation_steps: 3
- total_train_batch_size: 27
- total_eval_batch_size: 3
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
### Framework versions
- Transformers 4.48.1
- Pytorch 2.1.2+cu121
- Datasets 2.21.0
- Tokenizers 0.21.0
## **Cite Our [Paper](https://arxiv.org/abs/2502.10996)**
```
@misc{jiang2025rasretrievalandstructuringknowledgeintensivellm,
title={RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation},
author={Pengcheng Jiang and Lang Cao and Ruike Zhu and Minhao Jiang and Yunyi Zhang and Jimeng Sun and Jiawei Han},
year={2025},
eprint={2502.10996},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.10996},
}
```