Files
ABEJA-Qwen2.5-32b-Japanese-…/README.md
ModelHub XC acf08713cd 初始化项目,由ModelHub XC社区提供模型
Model: abeja/ABEJA-Qwen2.5-32b-Japanese-v1.0
Source: Original Platform
2026-04-24 23:37:19 +08:00

70 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
library_name: transformers
license: apache-2.0
language:
- ja
base_model:
- Qwen/Qwen2.5-32B-Instruct
- abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1
pipeline_tag: text-generation
---
## ABEJA-Qwen2.5-32b-Japanese-v1.0
`ABEJA-Qwen2.5-32b-Japanese-v1.0``Qwen/Qwen2.5-32B-Instruct`をベースに日本語中心とした継続事前学習を実施したモデル`abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1`に対してSFTとDPOによる事後学習を実施したモデルです。
詳細はABEJAのテックブログを参照してください。
https://tech-blog.abeja.asia/entry/geniac2-qwen25-32b-v1.0
## 使い方
```Python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "abeja/ABEJA-Qwen2.5-32b-Japanese-v1.0"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "人とAIが協調するためには"
messages = [
{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```
## 開発者
- Hiroshi Kiyota
- Keisuke Fujimoto
- Kentaro Nakanishi
- Kyo Hattori
- Shinya Otani
- Shogo Muranushi
- Takuma Kume
- Tomoki Manabe
(*)アルファベット順