67 lines
1.9 KiB
Markdown
67 lines
1.9 KiB
Markdown
---
|
|
license: mit
|
|
language:
|
|
- ja
|
|
library_name: transformers
|
|
pipeline_tag: text-generation
|
|
tags:
|
|
- gpt_neox
|
|
- gpt-neox
|
|
- japanese
|
|
inference:
|
|
parameters:
|
|
max_new_tokens: 32
|
|
do_sample: false
|
|
repetition_penalty: 1.1
|
|
---
|
|
|
|
# stockmark/gpt-neox-japanese-1.4b
|
|
|
|
This repository provides a GPT-NeoX based model with 1.4B parameters pre-trained on Japanese corpus of about 20B tokens. This model is developed by [Stockmark Inc.](https://stockmark.co.jp/)
|
|
|
|
## How to use
|
|
|
|
```python
|
|
import torch
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
# Use torch.bfloat16 for A100 GPU and torch.flaot16 for the older generation GPUs
|
|
torch_dtype = torch.bfloat16 if torch.cuda.is_available() and hasattr(torch.cuda, "is_bf16_supported") and torch.cuda.is_bf16_supported() else torch.float16
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("stockmark/gpt-neox-japanese-1.4b", device_map="auto", torch_dtype=torch_dtype)
|
|
tokenizer = AutoTokenizer.from_pretrained("stockmark/gpt-neox-japanese-1.4b")
|
|
|
|
inputs = tokenizer("自然言語処理は", return_tensors="pt").to(model.device)
|
|
with torch.no_grad():
|
|
tokens = model.generate(
|
|
**inputs,
|
|
max_new_tokens=128,
|
|
repetition_penalty=1.1
|
|
)
|
|
|
|
output = tokenizer.decode(tokens[0], skip_special_tokens=True)
|
|
print(output)
|
|
```
|
|
|
|
## Example:
|
|
|
|
- LoRA tuning: https://huggingface.co/stockmark/gpt-neox-japanese-1.4b/blob/main/notebooks/LoRA.ipynb
|
|
|
|
## Training dataset
|
|
- Japanese Web Corpus (ja): 8.6B tokens (This dataset will not be released.)
|
|
- Wikipedia (ja): 0.88B tokens
|
|
- CC100 (ja): 10.5B tokens
|
|
|
|
## Training setting
|
|
- Trained using HuggingFace Trainer and DeepSpeed (ZeRO-2)
|
|
- 8 A100 GPUs (40GB) at ABCI
|
|
- Mixed Precision (BF16)
|
|
|
|
## License
|
|
[The MIT license](https://opensource.org/licenses/MIT)
|
|
|
|
## Developed by
|
|
[Stockmark Inc.](https://stockmark.co.jp/)
|
|
|
|
## Author
|
|
[Takahiro Omi](https://huggingface.co/omitakahiro) |