初始化项目,由ModelHub XC社区提供模型
Model: benchang1110/Taiwan-tinyllama-v1.1-base Source: Original Platform
This commit is contained in:
65
README.md
Normal file
65
README.md
Normal file
@@ -0,0 +1,65 @@
|
||||
---
|
||||
base_model:
|
||||
- TinyLlama/TinyLlama_v1.1
|
||||
datasets:
|
||||
- benchang1110/Taiwan-pretrain-9B
|
||||
- benchang1110/Taiwan-book-1B
|
||||
language:
|
||||
- zh
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
---
|
||||
|
||||
# Model Card for Model ID
|
||||
|
||||

|
||||
This is a continue-pretrained version of [Tinyllama-v1.1](TinyLlama/TinyLlama_v1.1) tailored for traditional Chinese. The continue-pretraining dataset contains over 10B tokens. Using bfloat16, the VRAM required during inference is only around 3GB!!!
|
||||
|
||||
# Usage
|
||||
**This is a causal language model not a chat model !** It is not designed to generate human-like responses. It is designed to generate text based on previous text.
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
|
||||
import torch
|
||||
from transformers import TextStreamer
|
||||
|
||||
def generate_response(input):
|
||||
'''
|
||||
simple test for the model
|
||||
'''
|
||||
# tokenzize the input
|
||||
tokenized_input = tokenizer.encode_plus(input, return_tensors='pt').to(device)
|
||||
print(tokenized_input['input_ids'])
|
||||
# generate the response
|
||||
_ = model.generate(
|
||||
input_ids=tokenized_input['input_ids'],
|
||||
attention_mask=tokenized_input['attention_mask'],
|
||||
pad_token_id=tokenizer.pad_token_id,
|
||||
do_sample=True,
|
||||
repetition_penalty=1.0,
|
||||
max_length=2048,
|
||||
streamer=streamer,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
device = 'cuda' if torch.cuda.is_available() else 'cpu'
|
||||
model = AutoModelForCausalLM.from_pretrained("benchang1110/Taiwan-tinyllama-v1.1-base",attn_implementation="flash_attention_2",device_map=device,torch_dtype=torch.bfloat16)
|
||||
tokenizer = AutoTokenizer.from_pretrained("benchang1110/Taiwan-tinyllama-v1.1-base",use_fast=True)
|
||||
streamer = TextStreamer(tokenizer)
|
||||
while(True):
|
||||
text = input("input a simple prompt:")
|
||||
generate_response(text)
|
||||
```
|
||||
|
||||
### Training Procedure
|
||||
|
||||
The following training hyperparameters are used:
|
||||
|
||||
| Data size | Global Batch Size | Learning Rate | Epochs | Max Length | Weight Decay |
|
||||
|--------------|-------------------|---------------|--------|------------|--------------|
|
||||
| 10B | 32 | 5e-5 | 1 | 2048 | 1e-4 |
|
||||
|
||||

|
||||
### Compute Infrastructure
|
||||
1xA100(80GB), took approximately 200 GPU hours.
|
||||
Reference in New Issue
Block a user