72 lines
1.9 KiB
Markdown
72 lines
1.9 KiB
Markdown
|
|
---
|
||
|
|
library_name: transformers
|
||
|
|
license: apache-2.0
|
||
|
|
datasets:
|
||
|
|
- pythainlp/han-instruct-dataset-v2.0
|
||
|
|
language:
|
||
|
|
- th
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
---
|
||
|
|
|
||
|
|
# Model Card for Han LLM 7B v3
|
||
|
|
|
||
|
|
Han LLM 7B v3 is a model that trained by han-instruct-dataset v2.0 and more. The model are working with Thai.
|
||
|
|
|
||
|
|
Base model: [scb10x/typhoon-7b](https://huggingface.co/scb10x/typhoon-7b)
|
||
|
|
|
||
|
|
[Google colab: Demo Han LLM 7B v3](https://colab.research.google.com/drive/1eC3dIWjBgM2v_UyCopMLawvqqcnQFvmI?usp=sharing)
|
||
|
|
|
||
|
|
|
||
|
|
|
||
|
|
Thank you kaggle for free gpu!
|
||
|
|
|
||
|
|
## Model Details
|
||
|
|
|
||
|
|
### Model Description
|
||
|
|
|
||
|
|
The model was trained by LoRA.
|
||
|
|
|
||
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
||
|
|
|
||
|
|
- **Developed by:** Wannaphong Phatthiyaphaibun
|
||
|
|
- **Model type:** text-generation
|
||
|
|
- **Language(s) (NLP):** Thai
|
||
|
|
- **License:** apache-2.0
|
||
|
|
- **Finetuned from model:** [scb10x/typhoon-7b](https://huggingface.co/scb10x/typhoon-7b)
|
||
|
|
|
||
|
|
## Uses
|
||
|
|
|
||
|
|
Thai users
|
||
|
|
|
||
|
|
### Out-of-Scope Use
|
||
|
|
|
||
|
|
Math, Coding, and other language
|
||
|
|
|
||
|
|
|
||
|
|
## Bias, Risks, and Limitations
|
||
|
|
|
||
|
|
The model can has a bias from dataset. Use at your own risks!
|
||
|
|
|
||
|
|
## How to Get Started with the Model
|
||
|
|
|
||
|
|
Use the code below to get started with the model.
|
||
|
|
|
||
|
|
**Example**
|
||
|
|
|
||
|
|
|
||
|
|
```python
|
||
|
|
# !pip install accelerate sentencepiece transformers bitsandbytes
|
||
|
|
import torch
|
||
|
|
from transformers import pipeline
|
||
|
|
|
||
|
|
pipe = pipeline("text-generation", model="wannaphong/han-llm-7b-v3", torch_dtype=torch.bfloat16, device_map="auto")
|
||
|
|
|
||
|
|
# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
|
||
|
|
messages = [
|
||
|
|
{"role": "user", "content": "แมวคืออะไร"},
|
||
|
|
]
|
||
|
|
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
||
|
|
outputs = pipe(prompt, max_new_tokens=300, do_sample=True, temperature=0.9, top_k=50, top_p=0.95, no_repeat_ngram_size=2,typical_p=1.)
|
||
|
|
print(outputs[0]["generated_text"])
|
||
|
|
```
|