50 lines
1.5 KiB
Markdown
50 lines
1.5 KiB
Markdown
---
|
|
language:
|
|
- en
|
|
- vi
|
|
library_name: transformers
|
|
tags:
|
|
- chat
|
|
- llama
|
|
- finetune
|
|
- peft
|
|
base_model: duyhv1411/Llama-3.2-1B-en-vi
|
|
model_name: Llama-3.2-1B-en-vi
|
|
pipeline_tag: text-generation
|
|
inference: false
|
|
---
|
|
|
|
# duyhv1411/Llama-3.2-1B-en-vi
|
|
|
|
This model is an advanced iteration of the powerful `meta-llama/Llama-3.2-1B-Instruct`, specifically fine-tuned to enhance its capabilities in generic domains.
|
|
|
|
# How to use
|
|
|
|
```python
|
|
|
|
# Use a pipeline as a high-level helper
|
|
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
merged_model = AutoModelForCausalLM.from_pretrained("duyhv1411/Llama-3.2-1B-en-vi",
|
|
device_map="auto",
|
|
trust_remote_code=True,)
|
|
tokenizer = AutoTokenizer.from_pretrained("duyhv1411/Llama-3.2-1B-en-vi")
|
|
|
|
chat = [{"role": "user", "content": "Cách tính lương gross?"}]
|
|
|
|
tokenized_chat = tokenizer.encode(tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True), return_tensors="pt").to(torch.device("cuda"))
|
|
|
|
outputs = merged_model.generate(tokenized_chat, max_new_tokens=1024, do_sample=True, temperature = 0.9)
|
|
print(tokenizer.decode(outputs[0][len(tokenized_chat[0]):]))
|
|
|
|
|
|
from transformers import pipeline
|
|
|
|
chat = [{"role": "user", "content": "Cách tính lương gross?"}]
|
|
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
|
|
|
|
pipe = pipeline(task="text-generation", model=merged_model, tokenizer=tokenizer, device_map="auto", return_full_text=False)
|
|
print(pipe(prompt, max_new_tokens=1024, do_sample=True, temperature=0.9)[0]["generated_text"])
|
|
```
|