--- language: - en - vi library_name: transformers tags: - chat - llama - finetune - peft base_model: duyhv1411/Llama-3.2-1B-en-vi model_name: Llama-3.2-1B-en-vi pipeline_tag: text-generation inference: false --- # duyhv1411/Llama-3.2-1B-en-vi This model is an advanced iteration of the powerful `meta-llama/Llama-3.2-1B-Instruct`, specifically fine-tuned to enhance its capabilities in generic domains. # How to use ```python # Use a pipeline as a high-level helper from transformers import AutoModelForCausalLM, AutoTokenizer merged_model = AutoModelForCausalLM.from_pretrained("duyhv1411/Llama-3.2-1B-en-vi", device_map="auto", trust_remote_code=True,) tokenizer = AutoTokenizer.from_pretrained("duyhv1411/Llama-3.2-1B-en-vi") chat = [{"role": "user", "content": "Cách tính lương gross?"}] tokenized_chat = tokenizer.encode(tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True), return_tensors="pt").to(torch.device("cuda")) outputs = merged_model.generate(tokenized_chat, max_new_tokens=1024, do_sample=True, temperature = 0.9) print(tokenizer.decode(outputs[0][len(tokenized_chat[0]):])) from transformers import pipeline chat = [{"role": "user", "content": "Cách tính lương gross?"}] prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True) pipe = pipeline(task="text-generation", model=merged_model, tokenizer=tokenizer, device_map="auto", return_full_text=False) print(pipe(prompt, max_new_tokens=1024, do_sample=True, temperature=0.9)[0]["generated_text"]) ```