Llama-3-8B-Instruct-v0.1/README.md

---
base_model: MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2
library_name: transformers
tags:
- axolotl
- finetune
- facebook
- meta
- pytorch
- llama
- llama-3
language:
- en
pipeline_tag: text-generation
license: other
license_name: llama3
license_link: LICENSE
inference: false
model_creator: MaziyarPanahi
model_name: Llama-3-8B-Instruct-v0.1
quantized_by: MaziyarPanahi
---

<img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>


# Llama-3-8B-Instruct-v0.1

This model was developed based on `MaziyarPanahi/Llama-3-8B-Instruct-DPO` series.

# Quantized GGUF

All GGUF models are available here: [MaziyarPanahi/Llama-3-8B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-v0.1-GGUF)


# Prompt Template

This model uses `ChatML` prompt template:

```
<|im_start|>system
{System}
<|im_end|>
<|im_start|>user
{User}
<|im_end|>
<|im_start|>assistant
{Assistant}
````

# How to use

You can use this model by using `MaziyarPanahi/Llama-3-8B-Instruct-v0.1` as the model name in Hugging Face's
transformers library.

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from transformers import pipeline
import torch

model_id = "MaziyarPanahi/Llama-3-8B-Instruct-v0.1"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
    # attn_implementation="flash_attention_2"
)

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True
)

streamer = TextStreamer(tokenizer)

pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    model_kwargs={"torch_dtype": torch.bfloat16},
    streamer=streamer
)

# Then you can use the pipeline to generate text.

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|im_end|>"),
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=512,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.95,
)
print(outputs[0]["generated_text"][len(prompt):])
```
初始化项目，由ModelHub XC社区提供模型 Model: MaziyarPanahi/Llama-3-8B-Instruct-v0.1 Source: Original Platform 2026-04-24 21:02:15 +08:00			`---`
			`base_model: MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2`
			`library_name: transformers`
			`tags:`
			`- axolotl`
			`- finetune`
			`- facebook`
			`- meta`
			`- pytorch`
			`- llama`
			`- llama-3`
			`language:`
			`- en`
			`pipeline_tag: text-generation`
			`license: other`
			`license_name: llama3`
			`license_link: LICENSE`
			`inference: false`
			`model_creator: MaziyarPanahi`
			`model_name: Llama-3-8B-Instruct-v0.1`
			`quantized_by: MaziyarPanahi`
			`---`

			`<img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>`


			`# Llama-3-8B-Instruct-v0.1`

			This model was developed based on `MaziyarPanahi/Llama-3-8B-Instruct-DPO` series.

			`# Quantized GGUF`

			`All GGUF models are available here: [MaziyarPanahi/Llama-3-8B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-v0.1-GGUF)`


			`# Prompt Template`

			This model uses `ChatML` prompt template:

			```
			`<\|im_start\|>system`
			`{System}`
			`<\|im_end\|>`
			`<\|im_start\|>user`
			`{User}`
			`<\|im_end\|>`
			`<\|im_start\|>assistant`
			`{Assistant}`
			````

			`# How to use`

			You can use this model by using `MaziyarPanahi/Llama-3-8B-Instruct-v0.1` as the model name in Hugging Face's
			`transformers library.`

			```python
			`from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer`
			`from transformers import pipeline`
			`import torch`

			`model_id = "MaziyarPanahi/Llama-3-8B-Instruct-v0.1"`

			`model = AutoModelForCausalLM.from_pretrained(`
			`model_id,`
			`torch_dtype=torch.bfloat16,`
			`device_map="auto",`
			`trust_remote_code=True,`
			`# attn_implementation="flash_attention_2"`
			`)`

			`tokenizer = AutoTokenizer.from_pretrained(`
			`model_id,`
			`trust_remote_code=True`
			`)`

			`streamer = TextStreamer(tokenizer)`

			`pipeline = pipeline(`
			`"text-generation",`
			`model=model,`
			`tokenizer=tokenizer,`
			`model_kwargs={"torch_dtype": torch.bfloat16},`
			`streamer=streamer`
			`)`

			`# Then you can use the pipeline to generate text.`

			`messages = [`
			`{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},`
			`{"role": "user", "content": "Who are you?"},`
			`]`

			`prompt = tokenizer.apply_chat_template(`
			`messages,`
			`tokenize=False,`
			`add_generation_prompt=True`
			`)`

			`terminators = [`
			`tokenizer.eos_token_id,`
			`tokenizer.convert_tokens_to_ids("<\|im_end\|>"),`
			`tokenizer.convert_tokens_to_ids("<\|eot_id\|>")`
			`]`

			`outputs = pipeline(`
			`prompt,`
			`max_new_tokens=512,`
			`eos_token_id=terminators,`
			`do_sample=True,`
			`temperature=0.6,`
			`top_p=0.95,`
			`)`
			`print(outputs[0]["generated_text"][len(prompt):])`
			```