--- library_name: transformers tags: [] --- # Llama 3.1 8B Vanilla This is the [**Llama 3.1 8B**](https://huggingface.co/meta-llama/Llama-3.1-8B) model fine-tuned as the vanilla (unmodified) baseline, trained and evaluated in the paper [ASIDE: Architectural Separation of Instructions and Data in Language Models](https://openreview.net/forum?id=C81TnwHiRM). ## Model Description This is the vanilla (unmodified) baseline fine-tuned with the same training data and procedure, but without any embedding modification. ## Usage To use this model, first clone and follow the installation instructions in the official [ASIDE Repository](https://github.com/egozverev/aside/tree/main). Inside the repository, run the following code snippet [(also provided here as a script)](https://github.com/egozverev/aside/blob/main/experiments/example.py) to do inference with this model. ```python import torch import deepspeed import json import os from huggingface_hub import login from model_api import CustomModelHandler # Import your custom handler from model_api import format_prompt # Import your prompt formatting function # Define your instruction and data instruction_text = "Translate to German." data_text = "Who is Albert Einstein?" # Model configuration hf_token = os.environ["HUGGINGFACE_HUB_TOKEN"] login(token=hf_token) embedding_type = "single_emb" base_model = "meta-llama/Llama-3.1-8B" model_path = "Embeddings-Collab/llama_3.1_8b_single_emb_emb_SFTv110_from_base_run_11_fix" # Initialize the model handler handler = CustomModelHandler( model_path, base_model, base_model, model_path, None, 0, embedding_type=embedding_type, load_from_checkpoint=True ) # Initialize DeepSpeed inference engine engine = deepspeed.init_inference( model=handler.model, mp_size=torch.cuda.device_count(), # Number of GPUs dtype=torch.float16, replace_method='auto', replace_with_kernel_inject=False ) handler.model = engine.module # Load prompt templates with open("./data/prompt_templates.json", "r") as f: templates = json.load(f) template = templates[0] instruction_text = format_prompt(instruction_text, template, "system") data_text = format_prompt(data_text, template, "user") # Generate output output, inp = handler.call_model_api_batch([instruction_text], [data_text]) print(output) ``` ### Citation If you use this model, please cite our paper: ``` @inproceedings{ zverev2026aside, title={{ASIDE}}: Architectural Separation of Instructions and Data in Language Models}, author={Egor Zverev and Evgenii Kortukov and Alexander Panfilov and Alexandra Volkova and Rush Tabesh and Sebastian Lapuschkin and Wojciech Samek and Christoph H. Lampert}, booktitle={The Fourteenth International Conference on Learning Representations}, year={2026}, url={https://openreview.net/forum?id=C81TnwHiRM} } ```