Files
Llama_3.1_8b_single_emb/README.md
ModelHub XC a4d495230d 初始化项目,由ModelHub XC社区提供模型
Model: ISTA-MLCV/Llama_3.1_8b_single_emb
Source: Original Platform
2026-06-16 22:54:20 +08:00

89 lines
2.8 KiB
Markdown

---
library_name: transformers
tags: []
---
# Llama 3.1 8B Vanilla
This is the [**Llama 3.1 8B**](https://huggingface.co/meta-llama/Llama-3.1-8B) model fine-tuned as the vanilla (unmodified) baseline, trained and evaluated in the paper [ASIDE: Architectural Separation of Instructions and Data in Language Models](https://openreview.net/forum?id=C81TnwHiRM).
## Model Description
This is the vanilla (unmodified) baseline fine-tuned with the same training data and procedure, but without any embedding modification.
## Usage
To use this model, first clone and follow the installation instructions in the official [ASIDE Repository](https://github.com/egozverev/aside/tree/main).
Inside the repository, run the following code snippet [(also provided here as a script)](https://github.com/egozverev/aside/blob/main/experiments/example.py) to do inference with this model.
```python
import torch
import deepspeed
import json
import os
from huggingface_hub import login
from model_api import CustomModelHandler # Import your custom handler
from model_api import format_prompt # Import your prompt formatting function
# Define your instruction and data
instruction_text = "Translate to German."
data_text = "Who is Albert Einstein?"
# Model configuration
hf_token = os.environ["HUGGINGFACE_HUB_TOKEN"]
login(token=hf_token)
embedding_type = "single_emb"
base_model = "meta-llama/Llama-3.1-8B"
model_path = "Embeddings-Collab/llama_3.1_8b_single_emb_emb_SFTv110_from_base_run_11_fix"
# Initialize the model handler
handler = CustomModelHandler(
model_path,
base_model,
base_model,
model_path,
None,
0,
embedding_type=embedding_type,
load_from_checkpoint=True
)
# Initialize DeepSpeed inference engine
engine = deepspeed.init_inference(
model=handler.model,
mp_size=torch.cuda.device_count(), # Number of GPUs
dtype=torch.float16,
replace_method='auto',
replace_with_kernel_inject=False
)
handler.model = engine.module
# Load prompt templates
with open("./data/prompt_templates.json", "r") as f:
templates = json.load(f)
template = templates[0]
instruction_text = format_prompt(instruction_text, template, "system")
data_text = format_prompt(data_text, template, "user")
# Generate output
output, inp = handler.call_model_api_batch([instruction_text], [data_text])
print(output)
```
### Citation
If you use this model, please cite our paper:
```
@inproceedings{
zverev2026aside,
title={{ASIDE}}: Architectural Separation of Instructions and Data in Language Models},
author={Egor Zverev and Evgenii Kortukov and Alexander Panfilov and Alexandra Volkova and Rush Tabesh and Sebastian Lapuschkin and Wojciech Samek and Christoph H. Lampert},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=C81TnwHiRM}
}
```