Model: WestCode1357/gpt-sw3-126m-instruct Source: Original Platform
language, tags, license, library_name
| language | tags | license | library_name | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
other | transformers |
gpt-sw3-126m-instruct
Smallest GPT-SW3 instruct model (126M parameters). Loads instantly — ideal for testing and prototyping.
Size: 126M | Type: instruct | Languages: Swedish, Norwegian, Danish, Icelandic, English
Community mirror of AI-Sweden-Models/gpt-sw3-126m-instruct
Warning and Disclaimer
This model is provided as-is for research and educational purposes. Community redistribution of AI Sweden's GPT-SW3 under the same modified RAIL license.
You are responsible for any content you create using this model. Use responsibly.
The model may reflect biases from training data and may generate inaccurate, offensive, or inappropriate content. Neither the uploader nor AI Sweden are liable for downstream misuse. Review the AI Sweden RAIL license before any production deployment.
"You are responsible for any content you create using this model. Enjoy responsibly."
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "WestCode1357/gpt-sw3-126m-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
device = "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
prompt = "Träd är fina för att"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
out = model.generate(**inputs, max_new_tokens=150, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0]))
Chat / instruct format
GPT-SW3 instruct uses special tokens. The format is:
<|endoftext|><s>User: [your message]<s>Bot: [response]<s>...
eos = "<|endoftext|>"
seg = "<s>"
prompt = f"{eos}{seg}User: Vad är huvudstaden i Sverige?{seg}Bot: "
inputs = tokenizer(prompt, return_tensors="pt").to(device)
out = model.generate(
**inputs, max_new_tokens=200,
do_sample=True, temperature=0.7, top_p=0.95,
eos_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=False))
Intended Use
⚠️ These models contain extreme bias and are NOT intended for commercial use. For scientific and research use only.
GPT-SW3 was trained on large-scale web data and may reflect harmful societal biases present in that data. It has not been aligned or safety-tuned beyond its original training. Use strictly in controlled research settings. Do not deploy in any consumer-facing or commercial product without thorough evaluation and additional safety measures.
About GPT-SW3
GPT-SW3 is developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. Trained on 320B tokens: Swedish, Norwegian, Danish, Icelandic, English, and code.
- Original models: https://huggingface.co/AI-Sweden-Models
- Project page: https://www.ai.se/en/project/gpt-sw3