Files

ModelHub XC 1f5181dafd 初始化项目，由ModelHub XC社区提供模型

Model: WestCode1357/gpt-sw3-126m-instruct
Source: Original Platform

2026-06-10 14:50:18 +08:00

3.1 KiB

Raw Permalink Blame History

language, tags, license, library_name

language

gpt-sw3-126m-instruct

Smallest GPT-SW3 instruct model (126M parameters). Loads instantly — ideal for testing and prototyping.

Size: 126M | Type: instruct | Languages: Swedish, Norwegian, Danish, Icelandic, English

Community mirror of AI-Sweden-Models/gpt-sw3-126m-instruct

Warning and Disclaimer

This model is provided as-is for research and educational purposes. Community redistribution of AI Sweden's GPT-SW3 under the same modified RAIL license.

You are responsible for any content you create using this model. Use responsibly.

The model may reflect biases from training data and may generate inaccurate, offensive, or inappropriate content. Neither the uploader nor AI Sweden are liable for downstream misuse. Review the AI Sweden RAIL license before any production deployment.

"You are responsible for any content you create using this model. Enjoy responsibly."

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "WestCode1357/gpt-sw3-126m-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
device = "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

prompt = "Träd är fina för att"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
out = model.generate(**inputs, max_new_tokens=150, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0]))

Chat / instruct format

GPT-SW3 instruct uses special tokens. The format is:

<|endoftext|><s>User: [your message]<s>Bot: [response]<s>...

eos = "<|endoftext|>"
seg = "<s>"
prompt = f"{eos}{seg}User: Vad är huvudstaden i Sverige?{seg}Bot: "
inputs = tokenizer(prompt, return_tensors="pt").to(device)
out = model.generate(
    **inputs, max_new_tokens=200,
    do_sample=True, temperature=0.7, top_p=0.95,
    eos_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=False))

Intended Use

⚠️ These models contain extreme bias and are NOT intended for commercial use. For scientific and research use only.

GPT-SW3 was trained on large-scale web data and may reflect harmful societal biases present in that data. It has not been aligned or safety-tuned beyond its original training. Use strictly in controlled research settings. Do not deploy in any consumer-facing or commercial product without thorough evaluation and additional safety measures.

About GPT-SW3

GPT-SW3 is developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. Trained on 320B tokens: Swedish, Norwegian, Danish, Icelandic, English, and code.

Original models: https://huggingface.co/AI-Sweden-Models
Project page: https://www.ai.se/en/project/gpt-sw3

3.1 KiB Raw Permalink Blame History