--- language: - sv - "no" - da - is - en tags: - text-generation - swedish - nordic - gpt-sw3 - AI-Sweden - conversational license: other library_name: transformers --- # gpt-sw3-126m-instruct Smallest GPT-SW3 instruct model (126M parameters). Loads instantly — ideal for testing and prototyping. **Size:** 126M | **Type:** instruct | **Languages:** Swedish, Norwegian, Danish, Icelandic, English > Community mirror of [AI-Sweden-Models/gpt-sw3-126m-instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m-instruct) --- ## Warning and Disclaimer This model is provided as-is for research and educational purposes. Community redistribution of AI Sweden's GPT-SW3 under the same modified RAIL license. **You are responsible for any content you create using this model. Use responsibly.** The model may reflect biases from training data and may generate inaccurate, offensive, or inappropriate content. Neither the uploader nor AI Sweden are liable for downstream misuse. Review the [AI Sweden RAIL license](LICENSE) before any production deployment. > *"You are responsible for any content you create using this model. Enjoy responsibly."* --- ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "WestCode1357/gpt-sw3-126m-instruct" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16) device = "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu" model.to(device) prompt = "Träd är fina för att" inputs = tokenizer(prompt, return_tensors="pt").to(device) out = model.generate(**inputs, max_new_tokens=150, do_sample=True, temperature=0.7) print(tokenizer.decode(out[0])) ``` ### Chat / instruct format GPT-SW3 instruct uses special tokens. The format is: ``` <|endoftext|>User: [your message]Bot: [response]... ``` ```python eos = "<|endoftext|>" seg = "" prompt = f"{eos}{seg}User: Vad är huvudstaden i Sverige?{seg}Bot: " inputs = tokenizer(prompt, return_tensors="pt").to(device) out = model.generate( **inputs, max_new_tokens=200, do_sample=True, temperature=0.7, top_p=0.95, eos_token_id=tokenizer.eos_token_id ) print(tokenizer.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=False)) ``` ## Intended Use > ⚠️ **These models contain extreme bias and are NOT intended for commercial use.** > **For scientific and research use only.** GPT-SW3 was trained on large-scale web data and may reflect harmful societal biases present in that data. It has not been aligned or safety-tuned beyond its original training. Use strictly in controlled research settings. Do not deploy in any consumer-facing or commercial product without thorough evaluation and additional safety measures. ## About GPT-SW3 GPT-SW3 is developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. Trained on 320B tokens: Swedish, Norwegian, Danish, Icelandic, English, and code. - **Original models:** https://huggingface.co/AI-Sweden-Models - **Project page:** https://www.ai.se/en/project/gpt-sw3