97 lines
3.1 KiB
Markdown
97 lines
3.1 KiB
Markdown
---
|
|
language:
|
|
- sv
|
|
- "no"
|
|
- da
|
|
- is
|
|
- en
|
|
tags:
|
|
- text-generation
|
|
- swedish
|
|
- nordic
|
|
- gpt-sw3
|
|
- AI-Sweden
|
|
- conversational
|
|
license: other
|
|
library_name: transformers
|
|
---
|
|
|
|
# gpt-sw3-126m-instruct
|
|
|
|
Smallest GPT-SW3 instruct model (126M parameters). Loads instantly — ideal for testing and prototyping.
|
|
|
|
**Size:** 126M | **Type:** instruct | **Languages:** Swedish, Norwegian, Danish, Icelandic, English
|
|
|
|
> Community mirror of [AI-Sweden-Models/gpt-sw3-126m-instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m-instruct)
|
|
|
|
---
|
|
|
|
## Warning and Disclaimer
|
|
|
|
This model is provided as-is for research and educational purposes.
|
|
Community redistribution of AI Sweden's GPT-SW3 under the same modified RAIL license.
|
|
|
|
**You are responsible for any content you create using this model. Use responsibly.**
|
|
|
|
The model may reflect biases from training data and may generate inaccurate, offensive,
|
|
or inappropriate content. Neither the uploader nor AI Sweden are liable for downstream misuse.
|
|
Review the [AI Sweden RAIL license](LICENSE) before any production deployment.
|
|
|
|
> *"You are responsible for any content you create using this model. Enjoy responsibly."*
|
|
|
|
---
|
|
|
|
## Usage
|
|
|
|
```python
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
import torch
|
|
|
|
model_id = "WestCode1357/gpt-sw3-126m-instruct"
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
|
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16)
|
|
device = "mps" if torch.backends.mps.is_available() else "cuda" if torch.cuda.is_available() else "cpu"
|
|
model.to(device)
|
|
|
|
prompt = "Träd är fina för att"
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(device)
|
|
out = model.generate(**inputs, max_new_tokens=150, do_sample=True, temperature=0.7)
|
|
print(tokenizer.decode(out[0]))
|
|
```
|
|
|
|
### Chat / instruct format
|
|
|
|
GPT-SW3 instruct uses special tokens. The format is:
|
|
|
|
```
|
|
<|endoftext|><s>User: [your message]<s>Bot: [response]<s>...
|
|
```
|
|
|
|
```python
|
|
eos = "<|endoftext|>"
|
|
seg = "<s>"
|
|
prompt = f"{eos}{seg}User: Vad är huvudstaden i Sverige?{seg}Bot: "
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(device)
|
|
out = model.generate(
|
|
**inputs, max_new_tokens=200,
|
|
do_sample=True, temperature=0.7, top_p=0.95,
|
|
eos_token_id=tokenizer.eos_token_id
|
|
)
|
|
print(tokenizer.decode(out[0][inputs.input_ids.shape[-1]:], skip_special_tokens=False))
|
|
```
|
|
|
|
## Intended Use
|
|
|
|
> ⚠️ **These models contain extreme bias and are NOT intended for commercial use.**
|
|
> **For scientific and research use only.**
|
|
|
|
GPT-SW3 was trained on large-scale web data and may reflect harmful societal biases present in that data. It has not been aligned or safety-tuned beyond its original training. Use strictly in controlled research settings. Do not deploy in any consumer-facing or commercial product without thorough evaluation and additional safety measures.
|
|
|
|
## About GPT-SW3
|
|
|
|
GPT-SW3 is developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language.
|
|
Trained on 320B tokens: Swedish, Norwegian, Danish, Icelandic, English, and code.
|
|
|
|
- **Original models:** https://huggingface.co/AI-Sweden-Models
|
|
- **Project page:** https://www.ai.se/en/project/gpt-sw3
|