Files
OmniTranslate-1.1/README.md
ModelHub XC 77ff80b795 初始化项目,由ModelHub XC社区提供模型
Model: MihaiPopa-1/OmniTranslate-1.1
Source: Original Platform
2026-06-06 16:07:18 +08:00

7.6 KiB

base_model, tags, license, language, pipeline_tag, datasets
base_model tags license language pipeline_tag datasets
MihaiPopa-1/OmniTranslate-1.0
text-generation-inference
translation
transformers
unsloth
qwen3
omnitranslate
apache-2.0
abk
abq
abs
acm
adh
adi
ady
aeb
afr
agx
aii
aim
ain
ajz
akb
aln
als
alt
amh
anp
aoz
apc
apt
arb
arg
arq
ars
ary
arz
asm
ast
atb
ava
awa
ayp
ayr
azb
azj
bak
bam
ban
bar
bas
bbc
bbk
bcl
bdq
bel
ben
bew
bho
bhp
bis
biu
bjn
bod
bos
brh
brx
bts
btx
bug
bul
bwi
bxr
cat
cbk
ccp
ceb
ces
cfm
cha
che
chr
chu
chv
cjs
ckb
ckt
cmn
cnh
cnw
cos
crh
crj
crk
crl
crs
csb
csw
csy
ctd
cym
czt
dak
dan
dar
deu
dik
diu
div
dje
dks
dln
dng
dnw
doi
dru
dsb
dtp
dty
dzo
ekk
ell
emj
enl
enm
epo
ess
eus
eve
ewo
ext
fao
fas
ffm
fij
fil
fin
fit
fkv
fmu
fra
fro
frp
fry
fuf
fur
fuv
gag
gaz
gcf
gla
gle
glg
glk
glv
gmh
gnb
goh
gom
gos
grc
gsw
gug
guj
guz
hac
hae
hak
hat
hau
haw
hbo
heb
her
hif
hil
hin
hmr
hne
hns
hrv
hrx
hsb
hun
hwc
hye
hyw
iba
ibg
ibo
ife
ike
ikt
ilo
ina
ind
inh
isl
ita
ivv
jav
jpn
jun
kaa
kab
kac
kak
kal
kam
kan
kas
kat
kaz
kbd
kca
kdh
kdr
kea
kei
kgp
kha
khk
khm
kik
kin
kir
kiu
kjb
kjh
kmr
knc
koi
kor
kos
kpv
krj
krl
kru
ksh
ksw
ktj
ktz
kua
kum
kwn
kyu
kzj
lad
lao
lat
lbe
ldn
lew
lez
lfn
lim
lin
lis
lit
lki
lld
lmk
lnd
lrc
ltg
ltz
lud
lug
luo
lus
lvs
lwg
lzh
mag
mah
mai
mak
mal
mar
mas
mbf
mdf
mer
mfe
mfg
mfy
mhi
mhr
mhy
min
mip
mjw
mkd
mlt
mni
mnk
mns
mnw
moh
mph
mqy
mri
mrj
mrw
mtg
mui
mup
mus
mvp
mwf
mwl
mww
mya
myv
myx
mzh
nah
nan
nap
naq
nbu
nde
ndo
nds
new
nio
njn
njo
nld
nmf
nmz
nno
nob
nog
non
npi
npo
nrf
nri
nrm
nse
nus
nya
nyn
nzm
obo
oci
ojb
olo
orv
ory
oss
ota
oto
otw
pam
pan
pap
pbt
pcd
pck
pcm
pfl
plt
pmq
pmx
pnb
pnt
pol
por
pov
ppk
pps
prg
pui
pxm
quc
qul
qup
qus
quz
raw
rcf
rel
rhg
ria
rjs
rmc
rml
rmn
rmy
rnl
roh
ron
rtm
rue
run
rus
sah
san
sat
sck
scn
sda
sdc
sdh
ses
sgc
sgh
sid
sin
sju
skr
slk
slv
sma
sme
smj
smn
smo
sms
smt
sna
snd
som
sot
spa
srd
srp
ssw
sul
sun
swe
swg
swh
syc
syl
szl
tab
tam
taq
tat
tcy
tcz
tel
tet
tgk
tha
thl
tig
tir
tkl
tkr
tlh
tly
tok
ton
tpi
tpw
trc
trp
trs
ttj
tuk
tur
tuv
twx
tyv
tzl
tzm
udm
uig
ukr
urd
uzn
uzs
vap
vie
vot
vro
war
way
wba
wbm
wes
whk
wlx
wol
wsg
wwa
xal
xho
xmm
xmv
xog
yaz
ydd
yor
yrk
yrl
yua
yue
zea
zgh
zom
zsm
zul
translation
MihaiPopa-1/OmniSurgical-1.1

OmniTranslate 1.1

OmniTranslate 1.1 is a massively multilingual machine translation model supporting over 500 languages. Fine-tuned from Qwen 3 0.6B (with Unsloth), this model is designed for translation tasks on any device!

Features

  • 500+ Languages Supported: The broadest coverage of languages supported for a translation model that's under 1 billion parameters!
  • Tiny Size: Beats any other large model on speed and memory usage. No other model is able to compete with this!

Improvements over 1.0

  • OmniTranslate now makes less hiccups when translating to Romanian (like "ami"), and the diacritic bug on Romanian translations has been mostly fixed!

There's a tiny chance that the model will spit out without diacritics (mostly due to seeds) though, so try a different one.

Experimental Features

  • We added 2 new languages, Emoji and Sulfuristic Speak (my own language for OmniTranslate 1.1 to quite fit the Chaos Cubed Minecraft vibe!). Try these out:

Emoji

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.1"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float32, # Standard for CPU
    device_map="cpu"           # Forces CPU usage
)

# 2. Translate to Emoji
prompt = "<|im_start|>user\nTranslate to emj_Emoj: We love the world!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=64, temperature=0.1)
    
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Sulfuristic Speak

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.1"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float32, # Standard for CPU
    device_map="cpu"           # Forces CPU usage
)

# 2. Translate to Sulfuristic Speak ("Translate to Sulfuristic Speak" also works too!)
prompt = "<|im_start|>user\nTranslate to sul_Latn: Let's ride a Sulfur Cube!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.1)
    
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Notes

OmniTranslate 1.1 is still a experimental model and shouldn't be used for tasks where accurate translations matter.

Notes

Providing the ISO code instead of the language name can improve the results a lot.

Usage

Code is by Gemini 3 Flash (then some little modifications by myself):

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.1"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float32, # Standard for CPU
    device_map="cpu"           # Forces CPU usage
)

# 2. Translate (replace ron_Latn with your language here)
prompt = "<|im_start|>user\nTranslate to ron_Latn: OmniTranslate is a massively multilingual machine translation model supporting over 500 languages!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
    
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Data Used

I used my own OmniSurgical 1.1, which the dataset itself contains a part of HF's FineTranslations


Uploaded finetuned model

  • Developed by: MihaiPopa-1
  • License: apache-2.0
  • Finetuned from model : unsloth/qwen3-0.6b-unsloth-bnb-4bit

This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.