Files

ModelHub XC 77ff80b795 初始化项目，由ModelHub XC社区提供模型

Model: MihaiPopa-1/OmniTranslate-1.1
Source: Original Platform

2026-06-06 16:07:18 +08:00

7.6 KiB

Raw Blame History

base_model, tags, license, language, pipeline_tag, datasets

base_model

OmniTranslate 1.1

OmniTranslate 1.1 is a massively multilingual machine translation model supporting over 500 languages. Fine-tuned from Qwen 3 0.6B (with Unsloth), this model is designed for translation tasks on any device!

Features

500+ Languages Supported: The broadest coverage of languages supported for a translation model that's under 1 billion parameters!
Tiny Size: Beats any other large model on speed and memory usage. No other model is able to compete with this!

Improvements over 1.0

OmniTranslate now makes less hiccups when translating to Romanian (like "ami"), and the diacritic bug on Romanian translations has been mostly fixed!

There's a tiny chance that the model will spit out without diacritics (mostly due to seeds) though, so try a different one.

Experimental Features

We added 2 new languages, Emoji and Sulfuristic Speak (my own language for OmniTranslate 1.1 to quite fit the Chaos Cubed Minecraft vibe!). Try these out:

Emoji

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.1"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float32, # Standard for CPU
    device_map="cpu"           # Forces CPU usage
)

# 2. Translate to Emoji
prompt = "<|im_start|>user\nTranslate to emj_Emoj: We love the world!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=64, temperature=0.1)
    
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Sulfuristic Speak

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.1"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float32, # Standard for CPU
    device_map="cpu"           # Forces CPU usage
)

# 2. Translate to Sulfuristic Speak ("Translate to Sulfuristic Speak" also works too!)
prompt = "<|im_start|>user\nTranslate to sul_Latn: Let's ride a Sulfur Cube!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.1)
    
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Notes

OmniTranslate 1.1 is still a experimental model and shouldn't be used for tasks where accurate translations matter.

Notes

Providing the ISO code instead of the language name can improve the results a lot.

Usage

Code is by Gemini 3 Flash (then some little modifications by myself):

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# 1. Load from your Hugging Face Repo
model_id = "MihaiPopa-1/OmniTranslate-1.1"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float32, # Standard for CPU
    device_map="cpu"           # Forces CPU usage
)

# 2. Translate (replace ron_Latn with your language here)
prompt = "<|im_start|>user\nTranslate to ron_Latn: OmniTranslate is a massively multilingual machine translation model supporting over 500 languages!<|im_end|>\n<|im_start|>assistant\n<think>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cpu")

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
    
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Data Used

I used my own OmniSurgical 1.1, which the dataset itself contains a part of HF's FineTranslations

Uploaded finetuned model

Developed by: MihaiPopa-1
License: apache-2.0
Finetuned from model : unsloth/qwen3-0.6b-unsloth-bnb-4bit

This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

7.6 KiB Raw Blame History

OmniTranslate 1.1

Features

Improvements over 1.0

Experimental Features

Emoji

Sulfuristic Speak

Notes

Notes

Usage

Data Used

Uploaded finetuned model

7.6 KiB

Raw Blame History