初始化项目,由ModelHub XC社区提供模型
Model: trillionlabs/Tri-1.8B-Translation Source: Original Platform
This commit is contained in:
142
README.md
Normal file
142
README.md
Normal file
@@ -0,0 +1,142 @@
|
||||
---
|
||||
library_name: transformers
|
||||
tags:
|
||||
- translation
|
||||
- multilingual
|
||||
- trillionlabs
|
||||
- preview
|
||||
license: apache-2.0
|
||||
language:
|
||||
- ko
|
||||
- en
|
||||
- zh
|
||||
- ja
|
||||
---
|
||||
|
||||
# Tri-1.8B Translation
|
||||
|
||||
We release **Tri-1.8B Translation**, a lightweight multilingual translation model from **Trillion Labs**.
|
||||
|
||||
Tri-1.8B Translate is trained through pretraining and supervised fine-tuning (SFT), and was distilled from our larger Tri-21B model to preserve strong translation quality in a much smaller, deployment-friendly 1.8B parameter model. It supports all translation directions among English, Korean, Japanese, and Chinese.
|
||||
|
||||
|
||||
|
||||
## ✨ Highlights
|
||||
|
||||
* **Compact & efficient:** \~1.8B params, easy to serve on a single GPU.
|
||||
* **Multilingual:** Fully bidirectional **EN ↔ KO ↔ JA ↔ ZH**.
|
||||
* **Simple prompts:** Works with a short **task instruction + `<lang>` tag**.
|
||||
* **Research-ready:** Suitable for domain SFT or lightweight adapters.
|
||||
|
||||
---
|
||||
|
||||
## 🧾 Prompt format
|
||||
|
||||
```
|
||||
Translate the following {SRC_NAME} text into {TGT_NAME}:
|
||||
{TEXT} <{lang_tag}>
|
||||
```
|
||||
|
||||
Where `{lang_tag} ∈ { en, ko, ja, zh }`.
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Usage
|
||||
|
||||
### 1) 🤗 Transformers
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
model_name = "trillionlabs/Tri-1.8B-Translation"
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||||
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
|
||||
|
||||
text = "안녕하세요"
|
||||
messages = [
|
||||
{"role": "user", "content": f"Translate the following Korean text into English:\n{text} <en>"}
|
||||
]
|
||||
|
||||
inputs = tokenizer.apply_chat_template(
|
||||
messages,
|
||||
return_tensors="pt",
|
||||
add_generation_prompt=True
|
||||
).to(model.device)
|
||||
|
||||
outputs = model.generate(
|
||||
inputs,
|
||||
max_new_tokens=256,
|
||||
do_sample=False,
|
||||
pad_token_id=tokenizer.eos_token_id
|
||||
)
|
||||
|
||||
full_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
||||
translation = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
|
||||
|
||||
print(f"Korean: {text}")
|
||||
print(f"English: {translation}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2) Local vLLM
|
||||
|
||||
```python
|
||||
from vllm import LLM, SamplingParams
|
||||
|
||||
llm = LLM(model="trillionlabs/Tri-1.8B-Translation")
|
||||
sp = SamplingParams(temperature=0.3, max_tokens=512)
|
||||
|
||||
def translate(text, target="en"):
|
||||
prompt = f"Translate into {target}:\n{text} <{target}>"
|
||||
out = llm.chat([{"role": "user", "content": prompt}], sampling_params=sp)
|
||||
return out[0].outputs[0].text.strip()
|
||||
|
||||
print(translate("안녕하세요", "en"))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3) API client (OpenAI-compatible)
|
||||
|
||||
```python
|
||||
import openai
|
||||
|
||||
client = openai.OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
|
||||
|
||||
def translate(text, target="en"):
|
||||
prompt = f"Translate into {target}:\n{text} <{target}>"
|
||||
resp = client.chat.completions.create(
|
||||
model="trillionlabs/Tri-1.8B-Translation",
|
||||
messages=[{"role": "user", "content": prompt}],
|
||||
temperature=0.3,
|
||||
max_tokens=512,
|
||||
)
|
||||
return resp.choices[0].message.content.strip()
|
||||
|
||||
print(translate("안녕하세요", "en"))
|
||||
```
|
||||
|
||||
|
||||
## 📜 License
|
||||
|
||||
Apache-2.0 (for model weights & code). Please verify data licenses for your use.
|
||||
|
||||
|
||||
## 🙏 Acknowledgments
|
||||
|
||||
* Thanks to the **ByteDance Seed team** for releasing **Seed-X**; our prompt template and some training design were adapted from their paper.
|
||||
|
||||
|
||||
## 📚 Citation
|
||||
|
||||
If you use **Tri-1.8B Translation**, please cite:
|
||||
|
||||
```bibtex
|
||||
@misc{suk2025tri18b,
|
||||
title = {Tri-1.8B Translation: A Lightweight Multilingual Translation Model},
|
||||
author = {Juyoung Suk and Trillion Labs},
|
||||
year = {2025},
|
||||
howpublished = {\url{https://huggingface.co/trillionlabs/Tri-1.8B-Translation}}
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user