Files

ModelHub XC 5197abcc1a 初始化项目，由ModelHub XC社区提供模型

Model: lianghsun/Llama-3.2-Taiwan-3B-Instruct
Source: Original Platform

2026-05-22 22:02:30 +08:00

48 KiB

Raw Blame History

license, language, base_model, datasets, pipeline_tag, library_name, tags, model-index, widget, metrics

license

language

base_model

datasets

pipeline_tag

library_name

tags

model-index

widget

metrics

llama3.2

meta-llama/Llama-3.2-3B

lianghsun/Llama-3.2-Taiwan-3B

lianghsun/tw-emergency-medicine-bench

lianghsun/tw-legal-nlp

lianghsun/tw-legal-synthetic-qa

lianghsun/tw-law-article-qa

lianghsun/tw-judgment-qa

lianghsun/tw-judgment-gist-chat

lianghsun/tw-bar-examination-2020-chat

lianghsun/tw-structured-law-article

lianghsun/tw-judgment-gist-chat

lianghsun/tw-contract-review-chat

lianghsun/reasoning-base-20k-chat

lianghsun/vulnerability-mitigation-qa-zh_tw

lianghsun/tw-instruct

rombodawg/Everything_Instruct_Multilingual

xzuyn/manythings-translations-alpaca

neural-bridge/rag-dataset-12000

minyichen/glaive_toolcall_zh_tw

text-generation

transformers

Taiwan

ROC

zh-tw

instruct

chat

llama3.2

SLM

name

results

Llama-3.2-Taiwan-3B-Instruct

task

dataset

metrics

type	name
text-generation	Single Choice Question

type	name
lianghsun/tw-legal-benchmark-v1	tw-legal-benchmark-v1

name	type	value
single choice	accuracy	31.1

task

dataset

metrics

type	name
text-generation	Single Choice Question

type	name	config	split	revision
lianghsun/Formosa-bench	(Society) Formosa Taiwan Knowledge Bench	society	test	v2024.11.27

name	type	value
single choice	accuracy	60.42

task

dataset

metrics

type	name
text-generation	Single Choice Question

type	name	config	split	revision
lianghsun/Formosa-bench	(Governmnt) Formosa Taiwan Knowledge Bench	governmnt	test	v2024.11.27

name	type	value
single choice	accuracy	44.25

task

dataset

metrics

type	name
text-generation	Single Choice Question

type	name	config	split	revision
lianghsun/Formosa-bench	(Geography) Formosa Taiwan Knowledge Bench	geography	test	v2024.11.27

name	type	value
single choice	accuracy	47.54

task

dataset

metrics

type	name
text-generation	Single Choice Question

type	name	config	split	revision
lianghsun/Formosa-bench	(History) Formosa Taiwan Knowledge Bench	history	test	v2024.11.27

name	type	value
single choice	accuracy	60

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(geography_of_taiwan) tmmlu++	geography_of_taiwan	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	36.2

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(dentistry) tmmlu++	dentistry	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	33.83

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(technical) tmmlu++	technical	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	35.07

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(statistics_and_machine_learning) tmmlu++	statistics_and_machine_learning	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	28.57

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(clinical_psychology) tmmlu++	clinical_psychology	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	29.6

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(tve_design) tmmlu++	tve_design	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	38.54

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(three_principles_of_people) tmmlu++	three_principles_of_people	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	48.2

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(introduction_to_law) tmmlu++	introduction_to_law	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	29.96

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(linear_algebra) tmmlu++	linear_algebra	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	21.43

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(agriculture) tmmlu++	agriculture	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	24.5

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(jce_humanities) tmmlu++	jce_humanities	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	38.89

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(music) tmmlu++	music	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	25.9

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(secondary_physics) tmmlu++	secondary_physics	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	33.04

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(physics) tmmlu++	physics	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	27.84

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(advance_chemistry) tmmlu++	advance_chemistry	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	27.64

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(junior_science_exam) tmmlu++	junior_science_exam	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	30.05

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(veterinary_pathology) tmmlu++	veterinary_pathology	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	25.09

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(financial_analysis) tmmlu++	financial_analysis	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	25.13

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(national_protection) tmmlu++	national_protection	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	42.65

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(macroeconomics) tmmlu++	macroeconomics	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	26.76

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(politic_science) tmmlu++	politic_science	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	27.44

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(ttqav2) tmmlu++	ttqav2	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	61.06

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(junior_chinese_exam) tmmlu++	junior_chinese_exam	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	30.86

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(traditional_chinese_medicine_clinical_medicine) tmmlu++	traditional_chinese_medicine_clinical_medicine	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	25.9

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(junior_math_exam) tmmlu++	junior_math_exam	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	21.71

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(auditing) tmmlu++	auditing	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	21.82

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(anti_money_laundering) tmmlu++	anti_money_laundering	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	37.31

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(pharmacology) tmmlu++	pharmacology	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	30.68

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(trust_practice) tmmlu++	trust_practice	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	28.18

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(tve_mathematics) tmmlu++	tve_mathematics	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	18.67

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(human_behavior) tmmlu++	human_behavior	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	32.04

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(pharmacy) tmmlu++	pharmacy	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	22.76

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(tve_chinese_language) tmmlu++	tve_chinese_language	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	36.65

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(optometry) tmmlu++	optometry	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	25.11

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(physical_education) tmmlu++	physical_education	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	30.73

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(organic_chemistry) tmmlu++	organic_chemistry	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	35.78

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(tve_natural_sciences) tmmlu++	tve_natural_sciences	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	33.73

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(education) tmmlu++	education	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	37.9

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(mechanical) tmmlu++	mechanical	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	42.37

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(taiwanese_hokkien) tmmlu++	taiwanese_hokkien	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	14.73

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(nautical_science) tmmlu++	nautical_science	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	30.49

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(business_management) tmmlu++	business_management	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	39.57

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(logic_reasoning) tmmlu++	logic_reasoning	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	27.34

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(marketing_management) tmmlu++	marketing_management	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	39.78

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(economics) tmmlu++	economics	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	25.95

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(basic_medical_science) tmmlu++	basic_medical_science	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	28.41

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(occupational_therapy_for_psychological_disorders) tmmlu++	occupational_therapy_for_psychological_disorders	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	35.73

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(general_principles_of_law) tmmlu++	general_principles_of_law	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	31.13

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(junior_chemistry) tmmlu++	junior_chemistry	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	24.88

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(veterinary_pharmacology) tmmlu++	veterinary_pharmacology	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	36.3

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(educational_psychology) tmmlu++	educational_psychology	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	33.52

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(finance_banking) tmmlu++	finance_banking	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	32.59

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(official_document_management) tmmlu++	official_document_management	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	32.43

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(fire_science) tmmlu++	fire_science	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	30.65

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(junior_social_studies) tmmlu++	junior_social_studies	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	47.62

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(accounting) tmmlu++	accounting	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	20.94

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(engineering_math) tmmlu++	engineering_math	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	27.18

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(education_(profession_level)) tmmlu++	education_(profession_level)	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	24.07

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(chinese_language_and_literature) tmmlu++	chinese_language_and_literature	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	27.64

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(management_accounting) tmmlu++	management_accounting	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	24.19

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(culinary_skills) tmmlu++	culinary_skills	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	39.38

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(administrative_law) tmmlu++	administrative_law	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	25.71

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(insurance_studies) tmmlu++	insurance_studies	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	33.42

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(real_estate) tmmlu++	real_estate	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	22.83

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(computer_science) tmmlu++	computer_science	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	31.61

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(taxation) tmmlu++	taxation	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	27.47

task

dataset

metrics

type	name
question-answering	Single Choice Question

type	name	config	split	revision
ikala/tmmluplus	(trade) tmmlu++	trade	test	c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

name	type	value
single choice	accuracy	20.32

text
中華民國憲法第一條

accuracy

Model Card for lianghsun/Llama-3.2-Taiwan-3B-Instruct

[👋 歡迎加入 Discord 討論 🎉]

圖像生成來自 OpenArt：An anime-style 🦙 standing proudly atop the summit of Taiwan’s Yushan (Jade Mountain), gazing forward.

採用 lianghsun/Llama-3.2-Taiwan-3B 為基礎模型（foundation model），使用大量中華民國台灣的繁體中文對話集和多國語言對話集進行模型指令微調（instruction fine-tuning）和多輪迭代直接偏好優化（direct preference optimization, DPO），旨在訓練出具有中華民國台灣知識及風格的小語言模型（small langugae model, SLM）之對話模型。

Model Change Log

Update Date	Model Version	Key Changes
2025/01/01	v2025.01.01	Fine-tuning is based on the foundation model version v2024.12.28, and it uses self-prepared instruction datasets for this round of fine-tuning.
2024/12/13	v2024.12.13	Completed 1st round DPO training (10/10 epochs). Preparing for next round DPO training.
2024/11/27	v2024.11.27	Completed SFT training (5/5 epochs). Preparing for multi-round DPO training.
2024/11/25	v2024.11.25	Updated model version to v2024.11.25, training progressed to (3/5) epochs. Still in SFT stage, DPO training remains pending.
2024/11/22	v2024.11.22	Initial upload: Model version v2024.11.22, training completed up to (1/5) epochs. Currently trained only on SFT, DPO training not yet performed.

Model Details

Model Description

Developed by: Huang Liang Hsun
Model type: LlamaForCausalLM
Language(s) (NLP): Tranditional Chinese (zh-tw), English
License: llama3.2
Fine-tuned from model: lianghsun/Llama-3.2-Taiwan-3B

Model Sources

Repository: lianghsun/Llama-3.2-Taiwan-3B
Paper: (WIP, show me the time)
Playground: 🦉 Tawian SmolLM Chat 👈🏼 來玩看看 😻
Demo:

user: 請介紹台灣
assistant: 台灣，位於亞洲東部，地處太平洋與菲律賓海之間，面積約36,000平方公里，人口約2,300萬，是民主自由的國家，經濟實力強勁，擁有世界第10大經濟體。台灣以美食、文化、自然美景著稱，還有豐富的歷史與傳統，吸引全球遊客。台灣語為官方語言，但中文也廣為使用，英語也常用於國際交流。台灣政治多元，執政黨為民進黨，台灣是全球科技產業的重鎮，擁有先進的製造業與服務業。台灣氣候溫暖潮濕，四季分明，夏季炎熱，冬季涼爽，雨季則在5月至10月。台灣的美食以小吃為主，如滷肉飯、珍珠

Uses

Direct Use

本模型已經具備有繁體中文對話能力，使用者可以直接部署推論端點使用。

Downstream Use

若需強化模型在特定領域的知識，可透過微調進一步提升其性能與專業能力。

Out-of-Scope Use

本模型旨在提供資訊，不參與任何政治或法律問題的評斷或立場表達。

Bias, Risks, and Limitations

語言模型的生成內容可能因訓練集的多樣性而帶有偏見、特定立場，或包含與事實不符的言論，請使用者務必在使用過程中仔細確認內容的準確性與中立性。

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

要使用 vLLM Docker image 來啟動此模型，您可以按照以下操作：

docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HUGGING_FACE_HUB_TOKEN=<secret>" \
    -p 8000:8000 \
    --ipc=host \
    vllm/vllm-openai:latest \
    --model lianghsun/Llama-3.2-Taiwan-3B-Instruct

請注意，如果想要使用不同版本的 checkpoint，請加上 --revision <tag_name>

docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HUGGING_FACE_HUB_TOKEN=<secret>" \
    -p 8000:8000 \
    --ipc=host \
    vllm/vllm-openai:latest \
    --model lianghsun/Llama-3.2-Taiwan-3B-Instruct --revision <tag_name>

Training Details

Training Data

繁體中文對話資料集

多國語系對話資料集

Training Procedure

Preprocessing

(WIP)

Training Hyperparameters

SFT stage for v2024.11.27

Note: 以下包含 v2024.11.22 和 v2025.11.25 的超參數設定

learning_rate: 5e-05
min_learning_rate: 5e-07
train_batch_size: 105
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 50
total_train_batch_size: 21,000
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 5.0
global_step: 590

Speeds, Sizes, Times

SFT stage for v2024.11.27

Note: 以下包含 v2024.11.22 和 v2025.11.25 的超參數設定

Duration: 5 days, 16:15:11.17
Train runtime: 490,511.1789
Train samples per second: 25.37
Train steps per second: 0.001
Total training FLOPs: 26,658,386,120,540,160
Train loss: 0.8533

Evaluation

Testing Data, Factors & Metrics

Formosa Taiwan Knowledge Bench

Testing Data

lianghsun/Formosa-bench

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

lianghsun/tw-legal-benchmark-v1

Testing Data

Dataset: lianghsun/tw-legal-benchmark-v1
Revision: 66c3a5f3ff2298f6a1cf23201070b5317bdd1893

Factors

[More Information Needed]

Metrics

Accuracy

Results

Model Revision: v2024.11.27

Subset	Split	Score
lianghsun/tw-legal-benchmark-v1	train	31.1

Summary

tmmlu++

Testing Data

Dataset: ikala/tmmluplus
Revision: c0e8ae955997300d5dbf0e382bf0ba5115f85e8c

Factors

[More Information Needed]

Metrics

Accuracy

Results

Model Revision: v2024.11.27

Subset	Split	Score
geography_of_taiwan	test	36.2
dentistry	test	33.83
technical	test	35.07
statistics_and_machine_learning	test	28.57
clinical_psychology	test	29.6
tve_design	test	38.54
three_principles_of_people	test	48.2
introduction_to_law	test	29.96
linear_algebra	test	21.43
agriculture	test	24.5
jce_humanities	test	38.89
music	test	25.9
secondary_physics	test	33.04
physics	test	27.84
advance_chemistry	test	27.64
junior_science_exam	test	30.05
veterinary_pathology	test	25.09
financial_analysis	test	25.13
national_protection	test	42.65
macroeconomics	test	26.76
politic_science	test	27.44
ttqav2	test	61.06
junior_chinese_exam	test	30.86
traditional_chinese_medicine_clinical_medicine	test	25.9
junior_math_exam	test	21.71
auditing	test	21.82
anti_money_laundering	test	37.31
pharmacology	test	30.68
trust_practice	test	28.18
tve_mathematics	test	18.67
human_behavior	test	32.04
pharmacy	test	22.76
tve_chinese_language	test	36.65
optometry	test	25.11
physical_education	test	30.73
organic_chemistry	test	35.78
tve_natural_sciences	test	33.73
education	test	37.9
mechanical	test	42.37
taiwanese_hokkien	test	14.73
nautical_science	test	30.49
business_management	test	39.57
logic_reasoning	test	27.34
marketing_management	test	39.78
economics	test	25.95
basic_medical_science	test	28.41
occupational_therapy_for_psychological_disorders	test	35.73
general_principles_of_law	test	31.13
junior_chemistry	test	24.88
veterinary_pharmacology	test	36.3
educational_psychology	test	33.52
finance_banking	test	32.59
official_document_management	test	32.43
fire_science	test	30.65
junior_social_studies	test	47.62
accounting	test	20.94
engineering_math	test	27.18
education_(profession_level)	test	24.07
chinese_language_and_literature	test	27.64
management_accounting	test	24.19
culinary_skills	test	39.38
administrative_law	test	25.71
insurance_studies	test	33.42
real_estate	test	22.83
computer_science	test	31.61
taxation	test	27.47
trade	test	20.32

Summary

模型版號 v2024.11.27，無論是基礎模型（lianghsun/Llama-3.2-Taiwan-3B）還是指令微調模型（lianghsun/Llama-3.2-Taiwan-3B-Instruct），均未接受過 tmmlu++ 資料集的訓練，以確保測試的公平性。經測試，目前該模型在 tmmlu++ 上表現普遍不佳，未達及格分數，可能需要加入專業領域的資料集來強化基礎模型能力。

Model Examination [optional]

[More Information Needed]

Environmental Impact

Hardware Type: 🚀
Hours used: ⏳⏳⌛
Cloud Provider: 鴻鵠國際股份有限公司
Compute Region: 🇹🇼
Carbon Emitted: ♻️

Technical Specifications

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

CPU count: 32
Logical CPU count: 64
GPU count: 4
GPU type: NVIDIA H100 NVL

Software

OS version: Linux-5.15.0-124-generic-x86_64-with-glibc2.35
Python version: 3.12.7

Citation

@misc{lianghsun2024llama32taiwan3binstruct,
  author = {Huang, Liang Hsun},
  title = {Llama-3.2-Taiwan-3B-Instruct},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/lianghsun/Llama-3.2-Taiwan-3B-Instruct}},
  note = {Accessed: 2024-11-25}
}

Glossary [optional]

N/A

More Information

Acknowledge

在此致謝鴻鵠國際股份有限公司蔡長明先生無償地贊助算力，以及曾經幫忙過：廖振翔、chweng、Ben、kevin、Maxxchu、Lam 和陳林彥…等朋友們，才能讓這個模型得以訓練完成，提供算力者乃人生父母。

Usage

如果你基於此指令模型進行微調，希望能不吝嗇在 模型卡片(model card) 裡標註 基礎模型 為：

base_model: lianghsun/Llama-3.2-Taiwan-3B-Instruct

標註和 ❤️ 是給予我們最大的鼓勵，謝謝。😀

Model Card Authors

Huang Liang Hsun

Model Card Contact

Huang Liang Hsun

Framework versions

Transformers 4.45.2
Pytorch 2.4.1+cu121
Datasets 2.21.0
Tokenizers 0.20.0

48 KiB Raw Blame History Unescape Escape

Model Card for lianghsun/Llama-3.2-Taiwan-3B-Instruct

Model Details

Model Description

Model Sources

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing

Training Hyperparameters

Speeds, Sizes, Times

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary

Testing Data

Factors

Metrics

Results

Summary

Testing Data

Factors

Metrics

Results

Summary

Model Examination [optional]

Environmental Impact

Technical Specifications

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation

Glossary [optional]

More Information

Acknowledge

Usage

Model Card Authors

Model Card Contact

Framework versions

48 KiB

Raw Blame History