Go to file

ModelHub XC 807a23799f 初始化项目，由ModelHub XC社区提供模型

Model: hongli-zhan/MINT-empathy-Qwen3-1.7B
Source: Original Platform

2026-05-01 04:46:17 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

added_tokens.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

merges.txt

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

vocab.json

初始化项目，由ModelHub XC社区提供模型

2026-05-01 04:46:17 +08:00

README.md

license, base_model, tags, language, pipeline_tag

license

base_model

MINT-empathy-Qwen3-1.7B

This model is the Q + D_KL MINT checkpoint fine-tuned from Qwen/Qwen3-1.7B for multi-turn empathic dialogue.

MINT, short for Multi-turn Inter-tactic Novelty Training, is a reinforcement learning framework that optimizes empathic response quality together with cross-turn discourse-move novelty. At the 1.7B scale, this checkpoint provides the strongest aggregate empathy performance among the MINT variants reported in the paper.

Key Results

On the Lend-an-Ear test set reported in the paper, which contains 315 supporter turns across 50 conversations:

Aggregate empathy improves from 3.60 to 4.54 relative to the vanilla Qwen3-1.7B baseline.
In Table 2 of the paper, tactic stickiness is 0.51 for both the vanilla baseline and this checkpoint.
This checkpoint should therefore be understood primarily as the strongest 1.7B empathy checkpoint, while the clearest reduction in tactic repetition appears at the 4B scale.

Training Summary


Method	GRPO via VERL
Reward	Empathy quality + cross-turn tactic diversity
Base model	Qwen/Qwen3-1.7B
KL coeff	0.01
Diversity weight	1.0
Response length	2048 tokens
Rollouts	n=8 per prompt

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("hongli-zhan/MINT-empathy-Qwen3-1.7B")
tokenizer = AutoTokenizer.from_pretrained("hongli-zhan/MINT-empathy-Qwen3-1.7B")

With vLLM:

from vllm import LLM
llm = LLM(model="hongli-zhan/MINT-empathy-Qwen3-1.7B")

Intended Use and Limitations

This model is intended for research on empathic dialogue, discourse diversity, and supportive response generation. It is a research artifact, not a therapy system, and at the 1.7B scale the main improvement is empathy quality rather than a substantial reduction in tactic repetition.

Paper: Discourse Diversity in Multi-Turn Empathic Dialogue
Project page: honglizhan.github.io/mint-empathy
Code: github.com/honglizhan/mint-empathy
Tactic taggers: hongli-zhan/empathy-tactic-taggers-llama3.1-8b
Larger checkpoint: hongli-zhan/MINT-empathy-Qwen3-4B

Citation

@article{zhan2026discourse,
  title={Discourse Diversity in Multi-Turn Empathic Dialogue},
  author={Zhan, Hongli and Gueorguieva, Emma S and Hernandez, Javier and Suh, Jina and Ong, Desmond C and Li, Junyi Jessy},
  journal={arXiv preprint arXiv:2604.11742},
  year={2026}
}

README.md

MINT-empathy-Qwen3-1.7B

Key Results

Training Summary

Usage

Intended Use and Limitations

Related Artifacts

Citation