license, base_model, language, tags, pipeline_tag
license
base_model
language
tags
pipeline_tag
apache-2.0
Qwen/Qwen3-1.7B
turn-detection
call-center
code-switching
multilingual
text-generation
Turn Detector Qwen3-1.7B
Fine-tuned Qwen3-1.7B for real-time turn-end detection in multilingual call center conversations.
The model predicts P(<|im_end|>) — the probability that a speaker has finished their turn. Designed for low-latency voice agent pipelines (e.g. LiveKit) to determine when to respond.
How It Works
Given a conversation so far, the model outputs the probability of <|im_end|> as the next token:
P(im_end) > 0.5 → speaker is done talking (turn complete)
P(im_end) < 0.5 → speaker is still talking (turn incomplete)
Usage
Eval Results
Test set: 1200 samples (600 positive + 600 negative), 50 conversations per language pair.
Overall (threshold = 0.5)
Metric
Score
Accuracy
96.67%
Precision
99.82%
Recall
93.50%
F1
96.56%
Per Language
Language Pair
Overall
Positive
Negative
chinese-english
95.00%
90.00%
100.00%
chinese-malay
97.00%
94.00%
100.00%
chinese-tamil
97.00%
94.00%
100.00%
english-chinese
97.00%
96.00%
98.00%
english-malay
94.00%
88.00%
100.00%
english-tamil
95.00%
90.00%
100.00%
malay-chinese
97.00%
94.00%
100.00%
malay-english
96.00%
92.00%
100.00%
malay-tamil
97.00%
94.00%
100.00%
tamil-chinese
100.00%
100.00%
100.00%
tamil-english
97.00%
94.00%
100.00%
tamil-malay
98.00%
96.00%
100.00%
Threshold Sweep
Threshold
Accuracy
Precision
Recall
F1
0.1
99.00%
99.66%
98.33%
98.99%
0.2
98.67%
99.66%
97.67%
98.65%
0.3
98.00%
99.66%
96.33%
97.97%
0.4
97.58%
99.65%
95.50%
97.53%
0.5
96.67%
99.82%
93.50%
96.56%
0.6
95.50%
99.82%
91.17%
95.30%
0.7
93.67%
99.81%
87.50%
93.25%
0.8
91.17%
100.00%
82.33%
90.31%
0.9
83.83%
100.00%
67.67%
80.72%
Confusion Matrix (threshold = 0.5)
Pred Pos
Pred Neg
Actual Pos
561
39
Actual Neg
1
599
Probability Distribution
Class
Mean
Median
Min
Max
Positive (turn complete)
0.8813
0.9673
0.0063
1.0000
Negative (turn incomplete)
0.0020
0.0000
0.0000
0.7022
Dataset
Tokenized parquet datasets (chinidataset format) available at Scicom-intl/turn-detector-Qwen3-0.6B-dataset .
Training
Base model: Qwen/Qwen3-1.7B
Training data: Positive samples only (complete conversations ending with <|im_end|>)
Loss: Liger Fused Linear Cross Entropy
Attention: Flash Attention 3
Precision: bfloat16
Block size: 8192 (multipacked)
Batch size: 2 x 16 gradient accumulation
Learning rate: 2e-5 (constant)
Epochs: 1
Training Data Sources