license, datasets, model-index
license
datasets
model-index
mit
lavita/ChatDoctor-HealthCareMagic-100k
name
results
doctorLLM10k
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
AI2 Reasoning Challenge (25-Shot)
ai2_arc
ARC-Challenge
test
type
value
name
acc_norm
54.95
normalized accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
split
args
HellaSwag (10-Shot)
hellaswag
validation
type
value
name
acc_norm
79.94
normalized accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
MMLU (5-Shot)
cais/mmlu
all
test
type
value
name
acc
44.4
accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
TruthfulQA (0-shot)
truthful_qa
multiple_choice
validation
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
Winogrande (5-shot)
winogrande
winogrande_xl
validation
type
value
name
acc
70.01
accuracy
task
dataset
metrics
source
type
name
text-generation
Text Generation
name
type
config
split
args
GSM8k (5-shot)
gsm8k
main
test
type
value
name
acc
10.16
accuracy
Sample Input on Postman API:
Number of epochs: 10
Number of Data points: 10000
Creative Writing: Write a question or instruction that requires a creative medical response from a doctor.
The instruction should be reasonable to ask of a person with general medical knowledge and should not require searching.
In this task, your prompt should give very specific instructions to follow.
Constraints, instructions, guidelines, or requirements all work, and the more of them the better.
Reference dataset: https://github.com/Kent0n-Li/ChatDoctor
Detailed results can be found here
Metric
Value
Avg.
50.70
AI2 Reasoning Challenge (25-Shot)
54.95
HellaSwag (10-Shot)
79.94
MMLU (5-Shot)
44.40
TruthfulQA (0-shot)
44.76
Winogrande (5-shot)
70.01
GSM8k (5-shot)
10.16