license, datasets, model-index
license datasets model-index
mit
lavita/ChatDoctor-HealthCareMagic-100k
name results
doctorLLM10k
task dataset metrics source
type name
text-generation Text Generation
name type config split args
AI2 Reasoning Challenge (25-Shot) ai2_arc ARC-Challenge test
num_few_shot
25
type value name
acc_norm 54.95 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=vikash06/doctorLLM10k Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
HellaSwag (10-Shot) hellaswag validation
num_few_shot
10
type value name
acc_norm 79.94 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=vikash06/doctorLLM10k Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU (5-Shot) cais/mmlu all test
num_few_shot
5
type value name
acc 44.4 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=vikash06/doctorLLM10k Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
TruthfulQA (0-shot) truthful_qa multiple_choice validation
num_few_shot
0
type value
mc2 44.76
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=vikash06/doctorLLM10k Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
Winogrande (5-shot) winogrande winogrande_xl validation
num_few_shot
5
type value name
acc 70.01 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=vikash06/doctorLLM10k Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
GSM8k (5-shot) gsm8k main test
num_few_shot
5
type value name
acc 10.16 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=vikash06/doctorLLM10k Open LLM Leaderboard

Sample Input on Postman API:

image/png

Number of epochs: 10 Number of Data points: 10000

Creative Writing: Write a question or instruction that requires a creative medical response from a doctor.

The instruction should be reasonable to ask of a person with general medical knowledge and should not require searching. In this task, your prompt should give very specific instructions to follow. Constraints, instructions, guidelines, or requirements all work, and the more of them the better.

Reference dataset: https://github.com/Kent0n-Li/ChatDoctor

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 50.70
AI2 Reasoning Challenge (25-Shot) 54.95
HellaSwag (10-Shot) 79.94
MMLU (5-Shot) 44.40
TruthfulQA (0-shot) 44.76
Winogrande (5-shot) 70.01
GSM8k (5-shot) 10.16
Description
Model synced from source: vikash06/doctorLLM10k
Readme 581 KiB