Files
ModelHub XC 78a6661ff1 初始化项目,由ModelHub XC社区提供模型
Model: bigscience/bloomz-7b1-p3
Source: Original Platform
2026-06-15 07:40:14 +08:00

13 KiB

1datasetpromptmetricvalue
2amazon_reviews_multi_enprompt_body_title_to_staraccuracy0.6176
3amazon_reviews_multi_enprompt_review_to_staraccuracy0.5592
4amazon_reviews_multi_enprompt_title_to_staraccuracy0.3922
5amazon_reviews_multi_enmedianaccuracy0.5592
6amazon_reviews_multi_esprompt_body_title_to_staraccuracy0.5526
7amazon_reviews_multi_esprompt_review_to_staraccuracy0.5296
8amazon_reviews_multi_esprompt_title_to_staraccuracy0.3646
9amazon_reviews_multi_esmedianaccuracy0.5296
10amazon_reviews_multi_frprompt_body_title_to_staraccuracy0.5332
11amazon_reviews_multi_frprompt_review_to_staraccuracy0.5182
12amazon_reviews_multi_frprompt_title_to_staraccuracy0.3644
13amazon_reviews_multi_frmedianaccuracy0.5182
14amazon_reviews_multi_zhprompt_body_title_to_staraccuracy0.5174
15amazon_reviews_multi_zhprompt_review_to_staraccuracy0.5006
16amazon_reviews_multi_zhprompt_title_to_staraccuracy0.3874
17amazon_reviews_multi_zhmedianaccuracy0.5006
18aqua_rat_rawAnswer questions from optionsaccuracy0.24015748031496062
19aqua_rat_rawanswer_quizaccuracy0.22440944881889763
20aqua_rat_rawselect_the_best_optionaccuracy0.2559055118110236
21aqua_rat_rawmedianaccuracy0.24015748031496062
22art_Nonechoose_hypothesisaccuracy0.5926892950391645
23art_Nonechoose_hypothesis_believableaccuracy0.5711488250652742
24art_Nonechoose_hypothesis_descaccuracy0.5169712793733682
25art_Nonechoose_hypothesis_likelyaccuracy0.5300261096605744
26art_Nonechoose_hypothesis_optionsaccuracy0.5672323759791122
27art_Nonemedianaccuracy0.5672323759791122
28banking77_Nonedirect_to_which_departmentaccuracy0.16753246753246753
29banking77_Nonehelp_page_topicaccuracy0.26785714285714285
30banking77_Nonerephrase_as_banking_termaccuracy0.274025974025974
31banking77_Nonemedianaccuracy0.26785714285714285
32blbooksgenre_title_genre_classifictionclassifyaccuracy0.25057603686635943
33blbooksgenre_title_genre_classifictionmulti-choiceaccuracy0.25057603686635943
34blbooksgenre_title_genre_classifictionpremise_context_firstaccuracy0.7321428571428571
35blbooksgenre_title_genre_classifictionmedianaccuracy0.25057603686635943
36blimp_adjunct_islandgrammatical_between_1_2accuracy0.512
37blimp_adjunct_islandgrammatical_between_A_Baccuracy0.464
38blimp_adjunct_islandgrammatical_which_one_1_2accuracy0.512
39blimp_adjunct_islandsingle_sentence_bad_yes_noaccuracy0.52
40blimp_adjunct_islandsingle_sentence_good_yes_noaccuracy0.493
41blimp_adjunct_islandmedianaccuracy0.512
42climate_fever_Noneclaim_and_all_supporting_evidencesaccuracy0.3166123778501629
43climate_fever_Nonefifth_evidence_and_claim_itemizationaccuracy0.4749185667752443
44climate_fever_Nonefirst_evidence_and_claim_itemizationaccuracy0.22996742671009773
45climate_fever_Nonesecond_evidence_and_claim_itemizationaccuracy0.24625407166123778
46climate_fever_Nonethird_evidence_claim_pairaccuracy0.24234527687296417
47climate_fever_Nonemedianaccuracy0.24625407166123778
48codah_codahaffirmative_instruction_after_sentence_and_choicesaccuracy0.6693083573487032
49codah_codahaffirmative_instruction_before_sentence_and_choicesaccuracy0.6509365994236311
50codah_codahinterrogative_instruction_after_sentence_and_choicesaccuracy0.6761527377521613
51codah_codahmedianaccuracy0.6693083573487032
52commonsense_qa_Noneanswer_given_question_without_optionsaccuracy0.6388206388206388
53commonsense_qa_Nonemost_suitable_answeraccuracy0.7313677313677314
54commonsense_qa_Nonequestion_answeringaccuracy0.7158067158067158
55commonsense_qa_Nonemedianaccuracy0.7158067158067158
56conv_ai_3_Noneambiguousaccuracy0.39040207522697795
57conv_ai_3_Noneclarification_neededaccuracy0.39040207522697795
58conv_ai_3_Nonedirectly_answeraccuracy0.6095979247730221
59conv_ai_3_Nonescore_give_numberaccuracy0.057933419801124084
60conv_ai_3_Nonescore_how_muchaccuracy0.010376134889753566
61conv_ai_3_Nonemedianaccuracy0.39040207522697795
62craigslist_bargains_Nonebest dealaccuracy0.5192629815745393
63craigslist_bargains_Nonegood deal for selleraccuracy0.2529313232830821
64craigslist_bargains_Nonegood deal for seller no list priceaccuracy0.09715242881072027
65craigslist_bargains_Nonegood deal for seller no list price implicitaccuracy0.24623115577889448
66craigslist_bargains_Nonemedianaccuracy0.2495812395309883
67emotion_Noneanswer_question_with_emotion_labelaccuracy0.3375
68emotion_Noneanswer_with_class_labelaccuracy0.214
69emotion_Nonechoose_the_best_emotion_labelaccuracy0.312
70emotion_Nonereply_with_emoation_labelaccuracy0.4495
71emotion_Nonemedianaccuracy0.32475
72financial_phrasebank_sentences_allagreebullish_neutral_bearishaccuracy0.3878091872791519
73financial_phrasebank_sentences_allagreecomplementary_industriesaccuracy0.10114840989399293
74financial_phrasebank_sentences_allagreesentimentaccuracy0.35644876325088337
75financial_phrasebank_sentences_allagreeshare_price_optionaccuracy0.3670494699646643
76financial_phrasebank_sentences_allagreeword_comes_to_mindaccuracy0.08259717314487633
77financial_phrasebank_sentences_allagreemedianaccuracy0.35644876325088337
78glue_colaFollowing sentence acceptableaccuracy0.37583892617449666
79glue_colaMake sense yes noaccuracy0.33940556088207097
80glue_colaPrevious sentence acceptableaccuracy0.31255992329817833
81glue_colaeditingaccuracy0.3844678811121764
82glue_colais_this_correctaccuracy0.37775647171620325
83glue_colamedianaccuracy0.37583892617449666
84glue_sst2following positive negativeaccuracy0.9426605504587156
85glue_sst2happy or madaccuracy0.8279816513761468
86glue_sst2positive negative afteraccuracy0.9472477064220184
87glue_sst2reviewaccuracy0.9254587155963303
88glue_sst2saidaccuracy0.9059633027522935
89glue_sst2medianaccuracy0.9254587155963303
90head_qa_enmultiple_choice_a_and_q_enaccuracy0.29428989751098095
91head_qa_enmultiple_choice_a_and_q_with_context_enaccuracy0.29502196193265007
92head_qa_enmultiple_choice_q_and_a_enaccuracy0.3938506588579795
93head_qa_enmultiple_choice_q_and_a_index_enaccuracy0.30307467057101023
94head_qa_enmultiple_choice_q_and_a_index_with_context_enaccuracy0.30234260614934116
95head_qa_enmedianaccuracy0.30234260614934116
96head_qa_esmultiple_choice_a_and_q_enaccuracy0.2730600292825769
97head_qa_esmultiple_choice_a_and_q_with_context_enaccuracy0.27232796486090777
98head_qa_esmultiple_choice_q_and_a_enaccuracy0.36530014641288433
99head_qa_esmultiple_choice_q_and_a_index_enaccuracy0.3074670571010249
100head_qa_esmultiple_choice_q_and_a_index_with_context_enaccuracy0.3089311859443631
101head_qa_esmedianaccuracy0.3074670571010249
102health_fact_Noneclaim_explanation_classificationaccuracy0.5591836734693878
103health_fact_Noneclaim_veracity_classification_after_reading_I_believeaccuracy0.34938775510204084
104health_fact_Noneclaim_veracity_classification_tell_meaccuracy0.48244897959183675
105health_fact_Nonemedianaccuracy0.48244897959183675
106hlgd_Noneis_same_event_editor_asksaccuracy0.6926051232479459
107hlgd_Noneis_same_event_interrogative_talkaccuracy0.6582890285161914
108hlgd_Noneis_same_event_referaccuracy0.7858869018849686
109hlgd_Noneis_same_event_with_time_interrogative_relatedaccuracy0.7839536007733204
110hlgd_Noneis_same_event_with_time_interrogative_talkaccuracy0.7786370227162881
111hlgd_Nonemedianaccuracy0.7786370227162881
112hyperpartisan_news_detection_byarticleconsider_does_it_follow_a_hyperpartisan_argumentationaccuracy0.6232558139534884
113hyperpartisan_news_detection_byarticleconsider_it_exhibits_extreme_one_sidednessaccuracy0.6310077519379845
114hyperpartisan_news_detection_byarticleconsume_with_cautionaccuracy0.6294573643410852
115hyperpartisan_news_detection_byarticleextreme_left_wing_or_right_wingaccuracy0.6077519379844961
116hyperpartisan_news_detection_byarticlefollows_hyperpartisan_argumentationaccuracy0.627906976744186
117hyperpartisan_news_detection_byarticlemedianaccuracy0.627906976744186
118liar_NoneGiven statement guess categoryaccuracy0.19314641744548286
119liar_Nonemedianaccuracy0.19314641744548286
120lince_sa_spaengexpress sentimentaccuracy0.5696611081226466
121lince_sa_spaengnegation templateaccuracy0.3851533082302313
122lince_sa_spaengoriginal poster expressed sentimentaccuracy0.5841850457235073
123lince_sa_spaengsentiment trying to expressaccuracy0.5809575040344271
124lince_sa_spaengthe author seemaccuracy0.5771920387305003
125lince_sa_spaengmedianaccuracy0.5771920387305003
126math_qa_Nonechoose_correct_ogaccuracy0.23484087102177553
127math_qa_Nonefirst_choice_then_problemaccuracy0.2254606365159129
128math_qa_Nonegre_problemaccuracy0.21943048576214405
129math_qa_Nonepick_the_correctaccuracy0.2338358458961474
130math_qa_Noneproblem_set_typeaccuracy0.29246231155778896
131math_qa_Nonemedianaccuracy0.2338358458961474
132mlsum_eslayman_summ_esbleu0.026830705121606707
133mlsum_espalm_promptbleu0.033413101613448924
134mlsum_essummarise_this_in_es_few_sentencesbleu0.02224579465087946
135mlsum_esmedianbleu0.026830705121606707
136movie_rationales_NoneEvidences + reviewaccuracy0.97
137movie_rationales_NoneEvidences sentiment classificationaccuracy1.0
138movie_rationales_NoneStandard binary sentiment analysisaccuracy0.95
139movie_rationales_Nonemedianaccuracy0.97
140mwsc_Nonein-the-sentenceaccuracy0.6219512195121951
141mwsc_Nonein-the-sentence-question-firstaccuracy0.5853658536585366
142mwsc_Noneis-correctaccuracy0.5365853658536586
143mwsc_Noneoptions-oraccuracy0.6097560975609756
144mwsc_Nonewhat-thinkaccuracy0.6097560975609756
145mwsc_Nonemedianaccuracy0.6097560975609756
146onestop_english_Noneara_contextaccuracy0.3333333333333333
147onestop_english_Noneassessaccuracy0.3333333333333333
148onestop_english_Nonedetermine_reading_level_from_the_first_three_sentencesaccuracy0.5696649029982364
149onestop_english_Noneesl_contextaccuracy0.3333333333333333
150onestop_english_Noneesl_variationaccuracy0.3333333333333333
151onestop_english_Nonemedianaccuracy0.3333333333333333
152poem_sentiment_Noneguess_sentiment_without_options_variation_1accuracy0.22857142857142856
153poem_sentiment_Nonemost_appropriate_sentimentaccuracy0.2571428571428571
154poem_sentiment_Nonepositive_or_negative_sentiment_variation_1accuracy0.2571428571428571
155poem_sentiment_Nonepositive_or_negative_sentiment_variation_2accuracy0.21904761904761905
156poem_sentiment_Nonequestion_answer_formataccuracy0.24761904761904763
157poem_sentiment_Nonemedianaccuracy0.24761904761904763
158pubmed_qa_pqa_labeledLong Answer to Final Decisionaccuracy0.598
159pubmed_qa_pqa_labeledQuestion Answering (Short)accuracy0.581
160pubmed_qa_pqa_labeledmedianaccuracy0.5894999999999999
161riddle_sense_Noneanswer_given_question_without_optionsaccuracy0.4534769833496572
162riddle_sense_Nonemost_suitable_answeraccuracy0.4348677766895201
163riddle_sense_Nonequestion_answeringaccuracy0.4407443682664055
164riddle_sense_Nonequestion_to_answer_indexaccuracy0.3878550440744368
165riddle_sense_Nonemedianaccuracy0.43780607247796277
166scicite_NoneClassify intentaccuracy0.15065502183406113
167scicite_NoneClassify intent (choices first)accuracy0.1331877729257642
168scicite_NoneClassify intent (select choice)accuracy0.2652838427947598
169scicite_NoneClassify intent w/section (select choice)accuracy0.3537117903930131
170scicite_Nonecan_describeaccuracy0.15283842794759825
171scicite_Nonemedianaccuracy0.15283842794759825
172selqa_answer_selection_analysisis-he-talking-aboutaccuracy0.9121019108280255
173selqa_answer_selection_analysismake-sense-randaccuracy0.9171974522292994
174selqa_answer_selection_analysiswhich-answer-1st-vs-randomaccuracy0.7503184713375797
175selqa_answer_selection_analysiswould-make-sense-qu-randaccuracy0.8993630573248408
176selqa_answer_selection_analysismedianaccuracy0.9057324840764331
177snips_built_in_intents_Nonecategorize_queryaccuracy0.47865853658536583
178snips_built_in_intents_Nonecategorize_query_briefaccuracy0.375
179snips_built_in_intents_Noneintent_queryaccuracy0.31402439024390244
180snips_built_in_intents_Nonequery_intentaccuracy0.7012195121951219
181snips_built_in_intents_Nonevoice_intentaccuracy0.6128048780487805
182snips_built_in_intents_Nonemedianaccuracy0.47865853658536583
183wmt14_fr_en_en-fra_good_translation-en-fr-source+targetbleu0.02125573406419127
184wmt14_fr_en_en-fra_good_translation-en-fr-targetbleu0.015697853682886957
185wmt14_fr_en_en-frgpt3-en-frbleu0.0037928468482204985
186wmt14_fr_en_en-frversion-en-fr-targetbleu0.047885599586875285
187wmt14_fr_en_en-frxglm-en-fr-targetbleu0.021861712984543362
188wmt14_fr_en_en-frmedianbleu0.02125573406419127
189wmt14_fr_en_fr-ena_good_translation-fr-en-source+targetbleu0.3038834619016813
190wmt14_fr_en_fr-ena_good_translation-fr-en-targetbleu0.22361703612398195
191wmt14_fr_en_fr-engpt3-fr-enbleu0.17167001660570336
192wmt14_fr_en_fr-enversion-fr-en-targetbleu0.23925613843737142
193wmt14_fr_en_fr-enxglm-fr-en-targetbleu0.1410190003658709
194wmt14_fr_en_fr-enmedianbleu0.22361703612398195
195wmt14_hi_en_en-hia_good_translation-en-hi-source+targetbleu0.0018051438917625368
196wmt14_hi_en_en-hia_good_translation-en-hi-targetbleu0.0018126292465026588
197wmt14_hi_en_en-higpt-3-en-hi-targetbleu0.00010782650615890081
198wmt14_hi_en_en-hiversion-en-hi-targetbleu0.0018585745110753149
199wmt14_hi_en_en-hixglm-en-hi-targetbleu2.225608801197892e-05
200wmt14_hi_en_en-himedianbleu0.0018051438917625368
201wmt14_hi_en_hi-ena_good_translation-hi-en-source+targetbleu0.16056644593701627
202wmt14_hi_en_hi-ena_good_translation-hi-en-targetbleu0.1503249107946881
203wmt14_hi_en_hi-engpt-3-hi-en-targetbleu0.05607403962346587
204wmt14_hi_en_hi-enversion-hi-en-targetbleu0.15167071858881462
205wmt14_hi_en_hi-enxglm-hi-en-targetbleu0.03675518735361532
206wmt14_hi_en_hi-enmedianbleu0.1503249107946881
207multipleaveragemultiple0.42128315936464156