ModelHub XC 759d5008fe 初始化项目,由ModelHub XC社区提供模型
Model: Locutusque/lr-experiment1-7B
Source: Original Platform
2026-05-06 20:05:26 +08:00

library_name, license, datasets, language
library_name license datasets language
transformers apache-2.0
Locutusque/hercules-v2.0
CollectiveCognition/chats-data-2023-09-22
en

lr-experiment1-7B

The lr-experiment model series is a research project I'm conducting that I will be using to determine the best learning rate to use while fine-tuning Mistral. This model uses a learning rate of 2e-5 with a cosine scheduler and no warmup steps.

I used Locutusque/Hercules-2.0-Mistral-7B as a base model, and further fine-tuned it on CollectiveCognition/chats-data-2023-09-22 using QLoRA for 3 epochs. I will be keeping track of evaluation results, and will comparing it to upcoming models.

Evals

Tasks Version Filter n-shot Metric Value Stderr
agieval_nous N/A none None acc 0.3645 ± 0.0093
none None acc_norm 0.3468 ± 0.0092
- agieval_aqua_rat 1 none None acc 0.2283 ± 0.0264
none None acc_norm 0.2283 ± 0.0264
- agieval_logiqa_en 1 none None acc 0.2965 ± 0.0179
none None acc_norm 0.3303 ± 0.0184
- agieval_lsat_ar 1 none None acc 0.2217 ± 0.0275
none None acc_norm 0.1783 ± 0.0253
- agieval_lsat_lr 1 none None acc 0.4039 ± 0.0217
none None acc_norm 0.3686 ± 0.0214
- agieval_lsat_rc 1 none None acc 0.4870 ± 0.0305
none None acc_norm 0.4424 ± 0.0303
- agieval_sat_en 1 none None acc 0.6408 ± 0.0335
none None acc_norm 0.5971 ± 0.0343
- agieval_sat_en_without_passage 1 none None acc 0.3932 ± 0.0341
none None acc_norm 0.3835 ± 0.0340
- agieval_sat_math 1 none None acc 0.3455 ± 0.0321
none None acc_norm 0.2727 ± 0.0301
Groups Version Filter n-shot Metric Value Stderr
agieval_nous N/A none None acc 0.3645 ± 0.0093
none None acc_norm 0.3468 ± 0.0092
Description
Model synced from source: Locutusque/lr-experiment1-7B
Readme 1 MiB