初始化项目，由ModelHub XC社区提供模型

Model: Corianas/111m Source: Original Platform
2026-05-31 21:59:43 +08:00
commit 92e2bf4b33
11 changed files with 150640 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,201 @@
+---
+license: cc-by-nc-sa-4.0
+datasets:
+- tatsu-lab/alpaca
+- the_pile
+---
+
+# Model Card for Cerebras 111M Dollyfied.
+
+This is a finetuned model of Cerebras 111M model. using DataBricksLabs Dolly Framework
+
+## Model Details
+
+### Model Description
+
+This is a finetuned version of cerebras' 111million paramater model that has been trained to follow instructions.
+
+It was accomplished using DataBricks Dolly training tools and the alpaca dataset, and was trained for 2 epochs.
+
+- **Developed by:** Finetuned by Corianas (me) using open source tools
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** EN
+- **License:** cc-by-nc-4.0
+- **Finetuned from model:** https://huggingface.co/cerebras/Cerebras-GPT-111m
+- **Finetuned using:** https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html
+
+## Uses
+
+This is a simple GPT chatbot that has been finetuned to understand instructions.
+Its knowledge about facts about the world is should be considered suspect at best.
+
+### Direct Use
+
+If you have a use you put it to, Please let me know.
+
+[More Information Needed]
+
+### Downstream Use [optional]
+
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+
+[More Information Needed]
+
+### Out-of-Scope Use
+
+Any form of use where any form of accuracy is needed.
+FOR THE LOVE OF GOD DO NOT FOLLOW MEDICAL ADVICE FROM THIS.
+or financial advice.
+
+[More Information Needed]
+
+## Bias, Risks, and Limitations
+
+Limitations... Yes, I am sure there are so so many.
+
+[More Information Needed]
+
+## How to Get Started with the Model
+
+Use the code below to get started with the model.
+
+[More Information Needed]
+
+## Training Details
+
+### Training Data
+
+<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+
+[More Information Needed]
+
+### Training Procedure 
+
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+
+#### Preprocessing [optional]
+
+[More Information Needed]
+
+
+#### Training Hyperparameters
+
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+
+#### Speeds, Sizes, Times [optional]
+
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+
+[More Information Needed]
+
+## Evaluation
+
+<!-- This section describes the evaluation protocols and provides the results. -->
+
+### Testing Data, Factors & Metrics
+
+#### Testing Data
+
+<!-- This should link to a Data Card if possible. -->
+
+[More Information Needed]
+
+#### Factors
+
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+
+[More Information Needed]
+
+#### Metrics
+
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+
+[More Information Needed]
+
+### Results
+
+[More Information Needed]
+
+#### Summary
+
+
+
+## Model Examination [optional]
+
+<!-- Relevant interpretability work for the model goes here -->
+
+[More Information Needed]
+
+## Environmental Impact
+
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+
+- **Hardware Type:** 8xA100s (accomplished while I was downloading the model I was actually training.)
+- **Minutes used:** 7.5
+- **Cloud Provider:** LambdaGPU
+- **Compute Region:** USA
+- **Carbon Emitted:** [More Information Needed]
+
+## Technical Specifications [optional]
+
+### Model Architecture and Objective
+
+[More Information Needed]
+
+### Compute Infrastructure
+
+[More Information Needed]
+
+#### Hardware
+
+[More Information Needed]
+
+#### Software
+
+[More Information Needed]
+
+## Citation [optional]
+
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+
+**BibTeX:**
+
+[More Information Needed]
+
+**APA:**
+
+[More Information Needed]
+
+## Glossary [optional]
+
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+
+[More Information Needed]
+
+## More Information [optional]
+
+[More Information Needed]
+
+## Model Card Authors [optional]
+
+[More Information Needed]
+
+## Model Card Contact
+
+[More Information Needed]
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Corianas__111m)
+
+| Metric                | Value                     |
+|-----------------------|---------------------------|
+| Avg.                  | 24.04   |
+| ARC (25-shot)         | 19.71          |
+| HellaSwag (10-shot)   | 26.68    |
+| MMLU (5-shot)         | 25.28         |
+| TruthfulQA (0-shot)   | 43.72   |
+| Winogrande (5-shot)   | 50.2   |
+| GSM8K (5-shot)        | 0.0        |
+| DROP (3-shot)         | 2.69         |