初始化项目,由ModelHub XC社区提供模型
Model: aixonlab/Aether-12b Source: Original Platform
This commit is contained in:
56
README.md
Normal file
56
README.md
Normal file
@@ -0,0 +1,56 @@
|
||||
---
|
||||
base_model: Xclbr7/Arcanum-12b
|
||||
language:
|
||||
- en
|
||||
license: apache-2.0
|
||||
tags:
|
||||
- text-generation-inference
|
||||
- transformers
|
||||
- unsloth
|
||||
- mistral
|
||||
- trl
|
||||
---
|
||||
<img src="https://cdn-uploads.huggingface.co/production/uploads/66dcee3321f901b049f48002/Fpdr8qCx9Xx4RHWgptCGD.png" width="800"/>
|
||||
|
||||
|
||||
# Aether-12b
|
||||
|
||||
Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset.
|
||||
|
||||
## Model Details 📊
|
||||
- Developed by: AIXON Lab
|
||||
- Model type: Causal Language Model
|
||||
- Language(s): English (primarily), may support other languages
|
||||
- License: apache-2.0
|
||||
- Repository: https://huggingface.co/aixonlab/Aether-12b
|
||||
|
||||
## Model Architecture 🏗️
|
||||
- Base model: Arcanum-12b
|
||||
- Parameter count: ~12 billion
|
||||
- Architecture specifics: Transformer-based language model
|
||||
|
||||
## Open LLM Leaderboard Evaluation Results
|
||||
Coming Soon !
|
||||
|
||||
## Training & Fine-tuning 🔄
|
||||
Aether-12b was fine-tuned on the following dataset:
|
||||
- Dataset: theprint/CleverBoi-Data-20k
|
||||
- Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision.
|
||||
|
||||
The CleverBoi-Data-20k dataset improved the model in the following ways:
|
||||
1. Enhanced reasoning and problem-solving capabilities
|
||||
2. Broader knowledge across various topics
|
||||
3. Improved performance on specific tasks like writing, analysis, and problem-solving
|
||||
4. Better contextual understanding and response generation
|
||||
|
||||
## Intended Use 🎯
|
||||
As an assistant or specific role bot.
|
||||
|
||||
## Ethical Considerations 🤔
|
||||
As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly.
|
||||
|
||||
|
||||
## Acknowledgments 🙏
|
||||
We acknowledge the contributions of:
|
||||
- theprint for the amazing CleverBoi-Data-20k dataset
|
||||
|
||||
Reference in New Issue
Block a user