初始化项目,由ModelHub XC社区提供模型
Model: aixonlab/Grey-12b Source: Original Platform
This commit is contained in:
78
README.md
Normal file
78
README.md
Normal file
@@ -0,0 +1,78 @@
|
||||
---
|
||||
base_model: aixonlab/Aether-12b
|
||||
language:
|
||||
- en
|
||||
license: apache-2.0
|
||||
tags:
|
||||
- text-generation-inference
|
||||
- transformers
|
||||
- mistral
|
||||
---
|
||||
|
||||
<img src="https://cdn-uploads.huggingface.co/production/uploads/66dcee3321f901b049f48002/jWXtbknuetFdz5fkFn-ey.png" width="800"/>
|
||||
|
||||
# Grey-12b
|
||||
|
||||
Grey-12b is a merged language model created by combining multiple models using the della_linear merge method, with Aether-12b as the base model.
|
||||
|
||||
## Model Details 📊
|
||||
- Developed by: AIXON Lab
|
||||
- Model type: Merged Causal Language Model
|
||||
- Language(s): English (primarily), may support other languages
|
||||
- License: apache-2.0
|
||||
- Repository: https://huggingface.co/aixonlab/Grey-12b
|
||||
|
||||
## Model Architecture 🏗️
|
||||
- Base model: aixonlab/Aether-12b
|
||||
- Parameter count: ~12 billion
|
||||
- Architecture specifics: Transformer-based language model
|
||||
- Merge method: della_linear
|
||||
|
||||
### Merged Models
|
||||
1. VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
|
||||
- Weight: 0.33
|
||||
- Density: 0.4
|
||||
2. cognitivecomputations/dolphin-2.9.3-mistral-nemo-12b
|
||||
- Weight: 0.77
|
||||
- Density: 0.8
|
||||
|
||||
## Technical Specifications
|
||||
- Dtype: float16
|
||||
- Tokenizer source: base (aixonlab/Aether-12b)
|
||||
- Merge parameters:
|
||||
- Epsilon: 0.05
|
||||
- Lambda: 1
|
||||
|
||||
## Intended Use 🎯
|
||||
As an advanced language model for various natural language processing tasks, including but not limited to text generation, question-answering, and analysis.
|
||||
|
||||
## Ethical Considerations 🤔
|
||||
As a merged model based on multiple sources, Grey-12b may inherit biases and limitations from its constituent models. Users should be aware of potential biases in generated content and use the model responsibly.
|
||||
|
||||
## Performance and Evaluation
|
||||
Performance metrics and evaluation results for Grey-12b are yet to be determined. Users are encouraged to contribute their findings and benchmarks.
|
||||
|
||||
## Limitations and Biases
|
||||
The model may exhibit biases present in its training data and constituent models. It's crucial to critically evaluate the model's outputs and use them in conjunction with human judgment.
|
||||
|
||||
## Additional Information
|
||||
For more details on the base model and constituent models, please refer to their respective model cards and documentation.
|
||||
|
||||
## Acknowledgments 🙏
|
||||
We acknowledge the contributions of:
|
||||
- VAGOsolutions for the SauerkrautLM-Nemo-12b-Instruct model
|
||||
- Cognitive Computations for the dolphin-2.9.3-mistral-nemo-12b model
|
||||
|
||||
## How to Use
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
model = AutoModelForCausalLM.from_pretrained("aixonlab/Grey-12b")
|
||||
tokenizer = AutoTokenizer.from_pretrained("aixonlab/Grey-12b")
|
||||
|
||||
prompt = "Once upon a time"
|
||||
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
|
||||
|
||||
generated_ids = model.generate(input_ids, max_length=100)
|
||||
generated_text = tokenizer.decode(generated_ids, skip_special_tokens=True)
|
||||
print(generated_text)
|
||||
Reference in New Issue
Block a user