初始化项目,由ModelHub XC社区提供模型
Model: jondurbin/airoboros-l2-70b-gpt4-1.4.1 Source: Original Platform
This commit is contained in:
45
README.md
Normal file
45
README.md
Normal file
@@ -0,0 +1,45 @@
|
||||
---
|
||||
license: other
|
||||
datasets:
|
||||
- jondurbin/airoboros-gpt4-1.4.1
|
||||
---
|
||||
|
||||
### Overview
|
||||
|
||||
Llama 2 70b fine tune using https://huggingface.co/datasets/jondurbin/airoboros-gpt4-1.4.1
|
||||
|
||||
See the previous llama 65b model card for info:
|
||||
https://hf.co/jondurbin/airoboros-65b-gpt4-1.4
|
||||
|
||||
### Contribute
|
||||
|
||||
If you're interested in new functionality, particularly a new "instructor" type to generate a specific type of training data,
|
||||
take a look at the dataset generation tool repo: https://github.com/jondurbin/airoboros and either make a PR or open an issue with details.
|
||||
|
||||
To help me with the OpenAI/compute costs:
|
||||
|
||||
- https://bmc.link/jondurbin
|
||||
- ETH 0xce914eAFC2fe52FdceE59565Dd92c06f776fcb11
|
||||
- BTC bc1qdwuth4vlg8x37ggntlxu5cjfwgmdy5zaa7pswf
|
||||
|
||||
### Licence and usage restrictions
|
||||
|
||||
Base model has a custom Meta license:
|
||||
- See the [meta-license/LICENSE.txt](meta-license/LICENSE.txt) file attached for the original license provided by Meta.
|
||||
- See also [meta-license/USE_POLICY.md](meta-license/USE_POLICY.md) and [meta-license/Responsible-Use-Guide.pdf](meta-license/Responsible-Use-Guide.pdf), also provided by Meta.
|
||||
|
||||
The fine-tuning data was generated by OpenAI API calls to gpt-4, via [airoboros](https://github.com/jondurbin/airoboros)
|
||||
|
||||
The ToS for OpenAI API usage has a clause preventing the output from being used to train a model that __competes__ with OpenAI
|
||||
|
||||
- what does *compete* actually mean here?
|
||||
- these small open source models will not produce output anywhere near the quality of gpt-4, or even gpt-3.5, so I can't imagine this could credibly be considered competing in the first place
|
||||
- if someone else uses the dataset to do the same, they wouldn't necessarily be violating the ToS because they didn't call the API, so I don't know how that works
|
||||
- the training data used in essentially all large language models includes a significant amount of copyrighted or otherwise non-permissive licensing in the first place
|
||||
- other work using the self-instruct method, e.g. the original here: https://github.com/yizhongw/self-instruct released the data and model as apache-2
|
||||
|
||||
I am purposingly leaving this license ambiguous (other than the fact you must comply with the Meta original license for llama-2) because I am not a lawyer and refuse to attempt to interpret all of the terms accordingly.
|
||||
|
||||
Your best bet is probably to avoid using this commercially due to the OpenAI API usage.
|
||||
|
||||
Either way, by using this model, you agree to completely indemnify me.
|
||||
Reference in New Issue
Block a user