102 lines
3.6 KiB
Markdown
102 lines
3.6 KiB
Markdown
|
|
---
|
|||
|
|
license: llama3.2
|
|||
|
|
metrics:
|
|||
|
|
- perplexity
|
|||
|
|
base_model:
|
|||
|
|
- meta-llama/Llama-3.2-3B
|
|||
|
|
pipeline_tag: text-generation
|
|||
|
|
library_name: transformers
|
|||
|
|
tags:
|
|||
|
|
- Web3
|
|||
|
|
- Domain-Specific
|
|||
|
|
- NLP
|
|||
|
|
- Intent Recognition
|
|||
|
|
- Solidity
|
|||
|
|
language:
|
|||
|
|
- en
|
|||
|
|
---
|
|||
|
|
# Model Card for Brian-3B
|
|||
|
|
|
|||
|
|
<img src="brian_llama2_logo.png" alt="Brian Logo" width="600"/>
|
|||
|
|
|
|||
|
|
|
|||
|
|
## Model Details
|
|||
|
|
|
|||
|
|
### Model Description
|
|||
|
|
|
|||
|
|
The **Brian-3B** model is a domain-specific language model tailored for Web3 applications. Built upon Meta’s Llama-3.2-3B, it is optimized for tasks involving natural language understanding and intent recognition in the blockchain ecosystem.
|
|||
|
|
This includes tasks such as transaction intent parsing, Solidity code generation, and question answering on Web3-related topics.
|
|||
|
|
|
|||
|
|
- **Developed by:** The Brian Team
|
|||
|
|
- **Funded by:** The Brian Team
|
|||
|
|
- **Shared by:** The Brian Team
|
|||
|
|
- **Model type:** Transformer-based autoregressive language model
|
|||
|
|
- **Language(s):** English
|
|||
|
|
- **License:** Llama 3.2 Community License
|
|||
|
|
- **Finetuned from:** meta-llama/Llama-3.2-3B
|
|||
|
|
|
|||
|
|
**Please note**, this is just the first of a series of further training phases before the model can be used in production (estimated Q1 2025) to power our Intent Recognition Engine.
|
|||
|
|
The Brian team is calling on all partners interested in the space: developers, projects, and investors who might be involved in future phases of the model training.
|
|||
|
|
Join our [TG Dev chat](https://t.me/+NJjmAm2Y9p85Mzc0) if you have any questions or want to contribute to the model training.
|
|||
|
|
|
|||
|
|
### Model Sources
|
|||
|
|
|
|||
|
|
- **Repository:** [Hugging Face Repository](https://huggingface.co/brianknowsai/Brian-Llama-3.2-3B)
|
|||
|
|
- **Demo:** This model will be integrated soon to power https://www.brianknows.org/
|
|||
|
|
- **Paper:** Coming soon
|
|||
|
|
|
|||
|
|
## Uses
|
|||
|
|
|
|||
|
|
### Downstream Use
|
|||
|
|
|
|||
|
|
The model is specifically designed to be fine-tuned for downstream tasks such as:
|
|||
|
|
|
|||
|
|
- **Transaction intent recognition**: Parsing natural language into JSON for transaction data.
|
|||
|
|
- **Solidity code generation**: Creating smart contracts based on user prompts.
|
|||
|
|
- **Web3 question answering**: Answering protocol-specific queries or extracting blockchain-related data.
|
|||
|
|
|
|||
|
|
In the coming months, our team will release these task-specific models.
|
|||
|
|
Anyone in the web3 space can fine-tune the model for other downstream tasks or improve its knowledge of specific ecosystems (e.g., Solana, Farcaster, etc.)
|
|||
|
|
|
|||
|
|
### Out-of-Scope Use
|
|||
|
|
|
|||
|
|
- Tasks outside the Web3 domain.
|
|||
|
|
- Generating harmful, unethical, or misleading content.
|
|||
|
|
|
|||
|
|
## Bias, Risks, and Limitations
|
|||
|
|
|
|||
|
|
### Recommendations
|
|||
|
|
|
|||
|
|
While the model shows excellent performance in Web3-related domains, users should validate outputs for critical tasks like smart contract generation or
|
|||
|
|
transaction execution to avoid errors. Fine-tuning is recommended for domain-specific applications.
|
|||
|
|
|
|||
|
|
## How to Get Started with the Model
|
|||
|
|
|
|||
|
|
To load and use the Brian-3B model:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|||
|
|
|
|||
|
|
# Load the model and tokenizer
|
|||
|
|
model = AutoModelForCausalLM.from_pretrained("brianknowsai/Brian-Llama-3.2-3B")
|
|||
|
|
tokenizer = AutoTokenizer.from_pretrained("brianknowsai/Brian-Llama-3.2-3B")
|
|||
|
|
|
|||
|
|
# Generate text
|
|||
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
|||
|
|
model.to(device)
|
|||
|
|
|
|||
|
|
input_text = "A web3 bridge is "
|
|||
|
|
|
|||
|
|
# Tokenize the input text
|
|||
|
|
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(device)
|
|||
|
|
|
|||
|
|
# Generate output (this is typical for causal language models)
|
|||
|
|
with torch.no_grad():
|
|||
|
|
outputs = model.generate(input_ids, max_length=80, num_return_sequences=1)
|
|||
|
|
|
|||
|
|
# Decode the generated tokens to text
|
|||
|
|
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
|||
|
|
|
|||
|
|
# Print the result
|
|||
|
|
print(f"Input: {input_text}")
|
|||
|
|
print(f"Generated Brian text: {generated_text}")
|