初始化项目,由ModelHub XC社区提供模型
Model: wudejian789/MolMetaLM-base Source: Original Platform
This commit is contained in:
26
README.md
Normal file
26
README.md
Normal file
@@ -0,0 +1,26 @@
|
||||
---
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- Molecule Language Model
|
||||
- Physicochemical Knowledge
|
||||
---
|
||||
|
||||
refer to https://github.com/CSUBioGroup/MolMetaLM for more details.
|
||||
|
||||
# Usage
|
||||
|
||||
## Prepare tokenizer and model
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModel
|
||||
tokenizer = AutoTokenizer.from_pretrained('wudejian789/MolMetaLM-base')
|
||||
model = AutoModel.from_pretrained('wudejian789/MolMetaLM-base')
|
||||
```
|
||||
|
||||
## Obtain molecular representations from SMILES
|
||||
```python
|
||||
smi = "COc1cc2c(cc1OC)CC([NH3+])C2"
|
||||
tokenized_smi = tokenizer(" ".join(list(smi)), return_token_type_ids=False,
|
||||
return_tensors='pt', max_length=512, padding='longest', truncation=True)
|
||||
emb_smi = model(**tokenized_smi).last_hidden_state
|
||||
print(emb_smi.shape) # batch size, seq length, embedding size
|
||||
```
|
||||
Reference in New Issue
Block a user