Model: kojima-lab/molcrawl-molecule-nat-lang-mol-instructions-gpt2-small Source: Original Platform
386 B
386 B
Tokenizer Note
This model was trained with an internal hash-based tokenizer (vocab_size=50002). The tokenizer is not saved in standard HuggingFace format.
For inference, use a tokenizer with vocab_size=50002 or the CodeLlama tokenizer
(codellama/CodeLlama-7b-hf) as the intended base.
Special token IDs:
<pad>: 0<eos>: 2[/INST]sequence: [518, 29914, 25580, 29162]