* llama : add option to override tensor buffers * ggml : fix possible underflow in ggml_nbytes
llama_vocab