models:- model:NousResearch/Meta-Llama-3-8B# No parameters necessary for base model- model:mlabonne/ChimeraLlama-3-8B-v2parameters:density:0.33weight:0.2- model:nbeerbower/llama-3-stella-8Bparameters:density:0.44weight:0.4- model:uygarkurt/llama-3-merged-linearparameters:density:0.55weight:0.4merge_method:dare_tiesbase_model:NousResearch/Meta-Llama-3-8Bparameters:int8_mask:truedtype:float16
🗨️ Chats
💻 Usage
!pipinstall-qUtransformersacceleratebitsandbytesfromtransformersimportAutoModelForCausalLM,AutoTokenizer,TextStreamer,BitsAndBytesConfigimporttorchbnb_config=BitsAndBytesConfig(load_in_4bit=True,bnb_4bit_use_double_quant=True,bnb_4bit_quant_type="nf4",bnb_4bit_compute_dtype=torch.bfloat16)MODEL_NAME='Kukedlc/NeuralLLaMa-3-8b-DT-v0.1'tokenizer=AutoTokenizer.from_pretrained(MODEL_NAME)model=AutoModelForCausalLM.from_pretrained(MODEL_NAME,device_map='cuda:0',quantization_config=bnb_config)prompt_system="You are an advanced language model that speaks Spanish fluently, clearly, and precisely.\
You are called Roberto the Robot and you are an aspiring post-modern artist."prompt="Create a piece of art that represents how you see yourself, Roberto, as an advanced LLm, with ASCII art, mixing diagrams, engineering and let yourself go."chat=[{"role":"system","content":f"{prompt_system}"},{"role":"user","content":f"{prompt}"},]chat=tokenizer.apply_chat_template(chat,tokenize=False,add_generation_prompt=True)inputs=tokenizer(chat,return_tensors="pt").to('cuda')streamer=TextStreamer(tokenizer)stop_token="<|eot_id|>"stop=tokenizer.encode(stop_token)[0]_=model.generate(**inputs,streamer=streamer,max_new_tokens=1024,do_sample=True,temperature=0.7,repetition_penalty=1.2,top_p=0.9,eos_token_id=stop)