--- base_model: unsloth/qwen2-0.5b-bnb-4bit language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - qwen2 - trl - sft --- # 1 - Question : alpaca_prompt = Copied from above FastLanguageModel.for_inference(model) # Enable native 2x faster inference inputs = tokenizer( [ alpaca_prompt.format( "Continue the fibonnaci sequence.", # instruction "1, 1, 2, 3, 5, 8", # input "", # output - leave this blank for generation! ) ], return_tensors = "pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens = 128, use_cache = True) tokenizer.batch_decode(outputs) # 1 - Answer : ['Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Input:\nContinue the fibonnaci sequence.\n\n### Output:\n1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811'] # 2 - Question : alpaca_prompt = Copied from above FastLanguageModel.for_inference(model) # Enable native 2x faster inference inputs = tokenizer( [ alpaca_prompt.format( "What is fibonacci sequence?", # instruction "", # input "", # output - leave this blank for generation! ) ], return_tensors = "pt").to("cuda") from transformers import TextStreamer text_streamer = TextStreamer(tokenizer) _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128) # 2 - Answer : Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Input: What is fibonacci sequence? ### Output: The Fibonacci sequence is a series of numbers in which each number is the sum of the two preceding ones, usually starting with 0 and 1. The sequence goes like this: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 1 # 3 - Question : if False: from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name = "lora_model", # YOUR MODEL YOU USED FOR TRAINING max_seq_length = max_seq_length, dtype = dtype, load_in_4bit = load_in_4bit, ) FastLanguageModel.for_inference(model) # Enable native 2x faster inference alpaca_prompt = You MUST copy from above! inputs = tokenizer( [ alpaca_prompt.format( "I need train a AI offline on my computer, give me a code good for this case.", # instruction "", # input "", # output - leave this blank for generation! ) ], return_tensors = "pt").to("cuda") from transformers import TextStreamer text_streamer = TextStreamer(tokenizer) _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 4096) # 3 - Answer : Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Input: I need train a AI offline on my computer, give me a code good for this case. ### Output: Sure, here's a simple example of how you can train an AI model on a computer using Python and TensorFlow: ```python import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers # Define the model model = tf.keras.Sequential([ layers.Dense(64, activation='relu', input_shape=(100,)), layers.Dense(64, activation='relu'), layers.Dense(1) ]) # Compile the model model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mean_absolute_error']) # Train the model model.fit(X_train, y_train, epochs=100, batch_size=32) # Evaluate the model model.evaluate(X_test, y_test) ``` In this example, we are using the Keras library to create a sequential model. The model consists of two dense layers with ReLU activation. The first layer has 64 units and the second layer has 64 units. The output layer has 1 unit. The mean squared error is used as the loss function, and the mean absolute error is used as the metric for evaluation. The `adam` optimizer is used for training, and the `mean_squared_error` metric is used for evaluation. Please note that this is a very simple example and you may need to adjust the model architecture, number of layers, number of units, and other parameters depending on your specific use case.<|endoftext|> # Uploaded model - **Developed by:** Ramikan-BR - **License:** apache-2.0 - **Finetuned from model :** unsloth/qwen2-0.5b-bnb-4bit This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)