--- license: Apache License 2.0 #model-type: ##如 gpt、phi、llama、chatglm、baichuan 等 #- gpt #domain: ##如 nlp、cv、audio、multi-modal #- nlp #language: ##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa #- cn #metrics: ##如 CIDEr、Blue、ROUGE 等 #- CIDEr #tags: ##各种自定义,包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他 #- pretrained #tools: ##如 vllm、fastchat、llamacpp、AdaSeq 等 #- vllm language: - zh tasks: - text-generation frameworks: PyTorch base_model: - Qwen/Qwen2.5-1.5B-Instruct --- ### 本模型是用数据集FoolBird/GB50016-2014对Qwen2.5-1.5B-Instruct进行预训练,训练轮数250轮 ### 数据集地址https://modelscope.cn/datasets/FoolBird/GB50016-2014 ### 本模型仅供学习使用 #### 您可以通过如下git clone命令,或者ModelScope SDK来下载模型 SDK下载 ```bash #安装ModelScope pip install modelscope ``` ```python #SDK模型下载 from modelscope import snapshot_download model_dir = snapshot_download('FoolBird/Qwen-2.5-1.5b-instruct-JZFH') ``` Git下载 ``` #Git模型下载 git clone https://www.modelscope.cn/FoolBird/Qwen-2.5-1.5b-instruct-JZFH.git ``` 使用本模型进行推理 ```python # 使用本模型进行推理 from vllm import LLM, SamplingParams import os from transformers import AutoModelForCausalLM, AutoTokenizer from modelscope import snapshot_download import torch import time # 调用 ModelScope 模型 model_id = 'FoolBird/Qwen-2.5-1.5b-instruct-JZFH' # ModelScope 上的模型 ID # 初始化全局变量 llm = None tokenizer = None sampling_params = None # 下载 ModelScope 模型 def download_modelscope_model(model_id): # 下载模型并返回本地路径 model_path = snapshot_download(model_id) return model_path # 初始化模型和 tokenizer def qwen_vllm(model_path): # 设置 CUDA 设备为 GPU 2 os.environ["CUDA_VISIBLE_DEVICES"] = "0" global llm, tokenizer sum = 1 # 设置 CUDA 设备为 GPU 2 (在设置的上下文中,它是索引0) torch.cuda.set_device(0) # 加载模型 llm = LLM(model=model_path, tensor_parallel_size=sum, dtype=torch.float16, enforce_eager=True, gpu_memory_utilization=0.8, max_model_len=1024) tokenizer = AutoTokenizer.from_pretrained(model_path) text = "qwen_vllm加载完毕" print(text) return text # 使用 VLLM 进行推理 def qwen2_5_inference(info): messages = [ {"role": "system", "content":' You are Qwen, created by Alibaba Cloud. You are a helpful assistant'}, {"role": "user", "content": info} ] # 聊天模板的消息 text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) # 清理未使用的显存 torch.cuda.empty_cache() global sampling_params sampling_params = SamplingParams(temperature=0.7, top_p=0.8, repetition_penalty=1.05, max_tokens=512) outputs = llm.generate([text], sampling_params) # 流式输出结果 for output in outputs: generated_text = output.outputs[0].text for char in generated_text: print(char, end='', flush=True) # 流式输出每个字符 time.sleep(0.05) if __name__ == '__main__': model_path = download_modelscope_model(model_id) qwen_vllm(model_path) while True: user_input = input("请输入您的问题: ") qwen2_5_inference(user_input) print(" ") # 清理未使用的显存 torch.cuda.empty_cache() ```