Update README.md

2025-01-15 22:16:41 +08:00
parent 8f7e9112ec
commit e151ee4b2f
12 changed files with 192 additions and 63 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -1,47 +1,44 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bin.* filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zstandard filter=lfs diff=lfs merge=lfs -text
 *.tfevents* filter=lfs diff=lfs merge=lfs -text
 *.db* filter=lfs diff=lfs merge=lfs -text
 *.ark* filter=lfs diff=lfs merge=lfs -text
 **/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
 **/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
 **/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.gguf* filter=lfs diff=lfs merge=lfs -text
 *.ggml filter=lfs diff=lfs merge=lfs -text
 *.llamafile* filter=lfs diff=lfs merge=lfs -text
 *.pt2 filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 internlm3-8b-instruct-q2_k.gguf filter=lfs diff=lfs merge=lfs -text
 internlm3-8b-instruct-q3_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 internlm3-8b-instruct-q4_0.gguf filter=lfs diff=lfs merge=lfs -text
 internlm3-8b-instruct-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 internlm3-8b-instruct-q5_0.gguf filter=lfs diff=lfs merge=lfs -text
 internlm3-8b-instruct-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 internlm3-8b-instruct-q6_k.gguf filter=lfs diff=lfs merge=lfs -text
 internlm3-8b-instruct-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
 internlm3-8b-instruct.gguf filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -1,47 +1,151 @@
 ---
-license: Apache License 2.0
+license: apache-2.0
-
+language:
-#model-type:
+- en
-##如 gpt、phi、llama、chatglm、baichuan 等
+pipeline_tag: text-generation
-#- gpt
+tags:
-
+- chat
 #domain:
 ##如 nlp、cv、audio、multi-modal
 #- nlp
 #language:
 ##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
 #- cn 
 #metrics:
 ##如 CIDEr、Blue、ROUGE 等
 #- CIDEr
 #tags:
 ##各种自定义，包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
 #- pretrained
 #tools:
 ##如 vllm、fastchat、llamacpp、AdaSeq 等
 #- vllm
 ---
-### 当前模型的贡献者未提供更加详细的模型介绍。模型文件和权重，可浏览“模型文件”页面获取。
+# InternLM3-8B-Instruct GGUF Model
 #### 您可以通过如下git clone命令，或者ModelScope SDK来下载模型
-SDK下载
+## Introduction
-```bash
+
-#安装ModelScope
+The `internlm3-8b-instruct` model in GGUF format can be utilized by [llama.cpp](https://github.com/ggerganov/llama.cpp), a highly popular open-source framework for Large Language Model (LLM) inference, across a variety of hardware platforms, both locally and in the cloud.
-pip install modelscope
+This repository offers `internlm3-8b-instruct` models in GGUF format in both half precision and various low-bit quantized versions, including `q5_0`, `q5_k_m`, `q6_k`, and `q8_0`.
 In the subsequent sections, we will first present the installation procedure, followed by an explanation of the model download process. 
 And finally we will illustrate the methods for model inference and service deployment through specific examples.
 ## Installation
 We recommend building `llama.cpp` from source. The following code snippet provides an example for the Linux CUDA platform. For instructions on other platforms, please refer to the [official guide](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#build).
 - Step 1: create a conda environment and install cmake
 ```shell
 conda create --name internlm3 python=3.10 -y
 conda activate internlm3
 pip install cmake
 ```
 - Step 2: clone the source code and build the project 
 ```shell
 git clone --depth=1 https://github.com/ggerganov/llama.cpp.git
 cd llama.cpp
 cmake -B build -DGGML_CUDA=ON
 cmake --build build --config Release -j
 ```
 All the built targets can be found in the sub directory `build/bin`
 In the following sections, we assume that the working directory is at the root directory of `llama.cpp`.
 ## Download models
 In the [introduction section](#introduction), we mentioned that this repository includes several models with varying levels of computational precision. You can download the appropriate model based on your requirements.
 For instance, `internlm3-8b-instruct-fp16.gguf` can be downloaded as below：
 ```shell
 pip install huggingface-hub
 huggingface-cli download internlm/internlm3-8b-instruct-gguf internlm3-8b-instruct.gguf --local-dir . --local-dir-use-symlinks False
 ```
 ## Inference
 You can use `llama-cli` for conducting inference. For a detailed explanation of `llama-cli`, please refer to [this guide](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
 ### chat example
 ```shell
 build/bin/llama-cli \
    --model internlm3-8b-instruct.gguf  \
    --predict 512 \
    --ctx-size 4096 \
    --gpu-layers 48 \
    --temp 0.8 \
    --top-p 0.8 \
    --top-k 50 \
    --seed 1024 \
    --color \
    --prompt "<|im_start|>system\nYou are an AI assistant whose name is InternLM (书生·浦语).\n- InternLM (书生·浦语) is a conversational language model that is developed by Shanghai AI Laboratory (上海人工智能实验室). It is designed to be helpful, honest, and harmless.\n- InternLM (书生·浦语) can understand and communicate fluently in the language chosen by the user such as English and 中文.<|im_end|>\n" \
    --interactive \
    --multiline-input \
    --conversation \
    --verbose \
    --logdir workdir/logdir \
    --in-prefix "<|im_start|>user\n" \
    --in-suffix "<|im_end|>\n<|im_start|>assistant\n"
 ```
 ### Function call example
 `llama-cli` example:
 ```shell
 build/bin/llama-cli \
    --model internlm3-8b-instruct.gguf \
    --predict 512 \
    --ctx-size 4096 \
    --gpu-layers 48 \
    --temp 0.8 \
    --top-p 0.8 \
    --top-k 50 \
    --seed 1024 \
    --color \
    --prompt '<|im_start|>system\nYou are InternLM-Chat, a harmless AI assistant.<|im_end|>\n<|im_start|>system name=<|plugin|>[{"name": "get_current_weather", "parameters": {"required": ["location"], "type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "unit": {"type": "string"}}}, "description": "Get the current weather in a given location"}]<|im_end|>\n<|im_start|>user\n' \
    --interactive \
    --multiline-input \
    --conversation \
    --verbose \
    --in-suffix "<|im_end|>\n<|im_start|>assistant\n" \
    --special
 ```
 Conversation results:
 ```text
 <s><|im_start|>system
 You are InternLM-Chat, a harmless AI assistant.<|im_end|>
 <|im_start|>system name=<|plugin|>[{"name": "get_current_weather", "parameters": {"required": ["location"], "type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "unit": {"type": "string"}}}, "description": "Get the current weather in a given location"}]<|im_end|>
 <|im_start|>user
 > I want to know today's weather in Shanghai
 I need to use the get_current_weather function to get the current weather in Shanghai.<|action_start|><|plugin|>
 {"name": "get_current_weather", "parameters": {"location": "Shanghai"}}<|action_end|>32
 <|im_end|>
 > <|im_start|>environment name=<|plugin|>\n{"temperature": 22}
 The current temperature in Shanghai is 22 degrees Celsius.<|im_end|>
 > 
 ```
 ## Serving
 `llama.cpp` provides an OpenAI API compatible server - `llama-server`. You can deploy `internlm3-8b-instruct.gguf` into a service like this:
 ```shell
 ./build/bin/llama-server -m ./internlm3-8b-instruct.gguf -ngl 48
 ```
 At the client side, you can access the service through OpenAI API:
 ```python
-#SDK模型下载
+from openai import OpenAI
-from modelscope import snapshot_download
+client = OpenAI(
-model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm3-8b-instruct-gguf')
+    api_key='YOUR_API_KEY',
    base_url='http://localhost:8080/v1'
 )
 model_name = client.models.list().data[0].id
 response = client.chat.completions.create(
  model=model_name,
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": " provide three suggestions about time management"},
  ],
  temperature=0.8,
  top_p=0.8
 )
 print(response)
 ```
 Git下载
 ```
 #Git模型下载
 git clone https://www.modelscope.cn/Shanghai_AI_Laboratory/internlm3-8b-instruct-gguf.git
 ```
 <p style="color: lightgrey;">如果您是本模型的贡献者，我们邀请您根据<a href="https://modelscope.cn/docs/ModelScope%E6%A8%A1%E5%9E%8B%E6%8E%A5%E5%85%A5%E6%B5%81%E7%A8%8B%E6%A6%82%E8%A7%88" style="color: lightgrey; text-decoration: underline;">模型贡献文档</a>，及时完善模型卡片内容。</p>
--- a/configuration.json
+++ b/configuration.json
@@ -0,0 +1 @@
 {"framework": "pytorch", "task": "text-generation", "allow_remote": true}
--- a/internlm3-8b-instruct-q2_k.gguf
+++ b/internlm3-8b-instruct-q2_k.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:bc3ae670c7f8e74b69b6ba4d7c4dc9e7ad123b42215f176200493c09b627354b
 size 3450641600
--- a/internlm3-8b-instruct-q3_k_m.gguf
+++ b/internlm3-8b-instruct-q3_k_m.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:37edd5c2afdd3c11d5b7205a453e9cf054b97ede78706fd0bbb4433b0ac3ec0d
 size 4390280384
--- a/internlm3-8b-instruct-q4_0.gguf
+++ b/internlm3-8b-instruct-q4_0.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:dd38d7164071f9ca559e2e099a9a63f3ee9c0a4d7b5e81067349a25e41a7c915
 size 5092613312
--- a/internlm3-8b-instruct-q4_k_m.gguf
+++ b/internlm3-8b-instruct-q4_k_m.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:e7b10f95f20a5c5a8e6213925c88bc6b02012e41c0c7d7da0b0788c528c0e010
 size 5358623936
--- a/internlm3-8b-instruct-q5_0.gguf
+++ b/internlm3-8b-instruct-q5_0.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:7c1d3b59f10bb3dc940d137f605d69e23c8ed8944db1201eab0620bf2df02d08
 size 6127295680
--- a/internlm3-8b-instruct-q5_k_m.gguf
+++ b/internlm3-8b-instruct-q5_k_m.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:137045fa5d9504762ebdb3029a3780fafc2c2599fd77f99ed71b5124a9dae325
 size 6264331456
--- a/internlm3-8b-instruct-q6_k.gguf
+++ b/internlm3-8b-instruct-q6_k.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:fa2231958b48cca2a53ad98cc0c4caa354845bbc4db14f2c7eda7981f434fab6
 size 7226645696
--- a/internlm3-8b-instruct-q8_0.gguf
+++ b/internlm3-8b-instruct-q8_0.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:4b6cd620c74ff56aa465d31f30b9e54a0a73defe18dd979de84e82d6ef54174b
 size 9358826688
--- a/internlm3-8b-instruct.gguf
+++ b/internlm3-8b-instruct.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:a75c3ca0b4feefe73223751301f67fe73caf5f2a08d2c8a7d7f4d914fc207a6a
 size 17612430528
		`@@ -0,0 +1 @@`
							`{"framework": "pytorch", "task": "text-generation", "allow_remote": true}`