Files
Index-1.9B-Chat-GGUF/README_zh.md
2025-03-28 07:03:59 +00:00

75 lines
2.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: other
license_name: license
license_link: LICENSE
---
<div align="center">
<h1>
Index-1.9B-Chat-GGUF
</h1>
</div>
此为[Index-1.9B-Chat](https://huggingface.co/IndexTeam/Index-1.9B-Chat)的GGUF版本适配llama.cpp同时提供了Ollma的ModelFile适配。
更多细节详见我们的[GitHub](https://github.com/bilibili/Index-1.9B)和[Index-1.9B技术报告](https://github.com/bilibili/Index-1.9B/blob/main/Index-1.9B%20%E6%8A%80%E6%9C%AF%E6%8A%A5%E5%91%8A.pdf)
### LLAMA.CPP
```shell
# 安装llama.cpp(https://github.com/ggerganov/llama.cpp)
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
# 安装llama-cpp-python(https://github.com/abetlen/llama-cpp-python)
pip install llama-cpp-python
```
llama.cpp终端交互
```shell
./build/bin/llama-cli -m models/Index-1.9B-Chat/ggml-model-bf16.gguf --color -if
```
**注意!!** llama.cpp并不支持自定义chat_template, 故而需要自己拼接promptIndex-1.9B的chat_template(示意)为
```shell
# 三个分隔符为 <unk>(token_id=0), reserved_0(token_id=3), reserved_1(token_id=4)
[<unk>]sytem_message[reserved_0]user_message[reserved_1]response
```
使用llama-cpp-python可支持自定义的chat_template(已写入GGUF可直接使用)
```python
from llama_cpp import Llama
model_path = "Index-1.9B-Chat-GGUF/ggml-model-Q6_K.gguf"
llm = Llama(model_path =model_path, verbose=True)
output = llm.create_chat_completion(
messages = [
{"role": "system", "content": "你是由哔哩哔哩自主研发的大语言模型名为“Index”。你能够根据用户传入的信息帮助用户完成指定的任务并生成恰当的、符合要求的回复。"},
#{"role": "system", "content": "你需要扮演B站评论区老哥用评论区阴阳怪气的话术回复不要说你是AI"},
{"role": "user","content": "篮球和鸡有什么关系"}
]
)
print(output)
```
### OLLAMA
- 安装[Ollama](https://github.com/ollama/ollama)
```shell
curl -fsSL https://ollama.com/install.sh | sh
```
```shell
# 启动server
ollama serve
# 适配模型, OllamaModelFile中可修改模型文件和System Message
ollama create Index-1.9B-Chat -f Index-1.9B-Chat-GGUF/OllamaModelFile
# 启动终端
ollama run Index-1.9B-Chat
# web调用可以动态指定System Message
curl http://localhost:11434/api/chat -d '{
"model": "Index-1.9B-Chat",
"messages": [
{ "role": "system", "content": "你是由哔哩哔哩自主研发的大语言模型名为“Index”。你能够根据用户传入的信息帮助用户完成指定的任务并生成恰当的、符合要求的回复。" },
{ "role": "user", "content": "续写 金坷垃" }
]
}'
```