longchat-13b-16k-gguf/README.md at main

Files

ModelHub XC 80e9c175f6 初始化项目，由ModelHub XC社区提供模型

Model: shaowenchen/longchat-13b-16k-gguf
Source: Original Platform

2026-05-16 01:02:38 +08:00

inference, language, license, model_creator, model_link, model_name, model_type, pipeline_tag, quantized_by, tasks, tags

inference

language

license

model_creator

model_link

model_name

model_type

pipeline_tag

quantized_by

tasks

Provided files

Usage:

docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/gguf-model-name.gguf hubimage/llama-cpp-python:latest

and you can view http://localhost:8000/docs to see the swagger UI.

Name	Quant method	Compressed Size
`shaowenchen/longchat-13b-16k-gguf:Q2_K`	Q2_K	7.47 GB
`shaowenchen/longchat-13b-16k-gguf:Q3_K`	Q3_K	6.11 GB
`shaowenchen/longchat-13b-16k-gguf:Q4_K`	Q4_K	5.29 GB

Usage:

docker run --rm -p 8000:8000 shaowenchen/longchat-13b-16k-gguf:Q2_K

and you can view http://localhost:8000/docs to see the swagger UI.