初始化项目,由ModelHub XC社区提供模型

Model: RichardErkhov/dfurman_-_LLaMA-7B-gguf
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-04 04:48:15 +08:00
commit 6c3b8a0dd1
24 changed files with 279 additions and 0 deletions

57
.gitattributes vendored Normal file
View File

@@ -0,0 +1,57 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.IQ3_XS.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.IQ3_S.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.IQ3_M.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q3_K.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.IQ4_NL.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q4_K.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q5_K.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q5_1.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
LLaMA-7B.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

3
LLaMA-7B.IQ3_M.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4003f8d7424c12a6ac056873a82626f5daf18a8753f5ca38bf3d9a937b76d3b0
size 3114864384

3
LLaMA-7B.IQ3_S.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d1b6f9986fc4ef88e4ef8b715ef3ea43a76d74fb4bf687ff69f9e938ab6d0852
size 2948304640

3
LLaMA-7B.IQ3_XS.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a273950488dbcdeb277fa4192f749ba0c2478e1eb354a095ff2b6137a2b6491d
size 2796523264

3
LLaMA-7B.IQ4_NL.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:afa07ed4b603f53cbc841652f51205fc4ce8ea57d40e2f14dad5aff60472926a
size 3848351488

3
LLaMA-7B.IQ4_XS.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bcd37fc40da6aae80e1b42045dbae8cf09a965007df9a64befea5b6d8ba60aee
size 3647516416

3
LLaMA-7B.Q2_K.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e752ddfb69645153ede851cd47304f26ec9fb0c5e5044598adbe650ea00b7658
size 2532863744

3
LLaMA-7B.Q3_K.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9ba4df7884ee7a20ffbe44f52a956e37fe030a0d5cdb119e7b84a5705dc0bc64
size 3298004736

3
LLaMA-7B.Q3_K_L.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1eb3b630ce0b40f0b6eebeb12bb54a62e280841a0a003011852b52cb0073a8e3
size 3597111040

3
LLaMA-7B.Q3_K_M.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9ba4df7884ee7a20ffbe44f52a956e37fe030a0d5cdb119e7b84a5705dc0bc64
size 3298004736

3
LLaMA-7B.Q3_K_S.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:119de0dc4279b8daf0b187d6df92bb78760b8065dec7677888a81991d5aedb8b
size 2948304640

3
LLaMA-7B.Q4_0.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:42525e12bada47c168668fbb5a4f36d1ab32205b272053ddc07e0e565b061038
size 3825807104

3
LLaMA-7B.Q4_1.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f963b38ac74ba224cfe2405a4c6b02f0a5bed94057099db06bbaed4ff09d1fcf
size 4238749440

3
LLaMA-7B.Q4_K.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:aa65152b9d72e4a79206616ea8ae7ffc634211170a0c05a36599364c709d8f03
size 4081004288

3
LLaMA-7B.Q4_K_M.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:aa65152b9d72e4a79206616ea8ae7ffc634211170a0c05a36599364c709d8f03
size 4081004288

3
LLaMA-7B.Q4_K_S.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7b41672a8bb765be90ee6751da5dae205fb634145cfc79a783b34f0f1da9f4f6
size 3856740096

3
LLaMA-7B.Q5_0.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5b9a4d98ba3d7f1e014e2500b6ae258ace784571626471f554c4b0c19bc98ea7
size 4651691776

3
LLaMA-7B.Q5_1.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:91127ffd3fd3f7937ede0c76fbbdfd2b42d909a4a68c00ca706c76ab0ed30b4d
size 5064634112

3
LLaMA-7B.Q5_K.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bc7b18a55ffe12970cdebe6340b670fd161bfaadd52b527977a85d1c0a40f682
size 4783156992

3
LLaMA-7B.Q5_K_M.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bc7b18a55ffe12970cdebe6340b670fd161bfaadd52b527977a85d1c0a40f682
size 4783156992

3
LLaMA-7B.Q5_K_S.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fa87ae98216c72198bc059fe4ca808c5b6502513b594e9e88186ddce0f762e29
size 4651691776

3
LLaMA-7B.Q6_K.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f329367d1d9ecc26c6c23c44334384163225ac07bedf692b6fedb9ec257ae73e
size 5529194240

3
LLaMA-7B.Q8_0.gguf Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:65a33550755426363362cc66a9bd0f2d094faaff74dd1a8a3fed181d5c25b047
size 7161089792

156
README.md Normal file
View File

@@ -0,0 +1,156 @@
Quantization made by Richard Erkhov.
[Github](https://github.com/RichardErkhov)
[Discord](https://discord.gg/pvy7H8DZMG)
[Request more models](https://github.com/RichardErkhov/quant_request)
LLaMA-7B - GGUF
- Model creator: https://huggingface.co/dfurman/
- Original model: https://huggingface.co/dfurman/LLaMA-7B/
| Name | Quant method | Size |
| ---- | ---- | ---- |
| [LLaMA-7B.Q2_K.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q2_K.gguf) | Q2_K | 2.36GB |
| [LLaMA-7B.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.IQ3_XS.gguf) | IQ3_XS | 2.6GB |
| [LLaMA-7B.IQ3_S.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.IQ3_S.gguf) | IQ3_S | 2.75GB |
| [LLaMA-7B.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q3_K_S.gguf) | Q3_K_S | 2.75GB |
| [LLaMA-7B.IQ3_M.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.IQ3_M.gguf) | IQ3_M | 2.9GB |
| [LLaMA-7B.Q3_K.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q3_K.gguf) | Q3_K | 3.07GB |
| [LLaMA-7B.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q3_K_M.gguf) | Q3_K_M | 3.07GB |
| [LLaMA-7B.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q3_K_L.gguf) | Q3_K_L | 3.35GB |
| [LLaMA-7B.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.IQ4_XS.gguf) | IQ4_XS | 3.4GB |
| [LLaMA-7B.Q4_0.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q4_0.gguf) | Q4_0 | 3.56GB |
| [LLaMA-7B.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.IQ4_NL.gguf) | IQ4_NL | 3.58GB |
| [LLaMA-7B.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q4_K_S.gguf) | Q4_K_S | 3.59GB |
| [LLaMA-7B.Q4_K.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q4_K.gguf) | Q4_K | 3.8GB |
| [LLaMA-7B.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q4_K_M.gguf) | Q4_K_M | 3.8GB |
| [LLaMA-7B.Q4_1.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q4_1.gguf) | Q4_1 | 3.95GB |
| [LLaMA-7B.Q5_0.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q5_0.gguf) | Q5_0 | 4.33GB |
| [LLaMA-7B.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q5_K_S.gguf) | Q5_K_S | 4.33GB |
| [LLaMA-7B.Q5_K.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q5_K.gguf) | Q5_K | 4.45GB |
| [LLaMA-7B.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q5_K_M.gguf) | Q5_K_M | 4.45GB |
| [LLaMA-7B.Q5_1.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q5_1.gguf) | Q5_1 | 4.72GB |
| [LLaMA-7B.Q6_K.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q6_K.gguf) | Q6_K | 5.15GB |
| [LLaMA-7B.Q8_0.gguf](https://huggingface.co/RichardErkhov/dfurman_-_LLaMA-7B-gguf/blob/main/LLaMA-7B.Q8_0.gguf) | Q8_0 | 6.67GB |
Original model description:
---
pipeline_tag: text-generation
license: other
---
<div align="center">
<img src="./assets/llama.png" width="150px">
</div>
# LLaMA-7B
LLaMA-7B is a base model for text generation with 6.7B parameters and a 1T token training corpus. It was built and released by the FAIR team at Meta AI alongside the paper "[LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971)".
This model repo was converted to work with the transformers package. It is under a bespoke **non-commercial** license, please see the [LICENSE](https://huggingface.co/dfurman/llama-7b/blob/main/LICENSE) file for more details.
## Model Summary
- **Model Type:** Causal decoder-only.
- **Dataset:** The model was trained on 1T tokens using the following data sources: CCNet [67%], C4 [15%], GitHub [4.5%], Wikipedia [4.5%], Books [4.5%], ArXiv [2.5%], Stack Exchange[2%].
- **Language(s):** The Wikipedia and Books domains include data in the following languages: bg, ca, cs, da, de, en, es, fr, hr, hu, it, nl, pl, pt, ro, ru, sl, sr, sv, uk.
- **License:** Bespoke non-commercial license, see [LICENSE](https://huggingface.co/dfurman/llama-7b/blob/main/LICENSE) file.
- **Model date:** LLaMA was trained between Dec 2022 and Feb 2023.
**Where to send inquiries about the model:**
Questions and comments about LLaMA can be sent via the [GitHub repository](https://github.com/facebookresearch/llama) of the project, by opening an issue.
## Intended use
**Primary intended uses:**
The primary use of LLaMA is research on large language models, including: exploring potential applications such as question answering, natural language understanding or reading comprehension, understanding capabilities and limitations of current language models, and developing techniques to improve those, evaluating and mitigating biases, risks, toxic and harmful content generations, and hallucinations.
**Primary intended users:**
The primary intended users of the model are researchers in natural language processing, machine learning and artificial intelligence.
**Out-of-scope use cases:**
LLaMA is a base model, also known as a foundation model. As such, it should not be used on downstream applications without further risk evaluation, mitigation, and additional fine-tuning. In particular, the model has not been trained with human feedback, and can thus generate toxic or offensive content, incorrect information or generally unhelpful answers.
## Factors
**Relevant factors:**
One of the most relevant factors for which model performance may vary is which language is used. Although 20 languages were included in the training data, most of the LLaMA dataset is made of English text, and the model is thus expected to perform better for English than other languages. Relatedly, it has been shown in previous studies that performance might vary for different dialects, which is likely also the case for LLaMA.
**Evaluation factors:**
As LLaMA is trained on data from the Web, it is expected that the model reflects biases from this source. The RAI datasets are thus used to measure biases exhibited by the model for gender, religion, race, sexual orientation, age, nationality, disability, physical appearance and socio-economic status. The toxicity of model generations is also measured, depending on the toxicity of the context used to prompt the model.
## Ethical considerations
**Data:**
The data used to train the model is collected from various sources, mostly from the Web. As such, it contains offensive, harmful and biased content. LLaMA is thus expected to exhibit such biases from the training data.
**Human life:**
The model is not intended to inform decisions about matters central to human life, and should not be used in such a way.
**Mitigations:**
The data was filtered from the Web based on its proximity to Wikipedia text and references. For this, the Kneser-Ney language model is used with a fastText linear classifier.
**Risks and harms:**
Risks and harms of large language models include the generation of harmful, offensive or biased content. These models are often prone to generating incorrect information, sometimes referred to as hallucinations. LLaMA is not expected to be an exception in this regard.
**Use cases:**
LLaMA is a foundational model, and as such, it should not be used for downstream applications without further investigation and mitigations of risks. These risks and potential fraught use cases include, but are not limited to: generation of misinformation and generation of harmful, biased or offensive content.
## How to Get Started with the Model
### Setup
```python
!pip install -q -U transformers accelerate torch
```
### GPU Inference in fp16
This requires a GPU with at least 15GB of VRAM.
### First, Load the Model
```python
import transformers
import torch
model_name = "dfurman/llama-7b"
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
streamer = transformers.TextStreamer(tokenizer)
model = transformers.LlamaForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
```
### Next, Run the Model
```python
prompt = "An increasing sequence: one,"
inputs = tokenizer(
prompt,
padding=True,
truncation=True,
return_tensors='pt',
return_token_type_ids=False,
).to("cuda")
_ = model.generate(
**inputs,
max_new_tokens=20,
streamer=streamer,
)
```