初始化项目,由ModelHub XC社区提供模型

Model: bartowski/allura-org_remnant-qwen3-8b-GGUF
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-20 23:51:13 +08:00
commit 5bdc2a0715
28 changed files with 301 additions and 0 deletions

49
.gitattributes vendored Normal file
View File

@@ -0,0 +1,49 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
allura-org_remnant-qwen3-8b.imatrix filter=lfs diff=lfs merge=lfs -text

176
README.md Normal file
View File

@@ -0,0 +1,176 @@
---
quantized_by: bartowski
pipeline_tag: text-generation
base_model: allura-org/remnant-qwen3-8b
license: apache-2.0
tags:
- roleplay
- conversational
- axolotl
- qwen
base_model_relation: quantized
---
## Llamacpp imatrix Quantizations of remnant-qwen3-8b by allura-org
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5270">b5270</a> for quantization.
Original model: https://huggingface.co/allura-org/remnant-qwen3-8b
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
Run them in [LM Studio](https://lmstudio.ai/)
Run them directly with [llama.cpp](https://github.com/ggerganov/llama.cpp), or any other llama.cpp based project
## Prompt format
```
<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
```
## Download a file (not the whole branch) from below:
| Filename | Quant type | File Size | Split | Description |
| -------- | ---------- | --------- | ----- | ----------- |
| [remnant-qwen3-8b-bf16.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-bf16.gguf) | bf16 | 16.39GB | false | Full BF16 weights. |
| [remnant-qwen3-8b-Q8_0.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q8_0.gguf) | Q8_0 | 8.71GB | false | Extremely high quality, generally unneeded but max available quant. |
| [remnant-qwen3-8b-Q6_K_L.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q6_K_L.gguf) | Q6_K_L | 7.03GB | false | Uses Q8_0 for embed and output weights. Very high quality, near perfect, *recommended*. |
| [remnant-qwen3-8b-Q6_K.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q6_K.gguf) | Q6_K | 6.73GB | false | Very high quality, near perfect, *recommended*. |
| [remnant-qwen3-8b-Q5_K_L.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q5_K_L.gguf) | Q5_K_L | 6.24GB | false | Uses Q8_0 for embed and output weights. High quality, *recommended*. |
| [remnant-qwen3-8b-Q5_K_M.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q5_K_M.gguf) | Q5_K_M | 5.85GB | false | High quality, *recommended*. |
| [remnant-qwen3-8b-Q5_K_S.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q5_K_S.gguf) | Q5_K_S | 5.72GB | false | High quality, *recommended*. |
| [remnant-qwen3-8b-Q4_K_L.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q4_K_L.gguf) | Q4_K_L | 5.49GB | false | Uses Q8_0 for embed and output weights. Good quality, *recommended*. |
| [remnant-qwen3-8b-Q4_1.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q4_1.gguf) | Q4_1 | 5.25GB | false | Legacy format, similar performance to Q4_K_S but with improved tokens/watt on Apple silicon. |
| [remnant-qwen3-8b-Q4_K_M.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q4_K_M.gguf) | Q4_K_M | 5.03GB | false | Good quality, default size for most use cases, *recommended*. |
| [remnant-qwen3-8b-Q3_K_XL.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q3_K_XL.gguf) | Q3_K_XL | 4.98GB | false | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
| [remnant-qwen3-8b-Q4_K_S.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q4_K_S.gguf) | Q4_K_S | 4.80GB | false | Slightly lower quality with more space savings, *recommended*. |
| [remnant-qwen3-8b-Q4_0.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q4_0.gguf) | Q4_0 | 4.79GB | false | Legacy format, offers online repacking for ARM and AVX CPU inference. |
| [remnant-qwen3-8b-IQ4_NL.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-IQ4_NL.gguf) | IQ4_NL | 4.79GB | false | Similar to IQ4_XS, but slightly larger. Offers online repacking for ARM CPU inference. |
| [remnant-qwen3-8b-IQ4_XS.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-IQ4_XS.gguf) | IQ4_XS | 4.56GB | false | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
| [remnant-qwen3-8b-Q3_K_L.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q3_K_L.gguf) | Q3_K_L | 4.43GB | false | Lower quality but usable, good for low RAM availability. |
| [remnant-qwen3-8b-Q3_K_M.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q3_K_M.gguf) | Q3_K_M | 4.12GB | false | Low quality. |
| [remnant-qwen3-8b-IQ3_M.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-IQ3_M.gguf) | IQ3_M | 3.90GB | false | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
| [remnant-qwen3-8b-Q2_K_L.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q2_K_L.gguf) | Q2_K_L | 3.89GB | false | Uses Q8_0 for embed and output weights. Very low quality but surprisingly usable. |
| [remnant-qwen3-8b-Q3_K_S.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q3_K_S.gguf) | Q3_K_S | 3.77GB | false | Low quality, not recommended. |
| [remnant-qwen3-8b-IQ3_XS.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-IQ3_XS.gguf) | IQ3_XS | 3.63GB | false | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
| [remnant-qwen3-8b-IQ3_XXS.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-IQ3_XXS.gguf) | IQ3_XXS | 3.37GB | false | Lower quality, new method with decent performance, comparable to Q3 quants. |
| [remnant-qwen3-8b-Q2_K.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-Q2_K.gguf) | Q2_K | 3.28GB | false | Very low quality but surprisingly usable. |
| [remnant-qwen3-8b-IQ2_M.gguf](https://huggingface.co/bartowski/allura-org_remnant-qwen3-8b-GGUF/blob/main/allura-org_remnant-qwen3-8b-IQ2_M.gguf) | IQ2_M | 3.05GB | false | Relatively low quality, uses SOTA techniques to be surprisingly usable. |
## Embed/output weights
Some of these quants (Q3_K_XL, Q4_K_L etc) are the standard quantization method with the embeddings and output weights quantized to Q8_0 instead of what they would normally default to.
## Downloading using huggingface-cli
<details>
<summary>Click to view download instructions</summary>
First, make sure you have hugginface-cli installed:
```
pip install -U "huggingface_hub[cli]"
```
Then, you can target the specific file you want:
```
huggingface-cli download bartowski/allura-org_remnant-qwen3-8b-GGUF --include "allura-org_remnant-qwen3-8b-Q4_K_M.gguf" --local-dir ./
```
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
```
huggingface-cli download bartowski/allura-org_remnant-qwen3-8b-GGUF --include "allura-org_remnant-qwen3-8b-Q8_0/*" --local-dir ./
```
You can either specify a new local-dir (allura-org_remnant-qwen3-8b-Q8_0) or download them all in place (./)
</details>
## ARM/AVX information
Previously, you would download Q4_0_4_4/4_8/8_8, and these would have their weights interleaved in memory in order to improve performance on ARM and AVX machines by loading up more data in one pass.
Now, however, there is something called "online repacking" for weights. details in [this PR](https://github.com/ggerganov/llama.cpp/pull/9921). If you use Q4_0 and your hardware would benefit from repacking weights, it will do it automatically on the fly.
As of llama.cpp build [b4282](https://github.com/ggerganov/llama.cpp/releases/tag/b4282) you will not be able to run the Q4_0_X_X files and will instead need to use Q4_0.
Additionally, if you want to get slightly better quality for , you can use IQ4_NL thanks to [this PR](https://github.com/ggerganov/llama.cpp/pull/10541) which will also repack the weights for ARM, though only the 4_4 for now. The loading time may be slower but it will result in an overall speed incrase.
<details>
<summary>Click to view Q4_0_X_X information (deprecated</summary>
I'm keeping this section to show the potential theoretical uplift in performance from using the Q4_0 with online repacking.
<details>
<summary>Click to view benchmarks on an AVX2 system (EPYC7702)</summary>
| model | size | params | backend | threads | test | t/s | % (vs Q4_0) |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | ------------: | -------------------: |-------------: |
| qwen2 3B Q4_0 | 1.70 GiB | 3.09 B | CPU | 64 | pp512 | 204.03 ± 1.03 | 100% |
| qwen2 3B Q4_0 | 1.70 GiB | 3.09 B | CPU | 64 | pp1024 | 282.92 ± 0.19 | 100% |
| qwen2 3B Q4_0 | 1.70 GiB | 3.09 B | CPU | 64 | pp2048 | 259.49 ± 0.44 | 100% |
| qwen2 3B Q4_0 | 1.70 GiB | 3.09 B | CPU | 64 | tg128 | 39.12 ± 0.27 | 100% |
| qwen2 3B Q4_0 | 1.70 GiB | 3.09 B | CPU | 64 | tg256 | 39.31 ± 0.69 | 100% |
| qwen2 3B Q4_0 | 1.70 GiB | 3.09 B | CPU | 64 | tg512 | 40.52 ± 0.03 | 100% |
| qwen2 3B Q4_K_M | 1.79 GiB | 3.09 B | CPU | 64 | pp512 | 301.02 ± 1.74 | 147% |
| qwen2 3B Q4_K_M | 1.79 GiB | 3.09 B | CPU | 64 | pp1024 | 287.23 ± 0.20 | 101% |
| qwen2 3B Q4_K_M | 1.79 GiB | 3.09 B | CPU | 64 | pp2048 | 262.77 ± 1.81 | 101% |
| qwen2 3B Q4_K_M | 1.79 GiB | 3.09 B | CPU | 64 | tg128 | 18.80 ± 0.99 | 48% |
| qwen2 3B Q4_K_M | 1.79 GiB | 3.09 B | CPU | 64 | tg256 | 24.46 ± 3.04 | 83% |
| qwen2 3B Q4_K_M | 1.79 GiB | 3.09 B | CPU | 64 | tg512 | 36.32 ± 3.59 | 90% |
| qwen2 3B Q4_0_8_8 | 1.69 GiB | 3.09 B | CPU | 64 | pp512 | 271.71 ± 3.53 | 133% |
| qwen2 3B Q4_0_8_8 | 1.69 GiB | 3.09 B | CPU | 64 | pp1024 | 279.86 ± 45.63 | 100% |
| qwen2 3B Q4_0_8_8 | 1.69 GiB | 3.09 B | CPU | 64 | pp2048 | 320.77 ± 5.00 | 124% |
| qwen2 3B Q4_0_8_8 | 1.69 GiB | 3.09 B | CPU | 64 | tg128 | 43.51 ± 0.05 | 111% |
| qwen2 3B Q4_0_8_8 | 1.69 GiB | 3.09 B | CPU | 64 | tg256 | 43.35 ± 0.09 | 110% |
| qwen2 3B Q4_0_8_8 | 1.69 GiB | 3.09 B | CPU | 64 | tg512 | 42.60 ± 0.31 | 105% |
Q4_0_8_8 offers a nice bump to prompt processing and a small bump to text generation
</details>
</details>
## Which file should I choose?
<details>
<summary>Click here for details</summary>
A great write up with charts showing various performances is provided by Artefact2 [here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)
The first thing to figure out is how big a model you can run. To do this, you'll need to figure out how much RAM and/or VRAM you have.
If you want your model running as FAST as possible, you'll want to fit the whole thing on your GPU's VRAM. Aim for a quant with a file size 1-2GB smaller than your GPU's total VRAM.
If you want the absolute maximum quality, add both your system RAM and your GPU's VRAM together, then similarly grab a quant with a file size 1-2GB Smaller than that total.
Next, you'll need to decide if you want to use an 'I-quant' or a 'K-quant'.
If you don't want to think too much, grab one of the K-quants. These are in format 'QX_K_X', like Q5_K_M.
If you want to get more into the weeds, you can check out this extremely useful feature chart:
[llama.cpp feature matrix](https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix)
But basically, if you're aiming for below Q4, and you're running cuBLAS (Nvidia) or rocBLAS (AMD), you should look towards the I-quants. These are in format IQX_X, like IQ3_M. These are newer and offer better performance for their size.
These I-quants can also be used on CPU, but will be slower than their K-quant equivalent, so speed vs performance is a tradeoff you'll have to decide.
</details>
## Credits
Thank you kalomaze and Dampf for assistance in creating the imatrix calibration dataset.
Thank you ZeroWw for the inspiration to experiment with embed/output.
Thank you to LM Studio for sponsoring my work.
Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:aac0a589b66d8fdf57ab339f943020f673edd196d11e978b191f0653706b9e2e
size 3051910816

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f7ad8d8f593f1408493fbb1b27ebc491a1324e97ae2e4041774d44bb0e1aa1bc
size 3896616608

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a6077db9e903257778822de88e58fea1d88c09830312448c053e4636191a1083
size 3626870432

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c43df02726b32c34043e65248452b238eb0a881da8705dac7a0e1abf8e2d4c15
size 3369629344

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b985bab4ec61df744e10f48a3eeed66b024e178f231e359300b7842a54deabeb
size 4793620128

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:31e955acfa2bfddbe2322bd521e620b755c4864e6ac8566548faae33381a31bc
size 4561835680

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:97bf503c94ba172d95bf384c8e991aceae3c555a86e5168acf54bed22083e438
size 3281729184

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:18cb2c733a4fab7df2e774c16f68589a71609f6d8c8900b2e77bf6f7e23e7461
size 3889473184

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:952d610d441207a505c6e7f46326eba84b0a52dea32f7a126729025a868e2ddb
size 4431390368

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:68cca37efed585d9dc0b15e7262816f0a2125f6dd2f72944c0da8e6ab155e5cc
size 4124157600

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9fe95fb5b473dbb70aaa58dbc0022ebf2e2f8d52c8ce5f03acfcd2a290d7157c
size 3769607840

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f1c50c94976a9b105fede5351955826c371d7fa6424501d84fd620e78bfaa3a0
size 4975928992

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a4da546c72d92b1f768afdc57c715c7410813574fef17d924fca9251f8932b04
size 4787328672

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5c87b0972f378df7cc7135ee6cea7bcd84561faf84b9e36fab08fd7429221b4f
size 5247751840

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b6d13d38f7ff5486dcdde461db547ff47b131a6941b722097763e2c9d0581965
size 5489665696

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:94e179bb1f1fe0069804a7713bd6b1343626ef11d17a67c6990be7b813d26aeb
size 5027780256

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0fae2fad4225372c2e31269e281daa330e958e4f3d367abd08f18189d149673b
size 4802008736

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b07083c760ee0f83f5f35aee748bd1adb58f3fb839d73bf421187578f93134af
size 6235203232

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:44378c30d5fb9cb15c4e1cdb9d6b48dcecde35f7a0c80d489f34ad273015be6d
size 5851109024

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:83e845a3d81ba73f8863537e1cc26da3ca3d6ee3ab8fa8aacb00bbcf0a311f40
size 5720757920

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4f547e64e76c3130e5fd0a13cf5a80fb415ec8df18a0fb549e473cf8ac147a23
size 6725895840

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0297a45a3ad8ace58671169ff2db13187161d565a7099207318936e8a6dfe07b
size 7027336864

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:38a9cfe983cd535f2c1bdd20f963627a2bf8e3ef08b28740f75d83ab8fda12c7
size 8709514912

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:86eb0b6d3b933b3e1ea3f9bfeca981031d2465d2458e7375572362cc9e5711b5
size 16388040064

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:502c8f5364d0fe013b35dc0ffe66f636be383d6a42de3514fae7a869ed33f5fc
size 5316782

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}