初始化项目,由ModelHub XC社区提供模型
Model: numind/NuExtract-2.0-4B Source: Original Platform
This commit is contained in:
37
.gitattributes
vendored
Normal file
37
.gitattributes
vendored
Normal file
@@ -0,0 +1,37 @@
|
|||||||
|
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.model filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||||
|
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||||
|
nuextract2_bench.png filter=lfs diff=lfs merge=lfs -text
|
||||||
592
README.md
Normal file
592
README.md
Normal file
@@ -0,0 +1,592 @@
|
|||||||
|
---
|
||||||
|
library_name: transformers
|
||||||
|
license: mit
|
||||||
|
base_model:
|
||||||
|
- Qwen/Qwen2.5-VL-3B-Instruct
|
||||||
|
new_version: numind/NuExtract3
|
||||||
|
pipeline_tag: image-text-to-text
|
||||||
|
---
|
||||||
|
|
||||||
|
<p align="center">
|
||||||
|
<a href="https://nuextract.ai/">
|
||||||
|
<img src="logo_nuextract.svg" width="200"/>
|
||||||
|
</a>
|
||||||
|
</p>
|
||||||
|
<p align="center">
|
||||||
|
🖥️ <a href="https://nuextract.ai/">API / Platform</a>   |   📑 <a href="https://numind.ai/blog">Blog</a>   |   🗣️ <a href="https://discord.gg/3tsEtJNCDe">Discord</a>   |   🔗 <a href="https://github.com/numindai/nuextract">GitHub</a>
|
||||||
|
</p>
|
||||||
|
|
||||||
|
# NuExtract 2.0 4B by NuMind 📈📈📈
|
||||||
|
|
||||||
|
NuExtract 2.0 is a family of models trained specifically for structured information extraction tasks. It supports both multimodal inputs and is multilingual.
|
||||||
|
|
||||||
|
We provide several versions of different sizes, all based on pre-trained models from the QwenVL family.
|
||||||
|
| Model Size | Model Name | Base Model | License | Huggingface Link |
|
||||||
|
|------------|------------|------------|---------|------------------|
|
||||||
|
| 2B | NuExtract-2.0-2B | [Qwen2-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct) | MIT | 🤗 [NuExtract-2.0-2B](https://huggingface.co/numind/NuExtract-2.0-2B) |
|
||||||
|
| 4B | NuExtract-2.0-4B | [Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) | Qwen Research License | 🤗 [NuExtract-2.0-4B](https://huggingface.co/numind/NuExtract-2.0-4B) |
|
||||||
|
| 8B | NuExtract-2.0-8B | [Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) | MIT | 🤗 [NuExtract-2.0-8B](https://huggingface.co/numind/NuExtract-2.0-8B) |
|
||||||
|
|
||||||
|
❗️Note: `NuExtract-2.0-2B` is based on Qwen2-VL rather than Qwen2.5-VL because the smallest Qwen2.5-VL model (3B) has a more restrictive, non-commercial license. We therefore include `NuExtract-2.0-2B` as a small model option that can be used commercially.
|
||||||
|
|
||||||
|
## Benchmark
|
||||||
|
Performance on collection of ~1,000 diverse extraction examples containing both text and image inputs.
|
||||||
|
<a href="https://nuextract.ai/">
|
||||||
|
<img src="nuextract2_bench.png" width="500"/>
|
||||||
|
</a>
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
To use the model, provide an input text/image and a JSON template describing the information you need to extract. The template should be a JSON object, specifying field names and their expected type.
|
||||||
|
|
||||||
|
Support types include:
|
||||||
|
* `verbatim-string` - instructs the model to extract text that is present verbatim in the input.
|
||||||
|
* `string` - a generic string field that can incorporate paraphrasing/abstraction.
|
||||||
|
* `integer` - a whole number.
|
||||||
|
* `number` - a whole or decimal number.
|
||||||
|
* `date-time` - ISO formatted date.
|
||||||
|
* Array of any of the above types (e.g. `["string"]`)
|
||||||
|
* `enum` - a choice from set of possible answers (represented in template as an array of options, e.g. `["yes", "no", "maybe"]`).
|
||||||
|
* `multi-label` - an enum that can have multiple possible answers (represented in template as a double-wrapped array, e.g. `[["A", "B", "C"]]`).
|
||||||
|
|
||||||
|
If the model does not identify relevant information for a field, it will return `null` or `[]` (for arrays and multi-labels).
|
||||||
|
|
||||||
|
The following is an example template:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"first_name": "verbatim-string",
|
||||||
|
"last_name": "verbatim-string",
|
||||||
|
"description": "string",
|
||||||
|
"age": "integer",
|
||||||
|
"gpa": "number",
|
||||||
|
"birth_date": "date-time",
|
||||||
|
"nationality": ["France", "England", "Japan", "USA", "China"],
|
||||||
|
"languages_spoken": [["English", "French", "Japanese", "Mandarin", "Spanish"]]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
An example output:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"first_name": "Susan",
|
||||||
|
"last_name": "Smith",
|
||||||
|
"description": "A student studying computer science.",
|
||||||
|
"age": 20,
|
||||||
|
"gpa": 3.7,
|
||||||
|
"birth_date": "2005-03-01",
|
||||||
|
"nationality": "England",
|
||||||
|
"languages_spoken": ["English", "French"]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
⚠️ We recommend using NuExtract with a temperature at or very close to 0. Some inference frameworks, such as Ollama, use a default of 0.7 which is not well suited to many extraction tasks.
|
||||||
|
|
||||||
|
## Using NuExtract with 🤗 Transformers
|
||||||
|
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
from transformers import AutoProcessor, AutoModelForVision2Seq
|
||||||
|
|
||||||
|
model_name = "numind/NuExtract-2.0-2B"
|
||||||
|
# model_name = "numind/NuExtract-2.0-8B"
|
||||||
|
|
||||||
|
model = AutoModelForVision2Seq.from_pretrained(model_name,
|
||||||
|
trust_remote_code=True,
|
||||||
|
torch_dtype=torch.bfloat16,
|
||||||
|
attn_implementation="flash_attention_2",
|
||||||
|
device_map="auto")
|
||||||
|
processor = AutoProcessor.from_pretrained(model_name,
|
||||||
|
trust_remote_code=True,
|
||||||
|
padding_side='left',
|
||||||
|
use_fast=True)
|
||||||
|
|
||||||
|
# You can set min_pixels and max_pixels according to your needs, such as a token range of 256-1280, to balance performance and cost.
|
||||||
|
# min_pixels = 256*28*28
|
||||||
|
# max_pixels = 1280*28*28
|
||||||
|
# processor = AutoProcessor.from_pretrained(model_name, min_pixels=min_pixels, max_pixels=max_pixels)
|
||||||
|
```
|
||||||
|
|
||||||
|
You will need the following function to handle loading of image input data:
|
||||||
|
```python
|
||||||
|
def process_all_vision_info(messages, examples=None):
|
||||||
|
"""
|
||||||
|
Process vision information from both messages and in-context examples, supporting batch processing.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
messages: List of message dictionaries (single input) OR list of message lists (batch input)
|
||||||
|
examples: Optional list of example dictionaries (single input) OR list of example lists (batch)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
A flat list of all images in the correct order:
|
||||||
|
- For single input: example images followed by message images
|
||||||
|
- For batch input: interleaved as (item1 examples, item1 input, item2 examples, item2 input, etc.)
|
||||||
|
- Returns None if no images were found
|
||||||
|
"""
|
||||||
|
from qwen_vl_utils import process_vision_info, fetch_image
|
||||||
|
|
||||||
|
# Helper function to extract images from examples
|
||||||
|
def extract_example_images(example_item):
|
||||||
|
if not example_item:
|
||||||
|
return []
|
||||||
|
|
||||||
|
# Handle both list of examples and single example
|
||||||
|
examples_to_process = example_item if isinstance(example_item, list) else [example_item]
|
||||||
|
images = []
|
||||||
|
|
||||||
|
for example in examples_to_process:
|
||||||
|
if isinstance(example.get('input'), dict) and example['input'].get('type') == 'image':
|
||||||
|
images.append(fetch_image(example['input']))
|
||||||
|
|
||||||
|
return images
|
||||||
|
|
||||||
|
# Normalize inputs to always be batched format
|
||||||
|
is_batch = messages and isinstance(messages[0], list)
|
||||||
|
messages_batch = messages if is_batch else [messages]
|
||||||
|
is_batch_examples = examples and isinstance(examples, list) and (isinstance(examples[0], list) or examples[0] is None)
|
||||||
|
examples_batch = examples if is_batch_examples else ([examples] if examples is not None else None)
|
||||||
|
|
||||||
|
# Ensure examples batch matches messages batch if provided
|
||||||
|
if examples and len(examples_batch) != len(messages_batch):
|
||||||
|
if not is_batch and len(examples_batch) == 1:
|
||||||
|
# Single example set for a single input is fine
|
||||||
|
pass
|
||||||
|
else:
|
||||||
|
raise ValueError("Examples batch length must match messages batch length")
|
||||||
|
|
||||||
|
# Process all inputs, maintaining correct order
|
||||||
|
all_images = []
|
||||||
|
for i, message_group in enumerate(messages_batch):
|
||||||
|
# Get example images for this input
|
||||||
|
if examples and i < len(examples_batch):
|
||||||
|
input_example_images = extract_example_images(examples_batch[i])
|
||||||
|
all_images.extend(input_example_images)
|
||||||
|
|
||||||
|
# Get message images for this input
|
||||||
|
input_message_images = process_vision_info(message_group)[0] or []
|
||||||
|
all_images.extend(input_message_images)
|
||||||
|
|
||||||
|
return all_images if all_images else None
|
||||||
|
```
|
||||||
|
|
||||||
|
E.g. To perform a basic extraction of names from a text document:
|
||||||
|
```python
|
||||||
|
template = """{"names": ["string"]}"""
|
||||||
|
document = "John went to the restaurant with Mary. James went to the cinema."
|
||||||
|
|
||||||
|
# prepare the user message content
|
||||||
|
messages = [{"role": "user", "content": document}]
|
||||||
|
text = processor.tokenizer.apply_chat_template(
|
||||||
|
messages,
|
||||||
|
template=template, # template is specified here
|
||||||
|
tokenize=False,
|
||||||
|
add_generation_prompt=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
print(text)
|
||||||
|
""""<|im_start|>user
|
||||||
|
# Template:
|
||||||
|
{"names": ["string"]}
|
||||||
|
# Context:
|
||||||
|
John went to the restaurant with Mary. James went to the cinema.<|im_end|>
|
||||||
|
<|im_start|>assistant"""
|
||||||
|
|
||||||
|
image_inputs = process_all_vision_info(messages)
|
||||||
|
inputs = processor(
|
||||||
|
text=[text],
|
||||||
|
images=image_inputs,
|
||||||
|
padding=True,
|
||||||
|
return_tensors="pt",
|
||||||
|
).to("cuda")
|
||||||
|
|
||||||
|
# we choose greedy sampling here, which works well for most information extraction tasks
|
||||||
|
generation_config = {"do_sample": False, "num_beams": 1, "max_new_tokens": 2048}
|
||||||
|
|
||||||
|
# Inference: Generation of the output
|
||||||
|
generated_ids = model.generate(
|
||||||
|
**inputs,
|
||||||
|
**generation_config
|
||||||
|
)
|
||||||
|
generated_ids_trimmed = [
|
||||||
|
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
|
||||||
|
]
|
||||||
|
output_text = processor.batch_decode(
|
||||||
|
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
|
||||||
|
)
|
||||||
|
|
||||||
|
print(output_text)
|
||||||
|
# ['{"names": ["John", "Mary", "James"]}']
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>In-Context Examples</summary>
|
||||||
|
Sometimes the model might not perform as well as we want because our task is challenging or involves some degree of ambiguity. Alternatively, we may want the model to follow some specific formatting, or just give it a bit more help. In cases like this it can be valuable to provide "in-context examples" to help NuExtract better understand the task.
|
||||||
|
|
||||||
|
To do so, we can provide a list examples (dictionaries of input/output pairs). In the example below, we show to the model that we want the extracted names to be in captial letters with `-` on either side (for the sake of illustration). Usually providing multiple examples will lead to better results.
|
||||||
|
```python
|
||||||
|
template = """{"names": ["string"]}"""
|
||||||
|
document = "John went to the restaurant with Mary. James went to the cinema."
|
||||||
|
examples = [
|
||||||
|
{
|
||||||
|
"input": "Stephen is the manager at Susan's store.",
|
||||||
|
"output": """{"names": ["-STEPHEN-", "-SUSAN-"]}"""
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
messages = [{"role": "user", "content": document}]
|
||||||
|
text = processor.tokenizer.apply_chat_template(
|
||||||
|
messages,
|
||||||
|
template=template,
|
||||||
|
examples=examples, # examples provided here
|
||||||
|
tokenize=False,
|
||||||
|
add_generation_prompt=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
image_inputs = process_all_vision_info(messages, examples)
|
||||||
|
inputs = processor(
|
||||||
|
text=[text],
|
||||||
|
images=image_inputs,
|
||||||
|
padding=True,
|
||||||
|
return_tensors="pt",
|
||||||
|
).to("cuda")
|
||||||
|
|
||||||
|
# we choose greedy sampling here, which works well for most information extraction tasks
|
||||||
|
generation_config = {"do_sample": False, "num_beams": 1, "max_new_tokens": 2048}
|
||||||
|
|
||||||
|
# Inference: Generation of the output
|
||||||
|
generated_ids = model.generate(
|
||||||
|
**inputs,
|
||||||
|
**generation_config
|
||||||
|
)
|
||||||
|
generated_ids_trimmed = [
|
||||||
|
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
|
||||||
|
]
|
||||||
|
output_text = processor.batch_decode(
|
||||||
|
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
|
||||||
|
)
|
||||||
|
print(output_text)
|
||||||
|
# ['{"names": ["-JOHN-", "-MARY-", "-JAMES-"]}']
|
||||||
|
```
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>Image Inputs</summary>
|
||||||
|
If we want to give image inputs to NuExtract, instead of text, we simply provide a dictionary specifying the desired image file as the message content, instead of a string. (e.g. `{"type": "image", "image": "file://image.jpg"}`).
|
||||||
|
|
||||||
|
You can also specify an image URL (e.g. `{"type": "image", "image": "http://path/to/your/image.jpg"}`) or base64 encoding (e.g. `{"type": "image", "image": "data:image;base64,/9j/..."}`).
|
||||||
|
```python
|
||||||
|
template = """{"store": "verbatim-string"}"""
|
||||||
|
document = {"type": "image", "image": "file://1.jpg"}
|
||||||
|
|
||||||
|
messages = [{"role": "user", "content": [document]}]
|
||||||
|
text = processor.tokenizer.apply_chat_template(
|
||||||
|
messages,
|
||||||
|
template=template,
|
||||||
|
tokenize=False,
|
||||||
|
add_generation_prompt=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
image_inputs = process_all_vision_info(messages)
|
||||||
|
inputs = processor(
|
||||||
|
text=[text],
|
||||||
|
images=image_inputs,
|
||||||
|
padding=True,
|
||||||
|
return_tensors="pt",
|
||||||
|
).to("cuda")
|
||||||
|
|
||||||
|
generation_config = {"do_sample": False, "num_beams": 1, "max_new_tokens": 2048}
|
||||||
|
|
||||||
|
# Inference: Generation of the output
|
||||||
|
generated_ids = model.generate(
|
||||||
|
**inputs,
|
||||||
|
**generation_config
|
||||||
|
)
|
||||||
|
generated_ids_trimmed = [
|
||||||
|
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
|
||||||
|
]
|
||||||
|
output_text = processor.batch_decode(
|
||||||
|
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
|
||||||
|
)
|
||||||
|
print(output_text)
|
||||||
|
# ['{"store": "Trader Joe\'s"}']
|
||||||
|
```
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>Batch Inference</summary>
|
||||||
|
|
||||||
|
```python
|
||||||
|
inputs = [
|
||||||
|
# image input with no ICL examples
|
||||||
|
{
|
||||||
|
"document": {"type": "image", "image": "file://0.jpg"},
|
||||||
|
"template": """{"store_name": "verbatim-string"}""",
|
||||||
|
},
|
||||||
|
# image input with 1 ICL example
|
||||||
|
{
|
||||||
|
"document": {"type": "image", "image": "file://0.jpg"},
|
||||||
|
"template": """{"store_name": "verbatim-string"}""",
|
||||||
|
"examples": [
|
||||||
|
{
|
||||||
|
"input": {"type": "image", "image": "file://1.jpg"},
|
||||||
|
"output": """{"store_name": "Trader Joe's"}""",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
},
|
||||||
|
# text input with no ICL examples
|
||||||
|
{
|
||||||
|
"document": {"type": "text", "text": "John went to the restaurant with Mary. James went to the cinema."},
|
||||||
|
"template": """{"names": ["string"]}""",
|
||||||
|
},
|
||||||
|
# text input with ICL example
|
||||||
|
{
|
||||||
|
"document": {"type": "text", "text": "John went to the restaurant with Mary. James went to the cinema."},
|
||||||
|
"template": """{"names": ["string"]}""",
|
||||||
|
"examples": [
|
||||||
|
{
|
||||||
|
"input": "Stephen is the manager at Susan's store.",
|
||||||
|
"output": """{"names": ["STEPHEN", "SUSAN"]}"""
|
||||||
|
}
|
||||||
|
],
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
# messages should be a list of lists for batch processing
|
||||||
|
messages = [
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": [x['document']],
|
||||||
|
}
|
||||||
|
]
|
||||||
|
for x in inputs
|
||||||
|
]
|
||||||
|
|
||||||
|
# apply chat template to each example individually
|
||||||
|
texts = [
|
||||||
|
processor.tokenizer.apply_chat_template(
|
||||||
|
messages[i], # Now this is a list containing one message
|
||||||
|
template=x['template'],
|
||||||
|
examples=x.get('examples', None),
|
||||||
|
tokenize=False,
|
||||||
|
add_generation_prompt=True)
|
||||||
|
for i, x in enumerate(inputs)
|
||||||
|
]
|
||||||
|
|
||||||
|
image_inputs = process_all_vision_info(messages, [x.get('examples') for x in inputs])
|
||||||
|
inputs = processor(
|
||||||
|
text=texts,
|
||||||
|
images=image_inputs,
|
||||||
|
padding=True,
|
||||||
|
return_tensors="pt",
|
||||||
|
).to("cuda")
|
||||||
|
|
||||||
|
generation_config = {"do_sample": False, "num_beams": 1, "max_new_tokens": 2048}
|
||||||
|
|
||||||
|
# Batch Inference
|
||||||
|
generated_ids = model.generate(**inputs, **generation_config)
|
||||||
|
generated_ids_trimmed = [
|
||||||
|
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
|
||||||
|
]
|
||||||
|
output_texts = processor.batch_decode(
|
||||||
|
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
|
||||||
|
)
|
||||||
|
for y in output_texts:
|
||||||
|
print(y)
|
||||||
|
# {"store_name": "WAL-MART"}
|
||||||
|
# {"store_name": "Walmart"}
|
||||||
|
# {"names": ["John", "Mary", "James"]}
|
||||||
|
# {"names": ["JOHN", "MARY", "JAMES"]}
|
||||||
|
```
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>Template Generation</summary>
|
||||||
|
If you want to convert existing schema files you have in other formats (e.g. XML, YAML, etc.) or start from an example, NuExtract 2.0 models can automatically generate this for you.
|
||||||
|
|
||||||
|
E.g. convert XML into a NuExtract template:
|
||||||
|
```python
|
||||||
|
xml_template = """<SportResult>
|
||||||
|
<Date></Date>
|
||||||
|
<Sport></Sport>
|
||||||
|
<Venue></Venue>
|
||||||
|
<HomeTeam></HomeTeam>
|
||||||
|
<AwayTeam></AwayTeam>
|
||||||
|
<HomeScore></HomeScore>
|
||||||
|
<AwayScore></AwayScore>
|
||||||
|
<TopScorer></TopScorer>
|
||||||
|
</SportResult>"""
|
||||||
|
|
||||||
|
messages = [
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": [{"type": "text", "text": xml_template}],
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
text = processor.apply_chat_template(
|
||||||
|
messages, tokenize=False, add_generation_prompt=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
image_inputs = process_all_vision_info(messages)
|
||||||
|
inputs = processor(
|
||||||
|
text=[text],
|
||||||
|
images=image_inputs,
|
||||||
|
padding=True,
|
||||||
|
return_tensors="pt",
|
||||||
|
).to("cuda")
|
||||||
|
|
||||||
|
generated_ids = model.generate(
|
||||||
|
**inputs,
|
||||||
|
**generation_config
|
||||||
|
)
|
||||||
|
generated_ids_trimmed = [
|
||||||
|
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
|
||||||
|
]
|
||||||
|
output_text = processor.batch_decode(
|
||||||
|
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
|
||||||
|
)
|
||||||
|
|
||||||
|
print(output_text[0])
|
||||||
|
# {
|
||||||
|
# "Date": "date-time",
|
||||||
|
# "Sport": "verbatim-string",
|
||||||
|
# "Venue": "verbatim-string",
|
||||||
|
# "HomeTeam": "verbatim-string",
|
||||||
|
# "AwayTeam": "verbatim-string",
|
||||||
|
# "HomeScore": "integer",
|
||||||
|
# "AwayScore": "integer",
|
||||||
|
# "TopScorer": "verbatim-string"
|
||||||
|
# }
|
||||||
|
```
|
||||||
|
|
||||||
|
E.g. generate a template from natural language description:
|
||||||
|
```python
|
||||||
|
description = "I would like to extract important details from the contract."
|
||||||
|
|
||||||
|
messages = [
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": [{"type": "text", "text": description}],
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
text = processor.apply_chat_template(
|
||||||
|
messages, tokenize=False, add_generation_prompt=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
image_inputs = process_all_vision_info(messages)
|
||||||
|
inputs = processor(
|
||||||
|
text=[text],
|
||||||
|
images=image_inputs,
|
||||||
|
padding=True,
|
||||||
|
return_tensors="pt",
|
||||||
|
).to("cuda")
|
||||||
|
|
||||||
|
generated_ids = model.generate(
|
||||||
|
**inputs,
|
||||||
|
**generation_config
|
||||||
|
)
|
||||||
|
generated_ids_trimmed = [
|
||||||
|
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
|
||||||
|
]
|
||||||
|
output_text = processor.batch_decode(
|
||||||
|
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
|
||||||
|
)
|
||||||
|
|
||||||
|
print(output_text[0])
|
||||||
|
# {
|
||||||
|
# "Contract": {
|
||||||
|
# "Title": "verbatim-string",
|
||||||
|
# "Description": "verbatim-string",
|
||||||
|
# "Terms": [
|
||||||
|
# {
|
||||||
|
# "Term": "verbatim-string",
|
||||||
|
# "Description": "verbatim-string"
|
||||||
|
# }
|
||||||
|
# ],
|
||||||
|
# "Date": "date-time",
|
||||||
|
# "Signatory": "verbatim-string"
|
||||||
|
# }
|
||||||
|
# }
|
||||||
|
```
|
||||||
|
</details>
|
||||||
|
|
||||||
|
## Fine-Tuning
|
||||||
|
You can find a fine-tuning tutorial notebook in the [cookbooks](https://github.com/numindai/nuextract/tree/main/cookbooks) folder of the [GitHub repo](https://github.com/numindai/nuextract/tree/main).
|
||||||
|
|
||||||
|
## vLLM Deployment
|
||||||
|
Run the command below to serve an OpenAI-compatible API:
|
||||||
|
```bash
|
||||||
|
vllm serve numind/NuExtract-2.0-8B --trust_remote_code --limit-mm-per-prompt image=6 --chat-template-content-format openai
|
||||||
|
```
|
||||||
|
If you encounter memory issues, set `--max-model-len` accordingly.
|
||||||
|
|
||||||
|
Send requests to the model as follows:
|
||||||
|
```python
|
||||||
|
import json
|
||||||
|
from openai import OpenAI
|
||||||
|
|
||||||
|
openai_api_key = "EMPTY"
|
||||||
|
openai_api_base = "http://localhost:8000/v1"
|
||||||
|
|
||||||
|
client = OpenAI(
|
||||||
|
api_key=openai_api_key,
|
||||||
|
base_url=openai_api_base,
|
||||||
|
)
|
||||||
|
|
||||||
|
chat_response = client.chat.completions.create(
|
||||||
|
model="numind/NuExtract-2.0-8B",
|
||||||
|
temperature=0,
|
||||||
|
messages=[
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": [{"type": "text", "text": "Yesterday I went shopping at Bunnings"}],
|
||||||
|
},
|
||||||
|
],
|
||||||
|
extra_body={
|
||||||
|
"chat_template_kwargs": {
|
||||||
|
"template": json.dumps(json.loads("""{\"store\": \"verbatim-string\"}"""), indent=4)
|
||||||
|
},
|
||||||
|
}
|
||||||
|
)
|
||||||
|
print("Chat response:", chat_response)
|
||||||
|
```
|
||||||
|
For image inputs, structure requests as shown below. Make sure to order the images in `"content"` as they appear in the prompt (i.e. any in-context examples before the main input).
|
||||||
|
```python
|
||||||
|
import base64
|
||||||
|
|
||||||
|
def encode_image(image_path):
|
||||||
|
"""
|
||||||
|
Encode the image file to base64 string
|
||||||
|
"""
|
||||||
|
with open(image_path, "rb") as image_file:
|
||||||
|
return base64.b64encode(image_file.read()).decode('utf-8')
|
||||||
|
|
||||||
|
base64_image = encode_image("0.jpg")
|
||||||
|
base64_image2 = encode_image("1.jpg")
|
||||||
|
|
||||||
|
chat_response = client.chat.completions.create(
|
||||||
|
model="numind/NuExtract-2.0-8B",
|
||||||
|
temperature=0,
|
||||||
|
messages=[
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": [
|
||||||
|
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}, # first ICL example image
|
||||||
|
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image2}"}}, # real input image
|
||||||
|
],
|
||||||
|
},
|
||||||
|
],
|
||||||
|
extra_body={
|
||||||
|
"chat_template_kwargs": {
|
||||||
|
"template": json.dumps(json.loads("""{\"store\": \"verbatim-string\"}"""), indent=4),
|
||||||
|
"examples": [
|
||||||
|
{
|
||||||
|
"input": "<image>",
|
||||||
|
"output": """{\"store\": \"Walmart\"}"""
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
}
|
||||||
|
)
|
||||||
|
print("Chat response:", chat_response)
|
||||||
|
```
|
||||||
24
added_tokens.json
Normal file
24
added_tokens.json
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
{
|
||||||
|
"</tool_call>": 151658,
|
||||||
|
"<tool_call>": 151657,
|
||||||
|
"<|box_end|>": 151649,
|
||||||
|
"<|box_start|>": 151648,
|
||||||
|
"<|endoftext|>": 151643,
|
||||||
|
"<|file_sep|>": 151664,
|
||||||
|
"<|fim_middle|>": 151660,
|
||||||
|
"<|fim_pad|>": 151662,
|
||||||
|
"<|fim_prefix|>": 151659,
|
||||||
|
"<|fim_suffix|>": 151661,
|
||||||
|
"<|im_end|>": 151645,
|
||||||
|
"<|im_start|>": 151644,
|
||||||
|
"<|image_pad|>": 151655,
|
||||||
|
"<|object_ref_end|>": 151647,
|
||||||
|
"<|object_ref_start|>": 151646,
|
||||||
|
"<|quad_end|>": 151651,
|
||||||
|
"<|quad_start|>": 151650,
|
||||||
|
"<|repo_name|>": 151663,
|
||||||
|
"<|video_pad|>": 151656,
|
||||||
|
"<|vision_end|>": 151653,
|
||||||
|
"<|vision_pad|>": 151654,
|
||||||
|
"<|vision_start|>": 151652
|
||||||
|
}
|
||||||
3
chat_template.json
Normal file
3
chat_template.json
Normal file
File diff suppressed because one or more lines are too long
51
config.json
Normal file
51
config.json
Normal file
@@ -0,0 +1,51 @@
|
|||||||
|
{
|
||||||
|
"_name_or_path": "experiments/Qwen2.5_3B-final/checkpoint-117160",
|
||||||
|
"architectures": [
|
||||||
|
"Qwen2_5_VLForConditionalGeneration"
|
||||||
|
],
|
||||||
|
"attention_dropout": 0.0,
|
||||||
|
"bos_token_id": 151643,
|
||||||
|
"eos_token_id": 151645,
|
||||||
|
"hidden_act": "silu",
|
||||||
|
"hidden_size": 2048,
|
||||||
|
"image_token_id": 151655,
|
||||||
|
"initializer_range": 0.02,
|
||||||
|
"intermediate_size": 11008,
|
||||||
|
"max_position_embeddings": 128000,
|
||||||
|
"max_window_layers": 70,
|
||||||
|
"model_type": "qwen2_5_vl",
|
||||||
|
"num_attention_heads": 16,
|
||||||
|
"num_hidden_layers": 36,
|
||||||
|
"num_key_value_heads": 2,
|
||||||
|
"rms_norm_eps": 1e-06,
|
||||||
|
"rope_scaling": {
|
||||||
|
"mrope_section": [
|
||||||
|
16,
|
||||||
|
24,
|
||||||
|
24
|
||||||
|
],
|
||||||
|
"rope_type": "default",
|
||||||
|
"type": "default"
|
||||||
|
},
|
||||||
|
"rope_theta": 1000000.0,
|
||||||
|
"sliding_window": 32768,
|
||||||
|
"tie_word_embeddings": true,
|
||||||
|
"torch_dtype": "bfloat16",
|
||||||
|
"transformers_version": "4.49.0",
|
||||||
|
"use_cache": true,
|
||||||
|
"use_sliding_window": false,
|
||||||
|
"video_token_id": 151656,
|
||||||
|
"vision_config": {
|
||||||
|
"hidden_size": 1280,
|
||||||
|
"in_chans": 3,
|
||||||
|
"model_type": "qwen2_5_vl",
|
||||||
|
"out_hidden_size": 2048,
|
||||||
|
"spatial_patch_size": 14,
|
||||||
|
"tokens_per_second": 2,
|
||||||
|
"torch_dtype": "bfloat16"
|
||||||
|
},
|
||||||
|
"vision_end_token_id": 151653,
|
||||||
|
"vision_start_token_id": 151652,
|
||||||
|
"vision_token_id": 151654,
|
||||||
|
"vocab_size": 151936
|
||||||
|
}
|
||||||
13
generation_config.json
Normal file
13
generation_config.json
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
{
|
||||||
|
"attn_implementation": "flash_attention_2",
|
||||||
|
"bos_token_id": 151643,
|
||||||
|
"do_sample": true,
|
||||||
|
"eos_token_id": [
|
||||||
|
151645,
|
||||||
|
151643
|
||||||
|
],
|
||||||
|
"pad_token_id": 151643,
|
||||||
|
"repetition_penalty": 1.05,
|
||||||
|
"temperature": 1e-06,
|
||||||
|
"transformers_version": "4.49.0"
|
||||||
|
}
|
||||||
90
logo_nuextract.svg
Normal file
90
logo_nuextract.svg
Normal file
@@ -0,0 +1,90 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<!-- Created with Inkscape (http://www.inkscape.org/) -->
|
||||||
|
|
||||||
|
<svg
|
||||||
|
version="1.1"
|
||||||
|
id="svg1"
|
||||||
|
width="1520.4797"
|
||||||
|
height="292.58524"
|
||||||
|
viewBox="0 0 1520.4798 292.58524"
|
||||||
|
xmlns="http://www.w3.org/2000/svg"
|
||||||
|
xmlns:svg="http://www.w3.org/2000/svg">
|
||||||
|
<defs
|
||||||
|
id="defs1">
|
||||||
|
<clipPath
|
||||||
|
clipPathUnits="userSpaceOnUse"
|
||||||
|
id="clipPath13">
|
||||||
|
<path
|
||||||
|
d="M 0,341.62 H 1100.11 V 0 H 0 Z"
|
||||||
|
transform="translate(-296.42451,-172.17601)"
|
||||||
|
id="path13" />
|
||||||
|
</clipPath>
|
||||||
|
<clipPath
|
||||||
|
clipPathUnits="userSpaceOnUse"
|
||||||
|
id="clipPath15">
|
||||||
|
<path
|
||||||
|
d="M 0,341.62 H 1100.11 V 0 H 0 Z"
|
||||||
|
transform="translate(-232.38961,-267.80771)"
|
||||||
|
id="path15" />
|
||||||
|
</clipPath>
|
||||||
|
<clipPath
|
||||||
|
clipPathUnits="userSpaceOnUse"
|
||||||
|
id="clipPath17">
|
||||||
|
<path
|
||||||
|
d="M 0,341.62 H 1100.11 V 0 H 0 Z"
|
||||||
|
transform="translate(-122.20671,-236.63431)"
|
||||||
|
id="path17" />
|
||||||
|
</clipPath>
|
||||||
|
<clipPath
|
||||||
|
clipPathUnits="userSpaceOnUse"
|
||||||
|
id="clipPath19">
|
||||||
|
<path
|
||||||
|
d="M 0,341.62 H 1100.11 V 0 H 0 Z"
|
||||||
|
transform="translate(-117.495,-121.51351)"
|
||||||
|
id="path19" />
|
||||||
|
</clipPath>
|
||||||
|
<clipPath
|
||||||
|
clipPathUnits="userSpaceOnUse"
|
||||||
|
id="clipPath21">
|
||||||
|
<path
|
||||||
|
d="M 0,341.62 H 1100.11 V 0 H 0 Z"
|
||||||
|
transform="translate(-225.1954,-81.860705)"
|
||||||
|
id="path21" />
|
||||||
|
</clipPath>
|
||||||
|
</defs>
|
||||||
|
<path
|
||||||
|
id="path12"
|
||||||
|
d="m 0,0 v 0 c 1.104,3.005 2.142,6.323 3.014,9.839 v 0 C 15.425,-2.193 20.455,-12.66 13.315,-25.357 5.134,-41.848 -32.768,-69.221 -50.814,-72.11 c 2.78,6.99 39.967,81.895 -12.419,126.995 28.42,-10.17 49.405,-30.511 52.717,-33.265 0.018,-25.369 -7.695,-45.257 -11.521,-56.247 C 0.177,-26.332 9.806,-11.907 0,0"
|
||||||
|
style="fill:#e4843a;fill-opacity:1;fill-rule:nonzero;stroke:none"
|
||||||
|
transform="matrix(1.3333333,0,0,-1.3333333,273.36248,144.06706)"
|
||||||
|
clip-path="url(#clipPath13)" />
|
||||||
|
<path
|
||||||
|
id="path14"
|
||||||
|
d="m 0,0 v 0 c -2.517,1.979 -5.352,3.991 -8.426,5.907 v 0 C 6.852,13.993 18.361,15.542 28.231,4.828 41.386,-8.049 55.707,-52.555 52.878,-70.61 47.09,-65.806 -12.658,-7.292 -71.739,-43.178 -53.284,-19.292 -27.454,-5.619 -23.811,-3.32 0.322,-11.143 16.853,-24.624 26.122,-31.659 25.098,-7.969 14.354,5.646 0,0"
|
||||||
|
style="fill:#a37fb8;fill-opacity:1;fill-rule:nonzero;stroke:none"
|
||||||
|
transform="matrix(1.3333333,0,0,-1.3333333,187.98261,16.558133)"
|
||||||
|
clip-path="url(#clipPath15)" />
|
||||||
|
<path
|
||||||
|
id="path16"
|
||||||
|
d="m 0,0 v 0 c -2.66,-1.782 -5.45,-3.856 -8.221,-6.189 v 0 c -2.97,17.03 -0.886,28.454 12.354,34.53 16.311,8.532 63.064,8.399 79.361,0.129 C 77.137,24.45 3.024,-14.292 18.896,-81.57 1.882,-56.638 -3.14,-27.847 -4.2,-23.672 10.697,-3.138 28.627,8.418 38.181,15.06 15.334,21.407 -0.934,15.396 0,0"
|
||||||
|
style="fill:#87b7e0;fill-opacity:1;fill-rule:nonzero;stroke:none"
|
||||||
|
transform="matrix(1.3333333,0,0,-1.3333333,41.072076,58.122663)"
|
||||||
|
clip-path="url(#clipPath17)" />
|
||||||
|
<path
|
||||||
|
id="path18"
|
||||||
|
d="m 0,0 v 0 c 0.873,-3.08 1.984,-6.375 3.345,-9.731 v 0 c -17.113,2.438 -27.335,7.95 -29.022,22.419 -3.074,18.15 11.5,62.573 24.401,75.518 C 0.583,80.917 14.527,-1.541 83.417,-7.236 54.447,-15.712 25.513,-11.591 21.215,-11.31 6.29,9.204 0.84,29.827 -2.525,40.967 -15.621,21.199 -14.931,3.869 0,0"
|
||||||
|
style="fill:#99b535;fill-opacity:1;fill-rule:nonzero;stroke:none"
|
||||||
|
transform="matrix(1.3333333,0,0,-1.3333333,34.789806,211.61706)"
|
||||||
|
clip-path="url(#clipPath19)" />
|
||||||
|
<path
|
||||||
|
id="path20"
|
||||||
|
d="m 0,0 v 0 c 3.199,-0.121 6.676,-0.083 10.289,0.174 v 0 C 2.681,-15.348 -5.719,-23.366 -20.002,-20.499 -38.213,-17.814 -75.959,9.774 -84.283,26.044 -76.776,25.56 5.954,13.34 32.659,77.099 31.768,46.927 18.908,20.683 17.312,16.682 -6.81,8.826 -28.108,10.016 -39.742,10.258 -24.988,-8.305 -8.294,-13.005 0,0"
|
||||||
|
style="fill:#f0cf35;fill-opacity:1;fill-rule:nonzero;stroke:none"
|
||||||
|
transform="matrix(1.3333333,0,0,-1.3333333,178.39034,264.48746)"
|
||||||
|
clip-path="url(#clipPath21)" />
|
||||||
|
<path
|
||||||
|
style="font-weight:500;font-size:266.667px;font-family:Avenir;-inkscape-font-specification:'Avenir Medium';white-space:pre;fill:#535353;stroke-width:0;stroke-linejoin:round;stroke-miterlimit:5.4;stroke-opacity:0"
|
||||||
|
d="m 331.94501,49.293372 h 33.60005 L 469.54518,204.49356 h 0.53334 V 49.293372 h 25.60003 V 238.09361 H 463.14517 L 358.07838,82.893417 h -0.53333 V 238.09361 H 331.94501 Z M 650.07859,238.09361 h -24.00001 v -19.46669 h -0.53333 q -4.53334,10.13334 -15.73336,16.53335 -11.20001,6.13334 -25.86669,6.13334 -9.33335,0 -17.60003,-2.93334 -8.26667,-2.66667 -14.66668,-8.53334 -6.13334,-5.86667 -9.86668,-14.93335 -3.73334,-9.33335 -3.73334,-21.8667 v -81.33343 h 24.00003 v 74.66676 q 0,8.80001 2.40001,15.20002 2.4,6.13334 6.4,10.13334 4.00001,3.73334 9.06668,5.60001 5.33334,1.6 10.93335,1.6 7.46667,0 13.86668,-2.4 6.40001,-2.4 11.20002,-7.46668 4.8,-5.33334 7.46667,-13.33335 2.66667,-8.00001 2.66667,-18.93335 v -65.06675 h 24.00001 z m 42.4,-188.800238 h 121.8668 v 24.00003 h -96.2668 v 56.266738 h 89.6001 v 24.00003 h -89.6001 v 60.53341 h 101.0668 v 24.00003 h -126.6668 z m 191.467,121.333488 -44.8001,-58.93341 h 30.9334 l 30.6667,44.80006 30.1333,-44.80006 h 29.0667 l -43.2,58.93341 51.2001,67.46675 h -30.9334 l -37.3334,-53.06674 -37.3334,53.06674 h -29.6 z m 171.46661,-38.13338 h -34.4001 v 57.3334 q 0,5.33334 0.2667,10.66668 0.2667,5.06667 1.8667,9.33334 1.8666,4.00001 5.3333,6.66668 3.7333,2.4 10.6667,2.4 4.2667,0 8.8,-0.8 4.5333,-0.8 8.2667,-2.93334 v 21.8667 q -4.2667,2.4 -11.2,3.2 -6.6667,1.06667 -10.4001,1.06667 -13.8666,0 -21.6,-3.73334 -7.4667,-4 -11.2,-10.13334 -3.46671,-6.13334 -4.26671,-13.60002 -0.5333,-7.73334 -0.5333,-15.46669 v -65.86674 h -27.7334 v -20.80003 h 27.7334 V 76.226737 h 24.00001 v 35.466713 h 34.4001 z m 30.9335,-20.80003 h 24.0001 v 19.46669 h 0.5333 q 2.4,-5.06667 6.4,-9.06668 4,-4.26667 8.8,-7.2 5.0667,-2.93334 10.9334,-4.53334 5.8666,-1.86667 11.7333,-1.86667 5.8667,0 10.6667,1.6 l -1.0667,25.8667 q -2.9333,-0.8 -5.8666,-1.33334 -2.9334,-0.53333 -5.8667,-0.53333 -17.6,0 -26.9334,9.86668 -9.3333,9.86668 -9.3333,30.6667 v 63.46675 h -24.0001 z m 99.2004,15.46669 q 10.1333,-9.33335 23.4667,-13.86669 13.3333,-4.8 26.6667,-4.8 13.8666,0 23.7333,3.46667 10.1334,3.46667 16.5334,9.33334 6.4,5.86668 9.3333,13.60002 3.2,7.46668 3.2,15.73335 v 64.53341 q 0,6.66668 0.2667,12.26669 0.2667,5.6 0.8,10.66668 h -21.3334 q -0.8,-9.60002 -0.8,-19.20003 h -0.5333 q -8,12.26668 -18.9334,17.33336 -10.9333,5.06667 -25.3333,5.06667 -8.8,0 -16.8,-2.4 -8.0001,-2.40001 -14.1334,-7.20001 -5.8667,-4.80001 -9.3333,-11.73335 -3.4667,-7.20001 -3.4667,-16.53335 0,-12.26668 5.3333,-20.53336 5.6,-8.26668 14.9334,-13.33335 9.6,-5.33334 22.1333,-7.46668 12.8001,-2.4 27.2001,-2.4 h 17.6 v -5.33334 q 0,-4.80001 -1.8667,-9.60001 -1.8666,-4.80001 -5.6,-8.53335 -3.7333,-4 -9.3333,-6.13334 -5.6,-2.4 -13.3334,-2.4 -6.9333,0 -12.2667,1.33334 -5.0666,1.33333 -9.3333,3.46667 -4.2667,1.86667 -7.7333,4.53334 -3.4667,2.66667 -6.6667,5.06667 z m 67.7334,50.13339 q -8.5334,0 -17.6,1.06667 -8.8,0.8 -16.2667,3.46667 -7.2,2.66667 -12,7.46668 -4.5334,4.8 -4.5334,12.26668 0,10.93334 7.2,15.73335 7.4667,4.80001 20.0001,4.80001 9.8666,0 16.8,-3.20001 6.9333,-3.46667 11.2,-8.80001 4.2667,-5.33334 6.1333,-11.73335 1.8667,-6.66667 1.8667,-13.06668 v -8.00001 z m 160.5333,-32.00004 q -6.6667,-6.93334 -14.1333,-10.40001 -7.2001,-3.73334 -17.3334,-3.73334 -9.8667,0 -17.3334,3.73334 -7.2,3.46667 -12.2666,9.86668 -4.8,6.13334 -7.4667,14.40002 -2.4,8.00001 -2.4,16.80002 0,8.80001 2.9333,16.80002 2.9334,7.73334 8.2667,13.60001 5.3333,5.86668 12.8,9.33335 7.4667,3.2 16.8,3.2 10.1334,0 17.3334,-3.46667 7.2,-3.73334 13.3333,-10.66668 l 17.0667,17.06669 q -9.3333,10.40001 -21.8667,14.93335 -12.2666,4.53334 -26.1333,4.53334 -14.6667,0 -26.9334,-4.80001 -12,-4.8 -20.8,-13.33335 -8.8,-8.80001 -13.6,-20.80002 -4.8,-12.26668 -4.8,-26.93337 0,-14.66668 4.8,-26.93336 4.8,-12.26669 13.3333,-21.0667 8.8,-8.80001 20.8,-13.60001 12.2667,-5.06668 27.2001,-5.06668 13.8667,0 26.4,5.06668 12.8,4.8 22.1334,14.93335 z m 105.8669,-12.80001 h -34.4 v 57.3334 q 0,5.33334 0.2667,10.66668 0.2666,5.06667 1.8666,9.33334 1.8667,4.00001 5.3334,6.66668 3.7333,2.4 10.6667,2.4 4.2666,0 8.8,-0.8 4.5333,-0.8 8.2666,-2.93334 v 21.8667 q -4.2666,2.4 -11.2,3.2 -6.6666,1.06667 -10.4,1.06667 -13.8667,0 -21.6,-3.73334 -7.4667,-4 -11.2,-10.13334 -3.4667,-6.13334 -4.2667,-13.60002 -0.5333,-7.73334 -0.5333,-15.46669 v -65.86674 h -27.7334 v -20.80003 h 27.7334 V 76.226737 h 24 v 35.466713 h 34.4 z"
|
||||||
|
id="text21"
|
||||||
|
aria-label="NuExtract" />
|
||||||
|
</svg>
|
||||||
|
After Width: | Height: | Size: 8.4 KiB |
151388
merges.txt
Normal file
151388
merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
3
model-00001-of-00002.safetensors
Normal file
3
model-00001-of-00002.safetensors
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:5d7170bac9af7ae026034f57b7ebe79a165a10e30f8ed375083b0a15035cb5bb
|
||||||
|
size 4997750760
|
||||||
3
model-00002-of-00002.safetensors
Normal file
3
model-00002-of-00002.safetensors
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:08c4fd700780ec25d8da3ccfb7e9b4fd4adddc0933afbcb3bdcbc46072b49cc2
|
||||||
|
size 2511587184
|
||||||
831
model.safetensors.index.json
Normal file
831
model.safetensors.index.json
Normal file
@@ -0,0 +1,831 @@
|
|||||||
|
{
|
||||||
|
"metadata": {
|
||||||
|
"total_size": 7509245952
|
||||||
|
},
|
||||||
|
"weight_map": {
|
||||||
|
"model.embed_tokens.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.19.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.19.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.19.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.19.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.20.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.28.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.29.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.30.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.30.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.31.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.32.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.33.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.34.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.input_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.35.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.self_attn.k_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.self_attn.q_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.self_attn.v_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"model.norm.weight": "model-00002-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.0.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.1.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.10.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.11.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.12.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.13.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.14.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.15.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.16.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.17.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.18.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.19.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.2.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.20.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.21.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.22.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.23.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.24.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.25.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.26.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.27.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.28.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.29.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.3.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.30.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.31.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.4.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.5.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.6.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.7.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.8.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.attn.proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.attn.proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.attn.qkv.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.attn.qkv.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.mlp.down_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.mlp.gate_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.mlp.up_proj.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.norm1.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.blocks.9.norm2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.merger.ln_q.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.merger.mlp.0.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.merger.mlp.0.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.merger.mlp.2.bias": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.merger.mlp.2.weight": "model-00001-of-00002.safetensors",
|
||||||
|
"visual.patch_embed.proj.weight": "model-00001-of-00002.safetensors"
|
||||||
|
}
|
||||||
|
}
|
||||||
3
nuextract2_bench.png
Normal file
3
nuextract2_bench.png
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:b2cdf1eec686510aaa05e91d098ddda56f4674e7448a3e4b66e50a915240b545
|
||||||
|
size 106243
|
||||||
29
preprocessor_config.json
Normal file
29
preprocessor_config.json
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
{
|
||||||
|
"do_convert_rgb": true,
|
||||||
|
"do_normalize": true,
|
||||||
|
"do_rescale": true,
|
||||||
|
"do_resize": true,
|
||||||
|
"image_mean": [
|
||||||
|
0.48145466,
|
||||||
|
0.4578275,
|
||||||
|
0.40821073
|
||||||
|
],
|
||||||
|
"image_processor_type": "Qwen2VLImageProcessor",
|
||||||
|
"image_std": [
|
||||||
|
0.26862954,
|
||||||
|
0.26130258,
|
||||||
|
0.27577711
|
||||||
|
],
|
||||||
|
"max_pixels": 23000000,
|
||||||
|
"merge_size": 2,
|
||||||
|
"min_pixels": 200704,
|
||||||
|
"patch_size": 14,
|
||||||
|
"processor_class": "Qwen2_5_VLProcessor",
|
||||||
|
"resample": 3,
|
||||||
|
"rescale_factor": 0.00392156862745098,
|
||||||
|
"size": {
|
||||||
|
"longest_edge": 23000000,
|
||||||
|
"shortest_edge": 200704
|
||||||
|
},
|
||||||
|
"temporal_patch_size": 2
|
||||||
|
}
|
||||||
31
special_tokens_map.json
Normal file
31
special_tokens_map.json
Normal file
@@ -0,0 +1,31 @@
|
|||||||
|
{
|
||||||
|
"additional_special_tokens": [
|
||||||
|
"<|im_start|>",
|
||||||
|
"<|im_end|>",
|
||||||
|
"<|object_ref_start|>",
|
||||||
|
"<|object_ref_end|>",
|
||||||
|
"<|box_start|>",
|
||||||
|
"<|box_end|>",
|
||||||
|
"<|quad_start|>",
|
||||||
|
"<|quad_end|>",
|
||||||
|
"<|vision_start|>",
|
||||||
|
"<|vision_end|>",
|
||||||
|
"<|vision_pad|>",
|
||||||
|
"<|image_pad|>",
|
||||||
|
"<|video_pad|>"
|
||||||
|
],
|
||||||
|
"eos_token": {
|
||||||
|
"content": "<|im_end|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"pad_token": {
|
||||||
|
"content": "<|endoftext|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
}
|
||||||
|
}
|
||||||
3
tokenizer.json
Normal file
3
tokenizer.json
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:ba0c439f7be467bf47d12a7e6f9adc6116201056fc60c67f431c679b7c16afc8
|
||||||
|
size 11422064
|
||||||
216
tokenizer_config.json
Normal file
216
tokenizer_config.json
Normal file
File diff suppressed because one or more lines are too long
1
vocab.json
Normal file
1
vocab.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user