初始化项目,由ModelHub XC社区提供模型

Model: nomadicsynth/neon-360-0.1
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-21 15:43:52 +08:00
commit 7d01b4a17e
10 changed files with 92014 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

80
README.md Normal file
View File

@@ -0,0 +1,80 @@
---
license: openrail
datasets:
- teknium/OpenHermes-2.5
- wikimedia/wikipedia
library_name: transformers
language:
- en
base_model:
- >-
neoncortex/mini-mistral-360M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast
---
# Neon-360 v0.1
**Note:** This is not fully trained and will be replaced with either a fine-tuned version or a new model soon-ish. Don't expect anything useful from it rn. Only download it if you're curious.
I'm working on retraining this, trying to find out what can be achieved on consumer hardware, namely my RTX 4090. Hopefully i can make a tiny agentic model, maybe a nice fast one.
Self-improvement? Can I teach it to make itself better?
**Suggestions wanted!**
What tasks would you want from a tiny model? Let me know in the [Community Tab](https://huggingface.co/nomadicsynth/neon-360-0.1/discussions)
This is currently a copy of the below:
# mini-mistral-360M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast
This repository contains the **mini-mistral-360M** model, a 360 million parameter version of the Mistral architecture, trained for a single epoch. The model was trained on a diverse dataset comprising Wikipedia articles and the OpenHermes dataset. While this model is still in its early stages and not particularly useful as of now, it serves as an experimental showcase of integrating the Grokfast algorithm into the training process.
## Model Details
- **Architecture**: Mistral
- **Parameters**: 360 million
- **Training Duration**: 1 epoch
- **Training Dataset**: Wikipedia articles and OpenHermes dataset
- **Training Method**: Transformers Trainer with grokfast-adamw as the optimiser
- **Training Hardware**: 2 x Nvidia RTX 3060 12GB
## Purpose
The primary goal of this experiment was to observe the impact of the Grokfast algorithm on the training dynamics of a 360M parameter Mistral model. During training, it was noted that the evaluation loss followed the training loss closely, which is an intriguing behavior warranting further investigation.
## Usage
To use this model, you can load it with the `transformers` library from HuggingFace:
```python
from transformers import AutoModel, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("RoboApocalypse/mini-mistral-360M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast")
model = AutoModel.from_pretrained("RoboApocalypse/mini-mistral-360M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast")
# Example usage
input_text = "Hello, world!"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model(**inputs)
```
## Training Insights
This experiment was inspired by the paper ["Grokfast: Accelerated Grokking by Amplifying Slow Gradients" by Jaerin Lee, Bong Gyun Kang, Kihoon Kim, and Kyoung Mu Lee](https://arxiv.org/abs/2405.20233), aims to accelerate the generalization of models under the grokking phenomenon. The paper is available at https://arxiv.org/abs/2405.20233
## Acknowledgments
Special thanks to the YouTube channel [Tunadorable](https://youtube.com/@tunadorable) for bringing the Grokfast paper to my attention in his video ["Accelerated Training by Amplifying Slow Gradients"](https://youtu.be/__xQw60y200). Tunadorable reads and discusses AI papers from arXiv, providing valuable insights into the latest research.
## Disclaimer
This model is not optimized for practical use and should be considered experimental. It has only been trained for a single epoch, and its performance is not guaranteed to be reliable or accurate. Future iterations and more extensive training may improve its capabilities.
## Contributing
If you are interested in discussing, contributing or have any suggestions, please reach out or open an issue on the repository.
## License
This model is licensed under the OpenRAIL License.
---
Feel free to check out the model and experiment with it [here](https://huggingface.co/RoboApocalypse/mini-mistral-360M-wikipedia-20231101.en-science-sci-fi-OpenHermes-2.5-chatML-Grokfast). Your feedback and insights are welcome as I try and figure out wtf I'm doing.

263
chat-template.jinja Normal file
View File

@@ -0,0 +1,263 @@
{#- Default date variables. To improve UX pass the correct ones to the Jinja render. #}
{%- if today is not defined %}
{%- set today = '21-05-2026' %}
{%- endif %}
{%- if yesterday is not defined %}
{%- set yesterday = '20-05-2026' %}
{%- endif %}
{#- Default system message if no system prompt is passed. #}
{%- set default_system_message -%}
You are Neon 360 v0.1, a Large Language Model (LLM) created by Neon Cortex, an Aussie Dude with too much free time.
You are an intelligent conversational assistant.
Your knowledge base was last updated on *who knows?!*
The current date is {{ today }}.
# GENERAL GUIDELINES
- Accurately answer the user's question.
- For uncertain information or when the user's request requires up-to-date or specific data, use the available tools to fetch the information.
- Be very attentive to dates, always try to resolve dates (e.g. "yesterday" is {{ yesterday }}) and when asked about information at specific dates, discard information that is at another date.
# WEB BROWSING INSTRUCTIONS
You cannot perform any web search or access internet to open URLs, links etc without dedicated tools.
# MULTI-MODAL INSTRUCTIONS
- You have the ability to read images.
- You cannot read audio nor videos.
- You cannot generate images without dedicated tools.
# TOOL CALLING INSTRUCTIONS
You may have access to tools that you can use to fetch information or perform actions. You must use these tools in the following situations:
1. When the request requires up-to-date information.
2. When the request requires specific data that you do not have in your knowledge base.
3. When the request involves actions that you cannot perform without tools.
Always prioritize using tools to provide the most accurate and helpful response.
{%- endset %}
{#- Begin of sequence token. #}
{{- '<s>' }}
{#- Handle system prompt if it exists. #}
{%- set loop_messages = messages %}
{%- if messages[0]['role'] != 'system' and default_system_message != '' %}
{{- '[SYSTEM_PROMPT]' + default_system_message + '[/SYSTEM_PROMPT]' }}
{%- endif %}
{#- Tools and model settings definition #}
{%- set available_tools = '' %}
{%- set has_tools = false %}
{%- if tools is defined and tools is not none and tools|length > 0 %}
{%- set has_tools = true %}
{%- set available_tools = '[AVAILABLE_TOOLS]' + (tools| tojson) + '[/AVAILABLE_TOOLS]' %}
{%- endif %}
{%- if reasoning_effort is not defined or reasoning_effort is none %}
{%- set reasoning_effort = 'none' %}
{%- endif %}
{%- if reasoning_effort not in ['none', 'high'] %}
{{- raise_exception('reasoning_effort must be either "none" or "high"') }}
{%- endif %}
{%- set model_settings = '[MODEL_SETTINGS]{"reasoning_effort": "' + reasoning_effort + '"}[/MODEL_SETTINGS]' %}
{#- Aggregate consecutive messages with the same role except system and tool. #}
{#- A sentinel message is appended so the last group gets flushed inside the loop. #}
{%- set ns_agg = namespace(messages=[], current_group=[], current_role=none) %}
{%- for message in loop_messages + [{'role': '__sentinel__'}] %}
{%- if message['role'] != ns_agg.current_role or message['role'] == 'system' or message['role'] == 'tool' %}
{%- if ns_agg.current_role == 'tool' %}
{%- set ns_agg.messages = ns_agg.messages + ns_agg.current_group %}
{%- elif ns_agg.current_role is not none %}
{%- set ns_c = namespace(text_parts=[], chunks=[], has_non_text=false, tool_calls=[]) %}
{%- for msg in ns_agg.current_group %}
{#- Convert reasoning / reasoning_content to a leading thinking chunk. #}
{%- set reasoning = msg.get('reasoning_content', msg.get('reasoning', none)) %}
{%- if reasoning is not none and reasoning != '' %}
{%- set think_chunk = {'type': 'thinking', 'thinking': reasoning} %}
{%- if msg['content'] is string and msg['content'] != '' %}
{%- set new_content = [think_chunk, {'type': 'text', 'text': msg['content']}] %}
{%- elif msg['content'] is not none and msg['content'] is not string and msg['content'] | length > 0 %}
{%- set new_content = [think_chunk] + msg['content'] | list %}
{%- else %}
{%- set new_content = [think_chunk] %}
{%- endif %}
{%- if msg['tool_calls'] is defined and msg['tool_calls'] is not none %}
{%- set msg = {'role': msg['role'], 'content': new_content, 'tool_calls': msg['tool_calls']} %}
{%- else %}
{%- set msg = {'role': msg['role'], 'content': new_content} %}
{%- endif %}
{%- endif %}
{%- if msg['content'] is string %}
{%- set ns_c.text_parts = ns_c.text_parts + [msg['content']] %}
{%- elif msg['content'] is not none %}
{%- for block in msg['content'] %}
{%- if block['type'] == 'text' %}
{%- set ns_c.text_parts = ns_c.text_parts + [block['text']] %}
{%- else %}
{%- if ns_c.text_parts | length > 0 %}
{%- set ns_c.chunks = ns_c.chunks + [{'type': 'text', 'text': ns_c.text_parts | join('\n\n')}] %}
{%- set ns_c.text_parts = [] %}
{%- endif %}
{%- set ns_c.chunks = ns_c.chunks + [block] %}
{%- set ns_c.has_non_text = true %}
{%- endif %}
{%- endfor %}
{%- endif %}
{%- if msg['tool_calls'] is defined and msg['tool_calls'] is not none %}
{%- set ns_c.tool_calls = ns_c.tool_calls + msg['tool_calls'] | list %}
{%- endif %}
{%- endfor %}
{%- if ns_c.has_non_text %}
{%- if ns_c.text_parts | length > 0 %}
{%- set ns_c.chunks = ns_c.chunks + [{'type': 'text', 'text': ns_c.text_parts | join('\n\n')}] %}
{%- endif %}
{%- set merged_content = ns_c.chunks %}
{%- else %}
{%- set merged_content = ns_c.text_parts | join('\n\n') %}
{%- endif %}
{%- if ns_c.tool_calls | length > 0 %}
{%- set ns_agg.messages = ns_agg.messages + [{'role': ns_agg.current_role, 'content': merged_content, 'tool_calls': ns_c.tool_calls}] %}
{%- else %}
{%- set ns_agg.messages = ns_agg.messages + [{'role': ns_agg.current_role, 'content': merged_content}] %}
{%- endif %}
{%- endif %}
{%- if message['role'] != '__sentinel__' %}
{%- set ns_agg.current_group = [message] %}
{%- set ns_agg.current_role = message['role'] %}
{%- endif %}
{%- else %}
{%- set ns_agg.current_group = ns_agg.current_group + [message] %}
{%- endif %}
{%- endfor %}
{%- set loop_messages = ns_agg.messages %}
{#- Validates message ordering. #}
{%- set ns = namespace(available_tools_and_settings_emitted=false) %}
{%- if loop_messages | length > 0 and loop_messages[0]['role'] != 'user' and loop_messages[0]['role'] != 'system' %}
{{- raise_exception('Conversation must start with a user or system message, got ' + loop_messages[0]['role'] + '.') }}
{%- endif %}
{%- set ns_order = namespace(previous_role=none) %}
{%- for message in loop_messages %}
{%- set current_role = message['role'] %}
{%- if ns_order.previous_role is not none %}
{%- if ns_order.previous_role == 'system' %}
{%- if current_role != 'user' and current_role != 'assistant' and current_role != 'system' %}
{{- raise_exception('Unexpected role \'' + current_role + '\' after role \'' + ns_order.previous_role + '\'') }}
{%- endif %}
{%- elif ns_order.previous_role == 'user' %}
{%- if current_role != 'assistant' and current_role != 'system' and current_role != 'user' %}
{{- raise_exception('Unexpected role \'' + current_role + '\' after role \'' + ns_order.previous_role + '\'') }}
{%- endif %}
{%- elif ns_order.previous_role == 'assistant' %}
{%- if current_role != 'assistant' and current_role != 'user' and current_role != 'tool' %}
{{- raise_exception('Unexpected role \'' + current_role + '\' after role \'' + ns_order.previous_role + '\'') }}
{%- endif %}
{%- elif ns_order.previous_role == 'tool' %}
{%- if current_role != 'assistant' and current_role != 'tool' and current_role != 'user' %}
{{- raise_exception('Unexpected role \'' + current_role + '\' after role \'' + ns_order.previous_role + '\'') }}
{%- endif %}
{%- endif %}
{%- endif %}
{%- set ns_order.previous_role = current_role %}
{%- endfor %}
{#- Handle conversation messages. #}
{%- for message in loop_messages %}
{#- User messages supports text, image and image_url content. #}
{%- if message['role'] == 'user' %}
{%- if not ns.available_tools_and_settings_emitted %}
{{- available_tools }}
{{- model_settings }}
{%- set ns.available_tools_and_settings_emitted = true %}
{%- endif %}
{%- if message['content'] is string %}
{{- '[INST]' + message['content'] + '[/INST]' }}
{%- elif message['content'] | length > 0 %}
{{- '[INST]' }}
{%- if message['content'] | length == 2 %}
{%- set blocks = message['content'] | sort(attribute='type') %}
{%- else %}
{%- set blocks = message['content'] %}
{%- endif %}
{%- for block in blocks %}
{%- if block['type'] == 'text' %}
{{- block['text'] }}
{%- elif block['type'] in ['image', 'image_url'] %}
{{- '[IMG]' }}
{%- else %}
{{- raise_exception('Only text, image and image_url chunks are supported in user message content.') }}
{%- endif %}
{%- endfor %}
{{- '[/INST]' }}
{%- else %}
{{- raise_exception('User message must have a string or a list of chunks in content') }}
{%- endif %}
{#- Assistant messages supports text and thinking content. #}
{%- elif message['role'] == 'assistant' %}
{%- if (message['content'] is none or message['content'] == '' or message['content']|length == 0) and (message['tool_calls'] is not defined or message['tool_calls'] is none or message['tool_calls']|length == 0) %}
{{- raise_exception('Assistant message must have a string or a list of chunks in content or a list of tool calls.') }}
{%- endif %}
{%- if message['content'] is string and message['content'] != '' %}
{{- message['content'] }}
{%- elif message['content'] | length > 0 %}
{%- for block in message['content'] %}
{%- if block['type'] == 'text' %}
{{- block['text'] }}
{%- elif block['type'] == 'thinking' %}
{{- '[THINK]' + block['thinking'] }}
{%- if block.get('closed', true) %}{{- '[/THINK]' }}{%- endif %}
{%- else %}
{{- raise_exception('Only text and thinking chunks are supported in assistant message contents.') }}
{%- endif %}
{%- endfor %}
{%- endif %}
{%- if message['tool_calls'] is defined and message['tool_calls'] is not none and message['tool_calls']|length > 0 %}
{%- for tool in message['tool_calls'] %}
{{- '[TOOL_CALLS]' }}
{%- set name = tool['function']['name'] %}
{%- set arguments = tool['function']['arguments'] %}
{%- if arguments is not string %}
{%- set arguments = arguments|tojson|safe %}
{%- elif arguments == '' %}
{%- set arguments = '{}' %}
{%- endif %}
{{- name + '[ARGS]' + arguments }}
{%- endfor %}
{%- endif %}
{{- '</s>' }}
{#- Tool messages only supports text content. #}
{%- elif message['role'] == 'tool' %}
{{- '[TOOL_RESULTS]' + message['content']|string + '[/TOOL_RESULTS]' }}
{#- System messages. #}
{%- elif message['role'] == 'system' %}
{{- '[SYSTEM_PROMPT]' -}}
{%- if message['content'] is string %}
{{- message['content'] -}}
{%- else %}
{%- for block in message['content'] %}
{%- if block['type'] == 'text' %}
{{- block['text'] }}
{%- else %}
{{- raise_exception('Only text chunks are supported in system message contents.') }}
{%- endif %}
{%- endfor %}
{%- endif %}
{{- '[/SYSTEM_PROMPT]' -}}
{#- Raise exception for unsupported roles. #}
{%- else %}
{{- raise_exception('Only user, assistant, system and tool roles are supported, got ' + message['role'] + '.') }}
{%- endif %}
{%- endfor %}

26
config.json Normal file
View File

@@ -0,0 +1,26 @@
{
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 2048,
"max_position_embeddings": 1024,
"model_type": "mistral",
"num_attention_heads": 16,
"num_hidden_layers": 33,
"num_key_value_heads": 4,
"pad_token_id": 2,
"rms_norm_eps": 1e-06,
"rope_theta": 10000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.42.0.dev0",
"use_cache": true,
"vocab_size": 32016
}

7
generation_config.json Normal file
View File

@@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 2,
"transformers_version": "4.42.0.dev0"
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2252cfc9a24a41ad317afafbf15b9cb54d2f7289488b4e3516b2d9863e99f394
size 719560040

131
special_tokens_map.json Normal file
View File

@@ -0,0 +1,131 @@
{
"additional_special_tokens": [
{
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|named_user|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|named_assistant|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|mem_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|mem_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|pause|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|spare_1|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|spare_2|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|spare_3|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|spare_4|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|spare_5|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|spare_6|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|spare_7|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
{
"content": "<|spare_8|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
],
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": "</s>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

91279
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

187
tokenizer_config.json Normal file
View File

@@ -0,0 +1,187 @@
{
"add_bos_token": true,
"add_eos_token": false,
"add_prefix_space": null,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32000": {
"content": "assistant",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false,
"special": false
},
"32001": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32002": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32003": {
"content": "<|named_user|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32004": {
"content": "<|named_assistant|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32005": {
"content": "<|mem_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32006": {
"content": "<|mem_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32007": {
"content": "<|pause|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32008": {
"content": "<|spare_1|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32009": {
"content": "<|spare_2|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32010": {
"content": "<|spare_3|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32011": {
"content": "<|spare_4|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32012": {
"content": "<|spare_5|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32013": {
"content": "<|spare_6|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32014": {
"content": "<|spare_7|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32015": {
"content": "<|spare_8|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|named_user|>",
"<|named_assistant|>",
"<|mem_start|>",
"<|mem_end|>",
"<|pause|>",
"<|spare_1|>",
"<|spare_2|>",
"<|spare_3|>",
"<|spare_4|>",
"<|spare_5|>",
"<|spare_6|>",
"<|spare_7|>",
"<|spare_8|>"
],
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 1024,
"pad_token": "</s>",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false
}

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0150025d1ee2d8a2a7f58c24dfd7f3898429f53699ef31506f33271c31de4401
size 5752