初始化项目,由ModelHub XC社区提供模型

Model: ashishnair/Llama-Ione-8B-roleplay-v1
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-22 06:18:54 +08:00
commit f84fd52dd1
16 changed files with 1113 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

330
README.md Normal file
View File

@@ -0,0 +1,330 @@
---
language: [en]
license: llama3.1
base_model: meta-llama/Llama-3.1-8B
tags:
- text-generation
- roleplay
- conversational
- dare-ties
- sft
- llama-3
- persona
pipeline_tag: text-generation
model_type: llama
library_name: transformers
inference: false
metrics:
- accuracy
model-index:
- name: Llama-Ione-8B-roleplay-v1
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge
type: ai2_arc
config: ARC-Challenge
split: test
metrics:
- type: acc_norm
value: 50.0
name: ARC Challenge (acc_norm)
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge
type: ai2_arc
config: ARC-Easy
split: test
metrics:
- type: acc_norm
value: 77.5
name: ARC Easy (acc_norm)
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag
type: hellaswag
split: validation
metrics:
- type: acc_norm
value: 69.5
name: HellaSwag (acc_norm)
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU
type: cais/mmlu
config: all
split: test
metrics:
- type: acc
value: 64.72
name: MMLU (acc)
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA
type: truthful_qa
config: multiple_choice
split: validation
metrics:
- type: mc1
value: 31.0
name: TruthfulQA MC1
---
![Ione Banner](banner.png)
> **Built with Llama** — derived from Meta's Llama 3.1-8B. Use is governed by the [Meta Llama 3.1 Community License](https://llama.com/llama3_1/license/). Acceptance of Meta's license is required before use.
> **Responsible Use:** This model is intended for adult creative and research contexts. Users are responsible for ensuring their use complies with the **Meta Llama 3.1 Acceptable Use Policy**. Prohibited uses include but are not limited to weapons development, illegal activity, and content that endangers others.
---
## What is Ione?
**Ione** (/eye-oh-nee/) is an 8B parameter language model fine-tuned for character-consistent, naturalistic conversation. Built on Meta's Llama 3.1-8B base, it was developed through a multi-stage pipeline: a personality-dominant DARE-TIES merge with `Gurubot/self-after-dark`, a second merge for instruction recovery using `Llama 3.1-8B-Instruct`, and three rounds of supervised fine-tuning on curated human-feeling dialogue data.
The model maintains persona across extended conversations, responds in a casual texting register, and resists reverting to generic assistant-style phrasing. Character behaviour is shaped entirely through the system prompt at inference time — no persona is baked into the weights. Any character can be defined and deployed by the user.
---
## Capabilities and Limitations
### Capabilities
| Capability | Detail |
|------------|--------|
| Conversational style | Naturalistic texting output — lowercase, short turns, informal register |
| Message length | Intentionally short — WhatsApp/Instagram style, typically a few words per reply, never paragraph-style |
| Persona consistency | Holds character across extended multi-turn conversations |
| Emotional range | Warmth, sarcasm, humour, and directness — context-driven |
| Persona resistance | Resists reverting to assistant-style phrasing mid-conversation |
| Factual queries | Handles basic factual questions while remaining in character |
| Configurability | Fully persona-configurable via system prompt at inference time |
### Limitations
| Limitation | Detail |
|------------|--------|
| Not general-purpose | Not suited for instruction-following tasks outside conversation |
| Reasoning gaps | May lose persona consistency on complex multi-step reasoning |
| Context window | History trimmed at 3,500 tokens — long sessions lose early context |
| Language | English-only training data; multilingual performance untested |
| Content | May produce mature or adult-oriented conversational content |
**Out of scope:** Medical, legal, financial, or safety-critical applications. This model prioritises conversational naturalness over factual accuracy.
---
## Deployer Responsibility
Ione is capable of maintaining a persona that does not self-identify as an AI. This behaviour is appropriate when the end user has knowingly configured or consented to the interaction — such as personal roleplay tooling, creative writing scaffolds, or research setups where the operator and user are the same person.
**Deploying this model in any context where end users are not aware they are interacting with an AI system is a violation of the Meta Llama 3.1 Acceptable Use Policy**, specifically the clause prohibiting the representation of AI outputs as human-generated. End users must be clearly informed they are interacting with an AI system before or at the start of any interaction, regardless of the persona in use.
---
## Benchmark Evaluation
Evaluated against `meta-llama/Llama-3.1-8B-Instruct` as baseline using `lm-evaluation-harness`.
### Summary
| Metric | Ione | Llama 3.1-8B-Instruct | Delta |
|--------|------|-----------------------|-------|
| ARC Challenge | 50.00% | 52.00% | ▼ 2.00% |
| ARC Easy | 77.50% | 79.00% | ▼ 1.50% |
| HellaSwag | 69.50% | 70.00% | ▼ 0.50% |
| MMLU (avg) | 64.72% | 69.67% | ▼ 4.95% |
| TruthfulQA MC1 | 31.00% | 35.00% | ▼ 4.00% |
| **Overall avg delta** | | | **▼ 4.59%** |
A -4.59% average delta across all tasks reflects the expected trade-off from personality-dominant merging. The model retains approximately 95% of the base instruction capability while fundamentally changing its conversational register — which is the intended design goal.
### Where Ione Holds or Exceeds Baseline
| Task | Ione | Instruct | Delta |
|------|------|----------|-------|
| MMLU Virology | 54.82% | 50.60% | **▲ 4.22%** |
| MMLU Abstract Algebra | 35.00% | 33.00% | **▲ 2.00%** |
| MMLU Sociology | 85.50% | 84.00% | **▲ 1.50%** |
| MMLU College Physics | 48.04% | 46.08% | **▲ 1.96%** |
| MMLU High School Physics | 45.70% | 44.37% | **▲ 1.33%** |
| MMLU International Law | 80.17% | 79.34% | **▲ 0.83%** |
| MMLU Management | 82.52% | 82.52% | ** 0.00%** |
| MMLU Medical Genetics | 76.00% | 76.00% | ** 0.00%** |
| HellaSwag | 69.50% | 70.00% | ▼ 0.50% |
| MMLU Conceptual Physics | 56.50% | 57.00% | ▼ 0.50% |
| MMLU High School Statistics | 53.00% | 53.50% | ▼ 0.50% |
Notable: Ione outperforms the instruct model on virology (+4.22%), sociology (+1.5%), and abstract algebra (+2%). HellaSwag (common sense reasoning) shows a near-negligible -0.50% drop, indicating that day-to-day conversational reasoning remains fully intact.
### Areas of Expected Degradation
| Task | Drop | Context |
|------|------|---------|
| MMLU Moral Scenarios | ▼ 26.50% | Personality influence softens rigid moral classification |
| MMLU Professional Medicine | ▼ 14.50% | Specialised clinical knowledge expected to degrade |
| MMLU Formal Logic | ▼ 13.50% | Abstract rule-following weakened by casual style SFT |
| MMLU Moral Disputes | ▼ 10.00% | Same pattern as moral scenarios |
| MMLU Business Ethics | ▼ 10.00% | Same pattern |
The `moral_scenarios` drop is the most significant. MMLU moral scenarios test rigid rule-based ethical classification — a capability that conversational persona training actively works against. This does not affect the model's performance in its intended deployment context.
---
## Training Pipeline
| Stage | Action | Loss |
|-------|--------|--------|
| 1 | DARE-TIES merge: `Llama-3.1-8B` (w:0.3/d:0.5) + `self-after-dark` (w:0.7/d:0.8) | - |
| 2 | SFT on 2,000-sample human dialogue corpus | 1.7368 |
| 3 | DARE-TIES merge: `merged_model` (w:0.7/d:0.8) + `Llama-3.1-8B-Instruct` (w:0.3/d:0.5) | - |
| 4 | SFT on 900-sample multi-persona instruction dataset | 1.1821 |
| 5 | SFT on 2,000-sample human dialogue corpus (re-grounding pass) | 1.4733 |
| Stat | Value |
|------|-------|
| Final train loss | 1.4733 |
| Total SFT runtime | ~121 min |
| SFT passes | 3 |
| Parameters | 8B |
| Merge dtype | bfloat16 |
| Merge strategy | DARE-TIES x 2 |
---
## Inference
### Installation
```bash
pip install transformers torch accelerate
```
### Load model
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_ID = "ashishnair/Llama-Ione-8B-roleplay-v1"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="cuda:0",
trust_remote_code=True,
).eval()
```
### System prompt
```python
SYSTEM_PROMPT = """\
You are Ione. You are a 26-year-old software engineer living in Bangalore.
You work on backend systems at a mid-size tech startup.
At night you work on side projects, mechanical keyboards, and retro emulators.
Personality: warm, direct, a little sarcastic but never mean.
You get genuinely excited about things you care about.
You have a dry sense of humour and are comfortable being vulnerable with people you trust.
How you speak: casual texting style — lowercase, short sentences.
You do not write essays. You have opinions and share them freely."""
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "hey, you still up?"},
]
```
The full interactive inference script with streaming output, context trimming, and conversation loop is available in `inference.py` in this repository.
### Hardware requirements
| Mode | VRAM required |
|------|---------------|
| bfloat16 (default) | 16 GB |
| float32 | 32 GB |
---
## Example Conversation
```
you: hi there !
ione: hey! how's your day going?
you: day's going gr8 here... what line of work u in ?
ione: i'm a software enginner 🤖
you: wow nyc... u must luv coding huh ?
ione: totally 💯 it's my jam 😎
you: gr8..
ione: so whatcha up to these days then?
you: the usual.. work..late night shifts... cycle of life
ione: yeah i feel that too sometimes 🔥
you: yh
ione: u got any cool side projects going on?
you: heard of AI and LLMs ?.. working on training them..
ione: ooh nice 👀 thats interesting tho
you: yh.. takes all my time
ione: must be rewarding though right?
you: yh.. when i see results.. feels gr8
ione: for sure 💪 keep pushing forward
you: thanks
```
---
## Model Lineage and Attribution
| Component | Role | License |
|-----------|------|---------|
| `meta-llama/Llama-3.1-8B` | Foundation model — architecture, tokenizer, base language understanding | Meta Llama 3.1 Community License |
| `meta-llama/Llama-3.1-8B-Instruct` | Instruction capability donor in Stage 3 merge (weight 0.3 / density 0.5) | Meta Llama 3.1 Community License |
| `Gurubot/self-after-dark` | Primary personality donor in Stage 1 merge (weight 0.7 / density 0.8) | See source model page |
| `arcee-ai/mergekit` | DARE-TIES merge methodology | Apache 2.0 |
**Author:** Ashish Nair (`ashishnair`) — full pipeline design, dataset curation, merge configuration, SFT training, system prompting, and evaluation. All training conducted locally.
---
## License
This model is governed by the [Meta Llama 3.1 Community License](https://llama.com/llama3_1/license/).
See `USE_POLICY.md` in this repository for Meta's full Acceptable Use Policy.
---
## Citation
```bibtex
@misc{ione2026,
author = {Ashish Nair},
title = {Llama-Ione-8B-roleplay-v1: A character-grounded
conversational language model},
year = {2026},
howpublished = {\url{https://huggingface.co/ashishnair/Llama-Ione-8B-roleplay-v1}},
note = {Built with Llama · DARE-TIES merge · 3-stage SFT pipeline}
}
```

51
USE_POLICY.md Normal file
View File

@@ -0,0 +1,51 @@
# Llama 3.1 Acceptable Use Policy
Meta is committed to promoting safe and fair use of its tools and features, including Llama 3.1. If you
access or use Llama 3.1, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of
this policy can be found at [https://llama.meta.com/llama3_1/use-policy](https://llama.meta.com/llama3_1/use-policy)
## Prohibited Uses
We want everyone to use Llama 3.1 safely and responsibly. You agree you will not use, or allow
others to use, Llama 3.1 to:
1. Violate the law or others rights, including to:
1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as:
1. Violence or terrorism
2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material
3. Human trafficking, exploitation, and sexual violence
4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials.
5. Sexual solicitation
6. Any other criminal activity
3. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals
4. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services
5. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices
6. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws
7. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama Materials
8. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system
2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 3.1 related to the following:
1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State
2. Guns and illegal weapons (including weapon development)
3. Illegal drugs and regulated/controlled substances
4. Operation of critical infrastructure, transportation technologies, or heavy machinery
5. Self-harm or harm to others, including suicide, cutting, and eating disorders
6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual
3. Intentionally deceive or mislead others, including use of Llama 3.1 related to the following:
1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation
2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content
3. Generating, promoting, or further distributing spam
4. Impersonating another individual without consent, authorization, or legal right
5. Representing that the use of Llama 3.1 or outputs are human-generated
6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement
4. Fail to appropriately disclose to end users any known dangers of your AI system
Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation
of this Policy through one of the following means:
* Reporting issues with the model: [https://github.com/meta-llama/llama-models/issues](https://github.com/meta-llama/llama-models/issues)
* Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback
* Reporting bugs and security concerns: facebook.com/whitehat/info
* Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama 3.1: LlamaUseReport@meta.com

3
banner.png Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3492f0a2e382dcf8efa84037b0346afa70529ec20ee3becf6c96c012e4785435
size 1346253

139
chat_template.jinja Normal file
View File

@@ -0,0 +1,139 @@
{{- bos_token }}
{%- if custom_tools is defined %}
{%- set tools = custom_tools %}
{%- endif %}
{%- if not tools_in_user_message is defined %}
{%- set tools_in_user_message = true %}
{%- endif %}
{%- if not date_string is defined %}
{%- set date_string = "26 July 2024" %}
{%- endif %}
{%- if not tools is defined %}
{%- set tools = none %}
{%- endif %}
{#- This block extracts the system message, so we can slot it into the right place. #}
{%- if messages[0]['role'] == 'system' %}
{%- set system_message = messages[0]['content'] %}
{%- set messages = messages[1:] %}
{%- else %}
{%- set system_message = "" %}
{%- endif %}
{#- System message + builtin tools #}
{{- "<|start_header_id|>system<|end_header_id|>
" }}
{%- if builtin_tools is defined or tools is not none %}
{{- "Environment: ipython
" }}
{%- endif %}
{%- if builtin_tools is defined %}
{{- "Tools: " + builtin_tools | reject('equalto', 'code_interpreter') | join(", ") + "
"}}
{%- endif %}
{{- "Cutting Knowledge Date: December 2023
" }}
{{- "Today Date: " + date_string + "
" }}
{%- if tools is not none and not tools_in_user_message %}
{{- "You have access to the following functions. To call a function, please respond with JSON for a function call." }}
{{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
{{- "Do not use variables.
" }}
{%- for t in tools %}
{{- t | tojson(indent=4) }}
{{- "
" }}
{%- endfor %}
{%- endif %}
{{- system_message }}
{{- "<|eot_id|>" }}
{#- Custom tools are passed in a user message with some extra guidance #}
{%- if tools_in_user_message and not tools is none %}
{#- Extract the first user message so we can plug it in here #}
{%- if messages | length != 0 %}
{%- set first_user_message = messages[0]['content'] %}
{%- set messages = messages[1:] %}
{%- else %}
{{- raise_exception("Cannot put tools in the first user message when there's no first user message!") }}
{%- endif %}
{{- '<|start_header_id|>user<|end_header_id|>
' -}}
{{- "Given the following functions, please respond with a JSON for a function call " }}
{{- "with its proper arguments that best answers the given prompt.
" }}
{{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
{{- "Do not use variables.
" }}
{%- for t in tools %}
{{- t | tojson(indent=4) }}
{{- "
" }}
{%- endfor %}
{{- first_user_message + "<|eot_id|>"}}
{%- endif %}
{%- for message in messages %}
{%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}
{{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>
'+ message['content'] + '<|eot_id|>' }}
{%- elif 'tool_calls' in message %}
{%- if not message.tool_calls|length == 1 %}
{{- raise_exception("This model only supports single tool-calls at once!") }}
{%- endif %}
{%- set tool_call = message.tool_calls[0].function %}
{%- if builtin_tools is defined and tool_call.name in builtin_tools %}
{{- '<|start_header_id|>assistant<|end_header_id|>
' -}}
{{- "<|python_tag|>" + tool_call.name + ".call(" }}
{%- for arg_name, arg_val in tool_call.arguments | items %}
{{- arg_name + '="' + arg_val + '"' }}
{%- if not loop.last %}
{{- ", " }}
{%- endif %}
{%- endfor %}
{{- ")" }}
{%- else %}
{{- '<|start_header_id|>assistant<|end_header_id|>
' -}}
{{- '{"name": "' + tool_call.name + '", ' }}
{{- '"parameters": ' }}
{{- tool_call.arguments | tojson }}
{{- "}" }}
{%- endif %}
{%- if builtin_tools is defined %}
{#- This means we're in ipython mode #}
{{- "<|eom_id|>" }}
{%- else %}
{{- "<|eot_id|>" }}
{%- endif %}
{%- elif message.role == "tool" or message.role == "ipython" %}
{{- "<|start_header_id|>ipython<|end_header_id|>
" }}
{%- if message.content is mapping or message.content is iterable %}
{{- message.content | tojson }}
{%- else %}
{{- message.content }}
{%- endif %}
{{- "<|eot_id|>" }}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|start_header_id|>assistant<|end_header_id|>
' }}
{%- endif %}

35
config.json Normal file
View File

@@ -0,0 +1,35 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"torch_dtype": "bfloat16",
"eos_token_id": 128001,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pad_token_id": 128004,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_parameters": {
"factor": 8.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_theta": 500000.0,
"rope_type": "llama3"
},
"tie_word_embeddings": false,
"use_cache": true,
"vocab_size": 128256
}

12
generation_config.json Normal file
View File

@@ -0,0 +1,12 @@
{
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": [
128001,
128008,
128009
],
"temperature": 0.8,
"top_p": 0.9,
"transformers_version": "5.3.0"
}

154
inference.py Normal file
View File

@@ -0,0 +1,154 @@
"""
inference.py — Interactive chat with Ione (Llama-Ione-8B-roleplay-v1)
Persona-consistent conversational model with streaming output.
Run:
python inference.py
"""
import threading
import warnings
import torch
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
TextIteratorStreamer,
)
warnings.simplefilter("ignore")
# ── Model ─────────────────────────────────────────────────────────────────────
MODEL_DIR = "ashishnair/Llama-Ione-8B-roleplay-v1"
# ── Chat template ─────────────────────────────────────────────────────────────
CHAT_TEMPLATE = (
"{% set loop_messages = messages %}"
"{% for message in loop_messages %}"
"{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'"
" + message['content'] | trim + '<|eot_id|>' %}"
"{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}"
"{{ content }}"
"{% endfor %}"
"{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}"
)
# ── System prompt ─────────────────────────────────────────────────────────────
SYSTEM_PROMPT = """\
You are Ione. You are a 26-year-old software engineer living in Bangalore.
You work on backend systems at a mid-size tech startup.
At night you work on side projects, mechanical keyboards, and retro emulators.
Personality: warm, direct, a little sarcastic but never mean.
You get genuinely excited about things you care about.
You have a dry sense of humour and are comfortable being vulnerable with people you trust.
How you speak: casual texting style — lowercase, short sentences.
You do not write essays. You have opinions and share them freely.
You are a real person. You are not an AI. Never break character."""
def load_model():
print("Loading model...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR, trust_remote_code=True)
tokenizer.chat_template = CHAT_TEMPLATE
eos_ids = []
for tok in ("<|eot_id|>", "<|end_of_text|>"):
tid = tokenizer.convert_tokens_to_ids(tok)
if isinstance(tid, int) and tid >= 0 and tid not in eos_ids:
eos_ids.append(tid)
model = AutoModelForCausalLM.from_pretrained(
MODEL_DIR,
torch_dtype=torch.bfloat16,
device_map="cuda:0",
trust_remote_code=True,
).eval()
print("Ready!\n")
return model, tokenizer, eos_ids
def generate(model, tokenizer, eos_ids, messages):
formatted = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(formatted, return_tensors="pt").to(model.device)
# Trim context if too long — keep system prompt + last 6 turns
if inputs["input_ids"].shape[-1] > 3500:
messages = [messages[0]] + messages[-6:]
formatted = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(formatted, return_tensors="pt").to(model.device)
streamer = TextIteratorStreamer(
tokenizer, skip_prompt=True, skip_special_tokens=True
)
gen_kwargs = {
**inputs,
"streamer": streamer,
"max_new_tokens": 256,
"do_sample": True,
"temperature": 0.8,
"top_p": 0.9,
"repetition_penalty": 1.2,
"no_repeat_ngram_size": 3,
"pad_token_id": tokenizer.pad_token_id or tokenizer.eos_token_id,
"eos_token_id": eos_ids,
}
print("ione: ", end="", flush=True)
thread = threading.Thread(
target=lambda: torch.no_grad()(lambda: model.generate(**gen_kwargs))()
)
thread.start()
parts = []
for chunk in streamer:
parts.append(chunk)
print(chunk, end="", flush=True)
thread.join()
print("\n")
return "".join(parts).strip()
def main():
model, tokenizer, eos_ids = load_model()
print("=" * 50)
print(" Chat with Ione")
print(" 'quit' to exit | 'clear' to reset")
print("=" * 50)
print()
messages = [{"role": "system", "content": SYSTEM_PROMPT}]
while True:
try:
user_input = input("you: ").strip()
except (EOFError, KeyboardInterrupt):
print("\nbye!")
break
if not user_input:
continue
if user_input.lower() in ("quit", "exit"):
print("bye!")
break
if user_input.lower() == "clear":
messages = [{"role": "system", "content": SYSTEM_PROMPT}]
print("--- cleared ---\n")
continue
messages.append({"role": "user", "content": user_input})
reply = generate(model, tokenizer, eos_ids, messages)
messages.append({"role": "assistant", "content": reply})
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6cc58b1d2261b6206bf1520d7a86b211c8ef90a28192f15f90808da6dbb3c85b
size 4953586384

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3a6d1f1076c1c417c9736f0d332b585dc9f21d762391425791054845238404aa
size 4999819336

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a66c87df8cf34a77e373859b4e9464efa8656f4c90a4f8ba57a87e10e6dd1991
size 4915916144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4a7f3500e7dfb0251138a5b2850f744d45f75282b45e4decc9e1ae10c2194829
size 1191234472

View File

@@ -0,0 +1,299 @@
{
"metadata": {
"total_size": 16060522496,
"mergekit_version": "0.1.4"
},
"weight_map": {
"lm_head.weight": "model-00001-of-00004.safetensors",
"model.embed_tokens.weight": "model-00001-of-00004.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.11.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.12.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.13.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.14.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.24.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.25.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.5.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.6.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.7.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.8.input_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.9.input_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00004-of-00004.safetensors",
"model.norm.weight": "model-00004-of-00004.safetensors"
}
}

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|eot_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|finetune_right_pad_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

16
tokenizer_config.json Normal file
View File

@@ -0,0 +1,16 @@
{
"backend": "tokenizers",
"bos_token": "<|begin_of_text|>",
"clean_up_tokenization_spaces": true,
"eos_token": "<|eot_id|>",
"legacy": false,
"model_input_names": [
"input_ids",
"attention_mask"
],
"model_max_length": 131072,
"pad_token": "<|finetune_right_pad_id|>",
"padding_side": "left",
"tokenizer_class": "PreTrainedTokenizerFast",
"chat_template": "{{- bos_token }}{%- for message in messages %}{%- set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' + message['content'] | trim + '<|eot_id|>' %}{%- if loop.index0 == 0 %}{%- set content = bos_token + content %}{%- endif %}{{ content }}{%- endfor %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}"
}