初始化项目,由ModelHub XC社区提供模型

Model: distil-labs/Distil-PII-Llama-3.2-3B-Instruct
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-25 06:13:16 +08:00
commit b685f33c3d
17 changed files with 3055 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

31
LICENSE Normal file
View File

@@ -0,0 +1,31 @@
GENERAL TERMS AND CONDITIONS
Note that if you want to use the Commercial licence, please contact us at contact@distillabs.ai
- Model License Terms -
R&D License
1. SERVICES, PRICES AND PAYMENT
1.1 The Customer pays a one-time license fee, as indicated in the check-out process, for running of one (1) training process of the selected Base Model using Customer Data (“License Fee”).
1.2 The License Fee shall be due for payment in advance. The Customer shall only be permitted to set off against payment claims of Distil Labs if the Customers claims are undisputed or have become res judicata.
2. MODEL LICENSE: R&D LICENSE
2.1 Subject to Customers payment of the license fee, Distil Labs grants to Customer the Model License (as defined below). For clarification, Distil Labs retains any other rights in its software or know- how, in particular in the codebase needed for the fine-tuning of the Trained Model.
2.2 Subject to the requirements of the Base Model License (cf. Section 2.5 below), Distil Labs transfers to the Customer the perpetual, non-exclusive usage right to the Trained Model for non-commercial purposes of prototyping and research & development. The Parties agree, that commercial purposes include deployment in production externally (to be used by Customers customers paid or free of charge) or internally (as a tool for Customers employees). The territorial scope of the license is limited to the use within the United States of America and the European Economic Area including all member states of the European Union (“Model License”).
2.3 The Model License for non-commercial purposes of prototyping and research & development shall include (i) the non-exclusive right to permanent or temporary reproduction, in whole or in part, by any means and in any form (e.g. permanent and/or volatile storage on electrical, electromagnetic, optical storage media, such as any type of SDD, HDD, DVD, memory cards, USB sticks), (ii) the non-exclusive right to distribution in any form, media and by any means regardless of whether the distribution is in tangible or intangible form, in particular to transmit the Trained Model via wired and wireless networks (e.g. for download from internet or intranet by wire or wireless means including broadband, cable, fiberglass, WIFI, LTE, 5G, satellite internet, other data networks), and (iii) the non-exclusive right of making available to the public in such a way that members of the public can access it from places and at times of their choice (e.g. by web or mobile app, virtual or augmented reality, cloud storage, cloud hosting, decentralized hosting, non-fungible token, application service providing, software as a service, or cloud computing). The license shall also contain, to the extent necessary for prototyping and research & development, the right to adapt and modify the Trained Model subject to the limitation in Section 2.4 and 2.5 below, to further develop the Trained Model including changes to functions or appearance, adapt to other software versions, to exchange parts of the Trained Model or combine the Trained Model with other results of work and to use the results in the same way as the original Trained Model. Any derived models from the Trained Model shall retain this model license.
2.4 The Customer shall not, without the prior written consent of Distil Labs:
2.4.1 train, fine-tune, re-train, or otherwise modify the Trained Model, unless for purpose of research & development;
2.4.2 use the Trained Model or any part thereof to create derivative models or services that compete with those of Distil Labs;
2.4.3 circumvent any technical restrictions embedded in the Trained Model or Base Model that are designed to enforce usage limitations.
2.5 The Parties acknowledge and agree that the Trained Model is developed from Base Models which are supplied by a third party. Therefore, the Model License is subject to the restrictions resulting from the open-source or any other applicable license of the Base Model (“Base Model License”) and the Customer must use the Trained Model in compliance with the Base Model License. In particular, the Customer must oblige their clients to compliance with the Base Model License in any case of transferring or sublicensing the rights to or making available in any way the Trained Model. The applicable Base Model License is defined in the Training Configuration and will be provided for download. The Customer agrees to indemnify Distil Labs for any and all claims brought by the Base Model provider for violations of the Base Model License.

1
Modelfile Normal file
View File

@@ -0,0 +1 @@
FROM .

188
README.md Normal file
View File

@@ -0,0 +1,188 @@
---
license: llama3.2
language: en
base_model: meta-llama/Llama-3.2-3B-Instruct
pipeline_tag: text-generation
tags: [pii-redaction, privacy, slm, distil-labs]
---
<div align="center">
<img src="https://github.com/distil-labs/badges/blob/main/distillabs-logo.svg?raw=true" width="40%" alt="distil labs" />
</div>
---
<div align="center">
<table>
<tr>
<td align="center">
<a href="https://www.distillabs.ai/?utm_source=hugging-face&utm_medium=referral&utm_campaign=distil-PII">
<img src="https://github.com/distil-labs/badges/blob/main/badge-distillabs-home.svg?raw=true" alt="Homepage"/>
</a>
</td>
<td align="center">
<a href="https://github.com/distil-labs">
<img src="https://github.com/distil-labs/badges/blob/main/badge-github.svg?raw=true" alt="GitHub"/>
</a>
</td>
<td align="center">
<a href="https://huggingface.co/distil-labs">
<img src="https://github.com/distil-labs/badges/blob/main/badge-huggingface.svg?raw=true" alt="Hugging Face"/>
</a>
</td>
</tr>
<tr>
<td align="center">
<a href="https://www.linkedin.com/company/distil-labs/">
<img src="https://github.com/distil-labs/badges/blob/main/badge-linkedin.svg?raw=true" alt="LinkedIn"/>
</a>
</td>
<td align="center">
<a href="https://distil-labs-community.slack.com/join/shared_invite/zt-36zqj87le-i3quWUn2bjErRq22xoE58g">
<img src="https://github.com/distil-labs/badges/blob/main/badge-slack.svg?raw=true" alt="Slack"/>
</a>
</td>
<td align="center">
<a href="https://x.com/distil_labs">
<img src="https://github.com/distil-labs/badges/blob/main/badge-twitter.svg?raw=true" alt="Twitter"/>
</a>
</td>
</tr>
</table>
</div>
---
# Distil-PII-Llama-3.2-3B-Instruct
A **small language model** (SLM) fine-tuned by Distil Labs for **policy-aware PII redaction** that outputs a single JSON object with `redacted_text` and `entities`. Optimized to run locally with strong accuracy and strict schema adherence.
## Model Details
* **Developed by:** Distil Labs GmbH
* **License:** Llama 3.2 Community License Agreement
* **Finetuned from:** `meta-llama/Llama-3.2-3B-Instruct`
## Intended Use & Limitations
* **Use cases:** Redacting support chats, logs, tickets, transcripts—removing identity while preserving ops signals (IDs last-4, order numbers, etc.).
* **Out of scope:** Legal or compliance advice; languages beyond English (generalization not guaranteed); domain-specific IDs unseen in training.
## Input & Output
**Input:** A plain-text prompt with task instruction + context.
**Output (JSON only):**
```json
{
"redacted_text": "Text with in-place tokens",
"entities": [
{"value": "<original>", "replacement_token": "[TOKEN]", "reason": "<why>"}
]
}
```
**Tokens:** `[PERSON] [EMAIL] [PHONE] [ADDRESS] [SSN] [ID] [UUID] [CARD_LAST4:####] [IBAN_LAST4:####] [GENDER] [AGE] [RACE] [MARITAL_STATUS]`
## Training
Instruction-tuned on a compact policy spec + ~20 curated examples emphasizing **exact JSON schema**, **minimal in-place edits**, and **entity correctness**.
## Evaluation
Judged by a frontier LLM using a deterministic rubric: JSON-only, schema validity, **redacted_text exact match**, and **set-equality** of `(value, replacement_token)` pairs (reason/order ignored). Score: **0.82 ± 0.03**.
## How to Use
Details of deployment can be found in [docs](https://docs.distillabs.ai/how-to/model-deployment). Deploy the model using vllm or ollama (-gguf version available in this collection) and use the following snippet to get results
```python
SYSTEM_PROMPT = """
You are a problem solving model working on task_description XML block:
<task_description>
Produce a redacted version of texts, removing sensitive personal data while preserving operational signals. The model must return a single json blob with:
* **redacted_text** is the input with minimal, in-place replacements of redacted entities.
* **entities** as an array of objects with exactly three fields {value: original_value, replacement_token: replacement, reason: reasoning}.
## What to redact (→ replacement token)
* **PERSON** — customer/patient/person names (first/last/full; identifying initials) → `[PERSON]`
* **EMAIL** — any email, including obfuscated `name(at)domain(dot)com` → `[EMAIL]`
* **PHONE** — any international/national format (separators/emoji bullets allowed) → `[PHONE]`
* **ADDRESS** — street + number; full postal lines; apartment/unit numbers → `[ADDRESS]`
* **SSN** — US Social Security numbers → `[SSN]`
* **ID** — national IDs (PESEL, NIN, Aadhaar, DNI, etc.) when personal → `[ID]`
* **UUID** — person-scoped system identifiers (e.g., MRN/NHS/patient IDs/customer UUIDs) → `[UUID]`
* **CREDIT_CARD** — 1319 digits (spaces/hyphens allowed) → `[CARD_LAST4:####]` (keep last-4 only)
* **IBAN** — IBAN/bank account numbers → `[IBAN_LAST4:####]` (keep last-4 only)
* **GENDER** — self-identification (male/female/non-binary/etc.) → `[GENDER]`
* **AGE** — stated ages (“Im 29”, “age: 47”, “29 y/o”) → `[AGE_YEARS:##]`
* **RACE** — race/ethnicity self-identification → `[RACE]`
* **MARITAL_STATUS** — married/single/divorced/widowed/partnered → `[MARITAL_STATUS]`
## Keep (do not redact)
* Card **last-4** when only last-4 is present (e.g., “ending 9021”, “•••• 9021”).
* Operational IDs: order/ticket/invoice numbers, shipment tracking, device serials, case IDs.
* Non-personal org info: company names, product names, team names.
* Cities/countries alone (redact full street+number, not plain city/country mentions).
## Output schema (exactly these fields)
* **redacted_text** The original text with all the sensitive information replaced with redacted tokens
* **entities** Array with all the replaced elements, each element represented by following fields
* **replacement_token**: one of `[PERSON] | [EMAIL] | [PHONE] | [ADDRESS] | [SSN] | [ID] | [UUID] | [CREDIT_CARD] | [IBAN] | [GENDER] | [AGE] | [RACE] | [MARITAL_STATUS]`
* **value**: original text that was redacted
* **reason**: brief string explaining the rule/rationale
for example
{
"redacted_text": "Hi, I'm [PERSON] and my email is [EMAIL].",
"entities": [
{ "type": "PERSON", "value": "John Smith", "reason": "person name"},
{ "type": "EMAIL", "value": "john.smith@example.com", "reason": "email"},
]
}
</task_description>
You will be given a single task with context in the context XML block and the task in the question XML block
Solve the task in question block based on the context in context block.
Generate only the answer, do not generate anything else
"""
PROMPT_TEMPLATE = """
Now for the real task, solve the task in question block based on the context in context block.
Generate only the solution, do not generate anything else
<context>
{context}
</context>
<question>Redact provided text according to the task description and return redacted elements.</question>
"""
from openai import OpenAI
PORT = "PORT GOES HERE" # 8000 for vllm, 11434 for ollama
MODEL_NAME = "NAME USED FOR SETTING UP THE CLIENT"
TEXT_TO_REDACT = "NI number AB123456C confirmed."
client = OpenAI(base_url=f"http://127.0.0.1:{PORT}/v1", api_key="EMPTY")
chat_response = client.chat.completions.create(
model=MODEL_NAME,
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": PROMPT_TEMPLATE.format(context=TEXT_TO_REDACT)},
],
temperature=0,
)
```
## Risks & Mitigations
* **False negatives/positives:** May miss novel formats or over-redact generic terms. Mitigate via guardrails + post-validation.
* **Policy drift:** Keep task preamble fixed; monitor with unit tests.
## Model Sources
* **Homepage:** [https://distillabs.ai](https://distillabs.ai)
* **Contact:** [contact@distillabs.ai](mailto:contact@distillabs.ai)

111
STUDENT_LICENSE Normal file
View File

@@ -0,0 +1,111 @@
LLAMA 3.2 COMMUNITY LICENSE AGREEMENT
Llama 3.2 Version Release Date: September 25, 2024
“Agreement” means the terms and conditions for use, reproduction, distribution
and modification of the Llama Materials set forth herein.
“Documentation” means the specifications, manuals and documentation accompanying Llama 3.2
distributed by Meta at https://llama.meta.com/doc/overview.
“Licensee” or “you” means you, or your employer or any other person or entity (if you are
entering into this Agreement on such person or entitys behalf), of the age required under
applicable laws, rules or regulations to provide legal consent and that has legal authority
to bind your employer or such other person or entity if you are entering in this Agreement
on their behalf.
“Llama 3.2” means the foundational large language models and software and algorithms, including
machine-learning model code, trained model weights, inference-enabling code, training-enabling code,
fine-tuning enabling code and other elements of the foregoing distributed by Meta at
https://www.llama.com/llama-downloads.
“Llama Materials” means, collectively, Metas proprietary Llama 3.2 and Documentation (and
any portion thereof) made available under this Agreement.
“Meta” or “we” means Meta Platforms Ireland Limited (if you are located in or,
if you are an entity, your principal place of business is in the EEA or Switzerland)
and Meta Platforms, Inc. (if you are located outside of the EEA or Switzerland).
By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials,
you agree to be bound by this Agreement.
1. License Rights and Redistribution.
a. Grant of Rights. You are granted a non-exclusive, worldwide,
non-transferable and royalty-free limited license under Metas intellectual property or other rights
owned by Meta embodied in the Llama Materials to use, reproduce, distribute, copy, create derivative works
of, and make modifications to the Llama Materials.
b. Redistribution and Use.
i. If you distribute or make available the Llama Materials (or any derivative works thereof),
or a product or service (including another AI model) that contains any of them, you shall (A) provide
a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama”
on a related website, user interface, blogpost, about page, or product documentation. If you use the
Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or
otherwise improve an AI model, which is distributed or made available, you shall also include “Llama”
at the beginning of any such AI model name.
ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part
of an integrated end user product, then Section 2 of this Agreement will not apply to you.
iii. You must retain in all copies of the Llama Materials that you distribute the
following attribution notice within a “Notice” text file distributed as a part of such copies:
“Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms,
Inc. All Rights Reserved.”
iv. Your use of the Llama Materials must comply with applicable laws and regulations
(including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for
the Llama Materials (available at https://www.llama.com/llama3_2/use-policy), which is hereby
incorporated by reference into this Agreement.
2. Additional Commercial Terms. If, on the Llama 3.2 version release date, the monthly active users
of the products or services made available by or for Licensee, or Licensees affiliates,
is greater than 700 million monthly active users in the preceding calendar month, you must request
a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to
exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.
3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND
RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, AND META DISCLAIMS
ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES
OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE
FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED
WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.
4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT,
FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN
IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.
5. Intellectual Property.
a. No trademark licenses are granted under this Agreement, and in connection with the Llama Materials,
neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates,
except as required for reasonable and customary use in describing and redistributing the Llama Materials or as
set forth in this Section 5(a). Meta hereby grants you a license to use “Llama” (the “Mark”) solely as required
to comply with the last sentence of Section 1.b.i. You will comply with Metas brand guidelines (currently accessible
at https://about.meta.com/brand/resources/meta/company-brand/). All goodwill arising out of your use of the Mark
will inure to the benefit of Meta.
b. Subject to Metas ownership of Llama Materials and derivatives made by or for Meta, with respect to any
derivative works and modifications of the Llama Materials that are made by you, as between you and Meta,
you are and will be the owner of such derivative works and modifications.
c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or
counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.2 outputs or results, or any portion
of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable
by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or
claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third
party arising out of or related to your use or distribution of the Llama Materials.
6. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access
to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms
and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this
Agreement. Upon termination of this Agreement, you shall delete and cease use of the Llama Materials. Sections 3,
4 and 7 shall survive the termination of this Agreement.
7. Governing Law and Jurisdiction. This Agreement will be governed and construed under the laws of the State of
California without regard to choice of law principles, and the UN Convention on Contracts for the International
Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of
any dispute arising out of this Agreement.

9
TEACHER_LICENSE Normal file
View File

@@ -0,0 +1,9 @@
MIT License
Copyright (c) 2023 DeepSeek
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

93
chat_template.jinja Normal file
View File

@@ -0,0 +1,93 @@
{{- bos_token }}
{%- if custom_tools is defined %}
{%- set tools = custom_tools %}
{%- endif %}
{%- if not tools_in_user_message is defined %}
{%- set tools_in_user_message = true %}
{%- endif %}
{%- if not date_string is defined %}
{%- if strftime_now is defined %}
{%- set date_string = strftime_now("%d %b %Y") %}
{%- else %}
{%- set date_string = "26 Jul 2024" %}
{%- endif %}
{%- endif %}
{%- if not tools is defined %}
{%- set tools = none %}
{%- endif %}
{#- This block extracts the system message, so we can slot it into the right place. #}
{%- if messages[0]['role'] == 'system' %}
{%- set system_message = messages[0]['content']|trim %}
{%- set messages = messages[1:] %}
{%- else %}
{%- set system_message = "" %}
{%- endif %}
{#- System message #}
{{- "<|start_header_id|>system<|end_header_id|>\n\n" }}
{%- if tools is not none %}
{{- "Environment: ipython\n" }}
{%- endif %}
{{- "Cutting Knowledge Date: December 2023\n" }}
{{- "Today Date: " + date_string + "\n\n" }}
{%- if tools is not none and not tools_in_user_message %}
{{- "You have access to the following functions. To call a function, please respond with JSON for a function call." }}
{{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
{{- "Do not use variables.\n\n" }}
{%- for t in tools %}
{{- t | tojson(indent=4) }}
{{- "\n\n" }}
{%- endfor %}
{%- endif %}
{{- system_message }}
{{- "<|eot_id|>" }}
{#- Custom tools are passed in a user message with some extra guidance #}
{%- if tools_in_user_message and not tools is none %}
{#- Extract the first user message so we can plug it in here #}
{%- if messages | length != 0 %}
{%- set first_user_message = messages[0]['content']|trim %}
{%- set messages = messages[1:] %}
{%- else %}
{{- raise_exception("Cannot put tools in the first user message when there's no first user message!") }}
{%- endif %}
{{- '<|start_header_id|>user<|end_header_id|>\n\n' -}}
{{- "Given the following functions, please respond with a JSON for a function call " }}
{{- "with its proper arguments that best answers the given prompt.\n\n" }}
{{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
{{- "Do not use variables.\n\n" }}
{%- for t in tools %}
{{- t | tojson(indent=4) }}
{{- "\n\n" }}
{%- endfor %}
{{- first_user_message + "<|eot_id|>"}}
{%- endif %}
{%- for message in messages %}
{%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}
{{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }}
{%- elif 'tool_calls' in message %}
{%- if not message.tool_calls|length == 1 %}
{{- raise_exception("This model only supports single tool-calls at once!") }}
{%- endif %}
{%- set tool_call = message.tool_calls[0].function %}
{{- '<|start_header_id|>assistant<|end_header_id|>\n\n' -}}
{{- '{"name": "' + tool_call.name + '", ' }}
{{- '"parameters": ' }}
{{- tool_call.arguments | tojson }}
{{- "}" }}
{{- "<|eot_id|>" }}
{%- elif message.role == "tool" or message.role == "ipython" %}
{{- "<|start_header_id|>ipython<|end_header_id|>\n\n" }}
{%- if message.content is mapping or message.content is iterable %}
{{- message.content | tojson }}
{%- else %}
{{- message.content }}
{%- endif %}
{{- "<|eot_id|>" }}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
{%- endif %}

41
config.json Normal file
View File

@@ -0,0 +1,41 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": [
128001,
128008,
128009
],
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 3072,
"initializer_range": 0.02,
"intermediate_size": 8192,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 24,
"num_hidden_layers": 28,
"num_key_value_heads": 8,
"pad_token": "<|reserved_special_token_247|>",
"pad_token_id": 128255,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 32.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"rope_theta": 500000.0,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.53.0",
"use_cache": true,
"vocab_size": 128256
}

12
generation_config.json Normal file
View File

@@ -0,0 +1,12 @@
{
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": [
128001,
128008,
128009
],
"temperature": 0.6,
"top_p": 0.9,
"transformers_version": "4.53.0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b743aa324dc01646c41a7d277b960cda5d921452cf0b741b1d7f1f6ab6c91691
size 4965799096

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:847eff22c7a463cd298c4034cbbd62f44060df09287b8a9f0859a56126bff94f
size 1459729952

View File

@@ -0,0 +1,262 @@
{
"metadata": {
"total_parameters": 3212749824,
"total_size": 6425499648
},
"weight_map": {
"model.embed_tokens.weight": "model-00001-of-00002.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.21.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"model.norm.weight": "model-00002-of-00002.safetensors"
}
}

23
special_tokens_map.json Normal file
View File

@@ -0,0 +1,23 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|eot_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|reserved_special_token_247|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c82407ee10fa3777e08252308fce60fcca3e2ff2fb980acdca6c8c5bdd8470c0
size 17210206

2064
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff

88
training-logs.csv Normal file
View File

@@ -0,0 +1,88 @@
,eval_loss,eval_binary,eval_rouge,eval_llm_as_a_judge,eval_runtime,eval_samples_per_second,eval_steps_per_second,epoch,step,loss,grad_norm,learning_rate,train_runtime,train_samples_per_second,train_steps_per_second,total_flos,train_loss
0,0.5167275071144104,0.0,0.7697632054127466,0.0,32.4672,0.739,0.37,0.0,0,,,,,,,,
1,,,,,,,,0.04929022082018927,250,0.2184,0.917641818523407,1.2266009852216749e-05,,,,,
2,,,,,,,,0.09858044164037855,500,0.0936,1.5168706178665161,2.458128078817734e-05,,,,,
3,,,,,,,,0.14787066246056782,750,0.0713,0.9926924109458923,3.6896551724137934e-05,,,,,
4,,,,,,,,0.1971608832807571,1000,0.0686,0.7068266868591309,4.9211822660098524e-05,,,,,
5,,,,,,,,0.24645110410094637,1250,0.0639,0.9005385637283325,4.939293311887096e-05,,,,,
6,,,,,,,,0.29574132492113564,1500,0.0583,0.3320270776748657,4.874435739116899e-05,,,,,
7,,,,,,,,0.34503154574132494,1750,0.0523,0.534458577632904,4.8095781663467026e-05,,,,,
8,,,,,,,,0.3943217665615142,2000,0.0528,0.30121704936027527,4.744720593576506e-05,,,,,
9,,,,,,,,0.4436119873817035,2250,0.0503,0.2801056206226349,4.679863020806309e-05,,,,,
10,,,,,,,,0.49290220820189273,2500,0.0506,0.31803637742996216,4.6150054480361124e-05,,,,,
11,,,,,,,,0.542192429022082,2750,0.0504,0.39373141527175903,4.550147875265916e-05,,,,,
12,,,,,,,,0.5914826498422713,3000,0.0468,0.23342998325824738,4.4852903024957196e-05,,,,,
13,,,,,,,,0.6407728706624606,3250,0.0455,0.18071851134300232,4.420432729725523e-05,,,,,
14,,,,,,,,0.6900630914826499,3500,0.0485,0.41326144337654114,4.355575156955327e-05,,,,,
15,,,,,,,,0.7393533123028391,3750,0.0496,0.17202889919281006,4.29071758418513e-05,,,,,
16,,,,,,,,0.7886435331230284,4000,0.0467,0.3105004131793976,4.2258600114149334e-05,,,,,
17,,,,,,,,0.8379337539432177,4250,0.0461,0.2848104238510132,4.1610024386447366e-05,,,,,
18,,,,,,,,0.887223974763407,4500,0.046,0.5095773339271545,4.09614486587454e-05,,,,,
19,,,,,,,,0.9365141955835962,4750,0.0455,0.2530672252178192,4.031287293104343e-05,,,,,
20,,,,,,,,0.9858044164037855,5000,0.045,0.1897924840450287,3.9664297203341464e-05,,,,,
21,0.1263345330953598,0.125,0.9474400463451992,0.20833333333333334,40.2686,0.596,0.298,1.0,5072,,,,,,,,
22,,,,,,,,1.0350946372239747,5250,0.0437,0.2292327880859375,3.90157214756395e-05,,,,,
23,,,,,,,,1.084384858044164,5500,0.0405,0.12882058322429657,3.836714574793753e-05,,,,,
24,,,,,,,,1.1336750788643533,5750,0.0418,0.2277369201183319,3.771857002023557e-05,,,,,
25,,,,,,,,1.1829652996845426,6000,0.0408,0.4681090712547302,3.70699942925336e-05,,,,,
26,,,,,,,,1.2322555205047319,6250,0.0411,0.29907476902008057,3.6421418564831635e-05,,,,,
27,,,,,,,,1.2815457413249212,6500,0.0385,0.27942612767219543,3.577284283712967e-05,,,,,
28,,,,,,,,1.3308359621451105,6750,0.0428,0.2944606840610504,3.51242671094277e-05,,,,,
29,,,,,,,,1.3801261829652998,7000,0.0392,0.19481465220451355,3.447569138172573e-05,,,,,
30,,,,,,,,1.4294164037854888,7250,0.0411,0.31186002492904663,3.3827115654023766e-05,,,,,
31,,,,,,,,1.4787066246056781,7500,0.0421,0.37516000866889954,3.31785399263218e-05,,,,,
32,,,,,,,,1.5279968454258674,7750,0.0384,0.2926328778266907,3.252996419861983e-05,,,,,
33,,,,,,,,1.5772870662460567,8000,0.0405,0.2896762192249298,3.1881388470917864e-05,,,,,
34,,,,,,,,1.626577287066246,8250,0.0394,0.15331445634365082,3.12328127432159e-05,,,,,
35,,,,,,,,1.6758675078864353,8500,0.0412,0.28620100021362305,3.0584237015513936e-05,,,,,
36,,,,,,,,1.7251577287066246,8750,0.0364,0.30576035380363464,2.993566128781197e-05,,,,,
37,,,,,,,,1.774447949526814,9000,0.0407,0.18869711458683014,2.928708556011e-05,,,,,
38,,,,,,,,1.8237381703470033,9250,0.0378,0.22588257491588593,2.8638509832408034e-05,,,,,
39,,,,,,,,1.8730283911671926,9500,0.0406,0.2644851803779602,2.7989934104706067e-05,,,,,
40,,,,,,,,1.9223186119873819,9750,0.0368,0.12382346391677856,2.73413583770041e-05,,,,,
41,,,,,,,,1.971608832807571,10000,0.0375,0.155380517244339,2.6692782649302132e-05,,,,,
42,0.11640363931655884,0.16666666666666666,0.9475961059276585,0.2916666666666667,44.2421,0.542,0.271,2.0,10144,,,,,,,,
43,,,,,,,,2.0208990536277605,10250,0.0371,0.18081019818782806,2.6044206921600168e-05,,,,,
44,,,,,,,,2.0701892744479493,10500,0.0356,0.08594907075166702,2.53956311938982e-05,,,,,
45,,,,,,,,2.1194794952681386,10750,0.0336,0.2650609612464905,2.4747055466196234e-05,,,,,
46,,,,,,,,2.168769716088328,11000,0.0339,0.27648213505744934,2.4098479738494266e-05,,,,,
47,,,,,,,,2.218059936908517,11250,0.0346,0.2537253201007843,2.34499040107923e-05,,,,,
48,,,,,,,,2.2673501577287065,11500,0.0346,0.2574119567871094,2.2801328283090335e-05,,,,,
49,,,,,,,,2.316640378548896,11750,0.0342,0.17437870800495148,2.2152752555388368e-05,,,,,
50,,,,,,,,2.365930599369085,12000,0.0345,0.19289404153823853,2.15041768276864e-05,,,,,
51,,,,,,,,2.4152208201892744,12250,0.0331,0.20487362146377563,2.0855601099984433e-05,,,,,
52,,,,,,,,2.4645110410094637,12500,0.0325,0.1820557862520218,2.0207025372282466e-05,,,,,
53,,,,,,,,2.513801261829653,12750,0.0319,0.2215508073568344,1.9558449644580502e-05,,,,,
54,,,,,,,,2.5630914826498423,13000,0.0349,0.16218796372413635,1.8909873916878535e-05,,,,,
55,,,,,,,,2.6123817034700316,13250,0.0339,0.2562701404094696,1.8261298189176567e-05,,,,,
56,,,,,,,,2.661671924290221,13500,0.036,0.33530670404434204,1.7612722461474604e-05,,,,,
57,,,,,,,,2.7109621451104102,13750,0.0323,0.12399043887853622,1.6964146733772636e-05,,,,,
58,,,,,,,,2.7602523659305995,14000,0.0355,0.28111016750335693,1.631557100607067e-05,,,,,
59,,,,,,,,2.809542586750789,14250,0.0338,0.38726863265037537,1.5666995278368705e-05,,,,,
60,,,,,,,,2.8588328075709777,14500,0.0327,0.2847912609577179,1.5018419550666738e-05,,,,,
61,,,,,,,,2.9081230283911674,14750,0.0342,0.315563827753067,1.436984382296477e-05,,,,,
62,,,,,,,,2.9574132492113563,15000,0.0336,0.2762741148471832,1.3721268095262805e-05,,,,,
63,0.12141290307044983,0.2916666666666667,0.9544938655488249,0.3333333333333333,38.498,0.623,0.312,3.0,15216,,,,,,,,
64,,,,,,,,3.0067034700315456,15250,0.0318,0.3155117928981781,1.3072692367560838e-05,,,,,
65,,,,,,,,3.055993690851735,15500,0.0302,0.11656571924686432,1.242411663985887e-05,,,,,
66,,,,,,,,3.105283911671924,15750,0.0275,0.46191108226776123,1.1775540912156905e-05,,,,,
67,,,,,,,,3.1545741324921135,16000,0.0292,0.3043929934501648,1.1126965184454937e-05,,,,,
68,,,,,,,,3.203864353312303,16250,0.0294,0.1710922122001648,1.0478389456752972e-05,,,,,
69,,,,,,,,3.253154574132492,16500,0.0287,0.1864010989665985,9.829813729051005e-06,,,,,
70,,,,,,,,3.3024447949526814,16750,0.0319,0.2804664969444275,9.181238001349037e-06,,,,,
71,,,,,,,,3.3517350157728707,17000,0.0286,0.20211897790431976,8.532662273647072e-06,,,,,
72,,,,,,,,3.40102523659306,17250,0.0288,0.3104719817638397,7.884086545945104e-06,,,,,
73,,,,,,,,3.4503154574132493,17500,0.0287,0.10014355927705765,7.235510818243138e-06,,,,,
74,,,,,,,,3.4996056782334386,17750,0.0291,0.12555907666683197,6.5869350905411715e-06,,,,,
75,,,,,,,,3.548895899053628,18000,0.0288,0.1633451133966446,5.938359362839206e-06,,,,,
76,,,,,,,,3.5981861198738168,18250,0.0277,0.23577263951301575,5.289783635137239e-06,,,,,
77,,,,,,,,3.6474763406940065,18500,0.0275,0.4976824223995209,4.641207907435272e-06,,,,,
78,,,,,,,,3.6967665615141954,18750,0.0272,0.2571507692337036,3.992632179733306e-06,,,,,
79,,,,,,,,3.746056782334385,19000,0.0287,0.15798607468605042,3.344056452031339e-06,,,,,
80,,,,,,,,3.795347003154574,19250,0.0296,0.1648528277873993,2.695480724329373e-06,,,,,
81,,,,,,,,3.8446372239747633,19500,0.0277,0.13284029066562653,2.0469049966274064e-06,,,,,
82,,,,,,,,3.8939274447949526,19750,0.0289,0.28205162286758423,1.39832926892544e-06,,,,,
83,,,,,,,,3.943217665615142,20000,0.0292,0.12843023240566254,7.497535412234734e-07,,,,,
84,,,,,,,,3.992507886435331,20250,0.0283,0.3274129331111908,1.0117781352150678e-07,,,,,
85,0.12934015691280365,0.375,0.9521755235624768,0.4166666666666667,35.7003,0.672,0.336,4.0,20288,,,,,,,,
86,,,,,,,,4.0,20288,,,,7466.5185,5.434,2.717,8.261775423189135e+17,0.04122413263570999
1 eval_loss eval_binary eval_rouge eval_llm_as_a_judge eval_runtime eval_samples_per_second eval_steps_per_second epoch step loss grad_norm learning_rate train_runtime train_samples_per_second train_steps_per_second total_flos train_loss
2 0 0.5167275071144104 0.0 0.7697632054127466 0.0 32.4672 0.739 0.37 0.0 0
3 1 0.04929022082018927 250 0.2184 0.917641818523407 1.2266009852216749e-05
4 2 0.09858044164037855 500 0.0936 1.5168706178665161 2.458128078817734e-05
5 3 0.14787066246056782 750 0.0713 0.9926924109458923 3.6896551724137934e-05
6 4 0.1971608832807571 1000 0.0686 0.7068266868591309 4.9211822660098524e-05
7 5 0.24645110410094637 1250 0.0639 0.9005385637283325 4.939293311887096e-05
8 6 0.29574132492113564 1500 0.0583 0.3320270776748657 4.874435739116899e-05
9 7 0.34503154574132494 1750 0.0523 0.534458577632904 4.8095781663467026e-05
10 8 0.3943217665615142 2000 0.0528 0.30121704936027527 4.744720593576506e-05
11 9 0.4436119873817035 2250 0.0503 0.2801056206226349 4.679863020806309e-05
12 10 0.49290220820189273 2500 0.0506 0.31803637742996216 4.6150054480361124e-05
13 11 0.542192429022082 2750 0.0504 0.39373141527175903 4.550147875265916e-05
14 12 0.5914826498422713 3000 0.0468 0.23342998325824738 4.4852903024957196e-05
15 13 0.6407728706624606 3250 0.0455 0.18071851134300232 4.420432729725523e-05
16 14 0.6900630914826499 3500 0.0485 0.41326144337654114 4.355575156955327e-05
17 15 0.7393533123028391 3750 0.0496 0.17202889919281006 4.29071758418513e-05
18 16 0.7886435331230284 4000 0.0467 0.3105004131793976 4.2258600114149334e-05
19 17 0.8379337539432177 4250 0.0461 0.2848104238510132 4.1610024386447366e-05
20 18 0.887223974763407 4500 0.046 0.5095773339271545 4.09614486587454e-05
21 19 0.9365141955835962 4750 0.0455 0.2530672252178192 4.031287293104343e-05
22 20 0.9858044164037855 5000 0.045 0.1897924840450287 3.9664297203341464e-05
23 21 0.1263345330953598 0.125 0.9474400463451992 0.20833333333333334 40.2686 0.596 0.298 1.0 5072
24 22 1.0350946372239747 5250 0.0437 0.2292327880859375 3.90157214756395e-05
25 23 1.084384858044164 5500 0.0405 0.12882058322429657 3.836714574793753e-05
26 24 1.1336750788643533 5750 0.0418 0.2277369201183319 3.771857002023557e-05
27 25 1.1829652996845426 6000 0.0408 0.4681090712547302 3.70699942925336e-05
28 26 1.2322555205047319 6250 0.0411 0.29907476902008057 3.6421418564831635e-05
29 27 1.2815457413249212 6500 0.0385 0.27942612767219543 3.577284283712967e-05
30 28 1.3308359621451105 6750 0.0428 0.2944606840610504 3.51242671094277e-05
31 29 1.3801261829652998 7000 0.0392 0.19481465220451355 3.447569138172573e-05
32 30 1.4294164037854888 7250 0.0411 0.31186002492904663 3.3827115654023766e-05
33 31 1.4787066246056781 7500 0.0421 0.37516000866889954 3.31785399263218e-05
34 32 1.5279968454258674 7750 0.0384 0.2926328778266907 3.252996419861983e-05
35 33 1.5772870662460567 8000 0.0405 0.2896762192249298 3.1881388470917864e-05
36 34 1.626577287066246 8250 0.0394 0.15331445634365082 3.12328127432159e-05
37 35 1.6758675078864353 8500 0.0412 0.28620100021362305 3.0584237015513936e-05
38 36 1.7251577287066246 8750 0.0364 0.30576035380363464 2.993566128781197e-05
39 37 1.774447949526814 9000 0.0407 0.18869711458683014 2.928708556011e-05
40 38 1.8237381703470033 9250 0.0378 0.22588257491588593 2.8638509832408034e-05
41 39 1.8730283911671926 9500 0.0406 0.2644851803779602 2.7989934104706067e-05
42 40 1.9223186119873819 9750 0.0368 0.12382346391677856 2.73413583770041e-05
43 41 1.971608832807571 10000 0.0375 0.155380517244339 2.6692782649302132e-05
44 42 0.11640363931655884 0.16666666666666666 0.9475961059276585 0.2916666666666667 44.2421 0.542 0.271 2.0 10144
45 43 2.0208990536277605 10250 0.0371 0.18081019818782806 2.6044206921600168e-05
46 44 2.0701892744479493 10500 0.0356 0.08594907075166702 2.53956311938982e-05
47 45 2.1194794952681386 10750 0.0336 0.2650609612464905 2.4747055466196234e-05
48 46 2.168769716088328 11000 0.0339 0.27648213505744934 2.4098479738494266e-05
49 47 2.218059936908517 11250 0.0346 0.2537253201007843 2.34499040107923e-05
50 48 2.2673501577287065 11500 0.0346 0.2574119567871094 2.2801328283090335e-05
51 49 2.316640378548896 11750 0.0342 0.17437870800495148 2.2152752555388368e-05
52 50 2.365930599369085 12000 0.0345 0.19289404153823853 2.15041768276864e-05
53 51 2.4152208201892744 12250 0.0331 0.20487362146377563 2.0855601099984433e-05
54 52 2.4645110410094637 12500 0.0325 0.1820557862520218 2.0207025372282466e-05
55 53 2.513801261829653 12750 0.0319 0.2215508073568344 1.9558449644580502e-05
56 54 2.5630914826498423 13000 0.0349 0.16218796372413635 1.8909873916878535e-05
57 55 2.6123817034700316 13250 0.0339 0.2562701404094696 1.8261298189176567e-05
58 56 2.661671924290221 13500 0.036 0.33530670404434204 1.7612722461474604e-05
59 57 2.7109621451104102 13750 0.0323 0.12399043887853622 1.6964146733772636e-05
60 58 2.7602523659305995 14000 0.0355 0.28111016750335693 1.631557100607067e-05
61 59 2.809542586750789 14250 0.0338 0.38726863265037537 1.5666995278368705e-05
62 60 2.8588328075709777 14500 0.0327 0.2847912609577179 1.5018419550666738e-05
63 61 2.9081230283911674 14750 0.0342 0.315563827753067 1.436984382296477e-05
64 62 2.9574132492113563 15000 0.0336 0.2762741148471832 1.3721268095262805e-05
65 63 0.12141290307044983 0.2916666666666667 0.9544938655488249 0.3333333333333333 38.498 0.623 0.312 3.0 15216
66 64 3.0067034700315456 15250 0.0318 0.3155117928981781 1.3072692367560838e-05
67 65 3.055993690851735 15500 0.0302 0.11656571924686432 1.242411663985887e-05
68 66 3.105283911671924 15750 0.0275 0.46191108226776123 1.1775540912156905e-05
69 67 3.1545741324921135 16000 0.0292 0.3043929934501648 1.1126965184454937e-05
70 68 3.203864353312303 16250 0.0294 0.1710922122001648 1.0478389456752972e-05
71 69 3.253154574132492 16500 0.0287 0.1864010989665985 9.829813729051005e-06
72 70 3.3024447949526814 16750 0.0319 0.2804664969444275 9.181238001349037e-06
73 71 3.3517350157728707 17000 0.0286 0.20211897790431976 8.532662273647072e-06
74 72 3.40102523659306 17250 0.0288 0.3104719817638397 7.884086545945104e-06
75 73 3.4503154574132493 17500 0.0287 0.10014355927705765 7.235510818243138e-06
76 74 3.4996056782334386 17750 0.0291 0.12555907666683197 6.5869350905411715e-06
77 75 3.548895899053628 18000 0.0288 0.1633451133966446 5.938359362839206e-06
78 76 3.5981861198738168 18250 0.0277 0.23577263951301575 5.289783635137239e-06
79 77 3.6474763406940065 18500 0.0275 0.4976824223995209 4.641207907435272e-06
80 78 3.6967665615141954 18750 0.0272 0.2571507692337036 3.992632179733306e-06
81 79 3.746056782334385 19000 0.0287 0.15798607468605042 3.344056452031339e-06
82 80 3.795347003154574 19250 0.0296 0.1648528277873993 2.695480724329373e-06
83 81 3.8446372239747633 19500 0.0277 0.13284029066562653 2.0469049966274064e-06
84 82 3.8939274447949526 19750 0.0289 0.28205162286758423 1.39832926892544e-06
85 83 3.943217665615142 20000 0.0292 0.12843023240566254 7.497535412234734e-07
86 84 3.992507886435331 20250 0.0283 0.3274129331111908 1.0117781352150678e-07
87 85 0.12934015691280365 0.375 0.9521755235624768 0.4166666666666667 35.7003 0.672 0.336 4.0 20288
88 86 4.0 20288 7466.5185 5.434 2.717 8.261775423189135e+17 0.04122413263570999

87
training-logs.json Normal file
View File

@@ -0,0 +1,87 @@
{"eval_loss":0.5167275071,"eval_binary":0.0,"eval_rouge":0.7697632054,"eval_llm_as_a_judge":0.0,"eval_runtime":32.4672,"eval_samples_per_second":0.739,"eval_steps_per_second":0.37,"epoch":0.0,"step":0,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.0492902208,"step":250,"loss":0.2184,"grad_norm":0.9176418185,"learning_rate":0.000012266,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.0985804416,"step":500,"loss":0.0936,"grad_norm":1.5168706179,"learning_rate":0.0000245813,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.1478706625,"step":750,"loss":0.0713,"grad_norm":0.9926924109,"learning_rate":0.0000368966,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.1971608833,"step":1000,"loss":0.0686,"grad_norm":0.7068266869,"learning_rate":0.0000492118,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.2464511041,"step":1250,"loss":0.0639,"grad_norm":0.9005385637,"learning_rate":0.0000493929,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.2957413249,"step":1500,"loss":0.0583,"grad_norm":0.3320270777,"learning_rate":0.0000487444,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.3450315457,"step":1750,"loss":0.0523,"grad_norm":0.5344585776,"learning_rate":0.0000480958,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.3943217666,"step":2000,"loss":0.0528,"grad_norm":0.3012170494,"learning_rate":0.0000474472,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.4436119874,"step":2250,"loss":0.0503,"grad_norm":0.2801056206,"learning_rate":0.0000467986,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.4929022082,"step":2500,"loss":0.0506,"grad_norm":0.3180363774,"learning_rate":0.0000461501,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.542192429,"step":2750,"loss":0.0504,"grad_norm":0.3937314153,"learning_rate":0.0000455015,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.5914826498,"step":3000,"loss":0.0468,"grad_norm":0.2334299833,"learning_rate":0.0000448529,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.6407728707,"step":3250,"loss":0.0455,"grad_norm":0.1807185113,"learning_rate":0.0000442043,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.6900630915,"step":3500,"loss":0.0485,"grad_norm":0.4132614434,"learning_rate":0.0000435558,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.7393533123,"step":3750,"loss":0.0496,"grad_norm":0.1720288992,"learning_rate":0.0000429072,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.7886435331,"step":4000,"loss":0.0467,"grad_norm":0.3105004132,"learning_rate":0.0000422586,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.8379337539,"step":4250,"loss":0.0461,"grad_norm":0.2848104239,"learning_rate":0.00004161,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.8872239748,"step":4500,"loss":0.046,"grad_norm":0.5095773339,"learning_rate":0.0000409614,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.9365141956,"step":4750,"loss":0.0455,"grad_norm":0.2530672252,"learning_rate":0.0000403129,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.9858044164,"step":5000,"loss":0.045,"grad_norm":0.189792484,"learning_rate":0.0000396643,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":0.1263345331,"eval_binary":0.125,"eval_rouge":0.9474400463,"eval_llm_as_a_judge":0.2083333333,"eval_runtime":40.2686,"eval_samples_per_second":0.596,"eval_steps_per_second":0.298,"epoch":1.0,"step":5072,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.0350946372,"step":5250,"loss":0.0437,"grad_norm":0.2292327881,"learning_rate":0.0000390157,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.084384858,"step":5500,"loss":0.0405,"grad_norm":0.1288205832,"learning_rate":0.0000383671,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.1336750789,"step":5750,"loss":0.0418,"grad_norm":0.2277369201,"learning_rate":0.0000377186,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.1829652997,"step":6000,"loss":0.0408,"grad_norm":0.4681090713,"learning_rate":0.00003707,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.2322555205,"step":6250,"loss":0.0411,"grad_norm":0.299074769,"learning_rate":0.0000364214,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.2815457413,"step":6500,"loss":0.0385,"grad_norm":0.2794261277,"learning_rate":0.0000357728,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.3308359621,"step":6750,"loss":0.0428,"grad_norm":0.2944606841,"learning_rate":0.0000351243,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.380126183,"step":7000,"loss":0.0392,"grad_norm":0.1948146522,"learning_rate":0.0000344757,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.4294164038,"step":7250,"loss":0.0411,"grad_norm":0.3118600249,"learning_rate":0.0000338271,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.4787066246,"step":7500,"loss":0.0421,"grad_norm":0.3751600087,"learning_rate":0.0000331785,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.5279968454,"step":7750,"loss":0.0384,"grad_norm":0.2926328778,"learning_rate":0.00003253,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.5772870662,"step":8000,"loss":0.0405,"grad_norm":0.2896762192,"learning_rate":0.0000318814,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.6265772871,"step":8250,"loss":0.0394,"grad_norm":0.1533144563,"learning_rate":0.0000312328,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.6758675079,"step":8500,"loss":0.0412,"grad_norm":0.2862010002,"learning_rate":0.0000305842,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.7251577287,"step":8750,"loss":0.0364,"grad_norm":0.3057603538,"learning_rate":0.0000299357,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.7744479495,"step":9000,"loss":0.0407,"grad_norm":0.1886971146,"learning_rate":0.0000292871,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.8237381703,"step":9250,"loss":0.0378,"grad_norm":0.2258825749,"learning_rate":0.0000286385,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.8730283912,"step":9500,"loss":0.0406,"grad_norm":0.2644851804,"learning_rate":0.0000279899,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.922318612,"step":9750,"loss":0.0368,"grad_norm":0.1238234639,"learning_rate":0.0000273414,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.9716088328,"step":10000,"loss":0.0375,"grad_norm":0.1553805172,"learning_rate":0.0000266928,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":0.1164036393,"eval_binary":0.1666666667,"eval_rouge":0.9475961059,"eval_llm_as_a_judge":0.2916666667,"eval_runtime":44.2421,"eval_samples_per_second":0.542,"eval_steps_per_second":0.271,"epoch":2.0,"step":10144,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.0208990536,"step":10250,"loss":0.0371,"grad_norm":0.1808101982,"learning_rate":0.0000260442,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.0701892744,"step":10500,"loss":0.0356,"grad_norm":0.0859490708,"learning_rate":0.0000253956,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.1194794953,"step":10750,"loss":0.0336,"grad_norm":0.2650609612,"learning_rate":0.0000247471,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.1687697161,"step":11000,"loss":0.0339,"grad_norm":0.2764821351,"learning_rate":0.0000240985,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.2180599369,"step":11250,"loss":0.0346,"grad_norm":0.2537253201,"learning_rate":0.0000234499,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.2673501577,"step":11500,"loss":0.0346,"grad_norm":0.2574119568,"learning_rate":0.0000228013,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.3166403785,"step":11750,"loss":0.0342,"grad_norm":0.174378708,"learning_rate":0.0000221528,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.3659305994,"step":12000,"loss":0.0345,"grad_norm":0.1928940415,"learning_rate":0.0000215042,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.4152208202,"step":12250,"loss":0.0331,"grad_norm":0.2048736215,"learning_rate":0.0000208556,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.464511041,"step":12500,"loss":0.0325,"grad_norm":0.1820557863,"learning_rate":0.000020207,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.5138012618,"step":12750,"loss":0.0319,"grad_norm":0.2215508074,"learning_rate":0.0000195584,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.5630914826,"step":13000,"loss":0.0349,"grad_norm":0.1621879637,"learning_rate":0.0000189099,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.6123817035,"step":13250,"loss":0.0339,"grad_norm":0.2562701404,"learning_rate":0.0000182613,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.6616719243,"step":13500,"loss":0.036,"grad_norm":0.335306704,"learning_rate":0.0000176127,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.7109621451,"step":13750,"loss":0.0323,"grad_norm":0.1239904389,"learning_rate":0.0000169641,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.7602523659,"step":14000,"loss":0.0355,"grad_norm":0.2811101675,"learning_rate":0.0000163156,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.8095425868,"step":14250,"loss":0.0338,"grad_norm":0.3872686327,"learning_rate":0.000015667,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.8588328076,"step":14500,"loss":0.0327,"grad_norm":0.284791261,"learning_rate":0.0000150184,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.9081230284,"step":14750,"loss":0.0342,"grad_norm":0.3155638278,"learning_rate":0.0000143698,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.9574132492,"step":15000,"loss":0.0336,"grad_norm":0.2762741148,"learning_rate":0.0000137213,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":0.1214129031,"eval_binary":0.2916666667,"eval_rouge":0.9544938655,"eval_llm_as_a_judge":0.3333333333,"eval_runtime":38.498,"eval_samples_per_second":0.623,"eval_steps_per_second":0.312,"epoch":3.0,"step":15216,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.00670347,"step":15250,"loss":0.0318,"grad_norm":0.3155117929,"learning_rate":0.0000130727,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.0559936909,"step":15500,"loss":0.0302,"grad_norm":0.1165657192,"learning_rate":0.0000124241,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.1052839117,"step":15750,"loss":0.0275,"grad_norm":0.4619110823,"learning_rate":0.0000117755,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.1545741325,"step":16000,"loss":0.0292,"grad_norm":0.3043929935,"learning_rate":0.000011127,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.2038643533,"step":16250,"loss":0.0294,"grad_norm":0.1710922122,"learning_rate":0.0000104784,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.2531545741,"step":16500,"loss":0.0287,"grad_norm":0.186401099,"learning_rate":0.0000098298,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.302444795,"step":16750,"loss":0.0319,"grad_norm":0.2804664969,"learning_rate":0.0000091812,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.3517350158,"step":17000,"loss":0.0286,"grad_norm":0.2021189779,"learning_rate":0.0000085327,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.4010252366,"step":17250,"loss":0.0288,"grad_norm":0.3104719818,"learning_rate":0.0000078841,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.4503154574,"step":17500,"loss":0.0287,"grad_norm":0.1001435593,"learning_rate":0.0000072355,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.4996056782,"step":17750,"loss":0.0291,"grad_norm":0.1255590767,"learning_rate":0.0000065869,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.5488958991,"step":18000,"loss":0.0288,"grad_norm":0.1633451134,"learning_rate":0.0000059384,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.5981861199,"step":18250,"loss":0.0277,"grad_norm":0.2357726395,"learning_rate":0.0000052898,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.6474763407,"step":18500,"loss":0.0275,"grad_norm":0.4976824224,"learning_rate":0.0000046412,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.6967665615,"step":18750,"loss":0.0272,"grad_norm":0.2571507692,"learning_rate":0.0000039926,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.7460567823,"step":19000,"loss":0.0287,"grad_norm":0.1579860747,"learning_rate":0.0000033441,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.7953470032,"step":19250,"loss":0.0296,"grad_norm":0.1648528278,"learning_rate":0.0000026955,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.844637224,"step":19500,"loss":0.0277,"grad_norm":0.1328402907,"learning_rate":0.0000020469,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.8939274448,"step":19750,"loss":0.0289,"grad_norm":0.2820516229,"learning_rate":0.0000013983,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.9432176656,"step":20000,"loss":0.0292,"grad_norm":0.1284302324,"learning_rate":0.0000007498,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.9925078864,"step":20250,"loss":0.0283,"grad_norm":0.3274129331,"learning_rate":0.0000001012,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":0.1293401569,"eval_binary":0.375,"eval_rouge":0.9521755236,"eval_llm_as_a_judge":0.4166666667,"eval_runtime":35.7003,"eval_samples_per_second":0.672,"eval_steps_per_second":0.336,"epoch":4.0,"step":20288,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":4.0,"step":20288,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":7466.5185,"train_samples_per_second":5.434,"train_steps_per_second":2.717,"total_flos":8.261775423e+17,"train_loss":0.0412241326}