初始化项目,由ModelHub XC社区提供模型

Model: distil-labs/distil-qwen3-0.6b-text2sql
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-03 02:12:49 +08:00
commit 7fcf2966fd
18 changed files with 152545 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

31
LICENSE Normal file
View File

@@ -0,0 +1,31 @@
GENERAL TERMS AND CONDITIONS
Note that if you want to use the Commercial licence, please contact us at contact@distillabs.ai
- Model License Terms -
R&D License
1. SERVICES, PRICES AND PAYMENT
1.1 The Customer pays a one-time license fee, as indicated in the check-out process, for running of one (1) training process of the selected Base Model using Customer Data (“License Fee”).
1.2 The License Fee shall be due for payment in advance. The Customer shall only be permitted to set off against payment claims of Distil Labs if the Customers claims are undisputed or have become res judicata.
2. MODEL LICENSE: R&D LICENSE
2.1 Subject to Customers payment of the license fee, Distil Labs grants to Customer the Model License (as defined below). For clarification, Distil Labs retains any other rights in its software or know- how, in particular in the codebase needed for the fine-tuning of the Trained Model.
2.2 Subject to the requirements of the Base Model License (cf. Section 2.5 below), Distil Labs transfers to the Customer the perpetual, non-exclusive usage right to the Trained Model for non-commercial purposes of prototyping and research & development. The Parties agree, that commercial purposes include deployment in production externally (to be used by Customers customers paid or free of charge) or internally (as a tool for Customers employees). The territorial scope of the license is limited to the use within the United States of America and the European Economic Area including all member states of the European Union (“Model License”).
2.3 The Model License for non-commercial purposes of prototyping and research & development shall include (i) the non-exclusive right to permanent or temporary reproduction, in whole or in part, by any means and in any form (e.g. permanent and/or volatile storage on electrical, electromagnetic, optical storage media, such as any type of SDD, HDD, DVD, memory cards, USB sticks), (ii) the non-exclusive right to distribution in any form, media and by any means regardless of whether the distribution is in tangible or intangible form, in particular to transmit the Trained Model via wired and wireless networks (e.g. for download from internet or intranet by wire or wireless means including broadband, cable, fiberglass, WIFI, LTE, 5G, satellite internet, other data networks), and (iii) the non-exclusive right of making available to the public in such a way that members of the public can access it from places and at times of their choice (e.g. by web or mobile app, virtual or augmented reality, cloud storage, cloud hosting, decentralized hosting, non-fungible token, application service providing, software as a service, or cloud computing). The license shall also contain, to the extent necessary for prototyping and research & development, the right to adapt and modify the Trained Model subject to the limitation in Section 2.4 and 2.5 below, to further develop the Trained Model including changes to functions or appearance, adapt to other software versions, to exchange parts of the Trained Model or combine the Trained Model with other results of work and to use the results in the same way as the original Trained Model. Any derived models from the Trained Model shall retain this model license.
2.4 The Customer shall not, without the prior written consent of Distil Labs:
2.4.1 train, fine-tune, re-train, or otherwise modify the Trained Model, unless for purpose of research & development;
2.4.2 use the Trained Model or any part thereof to create derivative models or services that compete with those of Distil Labs;
2.4.3 circumvent any technical restrictions embedded in the Trained Model or Base Model that are designed to enforce usage limitations.
2.5 The Parties acknowledge and agree that the Trained Model is developed from Base Models which are supplied by a third party. Therefore, the Model License is subject to the restrictions resulting from the open-source or any other applicable license of the Base Model (“Base Model License”) and the Customer must use the Trained Model in compliance with the Base Model License. In particular, the Customer must oblige their clients to compliance with the Base Model License in any case of transferring or sublicensing the rights to or making available in any way the Trained Model. The applicable Base Model License is defined in the Training Configuration and will be provided for download. The Customer agrees to indemnify Distil Labs for any and all claims brought by the Base Model provider for violations of the Base Model License.

59
Modelfile Normal file
View File

@@ -0,0 +1,59 @@
FROM ./model.gguf
TEMPLATE """{{- $lastUserIdx := -1 -}}
{{- range $idx, $msg := .Messages -}}
{{- if eq $msg.Role "user" }}{{ $lastUserIdx = $idx }}{{ end -}}
{{- end }}
{{- if or .System .Tools }}<|im_start|>system
{{ if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}
# Tools
You may call one or more functions to assist with the user query.
You are provided with function signatures within <tools></tools> XML tags:
<tools>
{{- range .Tools }}
{"type": "function", "function": {{ .Function }}}
{{- end }}
</tools>
For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>
{{- end -}}
<|im_end|>
{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
{{- if eq .Role "user" }}<|im_start|>user
{{ .Content }}<|im_end|>
{{ else if eq .Role "assistant" }}<|im_start|>assistant
{{ if (and $.IsThinkSet (and .Thinking (or $last (gt $i $lastUserIdx)))) -}}
<think>{{ .Thinking }}</think>
{{ end -}}
{{ if .Content }}{{ .Content }}
{{- else if .ToolCalls }}<tool_call>
{{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
{{ end }}</tool_call>
{{- end }}{{ if not $last }}<|im_end|>
{{ end }}
{{- else if eq .Role "tool" }}<|im_start|>user
<tool_response>
{{ .Content }}
</tool_response><|im_end|>
{{ end }}
{{- if and (ne .Role "assistant") $last }}<|im_start|>assistant
{{ if and $.IsThinkSet (not $.Think) -}}
<think>
</think>
{{ end -}}
{{ end }}
{{- end }}"""

193
README.md Normal file
View File

@@ -0,0 +1,193 @@
---
library_name: transformers
license: apache-2.0
base_model: Qwen/Qwen3-0.6B
tags:
- text2sql
- sql
- nlp
- distillation
- qwen3
datasets:
- distil-labs/text2sql-synthetic
language:
- en
pipeline_tag: text-generation
---
# Distil-Qwen3-0.6B-Text2SQL
A fine-tuned Qwen3-0.6B model for converting natural language questions into SQL queries. Trained using knowledge distillation from DeepSeek-V3, this compact 0.6B parameter model delivers strong Text2SQL performance while being extremely lightweight and fast for local deployment.
## Results
| Metric | DeepSeek-V3 (Teacher) | Qwen3-0.6B (Base) | **This Model** |
|--------|:---------------------:|:-----------------:|:--------------:|
| LLM-as-a-Judge | 76% | 36% | **74%** |
| Exact Match | 38% | 24% | **40%** |
| ROUGE | 88.6% | 69.3% | **88.5%** |
| METEOR | 90.4% | 65.3% | **88.5%** |
The fine-tuned model achieves **74% on LLM-as-a-Judge** accuracy with only 0.6B parameters - a **2x improvement** over the base model and approaching the 685B parameter teacher's performance at a fraction of the size.
## Quick Start
### Using Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("distil-labs/distil-qwen3-0.6b-text2sql")
tokenizer = AutoTokenizer.from_pretrained("distil-labs/distil-qwen3-0.6b-text2sql")
schema = """CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
department TEXT,
salary INTEGER
);"""
question = "How many employees earn more than 50000?"
messages = [
{
"role": "system",
"content": """You are a problem solving model working on task_description XML block:
<task_description>You are given a database schema and a natural language question. Generate the SQL query that answers the question.
Input:
- Schema: One or two table definitions in SQL DDL format
- Question: Natural language question about the data
Output:
- A single SQL query that answers the question
- No explanations, comments, or additional text
Rules:
- Use only tables and columns from the provided schema
- Use uppercase SQL keywords (SELECT, FROM, WHERE, etc.)
- Use SQLite-compatible syntax</task_description>
You will be given a single task in the question XML block
Solve only the task in question block.
Generate only the answer, do not generate anything else"""
},
{
"role": "user",
"content": f"""Now for the real task, solve the task in question block.
Generate only the solution, do not generate anything else
<question>Schema:
{schema}
Question: {question}</question>"""
}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Using Ollama (GGUF version)
For local inference, use the quantized GGUF version included in this repository:
```bash
# Download and create Ollama model
ollama create distil-qwen3-0.6b-text2sql -f Modelfile
# Run inference
ollama run distil-qwen3-0.6b-text2sql
```
## Model Details
| Property | Value |
|----------|-------|
| Base Model | [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) |
| Parameters | 0.6 billion |
| Architecture | Qwen3ForCausalLM |
| Context Length | 40,960 tokens |
| Precision | bfloat16 |
| Training Data | ~10,000 synthetic examples |
| Teacher Model | DeepSeek-V3 |
## Training
This model was trained using the [Distil Labs](https://distillabs.ai) platform:
1. **Seed Data**: 50 hand-validated Text2SQL examples covering various SQL complexities
2. **Synthetic Generation**: Expanded to ~10,000 examples using DeepSeek-V3
3. **Fine-tuning**: 4 epochs on the synthetic dataset
4. **Evaluation**: LLM-as-a-Judge with semantic equivalence checking
### Training Hyperparameters
- Epochs: 4
- Learning Rate: 5e-5 (cosine schedule)
- Batch Size: 1 (with gradient accumulation)
- Total Steps: ~40,000
## Task Format
### Input Format
```
Schema:
CREATE TABLE table_name (
column_name DATA_TYPE [CONSTRAINTS],
...
);
Question: Natural language question about the data
```
### Output Format
A single SQL query with:
- Uppercase SQL keywords (SELECT, FROM, WHERE, etc.)
- SQLite-compatible syntax
- No explanations or additional text
### Supported SQL Features
- **Simple**: SELECT, WHERE, COUNT, SUM, AVG, MAX, MIN
- **Medium**: JOIN, GROUP BY, HAVING, ORDER BY, LIMIT
- **Complex**: Subqueries, multiple JOINs, UNION
## Use Cases
- Natural language interfaces to databases
- SQL query assistance and autocompletion
- Database chatbots and conversational BI
- Educational tools for learning SQL
- Edge deployment and mobile applications
## Limitations
- Optimized for SQLite syntax
- Best with 1-2 table schemas
- May struggle with highly complex nested subqueries
- Trained on English questions only
## License
This model is released under the Apache 2.0 license.
## Links
- [Distil Labs Website](https://distillabs.ai)
- [GitHub](https://github.com/distil-labs)
- [Hugging Face](https://huggingface.co/distil-labs)
## Citation
```bibtex
@misc{distil-qwen3-0.6b-text2sql,
author = {Distil Labs},
title = {Distil-Qwen3-0.6B-Text2SQL: A Compact Fine-tuned Model for Natural Language to SQL},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/distil-labs/distil-qwen3-0.6b-text2sql}
}
```

13
STUDENT_LICENSE Normal file
View File

@@ -0,0 +1,13 @@
Copyright 2023 Qwen
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

13
TEACHER_LICENSE Normal file
View File

@@ -0,0 +1,13 @@
Copyright 2025 OpenAI
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

28
added_tokens.json Normal file
View File

@@ -0,0 +1,28 @@
{
"</think>": 151668,
"</tool_call>": 151658,
"</tool_response>": 151666,
"<think>": 151667,
"<tool_call>": 151657,
"<tool_response>": 151665,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

89
chat_template.jinja Normal file
View File

@@ -0,0 +1,89 @@
{%- if tools %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].role == 'system' %}
{{- messages[0].content + '\n\n' }}
{%- endif %}
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
{%- for tool in tools %}
{{- "\n" }}
{{- tool | tojson }}
{%- endfor %}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
{%- if messages[0].role == 'system' %}
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
{%- set index = (messages|length - 1) - loop.index0 %}
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
{%- set ns.multi_step_tool = false %}
{%- set ns.last_query_index = index %}
{%- endif %}
{%- endfor %}
{%- for message in messages %}
{%- if message.content is string %}
{%- set content = message.content %}
{%- else %}
{%- set content = '' %}
{%- endif %}
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
{%- elif message.role == "assistant" %}
{%- set reasoning_content = '' %}
{%- if message.reasoning_content is string %}
{%- set reasoning_content = message.reasoning_content %}
{%- else %}
{%- if '</think>' in content %}
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
{%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_query_index %}
{%- if loop.last or (not loop.last and reasoning_content) %}
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and content) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "tool" %}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|im_start|>user' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- content }}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- if enable_thinking is defined and enable_thinking is false %}
{{- '<think>\n\n</think>\n\n' }}
{%- endif %}
{%- endif %}

62
config.json Normal file
View File

@@ -0,0 +1,62 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 40960,
"max_window_layers": 28,
"model_type": "qwen3",
"num_attention_heads": 16,
"num_hidden_layers": 28,
"num_key_value_heads": 8,
"pad_token": "<|endoftext|>",
"pad_token_id": 151643,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000,
"sliding_window": null,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.53.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}

13
generation_config.json Normal file
View File

@@ -0,0 +1,13 @@
{
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"transformers_version": "4.53.0"
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3ada3be37f601e4a85310e812fde7a341551e057cd2c79e512d2d7c1e643c77a
size 1192135096

38
special_tokens_map.json Normal file
View File

@@ -0,0 +1,38 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fbd5dd30a62db2f0ead71513492e40939dca4240dd5141e0a525212e2a45ff74
size 11422923

240
tokenizer_config.json Normal file
View File

@@ -0,0 +1,240 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": "<|endoftext|>",
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"extra_special_tokens": {},
"model_max_length": 131072,
"pad_token": "<|endoftext|>",
"padding_side": "left",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null
}

168
training-logs.csv Normal file
View File

@@ -0,0 +1,168 @@
,eval_loss,eval_binary,eval_rouge,eval_llm_as_a_judge,eval_runtime,eval_samples_per_second,eval_steps_per_second,epoch,step,loss,grad_norm,learning_rate,train_runtime,train_samples_per_second,train_steps_per_second,total_flos,train_loss
0,1.4611146450042725,0.0,0.8400047759556624,0.0,38.5471,1.297,1.297,0.0,0,,,,,,,,
1,,,,,,,,0.024723101265822785,250,0.7402,54.79979705810547,6.005931784478498e-06,,,,,
2,,,,,,,,0.04944620253164557,500,0.3334,2.3189218044281006,1.2184873949579832e-05,,,,,
3,,,,,,,,0.07416930379746836,750,0.2563,2.748095989227295,1.836381611468117e-05,,,,,
4,,,,,,,,0.09889240506329114,1000,0.2416,4.4334211349487305,2.45427582797825e-05,,,,,
5,,,,,,,,0.12361550632911393,1250,0.2444,2.845724582672119,3.0721700444883836e-05,,,,,
6,,,,,,,,0.14833860759493672,1500,0.2232,1.9762811660766602,3.690064260998517e-05,,,,,
7,,,,,,,,0.1730617088607595,1750,0.2032,1.2688300609588623,4.307958477508651e-05,,,,,
8,,,,,,,,0.19778481012658228,2000,0.21,0.7073513865470886,4.925852694018784e-05,,,,,
9,,,,,,,,0.22250791139240506,2250,0.219,0.16178998351097107,4.9713728041639565e-05,,,,,
10,,,,,,,,0.24723101265822786,2500,0.1886,1.8036707639694214,4.9388418998048146e-05,,,,,
11,,,,,,,,0.2719541139240506,2750,0.1907,1.909298300743103,4.9063109954456734e-05,,,,,
12,,,,,,,,0.29667721518987344,3000,0.1858,0.011116993613541126,4.873780091086532e-05,,,,,
13,,,,,,,,0.3214003164556962,3250,0.1831,1.3476927280426025,4.841249186727391e-05,,,,,
14,,,,,,,,0.346123417721519,3500,0.1814,1.0721465349197388,4.8087182823682505e-05,,,,,
15,,,,,,,,0.3708465189873418,3750,0.1719,1.1927968263626099,4.776187378009109e-05,,,,,
16,,,,,,,,0.39556962025316456,4000,0.1701,0.3336207866668701,4.7436564736499675e-05,,,,,
17,,,,,,,,0.42029272151898733,4250,0.1657,0.002819074084982276,4.711125569290826e-05,,,,,
18,,,,,,,,0.4450158227848101,4500,0.1542,2.771256446838379,4.678594664931685e-05,,,,,
19,,,,,,,,0.4697389240506329,4750,0.1612,1.4837995767593384,4.6460637605725446e-05,,,,,
20,,,,,,,,0.4944620253164557,5000,0.1543,0.004205027129501104,4.613532856213403e-05,,,,,
21,,,,,,,,0.5191851265822784,5250,0.1559,1.5585635900497437,4.5810019518542615e-05,,,,,
22,,,,,,,,0.5439082278481012,5500,0.1626,0.7253758311271667,4.54847104749512e-05,,,,,
23,,,,,,,,0.568631329113924,5750,0.1566,0.3586200475692749,4.515940143135979e-05,,,,,
24,,,,,,,,0.5933544303797469,6000,0.1529,0.2762477695941925,4.4834092387768386e-05,,,,,
25,,,,,,,,0.6180775316455697,6250,0.1441,0.05483795329928398,4.450878334417697e-05,,,,,
26,,,,,,,,0.6428006329113924,6500,0.1533,1.5881824493408203,4.4183474300585556e-05,,,,,
27,,,,,,,,0.6675237341772152,6750,0.1371,0.6844435334205627,4.385816525699415e-05,,,,,
28,,,,,,,,0.692246835443038,7000,0.1543,1.086692214012146,4.353285621340273e-05,,,,,
29,,,,,,,,0.7169699367088608,7250,0.1434,1.3913636207580566,4.320754716981133e-05,,,,,
30,,,,,,,,0.7416930379746836,7500,0.1319,1.0691365003585815,4.288223812621991e-05,,,,,
31,,,,,,,,0.7664161392405063,7750,0.1416,8.558887481689453,4.2556929082628496e-05,,,,,
32,,,,,,,,0.7911392405063291,8000,0.1373,0.004528762772679329,4.223162003903709e-05,,,,,
33,,,,,,,,0.8158623417721519,8250,0.1662,0.41639500856399536,4.190631099544567e-05,,,,,
34,,,,,,,,0.8405854430379747,8500,0.1272,2.729116439819336,4.158100195185427e-05,,,,,
35,,,,,,,,0.8653085443037974,8750,0.1431,1.398118257522583,4.125569290826285e-05,,,,,
36,,,,,,,,0.8900316455696202,9000,0.1534,0.35060954093933105,4.093038386467144e-05,,,,,
37,,,,,,,,0.914754746835443,9250,0.1387,1.2084834575653076,4.060507482108003e-05,,,,,
38,,,,,,,,0.9394778481012658,9500,0.1349,1.2688090801239014,4.027976577748861e-05,,,,,
39,,,,,,,,0.9642009493670886,9750,0.1288,0.8546839952468872,3.995445673389721e-05,,,,,
40,,,,,,,,0.9889240506329114,10000,0.1232,0.5538005828857422,3.962914769030579e-05,,,,,
41,0.13391633331775665,0.4,0.9484506498956332,0.4,37.5908,1.33,1.33,1.0,10112,,,,,,,,
42,,,,,,,,1.0136471518987342,10250,0.1237,0.9612093567848206,3.9303838646714384e-05,,,,,
43,,,,,,,,1.0383702531645569,10500,0.1067,1.7282613515853882,3.897852960312297e-05,,,,,
44,,,,,,,,1.0630933544303798,10750,0.1115,0.8479132056236267,3.865322055953155e-05,,,,,
45,,,,,,,,1.0878164556962024,11000,0.1068,0.979035496711731,3.832791151594015e-05,,,,,
46,,,,,,,,1.1125395569620253,11250,0.1144,1.2608872652053833,3.800260247234873e-05,,,,,
47,,,,,,,,1.137262658227848,11500,0.1143,1.106691598892212,3.7677293428757324e-05,,,,,
48,,,,,,,,1.1619857594936709,11750,0.1049,1.5078562498092651,3.735198438516591e-05,,,,,
49,,,,,,,,1.1867088607594938,12000,0.1134,0.020030342042446136,3.7026675341574493e-05,,,,,
50,,,,,,,,1.2114319620253164,12250,0.1165,1.020694375038147,3.670136629798309e-05,,,,,
51,,,,,,,,1.2361550632911393,12500,0.1143,0.608747661113739,3.6377358490566037e-05,,,,,
52,,,,,,,,1.260878164556962,12750,0.11,0.2116389125585556,3.605204944697463e-05,,,,,
53,,,,,,,,1.2856012658227849,13000,0.1087,1.4882630109786987,3.572674040338321e-05,,,,,
54,,,,,,,,1.3103243670886076,13250,0.1105,1.8852481842041016,3.54014313597918e-05,,,,,
55,,,,,,,,1.3350474683544304,13500,0.1006,0.611929714679718,3.5076122316200396e-05,,,,,
56,,,,,,,,1.3597705696202531,13750,0.114,0.002776511711999774,3.475081327260898e-05,,,,,
57,,,,,,,,1.384493670886076,14000,0.119,1.504339575767517,3.442550422901757e-05,,,,,
58,,,,,,,,1.4092167721518987,14250,0.1142,0.05355082079768181,3.410019518542616e-05,,,,,
59,,,,,,,,1.4339398734177216,14500,0.1183,0.33664050698280334,3.377488614183474e-05,,,,,
60,,,,,,,,1.4586629746835442,14750,0.1165,0.9554659724235535,3.3449577098243336e-05,,,,,
61,,,,,,,,1.4833860759493671,15000,0.1115,0.47997984290122986,3.312426805465192e-05,,,,,
62,,,,,,,,1.5081091772151898,15250,0.1158,0.47814834117889404,3.279895901106051e-05,,,,,
63,,,,,,,,1.5328322784810127,15500,0.1084,0.4871940612792969,3.24736499674691e-05,,,,,
64,,,,,,,,1.5575553797468356,15750,0.1099,1.3032745122909546,3.214834092387768e-05,,,,,
65,,,,,,,,1.5822784810126582,16000,0.1211,0.6182886362075806,3.1823031880286276e-05,,,,,
66,,,,,,,,1.607001582278481,16250,0.1175,0.5264896750450134,3.149772283669486e-05,,,,,
67,,,,,,,,1.6317246835443038,16500,0.1062,1.750696063041687,3.117241379310345e-05,,,,,
68,,,,,,,,1.6564477848101267,16750,0.1004,1.6210639476776123,3.084710474951204e-05,,,,,
69,,,,,,,,1.6811708860759493,17000,0.0958,0.5644216537475586,3.052179570592062e-05,,,,,
70,,,,,,,,1.705893987341772,17250,0.1147,1.8764464855194092,3.0196486662329217e-05,,,,,
71,,,,,,,,1.7306170886075949,17500,0.1126,3.0712196826934814,2.98711776187378e-05,,,,,
72,,,,,,,,1.7553401898734178,17750,0.1019,0.0009781919652596116,2.954586857514639e-05,,,,,
73,,,,,,,,1.7800632911392404,18000,0.1243,2.594623565673828,2.922186076772934e-05,,,,,
74,,,,,,,,1.8047863924050633,18250,0.1077,2.3117940425872803,2.8896551724137933e-05,,,,,
75,,,,,,,,1.8295094936708862,18500,0.1103,0.3653891086578369,2.8572543916720884e-05,,,,,
76,,,,,,,,1.8542325949367089,18750,0.1004,1.380301833152771,2.8247234873129476e-05,,,,,
77,,,,,,,,1.8789556962025316,19000,0.1112,2.6601779460906982,2.7921925829538064e-05,,,,,
78,,,,,,,,1.9036787974683544,19250,0.109,0.009044609032571316,2.759661678594665e-05,,,,,
79,,,,,,,,1.9284018987341773,19500,0.1026,0.256060391664505,2.727130774235524e-05,,,,,
80,,,,,,,,1.953125,19750,0.1221,1.5156265497207642,2.6945998698763825e-05,,,,,
81,,,,,,,,1.9778481012658227,20000,0.1063,0.05024247244000435,2.6620689655172416e-05,,,,,
82,0.12020297348499298,0.44,0.9528717906739388,0.44,37.6019,1.33,1.33,2.0,20224,,,,,,,,
83,,,,,,,,2.0025712025316458,20250,0.0969,1.1122921705245972,2.6295380611581004e-05,,,,,
84,,,,,,,,2.0272943037974684,20500,0.0884,0.9288895130157471,2.597007156798959e-05,,,,,
85,,,,,,,,2.052017405063291,20750,0.0899,0.8800064921379089,2.564476252439818e-05,,,,,
86,,,,,,,,2.0767405063291138,21000,0.0919,1.1201504468917847,2.5319453480806765e-05,,,,,
87,,,,,,,,2.101463607594937,21250,0.0962,1.5560466051101685,2.4994144437215357e-05,,,,,
88,,,,,,,,2.1261867088607596,21500,0.0907,0.5000078082084656,2.466883539362394e-05,,,,,
89,,,,,,,,2.1509098101265822,21750,0.0879,0.5242018103599548,2.4343526350032533e-05,,,,,
90,,,,,,,,2.175632911392405,22000,0.0866,1.4044090509414673,2.401821730644112e-05,,,,,
91,,,,,,,,2.200356012658228,22250,0.0964,0.88419508934021,2.369290826284971e-05,,,,,
92,,,,,,,,2.2250791139240507,22500,0.0854,0.8067905902862549,2.3367599219258297e-05,,,,,
93,,,,,,,,2.2498022151898733,22750,0.0828,1.1692417860031128,2.3042290175666885e-05,,,,,
94,,,,,,,,2.274525316455696,23000,0.0928,0.18776026368141174,2.2718282368249837e-05,,,,,
95,,,,,,,,2.299248417721519,23250,0.0948,0.0010576567146927118,2.2392973324658425e-05,,,,,
96,,,,,,,,2.3239715189873418,23500,0.0839,0.0008563404553569853,2.2067664281067016e-05,,,,,
97,,,,,,,,2.3486946202531644,23750,0.0949,1.6228750944137573,2.1742355237475604e-05,,,,,
98,,,,,,,,2.3734177215189876,24000,0.0896,1.3844517469406128,2.1417046193884193e-05,,,,,
99,,,,,,,,2.3981408227848102,24250,0.0858,1.2112687826156616,2.1091737150292777e-05,,,,,
100,,,,,,,,2.422863924050633,24500,0.0977,1.0013253688812256,2.0766428106701365e-05,,,,,
101,,,,,,,,2.4475870253164556,24750,0.0766,0.00082951202057302,2.0441119063109957e-05,,,,,
102,,,,,,,,2.4723101265822787,25000,0.0784,1.2415456771850586,2.0115810019518545e-05,,,,,
103,,,,,,,,2.4970332278481013,25250,0.0862,0.11346381902694702,1.9790500975927133e-05,,,,,
104,,,,,,,,2.521756329113924,25500,0.082,2.2702343463897705,1.9465191932335718e-05,,,,,
105,,,,,,,,2.5464794303797467,25750,0.0827,0.019430814310908318,1.913988288874431e-05,,,,,
106,,,,,,,,2.5712025316455698,26000,0.0857,0.419853538274765,1.8814573845152897e-05,,,,,
107,,,,,,,,2.5959256329113924,26250,0.0919,2.4839465618133545,1.8489264801561485e-05,,,,,
108,,,,,,,,2.620648734177215,26500,0.085,0.46295323967933655,1.8163955757970073e-05,,,,,
109,,,,,,,,2.645371835443038,26750,0.0828,0.2432839572429657,1.7838646714378658e-05,,,,,
110,,,,,,,,2.670094936708861,27000,0.0881,1.1563462018966675,1.7514638906961613e-05,,,,,
111,,,,,,,,2.6948180379746836,27250,0.0845,1.5228748321533203,1.71893298633702e-05,,,,,
112,,,,,,,,2.7195411392405062,27500,0.0828,0.5037127137184143,1.6864020819778793e-05,,,,,
113,,,,,,,,2.744264240506329,27750,0.0914,2.4526147842407227,1.653871177618738e-05,,,,,
114,,,,,,,,2.768987341772152,28000,0.0946,0.6520683765411377,1.6214703968770332e-05,,,,,
115,,,,,,,,2.7937104430379747,28250,0.081,0.5551896691322327,1.588939492517892e-05,,,,,
116,,,,,,,,2.8184335443037973,28500,0.089,0.35995572805404663,1.556408588158751e-05,,,,,
117,,,,,,,,2.8431566455696204,28750,0.0762,1.3633737564086914,1.5238776837996097e-05,,,,,
118,,,,,,,,2.867879746835443,29000,0.0848,0.06417599320411682,1.4913467794404685e-05,,,,,
119,,,,,,,,2.8926028481012658,29250,0.089,2.097482681274414,1.4588158750813274e-05,,,,,
120,,,,,,,,2.9173259493670884,29500,0.0736,1.4356244802474976,1.4262849707221863e-05,,,,,
121,,,,,,,,2.9420490506329116,29750,0.083,1.1658858060836792,1.3937540663630449e-05,,,,,
122,,,,,,,,2.9667721518987342,30000,0.0893,0.37668246030807495,1.3613532856213404e-05,,,,,
123,,,,,,,,2.991495253164557,30250,0.089,0.32948413491249084,1.3288223812621992e-05,,,,,
124,0.09559772163629532,0.48,0.9629392864315163,0.48,34.8289,1.436,1.436,3.0,30336,,,,,,,,
125,,,,,,,,3.0162183544303796,30500,0.0705,1.3704227209091187,1.2962914769030578e-05,,,,,
126,,,,,,,,3.0409414556962027,30750,0.0662,0.5225652456283569,1.2637605725439167e-05,,,,,
127,,,,,,,,3.0656645569620253,31000,0.0739,0.049944907426834106,1.2312296681847756e-05,,,,,
128,,,,,,,,3.090387658227848,31250,0.0819,0.2583652138710022,1.1986987638256344e-05,,,,,
129,,,,,,,,3.115110759493671,31500,0.0867,0.0035025831311941147,1.1661678594664932e-05,,,,,
130,,,,,,,,3.1398338607594938,31750,0.0743,0.6238996982574463,1.133636955107352e-05,,,,,
131,,,,,,,,3.1645569620253164,32000,0.0794,0.7921465039253235,1.1012361743656474e-05,,,,,
132,,,,,,,,3.189280063291139,32250,0.0757,0.0061428844928741455,1.0688353936239429e-05,,,,,
133,,,,,,,,3.2140031645569622,32500,0.073,1.9250328540802002,1.0363044892648017e-05,,,,,
134,,,,,,,,3.238726265822785,32750,0.0686,1.1742448806762695,1.0037735849056603e-05,,,,,
135,,,,,,,,3.2634493670886076,33000,0.0724,0.07340391725301743,9.712426805465193e-06,,,,,
136,,,,,,,,3.2881724683544302,33250,0.0625,1.830605149269104,9.38711776187378e-06,,,,,
137,,,,,,,,3.3128955696202533,33500,0.065,0.6452906131744385,9.06180871828237e-06,,,,,
138,,,,,,,,3.337618670886076,33750,0.0672,1.2680094242095947,8.736499674690957e-06,,,,,
139,,,,,,,,3.3623417721518987,34000,0.071,0.044624943286180496,8.411190631099545e-06,,,,,
140,,,,,,,,3.3870648734177213,34250,0.0694,2.3345537185668945,8.085881587508134e-06,,,,,
141,,,,,,,,3.4117879746835444,34500,0.0741,0.31069791316986084,7.76057254391672e-06,,,,,
142,,,,,,,,3.436511075949367,34750,0.0778,3.534733295440674,7.43526350032531e-06,,,,,
143,,,,,,,,3.4612341772151898,35000,0.0611,2.030383348464966,7.109954456733897e-06,,,,,
144,,,,,,,,3.4859572784810124,35250,0.0773,0.04156184941530228,6.784645413142486e-06,,,,,
145,,,,,,,,3.5106803797468356,35500,0.0828,1.2193745374679565,6.459336369551074e-06,,,,,
146,,,,,,,,3.5354034810126582,35750,0.0792,1.88787841796875,6.134027325959662e-06,,,,,
147,,,,,,,,3.560126582278481,36000,0.0681,1.1811633110046387,5.80871828236825e-06,,,,,
148,,,,,,,,3.584849683544304,36250,0.0717,1.1019994020462036,5.483409238776838e-06,,,,,
149,,,,,,,,3.6095727848101267,36500,0.0772,1.6269752979278564,5.158100195185426e-06,,,,,
150,,,,,,,,3.6342958860759493,36750,0.0666,1.4130914211273193,4.832791151594015e-06,,,,,
151,,,,,,,,3.659018987341772,37000,0.0719,1.556328535079956,4.5087833441769685e-06,,,,,
152,,,,,,,,3.6837420886075947,37250,0.0637,0.576520562171936,4.183474300585556e-06,,,,,
153,,,,,,,,3.7084651898734178,37500,0.0693,0.37657344341278076,3.85946649316851e-06,,,,,
154,,,,,,,,3.7331882911392404,37750,0.0637,0.5553440451622009,3.534157449577098e-06,,,,,
155,,,,,,,,3.757911392405063,38000,0.0627,1.6216650009155273,3.2088484059856866e-06,,,,,
156,,,,,,,,3.7826344936708862,38250,0.0656,4.061465263366699,2.8835393623942746e-06,,,,,
157,,,,,,,,3.807357594936709,38500,0.0646,0.9978199601173401,2.558230318802863e-06,,,,,
158,,,,,,,,3.8320806962025316,38750,0.065,1.2803815603256226,2.232921275211451e-06,,,,,
159,,,,,,,,3.8568037974683547,39000,0.0609,1.7353192567825317,1.907612231620039e-06,,,,,
160,,,,,,,,3.8815268987341773,39250,0.073,1.3712269067764282,1.5823031880286274e-06,,,,,
161,,,,,,,,3.90625,39500,0.0771,0.9225087761878967,1.2569941444372155e-06,,,,,
162,,,,,,,,3.9309731012658227,39750,0.0698,0.006028198637068272,9.316851008458037e-07,,,,,
163,,,,,,,,3.9556962025316453,40000,0.0763,1.3945635557174683,6.063760572543916e-07,,,,,
164,,,,,,,,3.9804193037974684,40250,0.0648,2.6979146003723145,2.810670136629798e-07,,,,,
165,0.11073184758424759,0.46,0.9610821865685706,0.46,36.7743,1.36,1.36,4.0,40448,,,,,,,,
166,,,,,,,,4.0,40448,,,,4982.372,8.118,8.118,4.056404093986406e+16,0.11362613520667522
1 eval_loss eval_binary eval_rouge eval_llm_as_a_judge eval_runtime eval_samples_per_second eval_steps_per_second epoch step loss grad_norm learning_rate train_runtime train_samples_per_second train_steps_per_second total_flos train_loss
2 0 1.4611146450042725 0.0 0.8400047759556624 0.0 38.5471 1.297 1.297 0.0 0
3 1 0.024723101265822785 250 0.7402 54.79979705810547 6.005931784478498e-06
4 2 0.04944620253164557 500 0.3334 2.3189218044281006 1.2184873949579832e-05
5 3 0.07416930379746836 750 0.2563 2.748095989227295 1.836381611468117e-05
6 4 0.09889240506329114 1000 0.2416 4.4334211349487305 2.45427582797825e-05
7 5 0.12361550632911393 1250 0.2444 2.845724582672119 3.0721700444883836e-05
8 6 0.14833860759493672 1500 0.2232 1.9762811660766602 3.690064260998517e-05
9 7 0.1730617088607595 1750 0.2032 1.2688300609588623 4.307958477508651e-05
10 8 0.19778481012658228 2000 0.21 0.7073513865470886 4.925852694018784e-05
11 9 0.22250791139240506 2250 0.219 0.16178998351097107 4.9713728041639565e-05
12 10 0.24723101265822786 2500 0.1886 1.8036707639694214 4.9388418998048146e-05
13 11 0.2719541139240506 2750 0.1907 1.909298300743103 4.9063109954456734e-05
14 12 0.29667721518987344 3000 0.1858 0.011116993613541126 4.873780091086532e-05
15 13 0.3214003164556962 3250 0.1831 1.3476927280426025 4.841249186727391e-05
16 14 0.346123417721519 3500 0.1814 1.0721465349197388 4.8087182823682505e-05
17 15 0.3708465189873418 3750 0.1719 1.1927968263626099 4.776187378009109e-05
18 16 0.39556962025316456 4000 0.1701 0.3336207866668701 4.7436564736499675e-05
19 17 0.42029272151898733 4250 0.1657 0.002819074084982276 4.711125569290826e-05
20 18 0.4450158227848101 4500 0.1542 2.771256446838379 4.678594664931685e-05
21 19 0.4697389240506329 4750 0.1612 1.4837995767593384 4.6460637605725446e-05
22 20 0.4944620253164557 5000 0.1543 0.004205027129501104 4.613532856213403e-05
23 21 0.5191851265822784 5250 0.1559 1.5585635900497437 4.5810019518542615e-05
24 22 0.5439082278481012 5500 0.1626 0.7253758311271667 4.54847104749512e-05
25 23 0.568631329113924 5750 0.1566 0.3586200475692749 4.515940143135979e-05
26 24 0.5933544303797469 6000 0.1529 0.2762477695941925 4.4834092387768386e-05
27 25 0.6180775316455697 6250 0.1441 0.05483795329928398 4.450878334417697e-05
28 26 0.6428006329113924 6500 0.1533 1.5881824493408203 4.4183474300585556e-05
29 27 0.6675237341772152 6750 0.1371 0.6844435334205627 4.385816525699415e-05
30 28 0.692246835443038 7000 0.1543 1.086692214012146 4.353285621340273e-05
31 29 0.7169699367088608 7250 0.1434 1.3913636207580566 4.320754716981133e-05
32 30 0.7416930379746836 7500 0.1319 1.0691365003585815 4.288223812621991e-05
33 31 0.7664161392405063 7750 0.1416 8.558887481689453 4.2556929082628496e-05
34 32 0.7911392405063291 8000 0.1373 0.004528762772679329 4.223162003903709e-05
35 33 0.8158623417721519 8250 0.1662 0.41639500856399536 4.190631099544567e-05
36 34 0.8405854430379747 8500 0.1272 2.729116439819336 4.158100195185427e-05
37 35 0.8653085443037974 8750 0.1431 1.398118257522583 4.125569290826285e-05
38 36 0.8900316455696202 9000 0.1534 0.35060954093933105 4.093038386467144e-05
39 37 0.914754746835443 9250 0.1387 1.2084834575653076 4.060507482108003e-05
40 38 0.9394778481012658 9500 0.1349 1.2688090801239014 4.027976577748861e-05
41 39 0.9642009493670886 9750 0.1288 0.8546839952468872 3.995445673389721e-05
42 40 0.9889240506329114 10000 0.1232 0.5538005828857422 3.962914769030579e-05
43 41 0.13391633331775665 0.4 0.9484506498956332 0.4 37.5908 1.33 1.33 1.0 10112
44 42 1.0136471518987342 10250 0.1237 0.9612093567848206 3.9303838646714384e-05
45 43 1.0383702531645569 10500 0.1067 1.7282613515853882 3.897852960312297e-05
46 44 1.0630933544303798 10750 0.1115 0.8479132056236267 3.865322055953155e-05
47 45 1.0878164556962024 11000 0.1068 0.979035496711731 3.832791151594015e-05
48 46 1.1125395569620253 11250 0.1144 1.2608872652053833 3.800260247234873e-05
49 47 1.137262658227848 11500 0.1143 1.106691598892212 3.7677293428757324e-05
50 48 1.1619857594936709 11750 0.1049 1.5078562498092651 3.735198438516591e-05
51 49 1.1867088607594938 12000 0.1134 0.020030342042446136 3.7026675341574493e-05
52 50 1.2114319620253164 12250 0.1165 1.020694375038147 3.670136629798309e-05
53 51 1.2361550632911393 12500 0.1143 0.608747661113739 3.6377358490566037e-05
54 52 1.260878164556962 12750 0.11 0.2116389125585556 3.605204944697463e-05
55 53 1.2856012658227849 13000 0.1087 1.4882630109786987 3.572674040338321e-05
56 54 1.3103243670886076 13250 0.1105 1.8852481842041016 3.54014313597918e-05
57 55 1.3350474683544304 13500 0.1006 0.611929714679718 3.5076122316200396e-05
58 56 1.3597705696202531 13750 0.114 0.002776511711999774 3.475081327260898e-05
59 57 1.384493670886076 14000 0.119 1.504339575767517 3.442550422901757e-05
60 58 1.4092167721518987 14250 0.1142 0.05355082079768181 3.410019518542616e-05
61 59 1.4339398734177216 14500 0.1183 0.33664050698280334 3.377488614183474e-05
62 60 1.4586629746835442 14750 0.1165 0.9554659724235535 3.3449577098243336e-05
63 61 1.4833860759493671 15000 0.1115 0.47997984290122986 3.312426805465192e-05
64 62 1.5081091772151898 15250 0.1158 0.47814834117889404 3.279895901106051e-05
65 63 1.5328322784810127 15500 0.1084 0.4871940612792969 3.24736499674691e-05
66 64 1.5575553797468356 15750 0.1099 1.3032745122909546 3.214834092387768e-05
67 65 1.5822784810126582 16000 0.1211 0.6182886362075806 3.1823031880286276e-05
68 66 1.607001582278481 16250 0.1175 0.5264896750450134 3.149772283669486e-05
69 67 1.6317246835443038 16500 0.1062 1.750696063041687 3.117241379310345e-05
70 68 1.6564477848101267 16750 0.1004 1.6210639476776123 3.084710474951204e-05
71 69 1.6811708860759493 17000 0.0958 0.5644216537475586 3.052179570592062e-05
72 70 1.705893987341772 17250 0.1147 1.8764464855194092 3.0196486662329217e-05
73 71 1.7306170886075949 17500 0.1126 3.0712196826934814 2.98711776187378e-05
74 72 1.7553401898734178 17750 0.1019 0.0009781919652596116 2.954586857514639e-05
75 73 1.7800632911392404 18000 0.1243 2.594623565673828 2.922186076772934e-05
76 74 1.8047863924050633 18250 0.1077 2.3117940425872803 2.8896551724137933e-05
77 75 1.8295094936708862 18500 0.1103 0.3653891086578369 2.8572543916720884e-05
78 76 1.8542325949367089 18750 0.1004 1.380301833152771 2.8247234873129476e-05
79 77 1.8789556962025316 19000 0.1112 2.6601779460906982 2.7921925829538064e-05
80 78 1.9036787974683544 19250 0.109 0.009044609032571316 2.759661678594665e-05
81 79 1.9284018987341773 19500 0.1026 0.256060391664505 2.727130774235524e-05
82 80 1.953125 19750 0.1221 1.5156265497207642 2.6945998698763825e-05
83 81 1.9778481012658227 20000 0.1063 0.05024247244000435 2.6620689655172416e-05
84 82 0.12020297348499298 0.44 0.9528717906739388 0.44 37.6019 1.33 1.33 2.0 20224
85 83 2.0025712025316458 20250 0.0969 1.1122921705245972 2.6295380611581004e-05
86 84 2.0272943037974684 20500 0.0884 0.9288895130157471 2.597007156798959e-05
87 85 2.052017405063291 20750 0.0899 0.8800064921379089 2.564476252439818e-05
88 86 2.0767405063291138 21000 0.0919 1.1201504468917847 2.5319453480806765e-05
89 87 2.101463607594937 21250 0.0962 1.5560466051101685 2.4994144437215357e-05
90 88 2.1261867088607596 21500 0.0907 0.5000078082084656 2.466883539362394e-05
91 89 2.1509098101265822 21750 0.0879 0.5242018103599548 2.4343526350032533e-05
92 90 2.175632911392405 22000 0.0866 1.4044090509414673 2.401821730644112e-05
93 91 2.200356012658228 22250 0.0964 0.88419508934021 2.369290826284971e-05
94 92 2.2250791139240507 22500 0.0854 0.8067905902862549 2.3367599219258297e-05
95 93 2.2498022151898733 22750 0.0828 1.1692417860031128 2.3042290175666885e-05
96 94 2.274525316455696 23000 0.0928 0.18776026368141174 2.2718282368249837e-05
97 95 2.299248417721519 23250 0.0948 0.0010576567146927118 2.2392973324658425e-05
98 96 2.3239715189873418 23500 0.0839 0.0008563404553569853 2.2067664281067016e-05
99 97 2.3486946202531644 23750 0.0949 1.6228750944137573 2.1742355237475604e-05
100 98 2.3734177215189876 24000 0.0896 1.3844517469406128 2.1417046193884193e-05
101 99 2.3981408227848102 24250 0.0858 1.2112687826156616 2.1091737150292777e-05
102 100 2.422863924050633 24500 0.0977 1.0013253688812256 2.0766428106701365e-05
103 101 2.4475870253164556 24750 0.0766 0.00082951202057302 2.0441119063109957e-05
104 102 2.4723101265822787 25000 0.0784 1.2415456771850586 2.0115810019518545e-05
105 103 2.4970332278481013 25250 0.0862 0.11346381902694702 1.9790500975927133e-05
106 104 2.521756329113924 25500 0.082 2.2702343463897705 1.9465191932335718e-05
107 105 2.5464794303797467 25750 0.0827 0.019430814310908318 1.913988288874431e-05
108 106 2.5712025316455698 26000 0.0857 0.419853538274765 1.8814573845152897e-05
109 107 2.5959256329113924 26250 0.0919 2.4839465618133545 1.8489264801561485e-05
110 108 2.620648734177215 26500 0.085 0.46295323967933655 1.8163955757970073e-05
111 109 2.645371835443038 26750 0.0828 0.2432839572429657 1.7838646714378658e-05
112 110 2.670094936708861 27000 0.0881 1.1563462018966675 1.7514638906961613e-05
113 111 2.6948180379746836 27250 0.0845 1.5228748321533203 1.71893298633702e-05
114 112 2.7195411392405062 27500 0.0828 0.5037127137184143 1.6864020819778793e-05
115 113 2.744264240506329 27750 0.0914 2.4526147842407227 1.653871177618738e-05
116 114 2.768987341772152 28000 0.0946 0.6520683765411377 1.6214703968770332e-05
117 115 2.7937104430379747 28250 0.081 0.5551896691322327 1.588939492517892e-05
118 116 2.8184335443037973 28500 0.089 0.35995572805404663 1.556408588158751e-05
119 117 2.8431566455696204 28750 0.0762 1.3633737564086914 1.5238776837996097e-05
120 118 2.867879746835443 29000 0.0848 0.06417599320411682 1.4913467794404685e-05
121 119 2.8926028481012658 29250 0.089 2.097482681274414 1.4588158750813274e-05
122 120 2.9173259493670884 29500 0.0736 1.4356244802474976 1.4262849707221863e-05
123 121 2.9420490506329116 29750 0.083 1.1658858060836792 1.3937540663630449e-05
124 122 2.9667721518987342 30000 0.0893 0.37668246030807495 1.3613532856213404e-05
125 123 2.991495253164557 30250 0.089 0.32948413491249084 1.3288223812621992e-05
126 124 0.09559772163629532 0.48 0.9629392864315163 0.48 34.8289 1.436 1.436 3.0 30336
127 125 3.0162183544303796 30500 0.0705 1.3704227209091187 1.2962914769030578e-05
128 126 3.0409414556962027 30750 0.0662 0.5225652456283569 1.2637605725439167e-05
129 127 3.0656645569620253 31000 0.0739 0.049944907426834106 1.2312296681847756e-05
130 128 3.090387658227848 31250 0.0819 0.2583652138710022 1.1986987638256344e-05
131 129 3.115110759493671 31500 0.0867 0.0035025831311941147 1.1661678594664932e-05
132 130 3.1398338607594938 31750 0.0743 0.6238996982574463 1.133636955107352e-05
133 131 3.1645569620253164 32000 0.0794 0.7921465039253235 1.1012361743656474e-05
134 132 3.189280063291139 32250 0.0757 0.0061428844928741455 1.0688353936239429e-05
135 133 3.2140031645569622 32500 0.073 1.9250328540802002 1.0363044892648017e-05
136 134 3.238726265822785 32750 0.0686 1.1742448806762695 1.0037735849056603e-05
137 135 3.2634493670886076 33000 0.0724 0.07340391725301743 9.712426805465193e-06
138 136 3.2881724683544302 33250 0.0625 1.830605149269104 9.38711776187378e-06
139 137 3.3128955696202533 33500 0.065 0.6452906131744385 9.06180871828237e-06
140 138 3.337618670886076 33750 0.0672 1.2680094242095947 8.736499674690957e-06
141 139 3.3623417721518987 34000 0.071 0.044624943286180496 8.411190631099545e-06
142 140 3.3870648734177213 34250 0.0694 2.3345537185668945 8.085881587508134e-06
143 141 3.4117879746835444 34500 0.0741 0.31069791316986084 7.76057254391672e-06
144 142 3.436511075949367 34750 0.0778 3.534733295440674 7.43526350032531e-06
145 143 3.4612341772151898 35000 0.0611 2.030383348464966 7.109954456733897e-06
146 144 3.4859572784810124 35250 0.0773 0.04156184941530228 6.784645413142486e-06
147 145 3.5106803797468356 35500 0.0828 1.2193745374679565 6.459336369551074e-06
148 146 3.5354034810126582 35750 0.0792 1.88787841796875 6.134027325959662e-06
149 147 3.560126582278481 36000 0.0681 1.1811633110046387 5.80871828236825e-06
150 148 3.584849683544304 36250 0.0717 1.1019994020462036 5.483409238776838e-06
151 149 3.6095727848101267 36500 0.0772 1.6269752979278564 5.158100195185426e-06
152 150 3.6342958860759493 36750 0.0666 1.4130914211273193 4.832791151594015e-06
153 151 3.659018987341772 37000 0.0719 1.556328535079956 4.5087833441769685e-06
154 152 3.6837420886075947 37250 0.0637 0.576520562171936 4.183474300585556e-06
155 153 3.7084651898734178 37500 0.0693 0.37657344341278076 3.85946649316851e-06
156 154 3.7331882911392404 37750 0.0637 0.5553440451622009 3.534157449577098e-06
157 155 3.757911392405063 38000 0.0627 1.6216650009155273 3.2088484059856866e-06
158 156 3.7826344936708862 38250 0.0656 4.061465263366699 2.8835393623942746e-06
159 157 3.807357594936709 38500 0.0646 0.9978199601173401 2.558230318802863e-06
160 158 3.8320806962025316 38750 0.065 1.2803815603256226 2.232921275211451e-06
161 159 3.8568037974683547 39000 0.0609 1.7353192567825317 1.907612231620039e-06
162 160 3.8815268987341773 39250 0.073 1.3712269067764282 1.5823031880286274e-06
163 161 3.90625 39500 0.0771 0.9225087761878967 1.2569941444372155e-06
164 162 3.9309731012658227 39750 0.0698 0.006028198637068272 9.316851008458037e-07
165 163 3.9556962025316453 40000 0.0763 1.3945635557174683 6.063760572543916e-07
166 164 3.9804193037974684 40250 0.0648 2.6979146003723145 2.810670136629798e-07
167 165 0.11073184758424759 0.46 0.9610821865685706 0.46 36.7743 1.36 1.36 4.0 40448
168 166 4.0 40448 4982.372 8.118 8.118 4.056404093986406e+16 0.11362613520667522

167
training-logs.json Normal file
View File

@@ -0,0 +1,167 @@
{"eval_loss":1.461114645,"eval_binary":0.0,"eval_rouge":0.840004776,"eval_llm_as_a_judge":0.0,"eval_runtime":38.5471,"eval_samples_per_second":1.297,"eval_steps_per_second":1.297,"epoch":0.0,"step":0,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.0247231013,"step":250,"loss":0.7402,"grad_norm":54.7997970581,"learning_rate":0.0000060059,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.0494462025,"step":500,"loss":0.3334,"grad_norm":2.3189218044,"learning_rate":0.0000121849,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.0741693038,"step":750,"loss":0.2563,"grad_norm":2.7480959892,"learning_rate":0.0000183638,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.0988924051,"step":1000,"loss":0.2416,"grad_norm":4.4334211349,"learning_rate":0.0000245428,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.1236155063,"step":1250,"loss":0.2444,"grad_norm":2.8457245827,"learning_rate":0.0000307217,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.1483386076,"step":1500,"loss":0.2232,"grad_norm":1.9762811661,"learning_rate":0.0000369006,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.1730617089,"step":1750,"loss":0.2032,"grad_norm":1.268830061,"learning_rate":0.0000430796,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.1977848101,"step":2000,"loss":0.21,"grad_norm":0.7073513865,"learning_rate":0.0000492585,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.2225079114,"step":2250,"loss":0.219,"grad_norm":0.1617899835,"learning_rate":0.0000497137,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.2472310127,"step":2500,"loss":0.1886,"grad_norm":1.803670764,"learning_rate":0.0000493884,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.2719541139,"step":2750,"loss":0.1907,"grad_norm":1.9092983007,"learning_rate":0.0000490631,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.2966772152,"step":3000,"loss":0.1858,"grad_norm":0.0111169936,"learning_rate":0.0000487378,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.3214003165,"step":3250,"loss":0.1831,"grad_norm":1.347692728,"learning_rate":0.0000484125,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.3461234177,"step":3500,"loss":0.1814,"grad_norm":1.0721465349,"learning_rate":0.0000480872,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.370846519,"step":3750,"loss":0.1719,"grad_norm":1.1927968264,"learning_rate":0.0000477619,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.3955696203,"step":4000,"loss":0.1701,"grad_norm":0.3336207867,"learning_rate":0.0000474366,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.4202927215,"step":4250,"loss":0.1657,"grad_norm":0.0028190741,"learning_rate":0.0000471113,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.4450158228,"step":4500,"loss":0.1542,"grad_norm":2.7712564468,"learning_rate":0.0000467859,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.4697389241,"step":4750,"loss":0.1612,"grad_norm":1.4837995768,"learning_rate":0.0000464606,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.4944620253,"step":5000,"loss":0.1543,"grad_norm":0.0042050271,"learning_rate":0.0000461353,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.5191851266,"step":5250,"loss":0.1559,"grad_norm":1.55856359,"learning_rate":0.00004581,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.5439082278,"step":5500,"loss":0.1626,"grad_norm":0.7253758311,"learning_rate":0.0000454847,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.5686313291,"step":5750,"loss":0.1566,"grad_norm":0.3586200476,"learning_rate":0.0000451594,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.5933544304,"step":6000,"loss":0.1529,"grad_norm":0.2762477696,"learning_rate":0.0000448341,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.6180775316,"step":6250,"loss":0.1441,"grad_norm":0.0548379533,"learning_rate":0.0000445088,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.6428006329,"step":6500,"loss":0.1533,"grad_norm":1.5881824493,"learning_rate":0.0000441835,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.6675237342,"step":6750,"loss":0.1371,"grad_norm":0.6844435334,"learning_rate":0.0000438582,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.6922468354,"step":7000,"loss":0.1543,"grad_norm":1.086692214,"learning_rate":0.0000435329,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.7169699367,"step":7250,"loss":0.1434,"grad_norm":1.3913636208,"learning_rate":0.0000432075,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.741693038,"step":7500,"loss":0.1319,"grad_norm":1.0691365004,"learning_rate":0.0000428822,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.7664161392,"step":7750,"loss":0.1416,"grad_norm":8.5588874817,"learning_rate":0.0000425569,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.7911392405,"step":8000,"loss":0.1373,"grad_norm":0.0045287628,"learning_rate":0.0000422316,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.8158623418,"step":8250,"loss":0.1662,"grad_norm":0.4163950086,"learning_rate":0.0000419063,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.840585443,"step":8500,"loss":0.1272,"grad_norm":2.7291164398,"learning_rate":0.000041581,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.8653085443,"step":8750,"loss":0.1431,"grad_norm":1.3981182575,"learning_rate":0.0000412557,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.8900316456,"step":9000,"loss":0.1534,"grad_norm":0.3506095409,"learning_rate":0.0000409304,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.9147547468,"step":9250,"loss":0.1387,"grad_norm":1.2084834576,"learning_rate":0.0000406051,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.9394778481,"step":9500,"loss":0.1349,"grad_norm":1.2688090801,"learning_rate":0.0000402798,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.9642009494,"step":9750,"loss":0.1288,"grad_norm":0.8546839952,"learning_rate":0.0000399545,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":0.9889240506,"step":10000,"loss":0.1232,"grad_norm":0.5538005829,"learning_rate":0.0000396291,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":0.1339163333,"eval_binary":0.4,"eval_rouge":0.9484506499,"eval_llm_as_a_judge":0.4,"eval_runtime":37.5908,"eval_samples_per_second":1.33,"eval_steps_per_second":1.33,"epoch":1.0,"step":10112,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.0136471519,"step":10250,"loss":0.1237,"grad_norm":0.9612093568,"learning_rate":0.0000393038,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.0383702532,"step":10500,"loss":0.1067,"grad_norm":1.7282613516,"learning_rate":0.0000389785,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.0630933544,"step":10750,"loss":0.1115,"grad_norm":0.8479132056,"learning_rate":0.0000386532,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.0878164557,"step":11000,"loss":0.1068,"grad_norm":0.9790354967,"learning_rate":0.0000383279,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.112539557,"step":11250,"loss":0.1144,"grad_norm":1.2608872652,"learning_rate":0.0000380026,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.1372626582,"step":11500,"loss":0.1143,"grad_norm":1.1066915989,"learning_rate":0.0000376773,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.1619857595,"step":11750,"loss":0.1049,"grad_norm":1.5078562498,"learning_rate":0.000037352,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.1867088608,"step":12000,"loss":0.1134,"grad_norm":0.020030342,"learning_rate":0.0000370267,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.211431962,"step":12250,"loss":0.1165,"grad_norm":1.020694375,"learning_rate":0.0000367014,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.2361550633,"step":12500,"loss":0.1143,"grad_norm":0.6087476611,"learning_rate":0.0000363774,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.2608781646,"step":12750,"loss":0.11,"grad_norm":0.2116389126,"learning_rate":0.000036052,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.2856012658,"step":13000,"loss":0.1087,"grad_norm":1.488263011,"learning_rate":0.0000357267,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.3103243671,"step":13250,"loss":0.1105,"grad_norm":1.8852481842,"learning_rate":0.0000354014,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.3350474684,"step":13500,"loss":0.1006,"grad_norm":0.6119297147,"learning_rate":0.0000350761,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.3597705696,"step":13750,"loss":0.114,"grad_norm":0.0027765117,"learning_rate":0.0000347508,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.3844936709,"step":14000,"loss":0.119,"grad_norm":1.5043395758,"learning_rate":0.0000344255,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.4092167722,"step":14250,"loss":0.1142,"grad_norm":0.0535508208,"learning_rate":0.0000341002,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.4339398734,"step":14500,"loss":0.1183,"grad_norm":0.336640507,"learning_rate":0.0000337749,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.4586629747,"step":14750,"loss":0.1165,"grad_norm":0.9554659724,"learning_rate":0.0000334496,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.4833860759,"step":15000,"loss":0.1115,"grad_norm":0.4799798429,"learning_rate":0.0000331243,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.5081091772,"step":15250,"loss":0.1158,"grad_norm":0.4781483412,"learning_rate":0.000032799,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.5328322785,"step":15500,"loss":0.1084,"grad_norm":0.4871940613,"learning_rate":0.0000324736,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.5575553797,"step":15750,"loss":0.1099,"grad_norm":1.3032745123,"learning_rate":0.0000321483,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.582278481,"step":16000,"loss":0.1211,"grad_norm":0.6182886362,"learning_rate":0.000031823,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.6070015823,"step":16250,"loss":0.1175,"grad_norm":0.526489675,"learning_rate":0.0000314977,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.6317246835,"step":16500,"loss":0.1062,"grad_norm":1.750696063,"learning_rate":0.0000311724,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.6564477848,"step":16750,"loss":0.1004,"grad_norm":1.6210639477,"learning_rate":0.0000308471,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.6811708861,"step":17000,"loss":0.0958,"grad_norm":0.5644216537,"learning_rate":0.0000305218,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.7058939873,"step":17250,"loss":0.1147,"grad_norm":1.8764464855,"learning_rate":0.0000301965,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.7306170886,"step":17500,"loss":0.1126,"grad_norm":3.0712196827,"learning_rate":0.0000298712,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.7553401899,"step":17750,"loss":0.1019,"grad_norm":0.000978192,"learning_rate":0.0000295459,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.7800632911,"step":18000,"loss":0.1243,"grad_norm":2.5946235657,"learning_rate":0.0000292219,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.8047863924,"step":18250,"loss":0.1077,"grad_norm":2.3117940426,"learning_rate":0.0000288966,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.8295094937,"step":18500,"loss":0.1103,"grad_norm":0.3653891087,"learning_rate":0.0000285725,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.8542325949,"step":18750,"loss":0.1004,"grad_norm":1.3803018332,"learning_rate":0.0000282472,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.8789556962,"step":19000,"loss":0.1112,"grad_norm":2.6601779461,"learning_rate":0.0000279219,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.9036787975,"step":19250,"loss":0.109,"grad_norm":0.009044609,"learning_rate":0.0000275966,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.9284018987,"step":19500,"loss":0.1026,"grad_norm":0.2560603917,"learning_rate":0.0000272713,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.953125,"step":19750,"loss":0.1221,"grad_norm":1.5156265497,"learning_rate":0.000026946,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":1.9778481013,"step":20000,"loss":0.1063,"grad_norm":0.0502424724,"learning_rate":0.0000266207,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":0.1202029735,"eval_binary":0.44,"eval_rouge":0.9528717907,"eval_llm_as_a_judge":0.44,"eval_runtime":37.6019,"eval_samples_per_second":1.33,"eval_steps_per_second":1.33,"epoch":2.0,"step":20224,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.0025712025,"step":20250,"loss":0.0969,"grad_norm":1.1122921705,"learning_rate":0.0000262954,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.0272943038,"step":20500,"loss":0.0884,"grad_norm":0.928889513,"learning_rate":0.0000259701,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.0520174051,"step":20750,"loss":0.0899,"grad_norm":0.8800064921,"learning_rate":0.0000256448,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.0767405063,"step":21000,"loss":0.0919,"grad_norm":1.1201504469,"learning_rate":0.0000253195,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.1014636076,"step":21250,"loss":0.0962,"grad_norm":1.5560466051,"learning_rate":0.0000249941,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.1261867089,"step":21500,"loss":0.0907,"grad_norm":0.5000078082,"learning_rate":0.0000246688,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.1509098101,"step":21750,"loss":0.0879,"grad_norm":0.5242018104,"learning_rate":0.0000243435,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.1756329114,"step":22000,"loss":0.0866,"grad_norm":1.4044090509,"learning_rate":0.0000240182,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.2003560127,"step":22250,"loss":0.0964,"grad_norm":0.8841950893,"learning_rate":0.0000236929,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.2250791139,"step":22500,"loss":0.0854,"grad_norm":0.8067905903,"learning_rate":0.0000233676,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.2498022152,"step":22750,"loss":0.0828,"grad_norm":1.169241786,"learning_rate":0.0000230423,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.2745253165,"step":23000,"loss":0.0928,"grad_norm":0.1877602637,"learning_rate":0.0000227183,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.2992484177,"step":23250,"loss":0.0948,"grad_norm":0.0010576567,"learning_rate":0.000022393,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.323971519,"step":23500,"loss":0.0839,"grad_norm":0.0008563405,"learning_rate":0.0000220677,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.3486946203,"step":23750,"loss":0.0949,"grad_norm":1.6228750944,"learning_rate":0.0000217424,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.3734177215,"step":24000,"loss":0.0896,"grad_norm":1.3844517469,"learning_rate":0.000021417,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.3981408228,"step":24250,"loss":0.0858,"grad_norm":1.2112687826,"learning_rate":0.0000210917,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.4228639241,"step":24500,"loss":0.0977,"grad_norm":1.0013253689,"learning_rate":0.0000207664,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.4475870253,"step":24750,"loss":0.0766,"grad_norm":0.000829512,"learning_rate":0.0000204411,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.4723101266,"step":25000,"loss":0.0784,"grad_norm":1.2415456772,"learning_rate":0.0000201158,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.4970332278,"step":25250,"loss":0.0862,"grad_norm":0.113463819,"learning_rate":0.0000197905,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.5217563291,"step":25500,"loss":0.082,"grad_norm":2.2702343464,"learning_rate":0.0000194652,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.5464794304,"step":25750,"loss":0.0827,"grad_norm":0.0194308143,"learning_rate":0.0000191399,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.5712025316,"step":26000,"loss":0.0857,"grad_norm":0.4198535383,"learning_rate":0.0000188146,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.5959256329,"step":26250,"loss":0.0919,"grad_norm":2.4839465618,"learning_rate":0.0000184893,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.6206487342,"step":26500,"loss":0.085,"grad_norm":0.4629532397,"learning_rate":0.000018164,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.6453718354,"step":26750,"loss":0.0828,"grad_norm":0.2432839572,"learning_rate":0.0000178386,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.6700949367,"step":27000,"loss":0.0881,"grad_norm":1.1563462019,"learning_rate":0.0000175146,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.694818038,"step":27250,"loss":0.0845,"grad_norm":1.5228748322,"learning_rate":0.0000171893,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.7195411392,"step":27500,"loss":0.0828,"grad_norm":0.5037127137,"learning_rate":0.000016864,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.7442642405,"step":27750,"loss":0.0914,"grad_norm":2.4526147842,"learning_rate":0.0000165387,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.7689873418,"step":28000,"loss":0.0946,"grad_norm":0.6520683765,"learning_rate":0.0000162147,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.793710443,"step":28250,"loss":0.081,"grad_norm":0.5551896691,"learning_rate":0.0000158894,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.8184335443,"step":28500,"loss":0.089,"grad_norm":0.3599557281,"learning_rate":0.0000155641,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.8431566456,"step":28750,"loss":0.0762,"grad_norm":1.3633737564,"learning_rate":0.0000152388,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.8678797468,"step":29000,"loss":0.0848,"grad_norm":0.0641759932,"learning_rate":0.0000149135,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.8926028481,"step":29250,"loss":0.089,"grad_norm":2.0974826813,"learning_rate":0.0000145882,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.9173259494,"step":29500,"loss":0.0736,"grad_norm":1.4356244802,"learning_rate":0.0000142628,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.9420490506,"step":29750,"loss":0.083,"grad_norm":1.1658858061,"learning_rate":0.0000139375,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.9667721519,"step":30000,"loss":0.0893,"grad_norm":0.3766824603,"learning_rate":0.0000136135,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":2.9914952532,"step":30250,"loss":0.089,"grad_norm":0.3294841349,"learning_rate":0.0000132882,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":0.0955977216,"eval_binary":0.48,"eval_rouge":0.9629392864,"eval_llm_as_a_judge":0.48,"eval_runtime":34.8289,"eval_samples_per_second":1.436,"eval_steps_per_second":1.436,"epoch":3.0,"step":30336,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.0162183544,"step":30500,"loss":0.0705,"grad_norm":1.3704227209,"learning_rate":0.0000129629,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.0409414557,"step":30750,"loss":0.0662,"grad_norm":0.5225652456,"learning_rate":0.0000126376,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.065664557,"step":31000,"loss":0.0739,"grad_norm":0.0499449074,"learning_rate":0.0000123123,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.0903876582,"step":31250,"loss":0.0819,"grad_norm":0.2583652139,"learning_rate":0.000011987,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.1151107595,"step":31500,"loss":0.0867,"grad_norm":0.0035025831,"learning_rate":0.0000116617,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.1398338608,"step":31750,"loss":0.0743,"grad_norm":0.6238996983,"learning_rate":0.0000113364,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.164556962,"step":32000,"loss":0.0794,"grad_norm":0.7921465039,"learning_rate":0.0000110124,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.1892800633,"step":32250,"loss":0.0757,"grad_norm":0.0061428845,"learning_rate":0.0000106884,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.2140031646,"step":32500,"loss":0.073,"grad_norm":1.9250328541,"learning_rate":0.000010363,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.2387262658,"step":32750,"loss":0.0686,"grad_norm":1.1742448807,"learning_rate":0.0000100377,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.2634493671,"step":33000,"loss":0.0724,"grad_norm":0.0734039173,"learning_rate":0.0000097124,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.2881724684,"step":33250,"loss":0.0625,"grad_norm":1.8306051493,"learning_rate":0.0000093871,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.3128955696,"step":33500,"loss":0.065,"grad_norm":0.6452906132,"learning_rate":0.0000090618,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.3376186709,"step":33750,"loss":0.0672,"grad_norm":1.2680094242,"learning_rate":0.0000087365,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.3623417722,"step":34000,"loss":0.071,"grad_norm":0.0446249433,"learning_rate":0.0000084112,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.3870648734,"step":34250,"loss":0.0694,"grad_norm":2.3345537186,"learning_rate":0.0000080859,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.4117879747,"step":34500,"loss":0.0741,"grad_norm":0.3106979132,"learning_rate":0.0000077606,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.4365110759,"step":34750,"loss":0.0778,"grad_norm":3.5347332954,"learning_rate":0.0000074353,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.4612341772,"step":35000,"loss":0.0611,"grad_norm":2.0303833485,"learning_rate":0.00000711,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.4859572785,"step":35250,"loss":0.0773,"grad_norm":0.0415618494,"learning_rate":0.0000067846,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.5106803797,"step":35500,"loss":0.0828,"grad_norm":1.2193745375,"learning_rate":0.0000064593,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.535403481,"step":35750,"loss":0.0792,"grad_norm":1.887878418,"learning_rate":0.000006134,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.5601265823,"step":36000,"loss":0.0681,"grad_norm":1.181163311,"learning_rate":0.0000058087,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.5848496835,"step":36250,"loss":0.0717,"grad_norm":1.101999402,"learning_rate":0.0000054834,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.6095727848,"step":36500,"loss":0.0772,"grad_norm":1.6269752979,"learning_rate":0.0000051581,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.6342958861,"step":36750,"loss":0.0666,"grad_norm":1.4130914211,"learning_rate":0.0000048328,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.6590189873,"step":37000,"loss":0.0719,"grad_norm":1.5563285351,"learning_rate":0.0000045088,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.6837420886,"step":37250,"loss":0.0637,"grad_norm":0.5765205622,"learning_rate":0.0000041835,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.7084651899,"step":37500,"loss":0.0693,"grad_norm":0.3765734434,"learning_rate":0.0000038595,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.7331882911,"step":37750,"loss":0.0637,"grad_norm":0.5553440452,"learning_rate":0.0000035342,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.7579113924,"step":38000,"loss":0.0627,"grad_norm":1.6216650009,"learning_rate":0.0000032088,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.7826344937,"step":38250,"loss":0.0656,"grad_norm":4.0614652634,"learning_rate":0.0000028835,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.8073575949,"step":38500,"loss":0.0646,"grad_norm":0.9978199601,"learning_rate":0.0000025582,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.8320806962,"step":38750,"loss":0.065,"grad_norm":1.2803815603,"learning_rate":0.0000022329,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.8568037975,"step":39000,"loss":0.0609,"grad_norm":1.7353192568,"learning_rate":0.0000019076,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.8815268987,"step":39250,"loss":0.073,"grad_norm":1.3712269068,"learning_rate":0.0000015823,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.90625,"step":39500,"loss":0.0771,"grad_norm":0.9225087762,"learning_rate":0.000001257,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.9309731013,"step":39750,"loss":0.0698,"grad_norm":0.0060281986,"learning_rate":0.0000009317,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.9556962025,"step":40000,"loss":0.0763,"grad_norm":1.3945635557,"learning_rate":0.0000006064,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":3.9804193038,"step":40250,"loss":0.0648,"grad_norm":2.6979146004,"learning_rate":0.0000002811,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":0.1107318476,"eval_binary":0.46,"eval_rouge":0.9610821866,"eval_llm_as_a_judge":0.46,"eval_runtime":36.7743,"eval_samples_per_second":1.36,"eval_steps_per_second":1.36,"epoch":4.0,"step":40448,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":null,"train_samples_per_second":null,"train_steps_per_second":null,"total_flos":null,"train_loss":null}
{"eval_loss":null,"eval_binary":null,"eval_rouge":null,"eval_llm_as_a_judge":null,"eval_runtime":null,"eval_samples_per_second":null,"eval_steps_per_second":null,"epoch":4.0,"step":40448,"loss":null,"grad_norm":null,"learning_rate":null,"train_runtime":4982.372,"train_samples_per_second":8.118,"train_steps_per_second":8.118,"total_flos":4.056404094e+16,"train_loss":0.1136261352}

1
vocab.json Normal file

File diff suppressed because one or more lines are too long