初始化项目,由ModelHub XC社区提供模型

Model: DCAgent/a1-curriculum_easy
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-13 12:40:16 +08:00
commit 79b9c7fc7b
23 changed files with 162346 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

60
README.md Normal file
View File

@@ -0,0 +1,60 @@
---
library_name: transformers
license: other
base_model: Qwen/Qwen3-8B
tags:
- llama-factory
- full
- generated_from_trainer
model-index:
- name: sft_a1_curriculum_easy__Qwen3-8B
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# sft_a1_curriculum_easy__Qwen3-8B
This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_curriculum-easy_10k_glm_4.7_traces_jupiter/snapshots/574bd2a799ddd9e4aad0cd97610d014c430bd681_thinking_preprocessed dataset.
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 16
- total_train_batch_size: 16
- total_eval_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 7.0
### Training results
### Framework versions
- Transformers 4.57.6
- Pytorch 2.9.1+cu130
- Datasets 4.7.0
- Tokenizers 0.22.2

28
added_tokens.json Normal file
View File

@@ -0,0 +1,28 @@
{
"</think>": 151668,
"</tool_call>": 151658,
"</tool_response>": 151666,
"<think>": 151667,
"<tool_call>": 151657,
"<tool_response>": 151665,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

16
all_results.json Normal file
View File

@@ -0,0 +1,16 @@
{
"achieved_tflops_per_gpu": 0.0035739762958156354,
"achieved_tflops_per_gpu_theoretical": 772.4422403236824,
"epoch": 7.0,
"loss_nan_ranks": 0,
"loss_rank_avg": 0.10281726717948914,
"mfu_percent": 0.00025257783009297773,
"mfu_percent_theoretical": 54.58955762004823,
"total_flos": 1032895789924352.0,
"train_loss": 0.16274532378516346,
"train_runtime": 18062.791,
"train_samples_per_second": 3.643,
"train_steps_per_second": 0.228,
"valid_targets_mean": 3705.9,
"valid_targets_min": 1332
}

89
chat_template.jinja Normal file
View File

@@ -0,0 +1,89 @@
{%- if tools %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].role == 'system' %}
{{- messages[0].content + '\n\n' }}
{%- endif %}
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
{%- for tool in tools %}
{{- "\n" }}
{{- tool | tojson }}
{%- endfor %}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
{%- if messages[0].role == 'system' %}
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
{%- set index = (messages|length - 1) - loop.index0 %}
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
{%- set ns.multi_step_tool = false %}
{%- set ns.last_query_index = index %}
{%- endif %}
{%- endfor %}
{%- for message in messages %}
{%- if message.content is string %}
{%- set content = message.content %}
{%- else %}
{%- set content = '' %}
{%- endif %}
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
{%- elif message.role == "assistant" %}
{%- set reasoning_content = '' %}
{%- if message.reasoning_content is string %}
{%- set reasoning_content = message.reasoning_content %}
{%- else %}
{%- if '</think>' in content %}
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
{%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_query_index %}
{%- if loop.last or (not loop.last and reasoning_content) %}
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and content) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "tool" %}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|im_start|>user' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- content }}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- if enable_thinking is defined and enable_thinking is false %}
{{- '<think>\n\n</think>\n\n' }}
{%- endif %}
{%- endif %}

68
config.json Normal file
View File

@@ -0,0 +1,68 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"dtype": "bfloat16",
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 12288,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 40960,
"max_window_layers": 36,
"model_type": "qwen3",
"num_attention_heads": 32,
"num_hidden_layers": 36,
"num_key_value_heads": 8,
"pad_token_id": 151643,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000,
"sliding_window": null,
"tie_word_embeddings": false,
"transformers_version": "4.57.6",
"use_cache": false,
"use_sliding_window": false,
"vocab_size": 151936
}

12
generation_config.json Normal file
View File

@@ -0,0 +1,12 @@
{
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"transformers_version": "4.57.6"
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:33f38e794ad71a86cda029e1e73c0d1a5845aaeea38fd795af69f1ef8756cd92
size 4902257696

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4336ee58d06fd10b7cce61f2ac86cf2c5035dedc83aeab5ed231b585e17d80c3
size 4915960368

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6d1565f50df7c1e5dd07e9fb3087595e9c8f0597f5ba154f3d283d582eb9c9e9
size 4983068496

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a77d7e4c8a34b303de0b0fd47e68fa7d9a8a2078e448d02a0187063eddb1f0ab
size 1580230264

View File

@@ -0,0 +1,407 @@
{
"metadata": {
"total_parameters": 308224,
"total_size": 16381470720
},
"weight_map": {
"lm_head.weight": "model-00004-of-00004.safetensors",
"model.embed_tokens.weight": "model-00001-of-00004.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.20.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.22.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.30.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.35.input_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.k_norm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.q_norm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.9.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.norm.weight": "model-00004-of-00004.safetensors"
}
}

12
run_summary.json Normal file
View File

@@ -0,0 +1,12 @@
{
"agent_name": "574bd2a799ddd9e4aad0cd97610d014c430bd681_thinking_preprocessed",
"training_start": null,
"training_end": null,
"created_by": "raoof1",
"base_model_name": "Qwen/Qwen3-8B",
"dataset_name": "/e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_curriculum-easy_10k_glm_4.7_traces_jupiter/snapshots/574bd2a799ddd9e4aad0cd97610d014c430bd681_thinking_preprocessed",
"training_type": "SFT",
"training_parameters": "https://huggingface.co/DCAgent/a1-curriculum_easy/blob/main/config.json",
"wandb_link": null,
"traces_location_s3": null
}

31
special_tokens_map.json Normal file
View File

@@ -0,0 +1,31 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

240
tokenizer_config.json Normal file
View File

@@ -0,0 +1,240 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"extra_special_tokens": {},
"model_max_length": 32768,
"pad_token": "<|endoftext|>",
"padding_side": "right",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null
}

16
train_results.json Normal file
View File

@@ -0,0 +1,16 @@
{
"achieved_tflops_per_gpu": 0.0035739762958156354,
"achieved_tflops_per_gpu_theoretical": 772.4422403236824,
"epoch": 7.0,
"loss_nan_ranks": 0,
"loss_rank_avg": 0.10281726717948914,
"mfu_percent": 0.00025257783009297773,
"mfu_percent_theoretical": 54.58955762004823,
"total_flos": 1032895789924352.0,
"train_loss": 0.16274532378516346,
"train_runtime": 18062.791,
"train_samples_per_second": 3.643,
"train_steps_per_second": 0.228,
"valid_targets_mean": 3705.9,
"valid_targets_min": 1332
}

824
trainer_log.jsonl Normal file
View File

@@ -0,0 +1,824 @@
{"current_steps": 5, "total_steps": 4116, "loss": 0.7008, "lr": 3.883495145631068e-07, "epoch": 0.008503401360544218, "percentage": 0.12, "elapsed_time": "0:00:32", "remaining_time": "7:27:56"}
{"current_steps": 10, "total_steps": 4116, "loss": 0.6754, "lr": 8.737864077669904e-07, "epoch": 0.017006802721088437, "percentage": 0.24, "elapsed_time": "0:00:52", "remaining_time": "6:00:16"}
{"current_steps": 15, "total_steps": 4116, "loss": 0.6342, "lr": 1.359223300970874e-06, "epoch": 0.025510204081632654, "percentage": 0.36, "elapsed_time": "0:01:12", "remaining_time": "5:29:12"}
{"current_steps": 20, "total_steps": 4116, "loss": 0.6454, "lr": 1.8446601941747574e-06, "epoch": 0.034013605442176874, "percentage": 0.49, "elapsed_time": "0:01:33", "remaining_time": "5:17:42"}
{"current_steps": 25, "total_steps": 4116, "loss": 0.5636, "lr": 2.330097087378641e-06, "epoch": 0.04251700680272109, "percentage": 0.61, "elapsed_time": "0:01:56", "remaining_time": "5:19:03"}
{"current_steps": 30, "total_steps": 4116, "loss": 0.5175, "lr": 2.8155339805825245e-06, "epoch": 0.05102040816326531, "percentage": 0.73, "elapsed_time": "0:02:20", "remaining_time": "5:18:22"}
{"current_steps": 35, "total_steps": 4116, "loss": 0.4638, "lr": 3.300970873786408e-06, "epoch": 0.05952380952380952, "percentage": 0.85, "elapsed_time": "0:02:39", "remaining_time": "5:09:34"}
{"current_steps": 40, "total_steps": 4116, "loss": 0.4355, "lr": 3.7864077669902915e-06, "epoch": 0.06802721088435375, "percentage": 0.97, "elapsed_time": "0:03:03", "remaining_time": "5:10:53"}
{"current_steps": 45, "total_steps": 4116, "loss": 0.3964, "lr": 4.271844660194175e-06, "epoch": 0.07653061224489796, "percentage": 1.09, "elapsed_time": "0:03:24", "remaining_time": "5:07:51"}
{"current_steps": 50, "total_steps": 4116, "loss": 0.3921, "lr": 4.7572815533980585e-06, "epoch": 0.08503401360544217, "percentage": 1.21, "elapsed_time": "0:03:54", "remaining_time": "5:17:18"}
{"current_steps": 55, "total_steps": 4116, "loss": 0.3858, "lr": 5.242718446601942e-06, "epoch": 0.0935374149659864, "percentage": 1.34, "elapsed_time": "0:04:15", "remaining_time": "5:14:17"}
{"current_steps": 60, "total_steps": 4116, "loss": 0.3561, "lr": 5.728155339805825e-06, "epoch": 0.10204081632653061, "percentage": 1.46, "elapsed_time": "0:04:36", "remaining_time": "5:11:43"}
{"current_steps": 65, "total_steps": 4116, "loss": 0.3645, "lr": 6.213592233009709e-06, "epoch": 0.11054421768707483, "percentage": 1.58, "elapsed_time": "0:05:02", "remaining_time": "5:14:28"}
{"current_steps": 70, "total_steps": 4116, "loss": 0.3923, "lr": 6.6990291262135935e-06, "epoch": 0.11904761904761904, "percentage": 1.7, "elapsed_time": "0:05:26", "remaining_time": "5:14:32"}
{"current_steps": 75, "total_steps": 4116, "loss": 0.3502, "lr": 7.184466019417476e-06, "epoch": 0.12755102040816327, "percentage": 1.82, "elapsed_time": "0:05:48", "remaining_time": "5:12:36"}
{"current_steps": 80, "total_steps": 4116, "loss": 0.3143, "lr": 7.66990291262136e-06, "epoch": 0.1360544217687075, "percentage": 1.94, "elapsed_time": "0:06:11", "remaining_time": "5:12:38"}
{"current_steps": 85, "total_steps": 4116, "loss": 0.3412, "lr": 8.155339805825243e-06, "epoch": 0.1445578231292517, "percentage": 2.07, "elapsed_time": "0:06:32", "remaining_time": "5:10:33"}
{"current_steps": 90, "total_steps": 4116, "loss": 0.3213, "lr": 8.640776699029127e-06, "epoch": 0.15306122448979592, "percentage": 2.19, "elapsed_time": "0:06:56", "remaining_time": "5:10:30"}
{"current_steps": 95, "total_steps": 4116, "loss": 0.3177, "lr": 9.12621359223301e-06, "epoch": 0.16156462585034015, "percentage": 2.31, "elapsed_time": "0:07:21", "remaining_time": "5:11:32"}
{"current_steps": 100, "total_steps": 4116, "loss": 0.3373, "lr": 9.611650485436894e-06, "epoch": 0.17006802721088435, "percentage": 2.43, "elapsed_time": "0:07:38", "remaining_time": "5:07:04"}
{"current_steps": 105, "total_steps": 4116, "loss": 0.3134, "lr": 1.0097087378640778e-05, "epoch": 0.17857142857142858, "percentage": 2.55, "elapsed_time": "0:08:02", "remaining_time": "5:07:12"}
{"current_steps": 110, "total_steps": 4116, "loss": 0.2831, "lr": 1.0582524271844662e-05, "epoch": 0.1870748299319728, "percentage": 2.67, "elapsed_time": "0:08:24", "remaining_time": "5:06:12"}
{"current_steps": 115, "total_steps": 4116, "loss": 0.3157, "lr": 1.1067961165048544e-05, "epoch": 0.195578231292517, "percentage": 2.79, "elapsed_time": "0:08:47", "remaining_time": "5:05:50"}
{"current_steps": 120, "total_steps": 4116, "loss": 0.2864, "lr": 1.1553398058252427e-05, "epoch": 0.20408163265306123, "percentage": 2.92, "elapsed_time": "0:09:12", "remaining_time": "5:06:24"}
{"current_steps": 125, "total_steps": 4116, "loss": 0.3084, "lr": 1.2038834951456311e-05, "epoch": 0.21258503401360543, "percentage": 3.04, "elapsed_time": "0:09:36", "remaining_time": "5:06:47"}
{"current_steps": 130, "total_steps": 4116, "loss": 0.3131, "lr": 1.2524271844660197e-05, "epoch": 0.22108843537414966, "percentage": 3.16, "elapsed_time": "0:09:57", "remaining_time": "5:05:26"}
{"current_steps": 135, "total_steps": 4116, "loss": 0.2865, "lr": 1.300970873786408e-05, "epoch": 0.22959183673469388, "percentage": 3.28, "elapsed_time": "0:10:20", "remaining_time": "5:04:47"}
{"current_steps": 140, "total_steps": 4116, "loss": 0.2783, "lr": 1.3495145631067962e-05, "epoch": 0.23809523809523808, "percentage": 3.4, "elapsed_time": "0:10:43", "remaining_time": "5:04:45"}
{"current_steps": 145, "total_steps": 4116, "loss": 0.2872, "lr": 1.3980582524271846e-05, "epoch": 0.2465986394557823, "percentage": 3.52, "elapsed_time": "0:11:06", "remaining_time": "5:04:08"}
{"current_steps": 150, "total_steps": 4116, "loss": 0.2927, "lr": 1.446601941747573e-05, "epoch": 0.25510204081632654, "percentage": 3.64, "elapsed_time": "0:11:31", "remaining_time": "5:04:40"}
{"current_steps": 155, "total_steps": 4116, "loss": 0.2733, "lr": 1.4951456310679614e-05, "epoch": 0.26360544217687076, "percentage": 3.77, "elapsed_time": "0:11:53", "remaining_time": "5:03:47"}
{"current_steps": 160, "total_steps": 4116, "loss": 0.2739, "lr": 1.5436893203883496e-05, "epoch": 0.272108843537415, "percentage": 3.89, "elapsed_time": "0:12:13", "remaining_time": "5:02:24"}
{"current_steps": 165, "total_steps": 4116, "loss": 0.2787, "lr": 1.592233009708738e-05, "epoch": 0.28061224489795916, "percentage": 4.01, "elapsed_time": "0:12:36", "remaining_time": "5:02:05"}
{"current_steps": 170, "total_steps": 4116, "loss": 0.267, "lr": 1.6407766990291263e-05, "epoch": 0.2891156462585034, "percentage": 4.13, "elapsed_time": "0:12:59", "remaining_time": "5:01:23"}
{"current_steps": 175, "total_steps": 4116, "loss": 0.2912, "lr": 1.6893203883495145e-05, "epoch": 0.2976190476190476, "percentage": 4.25, "elapsed_time": "0:13:22", "remaining_time": "5:01:04"}
{"current_steps": 180, "total_steps": 4116, "loss": 0.2688, "lr": 1.737864077669903e-05, "epoch": 0.30612244897959184, "percentage": 4.37, "elapsed_time": "0:13:43", "remaining_time": "5:00:07"}
{"current_steps": 185, "total_steps": 4116, "loss": 0.2591, "lr": 1.7864077669902916e-05, "epoch": 0.31462585034013607, "percentage": 4.49, "elapsed_time": "0:14:07", "remaining_time": "4:59:58"}
{"current_steps": 190, "total_steps": 4116, "loss": 0.2765, "lr": 1.8349514563106798e-05, "epoch": 0.3231292517006803, "percentage": 4.62, "elapsed_time": "0:14:28", "remaining_time": "4:59:05"}
{"current_steps": 195, "total_steps": 4116, "loss": 0.2895, "lr": 1.883495145631068e-05, "epoch": 0.33163265306122447, "percentage": 4.74, "elapsed_time": "0:14:48", "remaining_time": "4:57:42"}
{"current_steps": 200, "total_steps": 4116, "loss": 0.2896, "lr": 1.9320388349514565e-05, "epoch": 0.3401360544217687, "percentage": 4.86, "elapsed_time": "0:15:14", "remaining_time": "4:58:20"}
{"current_steps": 205, "total_steps": 4116, "loss": 0.2575, "lr": 1.9805825242718447e-05, "epoch": 0.3486394557823129, "percentage": 4.98, "elapsed_time": "0:15:37", "remaining_time": "4:58:03"}
{"current_steps": 210, "total_steps": 4116, "loss": 0.2622, "lr": 2.0291262135922333e-05, "epoch": 0.35714285714285715, "percentage": 5.1, "elapsed_time": "0:15:58", "remaining_time": "4:57:05"}
{"current_steps": 215, "total_steps": 4116, "loss": 0.2289, "lr": 2.0776699029126215e-05, "epoch": 0.3656462585034014, "percentage": 5.22, "elapsed_time": "0:16:17", "remaining_time": "4:55:28"}
{"current_steps": 220, "total_steps": 4116, "loss": 0.2622, "lr": 2.1262135922330097e-05, "epoch": 0.3741496598639456, "percentage": 5.34, "elapsed_time": "0:16:42", "remaining_time": "4:55:44"}
{"current_steps": 225, "total_steps": 4116, "loss": 0.2535, "lr": 2.1747572815533982e-05, "epoch": 0.3826530612244898, "percentage": 5.47, "elapsed_time": "0:17:05", "remaining_time": "4:55:35"}
{"current_steps": 230, "total_steps": 4116, "loss": 0.2535, "lr": 2.2233009708737864e-05, "epoch": 0.391156462585034, "percentage": 5.59, "elapsed_time": "0:17:30", "remaining_time": "4:55:44"}
{"current_steps": 235, "total_steps": 4116, "loss": 0.249, "lr": 2.271844660194175e-05, "epoch": 0.39965986394557823, "percentage": 5.71, "elapsed_time": "0:17:51", "remaining_time": "4:54:54"}
{"current_steps": 240, "total_steps": 4116, "loss": 0.2673, "lr": 2.3203883495145632e-05, "epoch": 0.40816326530612246, "percentage": 5.83, "elapsed_time": "0:18:18", "remaining_time": "4:55:45"}
{"current_steps": 245, "total_steps": 4116, "loss": 0.2554, "lr": 2.3689320388349514e-05, "epoch": 0.4166666666666667, "percentage": 5.95, "elapsed_time": "0:18:40", "remaining_time": "4:55:08"}
{"current_steps": 250, "total_steps": 4116, "loss": 0.2563, "lr": 2.41747572815534e-05, "epoch": 0.42517006802721086, "percentage": 6.07, "elapsed_time": "0:19:04", "remaining_time": "4:54:55"}
{"current_steps": 255, "total_steps": 4116, "loss": 0.2738, "lr": 2.466019417475728e-05, "epoch": 0.4336734693877551, "percentage": 6.2, "elapsed_time": "0:19:27", "remaining_time": "4:54:31"}
{"current_steps": 260, "total_steps": 4116, "loss": 0.2401, "lr": 2.514563106796117e-05, "epoch": 0.4421768707482993, "percentage": 6.32, "elapsed_time": "0:19:47", "remaining_time": "4:53:34"}
{"current_steps": 265, "total_steps": 4116, "loss": 0.2419, "lr": 2.5631067961165052e-05, "epoch": 0.45068027210884354, "percentage": 6.44, "elapsed_time": "0:20:09", "remaining_time": "4:52:52"}
{"current_steps": 270, "total_steps": 4116, "loss": 0.2481, "lr": 2.6116504854368934e-05, "epoch": 0.45918367346938777, "percentage": 6.56, "elapsed_time": "0:20:32", "remaining_time": "4:52:35"}
{"current_steps": 275, "total_steps": 4116, "loss": 0.2667, "lr": 2.660194174757282e-05, "epoch": 0.467687074829932, "percentage": 6.68, "elapsed_time": "0:20:50", "remaining_time": "4:51:12"}
{"current_steps": 280, "total_steps": 4116, "loss": 0.2395, "lr": 2.7087378640776702e-05, "epoch": 0.47619047619047616, "percentage": 6.8, "elapsed_time": "0:21:12", "remaining_time": "4:50:36"}
{"current_steps": 285, "total_steps": 4116, "loss": 0.2642, "lr": 2.7572815533980587e-05, "epoch": 0.4846938775510204, "percentage": 6.92, "elapsed_time": "0:21:39", "remaining_time": "4:51:13"}
{"current_steps": 290, "total_steps": 4116, "loss": 0.2415, "lr": 2.805825242718447e-05, "epoch": 0.4931972789115646, "percentage": 7.05, "elapsed_time": "0:21:59", "remaining_time": "4:50:10"}
{"current_steps": 295, "total_steps": 4116, "loss": 0.2556, "lr": 2.854368932038835e-05, "epoch": 0.5017006802721088, "percentage": 7.17, "elapsed_time": "0:22:26", "remaining_time": "4:50:40"}
{"current_steps": 300, "total_steps": 4116, "loss": 0.2277, "lr": 2.9029126213592237e-05, "epoch": 0.5102040816326531, "percentage": 7.29, "elapsed_time": "0:22:51", "remaining_time": "4:50:41"}
{"current_steps": 305, "total_steps": 4116, "loss": 0.2396, "lr": 2.951456310679612e-05, "epoch": 0.5187074829931972, "percentage": 7.41, "elapsed_time": "0:23:15", "remaining_time": "4:50:41"}
{"current_steps": 310, "total_steps": 4116, "loss": 0.2447, "lr": 3.0000000000000004e-05, "epoch": 0.5272108843537415, "percentage": 7.53, "elapsed_time": "0:23:37", "remaining_time": "4:50:01"}
{"current_steps": 315, "total_steps": 4116, "loss": 0.2321, "lr": 3.0485436893203886e-05, "epoch": 0.5357142857142857, "percentage": 7.65, "elapsed_time": "0:24:03", "remaining_time": "4:50:22"}
{"current_steps": 320, "total_steps": 4116, "loss": 0.2471, "lr": 3.097087378640777e-05, "epoch": 0.54421768707483, "percentage": 7.77, "elapsed_time": "0:24:29", "remaining_time": "4:50:35"}
{"current_steps": 325, "total_steps": 4116, "loss": 0.216, "lr": 3.1456310679611654e-05, "epoch": 0.5527210884353742, "percentage": 7.9, "elapsed_time": "0:24:51", "remaining_time": "4:49:53"}
{"current_steps": 330, "total_steps": 4116, "loss": 0.2373, "lr": 3.194174757281554e-05, "epoch": 0.5612244897959183, "percentage": 8.02, "elapsed_time": "0:25:12", "remaining_time": "4:49:07"}
{"current_steps": 335, "total_steps": 4116, "loss": 0.2503, "lr": 3.242718446601942e-05, "epoch": 0.5697278911564626, "percentage": 8.14, "elapsed_time": "0:25:41", "remaining_time": "4:50:00"}
{"current_steps": 340, "total_steps": 4116, "loss": 0.2414, "lr": 3.29126213592233e-05, "epoch": 0.5782312925170068, "percentage": 8.26, "elapsed_time": "0:26:05", "remaining_time": "4:49:40"}
{"current_steps": 345, "total_steps": 4116, "loss": 0.2365, "lr": 3.339805825242719e-05, "epoch": 0.5867346938775511, "percentage": 8.38, "elapsed_time": "0:26:24", "remaining_time": "4:48:43"}
{"current_steps": 350, "total_steps": 4116, "loss": 0.2548, "lr": 3.388349514563107e-05, "epoch": 0.5952380952380952, "percentage": 8.5, "elapsed_time": "0:26:49", "remaining_time": "4:48:41"}
{"current_steps": 355, "total_steps": 4116, "loss": 0.2874, "lr": 3.436893203883495e-05, "epoch": 0.6037414965986394, "percentage": 8.62, "elapsed_time": "0:27:14", "remaining_time": "4:48:36"}
{"current_steps": 360, "total_steps": 4116, "loss": 0.2428, "lr": 3.485436893203884e-05, "epoch": 0.6122448979591837, "percentage": 8.75, "elapsed_time": "0:27:40", "remaining_time": "4:48:48"}
{"current_steps": 365, "total_steps": 4116, "loss": 0.2201, "lr": 3.5339805825242724e-05, "epoch": 0.6207482993197279, "percentage": 8.87, "elapsed_time": "0:28:01", "remaining_time": "4:47:56"}
{"current_steps": 370, "total_steps": 4116, "loss": 0.2356, "lr": 3.582524271844661e-05, "epoch": 0.6292517006802721, "percentage": 8.99, "elapsed_time": "0:28:28", "remaining_time": "4:48:12"}
{"current_steps": 375, "total_steps": 4116, "loss": 0.2216, "lr": 3.631067961165049e-05, "epoch": 0.6377551020408163, "percentage": 9.11, "elapsed_time": "0:28:47", "remaining_time": "4:47:13"}
{"current_steps": 380, "total_steps": 4116, "loss": 0.2219, "lr": 3.679611650485437e-05, "epoch": 0.6462585034013606, "percentage": 9.23, "elapsed_time": "0:29:08", "remaining_time": "4:46:28"}
{"current_steps": 385, "total_steps": 4116, "loss": 0.2512, "lr": 3.728155339805826e-05, "epoch": 0.6547619047619048, "percentage": 9.35, "elapsed_time": "0:29:35", "remaining_time": "4:46:43"}
{"current_steps": 390, "total_steps": 4116, "loss": 0.232, "lr": 3.776699029126214e-05, "epoch": 0.6632653061224489, "percentage": 9.48, "elapsed_time": "0:29:58", "remaining_time": "4:46:19"}
{"current_steps": 395, "total_steps": 4116, "loss": 0.2218, "lr": 3.825242718446602e-05, "epoch": 0.6717687074829932, "percentage": 9.6, "elapsed_time": "0:30:16", "remaining_time": "4:45:10"}
{"current_steps": 400, "total_steps": 4116, "loss": 0.2152, "lr": 3.873786407766991e-05, "epoch": 0.6802721088435374, "percentage": 9.72, "elapsed_time": "0:30:38", "remaining_time": "4:44:39"}
{"current_steps": 405, "total_steps": 4116, "loss": 0.2262, "lr": 3.9223300970873787e-05, "epoch": 0.6887755102040817, "percentage": 9.84, "elapsed_time": "0:30:58", "remaining_time": "4:43:51"}
{"current_steps": 410, "total_steps": 4116, "loss": 0.2262, "lr": 3.970873786407767e-05, "epoch": 0.6972789115646258, "percentage": 9.96, "elapsed_time": "0:31:18", "remaining_time": "4:43:00"}
{"current_steps": 415, "total_steps": 4116, "loss": 0.2245, "lr": 3.9999971224844676e-05, "epoch": 0.70578231292517, "percentage": 10.08, "elapsed_time": "0:31:40", "remaining_time": "4:42:29"}
{"current_steps": 420, "total_steps": 4116, "loss": 0.2134, "lr": 3.9999647505298214e-05, "epoch": 0.7142857142857143, "percentage": 10.2, "elapsed_time": "0:32:01", "remaining_time": "4:41:48"}
{"current_steps": 425, "total_steps": 4116, "loss": 0.2168, "lr": 3.999896410310244e-05, "epoch": 0.7227891156462585, "percentage": 10.33, "elapsed_time": "0:32:25", "remaining_time": "4:41:35"}
{"current_steps": 430, "total_steps": 4116, "loss": 0.2399, "lr": 3.9997921030547975e-05, "epoch": 0.7312925170068028, "percentage": 10.45, "elapsed_time": "0:32:48", "remaining_time": "4:41:11"}
{"current_steps": 435, "total_steps": 4116, "loss": 0.2123, "lr": 3.9996518306393906e-05, "epoch": 0.7397959183673469, "percentage": 10.57, "elapsed_time": "0:33:09", "remaining_time": "4:40:39"}
{"current_steps": 440, "total_steps": 4116, "loss": 0.2106, "lr": 3.999475595586746e-05, "epoch": 0.7482993197278912, "percentage": 10.69, "elapsed_time": "0:33:32", "remaining_time": "4:40:12"}
{"current_steps": 445, "total_steps": 4116, "loss": 0.2243, "lr": 3.9992634010663535e-05, "epoch": 0.7568027210884354, "percentage": 10.81, "elapsed_time": "0:33:58", "remaining_time": "4:40:12"}
{"current_steps": 450, "total_steps": 4116, "loss": 0.2237, "lr": 3.999015250894415e-05, "epoch": 0.7653061224489796, "percentage": 10.93, "elapsed_time": "0:34:20", "remaining_time": "4:39:45"}
{"current_steps": 455, "total_steps": 4116, "loss": 0.2312, "lr": 3.998731149533774e-05, "epoch": 0.7738095238095238, "percentage": 11.05, "elapsed_time": "0:34:41", "remaining_time": "4:39:08"}
{"current_steps": 460, "total_steps": 4116, "loss": 0.2273, "lr": 3.9984111020938376e-05, "epoch": 0.782312925170068, "percentage": 11.18, "elapsed_time": "0:35:01", "remaining_time": "4:38:26"}
{"current_steps": 465, "total_steps": 4116, "loss": 0.2279, "lr": 3.998055114330483e-05, "epoch": 0.7908163265306123, "percentage": 11.3, "elapsed_time": "0:35:25", "remaining_time": "4:38:07"}
{"current_steps": 470, "total_steps": 4116, "loss": 0.2132, "lr": 3.9976631926459536e-05, "epoch": 0.7993197278911565, "percentage": 11.42, "elapsed_time": "0:35:46", "remaining_time": "4:37:28"}
{"current_steps": 475, "total_steps": 4116, "loss": 0.22, "lr": 3.997235344088747e-05, "epoch": 0.8078231292517006, "percentage": 11.54, "elapsed_time": "0:36:05", "remaining_time": "4:36:41"}
{"current_steps": 480, "total_steps": 4116, "loss": 0.2076, "lr": 3.9967715763534805e-05, "epoch": 0.8163265306122449, "percentage": 11.66, "elapsed_time": "0:36:30", "remaining_time": "4:36:31"}
{"current_steps": 485, "total_steps": 4116, "loss": 0.2198, "lr": 3.9962718977807637e-05, "epoch": 0.8248299319727891, "percentage": 11.78, "elapsed_time": "0:36:51", "remaining_time": "4:35:56"}
{"current_steps": 490, "total_steps": 4116, "loss": 0.2113, "lr": 3.995736317357041e-05, "epoch": 0.8333333333333334, "percentage": 11.9, "elapsed_time": "0:37:18", "remaining_time": "4:36:01"}
{"current_steps": 495, "total_steps": 4116, "loss": 0.2445, "lr": 3.9951648447144305e-05, "epoch": 0.8418367346938775, "percentage": 12.03, "elapsed_time": "0:37:41", "remaining_time": "4:35:40"}
{"current_steps": 500, "total_steps": 4116, "loss": 0.2148, "lr": 3.9945574901305534e-05, "epoch": 0.8503401360544217, "percentage": 12.15, "elapsed_time": "0:38:03", "remaining_time": "4:35:11"}
{"current_steps": 505, "total_steps": 4116, "loss": 0.2137, "lr": 3.9939142645283475e-05, "epoch": 0.858843537414966, "percentage": 12.27, "elapsed_time": "0:38:23", "remaining_time": "4:34:28"}
{"current_steps": 510, "total_steps": 4116, "loss": 0.2414, "lr": 3.9932351794758705e-05, "epoch": 0.8673469387755102, "percentage": 12.39, "elapsed_time": "0:38:46", "remaining_time": "4:34:09"}
{"current_steps": 515, "total_steps": 4116, "loss": 0.2206, "lr": 3.992520247186094e-05, "epoch": 0.8758503401360545, "percentage": 12.51, "elapsed_time": "0:39:12", "remaining_time": "4:34:08"}
{"current_steps": 520, "total_steps": 4116, "loss": 0.2099, "lr": 3.991769480516681e-05, "epoch": 0.8843537414965986, "percentage": 12.63, "elapsed_time": "0:39:32", "remaining_time": "4:33:27"}
{"current_steps": 525, "total_steps": 4116, "loss": 0.2189, "lr": 3.9909828929697575e-05, "epoch": 0.8928571428571429, "percentage": 12.76, "elapsed_time": "0:39:57", "remaining_time": "4:33:19"}
{"current_steps": 530, "total_steps": 4116, "loss": 0.2076, "lr": 3.9901604986916666e-05, "epoch": 0.9013605442176871, "percentage": 12.88, "elapsed_time": "0:40:22", "remaining_time": "4:33:10"}
{"current_steps": 535, "total_steps": 4116, "loss": 0.2128, "lr": 3.9893023124727176e-05, "epoch": 0.9098639455782312, "percentage": 13.0, "elapsed_time": "0:40:44", "remaining_time": "4:32:39"}
{"current_steps": 540, "total_steps": 4116, "loss": 0.2261, "lr": 3.988408349746916e-05, "epoch": 0.9183673469387755, "percentage": 13.12, "elapsed_time": "0:41:07", "remaining_time": "4:32:19"}
{"current_steps": 545, "total_steps": 4116, "loss": 0.2256, "lr": 3.9874786265916895e-05, "epoch": 0.9268707482993197, "percentage": 13.24, "elapsed_time": "0:41:30", "remaining_time": "4:31:58"}
{"current_steps": 550, "total_steps": 4116, "loss": 0.2176, "lr": 3.986513159727598e-05, "epoch": 0.935374149659864, "percentage": 13.36, "elapsed_time": "0:41:50", "remaining_time": "4:31:17"}
{"current_steps": 555, "total_steps": 4116, "loss": 0.2187, "lr": 3.985511966518031e-05, "epoch": 0.9438775510204082, "percentage": 13.48, "elapsed_time": "0:42:17", "remaining_time": "4:31:18"}
{"current_steps": 560, "total_steps": 4116, "loss": 0.207, "lr": 3.984475064968897e-05, "epoch": 0.9523809523809523, "percentage": 13.61, "elapsed_time": "0:42:40", "remaining_time": "4:31:00"}
{"current_steps": 565, "total_steps": 4116, "loss": 0.228, "lr": 3.9834024737282994e-05, "epoch": 0.9608843537414966, "percentage": 13.73, "elapsed_time": "0:43:06", "remaining_time": "4:30:56"}
{"current_steps": 570, "total_steps": 4116, "loss": 0.2059, "lr": 3.9822942120862016e-05, "epoch": 0.9693877551020408, "percentage": 13.85, "elapsed_time": "0:43:27", "remaining_time": "4:30:20"}
{"current_steps": 575, "total_steps": 4116, "loss": 0.2035, "lr": 3.981150299974079e-05, "epoch": 0.9778911564625851, "percentage": 13.97, "elapsed_time": "0:43:54", "remaining_time": "4:30:21"}
{"current_steps": 580, "total_steps": 4116, "loss": 0.2042, "lr": 3.979970757964561e-05, "epoch": 0.9863945578231292, "percentage": 14.09, "elapsed_time": "0:44:14", "remaining_time": "4:29:44"}
{"current_steps": 585, "total_steps": 4116, "loss": 0.2054, "lr": 3.978755607271061e-05, "epoch": 0.9948979591836735, "percentage": 14.21, "elapsed_time": "0:44:34", "remaining_time": "4:29:03"}
{"current_steps": 590, "total_steps": 4116, "loss": 0.2216, "lr": 3.977504869747395e-05, "epoch": 1.0034013605442176, "percentage": 14.33, "elapsed_time": "0:44:56", "remaining_time": "4:28:37"}
{"current_steps": 595, "total_steps": 4116, "loss": 0.2189, "lr": 3.976218567887389e-05, "epoch": 1.0119047619047619, "percentage": 14.46, "elapsed_time": "0:45:20", "remaining_time": "4:28:21"}
{"current_steps": 600, "total_steps": 4116, "loss": 0.191, "lr": 3.974896724824475e-05, "epoch": 1.0204081632653061, "percentage": 14.58, "elapsed_time": "0:45:39", "remaining_time": "4:27:35"}
{"current_steps": 605, "total_steps": 4116, "loss": 0.1928, "lr": 3.97353936433127e-05, "epoch": 1.0289115646258504, "percentage": 14.7, "elapsed_time": "0:46:00", "remaining_time": "4:26:57"}
{"current_steps": 610, "total_steps": 4116, "loss": 0.203, "lr": 3.972146510819158e-05, "epoch": 1.0374149659863945, "percentage": 14.82, "elapsed_time": "0:46:19", "remaining_time": "4:26:13"}
{"current_steps": 615, "total_steps": 4116, "loss": 0.1997, "lr": 3.970718189337841e-05, "epoch": 1.0459183673469388, "percentage": 14.94, "elapsed_time": "0:46:40", "remaining_time": "4:25:39"}
{"current_steps": 620, "total_steps": 4116, "loss": 0.2243, "lr": 3.969254425574895e-05, "epoch": 1.054421768707483, "percentage": 15.06, "elapsed_time": "0:47:04", "remaining_time": "4:25:26"}
{"current_steps": 625, "total_steps": 4116, "loss": 0.1932, "lr": 3.9677552458553046e-05, "epoch": 1.0629251700680271, "percentage": 15.18, "elapsed_time": "0:47:31", "remaining_time": "4:25:24"}
{"current_steps": 630, "total_steps": 4116, "loss": 0.2036, "lr": 3.9662206771409925e-05, "epoch": 1.0714285714285714, "percentage": 15.31, "elapsed_time": "0:47:50", "remaining_time": "4:24:44"}
{"current_steps": 635, "total_steps": 4116, "loss": 0.207, "lr": 3.964650747030332e-05, "epoch": 1.0799319727891157, "percentage": 15.43, "elapsed_time": "0:48:10", "remaining_time": "4:24:05"}
{"current_steps": 640, "total_steps": 4116, "loss": 0.2088, "lr": 3.963045483757652e-05, "epoch": 1.08843537414966, "percentage": 15.55, "elapsed_time": "0:48:32", "remaining_time": "4:23:38"}
{"current_steps": 645, "total_steps": 4116, "loss": 0.2013, "lr": 3.961404916192728e-05, "epoch": 1.096938775510204, "percentage": 15.67, "elapsed_time": "0:48:49", "remaining_time": "4:22:47"}
{"current_steps": 650, "total_steps": 4116, "loss": 0.1958, "lr": 3.959729073840265e-05, "epoch": 1.1054421768707483, "percentage": 15.79, "elapsed_time": "0:49:08", "remaining_time": "4:22:03"}
{"current_steps": 655, "total_steps": 4116, "loss": 0.1903, "lr": 3.958017986839364e-05, "epoch": 1.1139455782312926, "percentage": 15.91, "elapsed_time": "0:49:28", "remaining_time": "4:21:25"}
{"current_steps": 660, "total_steps": 4116, "loss": 0.1867, "lr": 3.956271685962985e-05, "epoch": 1.1224489795918366, "percentage": 16.03, "elapsed_time": "0:49:51", "remaining_time": "4:21:03"}
{"current_steps": 665, "total_steps": 4116, "loss": 0.2032, "lr": 3.954490202617385e-05, "epoch": 1.130952380952381, "percentage": 16.16, "elapsed_time": "0:50:14", "remaining_time": "4:20:44"}
{"current_steps": 670, "total_steps": 4116, "loss": 0.1952, "lr": 3.9526735688415624e-05, "epoch": 1.1394557823129252, "percentage": 16.28, "elapsed_time": "0:50:39", "remaining_time": "4:20:34"}
{"current_steps": 675, "total_steps": 4116, "loss": 0.2169, "lr": 3.9508218173066766e-05, "epoch": 1.1479591836734695, "percentage": 16.4, "elapsed_time": "0:51:03", "remaining_time": "4:20:15"}
{"current_steps": 680, "total_steps": 4116, "loss": 0.1962, "lr": 3.948934981315457e-05, "epoch": 1.1564625850340136, "percentage": 16.52, "elapsed_time": "0:51:21", "remaining_time": "4:19:32"}
{"current_steps": 685, "total_steps": 4116, "loss": 0.2118, "lr": 3.947013094801612e-05, "epoch": 1.1649659863945578, "percentage": 16.64, "elapsed_time": "0:51:43", "remaining_time": "4:19:06"}
{"current_steps": 690, "total_steps": 4116, "loss": 0.1898, "lr": 3.945056192329212e-05, "epoch": 1.1734693877551021, "percentage": 16.76, "elapsed_time": "0:52:04", "remaining_time": "4:18:32"}
{"current_steps": 695, "total_steps": 4116, "loss": 0.1904, "lr": 3.9430643090920705e-05, "epoch": 1.1819727891156462, "percentage": 16.89, "elapsed_time": "0:52:23", "remaining_time": "4:17:54"}
{"current_steps": 700, "total_steps": 4116, "loss": 0.2082, "lr": 3.941037480913111e-05, "epoch": 1.1904761904761905, "percentage": 17.01, "elapsed_time": "0:52:45", "remaining_time": "4:17:26"}
{"current_steps": 705, "total_steps": 4116, "loss": 0.1795, "lr": 3.9389757442437214e-05, "epoch": 1.1989795918367347, "percentage": 17.13, "elapsed_time": "0:53:07", "remaining_time": "4:17:01"}
{"current_steps": 710, "total_steps": 4116, "loss": 0.1936, "lr": 3.936879136163102e-05, "epoch": 1.2074829931972788, "percentage": 17.25, "elapsed_time": "0:53:26", "remaining_time": "4:16:24"}
{"current_steps": 715, "total_steps": 4116, "loss": 0.1889, "lr": 3.934747694377594e-05, "epoch": 1.215986394557823, "percentage": 17.37, "elapsed_time": "0:53:52", "remaining_time": "4:16:17"}
{"current_steps": 720, "total_steps": 4116, "loss": 0.1918, "lr": 3.932581457220004e-05, "epoch": 1.2244897959183674, "percentage": 17.49, "elapsed_time": "0:54:16", "remaining_time": "4:16:00"}
{"current_steps": 725, "total_steps": 4116, "loss": 0.2011, "lr": 3.9303804636489165e-05, "epoch": 1.2329931972789117, "percentage": 17.61, "elapsed_time": "0:54:39", "remaining_time": "4:15:38"}
{"current_steps": 730, "total_steps": 4116, "loss": 0.1897, "lr": 3.928144753247987e-05, "epoch": 1.2414965986394557, "percentage": 17.74, "elapsed_time": "0:55:02", "remaining_time": "4:15:19"}
{"current_steps": 735, "total_steps": 4116, "loss": 0.1963, "lr": 3.925874366225237e-05, "epoch": 1.25, "percentage": 17.86, "elapsed_time": "0:55:27", "remaining_time": "4:15:05"}
{"current_steps": 740, "total_steps": 4116, "loss": 0.168, "lr": 3.923569343412327e-05, "epoch": 1.2585034013605443, "percentage": 17.98, "elapsed_time": "0:55:46", "remaining_time": "4:14:26"}
{"current_steps": 745, "total_steps": 4116, "loss": 0.2012, "lr": 3.921229726263824e-05, "epoch": 1.2670068027210886, "percentage": 18.1, "elapsed_time": "0:56:07", "remaining_time": "4:13:55"}
{"current_steps": 750, "total_steps": 4116, "loss": 0.1937, "lr": 3.918855556856453e-05, "epoch": 1.2755102040816326, "percentage": 18.22, "elapsed_time": "0:56:31", "remaining_time": "4:13:40"}
{"current_steps": 755, "total_steps": 4116, "loss": 0.1867, "lr": 3.916446877888345e-05, "epoch": 1.284013605442177, "percentage": 18.34, "elapsed_time": "0:56:56", "remaining_time": "4:13:27"}
{"current_steps": 760, "total_steps": 4116, "loss": 0.1886, "lr": 3.914003732678264e-05, "epoch": 1.2925170068027212, "percentage": 18.46, "elapsed_time": "0:57:15", "remaining_time": "4:12:52"}
{"current_steps": 765, "total_steps": 4116, "loss": 0.1903, "lr": 3.911526165164831e-05, "epoch": 1.3010204081632653, "percentage": 18.59, "elapsed_time": "0:57:37", "remaining_time": "4:12:23"}
{"current_steps": 770, "total_steps": 4116, "loss": 0.1738, "lr": 3.909014219905736e-05, "epoch": 1.3095238095238095, "percentage": 18.71, "elapsed_time": "0:58:01", "remaining_time": "4:12:06"}
{"current_steps": 775, "total_steps": 4116, "loss": 0.1763, "lr": 3.906467942076929e-05, "epoch": 1.3180272108843538, "percentage": 18.83, "elapsed_time": "0:58:21", "remaining_time": "4:11:36"}
{"current_steps": 780, "total_steps": 4116, "loss": 0.1819, "lr": 3.903887377471816e-05, "epoch": 1.3265306122448979, "percentage": 18.95, "elapsed_time": "0:58:41", "remaining_time": "4:11:00"}
{"current_steps": 785, "total_steps": 4116, "loss": 0.2081, "lr": 3.901272572500431e-05, "epoch": 1.3350340136054422, "percentage": 19.07, "elapsed_time": "0:59:04", "remaining_time": "4:10:39"}
{"current_steps": 790, "total_steps": 4116, "loss": 0.1834, "lr": 3.8986235741886016e-05, "epoch": 1.3435374149659864, "percentage": 19.19, "elapsed_time": "0:59:26", "remaining_time": "4:10:15"}
{"current_steps": 795, "total_steps": 4116, "loss": 0.1794, "lr": 3.8959404301771035e-05, "epoch": 1.3520408163265305, "percentage": 19.31, "elapsed_time": "0:59:46", "remaining_time": "4:09:41"}
{"current_steps": 800, "total_steps": 4116, "loss": 0.191, "lr": 3.893223188720804e-05, "epoch": 1.3605442176870748, "percentage": 19.44, "elapsed_time": "1:00:07", "remaining_time": "4:09:14"}
{"current_steps": 805, "total_steps": 4116, "loss": 0.1871, "lr": 3.890471898687797e-05, "epoch": 1.369047619047619, "percentage": 19.56, "elapsed_time": "1:00:30", "remaining_time": "4:08:53"}
{"current_steps": 810, "total_steps": 4116, "loss": 0.1932, "lr": 3.887686609558516e-05, "epoch": 1.3775510204081631, "percentage": 19.68, "elapsed_time": "1:00:54", "remaining_time": "4:08:34"}
{"current_steps": 815, "total_steps": 4116, "loss": 0.2087, "lr": 3.8848673714248545e-05, "epoch": 1.3860544217687074, "percentage": 19.8, "elapsed_time": "1:01:15", "remaining_time": "4:08:06"}
{"current_steps": 820, "total_steps": 4116, "loss": 0.1922, "lr": 3.882014234989258e-05, "epoch": 1.3945578231292517, "percentage": 19.92, "elapsed_time": "1:01:36", "remaining_time": "4:07:37"}
{"current_steps": 825, "total_steps": 4116, "loss": 0.2077, "lr": 3.879127251563814e-05, "epoch": 1.403061224489796, "percentage": 20.04, "elapsed_time": "1:01:59", "remaining_time": "4:07:18"}
{"current_steps": 830, "total_steps": 4116, "loss": 0.1828, "lr": 3.87620647306933e-05, "epoch": 1.4115646258503403, "percentage": 20.17, "elapsed_time": "1:02:22", "remaining_time": "4:06:54"}
{"current_steps": 835, "total_steps": 4116, "loss": 0.2015, "lr": 3.873251952034398e-05, "epoch": 1.4200680272108843, "percentage": 20.29, "elapsed_time": "1:02:44", "remaining_time": "4:06:32"}
{"current_steps": 840, "total_steps": 4116, "loss": 0.1851, "lr": 3.8702637415944554e-05, "epoch": 1.4285714285714286, "percentage": 20.41, "elapsed_time": "1:03:08", "remaining_time": "4:06:16"}
{"current_steps": 845, "total_steps": 4116, "loss": 0.1874, "lr": 3.867241895490818e-05, "epoch": 1.4370748299319729, "percentage": 20.53, "elapsed_time": "1:03:34", "remaining_time": "4:06:04"}
{"current_steps": 850, "total_steps": 4116, "loss": 0.1956, "lr": 3.864186468069727e-05, "epoch": 1.445578231292517, "percentage": 20.65, "elapsed_time": "1:04:01", "remaining_time": "4:06:02"}
{"current_steps": 855, "total_steps": 4116, "loss": 0.1859, "lr": 3.86109751428136e-05, "epoch": 1.4540816326530612, "percentage": 20.77, "elapsed_time": "1:04:21", "remaining_time": "4:05:28"}
{"current_steps": 860, "total_steps": 4116, "loss": 0.194, "lr": 3.8579750896788526e-05, "epoch": 1.4625850340136055, "percentage": 20.89, "elapsed_time": "1:04:41", "remaining_time": "4:04:55"}
{"current_steps": 865, "total_steps": 4116, "loss": 0.1898, "lr": 3.8548192504172905e-05, "epoch": 1.4710884353741496, "percentage": 21.02, "elapsed_time": "1:05:03", "remaining_time": "4:04:30"}
{"current_steps": 870, "total_steps": 4116, "loss": 0.1958, "lr": 3.851630053252706e-05, "epoch": 1.4795918367346939, "percentage": 21.14, "elapsed_time": "1:05:26", "remaining_time": "4:04:10"}
{"current_steps": 875, "total_steps": 4116, "loss": 0.1766, "lr": 3.848407555541054e-05, "epoch": 1.4880952380952381, "percentage": 21.26, "elapsed_time": "1:05:48", "remaining_time": "4:03:44"}
{"current_steps": 880, "total_steps": 4116, "loss": 0.1986, "lr": 3.845151815237183e-05, "epoch": 1.4965986394557822, "percentage": 21.38, "elapsed_time": "1:06:05", "remaining_time": "4:03:03"}
{"current_steps": 885, "total_steps": 4116, "loss": 0.1953, "lr": 3.841862890893787e-05, "epoch": 1.5051020408163265, "percentage": 21.5, "elapsed_time": "1:06:29", "remaining_time": "4:02:46"}
{"current_steps": 890, "total_steps": 4116, "loss": 0.187, "lr": 3.838540841660363e-05, "epoch": 1.5136054421768708, "percentage": 21.62, "elapsed_time": "1:06:48", "remaining_time": "4:02:09"}
{"current_steps": 895, "total_steps": 4116, "loss": 0.1865, "lr": 3.835185727282135e-05, "epoch": 1.5221088435374148, "percentage": 21.74, "elapsed_time": "1:07:09", "remaining_time": "4:01:42"}
{"current_steps": 900, "total_steps": 4116, "loss": 0.2031, "lr": 3.8317976080989884e-05, "epoch": 1.5306122448979593, "percentage": 21.87, "elapsed_time": "1:07:32", "remaining_time": "4:01:20"}
{"current_steps": 905, "total_steps": 4116, "loss": 0.1673, "lr": 3.828376545044381e-05, "epoch": 1.5391156462585034, "percentage": 21.99, "elapsed_time": "1:07:50", "remaining_time": "4:00:42"}
{"current_steps": 910, "total_steps": 4116, "loss": 0.1961, "lr": 3.824922599644249e-05, "epoch": 1.5476190476190477, "percentage": 22.11, "elapsed_time": "1:08:14", "remaining_time": "4:00:23"}
{"current_steps": 915, "total_steps": 4116, "loss": 0.196, "lr": 3.8214358340159e-05, "epoch": 1.556122448979592, "percentage": 22.23, "elapsed_time": "1:08:36", "remaining_time": "3:59:59"}
{"current_steps": 920, "total_steps": 4116, "loss": 0.1902, "lr": 3.817916310866892e-05, "epoch": 1.564625850340136, "percentage": 22.35, "elapsed_time": "1:08:57", "remaining_time": "3:59:33"}
{"current_steps": 925, "total_steps": 4116, "loss": 0.1855, "lr": 3.814364093493912e-05, "epoch": 1.5731292517006803, "percentage": 22.47, "elapsed_time": "1:09:16", "remaining_time": "3:59:00"}
{"current_steps": 930, "total_steps": 4116, "loss": 0.1804, "lr": 3.810779245781633e-05, "epoch": 1.5816326530612246, "percentage": 22.59, "elapsed_time": "1:09:40", "remaining_time": "3:58:40"}
{"current_steps": 935, "total_steps": 4116, "loss": 0.1853, "lr": 3.807161832201569e-05, "epoch": 1.5901360544217686, "percentage": 22.72, "elapsed_time": "1:10:05", "remaining_time": "3:58:26"}
{"current_steps": 940, "total_steps": 4116, "loss": 0.1897, "lr": 3.80351191781091e-05, "epoch": 1.598639455782313, "percentage": 22.84, "elapsed_time": "1:10:31", "remaining_time": "3:58:15"}
{"current_steps": 945, "total_steps": 4116, "loss": 0.1802, "lr": 3.799829568251358e-05, "epoch": 1.6071428571428572, "percentage": 22.96, "elapsed_time": "1:10:54", "remaining_time": "3:57:56"}
{"current_steps": 950, "total_steps": 4116, "loss": 0.1862, "lr": 3.7961148497479414e-05, "epoch": 1.6156462585034013, "percentage": 23.08, "elapsed_time": "1:11:20", "remaining_time": "3:57:45"}
{"current_steps": 955, "total_steps": 4116, "loss": 0.1871, "lr": 3.792367829107828e-05, "epoch": 1.6241496598639455, "percentage": 23.2, "elapsed_time": "1:11:40", "remaining_time": "3:57:16"}
{"current_steps": 960, "total_steps": 4116, "loss": 0.1915, "lr": 3.7885885737191214e-05, "epoch": 1.6326530612244898, "percentage": 23.32, "elapsed_time": "1:12:03", "remaining_time": "3:56:54"}
{"current_steps": 965, "total_steps": 4116, "loss": 0.2057, "lr": 3.784777151549648e-05, "epoch": 1.641156462585034, "percentage": 23.45, "elapsed_time": "1:12:25", "remaining_time": "3:56:30"}
{"current_steps": 970, "total_steps": 4116, "loss": 0.1735, "lr": 3.780933631145738e-05, "epoch": 1.6496598639455784, "percentage": 23.57, "elapsed_time": "1:12:45", "remaining_time": "3:55:57"}
{"current_steps": 975, "total_steps": 4116, "loss": 0.1635, "lr": 3.777058081630989e-05, "epoch": 1.6581632653061225, "percentage": 23.69, "elapsed_time": "1:13:02", "remaining_time": "3:55:19"}
{"current_steps": 980, "total_steps": 4116, "loss": 0.1775, "lr": 3.773150572705026e-05, "epoch": 1.6666666666666665, "percentage": 23.81, "elapsed_time": "1:13:24", "remaining_time": "3:54:54"}
{"current_steps": 985, "total_steps": 4116, "loss": 0.1891, "lr": 3.769211174642245e-05, "epoch": 1.675170068027211, "percentage": 23.93, "elapsed_time": "1:13:48", "remaining_time": "3:54:37"}
{"current_steps": 990, "total_steps": 4116, "loss": 0.1759, "lr": 3.765239958290551e-05, "epoch": 1.683673469387755, "percentage": 24.05, "elapsed_time": "1:14:06", "remaining_time": "3:53:59"}
{"current_steps": 995, "total_steps": 4116, "loss": 0.1897, "lr": 3.761236995070086e-05, "epoch": 1.6921768707482994, "percentage": 24.17, "elapsed_time": "1:14:28", "remaining_time": "3:53:35"}
{"current_steps": 1000, "total_steps": 4116, "loss": 0.1819, "lr": 3.757202356971937e-05, "epoch": 1.7006802721088436, "percentage": 24.3, "elapsed_time": "1:14:53", "remaining_time": "3:53:20"}
{"current_steps": 1005, "total_steps": 4116, "loss": 0.1927, "lr": 3.75313611655685e-05, "epoch": 1.7091836734693877, "percentage": 24.42, "elapsed_time": "1:15:14", "remaining_time": "3:52:54"}
{"current_steps": 1010, "total_steps": 4116, "loss": 0.1794, "lr": 3.74903834695392e-05, "epoch": 1.717687074829932, "percentage": 24.54, "elapsed_time": "1:15:34", "remaining_time": "3:52:26"}
{"current_steps": 1015, "total_steps": 4116, "loss": 0.1713, "lr": 3.7449091218592765e-05, "epoch": 1.7261904761904763, "percentage": 24.66, "elapsed_time": "1:15:59", "remaining_time": "3:52:10"}
{"current_steps": 1020, "total_steps": 4116, "loss": 0.1692, "lr": 3.740748515534761e-05, "epoch": 1.7346938775510203, "percentage": 24.78, "elapsed_time": "1:16:19", "remaining_time": "3:51:39"}
{"current_steps": 1025, "total_steps": 4116, "loss": 0.1892, "lr": 3.736556602806587e-05, "epoch": 1.7431972789115646, "percentage": 24.9, "elapsed_time": "1:16:43", "remaining_time": "3:51:21"}
{"current_steps": 1030, "total_steps": 4116, "loss": 0.1916, "lr": 3.732333459063995e-05, "epoch": 1.751700680272109, "percentage": 25.02, "elapsed_time": "1:17:04", "remaining_time": "3:50:56"}
{"current_steps": 1035, "total_steps": 4116, "loss": 0.1874, "lr": 3.728079160257903e-05, "epoch": 1.760204081632653, "percentage": 25.15, "elapsed_time": "1:17:27", "remaining_time": "3:50:34"}
{"current_steps": 1040, "total_steps": 4116, "loss": 0.1776, "lr": 3.723793782899531e-05, "epoch": 1.7687074829931972, "percentage": 25.27, "elapsed_time": "1:17:51", "remaining_time": "3:50:15"}
{"current_steps": 1045, "total_steps": 4116, "loss": 0.1911, "lr": 3.719477404059032e-05, "epoch": 1.7772108843537415, "percentage": 25.39, "elapsed_time": "1:18:12", "remaining_time": "3:49:50"}
{"current_steps": 1050, "total_steps": 4116, "loss": 0.193, "lr": 3.715130101364103e-05, "epoch": 1.7857142857142856, "percentage": 25.51, "elapsed_time": "1:18:30", "remaining_time": "3:49:14"}
{"current_steps": 1055, "total_steps": 4116, "loss": 0.189, "lr": 3.710751952998591e-05, "epoch": 1.79421768707483, "percentage": 25.63, "elapsed_time": "1:18:52", "remaining_time": "3:48:50"}
{"current_steps": 1060, "total_steps": 4116, "loss": 0.1761, "lr": 3.7063430377010804e-05, "epoch": 1.8027210884353742, "percentage": 25.75, "elapsed_time": "1:19:13", "remaining_time": "3:48:23"}
{"current_steps": 1065, "total_steps": 4116, "loss": 0.1918, "lr": 3.7019034347634884e-05, "epoch": 1.8112244897959182, "percentage": 25.87, "elapsed_time": "1:19:35", "remaining_time": "3:48:01"}
{"current_steps": 1070, "total_steps": 4116, "loss": 0.1652, "lr": 3.6974332240296285e-05, "epoch": 1.8197278911564627, "percentage": 26.0, "elapsed_time": "1:19:53", "remaining_time": "3:47:25"}
{"current_steps": 1075, "total_steps": 4116, "loss": 0.2053, "lr": 3.692932485893779e-05, "epoch": 1.8282312925170068, "percentage": 26.12, "elapsed_time": "1:20:15", "remaining_time": "3:47:02"}
{"current_steps": 1080, "total_steps": 4116, "loss": 0.1708, "lr": 3.6884013012992384e-05, "epoch": 1.836734693877551, "percentage": 26.24, "elapsed_time": "1:20:35", "remaining_time": "3:46:32"}
{"current_steps": 1085, "total_steps": 4116, "loss": 0.2042, "lr": 3.683839751736866e-05, "epoch": 1.8452380952380953, "percentage": 26.36, "elapsed_time": "1:21:01", "remaining_time": "3:46:19"}
{"current_steps": 1090, "total_steps": 4116, "loss": 0.1999, "lr": 3.67924791924362e-05, "epoch": 1.8537414965986394, "percentage": 26.48, "elapsed_time": "1:21:20", "remaining_time": "3:45:49"}
{"current_steps": 1095, "total_steps": 4116, "loss": 0.1913, "lr": 3.6746258864010804e-05, "epoch": 1.8622448979591837, "percentage": 26.6, "elapsed_time": "1:21:41", "remaining_time": "3:45:23"}
{"current_steps": 1100, "total_steps": 4116, "loss": 0.208, "lr": 3.669973736333961e-05, "epoch": 1.870748299319728, "percentage": 26.72, "elapsed_time": "1:22:04", "remaining_time": "3:45:00"}
{"current_steps": 1105, "total_steps": 4116, "loss": 0.1712, "lr": 3.665291552708622e-05, "epoch": 1.879251700680272, "percentage": 26.85, "elapsed_time": "1:22:24", "remaining_time": "3:44:33"}
{"current_steps": 1110, "total_steps": 4116, "loss": 0.1772, "lr": 3.660579419731556e-05, "epoch": 1.8877551020408163, "percentage": 26.97, "elapsed_time": "1:22:44", "remaining_time": "3:44:05"}
{"current_steps": 1115, "total_steps": 4116, "loss": 0.186, "lr": 3.655837422147881e-05, "epoch": 1.8962585034013606, "percentage": 27.09, "elapsed_time": "1:23:05", "remaining_time": "3:43:39"}
{"current_steps": 1120, "total_steps": 4116, "loss": 0.1797, "lr": 3.6510656452398124e-05, "epoch": 1.9047619047619047, "percentage": 27.21, "elapsed_time": "1:23:28", "remaining_time": "3:43:17"}
{"current_steps": 1125, "total_steps": 4116, "loss": 0.1926, "lr": 3.646264174825131e-05, "epoch": 1.913265306122449, "percentage": 27.33, "elapsed_time": "1:23:48", "remaining_time": "3:42:47"}
{"current_steps": 1130, "total_steps": 4116, "loss": 0.1687, "lr": 3.6414330972556396e-05, "epoch": 1.9217687074829932, "percentage": 27.45, "elapsed_time": "1:24:05", "remaining_time": "3:42:13"}
{"current_steps": 1135, "total_steps": 4116, "loss": 0.1837, "lr": 3.6365724994156074e-05, "epoch": 1.9302721088435373, "percentage": 27.58, "elapsed_time": "1:24:33", "remaining_time": "3:42:04"}
{"current_steps": 1140, "total_steps": 4116, "loss": 0.1911, "lr": 3.63168246872021e-05, "epoch": 1.9387755102040818, "percentage": 27.7, "elapsed_time": "1:24:59", "remaining_time": "3:41:51"}
{"current_steps": 1145, "total_steps": 4116, "loss": 0.1788, "lr": 3.626763093113959e-05, "epoch": 1.9472789115646258, "percentage": 27.82, "elapsed_time": "1:25:19", "remaining_time": "3:41:22"}
{"current_steps": 1150, "total_steps": 4116, "loss": 0.1822, "lr": 3.621814461069114e-05, "epoch": 1.95578231292517, "percentage": 27.94, "elapsed_time": "1:25:44", "remaining_time": "3:41:08"}
{"current_steps": 1155, "total_steps": 4116, "loss": 0.1996, "lr": 3.6168366615841e-05, "epoch": 1.9642857142857144, "percentage": 28.06, "elapsed_time": "1:26:05", "remaining_time": "3:40:41"}
{"current_steps": 1160, "total_steps": 4116, "loss": 0.175, "lr": 3.6118297841818986e-05, "epoch": 1.9727891156462585, "percentage": 28.18, "elapsed_time": "1:26:27", "remaining_time": "3:40:18"}
{"current_steps": 1165, "total_steps": 4116, "loss": 0.1895, "lr": 3.6067939189084446e-05, "epoch": 1.9812925170068028, "percentage": 28.3, "elapsed_time": "1:26:48", "remaining_time": "3:39:52"}
{"current_steps": 1170, "total_steps": 4116, "loss": 0.1793, "lr": 3.6017291563310005e-05, "epoch": 1.989795918367347, "percentage": 28.43, "elapsed_time": "1:27:09", "remaining_time": "3:39:28"}
{"current_steps": 1175, "total_steps": 4116, "loss": 0.1748, "lr": 3.5966355875365344e-05, "epoch": 1.998299319727891, "percentage": 28.55, "elapsed_time": "1:27:30", "remaining_time": "3:39:02"}
{"current_steps": 1180, "total_steps": 4116, "loss": 0.166, "lr": 3.591513304130076e-05, "epoch": 2.006802721088435, "percentage": 28.67, "elapsed_time": "1:27:51", "remaining_time": "3:38:36"}
{"current_steps": 1185, "total_steps": 4116, "loss": 0.174, "lr": 3.586362398233072e-05, "epoch": 2.0153061224489797, "percentage": 28.79, "elapsed_time": "1:28:13", "remaining_time": "3:38:13"}
{"current_steps": 1190, "total_steps": 4116, "loss": 0.1638, "lr": 3.58118296248173e-05, "epoch": 2.0238095238095237, "percentage": 28.91, "elapsed_time": "1:28:40", "remaining_time": "3:38:01"}
{"current_steps": 1195, "total_steps": 4116, "loss": 0.1599, "lr": 3.57597509002535e-05, "epoch": 2.0323129251700682, "percentage": 29.03, "elapsed_time": "1:29:00", "remaining_time": "3:37:34"}
{"current_steps": 1200, "total_steps": 4116, "loss": 0.1663, "lr": 3.57073887452465e-05, "epoch": 2.0408163265306123, "percentage": 29.15, "elapsed_time": "1:29:19", "remaining_time": "3:37:02"}
{"current_steps": 1205, "total_steps": 4116, "loss": 0.1622, "lr": 3.565474410150083e-05, "epoch": 2.0493197278911564, "percentage": 29.28, "elapsed_time": "1:29:37", "remaining_time": "3:36:30"}
{"current_steps": 1210, "total_steps": 4116, "loss": 0.1726, "lr": 3.560181791580144e-05, "epoch": 2.057823129251701, "percentage": 29.4, "elapsed_time": "1:29:56", "remaining_time": "3:35:59"}
{"current_steps": 1215, "total_steps": 4116, "loss": 0.1697, "lr": 3.5548611139996634e-05, "epoch": 2.066326530612245, "percentage": 29.52, "elapsed_time": "1:30:15", "remaining_time": "3:35:29"}
{"current_steps": 1220, "total_steps": 4116, "loss": 0.1668, "lr": 3.549512473098097e-05, "epoch": 2.074829931972789, "percentage": 29.64, "elapsed_time": "1:30:41", "remaining_time": "3:35:15"}
{"current_steps": 1225, "total_steps": 4116, "loss": 0.1672, "lr": 3.544135965067807e-05, "epoch": 2.0833333333333335, "percentage": 29.76, "elapsed_time": "1:31:01", "remaining_time": "3:34:49"}
{"current_steps": 1230, "total_steps": 4116, "loss": 0.1697, "lr": 3.5387316866023314e-05, "epoch": 2.0918367346938775, "percentage": 29.88, "elapsed_time": "1:31:25", "remaining_time": "3:34:29"}
{"current_steps": 1235, "total_steps": 4116, "loss": 0.1667, "lr": 3.533299734894642e-05, "epoch": 2.1003401360544216, "percentage": 30.0, "elapsed_time": "1:31:45", "remaining_time": "3:34:03"}
{"current_steps": 1240, "total_steps": 4116, "loss": 0.1526, "lr": 3.527840207635399e-05, "epoch": 2.108843537414966, "percentage": 30.13, "elapsed_time": "1:32:11", "remaining_time": "3:33:49"}
{"current_steps": 1245, "total_steps": 4116, "loss": 0.1683, "lr": 3.522353203011197e-05, "epoch": 2.11734693877551, "percentage": 30.25, "elapsed_time": "1:32:30", "remaining_time": "3:33:20"}
{"current_steps": 1250, "total_steps": 4116, "loss": 0.1608, "lr": 3.5168388197027894e-05, "epoch": 2.1258503401360542, "percentage": 30.37, "elapsed_time": "1:32:57", "remaining_time": "3:33:07"}
{"current_steps": 1255, "total_steps": 4116, "loss": 0.1561, "lr": 3.5112971568833265e-05, "epoch": 2.1343537414965987, "percentage": 30.49, "elapsed_time": "1:33:16", "remaining_time": "3:32:39"}
{"current_steps": 1260, "total_steps": 4116, "loss": 0.1725, "lr": 3.505728314216563e-05, "epoch": 2.142857142857143, "percentage": 30.61, "elapsed_time": "1:33:39", "remaining_time": "3:32:16"}
{"current_steps": 1265, "total_steps": 4116, "loss": 0.182, "lr": 3.500132391855067e-05, "epoch": 2.1513605442176873, "percentage": 30.73, "elapsed_time": "1:34:01", "remaining_time": "3:31:53"}
{"current_steps": 1270, "total_steps": 4116, "loss": 0.1745, "lr": 3.494509490438422e-05, "epoch": 2.1598639455782314, "percentage": 30.86, "elapsed_time": "1:34:24", "remaining_time": "3:31:33"}
{"current_steps": 1275, "total_steps": 4116, "loss": 0.1702, "lr": 3.4888597110914155e-05, "epoch": 2.1683673469387754, "percentage": 30.98, "elapsed_time": "1:34:43", "remaining_time": "3:31:04"}
{"current_steps": 1280, "total_steps": 4116, "loss": 0.1551, "lr": 3.483183155422216e-05, "epoch": 2.17687074829932, "percentage": 31.1, "elapsed_time": "1:35:02", "remaining_time": "3:30:35"}
{"current_steps": 1285, "total_steps": 4116, "loss": 0.1594, "lr": 3.477479925520555e-05, "epoch": 2.185374149659864, "percentage": 31.22, "elapsed_time": "1:35:30", "remaining_time": "3:30:25"}
{"current_steps": 1290, "total_steps": 4116, "loss": 0.1569, "lr": 3.471750123955878e-05, "epoch": 2.193877551020408, "percentage": 31.34, "elapsed_time": "1:35:51", "remaining_time": "3:29:59"}
{"current_steps": 1295, "total_steps": 4116, "loss": 0.1561, "lr": 3.465993853775516e-05, "epoch": 2.2023809523809526, "percentage": 31.46, "elapsed_time": "1:36:13", "remaining_time": "3:29:37"}
{"current_steps": 1300, "total_steps": 4116, "loss": 0.1692, "lr": 3.460211218502815e-05, "epoch": 2.2108843537414966, "percentage": 31.58, "elapsed_time": "1:36:34", "remaining_time": "3:29:12"}
{"current_steps": 1305, "total_steps": 4116, "loss": 0.1773, "lr": 3.45440232213529e-05, "epoch": 2.2193877551020407, "percentage": 31.71, "elapsed_time": "1:37:04", "remaining_time": "3:29:05"}
{"current_steps": 1310, "total_steps": 4116, "loss": 0.1648, "lr": 3.4485672691427416e-05, "epoch": 2.227891156462585, "percentage": 31.83, "elapsed_time": "1:37:22", "remaining_time": "3:28:35"}
{"current_steps": 1315, "total_steps": 4116, "loss": 0.1803, "lr": 3.4427061644653873e-05, "epoch": 2.2363945578231292, "percentage": 31.95, "elapsed_time": "1:37:49", "remaining_time": "3:28:22"}
{"current_steps": 1320, "total_steps": 4116, "loss": 0.1553, "lr": 3.436819113511967e-05, "epoch": 2.2448979591836733, "percentage": 32.07, "elapsed_time": "1:38:16", "remaining_time": "3:28:08"}
{"current_steps": 1325, "total_steps": 4116, "loss": 0.1566, "lr": 3.43090622215785e-05, "epoch": 2.253401360544218, "percentage": 32.19, "elapsed_time": "1:38:38", "remaining_time": "3:27:46"}
{"current_steps": 1330, "total_steps": 4116, "loss": 0.1502, "lr": 3.4249675967431325e-05, "epoch": 2.261904761904762, "percentage": 32.31, "elapsed_time": "1:38:57", "remaining_time": "3:27:17"}
{"current_steps": 1335, "total_steps": 4116, "loss": 0.1571, "lr": 3.419003344070723e-05, "epoch": 2.270408163265306, "percentage": 32.43, "elapsed_time": "1:39:13", "remaining_time": "3:26:42"}
{"current_steps": 1340, "total_steps": 4116, "loss": 0.1729, "lr": 3.413013571404423e-05, "epoch": 2.2789115646258504, "percentage": 32.56, "elapsed_time": "1:39:36", "remaining_time": "3:26:20"}
{"current_steps": 1345, "total_steps": 4116, "loss": 0.1503, "lr": 3.406998386466995e-05, "epoch": 2.2874149659863945, "percentage": 32.68, "elapsed_time": "1:39:59", "remaining_time": "3:25:59"}
{"current_steps": 1350, "total_steps": 4116, "loss": 0.1607, "lr": 3.400957897438229e-05, "epoch": 2.295918367346939, "percentage": 32.8, "elapsed_time": "1:40:17", "remaining_time": "3:25:28"}
{"current_steps": 1355, "total_steps": 4116, "loss": 0.1615, "lr": 3.3948922129529935e-05, "epoch": 2.304421768707483, "percentage": 32.92, "elapsed_time": "1:40:37", "remaining_time": "3:25:01"}
{"current_steps": 1360, "total_steps": 4116, "loss": 0.1578, "lr": 3.388801442099286e-05, "epoch": 2.312925170068027, "percentage": 33.04, "elapsed_time": "1:41:04", "remaining_time": "3:24:48"}
{"current_steps": 1365, "total_steps": 4116, "loss": 0.1717, "lr": 3.3826856944162655e-05, "epoch": 2.3214285714285716, "percentage": 33.16, "elapsed_time": "1:41:27", "remaining_time": "3:24:28"}
{"current_steps": 1370, "total_steps": 4116, "loss": 0.181, "lr": 3.376545079892288e-05, "epoch": 2.3299319727891157, "percentage": 33.28, "elapsed_time": "1:41:48", "remaining_time": "3:24:03"}
{"current_steps": 1375, "total_steps": 4116, "loss": 0.1725, "lr": 3.3703797089629245e-05, "epoch": 2.3384353741496597, "percentage": 33.41, "elapsed_time": "1:42:05", "remaining_time": "3:23:31"}
{"current_steps": 1380, "total_steps": 4116, "loss": 0.1651, "lr": 3.364189692508976e-05, "epoch": 2.3469387755102042, "percentage": 33.53, "elapsed_time": "1:42:32", "remaining_time": "3:23:17"}
{"current_steps": 1385, "total_steps": 4116, "loss": 0.1837, "lr": 3.357975141854482e-05, "epoch": 2.3554421768707483, "percentage": 33.65, "elapsed_time": "1:42:53", "remaining_time": "3:22:52"}
{"current_steps": 1390, "total_steps": 4116, "loss": 0.1622, "lr": 3.3517361687647136e-05, "epoch": 2.3639455782312924, "percentage": 33.77, "elapsed_time": "1:43:14", "remaining_time": "3:22:27"}
{"current_steps": 1395, "total_steps": 4116, "loss": 0.1647, "lr": 3.3454728854441667e-05, "epoch": 2.372448979591837, "percentage": 33.89, "elapsed_time": "1:43:36", "remaining_time": "3:22:04"}
{"current_steps": 1400, "total_steps": 4116, "loss": 0.166, "lr": 3.3391854045345437e-05, "epoch": 2.380952380952381, "percentage": 34.01, "elapsed_time": "1:43:59", "remaining_time": "3:21:45"}
{"current_steps": 1405, "total_steps": 4116, "loss": 0.1432, "lr": 3.332873839112727e-05, "epoch": 2.389455782312925, "percentage": 34.14, "elapsed_time": "1:44:19", "remaining_time": "3:21:18"}
{"current_steps": 1410, "total_steps": 4116, "loss": 0.149, "lr": 3.3265383026887464e-05, "epoch": 2.3979591836734695, "percentage": 34.26, "elapsed_time": "1:44:40", "remaining_time": "3:20:53"}
{"current_steps": 1415, "total_steps": 4116, "loss": 0.1672, "lr": 3.320178909203735e-05, "epoch": 2.4064625850340136, "percentage": 34.38, "elapsed_time": "1:45:06", "remaining_time": "3:20:38"}
{"current_steps": 1420, "total_steps": 4116, "loss": 0.1639, "lr": 3.313795773027885e-05, "epoch": 2.4149659863945576, "percentage": 34.5, "elapsed_time": "1:45:27", "remaining_time": "3:20:13"}
{"current_steps": 1425, "total_steps": 4116, "loss": 0.1554, "lr": 3.307389008958385e-05, "epoch": 2.423469387755102, "percentage": 34.62, "elapsed_time": "1:45:41", "remaining_time": "3:19:34"}
{"current_steps": 1430, "total_steps": 4116, "loss": 0.1547, "lr": 3.30095873221736e-05, "epoch": 2.431972789115646, "percentage": 34.74, "elapsed_time": "1:46:03", "remaining_time": "3:19:13"}
{"current_steps": 1435, "total_steps": 4116, "loss": 0.1695, "lr": 3.294505058449797e-05, "epoch": 2.4404761904761907, "percentage": 34.86, "elapsed_time": "1:46:26", "remaining_time": "3:18:51"}
{"current_steps": 1440, "total_steps": 4116, "loss": 0.1646, "lr": 3.288028103721465e-05, "epoch": 2.4489795918367347, "percentage": 34.99, "elapsed_time": "1:46:50", "remaining_time": "3:18:32"}
{"current_steps": 1445, "total_steps": 4116, "loss": 0.1634, "lr": 3.2815279845168296e-05, "epoch": 2.457482993197279, "percentage": 35.11, "elapsed_time": "1:47:10", "remaining_time": "3:18:07"}
{"current_steps": 1450, "total_steps": 4116, "loss": 0.1527, "lr": 3.2750048177369554e-05, "epoch": 2.4659863945578233, "percentage": 35.23, "elapsed_time": "1:47:31", "remaining_time": "3:17:41"}
{"current_steps": 1455, "total_steps": 4116, "loss": 0.1677, "lr": 3.268458720697406e-05, "epoch": 2.4744897959183674, "percentage": 35.35, "elapsed_time": "1:47:53", "remaining_time": "3:17:19"}
{"current_steps": 1460, "total_steps": 4116, "loss": 0.1659, "lr": 3.2618898111261316e-05, "epoch": 2.4829931972789114, "percentage": 35.47, "elapsed_time": "1:48:14", "remaining_time": "3:16:54"}
{"current_steps": 1465, "total_steps": 4116, "loss": 0.15, "lr": 3.2552982071613555e-05, "epoch": 2.491496598639456, "percentage": 35.59, "elapsed_time": "1:48:34", "remaining_time": "3:16:28"}
{"current_steps": 1470, "total_steps": 4116, "loss": 0.1549, "lr": 3.248684027349445e-05, "epoch": 2.5, "percentage": 35.71, "elapsed_time": "1:48:57", "remaining_time": "3:16:07"}
{"current_steps": 1475, "total_steps": 4116, "loss": 0.1635, "lr": 3.2420473906427824e-05, "epoch": 2.508503401360544, "percentage": 35.84, "elapsed_time": "1:49:18", "remaining_time": "3:15:43"}
{"current_steps": 1480, "total_steps": 4116, "loss": 0.1512, "lr": 3.235388416397626e-05, "epoch": 2.5170068027210886, "percentage": 35.96, "elapsed_time": "1:49:41", "remaining_time": "3:15:21"}
{"current_steps": 1485, "total_steps": 4116, "loss": 0.1599, "lr": 3.228707224371961e-05, "epoch": 2.5255102040816326, "percentage": 36.08, "elapsed_time": "1:50:03", "remaining_time": "3:15:00"}
{"current_steps": 1490, "total_steps": 4116, "loss": 0.1654, "lr": 3.222003934723347e-05, "epoch": 2.534013605442177, "percentage": 36.2, "elapsed_time": "1:50:26", "remaining_time": "3:14:38"}
{"current_steps": 1495, "total_steps": 4116, "loss": 0.1636, "lr": 3.215278668006757e-05, "epoch": 2.542517006802721, "percentage": 36.32, "elapsed_time": "1:50:49", "remaining_time": "3:14:18"}
{"current_steps": 1500, "total_steps": 4116, "loss": 0.1552, "lr": 3.208531545172413e-05, "epoch": 2.5510204081632653, "percentage": 36.44, "elapsed_time": "1:51:12", "remaining_time": "3:13:56"}
{"current_steps": 1505, "total_steps": 4116, "loss": 0.1546, "lr": 3.2017626875636024e-05, "epoch": 2.5595238095238093, "percentage": 36.56, "elapsed_time": "1:51:44", "remaining_time": "3:13:51"}
{"current_steps": 1510, "total_steps": 4116, "loss": 0.1529, "lr": 3.194972216914502e-05, "epoch": 2.568027210884354, "percentage": 36.69, "elapsed_time": "1:52:03", "remaining_time": "3:13:24"}
{"current_steps": 1515, "total_steps": 4116, "loss": 0.1888, "lr": 3.18816025534799e-05, "epoch": 2.576530612244898, "percentage": 36.81, "elapsed_time": "1:52:29", "remaining_time": "3:13:06"}
{"current_steps": 1520, "total_steps": 4116, "loss": 0.177, "lr": 3.181326925373443e-05, "epoch": 2.5850340136054424, "percentage": 36.93, "elapsed_time": "1:52:48", "remaining_time": "3:12:39"}
{"current_steps": 1525, "total_steps": 4116, "loss": 0.1794, "lr": 3.174472349884539e-05, "epoch": 2.5935374149659864, "percentage": 37.05, "elapsed_time": "1:53:11", "remaining_time": "3:12:18"}
{"current_steps": 1530, "total_steps": 4116, "loss": 0.1616, "lr": 3.1675966521570444e-05, "epoch": 2.6020408163265305, "percentage": 37.17, "elapsed_time": "1:53:30", "remaining_time": "3:11:51"}
{"current_steps": 1535, "total_steps": 4116, "loss": 0.1658, "lr": 3.160699955846597e-05, "epoch": 2.6105442176870746, "percentage": 37.29, "elapsed_time": "1:54:00", "remaining_time": "3:11:42"}
{"current_steps": 1540, "total_steps": 4116, "loss": 0.1703, "lr": 3.1537823849864834e-05, "epoch": 2.619047619047619, "percentage": 37.41, "elapsed_time": "1:54:21", "remaining_time": "3:11:17"}
{"current_steps": 1545, "total_steps": 4116, "loss": 0.1535, "lr": 3.1468440639854056e-05, "epoch": 2.627551020408163, "percentage": 37.54, "elapsed_time": "1:54:38", "remaining_time": "3:10:47"}
{"current_steps": 1550, "total_steps": 4116, "loss": 0.1587, "lr": 3.139885117625249e-05, "epoch": 2.6360544217687076, "percentage": 37.66, "elapsed_time": "1:54:59", "remaining_time": "3:10:22"}
{"current_steps": 1555, "total_steps": 4116, "loss": 0.1594, "lr": 3.132905671058831e-05, "epoch": 2.6445578231292517, "percentage": 37.78, "elapsed_time": "1:55:19", "remaining_time": "3:09:56"}
{"current_steps": 1560, "total_steps": 4116, "loss": 0.158, "lr": 3.1259058498076554e-05, "epoch": 2.6530612244897958, "percentage": 37.9, "elapsed_time": "1:55:44", "remaining_time": "3:09:38"}
{"current_steps": 1565, "total_steps": 4116, "loss": 0.1568, "lr": 3.1188857797596545e-05, "epoch": 2.6615646258503403, "percentage": 38.02, "elapsed_time": "1:56:05", "remaining_time": "3:09:13"}
{"current_steps": 1570, "total_steps": 4116, "loss": 0.1654, "lr": 3.111845587166922e-05, "epoch": 2.6700680272108843, "percentage": 38.14, "elapsed_time": "1:56:28", "remaining_time": "3:08:53"}
{"current_steps": 1575, "total_steps": 4116, "loss": 0.1648, "lr": 3.104785398643447e-05, "epoch": 2.678571428571429, "percentage": 38.27, "elapsed_time": "1:56:48", "remaining_time": "3:08:27"}
{"current_steps": 1580, "total_steps": 4116, "loss": 0.1462, "lr": 3.0977053411628315e-05, "epoch": 2.687074829931973, "percentage": 38.39, "elapsed_time": "1:57:13", "remaining_time": "3:08:09"}
{"current_steps": 1585, "total_steps": 4116, "loss": 0.1776, "lr": 3.090605542056013e-05, "epoch": 2.695578231292517, "percentage": 38.51, "elapsed_time": "1:57:37", "remaining_time": "3:07:49"}
{"current_steps": 1590, "total_steps": 4116, "loss": 0.1664, "lr": 3.083486129008967e-05, "epoch": 2.704081632653061, "percentage": 38.63, "elapsed_time": "1:58:01", "remaining_time": "3:07:30"}
{"current_steps": 1595, "total_steps": 4116, "loss": 0.1602, "lr": 3.07634723006042e-05, "epoch": 2.7125850340136055, "percentage": 38.75, "elapsed_time": "1:58:20", "remaining_time": "3:07:03"}
{"current_steps": 1600, "total_steps": 4116, "loss": 0.1513, "lr": 3.069188973599538e-05, "epoch": 2.7210884353741496, "percentage": 38.87, "elapsed_time": "1:58:40", "remaining_time": "3:06:37"}
{"current_steps": 1605, "total_steps": 4116, "loss": 0.1562, "lr": 3.062011488363623e-05, "epoch": 2.729591836734694, "percentage": 38.99, "elapsed_time": "1:59:02", "remaining_time": "3:06:14"}
{"current_steps": 1610, "total_steps": 4116, "loss": 0.1624, "lr": 3.054814903435794e-05, "epoch": 2.738095238095238, "percentage": 39.12, "elapsed_time": "1:59:23", "remaining_time": "3:05:49"}
{"current_steps": 1615, "total_steps": 4116, "loss": 0.1705, "lr": 3.0475993482426695e-05, "epoch": 2.746598639455782, "percentage": 39.24, "elapsed_time": "1:59:45", "remaining_time": "3:05:27"}
{"current_steps": 1620, "total_steps": 4116, "loss": 0.1727, "lr": 3.0403649525520374e-05, "epoch": 2.7551020408163263, "percentage": 39.36, "elapsed_time": "2:00:11", "remaining_time": "3:05:10"}
{"current_steps": 1625, "total_steps": 4116, "loss": 0.156, "lr": 3.033111846470521e-05, "epoch": 2.7636054421768708, "percentage": 39.48, "elapsed_time": "2:00:31", "remaining_time": "3:04:45"}
{"current_steps": 1630, "total_steps": 4116, "loss": 0.1562, "lr": 3.02584016044124e-05, "epoch": 2.772108843537415, "percentage": 39.6, "elapsed_time": "2:00:48", "remaining_time": "3:04:15"}
{"current_steps": 1635, "total_steps": 4116, "loss": 0.1613, "lr": 3.0185500252414633e-05, "epoch": 2.7806122448979593, "percentage": 39.72, "elapsed_time": "2:01:12", "remaining_time": "3:03:55"}
{"current_steps": 1640, "total_steps": 4116, "loss": 0.1635, "lr": 3.01124157198026e-05, "epoch": 2.7891156462585034, "percentage": 39.84, "elapsed_time": "2:01:32", "remaining_time": "3:03:29"}
{"current_steps": 1645, "total_steps": 4116, "loss": 0.1554, "lr": 3.003914932096137e-05, "epoch": 2.7976190476190474, "percentage": 39.97, "elapsed_time": "2:01:50", "remaining_time": "3:03:01"}
{"current_steps": 1650, "total_steps": 4116, "loss": 0.1726, "lr": 2.99657023735468e-05, "epoch": 2.806122448979592, "percentage": 40.09, "elapsed_time": "2:02:12", "remaining_time": "3:02:38"}
{"current_steps": 1655, "total_steps": 4116, "loss": 0.153, "lr": 2.9892076198461794e-05, "epoch": 2.814625850340136, "percentage": 40.21, "elapsed_time": "2:02:32", "remaining_time": "3:02:13"}
{"current_steps": 1660, "total_steps": 4116, "loss": 0.1598, "lr": 2.9818272119832576e-05, "epoch": 2.8231292517006805, "percentage": 40.33, "elapsed_time": "2:02:55", "remaining_time": "3:01:51"}
{"current_steps": 1665, "total_steps": 4116, "loss": 0.1546, "lr": 2.974429146498487e-05, "epoch": 2.8316326530612246, "percentage": 40.45, "elapsed_time": "2:03:21", "remaining_time": "3:01:36"}
{"current_steps": 1670, "total_steps": 4116, "loss": 0.1596, "lr": 2.9670135564420016e-05, "epoch": 2.8401360544217686, "percentage": 40.57, "elapsed_time": "2:03:43", "remaining_time": "3:01:12"}
{"current_steps": 1675, "total_steps": 4116, "loss": 0.1684, "lr": 2.9595805751791068e-05, "epoch": 2.8486394557823127, "percentage": 40.69, "elapsed_time": "2:04:04", "remaining_time": "3:00:49"}
{"current_steps": 1680, "total_steps": 4116, "loss": 0.152, "lr": 2.9521303363878774e-05, "epoch": 2.857142857142857, "percentage": 40.82, "elapsed_time": "2:04:24", "remaining_time": "3:00:23"}
{"current_steps": 1685, "total_steps": 4116, "loss": 0.1741, "lr": 2.9446629740567566e-05, "epoch": 2.8656462585034013, "percentage": 40.94, "elapsed_time": "2:04:50", "remaining_time": "3:00:07"}
{"current_steps": 1690, "total_steps": 4116, "loss": 0.1735, "lr": 2.9371786224821447e-05, "epoch": 2.8741496598639458, "percentage": 41.06, "elapsed_time": "2:05:11", "remaining_time": "2:59:43"}
{"current_steps": 1695, "total_steps": 4116, "loss": 0.1749, "lr": 2.9296774162659836e-05, "epoch": 2.88265306122449, "percentage": 41.18, "elapsed_time": "2:05:36", "remaining_time": "2:59:25"}
{"current_steps": 1700, "total_steps": 4116, "loss": 0.1592, "lr": 2.9221594903133385e-05, "epoch": 2.891156462585034, "percentage": 41.3, "elapsed_time": "2:05:54", "remaining_time": "2:58:56"}
{"current_steps": 1705, "total_steps": 4116, "loss": 0.162, "lr": 2.9146249798299672e-05, "epoch": 2.8996598639455784, "percentage": 41.42, "elapsed_time": "2:06:16", "remaining_time": "2:58:34"}
{"current_steps": 1710, "total_steps": 4116, "loss": 0.1741, "lr": 2.9070740203198927e-05, "epoch": 2.9081632653061225, "percentage": 41.55, "elapsed_time": "2:06:38", "remaining_time": "2:58:11"}
{"current_steps": 1715, "total_steps": 4116, "loss": 0.1631, "lr": 2.899506747582965e-05, "epoch": 2.9166666666666665, "percentage": 41.67, "elapsed_time": "2:07:01", "remaining_time": "2:57:50"}
{"current_steps": 1720, "total_steps": 4116, "loss": 0.1694, "lr": 2.8919232977124177e-05, "epoch": 2.925170068027211, "percentage": 41.79, "elapsed_time": "2:07:24", "remaining_time": "2:57:29"}
{"current_steps": 1725, "total_steps": 4116, "loss": 0.1602, "lr": 2.884323807092423e-05, "epoch": 2.933673469387755, "percentage": 41.91, "elapsed_time": "2:07:47", "remaining_time": "2:57:08"}
{"current_steps": 1730, "total_steps": 4116, "loss": 0.1706, "lr": 2.8767084123956334e-05, "epoch": 2.942176870748299, "percentage": 42.03, "elapsed_time": "2:08:10", "remaining_time": "2:56:46"}
{"current_steps": 1735, "total_steps": 4116, "loss": 0.173, "lr": 2.8690772505807307e-05, "epoch": 2.9506802721088436, "percentage": 42.15, "elapsed_time": "2:08:35", "remaining_time": "2:56:28"}
{"current_steps": 1740, "total_steps": 4116, "loss": 0.1467, "lr": 2.8614304588899588e-05, "epoch": 2.9591836734693877, "percentage": 42.27, "elapsed_time": "2:08:52", "remaining_time": "2:55:59"}
{"current_steps": 1745, "total_steps": 4116, "loss": 0.1634, "lr": 2.8537681748466557e-05, "epoch": 2.967687074829932, "percentage": 42.4, "elapsed_time": "2:09:17", "remaining_time": "2:55:40"}
{"current_steps": 1750, "total_steps": 4116, "loss": 0.149, "lr": 2.8460905362527832e-05, "epoch": 2.9761904761904763, "percentage": 42.52, "elapsed_time": "2:09:37", "remaining_time": "2:55:14"}
{"current_steps": 1755, "total_steps": 4116, "loss": 0.1708, "lr": 2.8383976811864416e-05, "epoch": 2.9846938775510203, "percentage": 42.64, "elapsed_time": "2:09:55", "remaining_time": "2:54:47"}
{"current_steps": 1760, "total_steps": 4116, "loss": 0.1723, "lr": 2.8306897479993954e-05, "epoch": 2.9931972789115644, "percentage": 42.76, "elapsed_time": "2:10:16", "remaining_time": "2:54:23"}
{"current_steps": 1765, "total_steps": 4116, "loss": 0.1735, "lr": 2.8229668753145797e-05, "epoch": 3.001700680272109, "percentage": 42.88, "elapsed_time": "2:10:36", "remaining_time": "2:53:57"}
{"current_steps": 1770, "total_steps": 4116, "loss": 0.1375, "lr": 2.815229202023607e-05, "epoch": 3.010204081632653, "percentage": 43.0, "elapsed_time": "2:10:56", "remaining_time": "2:53:33"}
{"current_steps": 1775, "total_steps": 4116, "loss": 0.1425, "lr": 2.8074768672842716e-05, "epoch": 3.0187074829931975, "percentage": 43.12, "elapsed_time": "2:11:21", "remaining_time": "2:53:14"}
{"current_steps": 1780, "total_steps": 4116, "loss": 0.1358, "lr": 2.7997100105180455e-05, "epoch": 3.0272108843537415, "percentage": 43.25, "elapsed_time": "2:11:46", "remaining_time": "2:52:56"}
{"current_steps": 1785, "total_steps": 4116, "loss": 0.1546, "lr": 2.7919287714075714e-05, "epoch": 3.0357142857142856, "percentage": 43.37, "elapsed_time": "2:12:09", "remaining_time": "2:52:34"}
{"current_steps": 1790, "total_steps": 4116, "loss": 0.1452, "lr": 2.7841332898941513e-05, "epoch": 3.04421768707483, "percentage": 43.49, "elapsed_time": "2:12:32", "remaining_time": "2:52:13"}
{"current_steps": 1795, "total_steps": 4116, "loss": 0.1309, "lr": 2.7763237061752276e-05, "epoch": 3.052721088435374, "percentage": 43.61, "elapsed_time": "2:12:52", "remaining_time": "2:51:48"}
{"current_steps": 1800, "total_steps": 4116, "loss": 0.1377, "lr": 2.768500160701864e-05, "epoch": 3.061224489795918, "percentage": 43.73, "elapsed_time": "2:13:18", "remaining_time": "2:51:31"}
{"current_steps": 1805, "total_steps": 4116, "loss": 0.1333, "lr": 2.7606627941762186e-05, "epoch": 3.0697278911564627, "percentage": 43.85, "elapsed_time": "2:13:38", "remaining_time": "2:51:06"}
{"current_steps": 1810, "total_steps": 4116, "loss": 0.1512, "lr": 2.752811747549013e-05, "epoch": 3.078231292517007, "percentage": 43.97, "elapsed_time": "2:13:59", "remaining_time": "2:50:42"}
{"current_steps": 1815, "total_steps": 4116, "loss": 0.149, "lr": 2.7449471620169988e-05, "epoch": 3.086734693877551, "percentage": 44.1, "elapsed_time": "2:14:20", "remaining_time": "2:50:18"}
{"current_steps": 1820, "total_steps": 4116, "loss": 0.1655, "lr": 2.7370691790204173e-05, "epoch": 3.0952380952380953, "percentage": 44.22, "elapsed_time": "2:14:43", "remaining_time": "2:49:58"}
{"current_steps": 1825, "total_steps": 4116, "loss": 0.1423, "lr": 2.7291779402404538e-05, "epoch": 3.1037414965986394, "percentage": 44.34, "elapsed_time": "2:15:04", "remaining_time": "2:49:34"}
{"current_steps": 1830, "total_steps": 4116, "loss": 0.1374, "lr": 2.7212735875966946e-05, "epoch": 3.1122448979591835, "percentage": 44.46, "elapsed_time": "2:15:25", "remaining_time": "2:49:10"}
{"current_steps": 1835, "total_steps": 4116, "loss": 0.143, "lr": 2.713356263244569e-05, "epoch": 3.120748299319728, "percentage": 44.58, "elapsed_time": "2:15:46", "remaining_time": "2:48:46"}
{"current_steps": 1840, "total_steps": 4116, "loss": 0.1494, "lr": 2.7054261095727977e-05, "epoch": 3.129251700680272, "percentage": 44.7, "elapsed_time": "2:16:12", "remaining_time": "2:48:28"}
{"current_steps": 1845, "total_steps": 4116, "loss": 0.1511, "lr": 2.6974832692008278e-05, "epoch": 3.137755102040816, "percentage": 44.83, "elapsed_time": "2:16:32", "remaining_time": "2:48:04"}
{"current_steps": 1850, "total_steps": 4116, "loss": 0.1409, "lr": 2.6895278849762694e-05, "epoch": 3.1462585034013606, "percentage": 44.95, "elapsed_time": "2:16:54", "remaining_time": "2:47:41"}
{"current_steps": 1855, "total_steps": 4116, "loss": 0.1374, "lr": 2.68156009997233e-05, "epoch": 3.1547619047619047, "percentage": 45.07, "elapsed_time": "2:17:14", "remaining_time": "2:47:16"}
{"current_steps": 1860, "total_steps": 4116, "loss": 0.1376, "lr": 2.673580057485234e-05, "epoch": 3.163265306122449, "percentage": 45.19, "elapsed_time": "2:17:36", "remaining_time": "2:46:54"}
{"current_steps": 1865, "total_steps": 4116, "loss": 0.1616, "lr": 2.6655879010316535e-05, "epoch": 3.171768707482993, "percentage": 45.31, "elapsed_time": "2:17:54", "remaining_time": "2:46:27"}
{"current_steps": 1870, "total_steps": 4116, "loss": 0.1426, "lr": 2.657583774346121e-05, "epoch": 3.1802721088435373, "percentage": 45.43, "elapsed_time": "2:18:15", "remaining_time": "2:46:04"}
{"current_steps": 1875, "total_steps": 4116, "loss": 0.1439, "lr": 2.649567821378449e-05, "epoch": 3.188775510204082, "percentage": 45.55, "elapsed_time": "2:18:37", "remaining_time": "2:45:40"}
{"current_steps": 1880, "total_steps": 4116, "loss": 0.1321, "lr": 2.641540186291138e-05, "epoch": 3.197278911564626, "percentage": 45.68, "elapsed_time": "2:19:02", "remaining_time": "2:45:21"}
{"current_steps": 1885, "total_steps": 4116, "loss": 0.1446, "lr": 2.6335010134567852e-05, "epoch": 3.20578231292517, "percentage": 45.8, "elapsed_time": "2:19:21", "remaining_time": "2:44:56"}
{"current_steps": 1890, "total_steps": 4116, "loss": 0.1567, "lr": 2.6254504474554883e-05, "epoch": 3.2142857142857144, "percentage": 45.92, "elapsed_time": "2:19:41", "remaining_time": "2:44:31"}
{"current_steps": 1895, "total_steps": 4116, "loss": 0.1362, "lr": 2.617388633072244e-05, "epoch": 3.2227891156462585, "percentage": 46.04, "elapsed_time": "2:20:00", "remaining_time": "2:44:06"}
{"current_steps": 1900, "total_steps": 4116, "loss": 0.1451, "lr": 2.609315715294346e-05, "epoch": 3.2312925170068025, "percentage": 46.16, "elapsed_time": "2:20:21", "remaining_time": "2:43:42"}
{"current_steps": 1905, "total_steps": 4116, "loss": 0.143, "lr": 2.6012318393087754e-05, "epoch": 3.239795918367347, "percentage": 46.28, "elapsed_time": "2:20:40", "remaining_time": "2:43:16"}
{"current_steps": 1910, "total_steps": 4116, "loss": 0.1398, "lr": 2.593137150499591e-05, "epoch": 3.248299319727891, "percentage": 46.4, "elapsed_time": "2:21:01", "remaining_time": "2:42:52"}
{"current_steps": 1915, "total_steps": 4116, "loss": 0.1363, "lr": 2.5850317944453153e-05, "epoch": 3.2568027210884356, "percentage": 46.53, "elapsed_time": "2:21:18", "remaining_time": "2:42:25"}
{"current_steps": 1920, "total_steps": 4116, "loss": 0.146, "lr": 2.5769159169163137e-05, "epoch": 3.2653061224489797, "percentage": 46.65, "elapsed_time": "2:21:43", "remaining_time": "2:42:06"}
{"current_steps": 1925, "total_steps": 4116, "loss": 0.1353, "lr": 2.568789663872175e-05, "epoch": 3.2738095238095237, "percentage": 46.77, "elapsed_time": "2:22:05", "remaining_time": "2:41:43"}
{"current_steps": 1930, "total_steps": 4116, "loss": 0.1366, "lr": 2.5606531814590868e-05, "epoch": 3.282312925170068, "percentage": 46.89, "elapsed_time": "2:22:27", "remaining_time": "2:41:20"}
{"current_steps": 1935, "total_steps": 4116, "loss": 0.1629, "lr": 2.552506616007205e-05, "epoch": 3.2908163265306123, "percentage": 47.01, "elapsed_time": "2:22:54", "remaining_time": "2:41:04"}
{"current_steps": 1940, "total_steps": 4116, "loss": 0.1369, "lr": 2.544350114028025e-05, "epoch": 3.2993197278911564, "percentage": 47.13, "elapsed_time": "2:23:14", "remaining_time": "2:40:39"}
{"current_steps": 1945, "total_steps": 4116, "loss": 0.1544, "lr": 2.5361838222117435e-05, "epoch": 3.307823129251701, "percentage": 47.25, "elapsed_time": "2:23:35", "remaining_time": "2:40:16"}
{"current_steps": 1950, "total_steps": 4116, "loss": 0.1413, "lr": 2.5280078874246227e-05, "epoch": 3.316326530612245, "percentage": 47.38, "elapsed_time": "2:23:58", "remaining_time": "2:39:55"}
{"current_steps": 1955, "total_steps": 4116, "loss": 0.1518, "lr": 2.519822456706349e-05, "epoch": 3.324829931972789, "percentage": 47.5, "elapsed_time": "2:24:20", "remaining_time": "2:39:33"}
{"current_steps": 1960, "total_steps": 4116, "loss": 0.1653, "lr": 2.5116276772673863e-05, "epoch": 3.3333333333333335, "percentage": 47.62, "elapsed_time": "2:24:40", "remaining_time": "2:39:08"}
{"current_steps": 1965, "total_steps": 4116, "loss": 0.1388, "lr": 2.503423696486332e-05, "epoch": 3.3418367346938775, "percentage": 47.74, "elapsed_time": "2:24:57", "remaining_time": "2:38:40"}
{"current_steps": 1970, "total_steps": 4116, "loss": 0.1395, "lr": 2.495210661907263e-05, "epoch": 3.3503401360544216, "percentage": 47.86, "elapsed_time": "2:25:15", "remaining_time": "2:38:13"}
{"current_steps": 1975, "total_steps": 4116, "loss": 0.161, "lr": 2.4869887212370847e-05, "epoch": 3.358843537414966, "percentage": 47.98, "elapsed_time": "2:25:38", "remaining_time": "2:37:53"}
{"current_steps": 1980, "total_steps": 4116, "loss": 0.1337, "lr": 2.4787580223428736e-05, "epoch": 3.36734693877551, "percentage": 48.1, "elapsed_time": "2:26:01", "remaining_time": "2:37:32"}
{"current_steps": 1985, "total_steps": 4116, "loss": 0.1445, "lr": 2.470518713249217e-05, "epoch": 3.3758503401360542, "percentage": 48.23, "elapsed_time": "2:26:25", "remaining_time": "2:37:11"}
{"current_steps": 1990, "total_steps": 4116, "loss": 0.1374, "lr": 2.4622709421355535e-05, "epoch": 3.3843537414965987, "percentage": 48.35, "elapsed_time": "2:26:44", "remaining_time": "2:36:45"}
{"current_steps": 1995, "total_steps": 4116, "loss": 0.1305, "lr": 2.4540148573335057e-05, "epoch": 3.392857142857143, "percentage": 48.47, "elapsed_time": "2:27:08", "remaining_time": "2:36:26"}
{"current_steps": 2000, "total_steps": 4116, "loss": 0.1283, "lr": 2.445750607324212e-05, "epoch": 3.4013605442176873, "percentage": 48.59, "elapsed_time": "2:27:30", "remaining_time": "2:36:04"}
{"current_steps": 2005, "total_steps": 4116, "loss": 0.1502, "lr": 2.4374783407356602e-05, "epoch": 3.4098639455782314, "percentage": 48.71, "elapsed_time": "2:27:55", "remaining_time": "2:35:45"}
{"current_steps": 2010, "total_steps": 4116, "loss": 0.1536, "lr": 2.4291982063400096e-05, "epoch": 3.4183673469387754, "percentage": 48.83, "elapsed_time": "2:28:18", "remaining_time": "2:35:23"}
{"current_steps": 2015, "total_steps": 4116, "loss": 0.1403, "lr": 2.4209103530509196e-05, "epoch": 3.4268707482993195, "percentage": 48.96, "elapsed_time": "2:28:37", "remaining_time": "2:34:57"}
{"current_steps": 2020, "total_steps": 4116, "loss": 0.1585, "lr": 2.412614929920868e-05, "epoch": 3.435374149659864, "percentage": 49.08, "elapsed_time": "2:29:00", "remaining_time": "2:34:36"}
{"current_steps": 2025, "total_steps": 4116, "loss": 0.1496, "lr": 2.4043120861384723e-05, "epoch": 3.443877551020408, "percentage": 49.2, "elapsed_time": "2:29:19", "remaining_time": "2:34:11"}
{"current_steps": 2030, "total_steps": 4116, "loss": 0.1489, "lr": 2.3960019710258068e-05, "epoch": 3.4523809523809526, "percentage": 49.32, "elapsed_time": "2:29:43", "remaining_time": "2:33:50"}
{"current_steps": 2035, "total_steps": 4116, "loss": 0.1635, "lr": 2.3876847340357164e-05, "epoch": 3.4608843537414966, "percentage": 49.44, "elapsed_time": "2:30:02", "remaining_time": "2:33:25"}
{"current_steps": 2040, "total_steps": 4116, "loss": 0.1541, "lr": 2.3793605247491303e-05, "epoch": 3.4693877551020407, "percentage": 49.56, "elapsed_time": "2:30:27", "remaining_time": "2:33:07"}
{"current_steps": 2045, "total_steps": 4116, "loss": 0.1422, "lr": 2.371029492872369e-05, "epoch": 3.477891156462585, "percentage": 49.68, "elapsed_time": "2:30:46", "remaining_time": "2:32:41"}
{"current_steps": 2050, "total_steps": 4116, "loss": 0.1497, "lr": 2.362691788234453e-05, "epoch": 3.4863945578231292, "percentage": 49.81, "elapsed_time": "2:31:08", "remaining_time": "2:32:19"}
{"current_steps": 2055, "total_steps": 4116, "loss": 0.1536, "lr": 2.3543475607844105e-05, "epoch": 3.4948979591836733, "percentage": 49.93, "elapsed_time": "2:31:33", "remaining_time": "2:32:00"}
{"current_steps": 2060, "total_steps": 4116, "loss": 0.151, "lr": 2.3459969605885785e-05, "epoch": 3.503401360544218, "percentage": 50.05, "elapsed_time": "2:31:56", "remaining_time": "2:31:38"}
{"current_steps": 2065, "total_steps": 4116, "loss": 0.1324, "lr": 2.3376401378279037e-05, "epoch": 3.511904761904762, "percentage": 50.17, "elapsed_time": "2:32:17", "remaining_time": "2:31:15"}
{"current_steps": 2070, "total_steps": 4116, "loss": 0.1431, "lr": 2.329277242795243e-05, "epoch": 3.520408163265306, "percentage": 50.29, "elapsed_time": "2:32:41", "remaining_time": "2:30:54"}
{"current_steps": 2075, "total_steps": 4116, "loss": 0.1497, "lr": 2.320908425892658e-05, "epoch": 3.5289115646258504, "percentage": 50.41, "elapsed_time": "2:33:06", "remaining_time": "2:30:35"}
{"current_steps": 2080, "total_steps": 4116, "loss": 0.1315, "lr": 2.3125338376287137e-05, "epoch": 3.5374149659863945, "percentage": 50.53, "elapsed_time": "2:33:24", "remaining_time": "2:30:09"}
{"current_steps": 2085, "total_steps": 4116, "loss": 0.1451, "lr": 2.3041536286157706e-05, "epoch": 3.545918367346939, "percentage": 50.66, "elapsed_time": "2:33:46", "remaining_time": "2:29:47"}
{"current_steps": 2090, "total_steps": 4116, "loss": 0.1455, "lr": 2.295767949567272e-05, "epoch": 3.554421768707483, "percentage": 50.78, "elapsed_time": "2:34:08", "remaining_time": "2:29:24"}
{"current_steps": 2095, "total_steps": 4116, "loss": 0.1427, "lr": 2.28737695129504e-05, "epoch": 3.562925170068027, "percentage": 50.9, "elapsed_time": "2:34:30", "remaining_time": "2:29:02"}
{"current_steps": 2100, "total_steps": 4116, "loss": 0.1375, "lr": 2.2789807847065574e-05, "epoch": 3.571428571428571, "percentage": 51.02, "elapsed_time": "2:34:54", "remaining_time": "2:28:43"}
{"current_steps": 2105, "total_steps": 4116, "loss": 0.1467, "lr": 2.2705796008022592e-05, "epoch": 3.5799319727891157, "percentage": 51.14, "elapsed_time": "2:35:17", "remaining_time": "2:28:21"}
{"current_steps": 2110, "total_steps": 4116, "loss": 0.1377, "lr": 2.2621735506728126e-05, "epoch": 3.5884353741496597, "percentage": 51.26, "elapsed_time": "2:35:42", "remaining_time": "2:28:02"}
{"current_steps": 2115, "total_steps": 4116, "loss": 0.1349, "lr": 2.2537627854963994e-05, "epoch": 3.5969387755102042, "percentage": 51.38, "elapsed_time": "2:36:02", "remaining_time": "2:27:37"}
{"current_steps": 2120, "total_steps": 4116, "loss": 0.1416, "lr": 2.2453474565360018e-05, "epoch": 3.6054421768707483, "percentage": 51.51, "elapsed_time": "2:36:21", "remaining_time": "2:27:12"}
{"current_steps": 2125, "total_steps": 4116, "loss": 0.1434, "lr": 2.2369277151366776e-05, "epoch": 3.6139455782312924, "percentage": 51.63, "elapsed_time": "2:36:47", "remaining_time": "2:26:53"}
{"current_steps": 2130, "total_steps": 4116, "loss": 0.1365, "lr": 2.228503712722841e-05, "epoch": 3.622448979591837, "percentage": 51.75, "elapsed_time": "2:37:09", "remaining_time": "2:26:32"}
{"current_steps": 2135, "total_steps": 4116, "loss": 0.1424, "lr": 2.220075600795537e-05, "epoch": 3.630952380952381, "percentage": 51.87, "elapsed_time": "2:37:33", "remaining_time": "2:26:11"}
{"current_steps": 2140, "total_steps": 4116, "loss": 0.1487, "lr": 2.2116435309297177e-05, "epoch": 3.6394557823129254, "percentage": 51.99, "elapsed_time": "2:37:56", "remaining_time": "2:25:50"}
{"current_steps": 2145, "total_steps": 4116, "loss": 0.1558, "lr": 2.203207654771519e-05, "epoch": 3.6479591836734695, "percentage": 52.11, "elapsed_time": "2:38:18", "remaining_time": "2:25:28"}
{"current_steps": 2150, "total_steps": 4116, "loss": 0.1641, "lr": 2.1947681240355266e-05, "epoch": 3.6564625850340136, "percentage": 52.24, "elapsed_time": "2:38:44", "remaining_time": "2:25:09"}
{"current_steps": 2155, "total_steps": 4116, "loss": 0.1526, "lr": 2.1863250905020566e-05, "epoch": 3.6649659863945576, "percentage": 52.36, "elapsed_time": "2:39:07", "remaining_time": "2:24:47"}
{"current_steps": 2160, "total_steps": 4116, "loss": 0.1599, "lr": 2.177878706014418e-05, "epoch": 3.673469387755102, "percentage": 52.48, "elapsed_time": "2:39:32", "remaining_time": "2:24:28"}
{"current_steps": 2165, "total_steps": 4116, "loss": 0.1354, "lr": 2.1694291224761845e-05, "epoch": 3.681972789115646, "percentage": 52.6, "elapsed_time": "2:39:50", "remaining_time": "2:24:02"}
{"current_steps": 2170, "total_steps": 4116, "loss": 0.1348, "lr": 2.160976491848465e-05, "epoch": 3.6904761904761907, "percentage": 52.72, "elapsed_time": "2:40:12", "remaining_time": "2:23:40"}
{"current_steps": 2175, "total_steps": 4116, "loss": 0.1517, "lr": 2.1525209661471677e-05, "epoch": 3.6989795918367347, "percentage": 52.84, "elapsed_time": "2:40:33", "remaining_time": "2:23:16"}
{"current_steps": 2180, "total_steps": 4116, "loss": 0.1396, "lr": 2.1440626974402664e-05, "epoch": 3.707482993197279, "percentage": 52.96, "elapsed_time": "2:40:51", "remaining_time": "2:22:51"}
{"current_steps": 2185, "total_steps": 4116, "loss": 0.1572, "lr": 2.135601837845068e-05, "epoch": 3.715986394557823, "percentage": 53.09, "elapsed_time": "2:41:12", "remaining_time": "2:22:28"}
{"current_steps": 2190, "total_steps": 4116, "loss": 0.1591, "lr": 2.1271385395254728e-05, "epoch": 3.7244897959183674, "percentage": 53.21, "elapsed_time": "2:41:33", "remaining_time": "2:22:04"}
{"current_steps": 2195, "total_steps": 4116, "loss": 0.1394, "lr": 2.118672954689242e-05, "epoch": 3.7329931972789114, "percentage": 53.33, "elapsed_time": "2:41:54", "remaining_time": "2:21:41"}
{"current_steps": 2200, "total_steps": 4116, "loss": 0.1303, "lr": 2.1102052355852586e-05, "epoch": 3.741496598639456, "percentage": 53.45, "elapsed_time": "2:42:15", "remaining_time": "2:21:19"}
{"current_steps": 2205, "total_steps": 4116, "loss": 0.1533, "lr": 2.1017355345007883e-05, "epoch": 3.75, "percentage": 53.57, "elapsed_time": "2:42:36", "remaining_time": "2:20:55"}
{"current_steps": 2210, "total_steps": 4116, "loss": 0.1385, "lr": 2.093264003758742e-05, "epoch": 3.758503401360544, "percentage": 53.69, "elapsed_time": "2:42:54", "remaining_time": "2:20:29"}
{"current_steps": 2215, "total_steps": 4116, "loss": 0.1637, "lr": 2.0847907957149348e-05, "epoch": 3.7670068027210886, "percentage": 53.81, "elapsed_time": "2:43:19", "remaining_time": "2:20:10"}
{"current_steps": 2220, "total_steps": 4116, "loss": 0.1374, "lr": 2.0763160627553502e-05, "epoch": 3.7755102040816326, "percentage": 53.94, "elapsed_time": "2:43:37", "remaining_time": "2:19:44"}
{"current_steps": 2225, "total_steps": 4116, "loss": 0.1413, "lr": 2.067839957293394e-05, "epoch": 3.784013605442177, "percentage": 54.06, "elapsed_time": "2:43:58", "remaining_time": "2:19:21"}
{"current_steps": 2230, "total_steps": 4116, "loss": 0.1406, "lr": 2.059362631767156e-05, "epoch": 3.792517006802721, "percentage": 54.18, "elapsed_time": "2:44:22", "remaining_time": "2:19:01"}
{"current_steps": 2235, "total_steps": 4116, "loss": 0.1441, "lr": 2.0508842386366687e-05, "epoch": 3.8010204081632653, "percentage": 54.3, "elapsed_time": "2:44:45", "remaining_time": "2:18:39"}
{"current_steps": 2240, "total_steps": 4116, "loss": 0.1484, "lr": 2.0424049303811637e-05, "epoch": 3.8095238095238093, "percentage": 54.42, "elapsed_time": "2:45:05", "remaining_time": "2:18:16"}
{"current_steps": 2245, "total_steps": 4116, "loss": 0.1371, "lr": 2.0339248594963326e-05, "epoch": 3.818027210884354, "percentage": 54.54, "elapsed_time": "2:45:25", "remaining_time": "2:17:52"}
{"current_steps": 2250, "total_steps": 4116, "loss": 0.1464, "lr": 2.025444178491582e-05, "epoch": 3.826530612244898, "percentage": 54.66, "elapsed_time": "2:45:49", "remaining_time": "2:17:31"}
{"current_steps": 2255, "total_steps": 4116, "loss": 0.1465, "lr": 2.0169630398872897e-05, "epoch": 3.8350340136054424, "percentage": 54.79, "elapsed_time": "2:46:07", "remaining_time": "2:17:05"}
{"current_steps": 2260, "total_steps": 4116, "loss": 0.1418, "lr": 2.0084815962120644e-05, "epoch": 3.8435374149659864, "percentage": 54.91, "elapsed_time": "2:46:29", "remaining_time": "2:16:43"}
{"current_steps": 2265, "total_steps": 4116, "loss": 0.1431, "lr": 2e-05, "epoch": 3.8520408163265305, "percentage": 55.03, "elapsed_time": "2:46:45", "remaining_time": "2:16:17"}
{"current_steps": 2270, "total_steps": 4116, "loss": 0.1474, "lr": 1.9915184037879362e-05, "epoch": 3.8605442176870746, "percentage": 55.15, "elapsed_time": "2:47:08", "remaining_time": "2:15:55"}
{"current_steps": 2275, "total_steps": 4116, "loss": 0.1525, "lr": 1.983036960112711e-05, "epoch": 3.869047619047619, "percentage": 55.27, "elapsed_time": "2:47:31", "remaining_time": "2:15:33"}
{"current_steps": 2280, "total_steps": 4116, "loss": 0.1477, "lr": 1.9745558215084184e-05, "epoch": 3.877551020408163, "percentage": 55.39, "elapsed_time": "2:47:54", "remaining_time": "2:15:12"}
{"current_steps": 2285, "total_steps": 4116, "loss": 0.139, "lr": 1.9660751405036677e-05, "epoch": 3.8860544217687076, "percentage": 55.52, "elapsed_time": "2:48:15", "remaining_time": "2:14:49"}
{"current_steps": 2290, "total_steps": 4116, "loss": 0.1515, "lr": 1.9575950696188367e-05, "epoch": 3.8945578231292517, "percentage": 55.64, "elapsed_time": "2:48:33", "remaining_time": "2:14:24"}
{"current_steps": 2295, "total_steps": 4116, "loss": 0.1443, "lr": 1.949115761363332e-05, "epoch": 3.9030612244897958, "percentage": 55.76, "elapsed_time": "2:48:51", "remaining_time": "2:13:59"}
{"current_steps": 2300, "total_steps": 4116, "loss": 0.1364, "lr": 1.9406373682328445e-05, "epoch": 3.9115646258503403, "percentage": 55.88, "elapsed_time": "2:49:14", "remaining_time": "2:13:37"}
{"current_steps": 2305, "total_steps": 4116, "loss": 0.1533, "lr": 1.932160042706607e-05, "epoch": 3.9200680272108843, "percentage": 56.0, "elapsed_time": "2:49:37", "remaining_time": "2:13:16"}
{"current_steps": 2310, "total_steps": 4116, "loss": 0.1441, "lr": 1.9236839372446504e-05, "epoch": 3.928571428571429, "percentage": 56.12, "elapsed_time": "2:50:01", "remaining_time": "2:12:55"}
{"current_steps": 2315, "total_steps": 4116, "loss": 0.1545, "lr": 1.9152092042850655e-05, "epoch": 3.937074829931973, "percentage": 56.24, "elapsed_time": "2:50:25", "remaining_time": "2:12:35"}
{"current_steps": 2320, "total_steps": 4116, "loss": 0.1422, "lr": 1.9067359962412588e-05, "epoch": 3.945578231292517, "percentage": 56.37, "elapsed_time": "2:50:43", "remaining_time": "2:12:09"}
{"current_steps": 2325, "total_steps": 4116, "loss": 0.1537, "lr": 1.8982644654992123e-05, "epoch": 3.954081632653061, "percentage": 56.49, "elapsed_time": "2:51:05", "remaining_time": "2:11:47"}
{"current_steps": 2330, "total_steps": 4116, "loss": 0.1635, "lr": 1.8897947644147424e-05, "epoch": 3.9625850340136055, "percentage": 56.61, "elapsed_time": "2:51:32", "remaining_time": "2:11:29"}
{"current_steps": 2335, "total_steps": 4116, "loss": 0.1374, "lr": 1.8813270453107586e-05, "epoch": 3.9710884353741496, "percentage": 56.73, "elapsed_time": "2:51:55", "remaining_time": "2:11:08"}
{"current_steps": 2340, "total_steps": 4116, "loss": 0.1535, "lr": 1.872861460474528e-05, "epoch": 3.979591836734694, "percentage": 56.85, "elapsed_time": "2:52:19", "remaining_time": "2:10:47"}
{"current_steps": 2345, "total_steps": 4116, "loss": 0.1489, "lr": 1.8643981621549325e-05, "epoch": 3.988095238095238, "percentage": 56.97, "elapsed_time": "2:52:44", "remaining_time": "2:10:27"}
{"current_steps": 2350, "total_steps": 4116, "loss": 0.1446, "lr": 1.855937302559734e-05, "epoch": 3.996598639455782, "percentage": 57.09, "elapsed_time": "2:53:10", "remaining_time": "2:10:08"}
{"current_steps": 2355, "total_steps": 4116, "loss": 0.1119, "lr": 1.8474790338528336e-05, "epoch": 4.005102040816326, "percentage": 57.22, "elapsed_time": "2:53:27", "remaining_time": "2:09:42"}
{"current_steps": 2360, "total_steps": 4116, "loss": 0.1308, "lr": 1.839023508151536e-05, "epoch": 4.01360544217687, "percentage": 57.34, "elapsed_time": "2:53:48", "remaining_time": "2:09:19"}
{"current_steps": 2365, "total_steps": 4116, "loss": 0.141, "lr": 1.8305708775238162e-05, "epoch": 4.022108843537415, "percentage": 57.46, "elapsed_time": "2:54:10", "remaining_time": "2:08:57"}
{"current_steps": 2370, "total_steps": 4116, "loss": 0.1317, "lr": 1.822121293985583e-05, "epoch": 4.030612244897959, "percentage": 57.58, "elapsed_time": "2:54:32", "remaining_time": "2:08:35"}
{"current_steps": 2375, "total_steps": 4116, "loss": 0.1207, "lr": 1.813674909497944e-05, "epoch": 4.039115646258503, "percentage": 57.7, "elapsed_time": "2:54:54", "remaining_time": "2:08:13"}
{"current_steps": 2380, "total_steps": 4116, "loss": 0.1263, "lr": 1.8052318759644744e-05, "epoch": 4.0476190476190474, "percentage": 57.82, "elapsed_time": "2:55:16", "remaining_time": "2:07:51"}
{"current_steps": 2385, "total_steps": 4116, "loss": 0.125, "lr": 1.7967923452284824e-05, "epoch": 4.0561224489795915, "percentage": 57.94, "elapsed_time": "2:55:39", "remaining_time": "2:07:29"}
{"current_steps": 2390, "total_steps": 4116, "loss": 0.1343, "lr": 1.7883564690702826e-05, "epoch": 4.0646258503401365, "percentage": 58.07, "elapsed_time": "2:56:05", "remaining_time": "2:07:10"}
{"current_steps": 2395, "total_steps": 4116, "loss": 0.1202, "lr": 1.7799243992044634e-05, "epoch": 4.0731292517006805, "percentage": 58.19, "elapsed_time": "2:56:26", "remaining_time": "2:06:47"}
{"current_steps": 2400, "total_steps": 4116, "loss": 0.1312, "lr": 1.7714962872771593e-05, "epoch": 4.081632653061225, "percentage": 58.31, "elapsed_time": "2:56:47", "remaining_time": "2:06:24"}
{"current_steps": 2405, "total_steps": 4116, "loss": 0.1193, "lr": 1.7630722848633234e-05, "epoch": 4.090136054421769, "percentage": 58.43, "elapsed_time": "2:57:09", "remaining_time": "2:06:01"}
{"current_steps": 2410, "total_steps": 4116, "loss": 0.1503, "lr": 1.7546525434639992e-05, "epoch": 4.098639455782313, "percentage": 58.55, "elapsed_time": "2:57:30", "remaining_time": "2:05:39"}
{"current_steps": 2415, "total_steps": 4116, "loss": 0.1212, "lr": 1.7462372145036012e-05, "epoch": 4.107142857142857, "percentage": 58.67, "elapsed_time": "2:57:50", "remaining_time": "2:05:15"}
{"current_steps": 2420, "total_steps": 4116, "loss": 0.1224, "lr": 1.737826449327188e-05, "epoch": 4.115646258503402, "percentage": 58.79, "elapsed_time": "2:58:11", "remaining_time": "2:04:52"}
{"current_steps": 2425, "total_steps": 4116, "loss": 0.1399, "lr": 1.729420399197741e-05, "epoch": 4.124149659863946, "percentage": 58.92, "elapsed_time": "2:58:35", "remaining_time": "2:04:32"}
{"current_steps": 2430, "total_steps": 4116, "loss": 0.1416, "lr": 1.7210192152934423e-05, "epoch": 4.13265306122449, "percentage": 59.04, "elapsed_time": "2:58:57", "remaining_time": "2:04:09"}
{"current_steps": 2435, "total_steps": 4116, "loss": 0.1297, "lr": 1.712623048704961e-05, "epoch": 4.141156462585034, "percentage": 59.16, "elapsed_time": "2:59:19", "remaining_time": "2:03:47"}
{"current_steps": 2440, "total_steps": 4116, "loss": 0.1357, "lr": 1.7042320504327285e-05, "epoch": 4.149659863945578, "percentage": 59.28, "elapsed_time": "2:59:37", "remaining_time": "2:03:22"}
{"current_steps": 2445, "total_steps": 4116, "loss": 0.1307, "lr": 1.6958463713842298e-05, "epoch": 4.158163265306122, "percentage": 59.4, "elapsed_time": "3:00:07", "remaining_time": "2:03:06"}
{"current_steps": 2450, "total_steps": 4116, "loss": 0.1295, "lr": 1.687466162371286e-05, "epoch": 4.166666666666667, "percentage": 59.52, "elapsed_time": "3:00:28", "remaining_time": "2:02:43"}
{"current_steps": 2455, "total_steps": 4116, "loss": 0.1386, "lr": 1.6790915741073418e-05, "epoch": 4.175170068027211, "percentage": 59.65, "elapsed_time": "3:00:49", "remaining_time": "2:02:20"}
{"current_steps": 2460, "total_steps": 4116, "loss": 0.1264, "lr": 1.670722757204758e-05, "epoch": 4.183673469387755, "percentage": 59.77, "elapsed_time": "3:01:11", "remaining_time": "2:01:58"}
{"current_steps": 2465, "total_steps": 4116, "loss": 0.1157, "lr": 1.6623598621720966e-05, "epoch": 4.192176870748299, "percentage": 59.89, "elapsed_time": "3:01:36", "remaining_time": "2:01:37"}
{"current_steps": 2470, "total_steps": 4116, "loss": 0.1206, "lr": 1.6540030394114218e-05, "epoch": 4.200680272108843, "percentage": 60.01, "elapsed_time": "3:01:57", "remaining_time": "2:01:15"}
{"current_steps": 2475, "total_steps": 4116, "loss": 0.1265, "lr": 1.64565243921559e-05, "epoch": 4.209183673469388, "percentage": 60.13, "elapsed_time": "3:02:22", "remaining_time": "2:00:55"}
{"current_steps": 2480, "total_steps": 4116, "loss": 0.1156, "lr": 1.6373082117655473e-05, "epoch": 4.217687074829932, "percentage": 60.25, "elapsed_time": "3:02:41", "remaining_time": "2:00:31"}
{"current_steps": 2485, "total_steps": 4116, "loss": 0.1315, "lr": 1.6289705071276323e-05, "epoch": 4.226190476190476, "percentage": 60.37, "elapsed_time": "3:02:57", "remaining_time": "2:00:05"}
{"current_steps": 2490, "total_steps": 4116, "loss": 0.1428, "lr": 1.62063947525087e-05, "epoch": 4.23469387755102, "percentage": 60.5, "elapsed_time": "3:03:16", "remaining_time": "1:59:41"}
{"current_steps": 2495, "total_steps": 4116, "loss": 0.1261, "lr": 1.612315265964284e-05, "epoch": 4.243197278911564, "percentage": 60.62, "elapsed_time": "3:03:35", "remaining_time": "1:59:17"}
{"current_steps": 2500, "total_steps": 4116, "loss": 0.1387, "lr": 1.6039980289741935e-05, "epoch": 4.2517006802721085, "percentage": 60.74, "elapsed_time": "3:03:59", "remaining_time": "1:58:56"}
{"current_steps": 2505, "total_steps": 4116, "loss": 0.1217, "lr": 1.595687913861528e-05, "epoch": 4.260204081632653, "percentage": 60.86, "elapsed_time": "3:04:21", "remaining_time": "1:58:33"}
{"current_steps": 2510, "total_steps": 4116, "loss": 0.1496, "lr": 1.587385070079133e-05, "epoch": 4.2687074829931975, "percentage": 60.98, "elapsed_time": "3:04:51", "remaining_time": "1:58:16"}
{"current_steps": 2515, "total_steps": 4116, "loss": 0.1321, "lr": 1.5790896469490807e-05, "epoch": 4.2772108843537415, "percentage": 61.1, "elapsed_time": "3:05:14", "remaining_time": "1:57:55"}
{"current_steps": 2520, "total_steps": 4116, "loss": 0.1353, "lr": 1.5708017936599908e-05, "epoch": 4.285714285714286, "percentage": 61.22, "elapsed_time": "3:05:36", "remaining_time": "1:57:32"}
{"current_steps": 2525, "total_steps": 4116, "loss": 0.1249, "lr": 1.56252165926434e-05, "epoch": 4.29421768707483, "percentage": 61.35, "elapsed_time": "3:05:55", "remaining_time": "1:57:09"}
{"current_steps": 2530, "total_steps": 4116, "loss": 0.123, "lr": 1.554249392675788e-05, "epoch": 4.302721088435375, "percentage": 61.47, "elapsed_time": "3:06:21", "remaining_time": "1:56:49"}
{"current_steps": 2535, "total_steps": 4116, "loss": 0.1318, "lr": 1.5459851426664956e-05, "epoch": 4.311224489795919, "percentage": 61.59, "elapsed_time": "3:06:42", "remaining_time": "1:56:26"}
{"current_steps": 2540, "total_steps": 4116, "loss": 0.1191, "lr": 1.5377290578644468e-05, "epoch": 4.319727891156463, "percentage": 61.71, "elapsed_time": "3:07:04", "remaining_time": "1:56:04"}
{"current_steps": 2545, "total_steps": 4116, "loss": 0.1287, "lr": 1.5294812867507835e-05, "epoch": 4.328231292517007, "percentage": 61.83, "elapsed_time": "3:07:21", "remaining_time": "1:55:39"}
{"current_steps": 2550, "total_steps": 4116, "loss": 0.1393, "lr": 1.5212419776571267e-05, "epoch": 4.336734693877551, "percentage": 61.95, "elapsed_time": "3:07:46", "remaining_time": "1:55:18"}
{"current_steps": 2555, "total_steps": 4116, "loss": 0.1229, "lr": 1.5130112787629152e-05, "epoch": 4.345238095238095, "percentage": 62.07, "elapsed_time": "3:08:07", "remaining_time": "1:54:56"}
{"current_steps": 2560, "total_steps": 4116, "loss": 0.1196, "lr": 1.5047893380927377e-05, "epoch": 4.35374149659864, "percentage": 62.2, "elapsed_time": "3:08:26", "remaining_time": "1:54:32"}
{"current_steps": 2565, "total_steps": 4116, "loss": 0.1259, "lr": 1.4965763035136687e-05, "epoch": 4.362244897959184, "percentage": 62.32, "elapsed_time": "3:08:51", "remaining_time": "1:54:11"}
{"current_steps": 2570, "total_steps": 4116, "loss": 0.1388, "lr": 1.4883723227326144e-05, "epoch": 4.370748299319728, "percentage": 62.44, "elapsed_time": "3:09:12", "remaining_time": "1:53:49"}
{"current_steps": 2575, "total_steps": 4116, "loss": 0.1254, "lr": 1.4801775432936516e-05, "epoch": 4.379251700680272, "percentage": 62.56, "elapsed_time": "3:09:34", "remaining_time": "1:53:27"}
{"current_steps": 2580, "total_steps": 4116, "loss": 0.1213, "lr": 1.4719921125753773e-05, "epoch": 4.387755102040816, "percentage": 62.68, "elapsed_time": "3:09:56", "remaining_time": "1:53:04"}
{"current_steps": 2585, "total_steps": 4116, "loss": 0.1275, "lr": 1.4638161777882573e-05, "epoch": 4.39625850340136, "percentage": 62.8, "elapsed_time": "3:10:18", "remaining_time": "1:52:42"}
{"current_steps": 2590, "total_steps": 4116, "loss": 0.1245, "lr": 1.4556498859719756e-05, "epoch": 4.404761904761905, "percentage": 62.93, "elapsed_time": "3:10:43", "remaining_time": "1:52:22"}
{"current_steps": 2595, "total_steps": 4116, "loss": 0.1301, "lr": 1.4474933839927956e-05, "epoch": 4.413265306122449, "percentage": 63.05, "elapsed_time": "3:11:06", "remaining_time": "1:52:00"}
{"current_steps": 2600, "total_steps": 4116, "loss": 0.1345, "lr": 1.4393468185409139e-05, "epoch": 4.421768707482993, "percentage": 63.17, "elapsed_time": "3:11:31", "remaining_time": "1:51:40"}
{"current_steps": 2605, "total_steps": 4116, "loss": 0.1266, "lr": 1.4312103361278254e-05, "epoch": 4.430272108843537, "percentage": 63.29, "elapsed_time": "3:11:51", "remaining_time": "1:51:16"}
{"current_steps": 2610, "total_steps": 4116, "loss": 0.1215, "lr": 1.4230840830836875e-05, "epoch": 4.438775510204081, "percentage": 63.41, "elapsed_time": "3:12:17", "remaining_time": "1:50:57"}
{"current_steps": 2615, "total_steps": 4116, "loss": 0.133, "lr": 1.4149682055546854e-05, "epoch": 4.447278911564625, "percentage": 63.53, "elapsed_time": "3:12:37", "remaining_time": "1:50:34"}
{"current_steps": 2620, "total_steps": 4116, "loss": 0.1311, "lr": 1.4068628495004095e-05, "epoch": 4.45578231292517, "percentage": 63.65, "elapsed_time": "3:12:59", "remaining_time": "1:50:12"}
{"current_steps": 2625, "total_steps": 4116, "loss": 0.1257, "lr": 1.3987681606912254e-05, "epoch": 4.464285714285714, "percentage": 63.78, "elapsed_time": "3:13:19", "remaining_time": "1:49:48"}
{"current_steps": 2630, "total_steps": 4116, "loss": 0.1454, "lr": 1.3906842847056546e-05, "epoch": 4.4727891156462585, "percentage": 63.9, "elapsed_time": "3:13:48", "remaining_time": "1:49:30"}
{"current_steps": 2635, "total_steps": 4116, "loss": 0.1343, "lr": 1.382611366927757e-05, "epoch": 4.4812925170068025, "percentage": 64.02, "elapsed_time": "3:14:09", "remaining_time": "1:49:07"}
{"current_steps": 2640, "total_steps": 4116, "loss": 0.1309, "lr": 1.3745495525445126e-05, "epoch": 4.489795918367347, "percentage": 64.14, "elapsed_time": "3:14:32", "remaining_time": "1:48:45"}
{"current_steps": 2645, "total_steps": 4116, "loss": 0.1334, "lr": 1.3664989865432157e-05, "epoch": 4.4982993197278915, "percentage": 64.26, "elapsed_time": "3:14:53", "remaining_time": "1:48:23"}
{"current_steps": 2650, "total_steps": 4116, "loss": 0.141, "lr": 1.3584598137088625e-05, "epoch": 4.506802721088436, "percentage": 64.38, "elapsed_time": "3:15:15", "remaining_time": "1:48:00"}
{"current_steps": 2655, "total_steps": 4116, "loss": 0.1192, "lr": 1.3504321786215514e-05, "epoch": 4.51530612244898, "percentage": 64.5, "elapsed_time": "3:15:34", "remaining_time": "1:47:37"}
{"current_steps": 2660, "total_steps": 4116, "loss": 0.1421, "lr": 1.3424162256538797e-05, "epoch": 4.523809523809524, "percentage": 64.63, "elapsed_time": "3:15:56", "remaining_time": "1:47:15"}
{"current_steps": 2665, "total_steps": 4116, "loss": 0.1314, "lr": 1.3344120989683473e-05, "epoch": 4.532312925170068, "percentage": 64.75, "elapsed_time": "3:16:20", "remaining_time": "1:46:53"}
{"current_steps": 2670, "total_steps": 4116, "loss": 0.1361, "lr": 1.3264199425147667e-05, "epoch": 4.540816326530612, "percentage": 64.87, "elapsed_time": "3:16:38", "remaining_time": "1:46:29"}
{"current_steps": 2675, "total_steps": 4116, "loss": 0.16, "lr": 1.3184399000276708e-05, "epoch": 4.549319727891157, "percentage": 64.99, "elapsed_time": "3:16:59", "remaining_time": "1:46:07"}
{"current_steps": 2680, "total_steps": 4116, "loss": 0.1162, "lr": 1.3104721150237305e-05, "epoch": 4.557823129251701, "percentage": 65.11, "elapsed_time": "3:17:17", "remaining_time": "1:45:42"}
{"current_steps": 2685, "total_steps": 4116, "loss": 0.1209, "lr": 1.3025167307991734e-05, "epoch": 4.566326530612245, "percentage": 65.23, "elapsed_time": "3:17:34", "remaining_time": "1:45:18"}
{"current_steps": 2690, "total_steps": 4116, "loss": 0.1438, "lr": 1.294573890427203e-05, "epoch": 4.574829931972789, "percentage": 65.35, "elapsed_time": "3:17:55", "remaining_time": "1:44:55"}
{"current_steps": 2695, "total_steps": 4116, "loss": 0.134, "lr": 1.2866437367554313e-05, "epoch": 4.583333333333333, "percentage": 65.48, "elapsed_time": "3:18:18", "remaining_time": "1:44:33"}
{"current_steps": 2700, "total_steps": 4116, "loss": 0.1198, "lr": 1.278726412403306e-05, "epoch": 4.591836734693878, "percentage": 65.6, "elapsed_time": "3:18:37", "remaining_time": "1:44:10"}
{"current_steps": 2705, "total_steps": 4116, "loss": 0.1357, "lr": 1.2708220597595462e-05, "epoch": 4.600340136054422, "percentage": 65.72, "elapsed_time": "3:19:00", "remaining_time": "1:43:48"}
{"current_steps": 2710, "total_steps": 4116, "loss": 0.1328, "lr": 1.2629308209795834e-05, "epoch": 4.608843537414966, "percentage": 65.84, "elapsed_time": "3:19:19", "remaining_time": "1:43:24"}
{"current_steps": 2715, "total_steps": 4116, "loss": 0.1406, "lr": 1.2550528379830019e-05, "epoch": 4.61734693877551, "percentage": 65.96, "elapsed_time": "3:19:39", "remaining_time": "1:43:01"}
{"current_steps": 2720, "total_steps": 4116, "loss": 0.1412, "lr": 1.2471882524509873e-05, "epoch": 4.625850340136054, "percentage": 66.08, "elapsed_time": "3:19:56", "remaining_time": "1:42:36"}
{"current_steps": 2725, "total_steps": 4116, "loss": 0.1252, "lr": 1.2393372058237819e-05, "epoch": 4.634353741496598, "percentage": 66.21, "elapsed_time": "3:20:15", "remaining_time": "1:42:13"}
{"current_steps": 2730, "total_steps": 4116, "loss": 0.1313, "lr": 1.2314998392981364e-05, "epoch": 4.642857142857143, "percentage": 66.33, "elapsed_time": "3:20:35", "remaining_time": "1:41:50"}
{"current_steps": 2735, "total_steps": 4116, "loss": 0.1342, "lr": 1.2236762938247729e-05, "epoch": 4.651360544217687, "percentage": 66.45, "elapsed_time": "3:20:55", "remaining_time": "1:41:27"}
{"current_steps": 2740, "total_steps": 4116, "loss": 0.134, "lr": 1.2158667101058497e-05, "epoch": 4.659863945578231, "percentage": 66.57, "elapsed_time": "3:21:20", "remaining_time": "1:41:06"}
{"current_steps": 2745, "total_steps": 4116, "loss": 0.13, "lr": 1.208071228592429e-05, "epoch": 4.668367346938775, "percentage": 66.69, "elapsed_time": "3:21:47", "remaining_time": "1:40:47"}
{"current_steps": 2750, "total_steps": 4116, "loss": 0.125, "lr": 1.2002899894819548e-05, "epoch": 4.6768707482993195, "percentage": 66.81, "elapsed_time": "3:22:08", "remaining_time": "1:40:24"}
{"current_steps": 2755, "total_steps": 4116, "loss": 0.1318, "lr": 1.1925231327157288e-05, "epoch": 4.685374149659864, "percentage": 66.93, "elapsed_time": "3:22:32", "remaining_time": "1:40:03"}
{"current_steps": 2760, "total_steps": 4116, "loss": 0.1411, "lr": 1.1847707979763934e-05, "epoch": 4.6938775510204085, "percentage": 67.06, "elapsed_time": "3:22:55", "remaining_time": "1:39:41"}
{"current_steps": 2765, "total_steps": 4116, "loss": 0.1252, "lr": 1.1770331246854211e-05, "epoch": 4.7023809523809526, "percentage": 67.18, "elapsed_time": "3:23:11", "remaining_time": "1:39:16"}
{"current_steps": 2770, "total_steps": 4116, "loss": 0.1222, "lr": 1.169310252000605e-05, "epoch": 4.710884353741497, "percentage": 67.3, "elapsed_time": "3:23:32", "remaining_time": "1:38:54"}
{"current_steps": 2775, "total_steps": 4116, "loss": 0.1259, "lr": 1.1616023188135594e-05, "epoch": 4.719387755102041, "percentage": 67.42, "elapsed_time": "3:23:51", "remaining_time": "1:38:30"}
{"current_steps": 2780, "total_steps": 4116, "loss": 0.1364, "lr": 1.1539094637472181e-05, "epoch": 4.727891156462585, "percentage": 67.54, "elapsed_time": "3:24:10", "remaining_time": "1:38:07"}
{"current_steps": 2785, "total_steps": 4116, "loss": 0.1375, "lr": 1.1462318251533442e-05, "epoch": 4.736394557823129, "percentage": 67.66, "elapsed_time": "3:24:32", "remaining_time": "1:37:45"}
{"current_steps": 2790, "total_steps": 4116, "loss": 0.1179, "lr": 1.1385695411100424e-05, "epoch": 4.744897959183674, "percentage": 67.78, "elapsed_time": "3:24:50", "remaining_time": "1:37:21"}
{"current_steps": 2795, "total_steps": 4116, "loss": 0.1189, "lr": 1.1309227494192703e-05, "epoch": 4.753401360544218, "percentage": 67.91, "elapsed_time": "3:25:09", "remaining_time": "1:36:57"}
{"current_steps": 2800, "total_steps": 4116, "loss": 0.1433, "lr": 1.1232915876043673e-05, "epoch": 4.761904761904762, "percentage": 68.03, "elapsed_time": "3:25:35", "remaining_time": "1:36:37"}
{"current_steps": 2805, "total_steps": 4116, "loss": 0.1469, "lr": 1.1156761929075777e-05, "epoch": 4.770408163265306, "percentage": 68.15, "elapsed_time": "3:25:58", "remaining_time": "1:36:16"}
{"current_steps": 2810, "total_steps": 4116, "loss": 0.1257, "lr": 1.1080767022875816e-05, "epoch": 4.77891156462585, "percentage": 68.27, "elapsed_time": "3:26:18", "remaining_time": "1:35:53"}
{"current_steps": 2815, "total_steps": 4116, "loss": 0.1308, "lr": 1.100493252417035e-05, "epoch": 4.787414965986395, "percentage": 68.39, "elapsed_time": "3:26:42", "remaining_time": "1:35:32"}
{"current_steps": 2820, "total_steps": 4116, "loss": 0.1401, "lr": 1.0929259796801075e-05, "epoch": 4.795918367346939, "percentage": 68.51, "elapsed_time": "3:27:04", "remaining_time": "1:35:10"}
{"current_steps": 2825, "total_steps": 4116, "loss": 0.1308, "lr": 1.085375020170034e-05, "epoch": 4.804421768707483, "percentage": 68.63, "elapsed_time": "3:27:25", "remaining_time": "1:34:47"}
{"current_steps": 2830, "total_steps": 4116, "loss": 0.1198, "lr": 1.0778405096866624e-05, "epoch": 4.812925170068027, "percentage": 68.76, "elapsed_time": "3:27:49", "remaining_time": "1:34:26"}
{"current_steps": 2835, "total_steps": 4116, "loss": 0.1299, "lr": 1.0703225837340166e-05, "epoch": 4.821428571428571, "percentage": 68.88, "elapsed_time": "3:28:12", "remaining_time": "1:34:04"}
{"current_steps": 2840, "total_steps": 4116, "loss": 0.1218, "lr": 1.0628213775178561e-05, "epoch": 4.829931972789115, "percentage": 69.0, "elapsed_time": "3:28:37", "remaining_time": "1:33:44"}
{"current_steps": 2845, "total_steps": 4116, "loss": 0.1408, "lr": 1.055337025943244e-05, "epoch": 4.83843537414966, "percentage": 69.12, "elapsed_time": "3:28:59", "remaining_time": "1:33:21"}
{"current_steps": 2850, "total_steps": 4116, "loss": 0.1236, "lr": 1.0478696636121231e-05, "epoch": 4.846938775510204, "percentage": 69.24, "elapsed_time": "3:29:17", "remaining_time": "1:32:58"}
{"current_steps": 2855, "total_steps": 4116, "loss": 0.1326, "lr": 1.0404194248208934e-05, "epoch": 4.855442176870748, "percentage": 69.36, "elapsed_time": "3:29:38", "remaining_time": "1:32:35"}
{"current_steps": 2860, "total_steps": 4116, "loss": 0.1354, "lr": 1.0329864435579982e-05, "epoch": 4.863945578231292, "percentage": 69.48, "elapsed_time": "3:29:59", "remaining_time": "1:32:13"}
{"current_steps": 2865, "total_steps": 4116, "loss": 0.1287, "lr": 1.0255708535015137e-05, "epoch": 4.872448979591836, "percentage": 69.61, "elapsed_time": "3:30:24", "remaining_time": "1:31:52"}
{"current_steps": 2870, "total_steps": 4116, "loss": 0.1256, "lr": 1.0181727880167428e-05, "epoch": 4.880952380952381, "percentage": 69.73, "elapsed_time": "3:30:45", "remaining_time": "1:31:30"}
{"current_steps": 2875, "total_steps": 4116, "loss": 0.1276, "lr": 1.0107923801538215e-05, "epoch": 4.889455782312925, "percentage": 69.85, "elapsed_time": "3:31:04", "remaining_time": "1:31:06"}
{"current_steps": 2880, "total_steps": 4116, "loss": 0.1304, "lr": 1.003429762645321e-05, "epoch": 4.8979591836734695, "percentage": 69.97, "elapsed_time": "3:31:25", "remaining_time": "1:30:44"}
{"current_steps": 2885, "total_steps": 4116, "loss": 0.1307, "lr": 9.960850679038632e-06, "epoch": 4.906462585034014, "percentage": 70.09, "elapsed_time": "3:31:46", "remaining_time": "1:30:21"}
{"current_steps": 2890, "total_steps": 4116, "loss": 0.1363, "lr": 9.887584280197411e-06, "epoch": 4.914965986394558, "percentage": 70.21, "elapsed_time": "3:32:12", "remaining_time": "1:30:01"}
{"current_steps": 2895, "total_steps": 4116, "loss": 0.113, "lr": 9.814499747585373e-06, "epoch": 4.923469387755102, "percentage": 70.34, "elapsed_time": "3:32:33", "remaining_time": "1:29:39"}
{"current_steps": 2900, "total_steps": 4116, "loss": 0.1315, "lr": 9.741598395587606e-06, "epoch": 4.931972789115647, "percentage": 70.46, "elapsed_time": "3:32:55", "remaining_time": "1:29:16"}
{"current_steps": 2905, "total_steps": 4116, "loss": 0.1255, "lr": 9.668881535294792e-06, "epoch": 4.940476190476191, "percentage": 70.58, "elapsed_time": "3:33:14", "remaining_time": "1:28:53"}
{"current_steps": 2910, "total_steps": 4116, "loss": 0.1296, "lr": 9.596350474479625e-06, "epoch": 4.948979591836735, "percentage": 70.7, "elapsed_time": "3:33:36", "remaining_time": "1:28:31"}
{"current_steps": 2915, "total_steps": 4116, "loss": 0.1268, "lr": 9.52400651757331e-06, "epoch": 4.957482993197279, "percentage": 70.82, "elapsed_time": "3:33:58", "remaining_time": "1:28:09"}
{"current_steps": 2920, "total_steps": 4116, "loss": 0.1251, "lr": 9.451850965642073e-06, "epoch": 4.965986394557823, "percentage": 70.94, "elapsed_time": "3:34:21", "remaining_time": "1:27:47"}
{"current_steps": 2925, "total_steps": 4116, "loss": 0.1331, "lr": 9.379885116363786e-06, "epoch": 4.974489795918368, "percentage": 71.06, "elapsed_time": "3:34:41", "remaining_time": "1:27:25"}
{"current_steps": 2930, "total_steps": 4116, "loss": 0.1287, "lr": 9.308110264004625e-06, "epoch": 4.982993197278912, "percentage": 71.19, "elapsed_time": "3:34:58", "remaining_time": "1:27:01"}
{"current_steps": 2935, "total_steps": 4116, "loss": 0.1286, "lr": 9.236527699395803e-06, "epoch": 4.991496598639456, "percentage": 71.31, "elapsed_time": "3:35:16", "remaining_time": "1:26:37"}
{"current_steps": 2940, "total_steps": 4116, "loss": 0.1223, "lr": 9.165138709910338e-06, "epoch": 5.0, "percentage": 71.43, "elapsed_time": "3:35:39", "remaining_time": "1:26:15"}
{"current_steps": 2945, "total_steps": 4116, "loss": 0.1115, "lr": 9.093944579439882e-06, "epoch": 5.008503401360544, "percentage": 71.55, "elapsed_time": "3:36:00", "remaining_time": "1:25:53"}
{"current_steps": 2950, "total_steps": 4116, "loss": 0.1143, "lr": 9.022946588371688e-06, "epoch": 5.017006802721088, "percentage": 71.67, "elapsed_time": "3:36:19", "remaining_time": "1:25:30"}
{"current_steps": 2955, "total_steps": 4116, "loss": 0.1291, "lr": 8.952146013565532e-06, "epoch": 5.025510204081633, "percentage": 71.79, "elapsed_time": "3:36:37", "remaining_time": "1:25:06"}
{"current_steps": 2960, "total_steps": 4116, "loss": 0.124, "lr": 8.881544128330777e-06, "epoch": 5.034013605442177, "percentage": 71.91, "elapsed_time": "3:36:59", "remaining_time": "1:24:44"}
{"current_steps": 2965, "total_steps": 4116, "loss": 0.1163, "lr": 8.81114220240346e-06, "epoch": 5.042517006802721, "percentage": 72.04, "elapsed_time": "3:37:23", "remaining_time": "1:24:23"}
{"current_steps": 2970, "total_steps": 4116, "loss": 0.1104, "lr": 8.740941501923454e-06, "epoch": 5.051020408163265, "percentage": 72.16, "elapsed_time": "3:37:45", "remaining_time": "1:24:01"}
{"current_steps": 2975, "total_steps": 4116, "loss": 0.1076, "lr": 8.6709432894117e-06, "epoch": 5.059523809523809, "percentage": 72.28, "elapsed_time": "3:38:06", "remaining_time": "1:23:38"}
{"current_steps": 2980, "total_steps": 4116, "loss": 0.1112, "lr": 8.601148823747516e-06, "epoch": 5.068027210884353, "percentage": 72.4, "elapsed_time": "3:38:25", "remaining_time": "1:23:16"}
{"current_steps": 2985, "total_steps": 4116, "loss": 0.1139, "lr": 8.531559360145942e-06, "epoch": 5.076530612244898, "percentage": 72.52, "elapsed_time": "3:38:47", "remaining_time": "1:22:53"}
{"current_steps": 2990, "total_steps": 4116, "loss": 0.1356, "lr": 8.462176150135168e-06, "epoch": 5.085034013605442, "percentage": 72.64, "elapsed_time": "3:39:06", "remaining_time": "1:22:30"}
{"current_steps": 2995, "total_steps": 4116, "loss": 0.1218, "lr": 8.393000441534036e-06, "epoch": 5.093537414965986, "percentage": 72.76, "elapsed_time": "3:39:24", "remaining_time": "1:22:07"}
{"current_steps": 3000, "total_steps": 4116, "loss": 0.1143, "lr": 8.324033478429559e-06, "epoch": 5.1020408163265305, "percentage": 72.89, "elapsed_time": "3:39:42", "remaining_time": "1:21:43"}
{"current_steps": 3005, "total_steps": 4116, "loss": 0.1297, "lr": 8.25527650115461e-06, "epoch": 5.110544217687075, "percentage": 73.01, "elapsed_time": "3:40:13", "remaining_time": "1:21:25"}
{"current_steps": 3010, "total_steps": 4116, "loss": 0.1186, "lr": 8.186730746265577e-06, "epoch": 5.119047619047619, "percentage": 73.13, "elapsed_time": "3:40:36", "remaining_time": "1:21:03"}
{"current_steps": 3015, "total_steps": 4116, "loss": 0.1209, "lr": 8.118397446520103e-06, "epoch": 5.127551020408164, "percentage": 73.25, "elapsed_time": "3:40:59", "remaining_time": "1:20:41"}
{"current_steps": 3020, "total_steps": 4116, "loss": 0.1177, "lr": 8.050277830854984e-06, "epoch": 5.136054421768708, "percentage": 73.37, "elapsed_time": "3:41:23", "remaining_time": "1:20:20"}
{"current_steps": 3025, "total_steps": 4116, "loss": 0.1125, "lr": 7.982373124363984e-06, "epoch": 5.144557823129252, "percentage": 73.49, "elapsed_time": "3:41:47", "remaining_time": "1:19:59"}
{"current_steps": 3030, "total_steps": 4116, "loss": 0.1094, "lr": 7.914684548275875e-06, "epoch": 5.153061224489796, "percentage": 73.62, "elapsed_time": "3:42:10", "remaining_time": "1:19:37"}
{"current_steps": 3035, "total_steps": 4116, "loss": 0.1271, "lr": 7.847213319932425e-06, "epoch": 5.16156462585034, "percentage": 73.74, "elapsed_time": "3:42:31", "remaining_time": "1:19:15"}
{"current_steps": 3040, "total_steps": 4116, "loss": 0.1247, "lr": 7.779960652766534e-06, "epoch": 5.170068027210885, "percentage": 73.86, "elapsed_time": "3:42:55", "remaining_time": "1:18:54"}
{"current_steps": 3045, "total_steps": 4116, "loss": 0.1151, "lr": 7.7129277562804e-06, "epoch": 5.178571428571429, "percentage": 73.98, "elapsed_time": "3:43:13", "remaining_time": "1:18:30"}
{"current_steps": 3050, "total_steps": 4116, "loss": 0.1157, "lr": 7.646115836023744e-06, "epoch": 5.187074829931973, "percentage": 74.1, "elapsed_time": "3:43:35", "remaining_time": "1:18:08"}
{"current_steps": 3055, "total_steps": 4116, "loss": 0.1177, "lr": 7.5795260935721755e-06, "epoch": 5.195578231292517, "percentage": 74.22, "elapsed_time": "3:43:58", "remaining_time": "1:17:47"}
{"current_steps": 3060, "total_steps": 4116, "loss": 0.1146, "lr": 7.5131597265055585e-06, "epoch": 5.204081632653061, "percentage": 74.34, "elapsed_time": "3:44:19", "remaining_time": "1:17:24"}
{"current_steps": 3065, "total_steps": 4116, "loss": 0.1291, "lr": 7.447017928386453e-06, "epoch": 5.212585034013605, "percentage": 74.47, "elapsed_time": "3:44:40", "remaining_time": "1:17:02"}
{"current_steps": 3070, "total_steps": 4116, "loss": 0.1172, "lr": 7.381101888738693e-06, "epoch": 5.22108843537415, "percentage": 74.59, "elapsed_time": "3:44:56", "remaining_time": "1:16:38"}
{"current_steps": 3075, "total_steps": 4116, "loss": 0.1191, "lr": 7.315412793025951e-06, "epoch": 5.229591836734694, "percentage": 74.71, "elapsed_time": "3:45:16", "remaining_time": "1:16:15"}
{"current_steps": 3080, "total_steps": 4116, "loss": 0.1125, "lr": 7.24995182263045e-06, "epoch": 5.238095238095238, "percentage": 74.83, "elapsed_time": "3:45:38", "remaining_time": "1:15:53"}
{"current_steps": 3085, "total_steps": 4116, "loss": 0.1371, "lr": 7.184720154831706e-06, "epoch": 5.246598639455782, "percentage": 74.95, "elapsed_time": "3:46:06", "remaining_time": "1:15:33"}
{"current_steps": 3090, "total_steps": 4116, "loss": 0.11, "lr": 7.119718962785349e-06, "epoch": 5.255102040816326, "percentage": 75.07, "elapsed_time": "3:46:26", "remaining_time": "1:15:11"}
{"current_steps": 3095, "total_steps": 4116, "loss": 0.1168, "lr": 7.054949415502037e-06, "epoch": 5.263605442176871, "percentage": 75.19, "elapsed_time": "3:46:50", "remaining_time": "1:14:49"}
{"current_steps": 3100, "total_steps": 4116, "loss": 0.1269, "lr": 6.990412677826404e-06, "epoch": 5.272108843537415, "percentage": 75.32, "elapsed_time": "3:47:12", "remaining_time": "1:14:27"}
{"current_steps": 3105, "total_steps": 4116, "loss": 0.1059, "lr": 6.926109910416153e-06, "epoch": 5.280612244897959, "percentage": 75.44, "elapsed_time": "3:47:36", "remaining_time": "1:14:06"}
{"current_steps": 3110, "total_steps": 4116, "loss": 0.1084, "lr": 6.862042269721158e-06, "epoch": 5.289115646258503, "percentage": 75.56, "elapsed_time": "3:47:57", "remaining_time": "1:13:44"}
{"current_steps": 3115, "total_steps": 4116, "loss": 0.1177, "lr": 6.798210907962655e-06, "epoch": 5.2976190476190474, "percentage": 75.68, "elapsed_time": "3:48:21", "remaining_time": "1:13:22"}
{"current_steps": 3120, "total_steps": 4116, "loss": 0.1044, "lr": 6.734616973112549e-06, "epoch": 5.3061224489795915, "percentage": 75.8, "elapsed_time": "3:48:42", "remaining_time": "1:13:00"}
{"current_steps": 3125, "total_steps": 4116, "loss": 0.1082, "lr": 6.671261608872737e-06, "epoch": 5.3146258503401365, "percentage": 75.92, "elapsed_time": "3:49:04", "remaining_time": "1:12:38"}
{"current_steps": 3130, "total_steps": 4116, "loss": 0.1196, "lr": 6.60814595465457e-06, "epoch": 5.3231292517006805, "percentage": 76.04, "elapsed_time": "3:49:24", "remaining_time": "1:12:15"}
{"current_steps": 3135, "total_steps": 4116, "loss": 0.1212, "lr": 6.545271145558336e-06, "epoch": 5.331632653061225, "percentage": 76.17, "elapsed_time": "3:49:49", "remaining_time": "1:11:54"}
{"current_steps": 3140, "total_steps": 4116, "loss": 0.1162, "lr": 6.482638312352867e-06, "epoch": 5.340136054421769, "percentage": 76.29, "elapsed_time": "3:50:08", "remaining_time": "1:11:31"}
{"current_steps": 3145, "total_steps": 4116, "loss": 0.112, "lr": 6.420248581455184e-06, "epoch": 5.348639455782313, "percentage": 76.41, "elapsed_time": "3:50:26", "remaining_time": "1:11:08"}
{"current_steps": 3150, "total_steps": 4116, "loss": 0.1331, "lr": 6.358103074910238e-06, "epoch": 5.357142857142857, "percentage": 76.53, "elapsed_time": "3:50:49", "remaining_time": "1:10:47"}
{"current_steps": 3155, "total_steps": 4116, "loss": 0.1215, "lr": 6.296202910370757e-06, "epoch": 5.365646258503402, "percentage": 76.65, "elapsed_time": "3:51:13", "remaining_time": "1:10:25"}
{"current_steps": 3160, "total_steps": 4116, "loss": 0.1104, "lr": 6.234549201077124e-06, "epoch": 5.374149659863946, "percentage": 76.77, "elapsed_time": "3:51:33", "remaining_time": "1:10:03"}
{"current_steps": 3165, "total_steps": 4116, "loss": 0.1092, "lr": 6.173143055837345e-06, "epoch": 5.38265306122449, "percentage": 76.9, "elapsed_time": "3:51:51", "remaining_time": "1:09:40"}
{"current_steps": 3170, "total_steps": 4116, "loss": 0.1249, "lr": 6.111985579007149e-06, "epoch": 5.391156462585034, "percentage": 77.02, "elapsed_time": "3:52:15", "remaining_time": "1:09:18"}
{"current_steps": 3175, "total_steps": 4116, "loss": 0.1168, "lr": 6.051077870470068e-06, "epoch": 5.399659863945578, "percentage": 77.14, "elapsed_time": "3:52:38", "remaining_time": "1:08:57"}
{"current_steps": 3180, "total_steps": 4116, "loss": 0.1102, "lr": 5.990421025617716e-06, "epoch": 5.408163265306122, "percentage": 77.26, "elapsed_time": "3:52:58", "remaining_time": "1:08:34"}
{"current_steps": 3185, "total_steps": 4116, "loss": 0.1184, "lr": 5.930016135330052e-06, "epoch": 5.416666666666667, "percentage": 77.38, "elapsed_time": "3:53:26", "remaining_time": "1:08:14"}
{"current_steps": 3190, "total_steps": 4116, "loss": 0.1304, "lr": 5.869864285955771e-06, "epoch": 5.425170068027211, "percentage": 77.5, "elapsed_time": "3:53:48", "remaining_time": "1:07:52"}
{"current_steps": 3195, "total_steps": 4116, "loss": 0.1189, "lr": 5.80996655929277e-06, "epoch": 5.433673469387755, "percentage": 77.62, "elapsed_time": "3:54:09", "remaining_time": "1:07:30"}
{"current_steps": 3200, "total_steps": 4116, "loss": 0.1206, "lr": 5.750324032568677e-06, "epoch": 5.442176870748299, "percentage": 77.75, "elapsed_time": "3:54:31", "remaining_time": "1:07:08"}
{"current_steps": 3205, "total_steps": 4116, "loss": 0.1202, "lr": 5.6909377784215035e-06, "epoch": 5.450680272108843, "percentage": 77.87, "elapsed_time": "3:54:51", "remaining_time": "1:06:45"}
{"current_steps": 3210, "total_steps": 4116, "loss": 0.1241, "lr": 5.631808864880342e-06, "epoch": 5.459183673469388, "percentage": 77.99, "elapsed_time": "3:55:10", "remaining_time": "1:06:22"}
{"current_steps": 3215, "total_steps": 4116, "loss": 0.1141, "lr": 5.5729383553461315e-06, "epoch": 5.467687074829932, "percentage": 78.11, "elapsed_time": "3:55:29", "remaining_time": "1:05:59"}
{"current_steps": 3220, "total_steps": 4116, "loss": 0.1397, "lr": 5.5143273085725894e-06, "epoch": 5.476190476190476, "percentage": 78.23, "elapsed_time": "3:55:50", "remaining_time": "1:05:37"}
{"current_steps": 3225, "total_steps": 4116, "loss": 0.1118, "lr": 5.45597677864711e-06, "epoch": 5.48469387755102, "percentage": 78.35, "elapsed_time": "3:56:14", "remaining_time": "1:05:16"}
{"current_steps": 3230, "total_steps": 4116, "loss": 0.1219, "lr": 5.397887814971854e-06, "epoch": 5.493197278911564, "percentage": 78.47, "elapsed_time": "3:56:37", "remaining_time": "1:04:54"}
{"current_steps": 3235, "total_steps": 4116, "loss": 0.1252, "lr": 5.34006146224485e-06, "epoch": 5.5017006802721085, "percentage": 78.6, "elapsed_time": "3:56:59", "remaining_time": "1:04:32"}
{"current_steps": 3240, "total_steps": 4116, "loss": 0.1216, "lr": 5.282498760441219e-06, "epoch": 5.510204081632653, "percentage": 78.72, "elapsed_time": "3:57:25", "remaining_time": "1:04:11"}
{"current_steps": 3245, "total_steps": 4116, "loss": 0.1414, "lr": 5.225200744794465e-06, "epoch": 5.5187074829931975, "percentage": 78.84, "elapsed_time": "3:57:48", "remaining_time": "1:03:49"}
{"current_steps": 3250, "total_steps": 4116, "loss": 0.1297, "lr": 5.168168445777839e-06, "epoch": 5.5272108843537415, "percentage": 78.96, "elapsed_time": "3:58:09", "remaining_time": "1:03:27"}
{"current_steps": 3255, "total_steps": 4116, "loss": 0.1178, "lr": 5.111402889085852e-06, "epoch": 5.535714285714286, "percentage": 79.08, "elapsed_time": "3:58:36", "remaining_time": "1:03:07"}
{"current_steps": 3260, "total_steps": 4116, "loss": 0.1156, "lr": 5.054905095615779e-06, "epoch": 5.54421768707483, "percentage": 79.2, "elapsed_time": "3:59:00", "remaining_time": "1:02:45"}
{"current_steps": 3265, "total_steps": 4116, "loss": 0.1151, "lr": 4.9986760814493315e-06, "epoch": 5.552721088435375, "percentage": 79.32, "elapsed_time": "3:59:18", "remaining_time": "1:02:22"}
{"current_steps": 3270, "total_steps": 4116, "loss": 0.1285, "lr": 4.94271685783438e-06, "epoch": 5.561224489795919, "percentage": 79.45, "elapsed_time": "3:59:42", "remaining_time": "1:02:00"}
{"current_steps": 3275, "total_steps": 4116, "loss": 0.1108, "lr": 4.88702843116674e-06, "epoch": 5.569727891156463, "percentage": 79.57, "elapsed_time": "4:00:07", "remaining_time": "1:01:39"}
{"current_steps": 3280, "total_steps": 4116, "loss": 0.1091, "lr": 4.831611802972112e-06, "epoch": 5.578231292517007, "percentage": 79.69, "elapsed_time": "4:00:26", "remaining_time": "1:01:17"}
{"current_steps": 3285, "total_steps": 4116, "loss": 0.1404, "lr": 4.776467969888041e-06, "epoch": 5.586734693877551, "percentage": 79.81, "elapsed_time": "4:00:43", "remaining_time": "1:00:53"}
{"current_steps": 3290, "total_steps": 4116, "loss": 0.1218, "lr": 4.721597923646008e-06, "epoch": 5.595238095238095, "percentage": 79.93, "elapsed_time": "4:01:06", "remaining_time": "1:00:31"}
{"current_steps": 3295, "total_steps": 4116, "loss": 0.1103, "lr": 4.667002651053582e-06, "epoch": 5.603741496598639, "percentage": 80.05, "elapsed_time": "4:01:27", "remaining_time": "1:00:09"}
{"current_steps": 3300, "total_steps": 4116, "loss": 0.128, "lr": 4.612683133976692e-06, "epoch": 5.612244897959184, "percentage": 80.17, "elapsed_time": "4:01:51", "remaining_time": "0:59:48"}
{"current_steps": 3305, "total_steps": 4116, "loss": 0.1159, "lr": 4.5586403493219365e-06, "epoch": 5.620748299319728, "percentage": 80.3, "elapsed_time": "4:02:12", "remaining_time": "0:59:25"}
{"current_steps": 3310, "total_steps": 4116, "loss": 0.118, "lr": 4.504875269019038e-06, "epoch": 5.629251700680272, "percentage": 80.42, "elapsed_time": "4:02:34", "remaining_time": "0:59:04"}
{"current_steps": 3315, "total_steps": 4116, "loss": 0.119, "lr": 4.451388860003374e-06, "epoch": 5.637755102040816, "percentage": 80.54, "elapsed_time": "4:02:54", "remaining_time": "0:58:41"}
{"current_steps": 3320, "total_steps": 4116, "loss": 0.1315, "lr": 4.398182084198561e-06, "epoch": 5.646258503401361, "percentage": 80.66, "elapsed_time": "4:03:17", "remaining_time": "0:58:19"}
{"current_steps": 3325, "total_steps": 4116, "loss": 0.1296, "lr": 4.345255898499172e-06, "epoch": 5.654761904761905, "percentage": 80.78, "elapsed_time": "4:03:38", "remaining_time": "0:57:57"}
{"current_steps": 3330, "total_steps": 4116, "loss": 0.1173, "lr": 4.292611254753509e-06, "epoch": 5.663265306122449, "percentage": 80.9, "elapsed_time": "4:03:59", "remaining_time": "0:57:35"}
{"current_steps": 3335, "total_steps": 4116, "loss": 0.1261, "lr": 4.24024909974651e-06, "epoch": 5.671768707482993, "percentage": 81.03, "elapsed_time": "4:04:20", "remaining_time": "0:57:13"}
{"current_steps": 3340, "total_steps": 4116, "loss": 0.1214, "lr": 4.188170375182705e-06, "epoch": 5.680272108843537, "percentage": 81.15, "elapsed_time": "4:04:42", "remaining_time": "0:56:51"}
{"current_steps": 3345, "total_steps": 4116, "loss": 0.1384, "lr": 4.136376017669281e-06, "epoch": 5.688775510204081, "percentage": 81.27, "elapsed_time": "4:05:03", "remaining_time": "0:56:29"}
{"current_steps": 3350, "total_steps": 4116, "loss": 0.1187, "lr": 4.084866958699247e-06, "epoch": 5.697278911564625, "percentage": 81.39, "elapsed_time": "4:05:23", "remaining_time": "0:56:06"}
{"current_steps": 3355, "total_steps": 4116, "loss": 0.121, "lr": 4.0336441246346684e-06, "epoch": 5.70578231292517, "percentage": 81.51, "elapsed_time": "4:05:41", "remaining_time": "0:55:43"}
{"current_steps": 3360, "total_steps": 4116, "loss": 0.112, "lr": 3.982708436690001e-06, "epoch": 5.714285714285714, "percentage": 81.63, "elapsed_time": "4:06:01", "remaining_time": "0:55:21"}
{"current_steps": 3365, "total_steps": 4116, "loss": 0.1298, "lr": 3.932060810915563e-06, "epoch": 5.7227891156462585, "percentage": 81.75, "elapsed_time": "4:06:29", "remaining_time": "0:55:00"}
{"current_steps": 3370, "total_steps": 4116, "loss": 0.1127, "lr": 3.881702158181015e-06, "epoch": 5.7312925170068025, "percentage": 81.88, "elapsed_time": "4:06:52", "remaining_time": "0:54:39"}
{"current_steps": 3375, "total_steps": 4116, "loss": 0.114, "lr": 3.831633384159006e-06, "epoch": 5.739795918367347, "percentage": 82.0, "elapsed_time": "4:07:12", "remaining_time": "0:54:16"}
{"current_steps": 3380, "total_steps": 4116, "loss": 0.1228, "lr": 3.7818553893088596e-06, "epoch": 5.7482993197278915, "percentage": 82.12, "elapsed_time": "4:07:38", "remaining_time": "0:53:55"}
{"current_steps": 3385, "total_steps": 4116, "loss": 0.1137, "lr": 3.7323690688604153e-06, "epoch": 5.756802721088436, "percentage": 82.24, "elapsed_time": "4:07:57", "remaining_time": "0:53:32"}
{"current_steps": 3390, "total_steps": 4116, "loss": 0.1125, "lr": 3.6831753127978974e-06, "epoch": 5.76530612244898, "percentage": 82.36, "elapsed_time": "4:08:18", "remaining_time": "0:53:10"}
{"current_steps": 3395, "total_steps": 4116, "loss": 0.1109, "lr": 3.634275005843935e-06, "epoch": 5.773809523809524, "percentage": 82.48, "elapsed_time": "4:08:37", "remaining_time": "0:52:48"}
{"current_steps": 3400, "total_steps": 4116, "loss": 0.1164, "lr": 3.585669027443608e-06, "epoch": 5.782312925170068, "percentage": 82.6, "elapsed_time": "4:08:58", "remaining_time": "0:52:25"}
{"current_steps": 3405, "total_steps": 4116, "loss": 0.1298, "lr": 3.537358251748695e-06, "epoch": 5.790816326530612, "percentage": 82.73, "elapsed_time": "4:09:17", "remaining_time": "0:52:03"}
{"current_steps": 3410, "total_steps": 4116, "loss": 0.1193, "lr": 3.489343547601882e-06, "epoch": 5.799319727891157, "percentage": 82.85, "elapsed_time": "4:09:40", "remaining_time": "0:51:41"}
{"current_steps": 3415, "total_steps": 4116, "loss": 0.1133, "lr": 3.4416257785211983e-06, "epoch": 5.807823129251701, "percentage": 82.97, "elapsed_time": "4:10:02", "remaining_time": "0:51:19"}
{"current_steps": 3420, "total_steps": 4116, "loss": 0.126, "lr": 3.3942058026844472e-06, "epoch": 5.816326530612245, "percentage": 83.09, "elapsed_time": "4:10:19", "remaining_time": "0:50:56"}
{"current_steps": 3425, "total_steps": 4116, "loss": 0.1217, "lr": 3.3470844729137886e-06, "epoch": 5.824829931972789, "percentage": 83.21, "elapsed_time": "4:10:43", "remaining_time": "0:50:35"}
{"current_steps": 3430, "total_steps": 4116, "loss": 0.1225, "lr": 3.3002626366603896e-06, "epoch": 5.833333333333333, "percentage": 83.33, "elapsed_time": "4:11:01", "remaining_time": "0:50:12"}
{"current_steps": 3435, "total_steps": 4116, "loss": 0.1101, "lr": 3.253741135989199e-06, "epoch": 5.841836734693878, "percentage": 83.45, "elapsed_time": "4:11:20", "remaining_time": "0:49:49"}
{"current_steps": 3440, "total_steps": 4116, "loss": 0.1518, "lr": 3.2075208075637953e-06, "epoch": 5.850340136054422, "percentage": 83.58, "elapsed_time": "4:11:47", "remaining_time": "0:49:28"}
{"current_steps": 3445, "total_steps": 4116, "loss": 0.1162, "lr": 3.161602482631343e-06, "epoch": 5.858843537414966, "percentage": 83.7, "elapsed_time": "4:12:13", "remaining_time": "0:49:07"}
{"current_steps": 3450, "total_steps": 4116, "loss": 0.1189, "lr": 3.115986987007622e-06, "epoch": 5.86734693877551, "percentage": 83.82, "elapsed_time": "4:12:35", "remaining_time": "0:48:45"}
{"current_steps": 3455, "total_steps": 4116, "loss": 0.116, "lr": 3.0706751410622184e-06, "epoch": 5.875850340136054, "percentage": 83.94, "elapsed_time": "4:12:53", "remaining_time": "0:48:22"}
{"current_steps": 3460, "total_steps": 4116, "loss": 0.1186, "lr": 3.0256677597037253e-06, "epoch": 5.884353741496598, "percentage": 84.06, "elapsed_time": "4:13:12", "remaining_time": "0:48:00"}
{"current_steps": 3465, "total_steps": 4116, "loss": 0.1219, "lr": 2.980965652365122e-06, "epoch": 5.892857142857143, "percentage": 84.18, "elapsed_time": "4:13:31", "remaining_time": "0:47:37"}
{"current_steps": 3470, "total_steps": 4116, "loss": 0.1044, "lr": 2.936569622989198e-06, "epoch": 5.901360544217687, "percentage": 84.31, "elapsed_time": "4:13:53", "remaining_time": "0:47:15"}
{"current_steps": 3475, "total_steps": 4116, "loss": 0.1173, "lr": 2.892480470014101e-06, "epoch": 5.909863945578231, "percentage": 84.43, "elapsed_time": "4:14:17", "remaining_time": "0:46:54"}
{"current_steps": 3480, "total_steps": 4116, "loss": 0.1261, "lr": 2.8486989863589665e-06, "epoch": 5.918367346938775, "percentage": 84.55, "elapsed_time": "4:14:38", "remaining_time": "0:46:32"}
{"current_steps": 3485, "total_steps": 4116, "loss": 0.1098, "lr": 2.8052259594096786e-06, "epoch": 5.9268707482993195, "percentage": 84.67, "elapsed_time": "4:15:00", "remaining_time": "0:46:10"}
{"current_steps": 3490, "total_steps": 4116, "loss": 0.1169, "lr": 2.762062171004689e-06, "epoch": 5.935374149659864, "percentage": 84.79, "elapsed_time": "4:15:21", "remaining_time": "0:45:48"}
{"current_steps": 3495, "total_steps": 4116, "loss": 0.1135, "lr": 2.719208397420976e-06, "epoch": 5.9438775510204085, "percentage": 84.91, "elapsed_time": "4:15:44", "remaining_time": "0:45:26"}
{"current_steps": 3500, "total_steps": 4116, "loss": 0.1138, "lr": 2.6766654093600554e-06, "epoch": 5.9523809523809526, "percentage": 85.03, "elapsed_time": "4:16:05", "remaining_time": "0:45:04"}
{"current_steps": 3505, "total_steps": 4116, "loss": 0.1291, "lr": 2.634433971934145e-06, "epoch": 5.960884353741497, "percentage": 85.16, "elapsed_time": "4:16:27", "remaining_time": "0:44:42"}
{"current_steps": 3510, "total_steps": 4116, "loss": 0.1157, "lr": 2.592514844652396e-06, "epoch": 5.969387755102041, "percentage": 85.28, "elapsed_time": "4:16:47", "remaining_time": "0:44:20"}
{"current_steps": 3515, "total_steps": 4116, "loss": 0.1139, "lr": 2.5509087814072373e-06, "epoch": 5.977891156462585, "percentage": 85.4, "elapsed_time": "4:17:10", "remaining_time": "0:43:58"}
{"current_steps": 3520, "total_steps": 4116, "loss": 0.129, "lr": 2.5096165304608055e-06, "epoch": 5.986394557823129, "percentage": 85.52, "elapsed_time": "4:17:30", "remaining_time": "0:43:36"}
{"current_steps": 3525, "total_steps": 4116, "loss": 0.1137, "lr": 2.468638834431507e-06, "epoch": 5.994897959183674, "percentage": 85.64, "elapsed_time": "4:17:53", "remaining_time": "0:43:14"}
{"current_steps": 3530, "total_steps": 4116, "loss": 0.1096, "lr": 2.427976430280634e-06, "epoch": 6.003401360544218, "percentage": 85.76, "elapsed_time": "4:18:17", "remaining_time": "0:42:52"}
{"current_steps": 3535, "total_steps": 4116, "loss": 0.1012, "lr": 2.3876300492991456e-06, "epoch": 6.011904761904762, "percentage": 85.88, "elapsed_time": "4:18:37", "remaining_time": "0:42:30"}
{"current_steps": 3540, "total_steps": 4116, "loss": 0.1137, "lr": 2.3476004170944843e-06, "epoch": 6.020408163265306, "percentage": 86.01, "elapsed_time": "4:19:01", "remaining_time": "0:42:08"}
{"current_steps": 3545, "total_steps": 4116, "loss": 0.1101, "lr": 2.307888253577557e-06, "epoch": 6.02891156462585, "percentage": 86.13, "elapsed_time": "4:19:24", "remaining_time": "0:41:47"}
{"current_steps": 3550, "total_steps": 4116, "loss": 0.1143, "lr": 2.2684942729497505e-06, "epoch": 6.037414965986395, "percentage": 86.25, "elapsed_time": "4:19:46", "remaining_time": "0:41:25"}
{"current_steps": 3555, "total_steps": 4116, "loss": 0.1118, "lr": 2.229419183690118e-06, "epoch": 6.045918367346939, "percentage": 86.37, "elapsed_time": "4:20:06", "remaining_time": "0:41:02"}
{"current_steps": 3560, "total_steps": 4116, "loss": 0.1058, "lr": 2.1906636885426293e-06, "epoch": 6.054421768707483, "percentage": 86.49, "elapsed_time": "4:20:22", "remaining_time": "0:40:39"}
{"current_steps": 3565, "total_steps": 4116, "loss": 0.1056, "lr": 2.1522284845035246e-06, "epoch": 6.062925170068027, "percentage": 86.61, "elapsed_time": "4:20:44", "remaining_time": "0:40:17"}
{"current_steps": 3570, "total_steps": 4116, "loss": 0.1113, "lr": 2.1141142628087908e-06, "epoch": 6.071428571428571, "percentage": 86.73, "elapsed_time": "4:21:07", "remaining_time": "0:39:56"}
{"current_steps": 3575, "total_steps": 4116, "loss": 0.1226, "lr": 2.0763217089217204e-06, "epoch": 6.079931972789115, "percentage": 86.86, "elapsed_time": "4:21:30", "remaining_time": "0:39:34"}
{"current_steps": 3580, "total_steps": 4116, "loss": 0.1111, "lr": 2.0388515025205912e-06, "epoch": 6.08843537414966, "percentage": 86.98, "elapsed_time": "4:21:50", "remaining_time": "0:39:12"}
{"current_steps": 3585, "total_steps": 4116, "loss": 0.1188, "lr": 2.0017043174864257e-06, "epoch": 6.096938775510204, "percentage": 87.1, "elapsed_time": "4:22:10", "remaining_time": "0:38:50"}
{"current_steps": 3590, "total_steps": 4116, "loss": 0.1239, "lr": 1.9648808218909023e-06, "epoch": 6.105442176870748, "percentage": 87.22, "elapsed_time": "4:22:36", "remaining_time": "0:38:28"}
{"current_steps": 3595, "total_steps": 4116, "loss": 0.1424, "lr": 1.928381677984319e-06, "epoch": 6.113945578231292, "percentage": 87.34, "elapsed_time": "4:23:02", "remaining_time": "0:38:07"}
{"current_steps": 3600, "total_steps": 4116, "loss": 0.1191, "lr": 1.8922075421836729e-06, "epoch": 6.122448979591836, "percentage": 87.46, "elapsed_time": "4:23:25", "remaining_time": "0:37:45"}
{"current_steps": 3605, "total_steps": 4116, "loss": 0.1085, "lr": 1.8563590650608908e-06, "epoch": 6.130952380952381, "percentage": 87.59, "elapsed_time": "4:23:51", "remaining_time": "0:37:24"}
{"current_steps": 3610, "total_steps": 4116, "loss": 0.1208, "lr": 1.8208368913310881e-06, "epoch": 6.139455782312925, "percentage": 87.71, "elapsed_time": "4:24:09", "remaining_time": "0:37:01"}
{"current_steps": 3615, "total_steps": 4116, "loss": 0.1368, "lr": 1.7856416598410064e-06, "epoch": 6.1479591836734695, "percentage": 87.83, "elapsed_time": "4:24:31", "remaining_time": "0:36:39"}
{"current_steps": 3620, "total_steps": 4116, "loss": 0.1143, "lr": 1.7507740035575049e-06, "epoch": 6.156462585034014, "percentage": 87.95, "elapsed_time": "4:24:55", "remaining_time": "0:36:17"}
{"current_steps": 3625, "total_steps": 4116, "loss": 0.1288, "lr": 1.7162345495561905e-06, "epoch": 6.164965986394558, "percentage": 88.07, "elapsed_time": "4:25:20", "remaining_time": "0:35:56"}
{"current_steps": 3630, "total_steps": 4116, "loss": 0.1312, "lr": 1.682023919010125e-06, "epoch": 6.173469387755102, "percentage": 88.19, "elapsed_time": "4:25:42", "remaining_time": "0:35:34"}
{"current_steps": 3635, "total_steps": 4116, "loss": 0.1118, "lr": 1.6481427271786588e-06, "epoch": 6.181972789115647, "percentage": 88.31, "elapsed_time": "4:26:00", "remaining_time": "0:35:11"}
{"current_steps": 3640, "total_steps": 4116, "loss": 0.1153, "lr": 1.6145915833963788e-06, "epoch": 6.190476190476191, "percentage": 88.44, "elapsed_time": "4:26:23", "remaining_time": "0:34:50"}
{"current_steps": 3645, "total_steps": 4116, "loss": 0.1089, "lr": 1.5813710910621293e-06, "epoch": 6.198979591836735, "percentage": 88.56, "elapsed_time": "4:26:37", "remaining_time": "0:34:27"}
{"current_steps": 3650, "total_steps": 4116, "loss": 0.1195, "lr": 1.548481847628176e-06, "epoch": 6.207482993197279, "percentage": 88.68, "elapsed_time": "4:26:58", "remaining_time": "0:34:05"}
{"current_steps": 3655, "total_steps": 4116, "loss": 0.1135, "lr": 1.5159244445894605e-06, "epoch": 6.215986394557823, "percentage": 88.8, "elapsed_time": "4:27:21", "remaining_time": "0:33:43"}
{"current_steps": 3660, "total_steps": 4116, "loss": 0.1075, "lr": 1.4836994674729433e-06, "epoch": 6.224489795918367, "percentage": 88.92, "elapsed_time": "4:27:40", "remaining_time": "0:33:21"}
{"current_steps": 3665, "total_steps": 4116, "loss": 0.1229, "lr": 1.4518074958271e-06, "epoch": 6.232993197278912, "percentage": 89.04, "elapsed_time": "4:28:04", "remaining_time": "0:32:59"}
{"current_steps": 3670, "total_steps": 4116, "loss": 0.1133, "lr": 1.4202491032114797e-06, "epoch": 6.241496598639456, "percentage": 89.16, "elapsed_time": "4:28:24", "remaining_time": "0:32:37"}
{"current_steps": 3675, "total_steps": 4116, "loss": 0.0964, "lr": 1.389024857186403e-06, "epoch": 6.25, "percentage": 89.29, "elapsed_time": "4:28:45", "remaining_time": "0:32:15"}
{"current_steps": 3680, "total_steps": 4116, "loss": 0.1201, "lr": 1.3581353193027424e-06, "epoch": 6.258503401360544, "percentage": 89.41, "elapsed_time": "4:29:05", "remaining_time": "0:31:52"}
{"current_steps": 3685, "total_steps": 4116, "loss": 0.1149, "lr": 1.3275810450918257e-06, "epoch": 6.267006802721088, "percentage": 89.53, "elapsed_time": "4:29:27", "remaining_time": "0:31:30"}
{"current_steps": 3690, "total_steps": 4116, "loss": 0.1125, "lr": 1.2973625840554593e-06, "epoch": 6.275510204081632, "percentage": 89.65, "elapsed_time": "4:29:47", "remaining_time": "0:31:08"}
{"current_steps": 3695, "total_steps": 4116, "loss": 0.1078, "lr": 1.2674804796560202e-06, "epoch": 6.284013605442177, "percentage": 89.77, "elapsed_time": "4:30:07", "remaining_time": "0:30:46"}
{"current_steps": 3700, "total_steps": 4116, "loss": 0.114, "lr": 1.237935269306707e-06, "epoch": 6.292517006802721, "percentage": 89.89, "elapsed_time": "4:30:34", "remaining_time": "0:30:25"}
{"current_steps": 3705, "total_steps": 4116, "loss": 0.1067, "lr": 1.2087274843618668e-06, "epoch": 6.301020408163265, "percentage": 90.01, "elapsed_time": "4:31:01", "remaining_time": "0:30:03"}
{"current_steps": 3710, "total_steps": 4116, "loss": 0.1194, "lr": 1.1798576501074233e-06, "epoch": 6.309523809523809, "percentage": 90.14, "elapsed_time": "4:31:22", "remaining_time": "0:29:41"}
{"current_steps": 3715, "total_steps": 4116, "loss": 0.1027, "lr": 1.151326285751455e-06, "epoch": 6.318027210884353, "percentage": 90.26, "elapsed_time": "4:31:43", "remaining_time": "0:29:19"}
{"current_steps": 3720, "total_steps": 4116, "loss": 0.1114, "lr": 1.1231339044148393e-06, "epoch": 6.326530612244898, "percentage": 90.38, "elapsed_time": "4:32:07", "remaining_time": "0:28:58"}
{"current_steps": 3725, "total_steps": 4116, "loss": 0.1169, "lr": 1.095281013122036e-06, "epoch": 6.335034013605442, "percentage": 90.5, "elapsed_time": "4:32:32", "remaining_time": "0:28:36"}
{"current_steps": 3730, "total_steps": 4116, "loss": 0.1122, "lr": 1.0677681127919581e-06, "epoch": 6.343537414965986, "percentage": 90.62, "elapsed_time": "4:32:55", "remaining_time": "0:28:14"}
{"current_steps": 3735, "total_steps": 4116, "loss": 0.1037, "lr": 1.04059569822897e-06, "epoch": 6.3520408163265305, "percentage": 90.74, "elapsed_time": "4:33:13", "remaining_time": "0:27:52"}
{"current_steps": 3740, "total_steps": 4116, "loss": 0.1165, "lr": 1.0137642581139895e-06, "epoch": 6.360544217687075, "percentage": 90.86, "elapsed_time": "4:33:33", "remaining_time": "0:27:30"}
{"current_steps": 3745, "total_steps": 4116, "loss": 0.1166, "lr": 9.87274274995693e-07, "epoch": 6.369047619047619, "percentage": 90.99, "elapsed_time": "4:33:55", "remaining_time": "0:27:08"}
{"current_steps": 3750, "total_steps": 4116, "loss": 0.1265, "lr": 9.61126225281841e-07, "epoch": 6.377551020408164, "percentage": 91.11, "elapsed_time": "4:34:18", "remaining_time": "0:26:46"}
{"current_steps": 3755, "total_steps": 4116, "loss": 0.1168, "lr": 9.353205792307163e-07, "epoch": 6.386054421768708, "percentage": 91.23, "elapsed_time": "4:34:42", "remaining_time": "0:26:24"}
{"current_steps": 3760, "total_steps": 4116, "loss": 0.1063, "lr": 9.098578009426484e-07, "epoch": 6.394557823129252, "percentage": 91.35, "elapsed_time": "4:35:01", "remaining_time": "0:26:02"}
{"current_steps": 3765, "total_steps": 4116, "loss": 0.1115, "lr": 8.847383483516903e-07, "epoch": 6.403061224489796, "percentage": 91.47, "elapsed_time": "4:35:21", "remaining_time": "0:25:40"}
{"current_steps": 3770, "total_steps": 4116, "loss": 0.1224, "lr": 8.599626732173671e-07, "epoch": 6.41156462585034, "percentage": 91.59, "elapsed_time": "4:35:47", "remaining_time": "0:25:18"}
{"current_steps": 3775, "total_steps": 4116, "loss": 0.1148, "lr": 8.355312211165568e-07, "epoch": 6.420068027210885, "percentage": 91.72, "elapsed_time": "4:36:09", "remaining_time": "0:24:56"}
{"current_steps": 3780, "total_steps": 4116, "loss": 0.1079, "lr": 8.11444431435473e-07, "epoch": 6.428571428571429, "percentage": 91.84, "elapsed_time": "4:36:26", "remaining_time": "0:24:34"}
{"current_steps": 3785, "total_steps": 4116, "loss": 0.1261, "lr": 7.87702737361764e-07, "epoch": 6.437074829931973, "percentage": 91.96, "elapsed_time": "4:36:50", "remaining_time": "0:24:12"}
{"current_steps": 3790, "total_steps": 4116, "loss": 0.1173, "lr": 7.643065658767312e-07, "epoch": 6.445578231292517, "percentage": 92.08, "elapsed_time": "4:37:12", "remaining_time": "0:23:50"}
{"current_steps": 3795, "total_steps": 4116, "loss": 0.1138, "lr": 7.412563377476312e-07, "epoch": 6.454081632653061, "percentage": 92.2, "elapsed_time": "4:37:32", "remaining_time": "0:23:28"}
{"current_steps": 3800, "total_steps": 4116, "loss": 0.1159, "lr": 7.185524675201328e-07, "epoch": 6.462585034013605, "percentage": 92.32, "elapsed_time": "4:37:55", "remaining_time": "0:23:06"}
{"current_steps": 3805, "total_steps": 4116, "loss": 0.1231, "lr": 6.961953635108409e-07, "epoch": 6.47108843537415, "percentage": 92.44, "elapsed_time": "4:38:18", "remaining_time": "0:22:44"}
{"current_steps": 3810, "total_steps": 4116, "loss": 0.1016, "lr": 6.741854277999583e-07, "epoch": 6.479591836734694, "percentage": 92.57, "elapsed_time": "4:38:40", "remaining_time": "0:22:22"}
{"current_steps": 3815, "total_steps": 4116, "loss": 0.1169, "lr": 6.525230562240636e-07, "epoch": 6.488095238095238, "percentage": 92.69, "elapsed_time": "4:39:04", "remaining_time": "0:22:01"}
{"current_steps": 3820, "total_steps": 4116, "loss": 0.1139, "lr": 6.312086383689831e-07, "epoch": 6.496598639455782, "percentage": 92.81, "elapsed_time": "4:39:27", "remaining_time": "0:21:39"}
{"current_steps": 3825, "total_steps": 4116, "loss": 0.1008, "lr": 6.10242557562788e-07, "epoch": 6.505102040816326, "percentage": 92.93, "elapsed_time": "4:39:49", "remaining_time": "0:21:17"}
{"current_steps": 3830, "total_steps": 4116, "loss": 0.0949, "lr": 5.896251908688966e-07, "epoch": 6.513605442176871, "percentage": 93.05, "elapsed_time": "4:40:09", "remaining_time": "0:20:55"}
{"current_steps": 3835, "total_steps": 4116, "loss": 0.1091, "lr": 5.693569090792972e-07, "epoch": 6.522108843537415, "percentage": 93.17, "elapsed_time": "4:40:27", "remaining_time": "0:20:32"}
{"current_steps": 3840, "total_steps": 4116, "loss": 0.1113, "lr": 5.494380767078822e-07, "epoch": 6.530612244897959, "percentage": 93.29, "elapsed_time": "4:40:51", "remaining_time": "0:20:11"}
{"current_steps": 3845, "total_steps": 4116, "loss": 0.105, "lr": 5.298690519838823e-07, "epoch": 6.539115646258503, "percentage": 93.42, "elapsed_time": "4:41:11", "remaining_time": "0:19:49"}
{"current_steps": 3850, "total_steps": 4116, "loss": 0.1032, "lr": 5.106501868454317e-07, "epoch": 6.5476190476190474, "percentage": 93.54, "elapsed_time": "4:41:33", "remaining_time": "0:19:27"}
{"current_steps": 3855, "total_steps": 4116, "loss": 0.1063, "lr": 4.917818269332442e-07, "epoch": 6.5561224489795915, "percentage": 93.66, "elapsed_time": "4:41:54", "remaining_time": "0:19:05"}
{"current_steps": 3860, "total_steps": 4116, "loss": 0.1068, "lr": 4.7326431158437826e-07, "epoch": 6.564625850340136, "percentage": 93.78, "elapsed_time": "4:42:13", "remaining_time": "0:18:43"}
{"current_steps": 3865, "total_steps": 4116, "loss": 0.1015, "lr": 4.5509797382615517e-07, "epoch": 6.5731292517006805, "percentage": 93.9, "elapsed_time": "4:42:34", "remaining_time": "0:18:21"}
{"current_steps": 3870, "total_steps": 4116, "loss": 0.1039, "lr": 4.372831403701572e-07, "epoch": 6.581632653061225, "percentage": 94.02, "elapsed_time": "4:42:53", "remaining_time": "0:17:58"}
{"current_steps": 3875, "total_steps": 4116, "loss": 0.1107, "lr": 4.198201316063566e-07, "epoch": 6.590136054421769, "percentage": 94.14, "elapsed_time": "4:43:16", "remaining_time": "0:17:37"}
{"current_steps": 3880, "total_steps": 4116, "loss": 0.1124, "lr": 4.02709261597356e-07, "epoch": 6.598639455782313, "percentage": 94.27, "elapsed_time": "4:43:35", "remaining_time": "0:17:14"}
{"current_steps": 3885, "total_steps": 4116, "loss": 0.1261, "lr": 3.8595083807272393e-07, "epoch": 6.607142857142857, "percentage": 94.39, "elapsed_time": "4:43:59", "remaining_time": "0:16:53"}
{"current_steps": 3890, "total_steps": 4116, "loss": 0.1048, "lr": 3.695451624234836e-07, "epoch": 6.615646258503402, "percentage": 94.51, "elapsed_time": "4:44:21", "remaining_time": "0:16:31"}
{"current_steps": 3895, "total_steps": 4116, "loss": 0.1181, "lr": 3.5349252969667956e-07, "epoch": 6.624149659863946, "percentage": 94.63, "elapsed_time": "4:44:42", "remaining_time": "0:16:09"}
{"current_steps": 3900, "total_steps": 4116, "loss": 0.1207, "lr": 3.3779322859007536e-07, "epoch": 6.63265306122449, "percentage": 94.75, "elapsed_time": "4:45:02", "remaining_time": "0:15:47"}
{"current_steps": 3905, "total_steps": 4116, "loss": 0.1351, "lr": 3.224475414469552e-07, "epoch": 6.641156462585034, "percentage": 94.87, "elapsed_time": "4:45:24", "remaining_time": "0:15:25"}
{"current_steps": 3910, "total_steps": 4116, "loss": 0.1207, "lr": 3.0745574425105505e-07, "epoch": 6.649659863945578, "percentage": 95.0, "elapsed_time": "4:45:43", "remaining_time": "0:15:03"}
{"current_steps": 3915, "total_steps": 4116, "loss": 0.1101, "lr": 2.928181066215929e-07, "epoch": 6.658163265306122, "percentage": 95.12, "elapsed_time": "4:46:04", "remaining_time": "0:14:41"}
{"current_steps": 3920, "total_steps": 4116, "loss": 0.132, "lr": 2.785348918084241e-07, "epoch": 6.666666666666667, "percentage": 95.24, "elapsed_time": "4:46:29", "remaining_time": "0:14:19"}
{"current_steps": 3925, "total_steps": 4116, "loss": 0.1021, "lr": 2.6460635668730027e-07, "epoch": 6.675170068027211, "percentage": 95.36, "elapsed_time": "4:46:53", "remaining_time": "0:13:57"}
{"current_steps": 3930, "total_steps": 4116, "loss": 0.1155, "lr": 2.5103275175526023e-07, "epoch": 6.683673469387755, "percentage": 95.48, "elapsed_time": "4:47:18", "remaining_time": "0:13:35"}
{"current_steps": 3935, "total_steps": 4116, "loss": 0.1097, "lr": 2.3781432112611303e-07, "epoch": 6.692176870748299, "percentage": 95.6, "elapsed_time": "4:47:41", "remaining_time": "0:13:13"}
{"current_steps": 3940, "total_steps": 4116, "loss": 0.1181, "lr": 2.2495130252605746e-07, "epoch": 6.700680272108843, "percentage": 95.72, "elapsed_time": "4:48:00", "remaining_time": "0:12:51"}
{"current_steps": 3945, "total_steps": 4116, "loss": 0.1187, "lr": 2.1244392728939857e-07, "epoch": 6.709183673469388, "percentage": 95.85, "elapsed_time": "4:48:24", "remaining_time": "0:12:30"}
{"current_steps": 3950, "total_steps": 4116, "loss": 0.1312, "lr": 2.0029242035439768e-07, "epoch": 6.717687074829932, "percentage": 95.97, "elapsed_time": "4:48:44", "remaining_time": "0:12:08"}
{"current_steps": 3955, "total_steps": 4116, "loss": 0.1007, "lr": 1.8849700025921347e-07, "epoch": 6.726190476190476, "percentage": 96.09, "elapsed_time": "4:49:04", "remaining_time": "0:11:46"}
{"current_steps": 3960, "total_steps": 4116, "loss": 0.1117, "lr": 1.7705787913798734e-07, "epoch": 6.73469387755102, "percentage": 96.21, "elapsed_time": "4:49:23", "remaining_time": "0:11:24"}
{"current_steps": 3965, "total_steps": 4116, "loss": 0.1096, "lr": 1.659752627170086e-07, "epoch": 6.743197278911564, "percentage": 96.33, "elapsed_time": "4:49:43", "remaining_time": "0:11:02"}
{"current_steps": 3970, "total_steps": 4116, "loss": 0.116, "lr": 1.5524935031103305e-07, "epoch": 6.7517006802721085, "percentage": 96.45, "elapsed_time": "4:50:05", "remaining_time": "0:10:40"}
{"current_steps": 3975, "total_steps": 4116, "loss": 0.1055, "lr": 1.4488033481969478e-07, "epoch": 6.760204081632653, "percentage": 96.57, "elapsed_time": "4:50:27", "remaining_time": "0:10:18"}
{"current_steps": 3980, "total_steps": 4116, "loss": 0.1212, "lr": 1.3486840272402213e-07, "epoch": 6.7687074829931975, "percentage": 96.7, "elapsed_time": "4:50:53", "remaining_time": "0:09:56"}
{"current_steps": 3985, "total_steps": 4116, "loss": 0.1082, "lr": 1.252137340831072e-07, "epoch": 6.7772108843537415, "percentage": 96.82, "elapsed_time": "4:51:16", "remaining_time": "0:09:34"}
{"current_steps": 3990, "total_steps": 4116, "loss": 0.1044, "lr": 1.1591650253084619e-07, "epoch": 6.785714285714286, "percentage": 96.94, "elapsed_time": "4:51:38", "remaining_time": "0:09:12"}
{"current_steps": 3995, "total_steps": 4116, "loss": 0.1218, "lr": 1.0697687527283062e-07, "epoch": 6.79421768707483, "percentage": 97.06, "elapsed_time": "4:51:59", "remaining_time": "0:08:50"}
{"current_steps": 4000, "total_steps": 4116, "loss": 0.1239, "lr": 9.839501308333666e-08, "epoch": 6.802721088435375, "percentage": 97.18, "elapsed_time": "4:52:22", "remaining_time": "0:08:28"}
{"current_steps": 4005, "total_steps": 4116, "loss": 0.1075, "lr": 9.017107030242944e-08, "epoch": 6.811224489795919, "percentage": 97.3, "elapsed_time": "4:52:43", "remaining_time": "0:08:06"}
{"current_steps": 4010, "total_steps": 4116, "loss": 0.1166, "lr": 8.230519483319211e-08, "epoch": 6.819727891156463, "percentage": 97.42, "elapsed_time": "4:53:06", "remaining_time": "0:07:44"}
{"current_steps": 4015, "total_steps": 4116, "loss": 0.113, "lr": 7.479752813906338e-08, "epoch": 6.828231292517007, "percentage": 97.55, "elapsed_time": "4:53:29", "remaining_time": "0:07:22"}
{"current_steps": 4020, "total_steps": 4116, "loss": 0.1131, "lr": 6.764820524129745e-08, "epoch": 6.836734693877551, "percentage": 97.67, "elapsed_time": "4:53:50", "remaining_time": "0:07:01"}
{"current_steps": 4025, "total_steps": 4116, "loss": 0.0981, "lr": 6.085735471653032e-08, "epoch": 6.845238095238095, "percentage": 97.79, "elapsed_time": "4:54:10", "remaining_time": "0:06:39"}
{"current_steps": 4030, "total_steps": 4116, "loss": 0.101, "lr": 5.442509869446833e-08, "epoch": 6.853741496598639, "percentage": 97.91, "elapsed_time": "4:54:32", "remaining_time": "0:06:17"}
{"current_steps": 4035, "total_steps": 4116, "loss": 0.1164, "lr": 4.835155285569659e-08, "epoch": 6.862244897959184, "percentage": 98.03, "elapsed_time": "4:54:52", "remaining_time": "0:05:55"}
{"current_steps": 4040, "total_steps": 4116, "loss": 0.1133, "lr": 4.263682642959177e-08, "epoch": 6.870748299319728, "percentage": 98.15, "elapsed_time": "4:55:13", "remaining_time": "0:05:33"}
{"current_steps": 4045, "total_steps": 4116, "loss": 0.1056, "lr": 3.72810221923614e-08, "epoch": 6.879251700680272, "percentage": 98.28, "elapsed_time": "4:55:31", "remaining_time": "0:05:11"}
{"current_steps": 4050, "total_steps": 4116, "loss": 0.1055, "lr": 3.228423646519652e-08, "epoch": 6.887755102040816, "percentage": 98.4, "elapsed_time": "4:55:51", "remaining_time": "0:04:49"}
{"current_steps": 4055, "total_steps": 4116, "loss": 0.1055, "lr": 2.764655911253744e-08, "epoch": 6.896258503401361, "percentage": 98.52, "elapsed_time": "4:56:12", "remaining_time": "0:04:27"}
{"current_steps": 4060, "total_steps": 4116, "loss": 0.1093, "lr": 2.3368073540461778e-08, "epoch": 6.904761904761905, "percentage": 98.64, "elapsed_time": "4:56:36", "remaining_time": "0:04:05"}
{"current_steps": 4065, "total_steps": 4116, "loss": 0.1197, "lr": 1.944885669517005e-08, "epoch": 6.913265306122449, "percentage": 98.76, "elapsed_time": "4:56:56", "remaining_time": "0:03:43"}
{"current_steps": 4070, "total_steps": 4116, "loss": 0.1124, "lr": 1.5888979061624566e-08, "epoch": 6.921768707482993, "percentage": 98.88, "elapsed_time": "4:57:16", "remaining_time": "0:03:21"}
{"current_steps": 4075, "total_steps": 4116, "loss": 0.1041, "lr": 1.268850466226379e-08, "epoch": 6.930272108843537, "percentage": 99.0, "elapsed_time": "4:57:36", "remaining_time": "0:02:59"}
{"current_steps": 4080, "total_steps": 4116, "loss": 0.1179, "lr": 9.847491055856583e-09, "epoch": 6.938775510204081, "percentage": 99.13, "elapsed_time": "4:57:59", "remaining_time": "0:02:37"}
{"current_steps": 4085, "total_steps": 4116, "loss": 0.1135, "lr": 7.365989336469704e-09, "epoch": 6.947278911564625, "percentage": 99.25, "elapsed_time": "4:58:21", "remaining_time": "0:02:15"}
{"current_steps": 4090, "total_steps": 4116, "loss": 0.1099, "lr": 5.244044132544091e-09, "epoch": 6.95578231292517, "percentage": 99.37, "elapsed_time": "4:58:44", "remaining_time": "0:01:53"}
{"current_steps": 4095, "total_steps": 4116, "loss": 0.1205, "lr": 3.481693606095515e-09, "epoch": 6.964285714285714, "percentage": 99.49, "elapsed_time": "4:59:09", "remaining_time": "0:01:32"}
{"current_steps": 4100, "total_steps": 4116, "loss": 0.1054, "lr": 2.078969452030677e-09, "epoch": 6.9727891156462585, "percentage": 99.61, "elapsed_time": "4:59:30", "remaining_time": "0:01:10"}
{"current_steps": 4105, "total_steps": 4116, "loss": 0.111, "lr": 1.0358968975676675e-09, "epoch": 6.9812925170068025, "percentage": 99.73, "elapsed_time": "4:59:56", "remaining_time": "0:00:48"}
{"current_steps": 4110, "total_steps": 4116, "loss": 0.1024, "lr": 3.524947017941038e-10, "epoch": 6.989795918367347, "percentage": 99.85, "elapsed_time": "5:00:19", "remaining_time": "0:00:26"}
{"current_steps": 4115, "total_steps": 4116, "loss": 0.1074, "lr": 2.8775155322957604e-11, "epoch": 6.9982993197278915, "percentage": 99.98, "elapsed_time": "5:00:43", "remaining_time": "0:00:04"}
{"current_steps": 4116, "total_steps": 4116, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "5:00:57", "remaining_time": "0:00:00"}

9100
trainer_state.json Normal file

File diff suppressed because it is too large Load Diff

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b239ac78ab72bf297dc1d78672f8edf5df1571565d8db2748762cd64cc64cee4
size 8593

BIN
training_loss.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

1
vocab.json Normal file

File diff suppressed because one or more lines are too long