初始化项目,由ModelHub XC社区提供模型

Model: DCAgent/a1-curriculum_hard
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-18 16:43:04 +08:00
commit eff7b0c4f4
23 changed files with 162346 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

60
README.md Normal file
View File

@@ -0,0 +1,60 @@
---
library_name: transformers
license: other
base_model: Qwen/Qwen3-8B
tags:
- llama-factory
- full
- generated_from_trainer
model-index:
- name: sft_a1_curriculum_hard__Qwen3-8B
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# sft_a1_curriculum_hard__Qwen3-8B
This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_curriculum-hard_10k_glm_4.7_traces_jupiter/snapshots/f1b42fbba3fc2cc7e0bf2b4ad33938849ed47fba_thinking_preprocessed dataset.
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 16
- total_train_batch_size: 16
- total_eval_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 7.0
### Training results
### Framework versions
- Transformers 4.57.6
- Pytorch 2.9.1+cu130
- Datasets 4.7.0
- Tokenizers 0.22.2

28
added_tokens.json Normal file
View File

@@ -0,0 +1,28 @@
{
"</think>": 151668,
"</tool_call>": 151658,
"</tool_response>": 151666,
"<think>": 151667,
"<tool_call>": 151657,
"<tool_response>": 151665,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

16
all_results.json Normal file
View File

@@ -0,0 +1,16 @@
{
"achieved_tflops_per_gpu": 0.00465697604095815,
"achieved_tflops_per_gpu_theoretical": 728.3994810069811,
"epoch": 7.0,
"loss_nan_ranks": 0,
"loss_rank_avg": 0.15933610498905182,
"mfu_percent": 0.0003291149145553463,
"mfu_percent_theoretical": 51.476995124168276,
"total_flos": 1426059575820288.0,
"train_loss": 0.1865629164902068,
"train_runtime": 19138.755,
"train_samples_per_second": 3.44,
"train_steps_per_second": 0.215,
"valid_targets_mean": 6419.4,
"valid_targets_min": 898
}

89
chat_template.jinja Normal file
View File

@@ -0,0 +1,89 @@
{%- if tools %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].role == 'system' %}
{{- messages[0].content + '\n\n' }}
{%- endif %}
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
{%- for tool in tools %}
{{- "\n" }}
{{- tool | tojson }}
{%- endfor %}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
{%- if messages[0].role == 'system' %}
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
{%- set index = (messages|length - 1) - loop.index0 %}
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
{%- set ns.multi_step_tool = false %}
{%- set ns.last_query_index = index %}
{%- endif %}
{%- endfor %}
{%- for message in messages %}
{%- if message.content is string %}
{%- set content = message.content %}
{%- else %}
{%- set content = '' %}
{%- endif %}
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
{%- elif message.role == "assistant" %}
{%- set reasoning_content = '' %}
{%- if message.reasoning_content is string %}
{%- set reasoning_content = message.reasoning_content %}
{%- else %}
{%- if '</think>' in content %}
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
{%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_query_index %}
{%- if loop.last or (not loop.last and reasoning_content) %}
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and content) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "tool" %}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|im_start|>user' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- content }}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- if enable_thinking is defined and enable_thinking is false %}
{{- '<think>\n\n</think>\n\n' }}
{%- endif %}
{%- endif %}

68
config.json Normal file
View File

@@ -0,0 +1,68 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"dtype": "bfloat16",
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 12288,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 40960,
"max_window_layers": 36,
"model_type": "qwen3",
"num_attention_heads": 32,
"num_hidden_layers": 36,
"num_key_value_heads": 8,
"pad_token_id": 151643,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000,
"sliding_window": null,
"tie_word_embeddings": false,
"transformers_version": "4.57.6",
"use_cache": false,
"use_sliding_window": false,
"vocab_size": 151936
}

12
generation_config.json Normal file
View File

@@ -0,0 +1,12 @@
{
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"transformers_version": "4.57.6"
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:730848fc780fca757979fce57f9b2f2f63aca3c35520f462f50065a478efb0ee
size 4902257696

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f3167c15a70ca9687b8e249b761b217182fd7a0987f27225b8c1d326a7f68e0a
size 4915960368

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6efd0caa92fc0ef6bf5f3976c32076da944623cdaf562f4dc09c83e379af9d79
size 4983068496

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1ae77d525c9474d8f02e51139b90e58a87dd5285b9604f860e68793f0ac9c98e
size 1580230264

View File

@@ -0,0 +1,407 @@
{
"metadata": {
"total_parameters": 308224,
"total_size": 16381470720
},
"weight_map": {
"lm_head.weight": "model-00004-of-00004.safetensors",
"model.embed_tokens.weight": "model-00001-of-00004.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.20.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.22.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.30.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.35.input_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.k_norm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.q_norm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.9.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.norm.weight": "model-00004-of-00004.safetensors"
}
}

12
run_summary.json Normal file
View File

@@ -0,0 +1,12 @@
{
"agent_name": "f1b42fbba3fc2cc7e0bf2b4ad33938849ed47fba_thinking_preprocessed",
"training_start": null,
"training_end": null,
"created_by": "raoof1",
"base_model_name": "Qwen/Qwen3-8B",
"dataset_name": "/e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_curriculum-hard_10k_glm_4.7_traces_jupiter/snapshots/f1b42fbba3fc2cc7e0bf2b4ad33938849ed47fba_thinking_preprocessed",
"training_type": "SFT",
"training_parameters": "https://huggingface.co/DCAgent/a1-curriculum_hard/blob/main/config.json",
"wandb_link": null,
"traces_location_s3": null
}

31
special_tokens_map.json Normal file
View File

@@ -0,0 +1,31 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

240
tokenizer_config.json Normal file
View File

@@ -0,0 +1,240 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"extra_special_tokens": {},
"model_max_length": 32768,
"pad_token": "<|endoftext|>",
"padding_side": "right",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null
}

16
train_results.json Normal file
View File

@@ -0,0 +1,16 @@
{
"achieved_tflops_per_gpu": 0.00465697604095815,
"achieved_tflops_per_gpu_theoretical": 728.3994810069811,
"epoch": 7.0,
"loss_nan_ranks": 0,
"loss_rank_avg": 0.15933610498905182,
"mfu_percent": 0.0003291149145553463,
"mfu_percent_theoretical": 51.476995124168276,
"total_flos": 1426059575820288.0,
"train_loss": 0.1865629164902068,
"train_runtime": 19138.755,
"train_samples_per_second": 3.44,
"train_steps_per_second": 0.215,
"valid_targets_mean": 6419.4,
"valid_targets_min": 898
}

824
trainer_log.jsonl Normal file
View File

@@ -0,0 +1,824 @@
{"current_steps": 5, "total_steps": 4116, "loss": 0.6801, "lr": 3.883495145631068e-07, "epoch": 0.008503401360544218, "percentage": 0.12, "elapsed_time": "0:00:27", "remaining_time": "6:15:44"}
{"current_steps": 10, "total_steps": 4116, "loss": 0.653, "lr": 8.737864077669904e-07, "epoch": 0.017006802721088437, "percentage": 0.24, "elapsed_time": "0:00:50", "remaining_time": "5:42:52"}
{"current_steps": 15, "total_steps": 4116, "loss": 0.6461, "lr": 1.359223300970874e-06, "epoch": 0.025510204081632654, "percentage": 0.36, "elapsed_time": "0:01:14", "remaining_time": "5:37:56"}
{"current_steps": 20, "total_steps": 4116, "loss": 0.6176, "lr": 1.8446601941747574e-06, "epoch": 0.034013605442176874, "percentage": 0.49, "elapsed_time": "0:01:37", "remaining_time": "5:31:29"}
{"current_steps": 25, "total_steps": 4116, "loss": 0.5723, "lr": 2.330097087378641e-06, "epoch": 0.04251700680272109, "percentage": 0.61, "elapsed_time": "0:02:04", "remaining_time": "5:39:08"}
{"current_steps": 30, "total_steps": 4116, "loss": 0.5255, "lr": 2.8155339805825245e-06, "epoch": 0.05102040816326531, "percentage": 0.73, "elapsed_time": "0:02:29", "remaining_time": "5:39:50"}
{"current_steps": 35, "total_steps": 4116, "loss": 0.4911, "lr": 3.300970873786408e-06, "epoch": 0.05952380952380952, "percentage": 0.85, "elapsed_time": "0:02:53", "remaining_time": "5:37:20"}
{"current_steps": 40, "total_steps": 4116, "loss": 0.4598, "lr": 3.7864077669902915e-06, "epoch": 0.06802721088435375, "percentage": 0.97, "elapsed_time": "0:03:20", "remaining_time": "5:40:18"}
{"current_steps": 45, "total_steps": 4116, "loss": 0.4169, "lr": 4.271844660194175e-06, "epoch": 0.07653061224489796, "percentage": 1.09, "elapsed_time": "0:03:42", "remaining_time": "5:36:00"}
{"current_steps": 50, "total_steps": 4116, "loss": 0.4148, "lr": 4.7572815533980585e-06, "epoch": 0.08503401360544217, "percentage": 1.21, "elapsed_time": "0:04:10", "remaining_time": "5:39:51"}
{"current_steps": 55, "total_steps": 4116, "loss": 0.4079, "lr": 5.242718446601942e-06, "epoch": 0.0935374149659864, "percentage": 1.34, "elapsed_time": "0:04:35", "remaining_time": "5:38:39"}
{"current_steps": 60, "total_steps": 4116, "loss": 0.4261, "lr": 5.728155339805825e-06, "epoch": 0.10204081632653061, "percentage": 1.46, "elapsed_time": "0:05:01", "remaining_time": "5:39:26"}
{"current_steps": 65, "total_steps": 4116, "loss": 0.3852, "lr": 6.213592233009709e-06, "epoch": 0.11054421768707483, "percentage": 1.58, "elapsed_time": "0:05:25", "remaining_time": "5:37:49"}
{"current_steps": 70, "total_steps": 4116, "loss": 0.3784, "lr": 6.6990291262135935e-06, "epoch": 0.11904761904761904, "percentage": 1.7, "elapsed_time": "0:05:48", "remaining_time": "5:35:38"}
{"current_steps": 75, "total_steps": 4116, "loss": 0.3729, "lr": 7.184466019417476e-06, "epoch": 0.12755102040816327, "percentage": 1.82, "elapsed_time": "0:06:10", "remaining_time": "5:32:58"}
{"current_steps": 80, "total_steps": 4116, "loss": 0.3755, "lr": 7.66990291262136e-06, "epoch": 0.1360544217687075, "percentage": 1.94, "elapsed_time": "0:06:35", "remaining_time": "5:32:49"}
{"current_steps": 85, "total_steps": 4116, "loss": 0.3572, "lr": 8.155339805825243e-06, "epoch": 0.1445578231292517, "percentage": 2.07, "elapsed_time": "0:07:01", "remaining_time": "5:33:27"}
{"current_steps": 90, "total_steps": 4116, "loss": 0.3369, "lr": 8.640776699029127e-06, "epoch": 0.15306122448979592, "percentage": 2.19, "elapsed_time": "0:07:26", "remaining_time": "5:33:09"}
{"current_steps": 95, "total_steps": 4116, "loss": 0.3422, "lr": 9.12621359223301e-06, "epoch": 0.16156462585034015, "percentage": 2.31, "elapsed_time": "0:07:52", "remaining_time": "5:33:09"}
{"current_steps": 100, "total_steps": 4116, "loss": 0.3538, "lr": 9.611650485436894e-06, "epoch": 0.17006802721088435, "percentage": 2.43, "elapsed_time": "0:08:16", "remaining_time": "5:32:12"}
{"current_steps": 105, "total_steps": 4116, "loss": 0.3292, "lr": 1.0097087378640778e-05, "epoch": 0.17857142857142858, "percentage": 2.55, "elapsed_time": "0:08:40", "remaining_time": "5:31:04"}
{"current_steps": 110, "total_steps": 4116, "loss": 0.3515, "lr": 1.0582524271844662e-05, "epoch": 0.1870748299319728, "percentage": 2.67, "elapsed_time": "0:09:05", "remaining_time": "5:31:16"}
{"current_steps": 115, "total_steps": 4116, "loss": 0.3143, "lr": 1.1067961165048544e-05, "epoch": 0.195578231292517, "percentage": 2.79, "elapsed_time": "0:09:29", "remaining_time": "5:29:56"}
{"current_steps": 120, "total_steps": 4116, "loss": 0.334, "lr": 1.1553398058252427e-05, "epoch": 0.20408163265306123, "percentage": 2.92, "elapsed_time": "0:09:53", "remaining_time": "5:29:40"}
{"current_steps": 125, "total_steps": 4116, "loss": 0.3186, "lr": 1.2038834951456311e-05, "epoch": 0.21258503401360543, "percentage": 3.04, "elapsed_time": "0:10:19", "remaining_time": "5:29:51"}
{"current_steps": 130, "total_steps": 4116, "loss": 0.3143, "lr": 1.2524271844660197e-05, "epoch": 0.22108843537414966, "percentage": 3.16, "elapsed_time": "0:10:44", "remaining_time": "5:29:25"}
{"current_steps": 135, "total_steps": 4116, "loss": 0.3094, "lr": 1.300970873786408e-05, "epoch": 0.22959183673469388, "percentage": 3.28, "elapsed_time": "0:11:11", "remaining_time": "5:30:16"}
{"current_steps": 140, "total_steps": 4116, "loss": 0.3182, "lr": 1.3495145631067962e-05, "epoch": 0.23809523809523808, "percentage": 3.4, "elapsed_time": "0:11:34", "remaining_time": "5:28:51"}
{"current_steps": 145, "total_steps": 4116, "loss": 0.3221, "lr": 1.3980582524271846e-05, "epoch": 0.2465986394557823, "percentage": 3.52, "elapsed_time": "0:12:01", "remaining_time": "5:29:06"}
{"current_steps": 150, "total_steps": 4116, "loss": 0.3119, "lr": 1.446601941747573e-05, "epoch": 0.25510204081632654, "percentage": 3.64, "elapsed_time": "0:12:24", "remaining_time": "5:28:16"}
{"current_steps": 155, "total_steps": 4116, "loss": 0.3086, "lr": 1.4951456310679614e-05, "epoch": 0.26360544217687076, "percentage": 3.77, "elapsed_time": "0:12:52", "remaining_time": "5:28:50"}
{"current_steps": 160, "total_steps": 4116, "loss": 0.3204, "lr": 1.5436893203883496e-05, "epoch": 0.272108843537415, "percentage": 3.89, "elapsed_time": "0:13:19", "remaining_time": "5:29:37"}
{"current_steps": 165, "total_steps": 4116, "loss": 0.2973, "lr": 1.592233009708738e-05, "epoch": 0.28061224489795916, "percentage": 4.01, "elapsed_time": "0:13:46", "remaining_time": "5:29:58"}
{"current_steps": 170, "total_steps": 4116, "loss": 0.3212, "lr": 1.6407766990291263e-05, "epoch": 0.2891156462585034, "percentage": 4.13, "elapsed_time": "0:14:09", "remaining_time": "5:28:34"}
{"current_steps": 175, "total_steps": 4116, "loss": 0.2981, "lr": 1.6893203883495145e-05, "epoch": 0.2976190476190476, "percentage": 4.25, "elapsed_time": "0:14:34", "remaining_time": "5:28:19"}
{"current_steps": 180, "total_steps": 4116, "loss": 0.317, "lr": 1.737864077669903e-05, "epoch": 0.30612244897959184, "percentage": 4.37, "elapsed_time": "0:14:57", "remaining_time": "5:27:07"}
{"current_steps": 185, "total_steps": 4116, "loss": 0.3243, "lr": 1.7864077669902916e-05, "epoch": 0.31462585034013607, "percentage": 4.49, "elapsed_time": "0:15:21", "remaining_time": "5:26:15"}
{"current_steps": 190, "total_steps": 4116, "loss": 0.2964, "lr": 1.8349514563106798e-05, "epoch": 0.3231292517006803, "percentage": 4.62, "elapsed_time": "0:15:43", "remaining_time": "5:24:51"}
{"current_steps": 195, "total_steps": 4116, "loss": 0.3013, "lr": 1.883495145631068e-05, "epoch": 0.33163265306122447, "percentage": 4.74, "elapsed_time": "0:16:07", "remaining_time": "5:24:13"}
{"current_steps": 200, "total_steps": 4116, "loss": 0.3007, "lr": 1.9320388349514565e-05, "epoch": 0.3401360544217687, "percentage": 4.86, "elapsed_time": "0:16:31", "remaining_time": "5:23:32"}
{"current_steps": 205, "total_steps": 4116, "loss": 0.2896, "lr": 1.9805825242718447e-05, "epoch": 0.3486394557823129, "percentage": 4.98, "elapsed_time": "0:16:52", "remaining_time": "5:22:05"}
{"current_steps": 210, "total_steps": 4116, "loss": 0.2792, "lr": 2.0291262135922333e-05, "epoch": 0.35714285714285715, "percentage": 5.1, "elapsed_time": "0:17:16", "remaining_time": "5:21:14"}
{"current_steps": 215, "total_steps": 4116, "loss": 0.2784, "lr": 2.0776699029126215e-05, "epoch": 0.3656462585034014, "percentage": 5.22, "elapsed_time": "0:17:37", "remaining_time": "5:19:48"}
{"current_steps": 220, "total_steps": 4116, "loss": 0.2791, "lr": 2.1262135922330097e-05, "epoch": 0.3741496598639456, "percentage": 5.34, "elapsed_time": "0:18:00", "remaining_time": "5:18:58"}
{"current_steps": 225, "total_steps": 4116, "loss": 0.2877, "lr": 2.1747572815533982e-05, "epoch": 0.3826530612244898, "percentage": 5.47, "elapsed_time": "0:18:25", "remaining_time": "5:18:44"}
{"current_steps": 230, "total_steps": 4116, "loss": 0.2772, "lr": 2.2233009708737864e-05, "epoch": 0.391156462585034, "percentage": 5.59, "elapsed_time": "0:18:50", "remaining_time": "5:18:24"}
{"current_steps": 235, "total_steps": 4116, "loss": 0.2864, "lr": 2.271844660194175e-05, "epoch": 0.39965986394557823, "percentage": 5.71, "elapsed_time": "0:19:18", "remaining_time": "5:18:55"}
{"current_steps": 240, "total_steps": 4116, "loss": 0.2837, "lr": 2.3203883495145632e-05, "epoch": 0.40816326530612246, "percentage": 5.83, "elapsed_time": "0:19:42", "remaining_time": "5:18:17"}
{"current_steps": 245, "total_steps": 4116, "loss": 0.27, "lr": 2.3689320388349514e-05, "epoch": 0.4166666666666667, "percentage": 5.95, "elapsed_time": "0:20:07", "remaining_time": "5:17:57"}
{"current_steps": 250, "total_steps": 4116, "loss": 0.2859, "lr": 2.41747572815534e-05, "epoch": 0.42517006802721086, "percentage": 6.07, "elapsed_time": "0:20:32", "remaining_time": "5:17:38"}
{"current_steps": 255, "total_steps": 4116, "loss": 0.2782, "lr": 2.466019417475728e-05, "epoch": 0.4336734693877551, "percentage": 6.2, "elapsed_time": "0:20:56", "remaining_time": "5:16:58"}
{"current_steps": 260, "total_steps": 4116, "loss": 0.2709, "lr": 2.514563106796117e-05, "epoch": 0.4421768707482993, "percentage": 6.32, "elapsed_time": "0:21:17", "remaining_time": "5:15:50"}
{"current_steps": 265, "total_steps": 4116, "loss": 0.2754, "lr": 2.5631067961165052e-05, "epoch": 0.45068027210884354, "percentage": 6.44, "elapsed_time": "0:21:43", "remaining_time": "5:15:41"}
{"current_steps": 270, "total_steps": 4116, "loss": 0.296, "lr": 2.6116504854368934e-05, "epoch": 0.45918367346938777, "percentage": 6.56, "elapsed_time": "0:22:07", "remaining_time": "5:15:08"}
{"current_steps": 275, "total_steps": 4116, "loss": 0.2862, "lr": 2.660194174757282e-05, "epoch": 0.467687074829932, "percentage": 6.68, "elapsed_time": "0:22:33", "remaining_time": "5:14:59"}
{"current_steps": 280, "total_steps": 4116, "loss": 0.2799, "lr": 2.7087378640776702e-05, "epoch": 0.47619047619047616, "percentage": 6.8, "elapsed_time": "0:22:53", "remaining_time": "5:13:41"}
{"current_steps": 285, "total_steps": 4116, "loss": 0.2717, "lr": 2.7572815533980587e-05, "epoch": 0.4846938775510204, "percentage": 6.92, "elapsed_time": "0:23:16", "remaining_time": "5:12:47"}
{"current_steps": 290, "total_steps": 4116, "loss": 0.2586, "lr": 2.805825242718447e-05, "epoch": 0.4931972789115646, "percentage": 7.05, "elapsed_time": "0:23:38", "remaining_time": "5:11:57"}
{"current_steps": 295, "total_steps": 4116, "loss": 0.28, "lr": 2.854368932038835e-05, "epoch": 0.5017006802721088, "percentage": 7.17, "elapsed_time": "0:23:59", "remaining_time": "5:10:48"}
{"current_steps": 300, "total_steps": 4116, "loss": 0.2884, "lr": 2.9029126213592237e-05, "epoch": 0.5102040816326531, "percentage": 7.29, "elapsed_time": "0:24:21", "remaining_time": "5:09:49"}
{"current_steps": 305, "total_steps": 4116, "loss": 0.2719, "lr": 2.951456310679612e-05, "epoch": 0.5187074829931972, "percentage": 7.41, "elapsed_time": "0:24:48", "remaining_time": "5:09:58"}
{"current_steps": 310, "total_steps": 4116, "loss": 0.2805, "lr": 3.0000000000000004e-05, "epoch": 0.5272108843537415, "percentage": 7.53, "elapsed_time": "0:25:10", "remaining_time": "5:09:08"}
{"current_steps": 315, "total_steps": 4116, "loss": 0.271, "lr": 3.0485436893203886e-05, "epoch": 0.5357142857142857, "percentage": 7.65, "elapsed_time": "0:25:34", "remaining_time": "5:08:38"}
{"current_steps": 320, "total_steps": 4116, "loss": 0.2694, "lr": 3.097087378640777e-05, "epoch": 0.54421768707483, "percentage": 7.77, "elapsed_time": "0:25:57", "remaining_time": "5:07:54"}
{"current_steps": 325, "total_steps": 4116, "loss": 0.2669, "lr": 3.1456310679611654e-05, "epoch": 0.5527210884353742, "percentage": 7.9, "elapsed_time": "0:26:20", "remaining_time": "5:07:17"}
{"current_steps": 330, "total_steps": 4116, "loss": 0.2687, "lr": 3.194174757281554e-05, "epoch": 0.5612244897959183, "percentage": 8.02, "elapsed_time": "0:26:43", "remaining_time": "5:06:33"}
{"current_steps": 335, "total_steps": 4116, "loss": 0.2534, "lr": 3.242718446601942e-05, "epoch": 0.5697278911564626, "percentage": 8.14, "elapsed_time": "0:27:06", "remaining_time": "5:05:59"}
{"current_steps": 340, "total_steps": 4116, "loss": 0.2688, "lr": 3.29126213592233e-05, "epoch": 0.5782312925170068, "percentage": 8.26, "elapsed_time": "0:27:31", "remaining_time": "5:05:36"}
{"current_steps": 345, "total_steps": 4116, "loss": 0.2672, "lr": 3.339805825242719e-05, "epoch": 0.5867346938775511, "percentage": 8.38, "elapsed_time": "0:27:58", "remaining_time": "5:05:44"}
{"current_steps": 350, "total_steps": 4116, "loss": 0.2594, "lr": 3.388349514563107e-05, "epoch": 0.5952380952380952, "percentage": 8.5, "elapsed_time": "0:28:23", "remaining_time": "5:05:27"}
{"current_steps": 355, "total_steps": 4116, "loss": 0.265, "lr": 3.436893203883495e-05, "epoch": 0.6037414965986394, "percentage": 8.62, "elapsed_time": "0:28:44", "remaining_time": "5:04:26"}
{"current_steps": 360, "total_steps": 4116, "loss": 0.2826, "lr": 3.485436893203884e-05, "epoch": 0.6122448979591837, "percentage": 8.75, "elapsed_time": "0:29:09", "remaining_time": "5:04:15"}
{"current_steps": 365, "total_steps": 4116, "loss": 0.2497, "lr": 3.5339805825242724e-05, "epoch": 0.6207482993197279, "percentage": 8.87, "elapsed_time": "0:29:34", "remaining_time": "5:03:56"}
{"current_steps": 370, "total_steps": 4116, "loss": 0.2689, "lr": 3.582524271844661e-05, "epoch": 0.6292517006802721, "percentage": 8.99, "elapsed_time": "0:30:00", "remaining_time": "5:03:51"}
{"current_steps": 375, "total_steps": 4116, "loss": 0.2529, "lr": 3.631067961165049e-05, "epoch": 0.6377551020408163, "percentage": 9.11, "elapsed_time": "0:30:21", "remaining_time": "5:02:55"}
{"current_steps": 380, "total_steps": 4116, "loss": 0.2595, "lr": 3.679611650485437e-05, "epoch": 0.6462585034013606, "percentage": 9.23, "elapsed_time": "0:30:46", "remaining_time": "5:02:32"}
{"current_steps": 385, "total_steps": 4116, "loss": 0.2472, "lr": 3.728155339805826e-05, "epoch": 0.6547619047619048, "percentage": 9.35, "elapsed_time": "0:31:07", "remaining_time": "5:01:42"}
{"current_steps": 390, "total_steps": 4116, "loss": 0.2682, "lr": 3.776699029126214e-05, "epoch": 0.6632653061224489, "percentage": 9.48, "elapsed_time": "0:31:29", "remaining_time": "5:00:52"}
{"current_steps": 395, "total_steps": 4116, "loss": 0.2691, "lr": 3.825242718446602e-05, "epoch": 0.6717687074829932, "percentage": 9.6, "elapsed_time": "0:31:51", "remaining_time": "5:00:06"}
{"current_steps": 400, "total_steps": 4116, "loss": 0.2539, "lr": 3.873786407766991e-05, "epoch": 0.6802721088435374, "percentage": 9.72, "elapsed_time": "0:32:14", "remaining_time": "4:59:27"}
{"current_steps": 405, "total_steps": 4116, "loss": 0.2598, "lr": 3.9223300970873787e-05, "epoch": 0.6887755102040817, "percentage": 9.84, "elapsed_time": "0:32:36", "remaining_time": "4:58:48"}
{"current_steps": 410, "total_steps": 4116, "loss": 0.2424, "lr": 3.970873786407767e-05, "epoch": 0.6972789115646258, "percentage": 9.96, "elapsed_time": "0:32:56", "remaining_time": "4:57:49"}
{"current_steps": 415, "total_steps": 4116, "loss": 0.2615, "lr": 3.9999971224844676e-05, "epoch": 0.70578231292517, "percentage": 10.08, "elapsed_time": "0:33:18", "remaining_time": "4:57:02"}
{"current_steps": 420, "total_steps": 4116, "loss": 0.2535, "lr": 3.9999647505298214e-05, "epoch": 0.7142857142857143, "percentage": 10.2, "elapsed_time": "0:33:42", "remaining_time": "4:56:38"}
{"current_steps": 425, "total_steps": 4116, "loss": 0.2614, "lr": 3.999896410310244e-05, "epoch": 0.7227891156462585, "percentage": 10.33, "elapsed_time": "0:34:06", "remaining_time": "4:56:15"}
{"current_steps": 430, "total_steps": 4116, "loss": 0.2477, "lr": 3.9997921030547975e-05, "epoch": 0.7312925170068028, "percentage": 10.45, "elapsed_time": "0:34:35", "remaining_time": "4:56:35"}
{"current_steps": 435, "total_steps": 4116, "loss": 0.2437, "lr": 3.9996518306393906e-05, "epoch": 0.7397959183673469, "percentage": 10.57, "elapsed_time": "0:35:03", "remaining_time": "4:56:40"}
{"current_steps": 440, "total_steps": 4116, "loss": 0.2681, "lr": 3.999475595586746e-05, "epoch": 0.7482993197278912, "percentage": 10.69, "elapsed_time": "0:35:26", "remaining_time": "4:56:08"}
{"current_steps": 445, "total_steps": 4116, "loss": 0.2667, "lr": 3.9992634010663535e-05, "epoch": 0.7568027210884354, "percentage": 10.81, "elapsed_time": "0:35:50", "remaining_time": "4:55:44"}
{"current_steps": 450, "total_steps": 4116, "loss": 0.2571, "lr": 3.999015250894415e-05, "epoch": 0.7653061224489796, "percentage": 10.93, "elapsed_time": "0:36:14", "remaining_time": "4:55:17"}
{"current_steps": 455, "total_steps": 4116, "loss": 0.2572, "lr": 3.998731149533774e-05, "epoch": 0.7738095238095238, "percentage": 11.05, "elapsed_time": "0:36:36", "remaining_time": "4:54:35"}
{"current_steps": 460, "total_steps": 4116, "loss": 0.2295, "lr": 3.9984111020938376e-05, "epoch": 0.782312925170068, "percentage": 11.18, "elapsed_time": "0:37:00", "remaining_time": "4:54:04"}
{"current_steps": 465, "total_steps": 4116, "loss": 0.2501, "lr": 3.998055114330483e-05, "epoch": 0.7908163265306123, "percentage": 11.3, "elapsed_time": "0:37:26", "remaining_time": "4:53:57"}
{"current_steps": 470, "total_steps": 4116, "loss": 0.2444, "lr": 3.9976631926459536e-05, "epoch": 0.7993197278911565, "percentage": 11.42, "elapsed_time": "0:37:50", "remaining_time": "4:53:33"}
{"current_steps": 475, "total_steps": 4116, "loss": 0.255, "lr": 3.997235344088747e-05, "epoch": 0.8078231292517006, "percentage": 11.54, "elapsed_time": "0:38:14", "remaining_time": "4:53:07"}
{"current_steps": 480, "total_steps": 4116, "loss": 0.2471, "lr": 3.9967715763534805e-05, "epoch": 0.8163265306122449, "percentage": 11.66, "elapsed_time": "0:38:38", "remaining_time": "4:52:41"}
{"current_steps": 485, "total_steps": 4116, "loss": 0.2418, "lr": 3.9962718977807637e-05, "epoch": 0.8248299319727891, "percentage": 11.78, "elapsed_time": "0:39:02", "remaining_time": "4:52:17"}
{"current_steps": 490, "total_steps": 4116, "loss": 0.2394, "lr": 3.995736317357041e-05, "epoch": 0.8333333333333334, "percentage": 11.9, "elapsed_time": "0:39:23", "remaining_time": "4:51:28"}
{"current_steps": 495, "total_steps": 4116, "loss": 0.2535, "lr": 3.9951648447144305e-05, "epoch": 0.8418367346938775, "percentage": 12.03, "elapsed_time": "0:39:44", "remaining_time": "4:50:45"}
{"current_steps": 500, "total_steps": 4116, "loss": 0.2452, "lr": 3.9945574901305534e-05, "epoch": 0.8503401360544217, "percentage": 12.15, "elapsed_time": "0:40:07", "remaining_time": "4:50:09"}
{"current_steps": 505, "total_steps": 4116, "loss": 0.2761, "lr": 3.9939142645283475e-05, "epoch": 0.858843537414966, "percentage": 12.27, "elapsed_time": "0:40:29", "remaining_time": "4:49:30"}
{"current_steps": 510, "total_steps": 4116, "loss": 0.2409, "lr": 3.9932351794758705e-05, "epoch": 0.8673469387755102, "percentage": 12.39, "elapsed_time": "0:40:50", "remaining_time": "4:48:49"}
{"current_steps": 515, "total_steps": 4116, "loss": 0.2383, "lr": 3.992520247186094e-05, "epoch": 0.8758503401360545, "percentage": 12.51, "elapsed_time": "0:41:12", "remaining_time": "4:48:09"}
{"current_steps": 520, "total_steps": 4116, "loss": 0.251, "lr": 3.991769480516681e-05, "epoch": 0.8843537414965986, "percentage": 12.63, "elapsed_time": "0:41:34", "remaining_time": "4:47:28"}
{"current_steps": 525, "total_steps": 4116, "loss": 0.2501, "lr": 3.9909828929697575e-05, "epoch": 0.8928571428571429, "percentage": 12.76, "elapsed_time": "0:41:58", "remaining_time": "4:47:03"}
{"current_steps": 530, "total_steps": 4116, "loss": 0.2543, "lr": 3.9901604986916666e-05, "epoch": 0.9013605442176871, "percentage": 12.88, "elapsed_time": "0:42:19", "remaining_time": "4:46:21"}
{"current_steps": 535, "total_steps": 4116, "loss": 0.2385, "lr": 3.9893023124727176e-05, "epoch": 0.9098639455782312, "percentage": 13.0, "elapsed_time": "0:42:43", "remaining_time": "4:46:00"}
{"current_steps": 540, "total_steps": 4116, "loss": 0.2367, "lr": 3.988408349746916e-05, "epoch": 0.9183673469387755, "percentage": 13.12, "elapsed_time": "0:43:08", "remaining_time": "4:45:41"}
{"current_steps": 545, "total_steps": 4116, "loss": 0.2577, "lr": 3.9874786265916895e-05, "epoch": 0.9268707482993197, "percentage": 13.24, "elapsed_time": "0:43:35", "remaining_time": "4:45:38"}
{"current_steps": 550, "total_steps": 4116, "loss": 0.2431, "lr": 3.986513159727598e-05, "epoch": 0.935374149659864, "percentage": 13.36, "elapsed_time": "0:43:56", "remaining_time": "4:44:57"}
{"current_steps": 555, "total_steps": 4116, "loss": 0.2308, "lr": 3.985511966518031e-05, "epoch": 0.9438775510204082, "percentage": 13.48, "elapsed_time": "0:44:19", "remaining_time": "4:44:25"}
{"current_steps": 560, "total_steps": 4116, "loss": 0.2532, "lr": 3.984475064968897e-05, "epoch": 0.9523809523809523, "percentage": 13.61, "elapsed_time": "0:44:44", "remaining_time": "4:44:04"}
{"current_steps": 565, "total_steps": 4116, "loss": 0.2338, "lr": 3.9834024737282994e-05, "epoch": 0.9608843537414966, "percentage": 13.73, "elapsed_time": "0:45:08", "remaining_time": "4:43:41"}
{"current_steps": 570, "total_steps": 4116, "loss": 0.2443, "lr": 3.9822942120862016e-05, "epoch": 0.9693877551020408, "percentage": 13.85, "elapsed_time": "0:45:33", "remaining_time": "4:43:23"}
{"current_steps": 575, "total_steps": 4116, "loss": 0.2446, "lr": 3.981150299974079e-05, "epoch": 0.9778911564625851, "percentage": 13.97, "elapsed_time": "0:45:55", "remaining_time": "4:42:48"}
{"current_steps": 580, "total_steps": 4116, "loss": 0.236, "lr": 3.979970757964561e-05, "epoch": 0.9863945578231292, "percentage": 14.09, "elapsed_time": "0:46:21", "remaining_time": "4:42:37"}
{"current_steps": 585, "total_steps": 4116, "loss": 0.2339, "lr": 3.978755607271061e-05, "epoch": 0.9948979591836735, "percentage": 14.21, "elapsed_time": "0:46:48", "remaining_time": "4:42:29"}
{"current_steps": 590, "total_steps": 4116, "loss": 0.2437, "lr": 3.977504869747395e-05, "epoch": 1.0034013605442176, "percentage": 14.33, "elapsed_time": "0:47:12", "remaining_time": "4:42:07"}
{"current_steps": 595, "total_steps": 4116, "loss": 0.2302, "lr": 3.976218567887389e-05, "epoch": 1.0119047619047619, "percentage": 14.46, "elapsed_time": "0:47:36", "remaining_time": "4:41:41"}
{"current_steps": 600, "total_steps": 4116, "loss": 0.2186, "lr": 3.974896724824475e-05, "epoch": 1.0204081632653061, "percentage": 14.58, "elapsed_time": "0:48:02", "remaining_time": "4:41:29"}
{"current_steps": 605, "total_steps": 4116, "loss": 0.2217, "lr": 3.97353936433127e-05, "epoch": 1.0289115646258504, "percentage": 14.7, "elapsed_time": "0:48:28", "remaining_time": "4:41:21"}
{"current_steps": 610, "total_steps": 4116, "loss": 0.2344, "lr": 3.972146510819158e-05, "epoch": 1.0374149659863945, "percentage": 14.82, "elapsed_time": "0:48:53", "remaining_time": "4:41:03"}
{"current_steps": 615, "total_steps": 4116, "loss": 0.228, "lr": 3.970718189337841e-05, "epoch": 1.0459183673469388, "percentage": 14.94, "elapsed_time": "0:49:14", "remaining_time": "4:40:19"}
{"current_steps": 620, "total_steps": 4116, "loss": 0.2323, "lr": 3.969254425574895e-05, "epoch": 1.054421768707483, "percentage": 15.06, "elapsed_time": "0:49:39", "remaining_time": "4:40:02"}
{"current_steps": 625, "total_steps": 4116, "loss": 0.2479, "lr": 3.9677552458553046e-05, "epoch": 1.0629251700680271, "percentage": 15.18, "elapsed_time": "0:50:05", "remaining_time": "4:39:48"}
{"current_steps": 630, "total_steps": 4116, "loss": 0.2375, "lr": 3.9662206771409925e-05, "epoch": 1.0714285714285714, "percentage": 15.31, "elapsed_time": "0:50:27", "remaining_time": "4:39:11"}
{"current_steps": 635, "total_steps": 4116, "loss": 0.2341, "lr": 3.964650747030332e-05, "epoch": 1.0799319727891157, "percentage": 15.43, "elapsed_time": "0:50:50", "remaining_time": "4:38:43"}
{"current_steps": 640, "total_steps": 4116, "loss": 0.2394, "lr": 3.963045483757652e-05, "epoch": 1.08843537414966, "percentage": 15.55, "elapsed_time": "0:51:14", "remaining_time": "4:38:19"}
{"current_steps": 645, "total_steps": 4116, "loss": 0.2257, "lr": 3.961404916192728e-05, "epoch": 1.096938775510204, "percentage": 15.67, "elapsed_time": "0:51:37", "remaining_time": "4:37:50"}
{"current_steps": 650, "total_steps": 4116, "loss": 0.2254, "lr": 3.959729073840265e-05, "epoch": 1.1054421768707483, "percentage": 15.79, "elapsed_time": "0:52:03", "remaining_time": "4:37:34"}
{"current_steps": 655, "total_steps": 4116, "loss": 0.2114, "lr": 3.958017986839364e-05, "epoch": 1.1139455782312926, "percentage": 15.91, "elapsed_time": "0:52:29", "remaining_time": "4:37:19"}
{"current_steps": 660, "total_steps": 4116, "loss": 0.2183, "lr": 3.956271685962985e-05, "epoch": 1.1224489795918366, "percentage": 16.03, "elapsed_time": "0:52:52", "remaining_time": "4:36:52"}
{"current_steps": 665, "total_steps": 4116, "loss": 0.2085, "lr": 3.954490202617385e-05, "epoch": 1.130952380952381, "percentage": 16.16, "elapsed_time": "0:53:15", "remaining_time": "4:36:25"}
{"current_steps": 670, "total_steps": 4116, "loss": 0.2177, "lr": 3.9526735688415624e-05, "epoch": 1.1394557823129252, "percentage": 16.28, "elapsed_time": "0:53:35", "remaining_time": "4:35:38"}
{"current_steps": 675, "total_steps": 4116, "loss": 0.2215, "lr": 3.9508218173066766e-05, "epoch": 1.1479591836734695, "percentage": 16.4, "elapsed_time": "0:53:57", "remaining_time": "4:35:05"}
{"current_steps": 680, "total_steps": 4116, "loss": 0.2203, "lr": 3.948934981315457e-05, "epoch": 1.1564625850340136, "percentage": 16.52, "elapsed_time": "0:54:19", "remaining_time": "4:34:32"}
{"current_steps": 685, "total_steps": 4116, "loss": 0.2126, "lr": 3.947013094801612e-05, "epoch": 1.1649659863945578, "percentage": 16.64, "elapsed_time": "0:54:45", "remaining_time": "4:34:16"}
{"current_steps": 690, "total_steps": 4116, "loss": 0.2187, "lr": 3.945056192329212e-05, "epoch": 1.1734693877551021, "percentage": 16.76, "elapsed_time": "0:55:12", "remaining_time": "4:34:05"}
{"current_steps": 695, "total_steps": 4116, "loss": 0.2304, "lr": 3.9430643090920705e-05, "epoch": 1.1819727891156462, "percentage": 16.89, "elapsed_time": "0:55:35", "remaining_time": "4:33:39"}
{"current_steps": 700, "total_steps": 4116, "loss": 0.2262, "lr": 3.941037480913111e-05, "epoch": 1.1904761904761905, "percentage": 17.01, "elapsed_time": "0:55:56", "remaining_time": "4:32:57"}
{"current_steps": 705, "total_steps": 4116, "loss": 0.2217, "lr": 3.9389757442437214e-05, "epoch": 1.1989795918367347, "percentage": 17.13, "elapsed_time": "0:56:16", "remaining_time": "4:32:16"}
{"current_steps": 710, "total_steps": 4116, "loss": 0.2246, "lr": 3.936879136163102e-05, "epoch": 1.2074829931972788, "percentage": 17.25, "elapsed_time": "0:56:37", "remaining_time": "4:31:37"}
{"current_steps": 715, "total_steps": 4116, "loss": 0.218, "lr": 3.934747694377594e-05, "epoch": 1.215986394557823, "percentage": 17.37, "elapsed_time": "0:57:00", "remaining_time": "4:31:11"}
{"current_steps": 720, "total_steps": 4116, "loss": 0.2176, "lr": 3.932581457220004e-05, "epoch": 1.2244897959183674, "percentage": 17.49, "elapsed_time": "0:57:22", "remaining_time": "4:30:36"}
{"current_steps": 725, "total_steps": 4116, "loss": 0.2148, "lr": 3.9303804636489165e-05, "epoch": 1.2329931972789117, "percentage": 17.61, "elapsed_time": "0:57:44", "remaining_time": "4:30:02"}
{"current_steps": 730, "total_steps": 4116, "loss": 0.2163, "lr": 3.928144753247987e-05, "epoch": 1.2414965986394557, "percentage": 17.74, "elapsed_time": "0:58:09", "remaining_time": "4:29:45"}
{"current_steps": 735, "total_steps": 4116, "loss": 0.2301, "lr": 3.925874366225237e-05, "epoch": 1.25, "percentage": 17.86, "elapsed_time": "0:58:37", "remaining_time": "4:29:38"}
{"current_steps": 740, "total_steps": 4116, "loss": 0.2062, "lr": 3.923569343412327e-05, "epoch": 1.2585034013605443, "percentage": 17.98, "elapsed_time": "0:58:57", "remaining_time": "4:29:00"}
{"current_steps": 745, "total_steps": 4116, "loss": 0.2163, "lr": 3.921229726263824e-05, "epoch": 1.2670068027210886, "percentage": 18.1, "elapsed_time": "0:59:21", "remaining_time": "4:28:33"}
{"current_steps": 750, "total_steps": 4116, "loss": 0.2245, "lr": 3.918855556856453e-05, "epoch": 1.2755102040816326, "percentage": 18.22, "elapsed_time": "0:59:44", "remaining_time": "4:28:08"}
{"current_steps": 755, "total_steps": 4116, "loss": 0.2275, "lr": 3.916446877888345e-05, "epoch": 1.284013605442177, "percentage": 18.34, "elapsed_time": "1:00:07", "remaining_time": "4:27:38"}
{"current_steps": 760, "total_steps": 4116, "loss": 0.2338, "lr": 3.914003732678264e-05, "epoch": 1.2925170068027212, "percentage": 18.46, "elapsed_time": "1:00:31", "remaining_time": "4:27:17"}
{"current_steps": 765, "total_steps": 4116, "loss": 0.2213, "lr": 3.911526165164831e-05, "epoch": 1.3010204081632653, "percentage": 18.59, "elapsed_time": "1:00:52", "remaining_time": "4:26:40"}
{"current_steps": 770, "total_steps": 4116, "loss": 0.2334, "lr": 3.909014219905736e-05, "epoch": 1.3095238095238095, "percentage": 18.71, "elapsed_time": "1:01:14", "remaining_time": "4:26:06"}
{"current_steps": 775, "total_steps": 4116, "loss": 0.2268, "lr": 3.906467942076929e-05, "epoch": 1.3180272108843538, "percentage": 18.83, "elapsed_time": "1:01:38", "remaining_time": "4:25:44"}
{"current_steps": 780, "total_steps": 4116, "loss": 0.2137, "lr": 3.903887377471816e-05, "epoch": 1.3265306122448979, "percentage": 18.95, "elapsed_time": "1:02:01", "remaining_time": "4:25:15"}
{"current_steps": 785, "total_steps": 4116, "loss": 0.2167, "lr": 3.901272572500431e-05, "epoch": 1.3350340136054422, "percentage": 19.07, "elapsed_time": "1:02:24", "remaining_time": "4:24:50"}
{"current_steps": 790, "total_steps": 4116, "loss": 0.2161, "lr": 3.8986235741886016e-05, "epoch": 1.3435374149659864, "percentage": 19.19, "elapsed_time": "1:02:47", "remaining_time": "4:24:20"}
{"current_steps": 795, "total_steps": 4116, "loss": 0.2151, "lr": 3.8959404301771035e-05, "epoch": 1.3520408163265305, "percentage": 19.31, "elapsed_time": "1:03:09", "remaining_time": "4:23:51"}
{"current_steps": 800, "total_steps": 4116, "loss": 0.2052, "lr": 3.893223188720804e-05, "epoch": 1.3605442176870748, "percentage": 19.44, "elapsed_time": "1:03:33", "remaining_time": "4:23:26"}
{"current_steps": 805, "total_steps": 4116, "loss": 0.2152, "lr": 3.890471898687797e-05, "epoch": 1.369047619047619, "percentage": 19.56, "elapsed_time": "1:03:58", "remaining_time": "4:23:06"}
{"current_steps": 810, "total_steps": 4116, "loss": 0.2197, "lr": 3.887686609558516e-05, "epoch": 1.3775510204081631, "percentage": 19.68, "elapsed_time": "1:04:20", "remaining_time": "4:22:37"}
{"current_steps": 815, "total_steps": 4116, "loss": 0.206, "lr": 3.8848673714248545e-05, "epoch": 1.3860544217687074, "percentage": 19.8, "elapsed_time": "1:04:47", "remaining_time": "4:22:25"}
{"current_steps": 820, "total_steps": 4116, "loss": 0.2108, "lr": 3.882014234989258e-05, "epoch": 1.3945578231292517, "percentage": 19.92, "elapsed_time": "1:05:12", "remaining_time": "4:22:04"}
{"current_steps": 825, "total_steps": 4116, "loss": 0.2176, "lr": 3.879127251563814e-05, "epoch": 1.403061224489796, "percentage": 20.04, "elapsed_time": "1:05:33", "remaining_time": "4:21:31"}
{"current_steps": 830, "total_steps": 4116, "loss": 0.2034, "lr": 3.87620647306933e-05, "epoch": 1.4115646258503403, "percentage": 20.17, "elapsed_time": "1:05:55", "remaining_time": "4:20:59"}
{"current_steps": 835, "total_steps": 4116, "loss": 0.2153, "lr": 3.873251952034398e-05, "epoch": 1.4200680272108843, "percentage": 20.29, "elapsed_time": "1:06:16", "remaining_time": "4:20:24"}
{"current_steps": 840, "total_steps": 4116, "loss": 0.2094, "lr": 3.8702637415944554e-05, "epoch": 1.4285714285714286, "percentage": 20.41, "elapsed_time": "1:06:38", "remaining_time": "4:19:55"}
{"current_steps": 845, "total_steps": 4116, "loss": 0.1979, "lr": 3.867241895490818e-05, "epoch": 1.4370748299319729, "percentage": 20.53, "elapsed_time": "1:06:59", "remaining_time": "4:19:18"}
{"current_steps": 850, "total_steps": 4116, "loss": 0.221, "lr": 3.864186468069727e-05, "epoch": 1.445578231292517, "percentage": 20.65, "elapsed_time": "1:07:25", "remaining_time": "4:19:04"}
{"current_steps": 855, "total_steps": 4116, "loss": 0.2203, "lr": 3.86109751428136e-05, "epoch": 1.4540816326530612, "percentage": 20.77, "elapsed_time": "1:07:48", "remaining_time": "4:18:38"}
{"current_steps": 860, "total_steps": 4116, "loss": 0.2228, "lr": 3.8579750896788526e-05, "epoch": 1.4625850340136055, "percentage": 20.89, "elapsed_time": "1:08:11", "remaining_time": "4:18:10"}
{"current_steps": 865, "total_steps": 4116, "loss": 0.2133, "lr": 3.8548192504172905e-05, "epoch": 1.4710884353741496, "percentage": 21.02, "elapsed_time": "1:08:35", "remaining_time": "4:17:46"}
{"current_steps": 870, "total_steps": 4116, "loss": 0.2185, "lr": 3.851630053252706e-05, "epoch": 1.4795918367346939, "percentage": 21.14, "elapsed_time": "1:08:57", "remaining_time": "4:17:18"}
{"current_steps": 875, "total_steps": 4116, "loss": 0.2086, "lr": 3.848407555541054e-05, "epoch": 1.4880952380952381, "percentage": 21.26, "elapsed_time": "1:09:18", "remaining_time": "4:16:41"}
{"current_steps": 880, "total_steps": 4116, "loss": 0.2069, "lr": 3.845151815237183e-05, "epoch": 1.4965986394557822, "percentage": 21.38, "elapsed_time": "1:09:45", "remaining_time": "4:16:31"}
{"current_steps": 885, "total_steps": 4116, "loss": 0.2251, "lr": 3.841862890893787e-05, "epoch": 1.5051020408163265, "percentage": 21.5, "elapsed_time": "1:10:10", "remaining_time": "4:16:12"}
{"current_steps": 890, "total_steps": 4116, "loss": 0.2016, "lr": 3.838540841660363e-05, "epoch": 1.5136054421768708, "percentage": 21.62, "elapsed_time": "1:10:36", "remaining_time": "4:15:54"}
{"current_steps": 895, "total_steps": 4116, "loss": 0.2007, "lr": 3.835185727282135e-05, "epoch": 1.5221088435374148, "percentage": 21.74, "elapsed_time": "1:11:00", "remaining_time": "4:15:34"}
{"current_steps": 900, "total_steps": 4116, "loss": 0.214, "lr": 3.8317976080989884e-05, "epoch": 1.5306122448979593, "percentage": 21.87, "elapsed_time": "1:11:23", "remaining_time": "4:15:07"}
{"current_steps": 905, "total_steps": 4116, "loss": 0.2239, "lr": 3.828376545044381e-05, "epoch": 1.5391156462585034, "percentage": 21.99, "elapsed_time": "1:11:45", "remaining_time": "4:14:34"}
{"current_steps": 910, "total_steps": 4116, "loss": 0.216, "lr": 3.824922599644249e-05, "epoch": 1.5476190476190477, "percentage": 22.11, "elapsed_time": "1:12:09", "remaining_time": "4:14:14"}
{"current_steps": 915, "total_steps": 4116, "loss": 0.2208, "lr": 3.8214358340159e-05, "epoch": 1.556122448979592, "percentage": 22.23, "elapsed_time": "1:12:32", "remaining_time": "4:13:45"}
{"current_steps": 920, "total_steps": 4116, "loss": 0.2032, "lr": 3.817916310866892e-05, "epoch": 1.564625850340136, "percentage": 22.35, "elapsed_time": "1:12:54", "remaining_time": "4:13:16"}
{"current_steps": 925, "total_steps": 4116, "loss": 0.2115, "lr": 3.814364093493912e-05, "epoch": 1.5731292517006803, "percentage": 22.47, "elapsed_time": "1:13:18", "remaining_time": "4:12:51"}
{"current_steps": 930, "total_steps": 4116, "loss": 0.2088, "lr": 3.810779245781633e-05, "epoch": 1.5816326530612246, "percentage": 22.59, "elapsed_time": "1:13:39", "remaining_time": "4:12:18"}
{"current_steps": 935, "total_steps": 4116, "loss": 0.2133, "lr": 3.807161832201569e-05, "epoch": 1.5901360544217686, "percentage": 22.72, "elapsed_time": "1:14:03", "remaining_time": "4:11:56"}
{"current_steps": 940, "total_steps": 4116, "loss": 0.1985, "lr": 3.80351191781091e-05, "epoch": 1.598639455782313, "percentage": 22.84, "elapsed_time": "1:14:24", "remaining_time": "4:11:24"}
{"current_steps": 945, "total_steps": 4116, "loss": 0.2173, "lr": 3.799829568251358e-05, "epoch": 1.6071428571428572, "percentage": 22.96, "elapsed_time": "1:14:48", "remaining_time": "4:11:01"}
{"current_steps": 950, "total_steps": 4116, "loss": 0.2194, "lr": 3.7961148497479414e-05, "epoch": 1.6156462585034013, "percentage": 23.08, "elapsed_time": "1:15:10", "remaining_time": "4:10:31"}
{"current_steps": 955, "total_steps": 4116, "loss": 0.2152, "lr": 3.792367829107828e-05, "epoch": 1.6241496598639455, "percentage": 23.2, "elapsed_time": "1:15:36", "remaining_time": "4:10:16"}
{"current_steps": 960, "total_steps": 4116, "loss": 0.2108, "lr": 3.7885885737191214e-05, "epoch": 1.6326530612244898, "percentage": 23.32, "elapsed_time": "1:15:59", "remaining_time": "4:09:48"}
{"current_steps": 965, "total_steps": 4116, "loss": 0.2178, "lr": 3.784777151549648e-05, "epoch": 1.641156462585034, "percentage": 23.45, "elapsed_time": "1:16:22", "remaining_time": "4:09:23"}
{"current_steps": 970, "total_steps": 4116, "loss": 0.2164, "lr": 3.780933631145738e-05, "epoch": 1.6496598639455784, "percentage": 23.57, "elapsed_time": "1:16:44", "remaining_time": "4:08:53"}
{"current_steps": 975, "total_steps": 4116, "loss": 0.2146, "lr": 3.777058081630989e-05, "epoch": 1.6581632653061225, "percentage": 23.69, "elapsed_time": "1:17:06", "remaining_time": "4:08:23"}
{"current_steps": 980, "total_steps": 4116, "loss": 0.2064, "lr": 3.773150572705026e-05, "epoch": 1.6666666666666665, "percentage": 23.81, "elapsed_time": "1:17:29", "remaining_time": "4:07:59"}
{"current_steps": 985, "total_steps": 4116, "loss": 0.2075, "lr": 3.769211174642245e-05, "epoch": 1.675170068027211, "percentage": 23.93, "elapsed_time": "1:17:54", "remaining_time": "4:07:37"}
{"current_steps": 990, "total_steps": 4116, "loss": 0.2183, "lr": 3.765239958290551e-05, "epoch": 1.683673469387755, "percentage": 24.05, "elapsed_time": "1:18:14", "remaining_time": "4:07:04"}
{"current_steps": 995, "total_steps": 4116, "loss": 0.2274, "lr": 3.761236995070086e-05, "epoch": 1.6921768707482994, "percentage": 24.17, "elapsed_time": "1:18:42", "remaining_time": "4:06:54"}
{"current_steps": 1000, "total_steps": 4116, "loss": 0.2083, "lr": 3.757202356971937e-05, "epoch": 1.7006802721088436, "percentage": 24.3, "elapsed_time": "1:19:03", "remaining_time": "4:06:21"}
{"current_steps": 1005, "total_steps": 4116, "loss": 0.2102, "lr": 3.75313611655685e-05, "epoch": 1.7091836734693877, "percentage": 24.42, "elapsed_time": "1:19:24", "remaining_time": "4:05:48"}
{"current_steps": 1010, "total_steps": 4116, "loss": 0.2254, "lr": 3.74903834695392e-05, "epoch": 1.717687074829932, "percentage": 24.54, "elapsed_time": "1:19:48", "remaining_time": "4:05:26"}
{"current_steps": 1015, "total_steps": 4116, "loss": 0.2143, "lr": 3.7449091218592765e-05, "epoch": 1.7261904761904763, "percentage": 24.66, "elapsed_time": "1:20:13", "remaining_time": "4:05:06"}
{"current_steps": 1020, "total_steps": 4116, "loss": 0.2183, "lr": 3.740748515534761e-05, "epoch": 1.7346938775510203, "percentage": 24.78, "elapsed_time": "1:20:38", "remaining_time": "4:04:45"}
{"current_steps": 1025, "total_steps": 4116, "loss": 0.2104, "lr": 3.736556602806587e-05, "epoch": 1.7431972789115646, "percentage": 24.9, "elapsed_time": "1:20:59", "remaining_time": "4:04:13"}
{"current_steps": 1030, "total_steps": 4116, "loss": 0.2163, "lr": 3.732333459063995e-05, "epoch": 1.751700680272109, "percentage": 25.02, "elapsed_time": "1:21:20", "remaining_time": "4:03:43"}
{"current_steps": 1035, "total_steps": 4116, "loss": 0.2042, "lr": 3.728079160257903e-05, "epoch": 1.760204081632653, "percentage": 25.15, "elapsed_time": "1:21:42", "remaining_time": "4:03:15"}
{"current_steps": 1040, "total_steps": 4116, "loss": 0.2082, "lr": 3.723793782899531e-05, "epoch": 1.7687074829931972, "percentage": 25.27, "elapsed_time": "1:22:07", "remaining_time": "4:02:53"}
{"current_steps": 1045, "total_steps": 4116, "loss": 0.2164, "lr": 3.719477404059032e-05, "epoch": 1.7772108843537415, "percentage": 25.39, "elapsed_time": "1:22:32", "remaining_time": "4:02:34"}
{"current_steps": 1050, "total_steps": 4116, "loss": 0.2062, "lr": 3.715130101364103e-05, "epoch": 1.7857142857142856, "percentage": 25.51, "elapsed_time": "1:22:54", "remaining_time": "4:02:05"}
{"current_steps": 1055, "total_steps": 4116, "loss": 0.207, "lr": 3.710751952998591e-05, "epoch": 1.79421768707483, "percentage": 25.63, "elapsed_time": "1:23:18", "remaining_time": "4:01:43"}
{"current_steps": 1060, "total_steps": 4116, "loss": 0.2078, "lr": 3.7063430377010804e-05, "epoch": 1.8027210884353742, "percentage": 25.75, "elapsed_time": "1:23:40", "remaining_time": "4:01:15"}
{"current_steps": 1065, "total_steps": 4116, "loss": 0.206, "lr": 3.7019034347634884e-05, "epoch": 1.8112244897959182, "percentage": 25.87, "elapsed_time": "1:24:04", "remaining_time": "4:00:51"}
{"current_steps": 1070, "total_steps": 4116, "loss": 0.2262, "lr": 3.6974332240296285e-05, "epoch": 1.8197278911564627, "percentage": 26.0, "elapsed_time": "1:24:28", "remaining_time": "4:00:29"}
{"current_steps": 1075, "total_steps": 4116, "loss": 0.211, "lr": 3.692932485893779e-05, "epoch": 1.8282312925170068, "percentage": 26.12, "elapsed_time": "1:24:53", "remaining_time": "4:00:09"}
{"current_steps": 1080, "total_steps": 4116, "loss": 0.2212, "lr": 3.6884013012992384e-05, "epoch": 1.836734693877551, "percentage": 26.24, "elapsed_time": "1:25:14", "remaining_time": "3:59:38"}
{"current_steps": 1085, "total_steps": 4116, "loss": 0.2111, "lr": 3.683839751736866e-05, "epoch": 1.8452380952380953, "percentage": 26.36, "elapsed_time": "1:25:39", "remaining_time": "3:59:17"}
{"current_steps": 1090, "total_steps": 4116, "loss": 0.2008, "lr": 3.67924791924362e-05, "epoch": 1.8537414965986394, "percentage": 26.48, "elapsed_time": "1:26:02", "remaining_time": "3:58:50"}
{"current_steps": 1095, "total_steps": 4116, "loss": 0.2036, "lr": 3.6746258864010804e-05, "epoch": 1.8622448979591837, "percentage": 26.6, "elapsed_time": "1:26:24", "remaining_time": "3:58:23"}
{"current_steps": 1100, "total_steps": 4116, "loss": 0.2085, "lr": 3.669973736333961e-05, "epoch": 1.870748299319728, "percentage": 26.72, "elapsed_time": "1:26:46", "remaining_time": "3:57:55"}
{"current_steps": 1105, "total_steps": 4116, "loss": 0.2069, "lr": 3.665291552708622e-05, "epoch": 1.879251700680272, "percentage": 26.85, "elapsed_time": "1:27:11", "remaining_time": "3:57:34"}
{"current_steps": 1110, "total_steps": 4116, "loss": 0.2098, "lr": 3.660579419731556e-05, "epoch": 1.8877551020408163, "percentage": 26.97, "elapsed_time": "1:27:37", "remaining_time": "3:57:18"}
{"current_steps": 1115, "total_steps": 4116, "loss": 0.2006, "lr": 3.655837422147881e-05, "epoch": 1.8962585034013606, "percentage": 27.09, "elapsed_time": "1:27:57", "remaining_time": "3:56:45"}
{"current_steps": 1120, "total_steps": 4116, "loss": 0.1946, "lr": 3.6510656452398124e-05, "epoch": 1.9047619047619047, "percentage": 27.21, "elapsed_time": "1:28:21", "remaining_time": "3:56:22"}
{"current_steps": 1125, "total_steps": 4116, "loss": 0.2133, "lr": 3.646264174825131e-05, "epoch": 1.913265306122449, "percentage": 27.33, "elapsed_time": "1:28:43", "remaining_time": "3:55:53"}
{"current_steps": 1130, "total_steps": 4116, "loss": 0.1998, "lr": 3.6414330972556396e-05, "epoch": 1.9217687074829932, "percentage": 27.45, "elapsed_time": "1:29:04", "remaining_time": "3:55:23"}
{"current_steps": 1135, "total_steps": 4116, "loss": 0.209, "lr": 3.6365724994156074e-05, "epoch": 1.9302721088435373, "percentage": 27.58, "elapsed_time": "1:29:27", "remaining_time": "3:54:57"}
{"current_steps": 1140, "total_steps": 4116, "loss": 0.2217, "lr": 3.63168246872021e-05, "epoch": 1.9387755102040818, "percentage": 27.7, "elapsed_time": "1:29:51", "remaining_time": "3:54:35"}
{"current_steps": 1145, "total_steps": 4116, "loss": 0.1975, "lr": 3.626763093113959e-05, "epoch": 1.9472789115646258, "percentage": 27.82, "elapsed_time": "1:30:13", "remaining_time": "3:54:06"}
{"current_steps": 1150, "total_steps": 4116, "loss": 0.2029, "lr": 3.621814461069114e-05, "epoch": 1.95578231292517, "percentage": 27.94, "elapsed_time": "1:30:36", "remaining_time": "3:53:42"}
{"current_steps": 1155, "total_steps": 4116, "loss": 0.2102, "lr": 3.6168366615841e-05, "epoch": 1.9642857142857144, "percentage": 28.06, "elapsed_time": "1:31:01", "remaining_time": "3:53:20"}
{"current_steps": 1160, "total_steps": 4116, "loss": 0.2112, "lr": 3.6118297841818986e-05, "epoch": 1.9727891156462585, "percentage": 28.18, "elapsed_time": "1:31:23", "remaining_time": "3:52:53"}
{"current_steps": 1165, "total_steps": 4116, "loss": 0.2159, "lr": 3.6067939189084446e-05, "epoch": 1.9812925170068028, "percentage": 28.3, "elapsed_time": "1:31:50", "remaining_time": "3:52:37"}
{"current_steps": 1170, "total_steps": 4116, "loss": 0.1954, "lr": 3.6017291563310005e-05, "epoch": 1.989795918367347, "percentage": 28.43, "elapsed_time": "1:32:14", "remaining_time": "3:52:16"}
{"current_steps": 1175, "total_steps": 4116, "loss": 0.2054, "lr": 3.5966355875365344e-05, "epoch": 1.998299319727891, "percentage": 28.55, "elapsed_time": "1:32:41", "remaining_time": "3:51:59"}
{"current_steps": 1180, "total_steps": 4116, "loss": 0.1917, "lr": 3.591513304130076e-05, "epoch": 2.006802721088435, "percentage": 28.67, "elapsed_time": "1:33:04", "remaining_time": "3:51:34"}
{"current_steps": 1185, "total_steps": 4116, "loss": 0.1911, "lr": 3.586362398233072e-05, "epoch": 2.0153061224489797, "percentage": 28.79, "elapsed_time": "1:33:25", "remaining_time": "3:51:04"}
{"current_steps": 1190, "total_steps": 4116, "loss": 0.1874, "lr": 3.58118296248173e-05, "epoch": 2.0238095238095237, "percentage": 28.91, "elapsed_time": "1:33:51", "remaining_time": "3:50:47"}
{"current_steps": 1195, "total_steps": 4116, "loss": 0.1874, "lr": 3.57597509002535e-05, "epoch": 2.0323129251700682, "percentage": 29.03, "elapsed_time": "1:34:16", "remaining_time": "3:50:27"}
{"current_steps": 1200, "total_steps": 4116, "loss": 0.1932, "lr": 3.57073887452465e-05, "epoch": 2.0408163265306123, "percentage": 29.15, "elapsed_time": "1:34:37", "remaining_time": "3:49:55"}
{"current_steps": 1205, "total_steps": 4116, "loss": 0.1842, "lr": 3.565474410150083e-05, "epoch": 2.0493197278911564, "percentage": 29.28, "elapsed_time": "1:34:59", "remaining_time": "3:49:27"}
{"current_steps": 1210, "total_steps": 4116, "loss": 0.192, "lr": 3.560181791580144e-05, "epoch": 2.057823129251701, "percentage": 29.4, "elapsed_time": "1:35:20", "remaining_time": "3:48:58"}
{"current_steps": 1215, "total_steps": 4116, "loss": 0.1855, "lr": 3.5548611139996634e-05, "epoch": 2.066326530612245, "percentage": 29.52, "elapsed_time": "1:35:42", "remaining_time": "3:48:31"}
{"current_steps": 1220, "total_steps": 4116, "loss": 0.1754, "lr": 3.549512473098097e-05, "epoch": 2.074829931972789, "percentage": 29.64, "elapsed_time": "1:36:05", "remaining_time": "3:48:05"}
{"current_steps": 1225, "total_steps": 4116, "loss": 0.1884, "lr": 3.544135965067807e-05, "epoch": 2.0833333333333335, "percentage": 29.76, "elapsed_time": "1:36:33", "remaining_time": "3:47:51"}
{"current_steps": 1230, "total_steps": 4116, "loss": 0.182, "lr": 3.5387316866023314e-05, "epoch": 2.0918367346938775, "percentage": 29.88, "elapsed_time": "1:36:52", "remaining_time": "3:47:18"}
{"current_steps": 1235, "total_steps": 4116, "loss": 0.1804, "lr": 3.533299734894642e-05, "epoch": 2.1003401360544216, "percentage": 30.0, "elapsed_time": "1:37:13", "remaining_time": "3:46:49"}
{"current_steps": 1240, "total_steps": 4116, "loss": 0.1947, "lr": 3.527840207635399e-05, "epoch": 2.108843537414966, "percentage": 30.13, "elapsed_time": "1:37:36", "remaining_time": "3:46:22"}
{"current_steps": 1245, "total_steps": 4116, "loss": 0.1819, "lr": 3.522353203011197e-05, "epoch": 2.11734693877551, "percentage": 30.25, "elapsed_time": "1:37:57", "remaining_time": "3:45:54"}
{"current_steps": 1250, "total_steps": 4116, "loss": 0.1941, "lr": 3.5168388197027894e-05, "epoch": 2.1258503401360542, "percentage": 30.37, "elapsed_time": "1:38:21", "remaining_time": "3:45:30"}
{"current_steps": 1255, "total_steps": 4116, "loss": 0.1842, "lr": 3.5112971568833265e-05, "epoch": 2.1343537414965987, "percentage": 30.49, "elapsed_time": "1:38:43", "remaining_time": "3:45:03"}
{"current_steps": 1260, "total_steps": 4116, "loss": 0.1863, "lr": 3.505728314216563e-05, "epoch": 2.142857142857143, "percentage": 30.61, "elapsed_time": "1:39:04", "remaining_time": "3:44:35"}
{"current_steps": 1265, "total_steps": 4116, "loss": 0.1948, "lr": 3.500132391855067e-05, "epoch": 2.1513605442176873, "percentage": 30.73, "elapsed_time": "1:39:27", "remaining_time": "3:44:10"}
{"current_steps": 1270, "total_steps": 4116, "loss": 0.1918, "lr": 3.494509490438422e-05, "epoch": 2.1598639455782314, "percentage": 30.86, "elapsed_time": "1:39:52", "remaining_time": "3:43:48"}
{"current_steps": 1275, "total_steps": 4116, "loss": 0.1814, "lr": 3.4888597110914155e-05, "epoch": 2.1683673469387754, "percentage": 30.98, "elapsed_time": "1:40:19", "remaining_time": "3:43:33"}
{"current_steps": 1280, "total_steps": 4116, "loss": 0.1794, "lr": 3.483183155422216e-05, "epoch": 2.17687074829932, "percentage": 31.1, "elapsed_time": "1:40:41", "remaining_time": "3:43:04"}
{"current_steps": 1285, "total_steps": 4116, "loss": 0.1778, "lr": 3.477479925520555e-05, "epoch": 2.185374149659864, "percentage": 31.22, "elapsed_time": "1:41:06", "remaining_time": "3:42:45"}
{"current_steps": 1290, "total_steps": 4116, "loss": 0.1879, "lr": 3.471750123955878e-05, "epoch": 2.193877551020408, "percentage": 31.34, "elapsed_time": "1:41:29", "remaining_time": "3:42:20"}
{"current_steps": 1295, "total_steps": 4116, "loss": 0.1944, "lr": 3.465993853775516e-05, "epoch": 2.2023809523809526, "percentage": 31.46, "elapsed_time": "1:41:50", "remaining_time": "3:41:51"}
{"current_steps": 1300, "total_steps": 4116, "loss": 0.1916, "lr": 3.460211218502815e-05, "epoch": 2.2108843537414966, "percentage": 31.58, "elapsed_time": "1:42:12", "remaining_time": "3:41:24"}
{"current_steps": 1305, "total_steps": 4116, "loss": 0.1882, "lr": 3.45440232213529e-05, "epoch": 2.2193877551020407, "percentage": 31.71, "elapsed_time": "1:42:35", "remaining_time": "3:40:58"}
{"current_steps": 1310, "total_steps": 4116, "loss": 0.1929, "lr": 3.4485672691427416e-05, "epoch": 2.227891156462585, "percentage": 31.83, "elapsed_time": "1:43:01", "remaining_time": "3:40:41"}
{"current_steps": 1315, "total_steps": 4116, "loss": 0.2048, "lr": 3.4427061644653873e-05, "epoch": 2.2363945578231292, "percentage": 31.95, "elapsed_time": "1:43:25", "remaining_time": "3:40:17"}
{"current_steps": 1320, "total_steps": 4116, "loss": 0.1778, "lr": 3.436819113511967e-05, "epoch": 2.2448979591836733, "percentage": 32.07, "elapsed_time": "1:43:47", "remaining_time": "3:39:50"}
{"current_steps": 1325, "total_steps": 4116, "loss": 0.19, "lr": 3.43090622215785e-05, "epoch": 2.253401360544218, "percentage": 32.19, "elapsed_time": "1:44:08", "remaining_time": "3:39:21"}
{"current_steps": 1330, "total_steps": 4116, "loss": 0.1746, "lr": 3.4249675967431325e-05, "epoch": 2.261904761904762, "percentage": 32.31, "elapsed_time": "1:44:28", "remaining_time": "3:38:51"}
{"current_steps": 1335, "total_steps": 4116, "loss": 0.185, "lr": 3.419003344070723e-05, "epoch": 2.270408163265306, "percentage": 32.43, "elapsed_time": "1:44:49", "remaining_time": "3:38:21"}
{"current_steps": 1340, "total_steps": 4116, "loss": 0.1787, "lr": 3.413013571404423e-05, "epoch": 2.2789115646258504, "percentage": 32.56, "elapsed_time": "1:45:16", "remaining_time": "3:38:05"}
{"current_steps": 1345, "total_steps": 4116, "loss": 0.1896, "lr": 3.406998386466995e-05, "epoch": 2.2874149659863945, "percentage": 32.68, "elapsed_time": "1:45:40", "remaining_time": "3:37:42"}
{"current_steps": 1350, "total_steps": 4116, "loss": 0.181, "lr": 3.400957897438229e-05, "epoch": 2.295918367346939, "percentage": 32.8, "elapsed_time": "1:46:01", "remaining_time": "3:37:14"}
{"current_steps": 1355, "total_steps": 4116, "loss": 0.1984, "lr": 3.3948922129529935e-05, "epoch": 2.304421768707483, "percentage": 32.92, "elapsed_time": "1:46:23", "remaining_time": "3:36:47"}
{"current_steps": 1360, "total_steps": 4116, "loss": 0.1932, "lr": 3.388801442099286e-05, "epoch": 2.312925170068027, "percentage": 33.04, "elapsed_time": "1:46:49", "remaining_time": "3:36:29"}
{"current_steps": 1365, "total_steps": 4116, "loss": 0.2012, "lr": 3.3826856944162655e-05, "epoch": 2.3214285714285716, "percentage": 33.16, "elapsed_time": "1:47:17", "remaining_time": "3:36:13"}
{"current_steps": 1370, "total_steps": 4116, "loss": 0.1766, "lr": 3.376545079892288e-05, "epoch": 2.3299319727891157, "percentage": 33.28, "elapsed_time": "1:47:42", "remaining_time": "3:35:52"}
{"current_steps": 1375, "total_steps": 4116, "loss": 0.1833, "lr": 3.3703797089629245e-05, "epoch": 2.3384353741496597, "percentage": 33.41, "elapsed_time": "1:48:04", "remaining_time": "3:35:26"}
{"current_steps": 1380, "total_steps": 4116, "loss": 0.1878, "lr": 3.364189692508976e-05, "epoch": 2.3469387755102042, "percentage": 33.53, "elapsed_time": "1:48:23", "remaining_time": "3:34:54"}
{"current_steps": 1385, "total_steps": 4116, "loss": 0.1844, "lr": 3.357975141854482e-05, "epoch": 2.3554421768707483, "percentage": 33.65, "elapsed_time": "1:48:46", "remaining_time": "3:34:28"}
{"current_steps": 1390, "total_steps": 4116, "loss": 0.1995, "lr": 3.3517361687647136e-05, "epoch": 2.3639455782312924, "percentage": 33.77, "elapsed_time": "1:49:07", "remaining_time": "3:34:00"}
{"current_steps": 1395, "total_steps": 4116, "loss": 0.1964, "lr": 3.3454728854441667e-05, "epoch": 2.372448979591837, "percentage": 33.89, "elapsed_time": "1:49:29", "remaining_time": "3:33:34"}
{"current_steps": 1400, "total_steps": 4116, "loss": 0.2125, "lr": 3.3391854045345437e-05, "epoch": 2.380952380952381, "percentage": 34.01, "elapsed_time": "1:49:50", "remaining_time": "3:33:05"}
{"current_steps": 1405, "total_steps": 4116, "loss": 0.1815, "lr": 3.332873839112727e-05, "epoch": 2.389455782312925, "percentage": 34.14, "elapsed_time": "1:50:15", "remaining_time": "3:32:45"}
{"current_steps": 1410, "total_steps": 4116, "loss": 0.1749, "lr": 3.3265383026887464e-05, "epoch": 2.3979591836734695, "percentage": 34.26, "elapsed_time": "1:50:38", "remaining_time": "3:32:19"}
{"current_steps": 1415, "total_steps": 4116, "loss": 0.1919, "lr": 3.320178909203735e-05, "epoch": 2.4064625850340136, "percentage": 34.38, "elapsed_time": "1:51:01", "remaining_time": "3:31:55"}
{"current_steps": 1420, "total_steps": 4116, "loss": 0.1998, "lr": 3.313795773027885e-05, "epoch": 2.4149659863945576, "percentage": 34.5, "elapsed_time": "1:51:22", "remaining_time": "3:31:26"}
{"current_steps": 1425, "total_steps": 4116, "loss": 0.2007, "lr": 3.307389008958385e-05, "epoch": 2.423469387755102, "percentage": 34.62, "elapsed_time": "1:51:45", "remaining_time": "3:31:03"}
{"current_steps": 1430, "total_steps": 4116, "loss": 0.1966, "lr": 3.30095873221736e-05, "epoch": 2.431972789115646, "percentage": 34.74, "elapsed_time": "1:52:06", "remaining_time": "3:30:35"}
{"current_steps": 1435, "total_steps": 4116, "loss": 0.1788, "lr": 3.294505058449797e-05, "epoch": 2.4404761904761907, "percentage": 34.86, "elapsed_time": "1:52:29", "remaining_time": "3:30:09"}
{"current_steps": 1440, "total_steps": 4116, "loss": 0.1814, "lr": 3.288028103721465e-05, "epoch": 2.4489795918367347, "percentage": 34.99, "elapsed_time": "1:52:53", "remaining_time": "3:29:47"}
{"current_steps": 1445, "total_steps": 4116, "loss": 0.1984, "lr": 3.2815279845168296e-05, "epoch": 2.457482993197279, "percentage": 35.11, "elapsed_time": "1:53:18", "remaining_time": "3:29:26"}
{"current_steps": 1450, "total_steps": 4116, "loss": 0.1787, "lr": 3.2750048177369554e-05, "epoch": 2.4659863945578233, "percentage": 35.23, "elapsed_time": "1:53:46", "remaining_time": "3:29:11"}
{"current_steps": 1455, "total_steps": 4116, "loss": 0.1764, "lr": 3.268458720697406e-05, "epoch": 2.4744897959183674, "percentage": 35.35, "elapsed_time": "1:54:09", "remaining_time": "3:28:46"}
{"current_steps": 1460, "total_steps": 4116, "loss": 0.1835, "lr": 3.2618898111261316e-05, "epoch": 2.4829931972789114, "percentage": 35.47, "elapsed_time": "1:54:31", "remaining_time": "3:28:20"}
{"current_steps": 1465, "total_steps": 4116, "loss": 0.1848, "lr": 3.2552982071613555e-05, "epoch": 2.491496598639456, "percentage": 35.59, "elapsed_time": "1:54:54", "remaining_time": "3:27:56"}
{"current_steps": 1470, "total_steps": 4116, "loss": 0.1877, "lr": 3.248684027349445e-05, "epoch": 2.5, "percentage": 35.71, "elapsed_time": "1:55:22", "remaining_time": "3:27:39"}
{"current_steps": 1475, "total_steps": 4116, "loss": 0.1807, "lr": 3.2420473906427824e-05, "epoch": 2.508503401360544, "percentage": 35.84, "elapsed_time": "1:55:46", "remaining_time": "3:27:17"}
{"current_steps": 1480, "total_steps": 4116, "loss": 0.1952, "lr": 3.235388416397626e-05, "epoch": 2.5170068027210886, "percentage": 35.96, "elapsed_time": "1:56:09", "remaining_time": "3:26:53"}
{"current_steps": 1485, "total_steps": 4116, "loss": 0.1821, "lr": 3.228707224371961e-05, "epoch": 2.5255102040816326, "percentage": 36.08, "elapsed_time": "1:56:32", "remaining_time": "3:26:29"}
{"current_steps": 1490, "total_steps": 4116, "loss": 0.1892, "lr": 3.222003934723347e-05, "epoch": 2.534013605442177, "percentage": 36.2, "elapsed_time": "1:56:56", "remaining_time": "3:26:05"}
{"current_steps": 1495, "total_steps": 4116, "loss": 0.1884, "lr": 3.215278668006757e-05, "epoch": 2.542517006802721, "percentage": 36.32, "elapsed_time": "1:57:16", "remaining_time": "3:25:36"}
{"current_steps": 1500, "total_steps": 4116, "loss": 0.1909, "lr": 3.208531545172413e-05, "epoch": 2.5510204081632653, "percentage": 36.44, "elapsed_time": "1:57:37", "remaining_time": "3:25:07"}
{"current_steps": 1505, "total_steps": 4116, "loss": 0.1703, "lr": 3.2017626875636024e-05, "epoch": 2.5595238095238093, "percentage": 36.56, "elapsed_time": "1:58:06", "remaining_time": "3:24:53"}
{"current_steps": 1510, "total_steps": 4116, "loss": 0.1695, "lr": 3.194972216914502e-05, "epoch": 2.568027210884354, "percentage": 36.69, "elapsed_time": "1:58:27", "remaining_time": "3:24:25"}
{"current_steps": 1515, "total_steps": 4116, "loss": 0.1913, "lr": 3.18816025534799e-05, "epoch": 2.576530612244898, "percentage": 36.81, "elapsed_time": "1:58:47", "remaining_time": "3:23:57"}
{"current_steps": 1520, "total_steps": 4116, "loss": 0.177, "lr": 3.181326925373443e-05, "epoch": 2.5850340136054424, "percentage": 36.93, "elapsed_time": "1:59:09", "remaining_time": "3:23:31"}
{"current_steps": 1525, "total_steps": 4116, "loss": 0.1887, "lr": 3.174472349884539e-05, "epoch": 2.5935374149659864, "percentage": 37.05, "elapsed_time": "1:59:32", "remaining_time": "3:23:06"}
{"current_steps": 1530, "total_steps": 4116, "loss": 0.1969, "lr": 3.1675966521570444e-05, "epoch": 2.6020408163265305, "percentage": 37.17, "elapsed_time": "1:59:56", "remaining_time": "3:22:42"}
{"current_steps": 1535, "total_steps": 4116, "loss": 0.208, "lr": 3.160699955846597e-05, "epoch": 2.6105442176870746, "percentage": 37.29, "elapsed_time": "2:00:21", "remaining_time": "3:22:23"}
{"current_steps": 1540, "total_steps": 4116, "loss": 0.1925, "lr": 3.1537823849864834e-05, "epoch": 2.619047619047619, "percentage": 37.41, "elapsed_time": "2:00:46", "remaining_time": "3:22:02"}
{"current_steps": 1545, "total_steps": 4116, "loss": 0.1821, "lr": 3.1468440639854056e-05, "epoch": 2.627551020408163, "percentage": 37.54, "elapsed_time": "2:01:09", "remaining_time": "3:21:37"}
{"current_steps": 1550, "total_steps": 4116, "loss": 0.1758, "lr": 3.139885117625249e-05, "epoch": 2.6360544217687076, "percentage": 37.66, "elapsed_time": "2:01:33", "remaining_time": "3:21:14"}
{"current_steps": 1555, "total_steps": 4116, "loss": 0.1705, "lr": 3.132905671058831e-05, "epoch": 2.6445578231292517, "percentage": 37.78, "elapsed_time": "2:01:53", "remaining_time": "3:20:45"}
{"current_steps": 1560, "total_steps": 4116, "loss": 0.1877, "lr": 3.1259058498076554e-05, "epoch": 2.6530612244897958, "percentage": 37.9, "elapsed_time": "2:02:16", "remaining_time": "3:20:20"}
{"current_steps": 1565, "total_steps": 4116, "loss": 0.1819, "lr": 3.1188857797596545e-05, "epoch": 2.6615646258503403, "percentage": 38.02, "elapsed_time": "2:02:39", "remaining_time": "3:19:55"}
{"current_steps": 1570, "total_steps": 4116, "loss": 0.1842, "lr": 3.111845587166922e-05, "epoch": 2.6700680272108843, "percentage": 38.14, "elapsed_time": "2:03:04", "remaining_time": "3:19:34"}
{"current_steps": 1575, "total_steps": 4116, "loss": 0.1926, "lr": 3.104785398643447e-05, "epoch": 2.678571428571429, "percentage": 38.27, "elapsed_time": "2:03:29", "remaining_time": "3:19:13"}
{"current_steps": 1580, "total_steps": 4116, "loss": 0.1912, "lr": 3.0977053411628315e-05, "epoch": 2.687074829931973, "percentage": 38.39, "elapsed_time": "2:03:50", "remaining_time": "3:18:46"}
{"current_steps": 1585, "total_steps": 4116, "loss": 0.2067, "lr": 3.090605542056013e-05, "epoch": 2.695578231292517, "percentage": 38.51, "elapsed_time": "2:04:17", "remaining_time": "3:18:29"}
{"current_steps": 1590, "total_steps": 4116, "loss": 0.186, "lr": 3.083486129008967e-05, "epoch": 2.704081632653061, "percentage": 38.63, "elapsed_time": "2:04:39", "remaining_time": "3:18:02"}
{"current_steps": 1595, "total_steps": 4116, "loss": 0.1837, "lr": 3.07634723006042e-05, "epoch": 2.7125850340136055, "percentage": 38.75, "elapsed_time": "2:05:03", "remaining_time": "3:17:40"}
{"current_steps": 1600, "total_steps": 4116, "loss": 0.1895, "lr": 3.069188973599538e-05, "epoch": 2.7210884353741496, "percentage": 38.87, "elapsed_time": "2:05:28", "remaining_time": "3:17:18"}
{"current_steps": 1605, "total_steps": 4116, "loss": 0.177, "lr": 3.062011488363623e-05, "epoch": 2.729591836734694, "percentage": 38.99, "elapsed_time": "2:05:51", "remaining_time": "3:16:54"}
{"current_steps": 1610, "total_steps": 4116, "loss": 0.2033, "lr": 3.054814903435794e-05, "epoch": 2.738095238095238, "percentage": 39.12, "elapsed_time": "2:06:20", "remaining_time": "3:16:38"}
{"current_steps": 1615, "total_steps": 4116, "loss": 0.1838, "lr": 3.0475993482426695e-05, "epoch": 2.746598639455782, "percentage": 39.24, "elapsed_time": "2:06:43", "remaining_time": "3:16:14"}
{"current_steps": 1620, "total_steps": 4116, "loss": 0.1894, "lr": 3.0403649525520374e-05, "epoch": 2.7551020408163263, "percentage": 39.36, "elapsed_time": "2:07:04", "remaining_time": "3:15:47"}
{"current_steps": 1625, "total_steps": 4116, "loss": 0.2049, "lr": 3.033111846470521e-05, "epoch": 2.7636054421768708, "percentage": 39.48, "elapsed_time": "2:07:26", "remaining_time": "3:15:20"}
{"current_steps": 1630, "total_steps": 4116, "loss": 0.1736, "lr": 3.02584016044124e-05, "epoch": 2.772108843537415, "percentage": 39.6, "elapsed_time": "2:07:48", "remaining_time": "3:14:55"}
{"current_steps": 1635, "total_steps": 4116, "loss": 0.1885, "lr": 3.0185500252414633e-05, "epoch": 2.7806122448979593, "percentage": 39.72, "elapsed_time": "2:08:19", "remaining_time": "3:14:42"}
{"current_steps": 1640, "total_steps": 4116, "loss": 0.1784, "lr": 3.01124157198026e-05, "epoch": 2.7891156462585034, "percentage": 39.84, "elapsed_time": "2:08:40", "remaining_time": "3:14:15"}
{"current_steps": 1645, "total_steps": 4116, "loss": 0.1872, "lr": 3.003914932096137e-05, "epoch": 2.7976190476190474, "percentage": 39.97, "elapsed_time": "2:09:03", "remaining_time": "3:13:52"}
{"current_steps": 1650, "total_steps": 4116, "loss": 0.189, "lr": 2.99657023735468e-05, "epoch": 2.806122448979592, "percentage": 40.09, "elapsed_time": "2:09:29", "remaining_time": "3:13:31"}
{"current_steps": 1655, "total_steps": 4116, "loss": 0.1919, "lr": 2.9892076198461794e-05, "epoch": 2.814625850340136, "percentage": 40.21, "elapsed_time": "2:09:52", "remaining_time": "3:13:06"}
{"current_steps": 1660, "total_steps": 4116, "loss": 0.19, "lr": 2.9818272119832576e-05, "epoch": 2.8231292517006805, "percentage": 40.33, "elapsed_time": "2:10:14", "remaining_time": "3:12:42"}
{"current_steps": 1665, "total_steps": 4116, "loss": 0.1955, "lr": 2.974429146498487e-05, "epoch": 2.8316326530612246, "percentage": 40.45, "elapsed_time": "2:10:38", "remaining_time": "3:12:18"}
{"current_steps": 1670, "total_steps": 4116, "loss": 0.1814, "lr": 2.9670135564420016e-05, "epoch": 2.8401360544217686, "percentage": 40.57, "elapsed_time": "2:10:58", "remaining_time": "3:11:50"}
{"current_steps": 1675, "total_steps": 4116, "loss": 0.1846, "lr": 2.9595805751791068e-05, "epoch": 2.8486394557823127, "percentage": 40.69, "elapsed_time": "2:11:25", "remaining_time": "3:11:31"}
{"current_steps": 1680, "total_steps": 4116, "loss": 0.181, "lr": 2.9521303363878774e-05, "epoch": 2.857142857142857, "percentage": 40.82, "elapsed_time": "2:11:49", "remaining_time": "3:11:09"}
{"current_steps": 1685, "total_steps": 4116, "loss": 0.1751, "lr": 2.9446629740567566e-05, "epoch": 2.8656462585034013, "percentage": 40.94, "elapsed_time": "2:12:11", "remaining_time": "3:10:42"}
{"current_steps": 1690, "total_steps": 4116, "loss": 0.179, "lr": 2.9371786224821447e-05, "epoch": 2.8741496598639458, "percentage": 41.06, "elapsed_time": "2:12:33", "remaining_time": "3:10:17"}
{"current_steps": 1695, "total_steps": 4116, "loss": 0.1817, "lr": 2.9296774162659836e-05, "epoch": 2.88265306122449, "percentage": 41.18, "elapsed_time": "2:12:54", "remaining_time": "3:09:50"}
{"current_steps": 1700, "total_steps": 4116, "loss": 0.1915, "lr": 2.9221594903133385e-05, "epoch": 2.891156462585034, "percentage": 41.3, "elapsed_time": "2:13:20", "remaining_time": "3:09:29"}
{"current_steps": 1705, "total_steps": 4116, "loss": 0.1887, "lr": 2.9146249798299672e-05, "epoch": 2.8996598639455784, "percentage": 41.42, "elapsed_time": "2:13:43", "remaining_time": "3:09:05"}
{"current_steps": 1710, "total_steps": 4116, "loss": 0.1851, "lr": 2.9070740203198927e-05, "epoch": 2.9081632653061225, "percentage": 41.55, "elapsed_time": "2:14:04", "remaining_time": "3:08:38"}
{"current_steps": 1715, "total_steps": 4116, "loss": 0.1894, "lr": 2.899506747582965e-05, "epoch": 2.9166666666666665, "percentage": 41.67, "elapsed_time": "2:14:27", "remaining_time": "3:08:14"}
{"current_steps": 1720, "total_steps": 4116, "loss": 0.1744, "lr": 2.8919232977124177e-05, "epoch": 2.925170068027211, "percentage": 41.79, "elapsed_time": "2:14:49", "remaining_time": "3:07:49"}
{"current_steps": 1725, "total_steps": 4116, "loss": 0.1809, "lr": 2.884323807092423e-05, "epoch": 2.933673469387755, "percentage": 41.91, "elapsed_time": "2:15:10", "remaining_time": "3:07:22"}
{"current_steps": 1730, "total_steps": 4116, "loss": 0.1856, "lr": 2.8767084123956334e-05, "epoch": 2.942176870748299, "percentage": 42.03, "elapsed_time": "2:15:32", "remaining_time": "3:06:56"}
{"current_steps": 1735, "total_steps": 4116, "loss": 0.2012, "lr": 2.8690772505807307e-05, "epoch": 2.9506802721088436, "percentage": 42.15, "elapsed_time": "2:15:58", "remaining_time": "3:06:35"}
{"current_steps": 1740, "total_steps": 4116, "loss": 0.1824, "lr": 2.8614304588899588e-05, "epoch": 2.9591836734693877, "percentage": 42.27, "elapsed_time": "2:16:20", "remaining_time": "3:06:10"}
{"current_steps": 1745, "total_steps": 4116, "loss": 0.1923, "lr": 2.8537681748466557e-05, "epoch": 2.967687074829932, "percentage": 42.4, "elapsed_time": "2:16:43", "remaining_time": "3:05:46"}
{"current_steps": 1750, "total_steps": 4116, "loss": 0.1826, "lr": 2.8460905362527832e-05, "epoch": 2.9761904761904763, "percentage": 42.52, "elapsed_time": "2:17:06", "remaining_time": "3:05:22"}
{"current_steps": 1755, "total_steps": 4116, "loss": 0.1756, "lr": 2.8383976811864416e-05, "epoch": 2.9846938775510203, "percentage": 42.64, "elapsed_time": "2:17:28", "remaining_time": "3:04:56"}
{"current_steps": 1760, "total_steps": 4116, "loss": 0.1812, "lr": 2.8306897479993954e-05, "epoch": 2.9931972789115644, "percentage": 42.76, "elapsed_time": "2:17:52", "remaining_time": "3:04:33"}
{"current_steps": 1765, "total_steps": 4116, "loss": 0.1851, "lr": 2.8229668753145797e-05, "epoch": 3.001700680272109, "percentage": 42.88, "elapsed_time": "2:18:19", "remaining_time": "3:04:14"}
{"current_steps": 1770, "total_steps": 4116, "loss": 0.1669, "lr": 2.815229202023607e-05, "epoch": 3.010204081632653, "percentage": 43.0, "elapsed_time": "2:18:39", "remaining_time": "3:03:47"}
{"current_steps": 1775, "total_steps": 4116, "loss": 0.184, "lr": 2.8074768672842716e-05, "epoch": 3.0187074829931975, "percentage": 43.12, "elapsed_time": "2:19:02", "remaining_time": "3:03:22"}
{"current_steps": 1780, "total_steps": 4116, "loss": 0.1698, "lr": 2.7997100105180455e-05, "epoch": 3.0272108843537415, "percentage": 43.25, "elapsed_time": "2:19:30", "remaining_time": "3:03:04"}
{"current_steps": 1785, "total_steps": 4116, "loss": 0.1577, "lr": 2.7919287714075714e-05, "epoch": 3.0357142857142856, "percentage": 43.37, "elapsed_time": "2:19:52", "remaining_time": "3:02:40"}
{"current_steps": 1790, "total_steps": 4116, "loss": 0.1762, "lr": 2.7841332898941513e-05, "epoch": 3.04421768707483, "percentage": 43.49, "elapsed_time": "2:20:19", "remaining_time": "3:02:20"}
{"current_steps": 1795, "total_steps": 4116, "loss": 0.1587, "lr": 2.7763237061752276e-05, "epoch": 3.052721088435374, "percentage": 43.61, "elapsed_time": "2:20:42", "remaining_time": "3:01:56"}
{"current_steps": 1800, "total_steps": 4116, "loss": 0.1721, "lr": 2.768500160701864e-05, "epoch": 3.061224489795918, "percentage": 43.73, "elapsed_time": "2:21:06", "remaining_time": "3:01:33"}
{"current_steps": 1805, "total_steps": 4116, "loss": 0.1624, "lr": 2.7606627941762186e-05, "epoch": 3.0697278911564627, "percentage": 43.85, "elapsed_time": "2:21:28", "remaining_time": "3:01:07"}
{"current_steps": 1810, "total_steps": 4116, "loss": 0.1843, "lr": 2.752811747549013e-05, "epoch": 3.078231292517007, "percentage": 43.97, "elapsed_time": "2:21:54", "remaining_time": "3:00:48"}
{"current_steps": 1815, "total_steps": 4116, "loss": 0.1624, "lr": 2.7449471620169988e-05, "epoch": 3.086734693877551, "percentage": 44.1, "elapsed_time": "2:22:16", "remaining_time": "3:00:22"}
{"current_steps": 1820, "total_steps": 4116, "loss": 0.1651, "lr": 2.7370691790204173e-05, "epoch": 3.0952380952380953, "percentage": 44.22, "elapsed_time": "2:22:37", "remaining_time": "2:59:56"}
{"current_steps": 1825, "total_steps": 4116, "loss": 0.1633, "lr": 2.7291779402404538e-05, "epoch": 3.1037414965986394, "percentage": 44.34, "elapsed_time": "2:22:58", "remaining_time": "2:59:28"}
{"current_steps": 1830, "total_steps": 4116, "loss": 0.1715, "lr": 2.7212735875966946e-05, "epoch": 3.1122448979591835, "percentage": 44.46, "elapsed_time": "2:23:20", "remaining_time": "2:59:04"}
{"current_steps": 1835, "total_steps": 4116, "loss": 0.181, "lr": 2.713356263244569e-05, "epoch": 3.120748299319728, "percentage": 44.58, "elapsed_time": "2:23:43", "remaining_time": "2:58:40"}
{"current_steps": 1840, "total_steps": 4116, "loss": 0.1679, "lr": 2.7054261095727977e-05, "epoch": 3.129251700680272, "percentage": 44.7, "elapsed_time": "2:24:08", "remaining_time": "2:58:18"}
{"current_steps": 1845, "total_steps": 4116, "loss": 0.1653, "lr": 2.6974832692008278e-05, "epoch": 3.137755102040816, "percentage": 44.83, "elapsed_time": "2:24:31", "remaining_time": "2:57:53"}
{"current_steps": 1850, "total_steps": 4116, "loss": 0.1593, "lr": 2.6895278849762694e-05, "epoch": 3.1462585034013606, "percentage": 44.95, "elapsed_time": "2:24:52", "remaining_time": "2:57:27"}
{"current_steps": 1855, "total_steps": 4116, "loss": 0.1586, "lr": 2.68156009997233e-05, "epoch": 3.1547619047619047, "percentage": 45.07, "elapsed_time": "2:25:16", "remaining_time": "2:57:04"}
{"current_steps": 1860, "total_steps": 4116, "loss": 0.172, "lr": 2.673580057485234e-05, "epoch": 3.163265306122449, "percentage": 45.19, "elapsed_time": "2:25:38", "remaining_time": "2:56:38"}
{"current_steps": 1865, "total_steps": 4116, "loss": 0.1666, "lr": 2.6655879010316535e-05, "epoch": 3.171768707482993, "percentage": 45.31, "elapsed_time": "2:26:02", "remaining_time": "2:56:15"}
{"current_steps": 1870, "total_steps": 4116, "loss": 0.1673, "lr": 2.657583774346121e-05, "epoch": 3.1802721088435373, "percentage": 45.43, "elapsed_time": "2:26:23", "remaining_time": "2:55:49"}
{"current_steps": 1875, "total_steps": 4116, "loss": 0.1578, "lr": 2.649567821378449e-05, "epoch": 3.188775510204082, "percentage": 45.55, "elapsed_time": "2:26:47", "remaining_time": "2:55:26"}
{"current_steps": 1880, "total_steps": 4116, "loss": 0.1619, "lr": 2.641540186291138e-05, "epoch": 3.197278911564626, "percentage": 45.68, "elapsed_time": "2:27:07", "remaining_time": "2:54:59"}
{"current_steps": 1885, "total_steps": 4116, "loss": 0.1649, "lr": 2.6335010134567852e-05, "epoch": 3.20578231292517, "percentage": 45.8, "elapsed_time": "2:27:29", "remaining_time": "2:54:33"}
{"current_steps": 1890, "total_steps": 4116, "loss": 0.1643, "lr": 2.6254504474554883e-05, "epoch": 3.2142857142857144, "percentage": 45.92, "elapsed_time": "2:27:52", "remaining_time": "2:54:09"}
{"current_steps": 1895, "total_steps": 4116, "loss": 0.1695, "lr": 2.617388633072244e-05, "epoch": 3.2227891156462585, "percentage": 46.04, "elapsed_time": "2:28:19", "remaining_time": "2:53:49"}
{"current_steps": 1900, "total_steps": 4116, "loss": 0.1618, "lr": 2.609315715294346e-05, "epoch": 3.2312925170068025, "percentage": 46.16, "elapsed_time": "2:28:40", "remaining_time": "2:53:23"}
{"current_steps": 1905, "total_steps": 4116, "loss": 0.1626, "lr": 2.6012318393087754e-05, "epoch": 3.239795918367347, "percentage": 46.28, "elapsed_time": "2:29:01", "remaining_time": "2:52:57"}
{"current_steps": 1910, "total_steps": 4116, "loss": 0.1797, "lr": 2.593137150499591e-05, "epoch": 3.248299319727891, "percentage": 46.4, "elapsed_time": "2:29:24", "remaining_time": "2:52:33"}
{"current_steps": 1915, "total_steps": 4116, "loss": 0.1624, "lr": 2.5850317944453153e-05, "epoch": 3.2568027210884356, "percentage": 46.53, "elapsed_time": "2:29:45", "remaining_time": "2:52:07"}
{"current_steps": 1920, "total_steps": 4116, "loss": 0.1677, "lr": 2.5769159169163137e-05, "epoch": 3.2653061224489797, "percentage": 46.65, "elapsed_time": "2:30:05", "remaining_time": "2:51:40"}
{"current_steps": 1925, "total_steps": 4116, "loss": 0.1531, "lr": 2.568789663872175e-05, "epoch": 3.2738095238095237, "percentage": 46.77, "elapsed_time": "2:30:32", "remaining_time": "2:51:20"}
{"current_steps": 1930, "total_steps": 4116, "loss": 0.1692, "lr": 2.5606531814590868e-05, "epoch": 3.282312925170068, "percentage": 46.89, "elapsed_time": "2:30:57", "remaining_time": "2:50:59"}
{"current_steps": 1935, "total_steps": 4116, "loss": 0.1665, "lr": 2.552506616007205e-05, "epoch": 3.2908163265306123, "percentage": 47.01, "elapsed_time": "2:31:23", "remaining_time": "2:50:37"}
{"current_steps": 1940, "total_steps": 4116, "loss": 0.1621, "lr": 2.544350114028025e-05, "epoch": 3.2993197278911564, "percentage": 47.13, "elapsed_time": "2:31:46", "remaining_time": "2:50:14"}
{"current_steps": 1945, "total_steps": 4116, "loss": 0.1712, "lr": 2.5361838222117435e-05, "epoch": 3.307823129251701, "percentage": 47.25, "elapsed_time": "2:32:13", "remaining_time": "2:49:54"}
{"current_steps": 1950, "total_steps": 4116, "loss": 0.1763, "lr": 2.5280078874246227e-05, "epoch": 3.316326530612245, "percentage": 47.38, "elapsed_time": "2:32:40", "remaining_time": "2:49:34"}
{"current_steps": 1955, "total_steps": 4116, "loss": 0.1644, "lr": 2.519822456706349e-05, "epoch": 3.324829931972789, "percentage": 47.5, "elapsed_time": "2:33:04", "remaining_time": "2:49:12"}
{"current_steps": 1960, "total_steps": 4116, "loss": 0.172, "lr": 2.5116276772673863e-05, "epoch": 3.3333333333333335, "percentage": 47.62, "elapsed_time": "2:33:30", "remaining_time": "2:48:51"}
{"current_steps": 1965, "total_steps": 4116, "loss": 0.1648, "lr": 2.503423696486332e-05, "epoch": 3.3418367346938775, "percentage": 47.74, "elapsed_time": "2:33:51", "remaining_time": "2:48:25"}
{"current_steps": 1970, "total_steps": 4116, "loss": 0.1529, "lr": 2.495210661907263e-05, "epoch": 3.3503401360544216, "percentage": 47.86, "elapsed_time": "2:34:14", "remaining_time": "2:48:01"}
{"current_steps": 1975, "total_steps": 4116, "loss": 0.1777, "lr": 2.4869887212370847e-05, "epoch": 3.358843537414966, "percentage": 47.98, "elapsed_time": "2:34:33", "remaining_time": "2:47:33"}
{"current_steps": 1980, "total_steps": 4116, "loss": 0.1693, "lr": 2.4787580223428736e-05, "epoch": 3.36734693877551, "percentage": 48.1, "elapsed_time": "2:34:56", "remaining_time": "2:47:08"}
{"current_steps": 1985, "total_steps": 4116, "loss": 0.1616, "lr": 2.470518713249217e-05, "epoch": 3.3758503401360542, "percentage": 48.23, "elapsed_time": "2:35:19", "remaining_time": "2:46:44"}
{"current_steps": 1990, "total_steps": 4116, "loss": 0.1636, "lr": 2.4622709421355535e-05, "epoch": 3.3843537414965987, "percentage": 48.35, "elapsed_time": "2:35:43", "remaining_time": "2:46:22"}
{"current_steps": 1995, "total_steps": 4116, "loss": 0.166, "lr": 2.4540148573335057e-05, "epoch": 3.392857142857143, "percentage": 48.47, "elapsed_time": "2:36:04", "remaining_time": "2:45:55"}
{"current_steps": 2000, "total_steps": 4116, "loss": 0.1797, "lr": 2.445750607324212e-05, "epoch": 3.4013605442176873, "percentage": 48.59, "elapsed_time": "2:36:27", "remaining_time": "2:45:31"}
{"current_steps": 2005, "total_steps": 4116, "loss": 0.1567, "lr": 2.4374783407356602e-05, "epoch": 3.4098639455782314, "percentage": 48.71, "elapsed_time": "2:36:47", "remaining_time": "2:45:05"}
{"current_steps": 2010, "total_steps": 4116, "loss": 0.1678, "lr": 2.4291982063400096e-05, "epoch": 3.4183673469387754, "percentage": 48.83, "elapsed_time": "2:37:10", "remaining_time": "2:44:40"}
{"current_steps": 2015, "total_steps": 4116, "loss": 0.164, "lr": 2.4209103530509196e-05, "epoch": 3.4268707482993195, "percentage": 48.96, "elapsed_time": "2:37:31", "remaining_time": "2:44:14"}
{"current_steps": 2020, "total_steps": 4116, "loss": 0.1767, "lr": 2.412614929920868e-05, "epoch": 3.435374149659864, "percentage": 49.08, "elapsed_time": "2:37:52", "remaining_time": "2:43:49"}
{"current_steps": 2025, "total_steps": 4116, "loss": 0.1587, "lr": 2.4043120861384723e-05, "epoch": 3.443877551020408, "percentage": 49.2, "elapsed_time": "2:38:15", "remaining_time": "2:43:24"}
{"current_steps": 2030, "total_steps": 4116, "loss": 0.1855, "lr": 2.3960019710258068e-05, "epoch": 3.4523809523809526, "percentage": 49.32, "elapsed_time": "2:38:39", "remaining_time": "2:43:02"}
{"current_steps": 2035, "total_steps": 4116, "loss": 0.1662, "lr": 2.3876847340357164e-05, "epoch": 3.4608843537414966, "percentage": 49.44, "elapsed_time": "2:39:04", "remaining_time": "2:42:40"}
{"current_steps": 2040, "total_steps": 4116, "loss": 0.1697, "lr": 2.3793605247491303e-05, "epoch": 3.4693877551020407, "percentage": 49.56, "elapsed_time": "2:39:26", "remaining_time": "2:42:15"}
{"current_steps": 2045, "total_steps": 4116, "loss": 0.1663, "lr": 2.371029492872369e-05, "epoch": 3.477891156462585, "percentage": 49.68, "elapsed_time": "2:39:49", "remaining_time": "2:41:51"}
{"current_steps": 2050, "total_steps": 4116, "loss": 0.1748, "lr": 2.362691788234453e-05, "epoch": 3.4863945578231292, "percentage": 49.81, "elapsed_time": "2:40:16", "remaining_time": "2:41:31"}
{"current_steps": 2055, "total_steps": 4116, "loss": 0.1674, "lr": 2.3543475607844105e-05, "epoch": 3.4948979591836733, "percentage": 49.93, "elapsed_time": "2:40:38", "remaining_time": "2:41:06"}
{"current_steps": 2060, "total_steps": 4116, "loss": 0.1591, "lr": 2.3459969605885785e-05, "epoch": 3.503401360544218, "percentage": 50.05, "elapsed_time": "2:41:01", "remaining_time": "2:40:43"}
{"current_steps": 2065, "total_steps": 4116, "loss": 0.1597, "lr": 2.3376401378279037e-05, "epoch": 3.511904761904762, "percentage": 50.17, "elapsed_time": "2:41:22", "remaining_time": "2:40:16"}
{"current_steps": 2070, "total_steps": 4116, "loss": 0.1694, "lr": 2.329277242795243e-05, "epoch": 3.520408163265306, "percentage": 50.29, "elapsed_time": "2:41:45", "remaining_time": "2:39:53"}
{"current_steps": 2075, "total_steps": 4116, "loss": 0.1743, "lr": 2.320908425892658e-05, "epoch": 3.5289115646258504, "percentage": 50.41, "elapsed_time": "2:42:05", "remaining_time": "2:39:26"}
{"current_steps": 2080, "total_steps": 4116, "loss": 0.1552, "lr": 2.3125338376287137e-05, "epoch": 3.5374149659863945, "percentage": 50.53, "elapsed_time": "2:42:27", "remaining_time": "2:39:01"}
{"current_steps": 2085, "total_steps": 4116, "loss": 0.1771, "lr": 2.3041536286157706e-05, "epoch": 3.545918367346939, "percentage": 50.66, "elapsed_time": "2:42:56", "remaining_time": "2:38:43"}
{"current_steps": 2090, "total_steps": 4116, "loss": 0.1611, "lr": 2.295767949567272e-05, "epoch": 3.554421768707483, "percentage": 50.78, "elapsed_time": "2:43:19", "remaining_time": "2:38:19"}
{"current_steps": 2095, "total_steps": 4116, "loss": 0.1692, "lr": 2.28737695129504e-05, "epoch": 3.562925170068027, "percentage": 50.9, "elapsed_time": "2:43:44", "remaining_time": "2:37:57"}
{"current_steps": 2100, "total_steps": 4116, "loss": 0.1619, "lr": 2.2789807847065574e-05, "epoch": 3.571428571428571, "percentage": 51.02, "elapsed_time": "2:44:06", "remaining_time": "2:37:32"}
{"current_steps": 2105, "total_steps": 4116, "loss": 0.1701, "lr": 2.2705796008022592e-05, "epoch": 3.5799319727891157, "percentage": 51.14, "elapsed_time": "2:44:28", "remaining_time": "2:37:08"}
{"current_steps": 2110, "total_steps": 4116, "loss": 0.1677, "lr": 2.2621735506728126e-05, "epoch": 3.5884353741496597, "percentage": 51.26, "elapsed_time": "2:44:54", "remaining_time": "2:36:47"}
{"current_steps": 2115, "total_steps": 4116, "loss": 0.1704, "lr": 2.2537627854963994e-05, "epoch": 3.5969387755102042, "percentage": 51.38, "elapsed_time": "2:45:19", "remaining_time": "2:36:25"}
{"current_steps": 2120, "total_steps": 4116, "loss": 0.1732, "lr": 2.2453474565360018e-05, "epoch": 3.6054421768707483, "percentage": 51.51, "elapsed_time": "2:45:41", "remaining_time": "2:36:00"}
{"current_steps": 2125, "total_steps": 4116, "loss": 0.1647, "lr": 2.2369277151366776e-05, "epoch": 3.6139455782312924, "percentage": 51.63, "elapsed_time": "2:46:04", "remaining_time": "2:35:36"}
{"current_steps": 2130, "total_steps": 4116, "loss": 0.1711, "lr": 2.228503712722841e-05, "epoch": 3.622448979591837, "percentage": 51.75, "elapsed_time": "2:46:26", "remaining_time": "2:35:11"}
{"current_steps": 2135, "total_steps": 4116, "loss": 0.1609, "lr": 2.220075600795537e-05, "epoch": 3.630952380952381, "percentage": 51.87, "elapsed_time": "2:46:52", "remaining_time": "2:34:50"}
{"current_steps": 2140, "total_steps": 4116, "loss": 0.1556, "lr": 2.2116435309297177e-05, "epoch": 3.6394557823129254, "percentage": 51.99, "elapsed_time": "2:47:14", "remaining_time": "2:34:25"}
{"current_steps": 2145, "total_steps": 4116, "loss": 0.1603, "lr": 2.203207654771519e-05, "epoch": 3.6479591836734695, "percentage": 52.11, "elapsed_time": "2:47:39", "remaining_time": "2:34:03"}
{"current_steps": 2150, "total_steps": 4116, "loss": 0.1735, "lr": 2.1947681240355266e-05, "epoch": 3.6564625850340136, "percentage": 52.24, "elapsed_time": "2:48:00", "remaining_time": "2:33:37"}
{"current_steps": 2155, "total_steps": 4116, "loss": 0.1615, "lr": 2.1863250905020566e-05, "epoch": 3.6649659863945576, "percentage": 52.36, "elapsed_time": "2:48:22", "remaining_time": "2:33:13"}
{"current_steps": 2160, "total_steps": 4116, "loss": 0.1571, "lr": 2.177878706014418e-05, "epoch": 3.673469387755102, "percentage": 52.48, "elapsed_time": "2:48:42", "remaining_time": "2:32:46"}
{"current_steps": 2165, "total_steps": 4116, "loss": 0.1675, "lr": 2.1694291224761845e-05, "epoch": 3.681972789115646, "percentage": 52.6, "elapsed_time": "2:49:07", "remaining_time": "2:32:24"}
{"current_steps": 2170, "total_steps": 4116, "loss": 0.1722, "lr": 2.160976491848465e-05, "epoch": 3.6904761904761907, "percentage": 52.72, "elapsed_time": "2:49:30", "remaining_time": "2:32:00"}
{"current_steps": 2175, "total_steps": 4116, "loss": 0.1525, "lr": 2.1525209661471677e-05, "epoch": 3.6989795918367347, "percentage": 52.84, "elapsed_time": "2:49:55", "remaining_time": "2:31:38"}
{"current_steps": 2180, "total_steps": 4116, "loss": 0.1722, "lr": 2.1440626974402664e-05, "epoch": 3.707482993197279, "percentage": 52.96, "elapsed_time": "2:50:19", "remaining_time": "2:31:15"}
{"current_steps": 2185, "total_steps": 4116, "loss": 0.1648, "lr": 2.135601837845068e-05, "epoch": 3.715986394557823, "percentage": 53.09, "elapsed_time": "2:50:41", "remaining_time": "2:30:51"}
{"current_steps": 2190, "total_steps": 4116, "loss": 0.1654, "lr": 2.1271385395254728e-05, "epoch": 3.7244897959183674, "percentage": 53.21, "elapsed_time": "2:51:06", "remaining_time": "2:30:29"}
{"current_steps": 2195, "total_steps": 4116, "loss": 0.1617, "lr": 2.118672954689242e-05, "epoch": 3.7329931972789114, "percentage": 53.33, "elapsed_time": "2:51:33", "remaining_time": "2:30:08"}
{"current_steps": 2200, "total_steps": 4116, "loss": 0.1665, "lr": 2.1102052355852586e-05, "epoch": 3.741496598639456, "percentage": 53.45, "elapsed_time": "2:51:55", "remaining_time": "2:29:43"}
{"current_steps": 2205, "total_steps": 4116, "loss": 0.1687, "lr": 2.1017355345007883e-05, "epoch": 3.75, "percentage": 53.57, "elapsed_time": "2:52:20", "remaining_time": "2:29:21"}
{"current_steps": 2210, "total_steps": 4116, "loss": 0.1695, "lr": 2.093264003758742e-05, "epoch": 3.758503401360544, "percentage": 53.69, "elapsed_time": "2:52:39", "remaining_time": "2:28:54"}
{"current_steps": 2215, "total_steps": 4116, "loss": 0.1725, "lr": 2.0847907957149348e-05, "epoch": 3.7670068027210886, "percentage": 53.81, "elapsed_time": "2:53:00", "remaining_time": "2:28:28"}
{"current_steps": 2220, "total_steps": 4116, "loss": 0.1675, "lr": 2.0763160627553502e-05, "epoch": 3.7755102040816326, "percentage": 53.94, "elapsed_time": "2:53:21", "remaining_time": "2:28:03"}
{"current_steps": 2225, "total_steps": 4116, "loss": 0.1574, "lr": 2.067839957293394e-05, "epoch": 3.784013605442177, "percentage": 54.06, "elapsed_time": "2:53:44", "remaining_time": "2:27:39"}
{"current_steps": 2230, "total_steps": 4116, "loss": 0.1719, "lr": 2.059362631767156e-05, "epoch": 3.792517006802721, "percentage": 54.18, "elapsed_time": "2:54:08", "remaining_time": "2:27:16"}
{"current_steps": 2235, "total_steps": 4116, "loss": 0.1643, "lr": 2.0508842386366687e-05, "epoch": 3.8010204081632653, "percentage": 54.3, "elapsed_time": "2:54:31", "remaining_time": "2:26:53"}
{"current_steps": 2240, "total_steps": 4116, "loss": 0.1726, "lr": 2.0424049303811637e-05, "epoch": 3.8095238095238093, "percentage": 54.42, "elapsed_time": "2:54:52", "remaining_time": "2:26:27"}
{"current_steps": 2245, "total_steps": 4116, "loss": 0.1616, "lr": 2.0339248594963326e-05, "epoch": 3.818027210884354, "percentage": 54.54, "elapsed_time": "2:55:13", "remaining_time": "2:26:02"}
{"current_steps": 2250, "total_steps": 4116, "loss": 0.1725, "lr": 2.025444178491582e-05, "epoch": 3.826530612244898, "percentage": 54.66, "elapsed_time": "2:55:36", "remaining_time": "2:25:38"}
{"current_steps": 2255, "total_steps": 4116, "loss": 0.1747, "lr": 2.0169630398872897e-05, "epoch": 3.8350340136054424, "percentage": 54.79, "elapsed_time": "2:56:05", "remaining_time": "2:25:19"}
{"current_steps": 2260, "total_steps": 4116, "loss": 0.1649, "lr": 2.0084815962120644e-05, "epoch": 3.8435374149659864, "percentage": 54.91, "elapsed_time": "2:56:25", "remaining_time": "2:24:53"}
{"current_steps": 2265, "total_steps": 4116, "loss": 0.1703, "lr": 2e-05, "epoch": 3.8520408163265305, "percentage": 55.03, "elapsed_time": "2:56:46", "remaining_time": "2:24:27"}
{"current_steps": 2270, "total_steps": 4116, "loss": 0.1696, "lr": 1.9915184037879362e-05, "epoch": 3.8605442176870746, "percentage": 55.15, "elapsed_time": "2:57:10", "remaining_time": "2:24:04"}
{"current_steps": 2275, "total_steps": 4116, "loss": 0.1728, "lr": 1.983036960112711e-05, "epoch": 3.869047619047619, "percentage": 55.27, "elapsed_time": "2:57:31", "remaining_time": "2:23:39"}
{"current_steps": 2280, "total_steps": 4116, "loss": 0.1709, "lr": 1.9745558215084184e-05, "epoch": 3.877551020408163, "percentage": 55.39, "elapsed_time": "2:57:56", "remaining_time": "2:23:17"}
{"current_steps": 2285, "total_steps": 4116, "loss": 0.1714, "lr": 1.9660751405036677e-05, "epoch": 3.8860544217687076, "percentage": 55.52, "elapsed_time": "2:58:20", "remaining_time": "2:22:54"}
{"current_steps": 2290, "total_steps": 4116, "loss": 0.1697, "lr": 1.9575950696188367e-05, "epoch": 3.8945578231292517, "percentage": 55.64, "elapsed_time": "2:58:50", "remaining_time": "2:22:36"}
{"current_steps": 2295, "total_steps": 4116, "loss": 0.1676, "lr": 1.949115761363332e-05, "epoch": 3.9030612244897958, "percentage": 55.76, "elapsed_time": "2:59:11", "remaining_time": "2:22:10"}
{"current_steps": 2300, "total_steps": 4116, "loss": 0.1695, "lr": 1.9406373682328445e-05, "epoch": 3.9115646258503403, "percentage": 55.88, "elapsed_time": "2:59:35", "remaining_time": "2:21:47"}
{"current_steps": 2305, "total_steps": 4116, "loss": 0.1762, "lr": 1.932160042706607e-05, "epoch": 3.9200680272108843, "percentage": 56.0, "elapsed_time": "3:00:02", "remaining_time": "2:21:27"}
{"current_steps": 2310, "total_steps": 4116, "loss": 0.1731, "lr": 1.9236839372446504e-05, "epoch": 3.928571428571429, "percentage": 56.12, "elapsed_time": "3:00:25", "remaining_time": "2:21:03"}
{"current_steps": 2315, "total_steps": 4116, "loss": 0.1658, "lr": 1.9152092042850655e-05, "epoch": 3.937074829931973, "percentage": 56.24, "elapsed_time": "3:00:46", "remaining_time": "2:20:38"}
{"current_steps": 2320, "total_steps": 4116, "loss": 0.157, "lr": 1.9067359962412588e-05, "epoch": 3.945578231292517, "percentage": 56.37, "elapsed_time": "3:01:08", "remaining_time": "2:20:13"}
{"current_steps": 2325, "total_steps": 4116, "loss": 0.1711, "lr": 1.8982644654992123e-05, "epoch": 3.954081632653061, "percentage": 56.49, "elapsed_time": "3:01:28", "remaining_time": "2:19:47"}
{"current_steps": 2330, "total_steps": 4116, "loss": 0.1844, "lr": 1.8897947644147424e-05, "epoch": 3.9625850340136055, "percentage": 56.61, "elapsed_time": "3:01:50", "remaining_time": "2:19:22"}
{"current_steps": 2335, "total_steps": 4116, "loss": 0.1936, "lr": 1.8813270453107586e-05, "epoch": 3.9710884353741496, "percentage": 56.73, "elapsed_time": "3:02:11", "remaining_time": "2:18:57"}
{"current_steps": 2340, "total_steps": 4116, "loss": 0.1686, "lr": 1.872861460474528e-05, "epoch": 3.979591836734694, "percentage": 56.85, "elapsed_time": "3:02:37", "remaining_time": "2:18:36"}
{"current_steps": 2345, "total_steps": 4116, "loss": 0.1588, "lr": 1.8643981621549325e-05, "epoch": 3.988095238095238, "percentage": 56.97, "elapsed_time": "3:02:59", "remaining_time": "2:18:11"}
{"current_steps": 2350, "total_steps": 4116, "loss": 0.1662, "lr": 1.855937302559734e-05, "epoch": 3.996598639455782, "percentage": 57.09, "elapsed_time": "3:03:23", "remaining_time": "2:17:48"}
{"current_steps": 2355, "total_steps": 4116, "loss": 0.1617, "lr": 1.8474790338528336e-05, "epoch": 4.005102040816326, "percentage": 57.22, "elapsed_time": "3:03:50", "remaining_time": "2:17:28"}
{"current_steps": 2360, "total_steps": 4116, "loss": 0.1645, "lr": 1.839023508151536e-05, "epoch": 4.01360544217687, "percentage": 57.34, "elapsed_time": "3:04:17", "remaining_time": "2:17:07"}
{"current_steps": 2365, "total_steps": 4116, "loss": 0.1587, "lr": 1.8305708775238162e-05, "epoch": 4.022108843537415, "percentage": 57.46, "elapsed_time": "3:04:38", "remaining_time": "2:16:42"}
{"current_steps": 2370, "total_steps": 4116, "loss": 0.152, "lr": 1.822121293985583e-05, "epoch": 4.030612244897959, "percentage": 57.58, "elapsed_time": "3:04:58", "remaining_time": "2:16:16"}
{"current_steps": 2375, "total_steps": 4116, "loss": 0.1376, "lr": 1.813674909497944e-05, "epoch": 4.039115646258503, "percentage": 57.7, "elapsed_time": "3:05:24", "remaining_time": "2:15:55"}
{"current_steps": 2380, "total_steps": 4116, "loss": 0.1397, "lr": 1.8052318759644744e-05, "epoch": 4.0476190476190474, "percentage": 57.82, "elapsed_time": "3:05:45", "remaining_time": "2:15:29"}
{"current_steps": 2385, "total_steps": 4116, "loss": 0.1515, "lr": 1.7967923452284824e-05, "epoch": 4.0561224489795915, "percentage": 57.94, "elapsed_time": "3:06:06", "remaining_time": "2:15:04"}
{"current_steps": 2390, "total_steps": 4116, "loss": 0.1515, "lr": 1.7883564690702826e-05, "epoch": 4.0646258503401365, "percentage": 58.07, "elapsed_time": "3:06:32", "remaining_time": "2:14:42"}
{"current_steps": 2395, "total_steps": 4116, "loss": 0.1592, "lr": 1.7799243992044634e-05, "epoch": 4.0731292517006805, "percentage": 58.19, "elapsed_time": "3:06:52", "remaining_time": "2:14:17"}
{"current_steps": 2400, "total_steps": 4116, "loss": 0.1505, "lr": 1.7714962872771593e-05, "epoch": 4.081632653061225, "percentage": 58.31, "elapsed_time": "3:07:16", "remaining_time": "2:13:53"}
{"current_steps": 2405, "total_steps": 4116, "loss": 0.1419, "lr": 1.7630722848633234e-05, "epoch": 4.090136054421769, "percentage": 58.43, "elapsed_time": "3:07:38", "remaining_time": "2:13:29"}
{"current_steps": 2410, "total_steps": 4116, "loss": 0.1584, "lr": 1.7546525434639992e-05, "epoch": 4.098639455782313, "percentage": 58.55, "elapsed_time": "3:08:01", "remaining_time": "2:13:06"}
{"current_steps": 2415, "total_steps": 4116, "loss": 0.153, "lr": 1.7462372145036012e-05, "epoch": 4.107142857142857, "percentage": 58.67, "elapsed_time": "3:08:21", "remaining_time": "2:12:40"}
{"current_steps": 2420, "total_steps": 4116, "loss": 0.1595, "lr": 1.737826449327188e-05, "epoch": 4.115646258503402, "percentage": 58.79, "elapsed_time": "3:08:44", "remaining_time": "2:12:16"}
{"current_steps": 2425, "total_steps": 4116, "loss": 0.1649, "lr": 1.729420399197741e-05, "epoch": 4.124149659863946, "percentage": 58.92, "elapsed_time": "3:09:06", "remaining_time": "2:11:51"}
{"current_steps": 2430, "total_steps": 4116, "loss": 0.1364, "lr": 1.7210192152934423e-05, "epoch": 4.13265306122449, "percentage": 59.04, "elapsed_time": "3:09:27", "remaining_time": "2:11:27"}
{"current_steps": 2435, "total_steps": 4116, "loss": 0.1544, "lr": 1.712623048704961e-05, "epoch": 4.141156462585034, "percentage": 59.16, "elapsed_time": "3:09:49", "remaining_time": "2:11:02"}
{"current_steps": 2440, "total_steps": 4116, "loss": 0.1566, "lr": 1.7042320504327285e-05, "epoch": 4.149659863945578, "percentage": 59.28, "elapsed_time": "3:10:09", "remaining_time": "2:10:37"}
{"current_steps": 2445, "total_steps": 4116, "loss": 0.1564, "lr": 1.6958463713842298e-05, "epoch": 4.158163265306122, "percentage": 59.4, "elapsed_time": "3:10:31", "remaining_time": "2:10:12"}
{"current_steps": 2450, "total_steps": 4116, "loss": 0.1617, "lr": 1.687466162371286e-05, "epoch": 4.166666666666667, "percentage": 59.52, "elapsed_time": "3:10:54", "remaining_time": "2:09:48"}
{"current_steps": 2455, "total_steps": 4116, "loss": 0.1518, "lr": 1.6790915741073418e-05, "epoch": 4.175170068027211, "percentage": 59.65, "elapsed_time": "3:11:15", "remaining_time": "2:09:24"}
{"current_steps": 2460, "total_steps": 4116, "loss": 0.1582, "lr": 1.670722757204758e-05, "epoch": 4.183673469387755, "percentage": 59.77, "elapsed_time": "3:11:39", "remaining_time": "2:09:01"}
{"current_steps": 2465, "total_steps": 4116, "loss": 0.1526, "lr": 1.6623598621720966e-05, "epoch": 4.192176870748299, "percentage": 59.89, "elapsed_time": "3:12:01", "remaining_time": "2:08:36"}
{"current_steps": 2470, "total_steps": 4116, "loss": 0.1514, "lr": 1.6540030394114218e-05, "epoch": 4.200680272108843, "percentage": 60.01, "elapsed_time": "3:12:21", "remaining_time": "2:08:11"}
{"current_steps": 2475, "total_steps": 4116, "loss": 0.147, "lr": 1.64565243921559e-05, "epoch": 4.209183673469388, "percentage": 60.13, "elapsed_time": "3:12:44", "remaining_time": "2:07:47"}
{"current_steps": 2480, "total_steps": 4116, "loss": 0.1539, "lr": 1.6373082117655473e-05, "epoch": 4.217687074829932, "percentage": 60.25, "elapsed_time": "3:13:07", "remaining_time": "2:07:24"}
{"current_steps": 2485, "total_steps": 4116, "loss": 0.1487, "lr": 1.6289705071276323e-05, "epoch": 4.226190476190476, "percentage": 60.37, "elapsed_time": "3:13:32", "remaining_time": "2:07:01"}
{"current_steps": 2490, "total_steps": 4116, "loss": 0.1612, "lr": 1.62063947525087e-05, "epoch": 4.23469387755102, "percentage": 60.5, "elapsed_time": "3:13:59", "remaining_time": "2:06:40"}
{"current_steps": 2495, "total_steps": 4116, "loss": 0.1522, "lr": 1.612315265964284e-05, "epoch": 4.243197278911564, "percentage": 60.62, "elapsed_time": "3:14:27", "remaining_time": "2:06:20"}
{"current_steps": 2500, "total_steps": 4116, "loss": 0.1577, "lr": 1.6039980289741935e-05, "epoch": 4.2517006802721085, "percentage": 60.74, "elapsed_time": "3:14:51", "remaining_time": "2:05:57"}
{"current_steps": 2505, "total_steps": 4116, "loss": 0.1547, "lr": 1.595687913861528e-05, "epoch": 4.260204081632653, "percentage": 60.86, "elapsed_time": "3:15:16", "remaining_time": "2:05:35"}
{"current_steps": 2510, "total_steps": 4116, "loss": 0.149, "lr": 1.587385070079133e-05, "epoch": 4.2687074829931975, "percentage": 60.98, "elapsed_time": "3:15:40", "remaining_time": "2:05:11"}
{"current_steps": 2515, "total_steps": 4116, "loss": 0.1495, "lr": 1.5790896469490807e-05, "epoch": 4.2772108843537415, "percentage": 61.1, "elapsed_time": "3:16:06", "remaining_time": "2:04:50"}
{"current_steps": 2520, "total_steps": 4116, "loss": 0.1538, "lr": 1.5708017936599908e-05, "epoch": 4.285714285714286, "percentage": 61.22, "elapsed_time": "3:16:26", "remaining_time": "2:04:24"}
{"current_steps": 2525, "total_steps": 4116, "loss": 0.1511, "lr": 1.56252165926434e-05, "epoch": 4.29421768707483, "percentage": 61.35, "elapsed_time": "3:16:48", "remaining_time": "2:04:00"}
{"current_steps": 2530, "total_steps": 4116, "loss": 0.1557, "lr": 1.554249392675788e-05, "epoch": 4.302721088435375, "percentage": 61.47, "elapsed_time": "3:17:10", "remaining_time": "2:03:36"}
{"current_steps": 2535, "total_steps": 4116, "loss": 0.149, "lr": 1.5459851426664956e-05, "epoch": 4.311224489795919, "percentage": 61.59, "elapsed_time": "3:17:36", "remaining_time": "2:03:14"}
{"current_steps": 2540, "total_steps": 4116, "loss": 0.1625, "lr": 1.5377290578644468e-05, "epoch": 4.319727891156463, "percentage": 61.71, "elapsed_time": "3:17:59", "remaining_time": "2:02:50"}
{"current_steps": 2545, "total_steps": 4116, "loss": 0.1479, "lr": 1.5294812867507835e-05, "epoch": 4.328231292517007, "percentage": 61.83, "elapsed_time": "3:18:24", "remaining_time": "2:02:28"}
{"current_steps": 2550, "total_steps": 4116, "loss": 0.1501, "lr": 1.5212419776571267e-05, "epoch": 4.336734693877551, "percentage": 61.95, "elapsed_time": "3:18:47", "remaining_time": "2:02:04"}
{"current_steps": 2555, "total_steps": 4116, "loss": 0.16, "lr": 1.5130112787629152e-05, "epoch": 4.345238095238095, "percentage": 62.07, "elapsed_time": "3:19:08", "remaining_time": "2:01:39"}
{"current_steps": 2560, "total_steps": 4116, "loss": 0.1424, "lr": 1.5047893380927377e-05, "epoch": 4.35374149659864, "percentage": 62.2, "elapsed_time": "3:19:32", "remaining_time": "2:01:17"}
{"current_steps": 2565, "total_steps": 4116, "loss": 0.1474, "lr": 1.4965763035136687e-05, "epoch": 4.362244897959184, "percentage": 62.32, "elapsed_time": "3:19:53", "remaining_time": "2:00:52"}
{"current_steps": 2570, "total_steps": 4116, "loss": 0.1553, "lr": 1.4883723227326144e-05, "epoch": 4.370748299319728, "percentage": 62.44, "elapsed_time": "3:20:14", "remaining_time": "2:00:27"}
{"current_steps": 2575, "total_steps": 4116, "loss": 0.1502, "lr": 1.4801775432936516e-05, "epoch": 4.379251700680272, "percentage": 62.56, "elapsed_time": "3:20:36", "remaining_time": "2:00:03"}
{"current_steps": 2580, "total_steps": 4116, "loss": 0.1547, "lr": 1.4719921125753773e-05, "epoch": 4.387755102040816, "percentage": 62.68, "elapsed_time": "3:21:01", "remaining_time": "1:59:40"}
{"current_steps": 2585, "total_steps": 4116, "loss": 0.1419, "lr": 1.4638161777882573e-05, "epoch": 4.39625850340136, "percentage": 62.8, "elapsed_time": "3:21:22", "remaining_time": "1:59:15"}
{"current_steps": 2590, "total_steps": 4116, "loss": 0.1592, "lr": 1.4556498859719756e-05, "epoch": 4.404761904761905, "percentage": 62.93, "elapsed_time": "3:21:46", "remaining_time": "1:58:52"}
{"current_steps": 2595, "total_steps": 4116, "loss": 0.1578, "lr": 1.4474933839927956e-05, "epoch": 4.413265306122449, "percentage": 63.05, "elapsed_time": "3:22:08", "remaining_time": "1:58:28"}
{"current_steps": 2600, "total_steps": 4116, "loss": 0.15, "lr": 1.4393468185409139e-05, "epoch": 4.421768707482993, "percentage": 63.17, "elapsed_time": "3:22:28", "remaining_time": "1:58:03"}
{"current_steps": 2605, "total_steps": 4116, "loss": 0.162, "lr": 1.4312103361278254e-05, "epoch": 4.430272108843537, "percentage": 63.29, "elapsed_time": "3:22:52", "remaining_time": "1:57:40"}
{"current_steps": 2610, "total_steps": 4116, "loss": 0.1458, "lr": 1.4230840830836875e-05, "epoch": 4.438775510204081, "percentage": 63.41, "elapsed_time": "3:23:15", "remaining_time": "1:57:17"}
{"current_steps": 2615, "total_steps": 4116, "loss": 0.1457, "lr": 1.4149682055546854e-05, "epoch": 4.447278911564625, "percentage": 63.53, "elapsed_time": "3:23:37", "remaining_time": "1:56:52"}
{"current_steps": 2620, "total_steps": 4116, "loss": 0.1512, "lr": 1.4068628495004095e-05, "epoch": 4.45578231292517, "percentage": 63.65, "elapsed_time": "3:24:00", "remaining_time": "1:56:29"}
{"current_steps": 2625, "total_steps": 4116, "loss": 0.1473, "lr": 1.3987681606912254e-05, "epoch": 4.464285714285714, "percentage": 63.78, "elapsed_time": "3:24:22", "remaining_time": "1:56:05"}
{"current_steps": 2630, "total_steps": 4116, "loss": 0.1589, "lr": 1.3906842847056546e-05, "epoch": 4.4727891156462585, "percentage": 63.9, "elapsed_time": "3:24:42", "remaining_time": "1:55:39"}
{"current_steps": 2635, "total_steps": 4116, "loss": 0.1416, "lr": 1.382611366927757e-05, "epoch": 4.4812925170068025, "percentage": 64.02, "elapsed_time": "3:25:05", "remaining_time": "1:55:16"}
{"current_steps": 2640, "total_steps": 4116, "loss": 0.1588, "lr": 1.3745495525445126e-05, "epoch": 4.489795918367347, "percentage": 64.14, "elapsed_time": "3:25:29", "remaining_time": "1:54:53"}
{"current_steps": 2645, "total_steps": 4116, "loss": 0.1613, "lr": 1.3664989865432157e-05, "epoch": 4.4982993197278915, "percentage": 64.26, "elapsed_time": "3:25:50", "remaining_time": "1:54:28"}
{"current_steps": 2650, "total_steps": 4116, "loss": 0.141, "lr": 1.3584598137088625e-05, "epoch": 4.506802721088436, "percentage": 64.38, "elapsed_time": "3:26:12", "remaining_time": "1:54:04"}
{"current_steps": 2655, "total_steps": 4116, "loss": 0.1415, "lr": 1.3504321786215514e-05, "epoch": 4.51530612244898, "percentage": 64.5, "elapsed_time": "3:26:34", "remaining_time": "1:53:40"}
{"current_steps": 2660, "total_steps": 4116, "loss": 0.1517, "lr": 1.3424162256538797e-05, "epoch": 4.523809523809524, "percentage": 64.63, "elapsed_time": "3:26:59", "remaining_time": "1:53:17"}
{"current_steps": 2665, "total_steps": 4116, "loss": 0.16, "lr": 1.3344120989683473e-05, "epoch": 4.532312925170068, "percentage": 64.75, "elapsed_time": "3:27:22", "remaining_time": "1:52:54"}
{"current_steps": 2670, "total_steps": 4116, "loss": 0.1701, "lr": 1.3264199425147667e-05, "epoch": 4.540816326530612, "percentage": 64.87, "elapsed_time": "3:27:49", "remaining_time": "1:52:32"}
{"current_steps": 2675, "total_steps": 4116, "loss": 0.1551, "lr": 1.3184399000276708e-05, "epoch": 4.549319727891157, "percentage": 64.99, "elapsed_time": "3:28:10", "remaining_time": "1:52:08"}
{"current_steps": 2680, "total_steps": 4116, "loss": 0.1518, "lr": 1.3104721150237305e-05, "epoch": 4.557823129251701, "percentage": 65.11, "elapsed_time": "3:28:34", "remaining_time": "1:51:45"}
{"current_steps": 2685, "total_steps": 4116, "loss": 0.1485, "lr": 1.3025167307991734e-05, "epoch": 4.566326530612245, "percentage": 65.23, "elapsed_time": "3:28:55", "remaining_time": "1:51:20"}
{"current_steps": 2690, "total_steps": 4116, "loss": 0.1498, "lr": 1.294573890427203e-05, "epoch": 4.574829931972789, "percentage": 65.35, "elapsed_time": "3:29:20", "remaining_time": "1:50:58"}
{"current_steps": 2695, "total_steps": 4116, "loss": 0.1526, "lr": 1.2866437367554313e-05, "epoch": 4.583333333333333, "percentage": 65.48, "elapsed_time": "3:29:42", "remaining_time": "1:50:34"}
{"current_steps": 2700, "total_steps": 4116, "loss": 0.1487, "lr": 1.278726412403306e-05, "epoch": 4.591836734693878, "percentage": 65.6, "elapsed_time": "3:30:04", "remaining_time": "1:50:10"}
{"current_steps": 2705, "total_steps": 4116, "loss": 0.1496, "lr": 1.2708220597595462e-05, "epoch": 4.600340136054422, "percentage": 65.72, "elapsed_time": "3:30:26", "remaining_time": "1:49:46"}
{"current_steps": 2710, "total_steps": 4116, "loss": 0.1508, "lr": 1.2629308209795834e-05, "epoch": 4.608843537414966, "percentage": 65.84, "elapsed_time": "3:30:49", "remaining_time": "1:49:22"}
{"current_steps": 2715, "total_steps": 4116, "loss": 0.1498, "lr": 1.2550528379830019e-05, "epoch": 4.61734693877551, "percentage": 65.96, "elapsed_time": "3:31:11", "remaining_time": "1:48:58"}
{"current_steps": 2720, "total_steps": 4116, "loss": 0.1573, "lr": 1.2471882524509873e-05, "epoch": 4.625850340136054, "percentage": 66.08, "elapsed_time": "3:31:34", "remaining_time": "1:48:35"}
{"current_steps": 2725, "total_steps": 4116, "loss": 0.1497, "lr": 1.2393372058237819e-05, "epoch": 4.634353741496598, "percentage": 66.21, "elapsed_time": "3:31:58", "remaining_time": "1:48:12"}
{"current_steps": 2730, "total_steps": 4116, "loss": 0.143, "lr": 1.2314998392981364e-05, "epoch": 4.642857142857143, "percentage": 66.33, "elapsed_time": "3:32:21", "remaining_time": "1:47:48"}
{"current_steps": 2735, "total_steps": 4116, "loss": 0.1471, "lr": 1.2236762938247729e-05, "epoch": 4.651360544217687, "percentage": 66.45, "elapsed_time": "3:32:44", "remaining_time": "1:47:25"}
{"current_steps": 2740, "total_steps": 4116, "loss": 0.1556, "lr": 1.2158667101058497e-05, "epoch": 4.659863945578231, "percentage": 66.57, "elapsed_time": "3:33:07", "remaining_time": "1:47:01"}
{"current_steps": 2745, "total_steps": 4116, "loss": 0.153, "lr": 1.208071228592429e-05, "epoch": 4.668367346938775, "percentage": 66.69, "elapsed_time": "3:33:33", "remaining_time": "1:46:39"}
{"current_steps": 2750, "total_steps": 4116, "loss": 0.1449, "lr": 1.2002899894819548e-05, "epoch": 4.6768707482993195, "percentage": 66.81, "elapsed_time": "3:33:55", "remaining_time": "1:46:15"}
{"current_steps": 2755, "total_steps": 4116, "loss": 0.1445, "lr": 1.1925231327157288e-05, "epoch": 4.685374149659864, "percentage": 66.93, "elapsed_time": "3:34:17", "remaining_time": "1:45:51"}
{"current_steps": 2760, "total_steps": 4116, "loss": 0.1506, "lr": 1.1847707979763934e-05, "epoch": 4.6938775510204085, "percentage": 67.06, "elapsed_time": "3:34:38", "remaining_time": "1:45:27"}
{"current_steps": 2765, "total_steps": 4116, "loss": 0.1539, "lr": 1.1770331246854211e-05, "epoch": 4.7023809523809526, "percentage": 67.18, "elapsed_time": "3:35:01", "remaining_time": "1:45:03"}
{"current_steps": 2770, "total_steps": 4116, "loss": 0.1555, "lr": 1.169310252000605e-05, "epoch": 4.710884353741497, "percentage": 67.3, "elapsed_time": "3:35:23", "remaining_time": "1:44:40"}
{"current_steps": 2775, "total_steps": 4116, "loss": 0.1587, "lr": 1.1616023188135594e-05, "epoch": 4.719387755102041, "percentage": 67.42, "elapsed_time": "3:35:48", "remaining_time": "1:44:17"}
{"current_steps": 2780, "total_steps": 4116, "loss": 0.1435, "lr": 1.1539094637472181e-05, "epoch": 4.727891156462585, "percentage": 67.54, "elapsed_time": "3:36:12", "remaining_time": "1:43:54"}
{"current_steps": 2785, "total_steps": 4116, "loss": 0.1491, "lr": 1.1462318251533442e-05, "epoch": 4.736394557823129, "percentage": 67.66, "elapsed_time": "3:36:36", "remaining_time": "1:43:31"}
{"current_steps": 2790, "total_steps": 4116, "loss": 0.1621, "lr": 1.1385695411100424e-05, "epoch": 4.744897959183674, "percentage": 67.78, "elapsed_time": "3:36:57", "remaining_time": "1:43:06"}
{"current_steps": 2795, "total_steps": 4116, "loss": 0.1568, "lr": 1.1309227494192703e-05, "epoch": 4.753401360544218, "percentage": 67.91, "elapsed_time": "3:37:24", "remaining_time": "1:42:45"}
{"current_steps": 2800, "total_steps": 4116, "loss": 0.1608, "lr": 1.1232915876043673e-05, "epoch": 4.761904761904762, "percentage": 68.03, "elapsed_time": "3:37:45", "remaining_time": "1:42:20"}
{"current_steps": 2805, "total_steps": 4116, "loss": 0.1529, "lr": 1.1156761929075777e-05, "epoch": 4.770408163265306, "percentage": 68.15, "elapsed_time": "3:38:07", "remaining_time": "1:41:57"}
{"current_steps": 2810, "total_steps": 4116, "loss": 0.1606, "lr": 1.1080767022875816e-05, "epoch": 4.77891156462585, "percentage": 68.27, "elapsed_time": "3:38:34", "remaining_time": "1:41:35"}
{"current_steps": 2815, "total_steps": 4116, "loss": 0.1566, "lr": 1.100493252417035e-05, "epoch": 4.787414965986395, "percentage": 68.39, "elapsed_time": "3:38:57", "remaining_time": "1:41:11"}
{"current_steps": 2820, "total_steps": 4116, "loss": 0.154, "lr": 1.0929259796801075e-05, "epoch": 4.795918367346939, "percentage": 68.51, "elapsed_time": "3:39:24", "remaining_time": "1:40:49"}
{"current_steps": 2825, "total_steps": 4116, "loss": 0.1672, "lr": 1.085375020170034e-05, "epoch": 4.804421768707483, "percentage": 68.63, "elapsed_time": "3:39:45", "remaining_time": "1:40:25"}
{"current_steps": 2830, "total_steps": 4116, "loss": 0.1633, "lr": 1.0778405096866624e-05, "epoch": 4.812925170068027, "percentage": 68.76, "elapsed_time": "3:40:05", "remaining_time": "1:40:00"}
{"current_steps": 2835, "total_steps": 4116, "loss": 0.1447, "lr": 1.0703225837340166e-05, "epoch": 4.821428571428571, "percentage": 68.88, "elapsed_time": "3:40:29", "remaining_time": "1:39:37"}
{"current_steps": 2840, "total_steps": 4116, "loss": 0.162, "lr": 1.0628213775178561e-05, "epoch": 4.829931972789115, "percentage": 69.0, "elapsed_time": "3:40:52", "remaining_time": "1:39:14"}
{"current_steps": 2845, "total_steps": 4116, "loss": 0.1453, "lr": 1.055337025943244e-05, "epoch": 4.83843537414966, "percentage": 69.12, "elapsed_time": "3:41:14", "remaining_time": "1:38:50"}
{"current_steps": 2850, "total_steps": 4116, "loss": 0.1559, "lr": 1.0478696636121231e-05, "epoch": 4.846938775510204, "percentage": 69.24, "elapsed_time": "3:41:38", "remaining_time": "1:38:27"}
{"current_steps": 2855, "total_steps": 4116, "loss": 0.1501, "lr": 1.0404194248208934e-05, "epoch": 4.855442176870748, "percentage": 69.36, "elapsed_time": "3:42:01", "remaining_time": "1:38:03"}
{"current_steps": 2860, "total_steps": 4116, "loss": 0.1508, "lr": 1.0329864435579982e-05, "epoch": 4.863945578231292, "percentage": 69.48, "elapsed_time": "3:42:24", "remaining_time": "1:37:40"}
{"current_steps": 2865, "total_steps": 4116, "loss": 0.1579, "lr": 1.0255708535015137e-05, "epoch": 4.872448979591836, "percentage": 69.61, "elapsed_time": "3:42:50", "remaining_time": "1:37:18"}
{"current_steps": 2870, "total_steps": 4116, "loss": 0.1584, "lr": 1.0181727880167428e-05, "epoch": 4.880952380952381, "percentage": 69.73, "elapsed_time": "3:43:11", "remaining_time": "1:36:54"}
{"current_steps": 2875, "total_steps": 4116, "loss": 0.1645, "lr": 1.0107923801538215e-05, "epoch": 4.889455782312925, "percentage": 69.85, "elapsed_time": "3:43:33", "remaining_time": "1:36:29"}
{"current_steps": 2880, "total_steps": 4116, "loss": 0.1467, "lr": 1.003429762645321e-05, "epoch": 4.8979591836734695, "percentage": 69.97, "elapsed_time": "3:43:57", "remaining_time": "1:36:07"}
{"current_steps": 2885, "total_steps": 4116, "loss": 0.1502, "lr": 9.960850679038632e-06, "epoch": 4.906462585034014, "percentage": 70.09, "elapsed_time": "3:44:22", "remaining_time": "1:35:44"}
{"current_steps": 2890, "total_steps": 4116, "loss": 0.1417, "lr": 9.887584280197411e-06, "epoch": 4.914965986394558, "percentage": 70.21, "elapsed_time": "3:44:47", "remaining_time": "1:35:21"}
{"current_steps": 2895, "total_steps": 4116, "loss": 0.1519, "lr": 9.814499747585373e-06, "epoch": 4.923469387755102, "percentage": 70.34, "elapsed_time": "3:45:08", "remaining_time": "1:34:57"}
{"current_steps": 2900, "total_steps": 4116, "loss": 0.1482, "lr": 9.741598395587606e-06, "epoch": 4.931972789115647, "percentage": 70.46, "elapsed_time": "3:45:36", "remaining_time": "1:34:35"}
{"current_steps": 2905, "total_steps": 4116, "loss": 0.1367, "lr": 9.668881535294792e-06, "epoch": 4.940476190476191, "percentage": 70.58, "elapsed_time": "3:45:58", "remaining_time": "1:34:12"}
{"current_steps": 2910, "total_steps": 4116, "loss": 0.1457, "lr": 9.596350474479625e-06, "epoch": 4.948979591836735, "percentage": 70.7, "elapsed_time": "3:46:22", "remaining_time": "1:33:49"}
{"current_steps": 2915, "total_steps": 4116, "loss": 0.1657, "lr": 9.52400651757331e-06, "epoch": 4.957482993197279, "percentage": 70.82, "elapsed_time": "3:46:42", "remaining_time": "1:33:24"}
{"current_steps": 2920, "total_steps": 4116, "loss": 0.1552, "lr": 9.451850965642073e-06, "epoch": 4.965986394557823, "percentage": 70.94, "elapsed_time": "3:47:04", "remaining_time": "1:33:00"}
{"current_steps": 2925, "total_steps": 4116, "loss": 0.1596, "lr": 9.379885116363786e-06, "epoch": 4.974489795918368, "percentage": 71.06, "elapsed_time": "3:47:26", "remaining_time": "1:32:36"}
{"current_steps": 2930, "total_steps": 4116, "loss": 0.1629, "lr": 9.308110264004625e-06, "epoch": 4.982993197278912, "percentage": 71.19, "elapsed_time": "3:47:49", "remaining_time": "1:32:13"}
{"current_steps": 2935, "total_steps": 4116, "loss": 0.1596, "lr": 9.236527699395803e-06, "epoch": 4.991496598639456, "percentage": 71.31, "elapsed_time": "3:48:13", "remaining_time": "1:31:50"}
{"current_steps": 2940, "total_steps": 4116, "loss": 0.1441, "lr": 9.165138709910338e-06, "epoch": 5.0, "percentage": 71.43, "elapsed_time": "3:48:37", "remaining_time": "1:31:26"}
{"current_steps": 2945, "total_steps": 4116, "loss": 0.1369, "lr": 9.093944579439882e-06, "epoch": 5.008503401360544, "percentage": 71.55, "elapsed_time": "3:48:58", "remaining_time": "1:31:02"}
{"current_steps": 2950, "total_steps": 4116, "loss": 0.1489, "lr": 9.022946588371688e-06, "epoch": 5.017006802721088, "percentage": 71.67, "elapsed_time": "3:49:21", "remaining_time": "1:30:39"}
{"current_steps": 2955, "total_steps": 4116, "loss": 0.1398, "lr": 8.952146013565532e-06, "epoch": 5.025510204081633, "percentage": 71.79, "elapsed_time": "3:49:47", "remaining_time": "1:30:16"}
{"current_steps": 2960, "total_steps": 4116, "loss": 0.143, "lr": 8.881544128330777e-06, "epoch": 5.034013605442177, "percentage": 71.91, "elapsed_time": "3:50:08", "remaining_time": "1:29:52"}
{"current_steps": 2965, "total_steps": 4116, "loss": 0.1506, "lr": 8.81114220240346e-06, "epoch": 5.042517006802721, "percentage": 72.04, "elapsed_time": "3:50:31", "remaining_time": "1:29:29"}
{"current_steps": 2970, "total_steps": 4116, "loss": 0.1354, "lr": 8.740941501923454e-06, "epoch": 5.051020408163265, "percentage": 72.16, "elapsed_time": "3:50:59", "remaining_time": "1:29:07"}
{"current_steps": 2975, "total_steps": 4116, "loss": 0.1436, "lr": 8.6709432894117e-06, "epoch": 5.059523809523809, "percentage": 72.28, "elapsed_time": "3:51:22", "remaining_time": "1:28:44"}
{"current_steps": 2980, "total_steps": 4116, "loss": 0.1516, "lr": 8.601148823747516e-06, "epoch": 5.068027210884353, "percentage": 72.4, "elapsed_time": "3:51:45", "remaining_time": "1:28:20"}
{"current_steps": 2985, "total_steps": 4116, "loss": 0.1571, "lr": 8.531559360145942e-06, "epoch": 5.076530612244898, "percentage": 72.52, "elapsed_time": "3:52:06", "remaining_time": "1:27:56"}
{"current_steps": 2990, "total_steps": 4116, "loss": 0.1409, "lr": 8.462176150135168e-06, "epoch": 5.085034013605442, "percentage": 72.64, "elapsed_time": "3:52:29", "remaining_time": "1:27:33"}
{"current_steps": 2995, "total_steps": 4116, "loss": 0.1321, "lr": 8.393000441534036e-06, "epoch": 5.093537414965986, "percentage": 72.76, "elapsed_time": "3:52:50", "remaining_time": "1:27:09"}
{"current_steps": 3000, "total_steps": 4116, "loss": 0.1443, "lr": 8.324033478429559e-06, "epoch": 5.1020408163265305, "percentage": 72.89, "elapsed_time": "3:53:13", "remaining_time": "1:26:45"}
{"current_steps": 3005, "total_steps": 4116, "loss": 0.1493, "lr": 8.25527650115461e-06, "epoch": 5.110544217687075, "percentage": 73.01, "elapsed_time": "3:53:43", "remaining_time": "1:26:24"}
{"current_steps": 3010, "total_steps": 4116, "loss": 0.1614, "lr": 8.186730746265577e-06, "epoch": 5.119047619047619, "percentage": 73.13, "elapsed_time": "3:54:04", "remaining_time": "1:26:00"}
{"current_steps": 3015, "total_steps": 4116, "loss": 0.1422, "lr": 8.118397446520103e-06, "epoch": 5.127551020408164, "percentage": 73.25, "elapsed_time": "3:54:27", "remaining_time": "1:25:36"}
{"current_steps": 3020, "total_steps": 4116, "loss": 0.149, "lr": 8.050277830854984e-06, "epoch": 5.136054421768708, "percentage": 73.37, "elapsed_time": "3:54:55", "remaining_time": "1:25:15"}
{"current_steps": 3025, "total_steps": 4116, "loss": 0.1421, "lr": 7.982373124363984e-06, "epoch": 5.144557823129252, "percentage": 73.49, "elapsed_time": "3:55:22", "remaining_time": "1:24:53"}
{"current_steps": 3030, "total_steps": 4116, "loss": 0.1446, "lr": 7.914684548275875e-06, "epoch": 5.153061224489796, "percentage": 73.62, "elapsed_time": "3:55:46", "remaining_time": "1:24:30"}
{"current_steps": 3035, "total_steps": 4116, "loss": 0.1385, "lr": 7.847213319932425e-06, "epoch": 5.16156462585034, "percentage": 73.74, "elapsed_time": "3:56:09", "remaining_time": "1:24:06"}
{"current_steps": 3040, "total_steps": 4116, "loss": 0.1395, "lr": 7.779960652766534e-06, "epoch": 5.170068027210885, "percentage": 73.86, "elapsed_time": "3:56:32", "remaining_time": "1:23:43"}
{"current_steps": 3045, "total_steps": 4116, "loss": 0.1458, "lr": 7.7129277562804e-06, "epoch": 5.178571428571429, "percentage": 73.98, "elapsed_time": "3:56:53", "remaining_time": "1:23:19"}
{"current_steps": 3050, "total_steps": 4116, "loss": 0.1484, "lr": 7.646115836023744e-06, "epoch": 5.187074829931973, "percentage": 74.1, "elapsed_time": "3:57:13", "remaining_time": "1:22:54"}
{"current_steps": 3055, "total_steps": 4116, "loss": 0.1457, "lr": 7.5795260935721755e-06, "epoch": 5.195578231292517, "percentage": 74.22, "elapsed_time": "3:57:38", "remaining_time": "1:22:31"}
{"current_steps": 3060, "total_steps": 4116, "loss": 0.1405, "lr": 7.5131597265055585e-06, "epoch": 5.204081632653061, "percentage": 74.34, "elapsed_time": "3:58:03", "remaining_time": "1:22:09"}
{"current_steps": 3065, "total_steps": 4116, "loss": 0.1397, "lr": 7.447017928386453e-06, "epoch": 5.212585034013605, "percentage": 74.47, "elapsed_time": "3:58:26", "remaining_time": "1:21:45"}
{"current_steps": 3070, "total_steps": 4116, "loss": 0.1452, "lr": 7.381101888738693e-06, "epoch": 5.22108843537415, "percentage": 74.59, "elapsed_time": "3:58:49", "remaining_time": "1:21:22"}
{"current_steps": 3075, "total_steps": 4116, "loss": 0.1378, "lr": 7.315412793025951e-06, "epoch": 5.229591836734694, "percentage": 74.71, "elapsed_time": "3:59:09", "remaining_time": "1:20:57"}
{"current_steps": 3080, "total_steps": 4116, "loss": 0.1348, "lr": 7.24995182263045e-06, "epoch": 5.238095238095238, "percentage": 74.83, "elapsed_time": "3:59:30", "remaining_time": "1:20:33"}
{"current_steps": 3085, "total_steps": 4116, "loss": 0.1529, "lr": 7.184720154831706e-06, "epoch": 5.246598639455782, "percentage": 74.95, "elapsed_time": "3:59:54", "remaining_time": "1:20:10"}
{"current_steps": 3090, "total_steps": 4116, "loss": 0.1378, "lr": 7.119718962785349e-06, "epoch": 5.255102040816326, "percentage": 75.07, "elapsed_time": "4:00:18", "remaining_time": "1:19:47"}
{"current_steps": 3095, "total_steps": 4116, "loss": 0.1378, "lr": 7.054949415502037e-06, "epoch": 5.263605442176871, "percentage": 75.19, "elapsed_time": "4:00:39", "remaining_time": "1:19:23"}
{"current_steps": 3100, "total_steps": 4116, "loss": 0.1491, "lr": 6.990412677826404e-06, "epoch": 5.272108843537415, "percentage": 75.32, "elapsed_time": "4:01:01", "remaining_time": "1:18:59"}
{"current_steps": 3105, "total_steps": 4116, "loss": 0.1433, "lr": 6.926109910416153e-06, "epoch": 5.280612244897959, "percentage": 75.44, "elapsed_time": "4:01:28", "remaining_time": "1:18:37"}
{"current_steps": 3110, "total_steps": 4116, "loss": 0.1419, "lr": 6.862042269721158e-06, "epoch": 5.289115646258503, "percentage": 75.56, "elapsed_time": "4:01:50", "remaining_time": "1:18:13"}
{"current_steps": 3115, "total_steps": 4116, "loss": 0.1455, "lr": 6.798210907962655e-06, "epoch": 5.2976190476190474, "percentage": 75.68, "elapsed_time": "4:02:13", "remaining_time": "1:17:50"}
{"current_steps": 3120, "total_steps": 4116, "loss": 0.1347, "lr": 6.734616973112549e-06, "epoch": 5.3061224489795915, "percentage": 75.8, "elapsed_time": "4:02:34", "remaining_time": "1:17:26"}
{"current_steps": 3125, "total_steps": 4116, "loss": 0.1413, "lr": 6.671261608872737e-06, "epoch": 5.3146258503401365, "percentage": 75.92, "elapsed_time": "4:02:58", "remaining_time": "1:17:03"}
{"current_steps": 3130, "total_steps": 4116, "loss": 0.1449, "lr": 6.60814595465457e-06, "epoch": 5.3231292517006805, "percentage": 76.04, "elapsed_time": "4:03:20", "remaining_time": "1:16:39"}
{"current_steps": 3135, "total_steps": 4116, "loss": 0.1484, "lr": 6.545271145558336e-06, "epoch": 5.331632653061225, "percentage": 76.17, "elapsed_time": "4:03:43", "remaining_time": "1:16:15"}
{"current_steps": 3140, "total_steps": 4116, "loss": 0.1394, "lr": 6.482638312352867e-06, "epoch": 5.340136054421769, "percentage": 76.29, "elapsed_time": "4:04:04", "remaining_time": "1:15:52"}
{"current_steps": 3145, "total_steps": 4116, "loss": 0.1441, "lr": 6.420248581455184e-06, "epoch": 5.348639455782313, "percentage": 76.41, "elapsed_time": "4:04:26", "remaining_time": "1:15:28"}
{"current_steps": 3150, "total_steps": 4116, "loss": 0.1351, "lr": 6.358103074910238e-06, "epoch": 5.357142857142857, "percentage": 76.53, "elapsed_time": "4:04:45", "remaining_time": "1:15:03"}
{"current_steps": 3155, "total_steps": 4116, "loss": 0.1345, "lr": 6.296202910370757e-06, "epoch": 5.365646258503402, "percentage": 76.65, "elapsed_time": "4:05:08", "remaining_time": "1:14:40"}
{"current_steps": 3160, "total_steps": 4116, "loss": 0.1449, "lr": 6.234549201077124e-06, "epoch": 5.374149659863946, "percentage": 76.77, "elapsed_time": "4:05:31", "remaining_time": "1:14:16"}
{"current_steps": 3165, "total_steps": 4116, "loss": 0.1427, "lr": 6.173143055837345e-06, "epoch": 5.38265306122449, "percentage": 76.9, "elapsed_time": "4:05:57", "remaining_time": "1:13:54"}
{"current_steps": 3170, "total_steps": 4116, "loss": 0.1426, "lr": 6.111985579007149e-06, "epoch": 5.391156462585034, "percentage": 77.02, "elapsed_time": "4:06:19", "remaining_time": "1:13:30"}
{"current_steps": 3175, "total_steps": 4116, "loss": 0.1391, "lr": 6.051077870470068e-06, "epoch": 5.399659863945578, "percentage": 77.14, "elapsed_time": "4:06:44", "remaining_time": "1:13:07"}
{"current_steps": 3180, "total_steps": 4116, "loss": 0.1428, "lr": 5.990421025617716e-06, "epoch": 5.408163265306122, "percentage": 77.26, "elapsed_time": "4:07:05", "remaining_time": "1:12:43"}
{"current_steps": 3185, "total_steps": 4116, "loss": 0.1424, "lr": 5.930016135330052e-06, "epoch": 5.416666666666667, "percentage": 77.38, "elapsed_time": "4:07:28", "remaining_time": "1:12:20"}
{"current_steps": 3190, "total_steps": 4116, "loss": 0.1456, "lr": 5.869864285955771e-06, "epoch": 5.425170068027211, "percentage": 77.5, "elapsed_time": "4:07:52", "remaining_time": "1:11:57"}
{"current_steps": 3195, "total_steps": 4116, "loss": 0.1381, "lr": 5.80996655929277e-06, "epoch": 5.433673469387755, "percentage": 77.62, "elapsed_time": "4:08:14", "remaining_time": "1:11:33"}
{"current_steps": 3200, "total_steps": 4116, "loss": 0.141, "lr": 5.750324032568677e-06, "epoch": 5.442176870748299, "percentage": 77.75, "elapsed_time": "4:08:35", "remaining_time": "1:11:09"}
{"current_steps": 3205, "total_steps": 4116, "loss": 0.1469, "lr": 5.6909377784215035e-06, "epoch": 5.450680272108843, "percentage": 77.87, "elapsed_time": "4:08:59", "remaining_time": "1:10:46"}
{"current_steps": 3210, "total_steps": 4116, "loss": 0.1399, "lr": 5.631808864880342e-06, "epoch": 5.459183673469388, "percentage": 77.99, "elapsed_time": "4:09:20", "remaining_time": "1:10:22"}
{"current_steps": 3215, "total_steps": 4116, "loss": 0.1374, "lr": 5.5729383553461315e-06, "epoch": 5.467687074829932, "percentage": 78.11, "elapsed_time": "4:09:42", "remaining_time": "1:09:58"}
{"current_steps": 3220, "total_steps": 4116, "loss": 0.1279, "lr": 5.5143273085725894e-06, "epoch": 5.476190476190476, "percentage": 78.23, "elapsed_time": "4:10:08", "remaining_time": "1:09:36"}
{"current_steps": 3225, "total_steps": 4116, "loss": 0.14, "lr": 5.45597677864711e-06, "epoch": 5.48469387755102, "percentage": 78.35, "elapsed_time": "4:10:32", "remaining_time": "1:09:13"}
{"current_steps": 3230, "total_steps": 4116, "loss": 0.1362, "lr": 5.397887814971854e-06, "epoch": 5.493197278911564, "percentage": 78.47, "elapsed_time": "4:10:58", "remaining_time": "1:08:50"}
{"current_steps": 3235, "total_steps": 4116, "loss": 0.1531, "lr": 5.34006146224485e-06, "epoch": 5.5017006802721085, "percentage": 78.6, "elapsed_time": "4:11:19", "remaining_time": "1:08:26"}
{"current_steps": 3240, "total_steps": 4116, "loss": 0.1445, "lr": 5.282498760441219e-06, "epoch": 5.510204081632653, "percentage": 78.72, "elapsed_time": "4:11:42", "remaining_time": "1:08:03"}
{"current_steps": 3245, "total_steps": 4116, "loss": 0.1404, "lr": 5.225200744794465e-06, "epoch": 5.5187074829931975, "percentage": 78.84, "elapsed_time": "4:12:03", "remaining_time": "1:07:39"}
{"current_steps": 3250, "total_steps": 4116, "loss": 0.1405, "lr": 5.168168445777839e-06, "epoch": 5.5272108843537415, "percentage": 78.96, "elapsed_time": "4:12:29", "remaining_time": "1:07:16"}
{"current_steps": 3255, "total_steps": 4116, "loss": 0.1379, "lr": 5.111402889085852e-06, "epoch": 5.535714285714286, "percentage": 79.08, "elapsed_time": "4:12:52", "remaining_time": "1:06:53"}
{"current_steps": 3260, "total_steps": 4116, "loss": 0.1375, "lr": 5.054905095615779e-06, "epoch": 5.54421768707483, "percentage": 79.2, "elapsed_time": "4:13:14", "remaining_time": "1:06:29"}
{"current_steps": 3265, "total_steps": 4116, "loss": 0.126, "lr": 4.9986760814493315e-06, "epoch": 5.552721088435375, "percentage": 79.32, "elapsed_time": "4:13:35", "remaining_time": "1:06:05"}
{"current_steps": 3270, "total_steps": 4116, "loss": 0.1397, "lr": 4.94271685783438e-06, "epoch": 5.561224489795919, "percentage": 79.45, "elapsed_time": "4:13:59", "remaining_time": "1:05:42"}
{"current_steps": 3275, "total_steps": 4116, "loss": 0.1365, "lr": 4.88702843116674e-06, "epoch": 5.569727891156463, "percentage": 79.57, "elapsed_time": "4:14:23", "remaining_time": "1:05:19"}
{"current_steps": 3280, "total_steps": 4116, "loss": 0.1379, "lr": 4.831611802972112e-06, "epoch": 5.578231292517007, "percentage": 79.69, "elapsed_time": "4:14:48", "remaining_time": "1:04:56"}
{"current_steps": 3285, "total_steps": 4116, "loss": 0.1338, "lr": 4.776467969888041e-06, "epoch": 5.586734693877551, "percentage": 79.81, "elapsed_time": "4:15:11", "remaining_time": "1:04:33"}
{"current_steps": 3290, "total_steps": 4116, "loss": 0.1537, "lr": 4.721597923646008e-06, "epoch": 5.595238095238095, "percentage": 79.93, "elapsed_time": "4:15:40", "remaining_time": "1:04:11"}
{"current_steps": 3295, "total_steps": 4116, "loss": 0.1434, "lr": 4.667002651053582e-06, "epoch": 5.603741496598639, "percentage": 80.05, "elapsed_time": "4:16:02", "remaining_time": "1:03:47"}
{"current_steps": 3300, "total_steps": 4116, "loss": 0.1393, "lr": 4.612683133976692e-06, "epoch": 5.612244897959184, "percentage": 80.17, "elapsed_time": "4:16:22", "remaining_time": "1:03:23"}
{"current_steps": 3305, "total_steps": 4116, "loss": 0.144, "lr": 4.5586403493219365e-06, "epoch": 5.620748299319728, "percentage": 80.3, "elapsed_time": "4:16:45", "remaining_time": "1:03:00"}
{"current_steps": 3310, "total_steps": 4116, "loss": 0.1426, "lr": 4.504875269019038e-06, "epoch": 5.629251700680272, "percentage": 80.42, "elapsed_time": "4:17:08", "remaining_time": "1:02:36"}
{"current_steps": 3315, "total_steps": 4116, "loss": 0.1626, "lr": 4.451388860003374e-06, "epoch": 5.637755102040816, "percentage": 80.54, "elapsed_time": "4:17:34", "remaining_time": "1:02:14"}
{"current_steps": 3320, "total_steps": 4116, "loss": 0.1313, "lr": 4.398182084198561e-06, "epoch": 5.646258503401361, "percentage": 80.66, "elapsed_time": "4:17:53", "remaining_time": "1:01:49"}
{"current_steps": 3325, "total_steps": 4116, "loss": 0.1441, "lr": 4.345255898499172e-06, "epoch": 5.654761904761905, "percentage": 80.78, "elapsed_time": "4:18:18", "remaining_time": "1:01:26"}
{"current_steps": 3330, "total_steps": 4116, "loss": 0.1364, "lr": 4.292611254753509e-06, "epoch": 5.663265306122449, "percentage": 80.9, "elapsed_time": "4:18:40", "remaining_time": "1:01:03"}
{"current_steps": 3335, "total_steps": 4116, "loss": 0.147, "lr": 4.24024909974651e-06, "epoch": 5.671768707482993, "percentage": 81.03, "elapsed_time": "4:19:04", "remaining_time": "1:00:40"}
{"current_steps": 3340, "total_steps": 4116, "loss": 0.1492, "lr": 4.188170375182705e-06, "epoch": 5.680272108843537, "percentage": 81.15, "elapsed_time": "4:19:28", "remaining_time": "1:00:17"}
{"current_steps": 3345, "total_steps": 4116, "loss": 0.1451, "lr": 4.136376017669281e-06, "epoch": 5.688775510204081, "percentage": 81.27, "elapsed_time": "4:19:50", "remaining_time": "0:59:53"}
{"current_steps": 3350, "total_steps": 4116, "loss": 0.1502, "lr": 4.084866958699247e-06, "epoch": 5.697278911564625, "percentage": 81.39, "elapsed_time": "4:20:14", "remaining_time": "0:59:30"}
{"current_steps": 3355, "total_steps": 4116, "loss": 0.1406, "lr": 4.0336441246346684e-06, "epoch": 5.70578231292517, "percentage": 81.51, "elapsed_time": "4:20:41", "remaining_time": "0:59:07"}
{"current_steps": 3360, "total_steps": 4116, "loss": 0.1414, "lr": 3.982708436690001e-06, "epoch": 5.714285714285714, "percentage": 81.63, "elapsed_time": "4:21:06", "remaining_time": "0:58:44"}
{"current_steps": 3365, "total_steps": 4116, "loss": 0.1365, "lr": 3.932060810915563e-06, "epoch": 5.7227891156462585, "percentage": 81.75, "elapsed_time": "4:21:29", "remaining_time": "0:58:21"}
{"current_steps": 3370, "total_steps": 4116, "loss": 0.1431, "lr": 3.881702158181015e-06, "epoch": 5.7312925170068025, "percentage": 81.88, "elapsed_time": "4:21:52", "remaining_time": "0:57:58"}
{"current_steps": 3375, "total_steps": 4116, "loss": 0.1386, "lr": 3.831633384159006e-06, "epoch": 5.739795918367347, "percentage": 82.0, "elapsed_time": "4:22:16", "remaining_time": "0:57:35"}
{"current_steps": 3380, "total_steps": 4116, "loss": 0.1432, "lr": 3.7818553893088596e-06, "epoch": 5.7482993197278915, "percentage": 82.12, "elapsed_time": "4:22:42", "remaining_time": "0:57:12"}
{"current_steps": 3385, "total_steps": 4116, "loss": 0.1444, "lr": 3.7323690688604153e-06, "epoch": 5.756802721088436, "percentage": 82.24, "elapsed_time": "4:23:03", "remaining_time": "0:56:48"}
{"current_steps": 3390, "total_steps": 4116, "loss": 0.1453, "lr": 3.6831753127978974e-06, "epoch": 5.76530612244898, "percentage": 82.36, "elapsed_time": "4:23:25", "remaining_time": "0:56:24"}
{"current_steps": 3395, "total_steps": 4116, "loss": 0.1457, "lr": 3.634275005843935e-06, "epoch": 5.773809523809524, "percentage": 82.48, "elapsed_time": "4:23:52", "remaining_time": "0:56:02"}
{"current_steps": 3400, "total_steps": 4116, "loss": 0.1482, "lr": 3.585669027443608e-06, "epoch": 5.782312925170068, "percentage": 82.6, "elapsed_time": "4:24:14", "remaining_time": "0:55:38"}
{"current_steps": 3405, "total_steps": 4116, "loss": 0.1507, "lr": 3.537358251748695e-06, "epoch": 5.790816326530612, "percentage": 82.73, "elapsed_time": "4:24:38", "remaining_time": "0:55:15"}
{"current_steps": 3410, "total_steps": 4116, "loss": 0.1463, "lr": 3.489343547601882e-06, "epoch": 5.799319727891157, "percentage": 82.85, "elapsed_time": "4:24:59", "remaining_time": "0:54:51"}
{"current_steps": 3415, "total_steps": 4116, "loss": 0.1396, "lr": 3.4416257785211983e-06, "epoch": 5.807823129251701, "percentage": 82.97, "elapsed_time": "4:25:23", "remaining_time": "0:54:28"}
{"current_steps": 3420, "total_steps": 4116, "loss": 0.1394, "lr": 3.3942058026844472e-06, "epoch": 5.816326530612245, "percentage": 83.09, "elapsed_time": "4:25:45", "remaining_time": "0:54:05"}
{"current_steps": 3425, "total_steps": 4116, "loss": 0.1376, "lr": 3.3470844729137886e-06, "epoch": 5.824829931972789, "percentage": 83.21, "elapsed_time": "4:26:07", "remaining_time": "0:53:41"}
{"current_steps": 3430, "total_steps": 4116, "loss": 0.1523, "lr": 3.3002626366603896e-06, "epoch": 5.833333333333333, "percentage": 83.33, "elapsed_time": "4:26:30", "remaining_time": "0:53:18"}
{"current_steps": 3435, "total_steps": 4116, "loss": 0.1332, "lr": 3.253741135989199e-06, "epoch": 5.841836734693878, "percentage": 83.45, "elapsed_time": "4:26:56", "remaining_time": "0:52:55"}
{"current_steps": 3440, "total_steps": 4116, "loss": 0.1477, "lr": 3.2075208075637953e-06, "epoch": 5.850340136054422, "percentage": 83.58, "elapsed_time": "4:27:18", "remaining_time": "0:52:31"}
{"current_steps": 3445, "total_steps": 4116, "loss": 0.1413, "lr": 3.161602482631343e-06, "epoch": 5.858843537414966, "percentage": 83.7, "elapsed_time": "4:27:40", "remaining_time": "0:52:08"}
{"current_steps": 3450, "total_steps": 4116, "loss": 0.1409, "lr": 3.115986987007622e-06, "epoch": 5.86734693877551, "percentage": 83.82, "elapsed_time": "4:28:02", "remaining_time": "0:51:44"}
{"current_steps": 3455, "total_steps": 4116, "loss": 0.141, "lr": 3.0706751410622184e-06, "epoch": 5.875850340136054, "percentage": 83.94, "elapsed_time": "4:28:24", "remaining_time": "0:51:21"}
{"current_steps": 3460, "total_steps": 4116, "loss": 0.1392, "lr": 3.0256677597037253e-06, "epoch": 5.884353741496598, "percentage": 84.06, "elapsed_time": "4:28:45", "remaining_time": "0:50:57"}
{"current_steps": 3465, "total_steps": 4116, "loss": 0.1415, "lr": 2.980965652365122e-06, "epoch": 5.892857142857143, "percentage": 84.18, "elapsed_time": "4:29:07", "remaining_time": "0:50:33"}
{"current_steps": 3470, "total_steps": 4116, "loss": 0.1344, "lr": 2.936569622989198e-06, "epoch": 5.901360544217687, "percentage": 84.31, "elapsed_time": "4:29:29", "remaining_time": "0:50:10"}
{"current_steps": 3475, "total_steps": 4116, "loss": 0.1406, "lr": 2.892480470014101e-06, "epoch": 5.909863945578231, "percentage": 84.43, "elapsed_time": "4:29:53", "remaining_time": "0:49:47"}
{"current_steps": 3480, "total_steps": 4116, "loss": 0.133, "lr": 2.8486989863589665e-06, "epoch": 5.918367346938775, "percentage": 84.55, "elapsed_time": "4:30:14", "remaining_time": "0:49:23"}
{"current_steps": 3485, "total_steps": 4116, "loss": 0.1411, "lr": 2.8052259594096786e-06, "epoch": 5.9268707482993195, "percentage": 84.67, "elapsed_time": "4:30:36", "remaining_time": "0:48:59"}
{"current_steps": 3490, "total_steps": 4116, "loss": 0.1413, "lr": 2.762062171004689e-06, "epoch": 5.935374149659864, "percentage": 84.79, "elapsed_time": "4:31:00", "remaining_time": "0:48:36"}
{"current_steps": 3495, "total_steps": 4116, "loss": 0.1307, "lr": 2.719208397420976e-06, "epoch": 5.9438775510204085, "percentage": 84.91, "elapsed_time": "4:31:22", "remaining_time": "0:48:13"}
{"current_steps": 3500, "total_steps": 4116, "loss": 0.1326, "lr": 2.6766654093600554e-06, "epoch": 5.9523809523809526, "percentage": 85.03, "elapsed_time": "4:31:45", "remaining_time": "0:47:49"}
{"current_steps": 3505, "total_steps": 4116, "loss": 0.1482, "lr": 2.634433971934145e-06, "epoch": 5.960884353741497, "percentage": 85.16, "elapsed_time": "4:32:09", "remaining_time": "0:47:26"}
{"current_steps": 3510, "total_steps": 4116, "loss": 0.1503, "lr": 2.592514844652396e-06, "epoch": 5.969387755102041, "percentage": 85.28, "elapsed_time": "4:32:29", "remaining_time": "0:47:02"}
{"current_steps": 3515, "total_steps": 4116, "loss": 0.1425, "lr": 2.5509087814072373e-06, "epoch": 5.977891156462585, "percentage": 85.4, "elapsed_time": "4:32:51", "remaining_time": "0:46:39"}
{"current_steps": 3520, "total_steps": 4116, "loss": 0.1416, "lr": 2.5096165304608055e-06, "epoch": 5.986394557823129, "percentage": 85.52, "elapsed_time": "4:33:14", "remaining_time": "0:46:15"}
{"current_steps": 3525, "total_steps": 4116, "loss": 0.1485, "lr": 2.468638834431507e-06, "epoch": 5.994897959183674, "percentage": 85.64, "elapsed_time": "4:33:37", "remaining_time": "0:45:52"}
{"current_steps": 3530, "total_steps": 4116, "loss": 0.1304, "lr": 2.427976430280634e-06, "epoch": 6.003401360544218, "percentage": 85.76, "elapsed_time": "4:33:59", "remaining_time": "0:45:29"}
{"current_steps": 3535, "total_steps": 4116, "loss": 0.1311, "lr": 2.3876300492991456e-06, "epoch": 6.011904761904762, "percentage": 85.88, "elapsed_time": "4:34:21", "remaining_time": "0:45:05"}
{"current_steps": 3540, "total_steps": 4116, "loss": 0.1412, "lr": 2.3476004170944843e-06, "epoch": 6.020408163265306, "percentage": 86.01, "elapsed_time": "4:34:43", "remaining_time": "0:44:42"}
{"current_steps": 3545, "total_steps": 4116, "loss": 0.1332, "lr": 2.307888253577557e-06, "epoch": 6.02891156462585, "percentage": 86.13, "elapsed_time": "4:35:04", "remaining_time": "0:44:18"}
{"current_steps": 3550, "total_steps": 4116, "loss": 0.1363, "lr": 2.2684942729497505e-06, "epoch": 6.037414965986395, "percentage": 86.25, "elapsed_time": "4:35:30", "remaining_time": "0:43:55"}
{"current_steps": 3555, "total_steps": 4116, "loss": 0.1412, "lr": 2.229419183690118e-06, "epoch": 6.045918367346939, "percentage": 86.37, "elapsed_time": "4:35:52", "remaining_time": "0:43:32"}
{"current_steps": 3560, "total_steps": 4116, "loss": 0.1345, "lr": 2.1906636885426293e-06, "epoch": 6.054421768707483, "percentage": 86.49, "elapsed_time": "4:36:16", "remaining_time": "0:43:08"}
{"current_steps": 3565, "total_steps": 4116, "loss": 0.1361, "lr": 2.1522284845035246e-06, "epoch": 6.062925170068027, "percentage": 86.61, "elapsed_time": "4:36:37", "remaining_time": "0:42:45"}
{"current_steps": 3570, "total_steps": 4116, "loss": 0.1335, "lr": 2.1141142628087908e-06, "epoch": 6.071428571428571, "percentage": 86.73, "elapsed_time": "4:37:01", "remaining_time": "0:42:22"}
{"current_steps": 3575, "total_steps": 4116, "loss": 0.1409, "lr": 2.0763217089217204e-06, "epoch": 6.079931972789115, "percentage": 86.86, "elapsed_time": "4:37:23", "remaining_time": "0:41:58"}
{"current_steps": 3580, "total_steps": 4116, "loss": 0.1429, "lr": 2.0388515025205912e-06, "epoch": 6.08843537414966, "percentage": 86.98, "elapsed_time": "4:37:44", "remaining_time": "0:41:35"}
{"current_steps": 3585, "total_steps": 4116, "loss": 0.1321, "lr": 2.0017043174864257e-06, "epoch": 6.096938775510204, "percentage": 87.1, "elapsed_time": "4:38:05", "remaining_time": "0:41:11"}
{"current_steps": 3590, "total_steps": 4116, "loss": 0.1469, "lr": 1.9648808218909023e-06, "epoch": 6.105442176870748, "percentage": 87.22, "elapsed_time": "4:38:28", "remaining_time": "0:40:48"}
{"current_steps": 3595, "total_steps": 4116, "loss": 0.1355, "lr": 1.928381677984319e-06, "epoch": 6.113945578231292, "percentage": 87.34, "elapsed_time": "4:38:48", "remaining_time": "0:40:24"}
{"current_steps": 3600, "total_steps": 4116, "loss": 0.1313, "lr": 1.8922075421836729e-06, "epoch": 6.122448979591836, "percentage": 87.46, "elapsed_time": "4:39:08", "remaining_time": "0:40:00"}
{"current_steps": 3605, "total_steps": 4116, "loss": 0.1313, "lr": 1.8563590650608908e-06, "epoch": 6.130952380952381, "percentage": 87.59, "elapsed_time": "4:39:29", "remaining_time": "0:39:37"}
{"current_steps": 3610, "total_steps": 4116, "loss": 0.1447, "lr": 1.8208368913310881e-06, "epoch": 6.139455782312925, "percentage": 87.71, "elapsed_time": "4:39:50", "remaining_time": "0:39:13"}
{"current_steps": 3615, "total_steps": 4116, "loss": 0.1443, "lr": 1.7856416598410064e-06, "epoch": 6.1479591836734695, "percentage": 87.83, "elapsed_time": "4:40:13", "remaining_time": "0:38:50"}
{"current_steps": 3620, "total_steps": 4116, "loss": 0.1376, "lr": 1.7507740035575049e-06, "epoch": 6.156462585034014, "percentage": 87.95, "elapsed_time": "4:40:36", "remaining_time": "0:38:26"}
{"current_steps": 3625, "total_steps": 4116, "loss": 0.1372, "lr": 1.7162345495561905e-06, "epoch": 6.164965986394558, "percentage": 88.07, "elapsed_time": "4:40:59", "remaining_time": "0:38:03"}
{"current_steps": 3630, "total_steps": 4116, "loss": 0.1268, "lr": 1.682023919010125e-06, "epoch": 6.173469387755102, "percentage": 88.19, "elapsed_time": "4:41:23", "remaining_time": "0:37:40"}
{"current_steps": 3635, "total_steps": 4116, "loss": 0.1325, "lr": 1.6481427271786588e-06, "epoch": 6.181972789115647, "percentage": 88.31, "elapsed_time": "4:41:48", "remaining_time": "0:37:17"}
{"current_steps": 3640, "total_steps": 4116, "loss": 0.1286, "lr": 1.6145915833963788e-06, "epoch": 6.190476190476191, "percentage": 88.44, "elapsed_time": "4:42:10", "remaining_time": "0:36:53"}
{"current_steps": 3645, "total_steps": 4116, "loss": 0.1484, "lr": 1.5813710910621293e-06, "epoch": 6.198979591836735, "percentage": 88.56, "elapsed_time": "4:42:31", "remaining_time": "0:36:30"}
{"current_steps": 3650, "total_steps": 4116, "loss": 0.1429, "lr": 1.548481847628176e-06, "epoch": 6.207482993197279, "percentage": 88.68, "elapsed_time": "4:42:52", "remaining_time": "0:36:06"}
{"current_steps": 3655, "total_steps": 4116, "loss": 0.1383, "lr": 1.5159244445894605e-06, "epoch": 6.215986394557823, "percentage": 88.8, "elapsed_time": "4:43:15", "remaining_time": "0:35:43"}
{"current_steps": 3660, "total_steps": 4116, "loss": 0.1427, "lr": 1.4836994674729433e-06, "epoch": 6.224489795918367, "percentage": 88.92, "elapsed_time": "4:43:37", "remaining_time": "0:35:20"}
{"current_steps": 3665, "total_steps": 4116, "loss": 0.1261, "lr": 1.4518074958271e-06, "epoch": 6.232993197278912, "percentage": 89.04, "elapsed_time": "4:44:00", "remaining_time": "0:34:56"}
{"current_steps": 3670, "total_steps": 4116, "loss": 0.1409, "lr": 1.4202491032114797e-06, "epoch": 6.241496598639456, "percentage": 89.16, "elapsed_time": "4:44:26", "remaining_time": "0:34:33"}
{"current_steps": 3675, "total_steps": 4116, "loss": 0.1379, "lr": 1.389024857186403e-06, "epoch": 6.25, "percentage": 89.29, "elapsed_time": "4:44:47", "remaining_time": "0:34:10"}
{"current_steps": 3680, "total_steps": 4116, "loss": 0.145, "lr": 1.3581353193027424e-06, "epoch": 6.258503401360544, "percentage": 89.41, "elapsed_time": "4:45:10", "remaining_time": "0:33:47"}
{"current_steps": 3685, "total_steps": 4116, "loss": 0.1323, "lr": 1.3275810450918257e-06, "epoch": 6.267006802721088, "percentage": 89.53, "elapsed_time": "4:45:33", "remaining_time": "0:33:23"}
{"current_steps": 3690, "total_steps": 4116, "loss": 0.1301, "lr": 1.2973625840554593e-06, "epoch": 6.275510204081632, "percentage": 89.65, "elapsed_time": "4:45:55", "remaining_time": "0:33:00"}
{"current_steps": 3695, "total_steps": 4116, "loss": 0.1306, "lr": 1.2674804796560202e-06, "epoch": 6.284013605442177, "percentage": 89.77, "elapsed_time": "4:46:16", "remaining_time": "0:32:37"}
{"current_steps": 3700, "total_steps": 4116, "loss": 0.1369, "lr": 1.237935269306707e-06, "epoch": 6.292517006802721, "percentage": 89.89, "elapsed_time": "4:46:35", "remaining_time": "0:32:13"}
{"current_steps": 3705, "total_steps": 4116, "loss": 0.1481, "lr": 1.2087274843618668e-06, "epoch": 6.301020408163265, "percentage": 90.01, "elapsed_time": "4:46:59", "remaining_time": "0:31:50"}
{"current_steps": 3710, "total_steps": 4116, "loss": 0.1317, "lr": 1.1798576501074233e-06, "epoch": 6.309523809523809, "percentage": 90.14, "elapsed_time": "4:47:23", "remaining_time": "0:31:27"}
{"current_steps": 3715, "total_steps": 4116, "loss": 0.1333, "lr": 1.151326285751455e-06, "epoch": 6.318027210884353, "percentage": 90.26, "elapsed_time": "4:47:44", "remaining_time": "0:31:03"}
{"current_steps": 3720, "total_steps": 4116, "loss": 0.1317, "lr": 1.1231339044148393e-06, "epoch": 6.326530612244898, "percentage": 90.38, "elapsed_time": "4:48:06", "remaining_time": "0:30:40"}
{"current_steps": 3725, "total_steps": 4116, "loss": 0.1375, "lr": 1.095281013122036e-06, "epoch": 6.335034013605442, "percentage": 90.5, "elapsed_time": "4:48:27", "remaining_time": "0:30:16"}
{"current_steps": 3730, "total_steps": 4116, "loss": 0.1397, "lr": 1.0677681127919581e-06, "epoch": 6.343537414965986, "percentage": 90.62, "elapsed_time": "4:48:50", "remaining_time": "0:29:53"}
{"current_steps": 3735, "total_steps": 4116, "loss": 0.1368, "lr": 1.04059569822897e-06, "epoch": 6.3520408163265305, "percentage": 90.74, "elapsed_time": "4:49:12", "remaining_time": "0:29:30"}
{"current_steps": 3740, "total_steps": 4116, "loss": 0.1396, "lr": 1.0137642581139895e-06, "epoch": 6.360544217687075, "percentage": 90.86, "elapsed_time": "4:49:35", "remaining_time": "0:29:06"}
{"current_steps": 3745, "total_steps": 4116, "loss": 0.1358, "lr": 9.87274274995693e-07, "epoch": 6.369047619047619, "percentage": 90.99, "elapsed_time": "4:49:56", "remaining_time": "0:28:43"}
{"current_steps": 3750, "total_steps": 4116, "loss": 0.1302, "lr": 9.61126225281841e-07, "epoch": 6.377551020408164, "percentage": 91.11, "elapsed_time": "4:50:18", "remaining_time": "0:28:20"}
{"current_steps": 3755, "total_steps": 4116, "loss": 0.1364, "lr": 9.353205792307163e-07, "epoch": 6.386054421768708, "percentage": 91.23, "elapsed_time": "4:50:42", "remaining_time": "0:27:56"}
{"current_steps": 3760, "total_steps": 4116, "loss": 0.1295, "lr": 9.098578009426484e-07, "epoch": 6.394557823129252, "percentage": 91.35, "elapsed_time": "4:51:04", "remaining_time": "0:27:33"}
{"current_steps": 3765, "total_steps": 4116, "loss": 0.1284, "lr": 8.847383483516903e-07, "epoch": 6.403061224489796, "percentage": 91.47, "elapsed_time": "4:51:28", "remaining_time": "0:27:10"}
{"current_steps": 3770, "total_steps": 4116, "loss": 0.1238, "lr": 8.599626732173671e-07, "epoch": 6.41156462585034, "percentage": 91.59, "elapsed_time": "4:51:51", "remaining_time": "0:26:47"}
{"current_steps": 3775, "total_steps": 4116, "loss": 0.1319, "lr": 8.355312211165568e-07, "epoch": 6.420068027210885, "percentage": 91.72, "elapsed_time": "4:52:21", "remaining_time": "0:26:24"}
{"current_steps": 3780, "total_steps": 4116, "loss": 0.1302, "lr": 8.11444431435473e-07, "epoch": 6.428571428571429, "percentage": 91.84, "elapsed_time": "4:52:45", "remaining_time": "0:26:01"}
{"current_steps": 3785, "total_steps": 4116, "loss": 0.139, "lr": 7.87702737361764e-07, "epoch": 6.437074829931973, "percentage": 91.96, "elapsed_time": "4:53:10", "remaining_time": "0:25:38"}
{"current_steps": 3790, "total_steps": 4116, "loss": 0.1391, "lr": 7.643065658767312e-07, "epoch": 6.445578231292517, "percentage": 92.08, "elapsed_time": "4:53:35", "remaining_time": "0:25:15"}
{"current_steps": 3795, "total_steps": 4116, "loss": 0.1413, "lr": 7.412563377476312e-07, "epoch": 6.454081632653061, "percentage": 92.2, "elapsed_time": "4:54:00", "remaining_time": "0:24:52"}
{"current_steps": 3800, "total_steps": 4116, "loss": 0.1417, "lr": 7.185524675201328e-07, "epoch": 6.462585034013605, "percentage": 92.32, "elapsed_time": "4:54:21", "remaining_time": "0:24:28"}
{"current_steps": 3805, "total_steps": 4116, "loss": 0.1381, "lr": 6.961953635108409e-07, "epoch": 6.47108843537415, "percentage": 92.44, "elapsed_time": "4:54:44", "remaining_time": "0:24:05"}
{"current_steps": 3810, "total_steps": 4116, "loss": 0.1338, "lr": 6.741854277999583e-07, "epoch": 6.479591836734694, "percentage": 92.57, "elapsed_time": "4:55:06", "remaining_time": "0:23:42"}
{"current_steps": 3815, "total_steps": 4116, "loss": 0.1353, "lr": 6.525230562240636e-07, "epoch": 6.488095238095238, "percentage": 92.69, "elapsed_time": "4:55:35", "remaining_time": "0:23:19"}
{"current_steps": 3820, "total_steps": 4116, "loss": 0.1284, "lr": 6.312086383689831e-07, "epoch": 6.496598639455782, "percentage": 92.81, "elapsed_time": "4:55:59", "remaining_time": "0:22:56"}
{"current_steps": 3825, "total_steps": 4116, "loss": 0.1373, "lr": 6.10242557562788e-07, "epoch": 6.505102040816326, "percentage": 92.93, "elapsed_time": "4:56:23", "remaining_time": "0:22:32"}
{"current_steps": 3830, "total_steps": 4116, "loss": 0.1336, "lr": 5.896251908688966e-07, "epoch": 6.513605442176871, "percentage": 93.05, "elapsed_time": "4:56:45", "remaining_time": "0:22:09"}
{"current_steps": 3835, "total_steps": 4116, "loss": 0.1371, "lr": 5.693569090792972e-07, "epoch": 6.522108843537415, "percentage": 93.17, "elapsed_time": "4:57:08", "remaining_time": "0:21:46"}
{"current_steps": 3840, "total_steps": 4116, "loss": 0.1392, "lr": 5.494380767078822e-07, "epoch": 6.530612244897959, "percentage": 93.29, "elapsed_time": "4:57:32", "remaining_time": "0:21:23"}
{"current_steps": 3845, "total_steps": 4116, "loss": 0.1351, "lr": 5.298690519838823e-07, "epoch": 6.539115646258503, "percentage": 93.42, "elapsed_time": "4:57:54", "remaining_time": "0:20:59"}
{"current_steps": 3850, "total_steps": 4116, "loss": 0.1408, "lr": 5.106501868454317e-07, "epoch": 6.5476190476190474, "percentage": 93.54, "elapsed_time": "4:58:14", "remaining_time": "0:20:36"}
{"current_steps": 3855, "total_steps": 4116, "loss": 0.1314, "lr": 4.917818269332442e-07, "epoch": 6.5561224489795915, "percentage": 93.66, "elapsed_time": "4:58:36", "remaining_time": "0:20:13"}
{"current_steps": 3860, "total_steps": 4116, "loss": 0.1277, "lr": 4.7326431158437826e-07, "epoch": 6.564625850340136, "percentage": 93.78, "elapsed_time": "4:59:00", "remaining_time": "0:19:49"}
{"current_steps": 3865, "total_steps": 4116, "loss": 0.138, "lr": 4.5509797382615517e-07, "epoch": 6.5731292517006805, "percentage": 93.9, "elapsed_time": "4:59:22", "remaining_time": "0:19:26"}
{"current_steps": 3870, "total_steps": 4116, "loss": 0.1434, "lr": 4.372831403701572e-07, "epoch": 6.581632653061225, "percentage": 94.02, "elapsed_time": "4:59:45", "remaining_time": "0:19:03"}
{"current_steps": 3875, "total_steps": 4116, "loss": 0.1273, "lr": 4.198201316063566e-07, "epoch": 6.590136054421769, "percentage": 94.14, "elapsed_time": "5:00:07", "remaining_time": "0:18:39"}
{"current_steps": 3880, "total_steps": 4116, "loss": 0.1377, "lr": 4.02709261597356e-07, "epoch": 6.598639455782313, "percentage": 94.27, "elapsed_time": "5:00:32", "remaining_time": "0:18:16"}
{"current_steps": 3885, "total_steps": 4116, "loss": 0.1412, "lr": 3.8595083807272393e-07, "epoch": 6.607142857142857, "percentage": 94.39, "elapsed_time": "5:00:54", "remaining_time": "0:17:53"}
{"current_steps": 3890, "total_steps": 4116, "loss": 0.137, "lr": 3.695451624234836e-07, "epoch": 6.615646258503402, "percentage": 94.51, "elapsed_time": "5:01:20", "remaining_time": "0:17:30"}
{"current_steps": 3895, "total_steps": 4116, "loss": 0.1328, "lr": 3.5349252969667956e-07, "epoch": 6.624149659863946, "percentage": 94.63, "elapsed_time": "5:01:46", "remaining_time": "0:17:07"}
{"current_steps": 3900, "total_steps": 4116, "loss": 0.1412, "lr": 3.3779322859007536e-07, "epoch": 6.63265306122449, "percentage": 94.75, "elapsed_time": "5:02:10", "remaining_time": "0:16:44"}
{"current_steps": 3905, "total_steps": 4116, "loss": 0.124, "lr": 3.224475414469552e-07, "epoch": 6.641156462585034, "percentage": 94.87, "elapsed_time": "5:02:34", "remaining_time": "0:16:20"}
{"current_steps": 3910, "total_steps": 4116, "loss": 0.1301, "lr": 3.0745574425105505e-07, "epoch": 6.649659863945578, "percentage": 95.0, "elapsed_time": "5:02:56", "remaining_time": "0:15:57"}
{"current_steps": 3915, "total_steps": 4116, "loss": 0.1348, "lr": 2.928181066215929e-07, "epoch": 6.658163265306122, "percentage": 95.12, "elapsed_time": "5:03:19", "remaining_time": "0:15:34"}
{"current_steps": 3920, "total_steps": 4116, "loss": 0.1308, "lr": 2.785348918084241e-07, "epoch": 6.666666666666667, "percentage": 95.24, "elapsed_time": "5:03:41", "remaining_time": "0:15:11"}
{"current_steps": 3925, "total_steps": 4116, "loss": 0.1311, "lr": 2.6460635668730027e-07, "epoch": 6.675170068027211, "percentage": 95.36, "elapsed_time": "5:04:04", "remaining_time": "0:14:47"}
{"current_steps": 3930, "total_steps": 4116, "loss": 0.1398, "lr": 2.5103275175526023e-07, "epoch": 6.683673469387755, "percentage": 95.48, "elapsed_time": "5:04:28", "remaining_time": "0:14:24"}
{"current_steps": 3935, "total_steps": 4116, "loss": 0.142, "lr": 2.3781432112611303e-07, "epoch": 6.692176870748299, "percentage": 95.6, "elapsed_time": "5:04:50", "remaining_time": "0:14:01"}
{"current_steps": 3940, "total_steps": 4116, "loss": 0.1361, "lr": 2.2495130252605746e-07, "epoch": 6.700680272108843, "percentage": 95.72, "elapsed_time": "5:05:12", "remaining_time": "0:13:38"}
{"current_steps": 3945, "total_steps": 4116, "loss": 0.1506, "lr": 2.1244392728939857e-07, "epoch": 6.709183673469388, "percentage": 95.85, "elapsed_time": "5:05:42", "remaining_time": "0:13:15"}
{"current_steps": 3950, "total_steps": 4116, "loss": 0.1332, "lr": 2.0029242035439768e-07, "epoch": 6.717687074829932, "percentage": 95.97, "elapsed_time": "5:06:05", "remaining_time": "0:12:51"}
{"current_steps": 3955, "total_steps": 4116, "loss": 0.1475, "lr": 1.8849700025921347e-07, "epoch": 6.726190476190476, "percentage": 96.09, "elapsed_time": "5:06:25", "remaining_time": "0:12:28"}
{"current_steps": 3960, "total_steps": 4116, "loss": 0.1359, "lr": 1.7705787913798734e-07, "epoch": 6.73469387755102, "percentage": 96.21, "elapsed_time": "5:06:48", "remaining_time": "0:12:05"}
{"current_steps": 3965, "total_steps": 4116, "loss": 0.1259, "lr": 1.659752627170086e-07, "epoch": 6.743197278911564, "percentage": 96.33, "elapsed_time": "5:07:08", "remaining_time": "0:11:41"}
{"current_steps": 3970, "total_steps": 4116, "loss": 0.1494, "lr": 1.5524935031103305e-07, "epoch": 6.7517006802721085, "percentage": 96.45, "elapsed_time": "5:07:32", "remaining_time": "0:11:18"}
{"current_steps": 3975, "total_steps": 4116, "loss": 0.1376, "lr": 1.4488033481969478e-07, "epoch": 6.760204081632653, "percentage": 96.57, "elapsed_time": "5:07:54", "remaining_time": "0:10:55"}
{"current_steps": 3980, "total_steps": 4116, "loss": 0.1298, "lr": 1.3486840272402213e-07, "epoch": 6.7687074829931975, "percentage": 96.7, "elapsed_time": "5:08:19", "remaining_time": "0:10:32"}
{"current_steps": 3985, "total_steps": 4116, "loss": 0.1379, "lr": 1.252137340831072e-07, "epoch": 6.7772108843537415, "percentage": 96.82, "elapsed_time": "5:08:41", "remaining_time": "0:10:08"}
{"current_steps": 3990, "total_steps": 4116, "loss": 0.1498, "lr": 1.1591650253084619e-07, "epoch": 6.785714285714286, "percentage": 96.94, "elapsed_time": "5:09:04", "remaining_time": "0:09:45"}
{"current_steps": 3995, "total_steps": 4116, "loss": 0.1383, "lr": 1.0697687527283062e-07, "epoch": 6.79421768707483, "percentage": 97.06, "elapsed_time": "5:09:30", "remaining_time": "0:09:22"}
{"current_steps": 4000, "total_steps": 4116, "loss": 0.1349, "lr": 9.839501308333666e-08, "epoch": 6.802721088435375, "percentage": 97.18, "elapsed_time": "5:09:51", "remaining_time": "0:08:59"}
{"current_steps": 4005, "total_steps": 4116, "loss": 0.1407, "lr": 9.017107030242944e-08, "epoch": 6.811224489795919, "percentage": 97.3, "elapsed_time": "5:10:14", "remaining_time": "0:08:35"}
{"current_steps": 4010, "total_steps": 4116, "loss": 0.1591, "lr": 8.230519483319211e-08, "epoch": 6.819727891156463, "percentage": 97.42, "elapsed_time": "5:10:36", "remaining_time": "0:08:12"}
{"current_steps": 4015, "total_steps": 4116, "loss": 0.1396, "lr": 7.479752813906338e-08, "epoch": 6.828231292517007, "percentage": 97.55, "elapsed_time": "5:11:00", "remaining_time": "0:07:49"}
{"current_steps": 4020, "total_steps": 4116, "loss": 0.1427, "lr": 6.764820524129745e-08, "epoch": 6.836734693877551, "percentage": 97.67, "elapsed_time": "5:11:21", "remaining_time": "0:07:26"}
{"current_steps": 4025, "total_steps": 4116, "loss": 0.1395, "lr": 6.085735471653032e-08, "epoch": 6.845238095238095, "percentage": 97.79, "elapsed_time": "5:11:43", "remaining_time": "0:07:02"}
{"current_steps": 4030, "total_steps": 4116, "loss": 0.1444, "lr": 5.442509869446833e-08, "epoch": 6.853741496598639, "percentage": 97.91, "elapsed_time": "5:12:08", "remaining_time": "0:06:39"}
{"current_steps": 4035, "total_steps": 4116, "loss": 0.1378, "lr": 4.835155285569659e-08, "epoch": 6.862244897959184, "percentage": 98.03, "elapsed_time": "5:12:30", "remaining_time": "0:06:16"}
{"current_steps": 4040, "total_steps": 4116, "loss": 0.1309, "lr": 4.263682642959177e-08, "epoch": 6.870748299319728, "percentage": 98.15, "elapsed_time": "5:12:52", "remaining_time": "0:05:53"}
{"current_steps": 4045, "total_steps": 4116, "loss": 0.136, "lr": 3.72810221923614e-08, "epoch": 6.879251700680272, "percentage": 98.28, "elapsed_time": "5:13:16", "remaining_time": "0:05:29"}
{"current_steps": 4050, "total_steps": 4116, "loss": 0.153, "lr": 3.228423646519652e-08, "epoch": 6.887755102040816, "percentage": 98.4, "elapsed_time": "5:13:41", "remaining_time": "0:05:06"}
{"current_steps": 4055, "total_steps": 4116, "loss": 0.1216, "lr": 2.764655911253744e-08, "epoch": 6.896258503401361, "percentage": 98.52, "elapsed_time": "5:14:05", "remaining_time": "0:04:43"}
{"current_steps": 4060, "total_steps": 4116, "loss": 0.1319, "lr": 2.3368073540461778e-08, "epoch": 6.904761904761905, "percentage": 98.64, "elapsed_time": "5:14:26", "remaining_time": "0:04:20"}
{"current_steps": 4065, "total_steps": 4116, "loss": 0.1432, "lr": 1.944885669517005e-08, "epoch": 6.913265306122449, "percentage": 98.76, "elapsed_time": "5:14:52", "remaining_time": "0:03:57"}
{"current_steps": 4070, "total_steps": 4116, "loss": 0.1332, "lr": 1.5888979061624566e-08, "epoch": 6.921768707482993, "percentage": 98.88, "elapsed_time": "5:15:16", "remaining_time": "0:03:33"}
{"current_steps": 4075, "total_steps": 4116, "loss": 0.1425, "lr": 1.268850466226379e-08, "epoch": 6.930272108843537, "percentage": 99.0, "elapsed_time": "5:15:41", "remaining_time": "0:03:10"}
{"current_steps": 4080, "total_steps": 4116, "loss": 0.1344, "lr": 9.847491055856583e-09, "epoch": 6.938775510204081, "percentage": 99.13, "elapsed_time": "5:16:04", "remaining_time": "0:02:47"}
{"current_steps": 4085, "total_steps": 4116, "loss": 0.1356, "lr": 7.365989336469704e-09, "epoch": 6.947278911564625, "percentage": 99.25, "elapsed_time": "5:16:25", "remaining_time": "0:02:24"}
{"current_steps": 4090, "total_steps": 4116, "loss": 0.1333, "lr": 5.244044132544091e-09, "epoch": 6.95578231292517, "percentage": 99.37, "elapsed_time": "5:16:46", "remaining_time": "0:02:00"}
{"current_steps": 4095, "total_steps": 4116, "loss": 0.1362, "lr": 3.481693606095515e-09, "epoch": 6.964285714285714, "percentage": 99.49, "elapsed_time": "5:17:09", "remaining_time": "0:01:37"}
{"current_steps": 4100, "total_steps": 4116, "loss": 0.1457, "lr": 2.078969452030677e-09, "epoch": 6.9727891156462585, "percentage": 99.61, "elapsed_time": "5:17:29", "remaining_time": "0:01:14"}
{"current_steps": 4105, "total_steps": 4116, "loss": 0.145, "lr": 1.0358968975676675e-09, "epoch": 6.9812925170068025, "percentage": 99.73, "elapsed_time": "5:17:54", "remaining_time": "0:00:51"}
{"current_steps": 4110, "total_steps": 4116, "loss": 0.1376, "lr": 3.524947017941038e-10, "epoch": 6.989795918367347, "percentage": 99.85, "elapsed_time": "5:18:22", "remaining_time": "0:00:27"}
{"current_steps": 4115, "total_steps": 4116, "loss": 0.1316, "lr": 2.8775155322957604e-11, "epoch": 6.9982993197278915, "percentage": 99.98, "elapsed_time": "5:18:44", "remaining_time": "0:00:04"}
{"current_steps": 4116, "total_steps": 4116, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "5:18:57", "remaining_time": "0:00:00"}

9100
trainer_state.json Normal file

File diff suppressed because it is too large Load Diff

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:614cd10465ab2b6b7fe2abab9e95e43d0698c7d02b19b9998782227285f050e1
size 8593

BIN
training_loss.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

1
vocab.json Normal file

File diff suppressed because one or more lines are too long