初始化项目,由ModelHub XC社区提供模型

Model: laion/r2egym-31600__Qwen3-8B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-27 09:58:19 +08:00
commit 48e84f8ca1
23 changed files with 158012 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

61
README.md Normal file
View File

@@ -0,0 +1,61 @@
---
library_name: transformers
license: other
base_model: Qwen/Qwen3-8B
tags:
- llama-factory
- full
- generated_from_trainer
model-index:
- name: r2egym-31600__Qwen3-8B
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# r2egym-31600__Qwen3-8B
This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--r2egym-unified-31600/snapshots/68e1b38fd891a5a7c593dfcf25d1109f2dec75a5_thinking_preprocessed dataset.
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 32
- gradient_accumulation_steps: 3
- total_train_batch_size: 96
- total_eval_batch_size: 256
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 7.0
### Training results
### Framework versions
- Transformers 4.57.6
- Pytorch 2.9.1+cu130
- Datasets 4.7.0
- Tokenizers 0.22.2

28
added_tokens.json Normal file
View File

@@ -0,0 +1,28 @@
{
"</think>": 151668,
"</tool_call>": 151658,
"</tool_response>": 151666,
"<think>": 151667,
"<tool_call>": 151657,
"<tool_response>": 151665,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

16
all_results.json Normal file
View File

@@ -0,0 +1,16 @@
{
"achieved_tflops_per_gpu": 26225.137110939006,
"achieved_tflops_per_gpu_theoretical": 2205322.596124113,
"epoch": 7.0,
"loss_nan_ranks": 0,
"loss_rank_avg": 0.00018654789892025292,
"mfu_percent": 1853.366580278375,
"mfu_percent_theoretical": 155853.18700523768,
"total_flos": 2.978084610098856e+18,
"train_loss": 0.0,
"train_runtime": 3.5487,
"train_samples_per_second": 62332.474,
"train_steps_per_second": 650.94,
"valid_targets_mean": 3597.5,
"valid_targets_min": 1775
}

89
chat_template.jinja Normal file
View File

@@ -0,0 +1,89 @@
{%- if tools %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].role == 'system' %}
{{- messages[0].content + '\n\n' }}
{%- endif %}
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
{%- for tool in tools %}
{{- "\n" }}
{{- tool | tojson }}
{%- endfor %}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
{%- if messages[0].role == 'system' %}
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
{%- set index = (messages|length - 1) - loop.index0 %}
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
{%- set ns.multi_step_tool = false %}
{%- set ns.last_query_index = index %}
{%- endif %}
{%- endfor %}
{%- for message in messages %}
{%- if message.content is string %}
{%- set content = message.content %}
{%- else %}
{%- set content = '' %}
{%- endif %}
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
{%- elif message.role == "assistant" %}
{%- set reasoning_content = '' %}
{%- if message.reasoning_content is string %}
{%- set reasoning_content = message.reasoning_content %}
{%- else %}
{%- if '</think>' in content %}
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
{%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_query_index %}
{%- if loop.last or (not loop.last and reasoning_content) %}
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and content) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "tool" %}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|im_start|>user' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- content }}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- if enable_thinking is defined and enable_thinking is false %}
{{- '<think>\n\n</think>\n\n' }}
{%- endif %}
{%- endif %}

68
config.json Normal file
View File

@@ -0,0 +1,68 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"dtype": "bfloat16",
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 12288,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 40960,
"max_window_layers": 36,
"model_type": "qwen3",
"num_attention_heads": 32,
"num_hidden_layers": 36,
"num_key_value_heads": 8,
"pad_token_id": 151643,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000,
"sliding_window": null,
"tie_word_embeddings": false,
"transformers_version": "4.57.6",
"use_cache": false,
"use_sliding_window": false,
"vocab_size": 151936
}

12
generation_config.json Normal file
View File

@@ -0,0 +1,12 @@
{
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"transformers_version": "4.57.6"
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f0afdbb8305e74789cc274730e28490f8b0938122222ceb812308e867ebb2dc2
size 4902257696

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f4582e3a0956bcfb120d283b390b5f0b22fdd6781116d0641f719d4bd6e6245e
size 4915960368

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4b077a4dcce4bc127c99fb880b5df9e57c1884520b1c7ebb55c74a1d2f035964
size 4983068496

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:83a20000fe0a49a5cdd27577fab3638b585b97076bed14e97464f8a6202d03df
size 1580230264

View File

@@ -0,0 +1,407 @@
{
"metadata": {
"total_parameters": 308224,
"total_size": 16381470720
},
"weight_map": {
"lm_head.weight": "model-00004-of-00004.safetensors",
"model.embed_tokens.weight": "model-00001-of-00004.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.20.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.22.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.30.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.input_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.35.input_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.k_norm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.q_norm.weight": "model-00004-of-00004.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
"model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.input_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
"model.layers.9.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
"model.norm.weight": "model-00004-of-00004.safetensors"
}
}

12
run_summary.json Normal file
View File

@@ -0,0 +1,12 @@
{
"agent_name": "68e1b38fd891a5a7c593dfcf25d1109f2dec75a5_thinking_preprocessed",
"training_start": null,
"training_end": null,
"created_by": "DCAgent",
"base_model_name": "Qwen/Qwen3-8B",
"dataset_name": "/e/data1/datasets/playground/ot/hf_hub/datasets--laion--r2egym-unified-31600/snapshots/68e1b38fd891a5a7c593dfcf25d1109f2dec75a5_thinking_preprocessed",
"training_type": "SFT",
"training_parameters": "https://huggingface.co/laion/r2egym-unified-31600__Qwen3-8B/blob/main/config.json",
"wandb_link": null,
"traces_location_s3": null
}

31
special_tokens_map.json Normal file
View File

@@ -0,0 +1,31 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

240
tokenizer_config.json Normal file
View File

@@ -0,0 +1,240 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"extra_special_tokens": {},
"model_max_length": 32768,
"pad_token": "<|endoftext|>",
"padding_side": "right",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null
}

12
train_results.json Normal file
View File

@@ -0,0 +1,12 @@
{
"achieved_tflops_per_gpu": 26225.137110939006,
"achieved_tflops_per_gpu_theoretical": 2205322.596124113,
"epoch": 7.0,
"mfu_percent": 1853.366580278375,
"mfu_percent_theoretical": 155853.18700523768,
"total_flos": 2.978084610098856e+18,
"train_loss": 0.0,
"train_runtime": 3.5487,
"train_samples_per_second": 62332.474,
"train_steps_per_second": 650.94
}

468
trainer_log.jsonl Normal file
View File

@@ -0,0 +1,468 @@
{"current_steps": 5, "total_steps": 2310, "loss": 0.3843, "lr": 6.926406926406927e-07, "epoch": 0.015182186234817813, "percentage": 0.22, "elapsed_time": "0:01:11", "remaining_time": "9:07:28"}
{"current_steps": 10, "total_steps": 2310, "loss": 0.3659, "lr": 1.5584415584415584e-06, "epoch": 0.030364372469635626, "percentage": 0.43, "elapsed_time": "0:02:13", "remaining_time": "8:33:20"}
{"current_steps": 15, "total_steps": 2310, "loss": 0.3588, "lr": 2.4242424242424244e-06, "epoch": 0.04554655870445344, "percentage": 0.65, "elapsed_time": "0:03:14", "remaining_time": "8:17:03"}
{"current_steps": 20, "total_steps": 2310, "loss": 0.3206, "lr": 3.2900432900432905e-06, "epoch": 0.06072874493927125, "percentage": 0.87, "elapsed_time": "0:04:16", "remaining_time": "8:09:06"}
{"current_steps": 25, "total_steps": 2310, "loss": 0.319, "lr": 4.155844155844157e-06, "epoch": 0.07591093117408906, "percentage": 1.08, "elapsed_time": "0:05:13", "remaining_time": "7:57:41"}
{"current_steps": 30, "total_steps": 2310, "loss": 0.2854, "lr": 5.021645021645022e-06, "epoch": 0.09109311740890688, "percentage": 1.3, "elapsed_time": "0:06:13", "remaining_time": "7:52:44"}
{"current_steps": 35, "total_steps": 2310, "loss": 0.2563, "lr": 5.887445887445888e-06, "epoch": 0.1062753036437247, "percentage": 1.52, "elapsed_time": "0:07:13", "remaining_time": "7:49:24"}
{"current_steps": 40, "total_steps": 2310, "loss": 0.2445, "lr": 6.753246753246754e-06, "epoch": 0.1214574898785425, "percentage": 1.73, "elapsed_time": "0:08:13", "remaining_time": "7:46:41"}
{"current_steps": 45, "total_steps": 2310, "loss": 0.2336, "lr": 7.61904761904762e-06, "epoch": 0.13663967611336034, "percentage": 1.95, "elapsed_time": "0:09:10", "remaining_time": "7:42:05"}
{"current_steps": 50, "total_steps": 2310, "loss": 0.2085, "lr": 8.484848484848486e-06, "epoch": 0.15182186234817813, "percentage": 2.16, "elapsed_time": "0:10:08", "remaining_time": "7:38:22"}
{"current_steps": 55, "total_steps": 2310, "loss": 0.1907, "lr": 9.350649350649352e-06, "epoch": 0.16700404858299595, "percentage": 2.38, "elapsed_time": "0:11:13", "remaining_time": "7:40:01"}
{"current_steps": 60, "total_steps": 2310, "loss": 0.1762, "lr": 1.0216450216450218e-05, "epoch": 0.18218623481781376, "percentage": 2.6, "elapsed_time": "0:12:16", "remaining_time": "7:40:01"}
{"current_steps": 65, "total_steps": 2310, "loss": 0.1681, "lr": 1.1082251082251083e-05, "epoch": 0.19736842105263158, "percentage": 2.81, "elapsed_time": "0:13:15", "remaining_time": "7:37:46"}
{"current_steps": 70, "total_steps": 2310, "loss": 0.1635, "lr": 1.1948051948051949e-05, "epoch": 0.2125506072874494, "percentage": 3.03, "elapsed_time": "0:14:17", "remaining_time": "7:37:04"}
{"current_steps": 75, "total_steps": 2310, "loss": 0.1582, "lr": 1.2813852813852813e-05, "epoch": 0.22773279352226722, "percentage": 3.25, "elapsed_time": "0:15:19", "remaining_time": "7:36:48"}
{"current_steps": 80, "total_steps": 2310, "loss": 0.1562, "lr": 1.3679653679653682e-05, "epoch": 0.242914979757085, "percentage": 3.46, "elapsed_time": "0:16:19", "remaining_time": "7:35:16"}
{"current_steps": 85, "total_steps": 2310, "loss": 0.1534, "lr": 1.4545454545454546e-05, "epoch": 0.25809716599190285, "percentage": 3.68, "elapsed_time": "0:17:16", "remaining_time": "7:32:01"}
{"current_steps": 90, "total_steps": 2310, "loss": 0.1553, "lr": 1.5411255411255414e-05, "epoch": 0.2732793522267207, "percentage": 3.9, "elapsed_time": "0:18:14", "remaining_time": "7:30:02"}
{"current_steps": 95, "total_steps": 2310, "loss": 0.1357, "lr": 1.6277056277056278e-05, "epoch": 0.28846153846153844, "percentage": 4.11, "elapsed_time": "0:19:20", "remaining_time": "7:30:52"}
{"current_steps": 100, "total_steps": 2310, "loss": 0.1429, "lr": 1.7142857142857142e-05, "epoch": 0.30364372469635625, "percentage": 4.33, "elapsed_time": "0:20:13", "remaining_time": "7:27:00"}
{"current_steps": 105, "total_steps": 2310, "loss": 0.1355, "lr": 1.800865800865801e-05, "epoch": 0.3188259109311741, "percentage": 4.55, "elapsed_time": "0:21:11", "remaining_time": "7:25:04"}
{"current_steps": 110, "total_steps": 2310, "loss": 0.1294, "lr": 1.8874458874458877e-05, "epoch": 0.3340080971659919, "percentage": 4.76, "elapsed_time": "0:22:11", "remaining_time": "7:23:53"}
{"current_steps": 115, "total_steps": 2310, "loss": 0.1295, "lr": 1.974025974025974e-05, "epoch": 0.3491902834008097, "percentage": 4.98, "elapsed_time": "0:23:13", "remaining_time": "7:23:15"}
{"current_steps": 120, "total_steps": 2310, "loss": 0.1303, "lr": 2.0606060606060608e-05, "epoch": 0.3643724696356275, "percentage": 5.19, "elapsed_time": "0:24:12", "remaining_time": "7:21:39"}
{"current_steps": 125, "total_steps": 2310, "loss": 0.1233, "lr": 2.1471861471861476e-05, "epoch": 0.37955465587044535, "percentage": 5.41, "elapsed_time": "0:25:11", "remaining_time": "7:20:28"}
{"current_steps": 130, "total_steps": 2310, "loss": 0.1235, "lr": 2.233766233766234e-05, "epoch": 0.39473684210526316, "percentage": 5.63, "elapsed_time": "0:26:14", "remaining_time": "7:19:57"}
{"current_steps": 135, "total_steps": 2310, "loss": 0.1209, "lr": 2.3203463203463207e-05, "epoch": 0.409919028340081, "percentage": 5.84, "elapsed_time": "0:27:17", "remaining_time": "7:19:42"}
{"current_steps": 140, "total_steps": 2310, "loss": 0.123, "lr": 2.4069264069264074e-05, "epoch": 0.4251012145748988, "percentage": 6.06, "elapsed_time": "0:28:19", "remaining_time": "7:18:56"}
{"current_steps": 145, "total_steps": 2310, "loss": 0.1163, "lr": 2.4935064935064935e-05, "epoch": 0.4402834008097166, "percentage": 6.28, "elapsed_time": "0:29:17", "remaining_time": "7:17:20"}
{"current_steps": 150, "total_steps": 2310, "loss": 0.1192, "lr": 2.5800865800865803e-05, "epoch": 0.45546558704453444, "percentage": 6.49, "elapsed_time": "0:30:11", "remaining_time": "7:14:48"}
{"current_steps": 155, "total_steps": 2310, "loss": 0.1194, "lr": 2.6666666666666667e-05, "epoch": 0.4706477732793522, "percentage": 6.71, "elapsed_time": "0:31:12", "remaining_time": "7:13:56"}
{"current_steps": 160, "total_steps": 2310, "loss": 0.1139, "lr": 2.7532467532467534e-05, "epoch": 0.48582995951417, "percentage": 6.93, "elapsed_time": "0:32:20", "remaining_time": "7:14:32"}
{"current_steps": 165, "total_steps": 2310, "loss": 0.1135, "lr": 2.83982683982684e-05, "epoch": 0.5010121457489879, "percentage": 7.14, "elapsed_time": "0:33:20", "remaining_time": "7:13:26"}
{"current_steps": 170, "total_steps": 2310, "loss": 0.114, "lr": 2.9264069264069265e-05, "epoch": 0.5161943319838057, "percentage": 7.36, "elapsed_time": "0:34:22", "remaining_time": "7:12:44"}
{"current_steps": 175, "total_steps": 2310, "loss": 0.1078, "lr": 3.0129870129870133e-05, "epoch": 0.5313765182186235, "percentage": 7.58, "elapsed_time": "0:35:25", "remaining_time": "7:12:12"}
{"current_steps": 180, "total_steps": 2310, "loss": 0.1071, "lr": 3.0995670995671e-05, "epoch": 0.5465587044534413, "percentage": 7.79, "elapsed_time": "0:36:23", "remaining_time": "7:10:43"}
{"current_steps": 185, "total_steps": 2310, "loss": 0.1118, "lr": 3.1861471861471864e-05, "epoch": 0.5617408906882592, "percentage": 8.01, "elapsed_time": "0:37:16", "remaining_time": "7:08:13"}
{"current_steps": 190, "total_steps": 2310, "loss": 0.105, "lr": 3.272727272727273e-05, "epoch": 0.5769230769230769, "percentage": 8.23, "elapsed_time": "0:38:15", "remaining_time": "7:06:51"}
{"current_steps": 195, "total_steps": 2310, "loss": 0.1036, "lr": 3.359307359307359e-05, "epoch": 0.5921052631578947, "percentage": 8.44, "elapsed_time": "0:39:15", "remaining_time": "7:05:47"}
{"current_steps": 200, "total_steps": 2310, "loss": 0.104, "lr": 3.445887445887446e-05, "epoch": 0.6072874493927125, "percentage": 8.66, "elapsed_time": "0:40:13", "remaining_time": "7:04:17"}
{"current_steps": 205, "total_steps": 2310, "loss": 0.1069, "lr": 3.532467532467533e-05, "epoch": 0.6224696356275303, "percentage": 8.87, "elapsed_time": "0:41:11", "remaining_time": "7:02:58"}
{"current_steps": 210, "total_steps": 2310, "loss": 0.101, "lr": 3.6190476190476195e-05, "epoch": 0.6376518218623481, "percentage": 9.09, "elapsed_time": "0:42:14", "remaining_time": "7:02:29"}
{"current_steps": 215, "total_steps": 2310, "loss": 0.0997, "lr": 3.705627705627706e-05, "epoch": 0.652834008097166, "percentage": 9.31, "elapsed_time": "0:43:14", "remaining_time": "7:01:16"}
{"current_steps": 220, "total_steps": 2310, "loss": 0.0971, "lr": 3.792207792207792e-05, "epoch": 0.6680161943319838, "percentage": 9.52, "elapsed_time": "0:44:17", "remaining_time": "7:00:46"}
{"current_steps": 225, "total_steps": 2310, "loss": 0.0964, "lr": 3.878787878787879e-05, "epoch": 0.6831983805668016, "percentage": 9.74, "elapsed_time": "0:45:12", "remaining_time": "6:58:55"}
{"current_steps": 230, "total_steps": 2310, "loss": 0.0926, "lr": 3.965367965367966e-05, "epoch": 0.6983805668016194, "percentage": 9.96, "elapsed_time": "0:46:11", "remaining_time": "6:57:40"}
{"current_steps": 235, "total_steps": 2310, "loss": 0.0949, "lr": 3.999979449020199e-05, "epoch": 0.7135627530364372, "percentage": 10.17, "elapsed_time": "0:47:07", "remaining_time": "6:56:08"}
{"current_steps": 240, "total_steps": 2310, "loss": 0.0918, "lr": 3.99985386122866e-05, "epoch": 0.728744939271255, "percentage": 10.39, "elapsed_time": "0:48:06", "remaining_time": "6:54:57"}
{"current_steps": 245, "total_steps": 2310, "loss": 0.0911, "lr": 3.999614110017182e-05, "epoch": 0.7439271255060729, "percentage": 10.61, "elapsed_time": "0:49:04", "remaining_time": "6:53:35"}
{"current_steps": 250, "total_steps": 2310, "loss": 0.0902, "lr": 3.999260209072175e-05, "epoch": 0.7591093117408907, "percentage": 10.82, "elapsed_time": "0:49:59", "remaining_time": "6:51:56"}
{"current_steps": 255, "total_steps": 2310, "loss": 0.0884, "lr": 3.998792178596384e-05, "epoch": 0.7742914979757085, "percentage": 11.04, "elapsed_time": "0:50:57", "remaining_time": "6:50:42"}
{"current_steps": 260, "total_steps": 2310, "loss": 0.0831, "lr": 3.998210045307744e-05, "epoch": 0.7894736842105263, "percentage": 11.26, "elapsed_time": "0:51:57", "remaining_time": "6:49:42"}
{"current_steps": 265, "total_steps": 2310, "loss": 0.0866, "lr": 3.997513842437845e-05, "epoch": 0.8046558704453441, "percentage": 11.47, "elapsed_time": "0:52:51", "remaining_time": "6:47:55"}
{"current_steps": 270, "total_steps": 2310, "loss": 0.0808, "lr": 3.996703609730042e-05, "epoch": 0.819838056680162, "percentage": 11.69, "elapsed_time": "0:53:50", "remaining_time": "6:46:49"}
{"current_steps": 275, "total_steps": 2310, "loss": 0.0813, "lr": 3.995779393437183e-05, "epoch": 0.8350202429149798, "percentage": 11.9, "elapsed_time": "0:54:49", "remaining_time": "6:45:43"}
{"current_steps": 280, "total_steps": 2310, "loss": 0.0758, "lr": 3.99474124631897e-05, "epoch": 0.8502024291497976, "percentage": 12.12, "elapsed_time": "0:55:50", "remaining_time": "6:44:54"}
{"current_steps": 285, "total_steps": 2310, "loss": 0.0791, "lr": 3.993589227638944e-05, "epoch": 0.8653846153846154, "percentage": 12.34, "elapsed_time": "0:56:48", "remaining_time": "6:43:38"}
{"current_steps": 290, "total_steps": 2310, "loss": 0.0763, "lr": 3.992323403161109e-05, "epoch": 0.8805668016194332, "percentage": 12.55, "elapsed_time": "0:57:51", "remaining_time": "6:42:59"}
{"current_steps": 295, "total_steps": 2310, "loss": 0.0745, "lr": 3.9909438451461695e-05, "epoch": 0.895748987854251, "percentage": 12.77, "elapsed_time": "0:58:46", "remaining_time": "6:41:27"}
{"current_steps": 300, "total_steps": 2310, "loss": 0.0734, "lr": 3.989450632347411e-05, "epoch": 0.9109311740890689, "percentage": 12.99, "elapsed_time": "0:59:44", "remaining_time": "6:40:18"}
{"current_steps": 305, "total_steps": 2310, "loss": 0.0712, "lr": 3.987843850006201e-05, "epoch": 0.9261133603238867, "percentage": 13.2, "elapsed_time": "1:00:51", "remaining_time": "6:40:07"}
{"current_steps": 310, "total_steps": 2310, "loss": 0.0729, "lr": 3.9861235898471265e-05, "epoch": 0.9412955465587044, "percentage": 13.42, "elapsed_time": "1:01:53", "remaining_time": "6:39:15"}
{"current_steps": 315, "total_steps": 2310, "loss": 0.0713, "lr": 3.984289950072754e-05, "epoch": 0.9564777327935222, "percentage": 13.64, "elapsed_time": "1:02:52", "remaining_time": "6:38:11"}
{"current_steps": 320, "total_steps": 2310, "loss": 0.0651, "lr": 3.982343035358026e-05, "epoch": 0.97165991902834, "percentage": 13.85, "elapsed_time": "1:03:52", "remaining_time": "6:37:12"}
{"current_steps": 325, "total_steps": 2310, "loss": 0.0698, "lr": 3.980282956844284e-05, "epoch": 0.9868421052631579, "percentage": 14.07, "elapsed_time": "1:04:46", "remaining_time": "6:35:34"}
{"current_steps": 330, "total_steps": 2310, "loss": 0.0643, "lr": 3.9781098321329266e-05, "epoch": 1.0, "percentage": 14.29, "elapsed_time": "1:05:36", "remaining_time": "6:33:38"}
{"current_steps": 335, "total_steps": 2310, "loss": 0.0621, "lr": 3.975823785278691e-05, "epoch": 1.0151821862348178, "percentage": 14.5, "elapsed_time": "1:06:33", "remaining_time": "6:32:26"}
{"current_steps": 340, "total_steps": 2310, "loss": 0.0598, "lr": 3.973424946782578e-05, "epoch": 1.0303643724696356, "percentage": 14.72, "elapsed_time": "1:07:27", "remaining_time": "6:30:53"}
{"current_steps": 345, "total_steps": 2310, "loss": 0.059, "lr": 3.970913453584397e-05, "epoch": 1.0455465587044535, "percentage": 14.94, "elapsed_time": "1:08:25", "remaining_time": "6:29:41"}
{"current_steps": 350, "total_steps": 2310, "loss": 0.0571, "lr": 3.9682894490549485e-05, "epoch": 1.0607287449392713, "percentage": 15.15, "elapsed_time": "1:09:24", "remaining_time": "6:28:42"}
{"current_steps": 355, "total_steps": 2310, "loss": 0.0577, "lr": 3.965553082987846e-05, "epoch": 1.075910931174089, "percentage": 15.37, "elapsed_time": "1:10:19", "remaining_time": "6:27:18"}
{"current_steps": 360, "total_steps": 2310, "loss": 0.0529, "lr": 3.9627045115909556e-05, "epoch": 1.091093117408907, "percentage": 15.58, "elapsed_time": "1:11:17", "remaining_time": "6:26:08"}
{"current_steps": 365, "total_steps": 2310, "loss": 0.0555, "lr": 3.959743897477486e-05, "epoch": 1.1062753036437247, "percentage": 15.8, "elapsed_time": "1:12:12", "remaining_time": "6:24:46"}
{"current_steps": 370, "total_steps": 2310, "loss": 0.0507, "lr": 3.9566714096567035e-05, "epoch": 1.1214574898785425, "percentage": 16.02, "elapsed_time": "1:13:08", "remaining_time": "6:23:29"}
{"current_steps": 375, "total_steps": 2310, "loss": 0.0497, "lr": 3.953487223524283e-05, "epoch": 1.1366396761133604, "percentage": 16.23, "elapsed_time": "1:14:06", "remaining_time": "6:22:23"}
{"current_steps": 380, "total_steps": 2310, "loss": 0.0472, "lr": 3.950191520852294e-05, "epoch": 1.1518218623481782, "percentage": 16.45, "elapsed_time": "1:15:03", "remaining_time": "6:21:14"}
{"current_steps": 385, "total_steps": 2310, "loss": 0.048, "lr": 3.94678448977883e-05, "epoch": 1.167004048582996, "percentage": 16.67, "elapsed_time": "1:16:03", "remaining_time": "6:20:16"}
{"current_steps": 390, "total_steps": 2310, "loss": 0.0446, "lr": 3.9432663247972614e-05, "epoch": 1.1821862348178138, "percentage": 16.88, "elapsed_time": "1:16:54", "remaining_time": "6:18:38"}
{"current_steps": 395, "total_steps": 2310, "loss": 0.0415, "lr": 3.9396372267451356e-05, "epoch": 1.1973684210526316, "percentage": 17.1, "elapsed_time": "1:17:56", "remaining_time": "6:17:53"}
{"current_steps": 400, "total_steps": 2310, "loss": 0.0426, "lr": 3.935897402792713e-05, "epoch": 1.2125506072874495, "percentage": 17.32, "elapsed_time": "1:18:51", "remaining_time": "6:16:33"}
{"current_steps": 405, "total_steps": 2310, "loss": 0.0397, "lr": 3.93204706643114e-05, "epoch": 1.2277327935222673, "percentage": 17.53, "elapsed_time": "1:19:51", "remaining_time": "6:15:39"}
{"current_steps": 410, "total_steps": 2310, "loss": 0.0408, "lr": 3.92808643746026e-05, "epoch": 1.242914979757085, "percentage": 17.75, "elapsed_time": "1:20:47", "remaining_time": "6:14:24"}
{"current_steps": 415, "total_steps": 2310, "loss": 0.0419, "lr": 3.924015741976069e-05, "epoch": 1.258097165991903, "percentage": 17.97, "elapsed_time": "1:21:46", "remaining_time": "6:13:24"}
{"current_steps": 420, "total_steps": 2310, "loss": 0.0379, "lr": 3.919835212357805e-05, "epoch": 1.2732793522267207, "percentage": 18.18, "elapsed_time": "1:22:42", "remaining_time": "6:12:09"}
{"current_steps": 425, "total_steps": 2310, "loss": 0.0363, "lr": 3.915545087254685e-05, "epoch": 1.2884615384615383, "percentage": 18.4, "elapsed_time": "1:23:38", "remaining_time": "6:10:56"}
{"current_steps": 430, "total_steps": 2310, "loss": 0.0365, "lr": 3.911145611572282e-05, "epoch": 1.3036437246963564, "percentage": 18.61, "elapsed_time": "1:24:36", "remaining_time": "6:09:52"}
{"current_steps": 435, "total_steps": 2310, "loss": 0.0364, "lr": 3.906637036458541e-05, "epoch": 1.318825910931174, "percentage": 18.83, "elapsed_time": "1:25:33", "remaining_time": "6:08:45"}
{"current_steps": 440, "total_steps": 2310, "loss": 0.0332, "lr": 3.902019619289446e-05, "epoch": 1.334008097165992, "percentage": 19.05, "elapsed_time": "1:26:35", "remaining_time": "6:07:59"}
{"current_steps": 445, "total_steps": 2310, "loss": 0.032, "lr": 3.897293623654325e-05, "epoch": 1.3491902834008096, "percentage": 19.26, "elapsed_time": "1:27:35", "remaining_time": "6:07:03"}
{"current_steps": 450, "total_steps": 2310, "loss": 0.0315, "lr": 3.892459319340802e-05, "epoch": 1.3643724696356276, "percentage": 19.48, "elapsed_time": "1:28:32", "remaining_time": "6:05:57"}
{"current_steps": 455, "total_steps": 2310, "loss": 0.0328, "lr": 3.8875169823194e-05, "epoch": 1.3795546558704452, "percentage": 19.7, "elapsed_time": "1:29:27", "remaining_time": "6:04:42"}
{"current_steps": 460, "total_steps": 2310, "loss": 0.029, "lr": 3.882466894727784e-05, "epoch": 1.3947368421052633, "percentage": 19.91, "elapsed_time": "1:30:29", "remaining_time": "6:03:54"}
{"current_steps": 465, "total_steps": 2310, "loss": 0.0305, "lr": 3.8773093448546525e-05, "epoch": 1.4099190283400809, "percentage": 20.13, "elapsed_time": "1:31:26", "remaining_time": "6:02:50"}
{"current_steps": 470, "total_steps": 2310, "loss": 0.0277, "lr": 3.872044627123285e-05, "epoch": 1.425101214574899, "percentage": 20.35, "elapsed_time": "1:32:24", "remaining_time": "6:01:47"}
{"current_steps": 475, "total_steps": 2310, "loss": 0.0252, "lr": 3.8666730420747336e-05, "epoch": 1.4402834008097165, "percentage": 20.56, "elapsed_time": "1:33:27", "remaining_time": "6:01:02"}
{"current_steps": 480, "total_steps": 2310, "loss": 0.025, "lr": 3.861194896350664e-05, "epoch": 1.4554655870445345, "percentage": 20.78, "elapsed_time": "1:34:25", "remaining_time": "5:59:59"}
{"current_steps": 485, "total_steps": 2310, "loss": 0.0258, "lr": 3.855610502675851e-05, "epoch": 1.4706477732793521, "percentage": 21.0, "elapsed_time": "1:35:18", "remaining_time": "5:58:37"}
{"current_steps": 490, "total_steps": 2310, "loss": 0.0237, "lr": 3.8499201798403303e-05, "epoch": 1.48582995951417, "percentage": 21.21, "elapsed_time": "1:36:11", "remaining_time": "5:57:15"}
{"current_steps": 495, "total_steps": 2310, "loss": 0.0242, "lr": 3.8441242526811934e-05, "epoch": 1.5010121457489878, "percentage": 21.43, "elapsed_time": "1:37:11", "remaining_time": "5:56:22"}
{"current_steps": 500, "total_steps": 2310, "loss": 0.0227, "lr": 3.8382230520640506e-05, "epoch": 1.5161943319838058, "percentage": 21.65, "elapsed_time": "1:38:11", "remaining_time": "5:55:28"}
{"current_steps": 505, "total_steps": 2310, "loss": 0.0224, "lr": 3.832216914864139e-05, "epoch": 1.5313765182186234, "percentage": 21.86, "elapsed_time": "1:39:07", "remaining_time": "5:54:18"}
{"current_steps": 510, "total_steps": 2310, "loss": 0.0224, "lr": 3.826106183947095e-05, "epoch": 1.5465587044534415, "percentage": 22.08, "elapsed_time": "1:40:06", "remaining_time": "5:53:18"}
{"current_steps": 515, "total_steps": 2310, "loss": 0.0199, "lr": 3.8198912081493735e-05, "epoch": 1.561740890688259, "percentage": 22.29, "elapsed_time": "1:41:02", "remaining_time": "5:52:08"}
{"current_steps": 520, "total_steps": 2310, "loss": 0.0216, "lr": 3.813572342258349e-05, "epoch": 1.5769230769230769, "percentage": 22.51, "elapsed_time": "1:41:53", "remaining_time": "5:50:45"}
{"current_steps": 525, "total_steps": 2310, "loss": 0.0185, "lr": 3.807149946992047e-05, "epoch": 1.5921052631578947, "percentage": 22.73, "elapsed_time": "1:42:49", "remaining_time": "5:49:36"}
{"current_steps": 530, "total_steps": 2310, "loss": 0.0198, "lr": 3.800624388978561e-05, "epoch": 1.6072874493927125, "percentage": 22.94, "elapsed_time": "1:43:44", "remaining_time": "5:48:25"}
{"current_steps": 535, "total_steps": 2310, "loss": 0.0198, "lr": 3.793996040735119e-05, "epoch": 1.6224696356275303, "percentage": 23.16, "elapsed_time": "1:44:37", "remaining_time": "5:47:08"}
{"current_steps": 540, "total_steps": 2310, "loss": 0.0181, "lr": 3.787265280646825e-05, "epoch": 1.6376518218623481, "percentage": 23.38, "elapsed_time": "1:45:35", "remaining_time": "5:46:06"}
{"current_steps": 545, "total_steps": 2310, "loss": 0.0177, "lr": 3.7804324929450465e-05, "epoch": 1.652834008097166, "percentage": 23.59, "elapsed_time": "1:46:30", "remaining_time": "5:44:54"}
{"current_steps": 550, "total_steps": 2310, "loss": 0.0144, "lr": 3.7734980676854916e-05, "epoch": 1.6680161943319838, "percentage": 23.81, "elapsed_time": "1:47:27", "remaining_time": "5:43:52"}
{"current_steps": 555, "total_steps": 2310, "loss": 0.0161, "lr": 3.7664624007259375e-05, "epoch": 1.6831983805668016, "percentage": 24.03, "elapsed_time": "1:48:20", "remaining_time": "5:42:35"}
{"current_steps": 560, "total_steps": 2310, "loss": 0.0153, "lr": 3.759325893703631e-05, "epoch": 1.6983805668016194, "percentage": 24.24, "elapsed_time": "1:49:15", "remaining_time": "5:41:26"}
{"current_steps": 565, "total_steps": 2310, "loss": 0.0154, "lr": 3.752088954012366e-05, "epoch": 1.7135627530364372, "percentage": 24.46, "elapsed_time": "1:50:12", "remaining_time": "5:40:21"}
{"current_steps": 570, "total_steps": 2310, "loss": 0.0142, "lr": 3.74475199477922e-05, "epoch": 1.728744939271255, "percentage": 24.68, "elapsed_time": "1:51:06", "remaining_time": "5:39:10"}
{"current_steps": 575, "total_steps": 2310, "loss": 0.0138, "lr": 3.7373154348409775e-05, "epoch": 1.7439271255060729, "percentage": 24.89, "elapsed_time": "1:51:58", "remaining_time": "5:37:53"}
{"current_steps": 580, "total_steps": 2310, "loss": 0.0137, "lr": 3.729779698720215e-05, "epoch": 1.7591093117408907, "percentage": 25.11, "elapsed_time": "1:52:57", "remaining_time": "5:36:54"}
{"current_steps": 585, "total_steps": 2310, "loss": 0.0137, "lr": 3.7221452166010704e-05, "epoch": 1.7742914979757085, "percentage": 25.32, "elapsed_time": "1:53:52", "remaining_time": "5:35:47"}
{"current_steps": 590, "total_steps": 2310, "loss": 0.013, "lr": 3.7144124243046815e-05, "epoch": 1.7894736842105263, "percentage": 25.54, "elapsed_time": "1:54:50", "remaining_time": "5:34:46"}
{"current_steps": 595, "total_steps": 2310, "loss": 0.0123, "lr": 3.7065817632643115e-05, "epoch": 1.8046558704453441, "percentage": 25.76, "elapsed_time": "1:55:46", "remaining_time": "5:33:41"}
{"current_steps": 600, "total_steps": 2310, "loss": 0.0115, "lr": 3.6986536805001466e-05, "epoch": 1.819838056680162, "percentage": 25.97, "elapsed_time": "1:56:43", "remaining_time": "5:32:39"}
{"current_steps": 605, "total_steps": 2310, "loss": 0.0115, "lr": 3.690628628593777e-05, "epoch": 1.8350202429149798, "percentage": 26.19, "elapsed_time": "1:57:47", "remaining_time": "5:31:58"}
{"current_steps": 610, "total_steps": 2310, "loss": 0.0109, "lr": 3.6825070656623626e-05, "epoch": 1.8502024291497976, "percentage": 26.41, "elapsed_time": "1:58:42", "remaining_time": "5:30:49"}
{"current_steps": 615, "total_steps": 2310, "loss": 0.0107, "lr": 3.67428945533248e-05, "epoch": 1.8653846153846154, "percentage": 26.62, "elapsed_time": "1:59:43", "remaining_time": "5:29:57"}
{"current_steps": 620, "total_steps": 2310, "loss": 0.0104, "lr": 3.6659762667136567e-05, "epoch": 1.8805668016194332, "percentage": 26.84, "elapsed_time": "2:00:38", "remaining_time": "5:28:50"}
{"current_steps": 625, "total_steps": 2310, "loss": 0.0107, "lr": 3.65756797437159e-05, "epoch": 1.895748987854251, "percentage": 27.06, "elapsed_time": "2:01:30", "remaining_time": "5:27:36"}
{"current_steps": 630, "total_steps": 2310, "loss": 0.0105, "lr": 3.6490650583010585e-05, "epoch": 1.9109311740890689, "percentage": 27.27, "elapsed_time": "2:02:25", "remaining_time": "5:26:29"}
{"current_steps": 635, "total_steps": 2310, "loss": 0.0092, "lr": 3.640468003898518e-05, "epoch": 1.9261133603238867, "percentage": 27.49, "elapsed_time": "2:03:23", "remaining_time": "5:25:28"}
{"current_steps": 640, "total_steps": 2310, "loss": 0.0102, "lr": 3.631777301934395e-05, "epoch": 1.9412955465587043, "percentage": 27.71, "elapsed_time": "2:04:13", "remaining_time": "5:24:09"}
{"current_steps": 645, "total_steps": 2310, "loss": 0.0088, "lr": 3.6229934485250684e-05, "epoch": 1.9564777327935223, "percentage": 27.92, "elapsed_time": "2:05:08", "remaining_time": "5:23:02"}
{"current_steps": 650, "total_steps": 2310, "loss": 0.009, "lr": 3.6141169451045526e-05, "epoch": 1.97165991902834, "percentage": 28.14, "elapsed_time": "2:06:09", "remaining_time": "5:22:11"}
{"current_steps": 655, "total_steps": 2310, "loss": 0.0084, "lr": 3.605148298395865e-05, "epoch": 1.986842105263158, "percentage": 28.35, "elapsed_time": "2:07:09", "remaining_time": "5:21:18"}
{"current_steps": 660, "total_steps": 2310, "loss": 0.0074, "lr": 3.5960880203821086e-05, "epoch": 2.0, "percentage": 28.57, "elapsed_time": "2:08:05", "remaining_time": "5:20:12"}
{"current_steps": 665, "total_steps": 2310, "loss": 0.0068, "lr": 3.5869366282772354e-05, "epoch": 2.0151821862348176, "percentage": 28.79, "elapsed_time": "2:09:00", "remaining_time": "5:19:06"}
{"current_steps": 670, "total_steps": 2310, "loss": 0.0068, "lr": 3.577694644496529e-05, "epoch": 2.0303643724696356, "percentage": 29.0, "elapsed_time": "2:09:54", "remaining_time": "5:17:58"}
{"current_steps": 675, "total_steps": 2310, "loss": 0.0073, "lr": 3.568362596626778e-05, "epoch": 2.0455465587044532, "percentage": 29.22, "elapsed_time": "2:10:48", "remaining_time": "5:16:50"}
{"current_steps": 680, "total_steps": 2310, "loss": 0.0067, "lr": 3.558941017396158e-05, "epoch": 2.0607287449392713, "percentage": 29.44, "elapsed_time": "2:11:46", "remaining_time": "5:15:52"}
{"current_steps": 685, "total_steps": 2310, "loss": 0.0069, "lr": 3.549430444643821e-05, "epoch": 2.075910931174089, "percentage": 29.65, "elapsed_time": "2:12:46", "remaining_time": "5:14:59"}
{"current_steps": 690, "total_steps": 2310, "loss": 0.0063, "lr": 3.539831421289195e-05, "epoch": 2.091093117408907, "percentage": 29.87, "elapsed_time": "2:13:44", "remaining_time": "5:13:59"}
{"current_steps": 695, "total_steps": 2310, "loss": 0.0065, "lr": 3.530144495300984e-05, "epoch": 2.1062753036437245, "percentage": 30.09, "elapsed_time": "2:14:41", "remaining_time": "5:12:58"}
{"current_steps": 700, "total_steps": 2310, "loss": 0.0062, "lr": 3.520370219665897e-05, "epoch": 2.1214574898785425, "percentage": 30.3, "elapsed_time": "2:15:36", "remaining_time": "5:11:52"}
{"current_steps": 705, "total_steps": 2310, "loss": 0.0062, "lr": 3.51050915235707e-05, "epoch": 2.13663967611336, "percentage": 30.52, "elapsed_time": "2:16:29", "remaining_time": "5:10:44"}
{"current_steps": 710, "total_steps": 2310, "loss": 0.0051, "lr": 3.500561856302221e-05, "epoch": 2.151821862348178, "percentage": 30.74, "elapsed_time": "2:17:26", "remaining_time": "5:09:43"}
{"current_steps": 715, "total_steps": 2310, "loss": 0.0061, "lr": 3.49052889935151e-05, "epoch": 2.167004048582996, "percentage": 30.95, "elapsed_time": "2:18:20", "remaining_time": "5:08:36"}
{"current_steps": 720, "total_steps": 2310, "loss": 0.0057, "lr": 3.480410854245125e-05, "epoch": 2.182186234817814, "percentage": 31.17, "elapsed_time": "2:19:19", "remaining_time": "5:07:39"}
{"current_steps": 725, "total_steps": 2310, "loss": 0.005, "lr": 3.470208298580589e-05, "epoch": 2.1973684210526314, "percentage": 31.39, "elapsed_time": "2:20:15", "remaining_time": "5:06:38"}
{"current_steps": 730, "total_steps": 2310, "loss": 0.0058, "lr": 3.459921814779781e-05, "epoch": 2.2125506072874495, "percentage": 31.6, "elapsed_time": "2:21:07", "remaining_time": "5:05:26"}
{"current_steps": 735, "total_steps": 2310, "loss": 0.0059, "lr": 3.449551990055694e-05, "epoch": 2.227732793522267, "percentage": 31.82, "elapsed_time": "2:22:00", "remaining_time": "5:04:17"}
{"current_steps": 740, "total_steps": 2310, "loss": 0.0047, "lr": 3.43909941637891e-05, "epoch": 2.242914979757085, "percentage": 32.03, "elapsed_time": "2:22:55", "remaining_time": "5:03:14"}
{"current_steps": 745, "total_steps": 2310, "loss": 0.005, "lr": 3.428564690443807e-05, "epoch": 2.2580971659919027, "percentage": 32.25, "elapsed_time": "2:23:54", "remaining_time": "5:02:18"}
{"current_steps": 750, "total_steps": 2310, "loss": 0.0051, "lr": 3.4179484136345e-05, "epoch": 2.2732793522267207, "percentage": 32.47, "elapsed_time": "2:24:51", "remaining_time": "5:01:19"}
{"current_steps": 755, "total_steps": 2310, "loss": 0.0048, "lr": 3.4072511919905055e-05, "epoch": 2.2884615384615383, "percentage": 32.68, "elapsed_time": "2:25:46", "remaining_time": "5:00:13"}
{"current_steps": 760, "total_steps": 2310, "loss": 0.0053, "lr": 3.396473636172146e-05, "epoch": 2.3036437246963564, "percentage": 32.9, "elapsed_time": "2:26:36", "remaining_time": "4:59:00"}
{"current_steps": 765, "total_steps": 2310, "loss": 0.0042, "lr": 3.385616361425694e-05, "epoch": 2.318825910931174, "percentage": 33.12, "elapsed_time": "2:27:28", "remaining_time": "4:57:50"}
{"current_steps": 770, "total_steps": 2310, "loss": 0.0047, "lr": 3.374679987548245e-05, "epoch": 2.334008097165992, "percentage": 33.33, "elapsed_time": "2:28:21", "remaining_time": "4:56:43"}
{"current_steps": 775, "total_steps": 2310, "loss": 0.0047, "lr": 3.363665138852339e-05, "epoch": 2.3491902834008096, "percentage": 33.55, "elapsed_time": "2:29:16", "remaining_time": "4:55:40"}
{"current_steps": 780, "total_steps": 2310, "loss": 0.0045, "lr": 3.3525724441303206e-05, "epoch": 2.3643724696356276, "percentage": 33.77, "elapsed_time": "2:30:13", "remaining_time": "4:54:40"}
{"current_steps": 785, "total_steps": 2310, "loss": 0.0047, "lr": 3.341402536618442e-05, "epoch": 2.3795546558704452, "percentage": 33.98, "elapsed_time": "2:31:07", "remaining_time": "4:53:35"}
{"current_steps": 790, "total_steps": 2310, "loss": 0.0043, "lr": 3.3301560539607174e-05, "epoch": 2.3947368421052633, "percentage": 34.2, "elapsed_time": "2:32:05", "remaining_time": "4:52:37"}
{"current_steps": 795, "total_steps": 2310, "loss": 0.0042, "lr": 3.318833638172519e-05, "epoch": 2.409919028340081, "percentage": 34.42, "elapsed_time": "2:33:06", "remaining_time": "4:51:46"}
{"current_steps": 800, "total_steps": 2310, "loss": 0.0041, "lr": 3.3074359356039306e-05, "epoch": 2.425101214574899, "percentage": 34.63, "elapsed_time": "2:34:04", "remaining_time": "4:50:48"}
{"current_steps": 805, "total_steps": 2310, "loss": 0.0038, "lr": 3.295963596902846e-05, "epoch": 2.4402834008097165, "percentage": 34.85, "elapsed_time": "2:35:03", "remaining_time": "4:49:54"}
{"current_steps": 810, "total_steps": 2310, "loss": 0.0039, "lr": 3.284417276977829e-05, "epoch": 2.4554655870445345, "percentage": 35.06, "elapsed_time": "2:35:55", "remaining_time": "4:48:45"}
{"current_steps": 815, "total_steps": 2310, "loss": 0.0036, "lr": 3.2727976349607276e-05, "epoch": 2.470647773279352, "percentage": 35.28, "elapsed_time": "2:36:51", "remaining_time": "4:47:44"}
{"current_steps": 820, "total_steps": 2310, "loss": 0.004, "lr": 3.261105334169045e-05, "epoch": 2.48582995951417, "percentage": 35.5, "elapsed_time": "2:37:47", "remaining_time": "4:46:42"}
{"current_steps": 825, "total_steps": 2310, "loss": 0.0034, "lr": 3.249341042068077e-05, "epoch": 2.501012145748988, "percentage": 35.71, "elapsed_time": "2:38:46", "remaining_time": "4:45:46"}
{"current_steps": 830, "total_steps": 2310, "loss": 0.0034, "lr": 3.237505430232803e-05, "epoch": 2.516194331983806, "percentage": 35.93, "elapsed_time": "2:39:41", "remaining_time": "4:44:45"}
{"current_steps": 835, "total_steps": 2310, "loss": 0.0038, "lr": 3.225599174309554e-05, "epoch": 2.5313765182186234, "percentage": 36.15, "elapsed_time": "2:40:39", "remaining_time": "4:43:47"}
{"current_steps": 840, "total_steps": 2310, "loss": 0.0039, "lr": 3.213622953977443e-05, "epoch": 2.5465587044534415, "percentage": 36.36, "elapsed_time": "2:41:39", "remaining_time": "4:42:54"}
{"current_steps": 845, "total_steps": 2310, "loss": 0.0033, "lr": 3.2015774529095595e-05, "epoch": 2.561740890688259, "percentage": 36.58, "elapsed_time": "2:42:37", "remaining_time": "4:41:57"}
{"current_steps": 850, "total_steps": 2310, "loss": 0.0035, "lr": 3.189463358733947e-05, "epoch": 2.5769230769230766, "percentage": 36.8, "elapsed_time": "2:43:33", "remaining_time": "4:40:55"}
{"current_steps": 855, "total_steps": 2310, "loss": 0.0032, "lr": 3.1772813629943455e-05, "epoch": 2.5921052631578947, "percentage": 37.01, "elapsed_time": "2:44:27", "remaining_time": "4:39:51"}
{"current_steps": 860, "total_steps": 2310, "loss": 0.0032, "lr": 3.1650321611107195e-05, "epoch": 2.6072874493927127, "percentage": 37.23, "elapsed_time": "2:45:16", "remaining_time": "4:38:40"}
{"current_steps": 865, "total_steps": 2310, "loss": 0.0029, "lr": 3.152716452339549e-05, "epoch": 2.6224696356275303, "percentage": 37.45, "elapsed_time": "2:46:18", "remaining_time": "4:37:48"}
{"current_steps": 870, "total_steps": 2310, "loss": 0.003, "lr": 3.140334939733924e-05, "epoch": 2.637651821862348, "percentage": 37.66, "elapsed_time": "2:47:14", "remaining_time": "4:36:49"}
{"current_steps": 875, "total_steps": 2310, "loss": 0.0033, "lr": 3.127888330103401e-05, "epoch": 2.652834008097166, "percentage": 37.88, "elapsed_time": "2:48:09", "remaining_time": "4:35:47"}
{"current_steps": 880, "total_steps": 2310, "loss": 0.0032, "lr": 3.115377333973659e-05, "epoch": 2.668016194331984, "percentage": 38.1, "elapsed_time": "2:49:04", "remaining_time": "4:34:45"}
{"current_steps": 885, "total_steps": 2310, "loss": 0.0031, "lr": 3.1028026655459376e-05, "epoch": 2.6831983805668016, "percentage": 38.31, "elapsed_time": "2:50:02", "remaining_time": "4:33:47"}
{"current_steps": 890, "total_steps": 2310, "loss": 0.0029, "lr": 3.0901650426562634e-05, "epoch": 2.698380566801619, "percentage": 38.53, "elapsed_time": "2:50:59", "remaining_time": "4:32:49"}
{"current_steps": 895, "total_steps": 2310, "loss": 0.0027, "lr": 3.0774651867344765e-05, "epoch": 2.7135627530364372, "percentage": 38.74, "elapsed_time": "2:51:52", "remaining_time": "4:31:44"}
{"current_steps": 900, "total_steps": 2310, "loss": 0.0025, "lr": 3.064703822763043e-05, "epoch": 2.7287449392712553, "percentage": 38.96, "elapsed_time": "2:52:51", "remaining_time": "4:30:49"}
{"current_steps": 905, "total_steps": 2310, "loss": 0.0025, "lr": 3.05188167923567e-05, "epoch": 2.743927125506073, "percentage": 39.18, "elapsed_time": "2:54:00", "remaining_time": "4:30:08"}
{"current_steps": 910, "total_steps": 2310, "loss": 0.0027, "lr": 3.0389994881157207e-05, "epoch": 2.7591093117408905, "percentage": 39.39, "elapsed_time": "2:54:56", "remaining_time": "4:29:08"}
{"current_steps": 915, "total_steps": 2310, "loss": 0.0027, "lr": 3.0260579847944255e-05, "epoch": 2.7742914979757085, "percentage": 39.61, "elapsed_time": "2:55:53", "remaining_time": "4:28:09"}
{"current_steps": 920, "total_steps": 2310, "loss": 0.0027, "lr": 3.0130579080489064e-05, "epoch": 2.7894736842105265, "percentage": 39.83, "elapsed_time": "2:56:44", "remaining_time": "4:27:01"}
{"current_steps": 925, "total_steps": 2310, "loss": 0.0025, "lr": 3.0000000000000004e-05, "epoch": 2.804655870445344, "percentage": 40.04, "elapsed_time": "2:57:41", "remaining_time": "4:26:03"}
{"current_steps": 930, "total_steps": 2310, "loss": 0.0024, "lr": 2.986885006069894e-05, "epoch": 2.8198380566801617, "percentage": 40.26, "elapsed_time": "2:58:41", "remaining_time": "4:25:08"}
{"current_steps": 935, "total_steps": 2310, "loss": 0.0024, "lr": 2.9737136749395743e-05, "epoch": 2.83502024291498, "percentage": 40.48, "elapsed_time": "2:59:37", "remaining_time": "4:24:08"}
{"current_steps": 940, "total_steps": 2310, "loss": 0.0023, "lr": 2.9604867585060834e-05, "epoch": 2.850202429149798, "percentage": 40.69, "elapsed_time": "3:00:33", "remaining_time": "4:23:08"}
{"current_steps": 945, "total_steps": 2310, "loss": 0.0022, "lr": 2.9472050118396034e-05, "epoch": 2.8653846153846154, "percentage": 40.91, "elapsed_time": "3:01:33", "remaining_time": "4:22:14"}
{"current_steps": 950, "total_steps": 2310, "loss": 0.0027, "lr": 2.933869193140346e-05, "epoch": 2.880566801619433, "percentage": 41.13, "elapsed_time": "3:02:25", "remaining_time": "4:21:09"}
{"current_steps": 955, "total_steps": 2310, "loss": 0.0025, "lr": 2.920480063695272e-05, "epoch": 2.895748987854251, "percentage": 41.34, "elapsed_time": "3:03:19", "remaining_time": "4:20:07"}
{"current_steps": 960, "total_steps": 2310, "loss": 0.0026, "lr": 2.9070383878346358e-05, "epoch": 2.910931174089069, "percentage": 41.56, "elapsed_time": "3:04:12", "remaining_time": "4:19:02"}
{"current_steps": 965, "total_steps": 2310, "loss": 0.0023, "lr": 2.8935449328883478e-05, "epoch": 2.9261133603238867, "percentage": 41.77, "elapsed_time": "3:05:07", "remaining_time": "4:18:01"}
{"current_steps": 970, "total_steps": 2310, "loss": 0.0019, "lr": 2.8800004691421743e-05, "epoch": 2.9412955465587043, "percentage": 41.99, "elapsed_time": "3:06:10", "remaining_time": "4:17:11"}
{"current_steps": 975, "total_steps": 2310, "loss": 0.002, "lr": 2.866405769793764e-05, "epoch": 2.9564777327935223, "percentage": 42.21, "elapsed_time": "3:07:05", "remaining_time": "4:16:10"}
{"current_steps": 980, "total_steps": 2310, "loss": 0.0021, "lr": 2.8527616109085082e-05, "epoch": 2.97165991902834, "percentage": 42.42, "elapsed_time": "3:08:03", "remaining_time": "4:15:13"}
{"current_steps": 985, "total_steps": 2310, "loss": 0.0023, "lr": 2.8390687713752405e-05, "epoch": 2.986842105263158, "percentage": 42.64, "elapsed_time": "3:09:02", "remaining_time": "4:14:17"}
{"current_steps": 990, "total_steps": 2310, "loss": 0.0022, "lr": 2.8253280328617712e-05, "epoch": 3.0, "percentage": 42.86, "elapsed_time": "3:09:49", "remaining_time": "4:13:06"}
{"current_steps": 995, "total_steps": 2310, "loss": 0.0019, "lr": 2.811540179770268e-05, "epoch": 3.0151821862348176, "percentage": 43.07, "elapsed_time": "3:10:50", "remaining_time": "4:12:12"}
{"current_steps": 1000, "total_steps": 2310, "loss": 0.002, "lr": 2.7977059991924734e-05, "epoch": 3.0303643724696356, "percentage": 43.29, "elapsed_time": "3:11:44", "remaining_time": "4:11:10"}
{"current_steps": 1005, "total_steps": 2310, "loss": 0.0019, "lr": 2.7838262808647766e-05, "epoch": 3.0455465587044532, "percentage": 43.51, "elapsed_time": "3:12:42", "remaining_time": "4:10:13"}
{"current_steps": 1010, "total_steps": 2310, "loss": 0.0019, "lr": 2.7699018171231288e-05, "epoch": 3.0607287449392713, "percentage": 43.72, "elapsed_time": "3:13:36", "remaining_time": "4:09:12"}
{"current_steps": 1015, "total_steps": 2310, "loss": 0.0019, "lr": 2.7559334028578135e-05, "epoch": 3.075910931174089, "percentage": 43.94, "elapsed_time": "3:14:32", "remaining_time": "4:08:11"}
{"current_steps": 1020, "total_steps": 2310, "loss": 0.0021, "lr": 2.741921835468068e-05, "epoch": 3.091093117408907, "percentage": 44.16, "elapsed_time": "3:15:30", "remaining_time": "4:07:15"}
{"current_steps": 1025, "total_steps": 2310, "loss": 0.0021, "lr": 2.727867914816563e-05, "epoch": 3.1062753036437245, "percentage": 44.37, "elapsed_time": "3:16:24", "remaining_time": "4:06:14"}
{"current_steps": 1030, "total_steps": 2310, "loss": 0.0017, "lr": 2.713772443183744e-05, "epoch": 3.1214574898785425, "percentage": 44.59, "elapsed_time": "3:17:16", "remaining_time": "4:05:09"}
{"current_steps": 1035, "total_steps": 2310, "loss": 0.0017, "lr": 2.69963622522203e-05, "epoch": 3.13663967611336, "percentage": 44.81, "elapsed_time": "3:18:10", "remaining_time": "4:04:07"}
{"current_steps": 1040, "total_steps": 2310, "loss": 0.002, "lr": 2.6854600679098798e-05, "epoch": 3.151821862348178, "percentage": 45.02, "elapsed_time": "3:19:05", "remaining_time": "4:03:07"}
{"current_steps": 1045, "total_steps": 2310, "loss": 0.0017, "lr": 2.6712447805057254e-05, "epoch": 3.167004048582996, "percentage": 45.24, "elapsed_time": "3:20:02", "remaining_time": "4:02:09"}
{"current_steps": 1050, "total_steps": 2310, "loss": 0.0019, "lr": 2.656991174501775e-05, "epoch": 3.182186234817814, "percentage": 45.45, "elapsed_time": "3:21:00", "remaining_time": "4:01:12"}
{"current_steps": 1055, "total_steps": 2310, "loss": 0.0018, "lr": 2.642700063577689e-05, "epoch": 3.1973684210526314, "percentage": 45.67, "elapsed_time": "3:21:55", "remaining_time": "4:00:12"}
{"current_steps": 1060, "total_steps": 2310, "loss": 0.0018, "lr": 2.6283722635541255e-05, "epoch": 3.2125506072874495, "percentage": 45.89, "elapsed_time": "3:22:46", "remaining_time": "3:59:07"}
{"current_steps": 1065, "total_steps": 2310, "loss": 0.0018, "lr": 2.6140085923461756e-05, "epoch": 3.227732793522267, "percentage": 46.1, "elapsed_time": "3:23:42", "remaining_time": "3:58:08"}
{"current_steps": 1070, "total_steps": 2310, "loss": 0.0015, "lr": 2.5996098699166678e-05, "epoch": 3.242914979757085, "percentage": 46.32, "elapsed_time": "3:24:37", "remaining_time": "3:57:08"}
{"current_steps": 1075, "total_steps": 2310, "loss": 0.0016, "lr": 2.585176918229359e-05, "epoch": 3.2580971659919027, "percentage": 46.54, "elapsed_time": "3:25:35", "remaining_time": "3:56:12"}
{"current_steps": 1080, "total_steps": 2310, "loss": 0.0017, "lr": 2.5707105612020163e-05, "epoch": 3.2732793522267207, "percentage": 46.75, "elapsed_time": "3:26:30", "remaining_time": "3:55:10"}
{"current_steps": 1085, "total_steps": 2310, "loss": 0.0014, "lr": 2.5562116246593765e-05, "epoch": 3.2884615384615383, "percentage": 46.97, "elapsed_time": "3:27:28", "remaining_time": "3:54:14"}
{"current_steps": 1090, "total_steps": 2310, "loss": 0.0018, "lr": 2.5416809362860107e-05, "epoch": 3.3036437246963564, "percentage": 47.19, "elapsed_time": "3:28:18", "remaining_time": "3:53:08"}
{"current_steps": 1095, "total_steps": 2310, "loss": 0.0016, "lr": 2.5271193255790672e-05, "epoch": 3.318825910931174, "percentage": 47.4, "elapsed_time": "3:29:08", "remaining_time": "3:52:03"}
{"current_steps": 1100, "total_steps": 2310, "loss": 0.0016, "lr": 2.5125276238009254e-05, "epoch": 3.334008097165992, "percentage": 47.62, "elapsed_time": "3:30:06", "remaining_time": "3:51:07"}
{"current_steps": 1105, "total_steps": 2310, "loss": 0.0016, "lr": 2.4979066639317405e-05, "epoch": 3.3491902834008096, "percentage": 47.84, "elapsed_time": "3:31:01", "remaining_time": "3:50:07"}
{"current_steps": 1110, "total_steps": 2310, "loss": 0.0016, "lr": 2.4832572806218892e-05, "epoch": 3.3643724696356276, "percentage": 48.05, "elapsed_time": "3:32:00", "remaining_time": "3:49:12"}
{"current_steps": 1115, "total_steps": 2310, "loss": 0.0015, "lr": 2.4685803101443276e-05, "epoch": 3.3795546558704452, "percentage": 48.27, "elapsed_time": "3:32:58", "remaining_time": "3:48:15"}
{"current_steps": 1120, "total_steps": 2310, "loss": 0.0013, "lr": 2.4538765903468486e-05, "epoch": 3.3947368421052633, "percentage": 48.48, "elapsed_time": "3:33:57", "remaining_time": "3:47:19"}
{"current_steps": 1125, "total_steps": 2310, "loss": 0.0014, "lr": 2.4391469606042533e-05, "epoch": 3.409919028340081, "percentage": 48.7, "elapsed_time": "3:34:57", "remaining_time": "3:46:25"}
{"current_steps": 1130, "total_steps": 2310, "loss": 0.0015, "lr": 2.4243922617704364e-05, "epoch": 3.425101214574899, "percentage": 48.92, "elapsed_time": "3:35:56", "remaining_time": "3:45:29"}
{"current_steps": 1135, "total_steps": 2310, "loss": 0.0013, "lr": 2.4096133361303816e-05, "epoch": 3.4402834008097165, "percentage": 49.13, "elapsed_time": "3:36:54", "remaining_time": "3:44:33"}
{"current_steps": 1140, "total_steps": 2310, "loss": 0.0014, "lr": 2.394811027352083e-05, "epoch": 3.4554655870445345, "percentage": 49.35, "elapsed_time": "3:37:52", "remaining_time": "3:43:36"}
{"current_steps": 1145, "total_steps": 2310, "loss": 0.0015, "lr": 2.3799861804383807e-05, "epoch": 3.470647773279352, "percentage": 49.57, "elapsed_time": "3:38:49", "remaining_time": "3:42:38"}
{"current_steps": 1150, "total_steps": 2310, "loss": 0.0014, "lr": 2.365139641678724e-05, "epoch": 3.48582995951417, "percentage": 49.78, "elapsed_time": "3:39:46", "remaining_time": "3:41:40"}
{"current_steps": 1155, "total_steps": 2310, "loss": 0.0014, "lr": 2.350272258600861e-05, "epoch": 3.501012145748988, "percentage": 50.0, "elapsed_time": "3:40:40", "remaining_time": "3:40:40"}
{"current_steps": 1160, "total_steps": 2310, "loss": 0.0014, "lr": 2.335384879922456e-05, "epoch": 3.516194331983806, "percentage": 50.22, "elapsed_time": "3:41:37", "remaining_time": "3:39:43"}
{"current_steps": 1165, "total_steps": 2310, "loss": 0.0013, "lr": 2.3204783555026394e-05, "epoch": 3.5313765182186234, "percentage": 50.43, "elapsed_time": "3:42:39", "remaining_time": "3:38:50"}
{"current_steps": 1170, "total_steps": 2310, "loss": 0.0015, "lr": 2.3055535362934945e-05, "epoch": 3.5465587044534415, "percentage": 50.65, "elapsed_time": "3:43:28", "remaining_time": "3:37:44"}
{"current_steps": 1175, "total_steps": 2310, "loss": 0.0013, "lr": 2.2906112742914756e-05, "epoch": 3.561740890688259, "percentage": 50.87, "elapsed_time": "3:44:31", "remaining_time": "3:36:52"}
{"current_steps": 1180, "total_steps": 2310, "loss": 0.0014, "lr": 2.2756524224887776e-05, "epoch": 3.5769230769230766, "percentage": 51.08, "elapsed_time": "3:45:25", "remaining_time": "3:35:52"}
{"current_steps": 1185, "total_steps": 2310, "loss": 0.0013, "lr": 2.2606778348246366e-05, "epoch": 3.5921052631578947, "percentage": 51.3, "elapsed_time": "3:46:23", "remaining_time": "3:34:55"}
{"current_steps": 1190, "total_steps": 2310, "loss": 0.0012, "lr": 2.245688366136586e-05, "epoch": 3.6072874493927127, "percentage": 51.52, "elapsed_time": "3:47:15", "remaining_time": "3:33:52"}
{"current_steps": 1195, "total_steps": 2310, "loss": 0.0012, "lr": 2.2306848721116535e-05, "epoch": 3.6224696356275303, "percentage": 51.73, "elapsed_time": "3:48:14", "remaining_time": "3:32:57"}
{"current_steps": 1200, "total_steps": 2310, "loss": 0.0012, "lr": 2.2156682092375175e-05, "epoch": 3.637651821862348, "percentage": 51.95, "elapsed_time": "3:49:12", "remaining_time": "3:32:01"}
{"current_steps": 1205, "total_steps": 2310, "loss": 0.0011, "lr": 2.2006392347536102e-05, "epoch": 3.652834008097166, "percentage": 52.16, "elapsed_time": "3:50:18", "remaining_time": "3:31:12"}
{"current_steps": 1210, "total_steps": 2310, "loss": 0.0011, "lr": 2.1855988066021837e-05, "epoch": 3.668016194331984, "percentage": 52.38, "elapsed_time": "3:51:17", "remaining_time": "3:30:15"}
{"current_steps": 1215, "total_steps": 2310, "loss": 0.0012, "lr": 2.1705477833793332e-05, "epoch": 3.6831983805668016, "percentage": 52.6, "elapsed_time": "3:52:12", "remaining_time": "3:29:16"}
{"current_steps": 1220, "total_steps": 2310, "loss": 0.0012, "lr": 2.1554870242859813e-05, "epoch": 3.698380566801619, "percentage": 52.81, "elapsed_time": "3:53:10", "remaining_time": "3:28:19"}
{"current_steps": 1225, "total_steps": 2310, "loss": 0.0013, "lr": 2.140417389078833e-05, "epoch": 3.7135627530364372, "percentage": 53.03, "elapsed_time": "3:54:06", "remaining_time": "3:27:21"}
{"current_steps": 1230, "total_steps": 2310, "loss": 0.0012, "lr": 2.1253397380212935e-05, "epoch": 3.7287449392712553, "percentage": 53.25, "elapsed_time": "3:55:02", "remaining_time": "3:26:22"}
{"current_steps": 1235, "total_steps": 2310, "loss": 0.0011, "lr": 2.110254931834359e-05, "epoch": 3.743927125506073, "percentage": 53.46, "elapsed_time": "3:55:56", "remaining_time": "3:25:22"}
{"current_steps": 1240, "total_steps": 2310, "loss": 0.0011, "lr": 2.095163831647485e-05, "epoch": 3.7591093117408905, "percentage": 53.68, "elapsed_time": "3:56:53", "remaining_time": "3:24:24"}
{"current_steps": 1245, "total_steps": 2310, "loss": 0.0011, "lr": 2.0800672989494225e-05, "epoch": 3.7742914979757085, "percentage": 53.9, "elapsed_time": "3:57:50", "remaining_time": "3:23:27"}
{"current_steps": 1250, "total_steps": 2310, "loss": 0.0012, "lr": 2.0649661955390447e-05, "epoch": 3.7894736842105265, "percentage": 54.11, "elapsed_time": "3:58:43", "remaining_time": "3:22:26"}
{"current_steps": 1255, "total_steps": 2310, "loss": 0.0012, "lr": 2.0498613834761462e-05, "epoch": 3.804655870445344, "percentage": 54.33, "elapsed_time": "3:59:36", "remaining_time": "3:21:25"}
{"current_steps": 1260, "total_steps": 2310, "loss": 0.001, "lr": 2.0347537250322338e-05, "epoch": 3.8198380566801617, "percentage": 54.55, "elapsed_time": "4:00:36", "remaining_time": "3:20:30"}
{"current_steps": 1265, "total_steps": 2310, "loss": 0.0011, "lr": 2.0196440826413033e-05, "epoch": 3.83502024291498, "percentage": 54.76, "elapsed_time": "4:01:30", "remaining_time": "3:19:30"}
{"current_steps": 1270, "total_steps": 2310, "loss": 0.0011, "lr": 2.004533318850605e-05, "epoch": 3.850202429149798, "percentage": 54.98, "elapsed_time": "4:02:23", "remaining_time": "3:18:29"}
{"current_steps": 1275, "total_steps": 2310, "loss": 0.0011, "lr": 1.9894222962714046e-05, "epoch": 3.8653846153846154, "percentage": 55.19, "elapsed_time": "4:03:15", "remaining_time": "3:17:28"}
{"current_steps": 1280, "total_steps": 2310, "loss": 0.0009, "lr": 1.9743118775297432e-05, "epoch": 3.880566801619433, "percentage": 55.41, "elapsed_time": "4:04:15", "remaining_time": "3:16:33"}
{"current_steps": 1285, "total_steps": 2310, "loss": 0.001, "lr": 1.95920292521719e-05, "epoch": 3.895748987854251, "percentage": 55.63, "elapsed_time": "4:05:18", "remaining_time": "3:15:40"}
{"current_steps": 1290, "total_steps": 2310, "loss": 0.0013, "lr": 1.9440963018416002e-05, "epoch": 3.910931174089069, "percentage": 55.84, "elapsed_time": "4:06:07", "remaining_time": "3:14:36"}
{"current_steps": 1295, "total_steps": 2310, "loss": 0.001, "lr": 1.9289928697778826e-05, "epoch": 3.9261133603238867, "percentage": 56.06, "elapsed_time": "4:07:02", "remaining_time": "3:13:37"}
{"current_steps": 1300, "total_steps": 2310, "loss": 0.0009, "lr": 1.9138934912187647e-05, "epoch": 3.9412955465587043, "percentage": 56.28, "elapsed_time": "4:08:00", "remaining_time": "3:12:40"}
{"current_steps": 1305, "total_steps": 2310, "loss": 0.0008, "lr": 1.898799028125578e-05, "epoch": 3.9564777327935223, "percentage": 56.49, "elapsed_time": "4:09:01", "remaining_time": "3:11:46"}
{"current_steps": 1310, "total_steps": 2310, "loss": 0.0011, "lr": 1.8837103421790486e-05, "epoch": 3.97165991902834, "percentage": 56.71, "elapsed_time": "4:09:53", "remaining_time": "3:10:45"}
{"current_steps": 1315, "total_steps": 2310, "loss": 0.001, "lr": 1.8686282947301108e-05, "epoch": 3.986842105263158, "percentage": 56.93, "elapsed_time": "4:10:48", "remaining_time": "3:09:46"}
{"current_steps": 1320, "total_steps": 2310, "loss": 0.001, "lr": 1.8535537467507326e-05, "epoch": 4.0, "percentage": 57.14, "elapsed_time": "4:11:35", "remaining_time": "3:08:41"}
{"current_steps": 1325, "total_steps": 2310, "loss": 0.0009, "lr": 1.8384875587847698e-05, "epoch": 4.015182186234818, "percentage": 57.36, "elapsed_time": "4:12:32", "remaining_time": "3:07:44"}
{"current_steps": 1330, "total_steps": 2310, "loss": 0.0008, "lr": 1.823430590898839e-05, "epoch": 4.030364372469635, "percentage": 57.58, "elapsed_time": "4:13:34", "remaining_time": "3:06:50"}
{"current_steps": 1335, "total_steps": 2310, "loss": 0.0009, "lr": 1.8083837026332187e-05, "epoch": 4.045546558704453, "percentage": 57.79, "elapsed_time": "4:14:31", "remaining_time": "3:05:53"}
{"current_steps": 1340, "total_steps": 2310, "loss": 0.0007, "lr": 1.7933477529527862e-05, "epoch": 4.060728744939271, "percentage": 58.01, "elapsed_time": "4:15:32", "remaining_time": "3:04:59"}
{"current_steps": 1345, "total_steps": 2310, "loss": 0.0008, "lr": 1.778323600197978e-05, "epoch": 4.075910931174089, "percentage": 58.23, "elapsed_time": "4:16:28", "remaining_time": "3:04:00"}
{"current_steps": 1350, "total_steps": 2310, "loss": 0.0009, "lr": 1.7633121020357945e-05, "epoch": 4.0910931174089065, "percentage": 58.44, "elapsed_time": "4:17:30", "remaining_time": "3:03:06"}
{"current_steps": 1355, "total_steps": 2310, "loss": 0.0008, "lr": 1.748314115410835e-05, "epoch": 4.1062753036437245, "percentage": 58.66, "elapsed_time": "4:18:23", "remaining_time": "3:02:06"}
{"current_steps": 1360, "total_steps": 2310, "loss": 0.0008, "lr": 1.733330496496383e-05, "epoch": 4.1214574898785425, "percentage": 58.87, "elapsed_time": "4:19:16", "remaining_time": "3:01:06"}
{"current_steps": 1365, "total_steps": 2310, "loss": 0.0009, "lr": 1.718362100645527e-05, "epoch": 4.136639676113361, "percentage": 59.09, "elapsed_time": "4:20:11", "remaining_time": "3:00:07"}
{"current_steps": 1370, "total_steps": 2310, "loss": 0.0008, "lr": 1.7034097823423356e-05, "epoch": 4.151821862348178, "percentage": 59.31, "elapsed_time": "4:21:05", "remaining_time": "2:59:08"}
{"current_steps": 1375, "total_steps": 2310, "loss": 0.0007, "lr": 1.6884743951530737e-05, "epoch": 4.167004048582996, "percentage": 59.52, "elapsed_time": "4:22:02", "remaining_time": "2:58:10"}
{"current_steps": 1380, "total_steps": 2310, "loss": 0.0007, "lr": 1.6735567916774814e-05, "epoch": 4.182186234817814, "percentage": 59.74, "elapsed_time": "4:22:59", "remaining_time": "2:57:13"}
{"current_steps": 1385, "total_steps": 2310, "loss": 0.0008, "lr": 1.6586578235000975e-05, "epoch": 4.197368421052632, "percentage": 59.96, "elapsed_time": "4:23:54", "remaining_time": "2:56:15"}
{"current_steps": 1390, "total_steps": 2310, "loss": 0.0008, "lr": 1.64377834114165e-05, "epoch": 4.212550607287449, "percentage": 60.17, "elapsed_time": "4:24:48", "remaining_time": "2:55:16"}
{"current_steps": 1395, "total_steps": 2310, "loss": 0.0007, "lr": 1.6289191940105004e-05, "epoch": 4.227732793522267, "percentage": 60.39, "elapsed_time": "4:25:45", "remaining_time": "2:54:18"}
{"current_steps": 1400, "total_steps": 2310, "loss": 0.0007, "lr": 1.614081230354158e-05, "epoch": 4.242914979757085, "percentage": 60.61, "elapsed_time": "4:26:44", "remaining_time": "2:53:23"}
{"current_steps": 1405, "total_steps": 2310, "loss": 0.0007, "lr": 1.599265297210852e-05, "epoch": 4.258097165991903, "percentage": 60.82, "elapsed_time": "4:27:45", "remaining_time": "2:52:28"}
{"current_steps": 1410, "total_steps": 2310, "loss": 0.0007, "lr": 1.5844722403611814e-05, "epoch": 4.27327935222672, "percentage": 61.04, "elapsed_time": "4:28:46", "remaining_time": "2:51:33"}
{"current_steps": 1415, "total_steps": 2310, "loss": 0.0007, "lr": 1.5697029042798334e-05, "epoch": 4.288461538461538, "percentage": 61.26, "elapsed_time": "4:29:41", "remaining_time": "2:50:34"}
{"current_steps": 1420, "total_steps": 2310, "loss": 0.0007, "lr": 1.5549581320873715e-05, "epoch": 4.303643724696356, "percentage": 61.47, "elapsed_time": "4:30:38", "remaining_time": "2:49:37"}
{"current_steps": 1425, "total_steps": 2310, "loss": 0.0007, "lr": 1.5402387655021106e-05, "epoch": 4.318825910931174, "percentage": 61.69, "elapsed_time": "4:31:34", "remaining_time": "2:48:39"}
{"current_steps": 1430, "total_steps": 2310, "loss": 0.0006, "lr": 1.5255456447920622e-05, "epoch": 4.334008097165992, "percentage": 61.9, "elapsed_time": "4:32:30", "remaining_time": "2:47:41"}
{"current_steps": 1435, "total_steps": 2310, "loss": 0.0006, "lr": 1.5108796087269704e-05, "epoch": 4.34919028340081, "percentage": 62.12, "elapsed_time": "4:33:20", "remaining_time": "2:46:40"}
{"current_steps": 1440, "total_steps": 2310, "loss": 0.0006, "lr": 1.4962414945304269e-05, "epoch": 4.364372469635628, "percentage": 62.34, "elapsed_time": "4:34:16", "remaining_time": "2:45:42"}
{"current_steps": 1445, "total_steps": 2310, "loss": 0.0006, "lr": 1.48163213783208e-05, "epoch": 4.379554655870446, "percentage": 62.55, "elapsed_time": "4:35:13", "remaining_time": "2:44:45"}
{"current_steps": 1450, "total_steps": 2310, "loss": 0.0007, "lr": 1.4670523726199304e-05, "epoch": 4.394736842105263, "percentage": 62.77, "elapsed_time": "4:36:03", "remaining_time": "2:43:44"}
{"current_steps": 1455, "total_steps": 2310, "loss": 0.0006, "lr": 1.4525030311927228e-05, "epoch": 4.409919028340081, "percentage": 62.99, "elapsed_time": "4:36:57", "remaining_time": "2:42:44"}
{"current_steps": 1460, "total_steps": 2310, "loss": 0.0006, "lr": 1.4379849441124345e-05, "epoch": 4.425101214574899, "percentage": 63.2, "elapsed_time": "4:37:46", "remaining_time": "2:41:43"}
{"current_steps": 1465, "total_steps": 2310, "loss": 0.0006, "lr": 1.4234989401568584e-05, "epoch": 4.440283400809717, "percentage": 63.42, "elapsed_time": "4:38:45", "remaining_time": "2:40:47"}
{"current_steps": 1470, "total_steps": 2310, "loss": 0.0007, "lr": 1.4090458462722964e-05, "epoch": 4.455465587044534, "percentage": 63.64, "elapsed_time": "4:39:42", "remaining_time": "2:39:49"}
{"current_steps": 1475, "total_steps": 2310, "loss": 0.0006, "lr": 1.3946264875263485e-05, "epoch": 4.470647773279352, "percentage": 63.85, "elapsed_time": "4:40:39", "remaining_time": "2:38:53"}
{"current_steps": 1480, "total_steps": 2310, "loss": 0.0006, "lr": 1.380241687060815e-05, "epoch": 4.48582995951417, "percentage": 64.07, "elapsed_time": "4:41:39", "remaining_time": "2:37:57"}
{"current_steps": 1485, "total_steps": 2310, "loss": 0.0007, "lr": 1.3658922660447065e-05, "epoch": 4.501012145748988, "percentage": 64.29, "elapsed_time": "4:42:32", "remaining_time": "2:36:57"}
{"current_steps": 1490, "total_steps": 2310, "loss": 0.0007, "lr": 1.3515790436273664e-05, "epoch": 4.516194331983805, "percentage": 64.5, "elapsed_time": "4:43:27", "remaining_time": "2:35:59"}
{"current_steps": 1495, "total_steps": 2310, "loss": 0.0006, "lr": 1.3373028368917091e-05, "epoch": 4.531376518218623, "percentage": 64.72, "elapsed_time": "4:44:21", "remaining_time": "2:35:01"}
{"current_steps": 1500, "total_steps": 2310, "loss": 0.0006, "lr": 1.3230644608075766e-05, "epoch": 4.5465587044534415, "percentage": 64.94, "elapsed_time": "4:45:13", "remaining_time": "2:34:01"}
{"current_steps": 1505, "total_steps": 2310, "loss": 0.0005, "lr": 1.3088647281852138e-05, "epoch": 4.5617408906882595, "percentage": 65.15, "elapsed_time": "4:46:19", "remaining_time": "2:33:08"}
{"current_steps": 1510, "total_steps": 2310, "loss": 0.0005, "lr": 1.2947044496288718e-05, "epoch": 4.576923076923077, "percentage": 65.37, "elapsed_time": "4:47:11", "remaining_time": "2:32:09"}
{"current_steps": 1515, "total_steps": 2310, "loss": 0.0005, "lr": 1.2805844334905286e-05, "epoch": 4.592105263157895, "percentage": 65.58, "elapsed_time": "4:48:10", "remaining_time": "2:31:13"}
{"current_steps": 1520, "total_steps": 2310, "loss": 0.0005, "lr": 1.266505485823748e-05, "epoch": 4.607287449392713, "percentage": 65.8, "elapsed_time": "4:49:07", "remaining_time": "2:30:15"}
{"current_steps": 1525, "total_steps": 2310, "loss": 0.0005, "lr": 1.2524684103376643e-05, "epoch": 4.62246963562753, "percentage": 66.02, "elapsed_time": "4:50:03", "remaining_time": "2:29:18"}
{"current_steps": 1530, "total_steps": 2310, "loss": 0.0005, "lr": 1.238474008351101e-05, "epoch": 4.637651821862348, "percentage": 66.23, "elapsed_time": "4:51:03", "remaining_time": "2:28:22"}
{"current_steps": 1535, "total_steps": 2310, "loss": 0.0006, "lr": 1.2245230787468279e-05, "epoch": 4.652834008097166, "percentage": 66.45, "elapsed_time": "4:52:00", "remaining_time": "2:27:25"}
{"current_steps": 1540, "total_steps": 2310, "loss": 0.0006, "lr": 1.210616417925954e-05, "epoch": 4.668016194331984, "percentage": 66.67, "elapsed_time": "4:52:52", "remaining_time": "2:26:26"}
{"current_steps": 1545, "total_steps": 2310, "loss": 0.0005, "lr": 1.1967548197624679e-05, "epoch": 4.683198380566802, "percentage": 66.88, "elapsed_time": "4:53:43", "remaining_time": "2:25:26"}
{"current_steps": 1550, "total_steps": 2310, "loss": 0.0005, "lr": 1.182939075557915e-05, "epoch": 4.698380566801619, "percentage": 67.1, "elapsed_time": "4:54:41", "remaining_time": "2:24:29"}
{"current_steps": 1555, "total_steps": 2310, "loss": 0.0005, "lr": 1.1691699739962275e-05, "epoch": 4.713562753036437, "percentage": 67.32, "elapsed_time": "4:55:36", "remaining_time": "2:23:31"}
{"current_steps": 1560, "total_steps": 2310, "loss": 0.0004, "lr": 1.1554483010987015e-05, "epoch": 4.728744939271255, "percentage": 67.53, "elapsed_time": "4:56:36", "remaining_time": "2:22:36"}
{"current_steps": 1565, "total_steps": 2310, "loss": 0.0005, "lr": 1.1417748401791261e-05, "epoch": 4.743927125506072, "percentage": 67.75, "elapsed_time": "4:57:27", "remaining_time": "2:21:36"}
{"current_steps": 1570, "total_steps": 2310, "loss": 0.0005, "lr": 1.1281503717990675e-05, "epoch": 4.7591093117408905, "percentage": 67.97, "elapsed_time": "4:58:23", "remaining_time": "2:20:38"}
{"current_steps": 1575, "total_steps": 2310, "loss": 0.0005, "lr": 1.1145756737233078e-05, "epoch": 4.7742914979757085, "percentage": 68.18, "elapsed_time": "4:59:22", "remaining_time": "2:19:42"}
{"current_steps": 1580, "total_steps": 2310, "loss": 0.0005, "lr": 1.1010515208754487e-05, "epoch": 4.7894736842105265, "percentage": 68.4, "elapsed_time": "5:00:14", "remaining_time": "2:18:43"}
{"current_steps": 1585, "total_steps": 2310, "loss": 0.0005, "lr": 1.087578685293674e-05, "epoch": 4.804655870445345, "percentage": 68.61, "elapsed_time": "5:01:12", "remaining_time": "2:17:46"}
{"current_steps": 1590, "total_steps": 2310, "loss": 0.0004, "lr": 1.0741579360866752e-05, "epoch": 4.819838056680162, "percentage": 68.83, "elapsed_time": "5:02:09", "remaining_time": "2:16:49"}
{"current_steps": 1595, "total_steps": 2310, "loss": 0.0004, "lr": 1.0607900393897463e-05, "epoch": 4.83502024291498, "percentage": 69.05, "elapsed_time": "5:03:03", "remaining_time": "2:15:51"}
{"current_steps": 1600, "total_steps": 2310, "loss": 0.0004, "lr": 1.0474757583210498e-05, "epoch": 4.850202429149798, "percentage": 69.26, "elapsed_time": "5:04:01", "remaining_time": "2:14:54"}
{"current_steps": 1605, "total_steps": 2310, "loss": 0.0004, "lr": 1.0342158529380544e-05, "epoch": 4.865384615384615, "percentage": 69.48, "elapsed_time": "5:04:54", "remaining_time": "2:13:55"}
{"current_steps": 1610, "total_steps": 2310, "loss": 0.0004, "lr": 1.0210110801941426e-05, "epoch": 4.880566801619433, "percentage": 69.7, "elapsed_time": "5:05:54", "remaining_time": "2:13:00"}
{"current_steps": 1615, "total_steps": 2310, "loss": 0.0005, "lr": 1.0078621938954028e-05, "epoch": 4.895748987854251, "percentage": 69.91, "elapsed_time": "5:06:46", "remaining_time": "2:12:01"}
{"current_steps": 1620, "total_steps": 2310, "loss": 0.0004, "lr": 9.947699446575951e-06, "epoch": 4.910931174089069, "percentage": 70.13, "elapsed_time": "5:07:41", "remaining_time": "2:11:03"}
{"current_steps": 1625, "total_steps": 2310, "loss": 0.0004, "lr": 9.817350798633068e-06, "epoch": 4.926113360323887, "percentage": 70.35, "elapsed_time": "5:08:35", "remaining_time": "2:10:04"}
{"current_steps": 1630, "total_steps": 2310, "loss": 0.0004, "lr": 9.687583436192785e-06, "epoch": 4.941295546558704, "percentage": 70.56, "elapsed_time": "5:09:36", "remaining_time": "2:09:09"}
{"current_steps": 1635, "total_steps": 2310, "loss": 0.0004, "lr": 9.558404767139335e-06, "epoch": 4.956477732793522, "percentage": 70.78, "elapsed_time": "5:10:39", "remaining_time": "2:08:15"}
{"current_steps": 1640, "total_steps": 2310, "loss": 0.0004, "lr": 9.429822165750893e-06, "epoch": 4.97165991902834, "percentage": 71.0, "elapsed_time": "5:11:40", "remaining_time": "2:07:19"}
{"current_steps": 1645, "total_steps": 2310, "loss": 0.0004, "lr": 9.301842972278557e-06, "epoch": 4.9868421052631575, "percentage": 71.21, "elapsed_time": "5:12:37", "remaining_time": "2:06:22"}
{"current_steps": 1650, "total_steps": 2310, "loss": 0.0004, "lr": 9.174474492527359e-06, "epoch": 5.0, "percentage": 71.43, "elapsed_time": "5:13:26", "remaining_time": "2:05:22"}
{"current_steps": 1655, "total_steps": 2310, "loss": 0.0003, "lr": 9.047723997439206e-06, "epoch": 5.015182186234818, "percentage": 71.65, "elapsed_time": "5:14:24", "remaining_time": "2:04:25"}
{"current_steps": 1660, "total_steps": 2310, "loss": 0.0004, "lr": 8.921598722677796e-06, "epoch": 5.030364372469635, "percentage": 71.86, "elapsed_time": "5:15:20", "remaining_time": "2:03:28"}
{"current_steps": 1665, "total_steps": 2310, "loss": 0.0004, "lr": 8.796105868215592e-06, "epoch": 5.045546558704453, "percentage": 72.08, "elapsed_time": "5:16:11", "remaining_time": "2:02:29"}
{"current_steps": 1670, "total_steps": 2310, "loss": 0.0003, "lr": 8.671252597922768e-06, "epoch": 5.060728744939271, "percentage": 72.29, "elapsed_time": "5:17:06", "remaining_time": "2:01:31"}
{"current_steps": 1675, "total_steps": 2310, "loss": 0.0004, "lr": 8.547046039158283e-06, "epoch": 5.075910931174089, "percentage": 72.51, "elapsed_time": "5:18:03", "remaining_time": "2:00:34"}
{"current_steps": 1680, "total_steps": 2310, "loss": 0.0004, "lr": 8.423493282362982e-06, "epoch": 5.0910931174089065, "percentage": 72.73, "elapsed_time": "5:18:59", "remaining_time": "1:59:37"}
{"current_steps": 1685, "total_steps": 2310, "loss": 0.0004, "lr": 8.300601380654883e-06, "epoch": 5.1062753036437245, "percentage": 72.94, "elapsed_time": "5:19:49", "remaining_time": "1:58:37"}
{"current_steps": 1690, "total_steps": 2310, "loss": 0.0003, "lr": 8.178377349426471e-06, "epoch": 5.1214574898785425, "percentage": 73.16, "elapsed_time": "5:20:45", "remaining_time": "1:57:40"}
{"current_steps": 1695, "total_steps": 2310, "loss": 0.0003, "lr": 8.056828165944282e-06, "epoch": 5.136639676113361, "percentage": 73.38, "elapsed_time": "5:21:42", "remaining_time": "1:56:43"}
{"current_steps": 1700, "total_steps": 2310, "loss": 0.0003, "lr": 7.93596076895055e-06, "epoch": 5.151821862348178, "percentage": 73.59, "elapsed_time": "5:22:40", "remaining_time": "1:55:47"}
{"current_steps": 1705, "total_steps": 2310, "loss": 0.0003, "lr": 7.815782058267156e-06, "epoch": 5.167004048582996, "percentage": 73.81, "elapsed_time": "5:23:33", "remaining_time": "1:54:48"}
{"current_steps": 1710, "total_steps": 2310, "loss": 0.0003, "lr": 7.696298894401697e-06, "epoch": 5.182186234817814, "percentage": 74.03, "elapsed_time": "5:24:28", "remaining_time": "1:53:50"}
{"current_steps": 1715, "total_steps": 2310, "loss": 0.0003, "lr": 7.577518098155869e-06, "epoch": 5.197368421052632, "percentage": 74.24, "elapsed_time": "5:25:20", "remaining_time": "1:52:52"}
{"current_steps": 1720, "total_steps": 2310, "loss": 0.0004, "lr": 7.459446450236087e-06, "epoch": 5.212550607287449, "percentage": 74.46, "elapsed_time": "5:26:19", "remaining_time": "1:51:56"}
{"current_steps": 1725, "total_steps": 2310, "loss": 0.0004, "lr": 7.3420906908664345e-06, "epoch": 5.227732793522267, "percentage": 74.68, "elapsed_time": "5:27:14", "remaining_time": "1:50:58"}
{"current_steps": 1730, "total_steps": 2310, "loss": 0.0004, "lr": 7.225457519403838e-06, "epoch": 5.242914979757085, "percentage": 74.89, "elapsed_time": "5:28:09", "remaining_time": "1:50:00"}
{"current_steps": 1735, "total_steps": 2310, "loss": 0.0003, "lr": 7.109553593955671e-06, "epoch": 5.258097165991903, "percentage": 75.11, "elapsed_time": "5:29:03", "remaining_time": "1:49:03"}
{"current_steps": 1740, "total_steps": 2310, "loss": 0.0003, "lr": 6.99438553099965e-06, "epoch": 5.27327935222672, "percentage": 75.32, "elapsed_time": "5:29:58", "remaining_time": "1:48:05"}
{"current_steps": 1745, "total_steps": 2310, "loss": 0.0003, "lr": 6.879959905006135e-06, "epoch": 5.288461538461538, "percentage": 75.54, "elapsed_time": "5:30:57", "remaining_time": "1:47:09"}
{"current_steps": 1750, "total_steps": 2310, "loss": 0.0003, "lr": 6.766283248062817e-06, "epoch": 5.303643724696356, "percentage": 75.76, "elapsed_time": "5:31:54", "remaining_time": "1:46:12"}
{"current_steps": 1755, "total_steps": 2310, "loss": 0.0003, "lr": 6.653362049501826e-06, "epoch": 5.318825910931174, "percentage": 75.97, "elapsed_time": "5:32:46", "remaining_time": "1:45:14"}
{"current_steps": 1760, "total_steps": 2310, "loss": 0.0004, "lr": 6.541202755529299e-06, "epoch": 5.334008097165992, "percentage": 76.19, "elapsed_time": "5:33:43", "remaining_time": "1:44:17"}
{"current_steps": 1765, "total_steps": 2310, "loss": 0.0003, "lr": 6.429811768857359e-06, "epoch": 5.34919028340081, "percentage": 76.41, "elapsed_time": "5:34:35", "remaining_time": "1:43:19"}
{"current_steps": 1770, "total_steps": 2310, "loss": 0.0003, "lr": 6.319195448338642e-06, "epoch": 5.364372469635628, "percentage": 76.62, "elapsed_time": "5:35:29", "remaining_time": "1:42:21"}
{"current_steps": 1775, "total_steps": 2310, "loss": 0.0003, "lr": 6.2093601086032754e-06, "epoch": 5.379554655870446, "percentage": 76.84, "elapsed_time": "5:36:23", "remaining_time": "1:41:23"}
{"current_steps": 1780, "total_steps": 2310, "loss": 0.0003, "lr": 6.100312019698411e-06, "epoch": 5.394736842105263, "percentage": 77.06, "elapsed_time": "5:37:21", "remaining_time": "1:40:27"}
{"current_steps": 1785, "total_steps": 2310, "loss": 0.0003, "lr": 5.9920574067303114e-06, "epoch": 5.409919028340081, "percentage": 77.27, "elapsed_time": "5:38:22", "remaining_time": "1:39:31"}
{"current_steps": 1790, "total_steps": 2310, "loss": 0.0003, "lr": 5.8846024495089425e-06, "epoch": 5.425101214574899, "percentage": 77.49, "elapsed_time": "5:39:15", "remaining_time": "1:38:33"}
{"current_steps": 1795, "total_steps": 2310, "loss": 0.0003, "lr": 5.777953282195228e-06, "epoch": 5.440283400809717, "percentage": 77.71, "elapsed_time": "5:40:17", "remaining_time": "1:37:37"}
{"current_steps": 1800, "total_steps": 2310, "loss": 0.0003, "lr": 5.672115992950855e-06, "epoch": 5.455465587044534, "percentage": 77.92, "elapsed_time": "5:41:17", "remaining_time": "1:36:41"}
{"current_steps": 1805, "total_steps": 2310, "loss": 0.0003, "lr": 5.567096623590758e-06, "epoch": 5.470647773279352, "percentage": 78.14, "elapsed_time": "5:42:22", "remaining_time": "1:35:47"}
{"current_steps": 1810, "total_steps": 2310, "loss": 0.0003, "lr": 5.462901169238175e-06, "epoch": 5.48582995951417, "percentage": 78.35, "elapsed_time": "5:43:17", "remaining_time": "1:34:49"}
{"current_steps": 1815, "total_steps": 2310, "loss": 0.0002, "lr": 5.359535577982431e-06, "epoch": 5.501012145748988, "percentage": 78.57, "elapsed_time": "5:44:21", "remaining_time": "1:33:54"}
{"current_steps": 1820, "total_steps": 2310, "loss": 0.0003, "lr": 5.257005750539388e-06, "epoch": 5.516194331983805, "percentage": 78.79, "elapsed_time": "5:45:17", "remaining_time": "1:32:57"}
{"current_steps": 1825, "total_steps": 2310, "loss": 0.0003, "lr": 5.155317539914601e-06, "epoch": 5.531376518218623, "percentage": 79.0, "elapsed_time": "5:46:13", "remaining_time": "1:32:00"}
{"current_steps": 1830, "total_steps": 2310, "loss": 0.0003, "lr": 5.054476751069179e-06, "epoch": 5.5465587044534415, "percentage": 79.22, "elapsed_time": "5:47:07", "remaining_time": "1:31:02"}
{"current_steps": 1835, "total_steps": 2310, "loss": 0.0003, "lr": 4.954489140588412e-06, "epoch": 5.5617408906882595, "percentage": 79.44, "elapsed_time": "5:48:07", "remaining_time": "1:30:06"}
{"current_steps": 1840, "total_steps": 2310, "loss": 0.0003, "lr": 4.85536041635315e-06, "epoch": 5.576923076923077, "percentage": 79.65, "elapsed_time": "5:49:05", "remaining_time": "1:29:10"}
{"current_steps": 1845, "total_steps": 2310, "loss": 0.0003, "lr": 4.757096237213976e-06, "epoch": 5.592105263157895, "percentage": 79.87, "elapsed_time": "5:50:00", "remaining_time": "1:28:12"}
{"current_steps": 1850, "total_steps": 2310, "loss": 0.0003, "lr": 4.659702212668151e-06, "epoch": 5.607287449392713, "percentage": 80.09, "elapsed_time": "5:50:52", "remaining_time": "1:27:14"}
{"current_steps": 1855, "total_steps": 2310, "loss": 0.0002, "lr": 4.5631839025393694e-06, "epoch": 5.62246963562753, "percentage": 80.3, "elapsed_time": "5:51:47", "remaining_time": "1:26:17"}
{"current_steps": 1860, "total_steps": 2310, "loss": 0.0003, "lr": 4.467546816660433e-06, "epoch": 5.637651821862348, "percentage": 80.52, "elapsed_time": "5:52:45", "remaining_time": "1:25:20"}
{"current_steps": 1865, "total_steps": 2310, "loss": 0.0002, "lr": 4.372796414558665e-06, "epoch": 5.652834008097166, "percentage": 80.74, "elapsed_time": "5:53:42", "remaining_time": "1:24:23"}
{"current_steps": 1870, "total_steps": 2310, "loss": 0.0002, "lr": 4.278938105144255e-06, "epoch": 5.668016194331984, "percentage": 80.95, "elapsed_time": "5:54:38", "remaining_time": "1:23:26"}
{"current_steps": 1875, "total_steps": 2310, "loss": 0.0002, "lr": 4.185977246401509e-06, "epoch": 5.683198380566802, "percentage": 81.17, "elapsed_time": "5:55:37", "remaining_time": "1:22:30"}
{"current_steps": 1880, "total_steps": 2310, "loss": 0.0002, "lr": 4.093919145082959e-06, "epoch": 5.698380566801619, "percentage": 81.39, "elapsed_time": "5:56:34", "remaining_time": "1:21:33"}
{"current_steps": 1885, "total_steps": 2310, "loss": 0.0002, "lr": 4.002769056406453e-06, "epoch": 5.713562753036437, "percentage": 81.6, "elapsed_time": "5:57:29", "remaining_time": "1:20:36"}
{"current_steps": 1890, "total_steps": 2310, "loss": 0.0003, "lr": 3.912532183755115e-06, "epoch": 5.728744939271255, "percentage": 81.82, "elapsed_time": "5:58:28", "remaining_time": "1:19:39"}
{"current_steps": 1895, "total_steps": 2310, "loss": 0.0002, "lr": 3.82321367838034e-06, "epoch": 5.743927125506072, "percentage": 82.03, "elapsed_time": "5:59:30", "remaining_time": "1:18:43"}
{"current_steps": 1900, "total_steps": 2310, "loss": 0.0002, "lr": 3.734818639107709e-06, "epoch": 5.7591093117408905, "percentage": 82.25, "elapsed_time": "6:00:27", "remaining_time": "1:17:47"}
{"current_steps": 1905, "total_steps": 2310, "loss": 0.0002, "lr": 3.647352112045943e-06, "epoch": 5.7742914979757085, "percentage": 82.47, "elapsed_time": "6:01:21", "remaining_time": "1:16:49"}
{"current_steps": 1910, "total_steps": 2310, "loss": 0.0003, "lr": 3.560819090298808e-06, "epoch": 5.7894736842105265, "percentage": 82.68, "elapsed_time": "6:02:11", "remaining_time": "1:15:51"}
{"current_steps": 1915, "total_steps": 2310, "loss": 0.0003, "lr": 3.4752245136801065e-06, "epoch": 5.804655870445345, "percentage": 82.9, "elapsed_time": "6:03:02", "remaining_time": "1:14:53"}
{"current_steps": 1920, "total_steps": 2310, "loss": 0.0002, "lr": 3.3905732684316626e-06, "epoch": 5.819838056680162, "percentage": 83.12, "elapsed_time": "6:03:59", "remaining_time": "1:13:56"}
{"current_steps": 1925, "total_steps": 2310, "loss": 0.0002, "lr": 3.3068701869444177e-06, "epoch": 5.83502024291498, "percentage": 83.33, "elapsed_time": "6:04:56", "remaining_time": "1:12:59"}
{"current_steps": 1930, "total_steps": 2310, "loss": 0.0002, "lr": 3.2241200474825307e-06, "epoch": 5.850202429149798, "percentage": 83.55, "elapsed_time": "6:05:55", "remaining_time": "1:12:02"}
{"current_steps": 1935, "total_steps": 2310, "loss": 0.0002, "lr": 3.1423275739106353e-06, "epoch": 5.865384615384615, "percentage": 83.77, "elapsed_time": "6:06:50", "remaining_time": "1:11:05"}
{"current_steps": 1940, "total_steps": 2310, "loss": 0.0002, "lr": 3.0614974354241547e-06, "epoch": 5.880566801619433, "percentage": 83.98, "elapsed_time": "6:07:51", "remaining_time": "1:10:09"}
{"current_steps": 1945, "total_steps": 2310, "loss": 0.0002, "lr": 2.9816342462827806e-06, "epoch": 5.895748987854251, "percentage": 84.2, "elapsed_time": "6:08:48", "remaining_time": "1:09:12"}
{"current_steps": 1950, "total_steps": 2310, "loss": 0.0002, "lr": 2.9027425655470366e-06, "epoch": 5.910931174089069, "percentage": 84.42, "elapsed_time": "6:09:42", "remaining_time": "1:08:15"}
{"current_steps": 1955, "total_steps": 2310, "loss": 0.0002, "lr": 2.824826896818036e-06, "epoch": 5.926113360323887, "percentage": 84.63, "elapsed_time": "6:10:37", "remaining_time": "1:07:17"}
{"current_steps": 1960, "total_steps": 2310, "loss": 0.0002, "lr": 2.747891687980384e-06, "epoch": 5.941295546558704, "percentage": 84.85, "elapsed_time": "6:11:29", "remaining_time": "1:06:20"}
{"current_steps": 1965, "total_steps": 2310, "loss": 0.0002, "lr": 2.6719413309482843e-06, "epoch": 5.956477732793522, "percentage": 85.06, "elapsed_time": "6:12:27", "remaining_time": "1:05:23"}
{"current_steps": 1970, "total_steps": 2310, "loss": 0.0002, "lr": 2.5969801614147838e-06, "epoch": 5.97165991902834, "percentage": 85.28, "elapsed_time": "6:13:21", "remaining_time": "1:04:26"}
{"current_steps": 1975, "total_steps": 2310, "loss": 0.0002, "lr": 2.5230124586043016e-06, "epoch": 5.9868421052631575, "percentage": 85.5, "elapsed_time": "6:14:21", "remaining_time": "1:03:30"}
{"current_steps": 1980, "total_steps": 2310, "loss": 0.0002, "lr": 2.45004244502834e-06, "epoch": 6.0, "percentage": 85.71, "elapsed_time": "6:15:11", "remaining_time": "1:02:31"}
{"current_steps": 1985, "total_steps": 2310, "loss": 0.0002, "lr": 2.3780742862444205e-06, "epoch": 6.015182186234818, "percentage": 85.93, "elapsed_time": "6:16:07", "remaining_time": "1:01:34"}
{"current_steps": 1990, "total_steps": 2310, "loss": 0.0002, "lr": 2.3071120906183064e-06, "epoch": 6.030364372469635, "percentage": 86.15, "elapsed_time": "6:17:03", "remaining_time": "1:00:37"}
{"current_steps": 1995, "total_steps": 2310, "loss": 0.0002, "lr": 2.237159909089468e-06, "epoch": 6.045546558704453, "percentage": 86.36, "elapsed_time": "6:17:57", "remaining_time": "0:59:40"}
{"current_steps": 2000, "total_steps": 2310, "loss": 0.0002, "lr": 2.168221734939824e-06, "epoch": 6.060728744939271, "percentage": 86.58, "elapsed_time": "6:18:54", "remaining_time": "0:58:43"}
{"current_steps": 2005, "total_steps": 2310, "loss": 0.0002, "lr": 2.1003015035658024e-06, "epoch": 6.075910931174089, "percentage": 86.8, "elapsed_time": "6:19:52", "remaining_time": "0:57:47"}
{"current_steps": 2010, "total_steps": 2310, "loss": 0.0002, "lr": 2.0334030922536606e-06, "epoch": 6.0910931174089065, "percentage": 87.01, "elapsed_time": "6:20:49", "remaining_time": "0:56:50"}
{"current_steps": 2015, "total_steps": 2310, "loss": 0.0002, "lr": 1.9675303199581554e-06, "epoch": 6.1062753036437245, "percentage": 87.23, "elapsed_time": "6:21:44", "remaining_time": "0:55:53"}
{"current_steps": 2020, "total_steps": 2310, "loss": 0.0002, "lr": 1.90268694708454e-06, "epoch": 6.1214574898785425, "percentage": 87.45, "elapsed_time": "6:22:40", "remaining_time": "0:54:56"}
{"current_steps": 2025, "total_steps": 2310, "loss": 0.0002, "lr": 1.8388766752739017e-06, "epoch": 6.136639676113361, "percentage": 87.66, "elapsed_time": "6:23:37", "remaining_time": "0:53:59"}
{"current_steps": 2030, "total_steps": 2310, "loss": 0.0002, "lr": 1.77610314719183e-06, "epoch": 6.151821862348178, "percentage": 87.88, "elapsed_time": "6:24:35", "remaining_time": "0:53:02"}
{"current_steps": 2035, "total_steps": 2310, "loss": 0.0002, "lr": 1.7143699463204932e-06, "epoch": 6.167004048582996, "percentage": 88.1, "elapsed_time": "6:25:34", "remaining_time": "0:52:06"}
{"current_steps": 2040, "total_steps": 2310, "loss": 0.0002, "lr": 1.6536805967540614e-06, "epoch": 6.182186234817814, "percentage": 88.31, "elapsed_time": "6:26:30", "remaining_time": "0:51:09"}
{"current_steps": 2045, "total_steps": 2310, "loss": 0.0002, "lr": 1.5940385629975353e-06, "epoch": 6.197368421052632, "percentage": 88.53, "elapsed_time": "6:27:26", "remaining_time": "0:50:12"}
{"current_steps": 2050, "total_steps": 2310, "loss": 0.0002, "lr": 1.535447249768971e-06, "epoch": 6.212550607287449, "percentage": 88.74, "elapsed_time": "6:28:24", "remaining_time": "0:49:15"}
{"current_steps": 2055, "total_steps": 2310, "loss": 0.0002, "lr": 1.4779100018051118e-06, "epoch": 6.227732793522267, "percentage": 88.96, "elapsed_time": "6:29:19", "remaining_time": "0:48:18"}
{"current_steps": 2060, "total_steps": 2310, "loss": 0.0002, "lr": 1.421430103670456e-06, "epoch": 6.242914979757085, "percentage": 89.18, "elapsed_time": "6:30:09", "remaining_time": "0:47:20"}
{"current_steps": 2065, "total_steps": 2310, "loss": 0.0002, "lr": 1.366010779569764e-06, "epoch": 6.258097165991903, "percentage": 89.39, "elapsed_time": "6:31:10", "remaining_time": "0:46:24"}
{"current_steps": 2070, "total_steps": 2310, "loss": 0.0002, "lr": 1.3116551931639899e-06, "epoch": 6.27327935222672, "percentage": 89.61, "elapsed_time": "6:32:04", "remaining_time": "0:45:27"}
{"current_steps": 2075, "total_steps": 2310, "loss": 0.0002, "lr": 1.258366447389674e-06, "epoch": 6.288461538461538, "percentage": 89.83, "elapsed_time": "6:33:02", "remaining_time": "0:44:30"}
{"current_steps": 2080, "total_steps": 2310, "loss": 0.0002, "lr": 1.2061475842818337e-06, "epoch": 6.303643724696356, "percentage": 90.04, "elapsed_time": "6:33:56", "remaining_time": "0:43:33"}
{"current_steps": 2085, "total_steps": 2310, "loss": 0.0002, "lr": 1.1550015848002816e-06, "epoch": 6.318825910931174, "percentage": 90.26, "elapsed_time": "6:34:54", "remaining_time": "0:42:36"}
{"current_steps": 2090, "total_steps": 2310, "loss": 0.0002, "lr": 1.1049313686594675e-06, "epoch": 6.334008097165992, "percentage": 90.48, "elapsed_time": "6:35:42", "remaining_time": "0:41:39"}
{"current_steps": 2095, "total_steps": 2310, "loss": 0.0002, "lr": 1.0559397941618022e-06, "epoch": 6.34919028340081, "percentage": 90.69, "elapsed_time": "6:36:40", "remaining_time": "0:40:42"}
{"current_steps": 2100, "total_steps": 2310, "loss": 0.0002, "lr": 1.008029658034484e-06, "epoch": 6.364372469635628, "percentage": 90.91, "elapsed_time": "6:37:36", "remaining_time": "0:39:45"}
{"current_steps": 2105, "total_steps": 2310, "loss": 0.0002, "lr": 9.61203695269859e-07, "epoch": 6.379554655870446, "percentage": 91.13, "elapsed_time": "6:38:43", "remaining_time": "0:38:49"}
{"current_steps": 2110, "total_steps": 2310, "loss": 0.0002, "lr": 9.154645789692718e-07, "epoch": 6.394736842105263, "percentage": 91.34, "elapsed_time": "6:39:35", "remaining_time": "0:37:52"}
{"current_steps": 2115, "total_steps": 2310, "loss": 0.0002, "lr": 8.708149201904814e-07, "epoch": 6.409919028340081, "percentage": 91.56, "elapsed_time": "6:40:35", "remaining_time": "0:36:56"}
{"current_steps": 2120, "total_steps": 2310, "loss": 0.0002, "lr": 8.272572677986001e-07, "epoch": 6.425101214574899, "percentage": 91.77, "elapsed_time": "6:41:33", "remaining_time": "0:35:59"}
{"current_steps": 2125, "total_steps": 2310, "loss": 0.0001, "lr": 7.847941083206057e-07, "epoch": 6.440283400809717, "percentage": 91.99, "elapsed_time": "6:42:32", "remaining_time": "0:35:02"}
{"current_steps": 2130, "total_steps": 2310, "loss": 0.0002, "lr": 7.4342786580337e-07, "epoch": 6.455465587044534, "percentage": 92.21, "elapsed_time": "6:43:30", "remaining_time": "0:34:05"}
{"current_steps": 2135, "total_steps": 2310, "loss": 0.0002, "lr": 7.031609016753016e-07, "epoch": 6.470647773279352, "percentage": 92.42, "elapsed_time": "6:44:25", "remaining_time": "0:33:08"}
{"current_steps": 2140, "total_steps": 2310, "loss": 0.0001, "lr": 6.639955146115284e-07, "epoch": 6.48582995951417, "percentage": 92.64, "elapsed_time": "6:45:22", "remaining_time": "0:32:12"}
{"current_steps": 2145, "total_steps": 2310, "loss": 0.0002, "lr": 6.259339404026876e-07, "epoch": 6.501012145748988, "percentage": 92.86, "elapsed_time": "6:46:18", "remaining_time": "0:31:15"}
{"current_steps": 2150, "total_steps": 2310, "loss": 0.0002, "lr": 5.889783518272785e-07, "epoch": 6.516194331983805, "percentage": 93.07, "elapsed_time": "6:47:09", "remaining_time": "0:30:18"}
{"current_steps": 2155, "total_steps": 2310, "loss": 0.0002, "lr": 5.53130858527644e-07, "epoch": 6.531376518218623, "percentage": 93.29, "elapsed_time": "6:48:00", "remaining_time": "0:29:20"}
{"current_steps": 2160, "total_steps": 2310, "loss": 0.0002, "lr": 5.183935068895207e-07, "epoch": 6.5465587044534415, "percentage": 93.51, "elapsed_time": "6:48:59", "remaining_time": "0:28:24"}
{"current_steps": 2165, "total_steps": 2310, "loss": 0.0002, "lr": 4.847682799252474e-07, "epoch": 6.5617408906882595, "percentage": 93.72, "elapsed_time": "6:49:54", "remaining_time": "0:27:27"}
{"current_steps": 2170, "total_steps": 2310, "loss": 0.0002, "lr": 4.522570971605289e-07, "epoch": 6.576923076923077, "percentage": 93.94, "elapsed_time": "6:50:46", "remaining_time": "0:26:30"}
{"current_steps": 2175, "total_steps": 2310, "loss": 0.0001, "lr": 4.208618145248866e-07, "epoch": 6.592105263157895, "percentage": 94.16, "elapsed_time": "6:51:43", "remaining_time": "0:25:33"}
{"current_steps": 2180, "total_steps": 2310, "loss": 0.0002, "lr": 3.9058422424568923e-07, "epoch": 6.607287449392713, "percentage": 94.37, "elapsed_time": "6:52:41", "remaining_time": "0:24:36"}
{"current_steps": 2185, "total_steps": 2310, "loss": 0.0001, "lr": 3.614260547458659e-07, "epoch": 6.62246963562753, "percentage": 94.59, "elapsed_time": "6:53:31", "remaining_time": "0:23:39"}
{"current_steps": 2190, "total_steps": 2310, "loss": 0.0002, "lr": 3.3338897054521205e-07, "epoch": 6.637651821862348, "percentage": 94.81, "elapsed_time": "6:54:24", "remaining_time": "0:22:42"}
{"current_steps": 2195, "total_steps": 2310, "loss": 0.0001, "lr": 3.0647457216538724e-07, "epoch": 6.652834008097166, "percentage": 95.02, "elapsed_time": "6:55:28", "remaining_time": "0:21:46"}
{"current_steps": 2200, "total_steps": 2310, "loss": 0.0002, "lr": 2.8068439603853747e-07, "epoch": 6.668016194331984, "percentage": 95.24, "elapsed_time": "6:56:20", "remaining_time": "0:20:49"}
{"current_steps": 2205, "total_steps": 2310, "loss": 0.0002, "lr": 2.5601991441959407e-07, "epoch": 6.683198380566802, "percentage": 95.45, "elapsed_time": "6:57:22", "remaining_time": "0:19:52"}
{"current_steps": 2210, "total_steps": 2310, "loss": 0.0002, "lr": 2.3248253530222753e-07, "epoch": 6.698380566801619, "percentage": 95.67, "elapsed_time": "6:58:22", "remaining_time": "0:18:55"}
{"current_steps": 2215, "total_steps": 2310, "loss": 0.0002, "lr": 2.1007360233846308e-07, "epoch": 6.713562753036437, "percentage": 95.89, "elapsed_time": "6:59:16", "remaining_time": "0:17:58"}
{"current_steps": 2220, "total_steps": 2310, "loss": 0.0001, "lr": 1.8879439476198636e-07, "epoch": 6.728744939271255, "percentage": 96.1, "elapsed_time": "7:00:11", "remaining_time": "0:17:02"}
{"current_steps": 2225, "total_steps": 2310, "loss": 0.0002, "lr": 1.6864612731511298e-07, "epoch": 6.743927125506072, "percentage": 96.32, "elapsed_time": "7:01:04", "remaining_time": "0:16:05"}
{"current_steps": 2230, "total_steps": 2310, "loss": 0.0001, "lr": 1.4962995017944626e-07, "epoch": 6.7591093117408905, "percentage": 96.54, "elapsed_time": "7:02:00", "remaining_time": "0:15:08"}
{"current_steps": 2235, "total_steps": 2310, "loss": 0.0002, "lr": 1.3174694891021188e-07, "epoch": 6.7742914979757085, "percentage": 96.75, "elapsed_time": "7:02:52", "remaining_time": "0:14:11"}
{"current_steps": 2240, "total_steps": 2310, "loss": 0.0001, "lr": 1.1499814437429869e-07, "epoch": 6.7894736842105265, "percentage": 96.97, "elapsed_time": "7:03:48", "remaining_time": "0:13:14"}
{"current_steps": 2245, "total_steps": 2310, "loss": 0.0002, "lr": 9.938449269197181e-08, "epoch": 6.804655870445345, "percentage": 97.19, "elapsed_time": "7:04:45", "remaining_time": "0:12:17"}
{"current_steps": 2250, "total_steps": 2310, "loss": 0.0001, "lr": 8.490688518229651e-08, "epoch": 6.819838056680162, "percentage": 97.4, "elapsed_time": "7:05:43", "remaining_time": "0:11:21"}
{"current_steps": 2255, "total_steps": 2310, "loss": 0.0001, "lr": 7.156614831225428e-08, "epoch": 6.83502024291498, "percentage": 97.62, "elapsed_time": "7:06:40", "remaining_time": "0:10:24"}
{"current_steps": 2260, "total_steps": 2310, "loss": 0.0002, "lr": 5.936304364956513e-08, "epoch": 6.850202429149798, "percentage": 97.84, "elapsed_time": "7:07:38", "remaining_time": "0:09:27"}
{"current_steps": 2265, "total_steps": 2310, "loss": 0.0001, "lr": 4.829826781921343e-08, "epoch": 6.865384615384615, "percentage": 98.05, "elapsed_time": "7:08:35", "remaining_time": "0:08:30"}
{"current_steps": 2270, "total_steps": 2310, "loss": 0.0002, "lr": 3.837245246367749e-08, "epoch": 6.880566801619433, "percentage": 98.27, "elapsed_time": "7:09:33", "remaining_time": "0:07:34"}
{"current_steps": 2275, "total_steps": 2310, "loss": 0.0001, "lr": 2.958616420687177e-08, "epoch": 6.895748987854251, "percentage": 98.48, "elapsed_time": "7:10:32", "remaining_time": "0:06:37"}
{"current_steps": 2280, "total_steps": 2310, "loss": 0.0001, "lr": 2.1939904621806062e-08, "epoch": 6.910931174089069, "percentage": 98.7, "elapsed_time": "7:11:35", "remaining_time": "0:05:40"}
{"current_steps": 2285, "total_steps": 2310, "loss": 0.0002, "lr": 1.5434110201946184e-08, "epoch": 6.926113360323887, "percentage": 98.92, "elapsed_time": "7:12:29", "remaining_time": "0:04:43"}
{"current_steps": 2290, "total_steps": 2310, "loss": 0.0001, "lr": 1.006915233629835e-08, "epoch": 6.941295546558704, "percentage": 99.13, "elapsed_time": "7:13:23", "remaining_time": "0:03:47"}
{"current_steps": 2295, "total_steps": 2310, "loss": 0.0002, "lr": 5.845337288210573e-09, "epoch": 6.956477732793522, "percentage": 99.35, "elapsed_time": "7:14:22", "remaining_time": "0:02:50"}
{"current_steps": 2300, "total_steps": 2310, "loss": 0.0002, "lr": 2.7629061778866597e-09, "epoch": 6.97165991902834, "percentage": 99.57, "elapsed_time": "7:15:14", "remaining_time": "0:01:53"}
{"current_steps": 2305, "total_steps": 2310, "loss": 0.0002, "lr": 8.220349686216545e-10, "epoch": 6.9868421052631575, "percentage": 99.78, "elapsed_time": "7:16:15", "remaining_time": "0:00:56"}
{"current_steps": 2310, "total_steps": 2310, "loss": 0.0001, "lr": 2.2834456763209944e-11, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "7:17:08", "remaining_time": "0:00:00"}
{"current_steps": 2310, "total_steps": 2310, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "7:17:17", "remaining_time": "0:00:00"}
{"current_steps": 2310, "total_steps": 2310, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "0:00:00", "remaining_time": "0:00:00"}
{"current_steps": 2310, "total_steps": 2310, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "0:00:00", "remaining_time": "0:00:00"}
{"current_steps": 2310, "total_steps": 2310, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "0:00:00", "remaining_time": "0:00:00"}
{"current_steps": 2310, "total_steps": 2310, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "0:00:00", "remaining_time": "0:00:00"}
{"current_steps": 2310, "total_steps": 2310, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "0:00:00", "remaining_time": "0:00:00"}

5125
trainer_state.json Normal file

File diff suppressed because it is too large Load Diff

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:431054a13ed7c1cbe055d1e9e2193544c9c44ce357f2813cb343276edfd962c1
size 8657

BIN
training_loss.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

1
vocab.json Normal file

File diff suppressed because one or more lines are too long