初始化项目,由ModelHub XC社区提供模型
Model: laion/nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg__Qwen3-8B Source: Original Platform
This commit is contained in:
36
.gitattributes
vendored
Normal file
36
.gitattributes
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||
61
README.md
Normal file
61
README.md
Normal file
@@ -0,0 +1,61 @@
|
||||
---
|
||||
library_name: transformers
|
||||
license: other
|
||||
base_model: Qwen/Qwen3-8B
|
||||
tags:
|
||||
- llama-factory
|
||||
- full
|
||||
- generated_from_trainer
|
||||
model-index:
|
||||
- name: nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg__Qwen3-8B
|
||||
results: []
|
||||
---
|
||||
|
||||
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
||||
should probably proofread and complete it, then remove this comment. -->
|
||||
|
||||
# nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg__Qwen3-8B
|
||||
|
||||
This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the /e/data1/datasets/playground/ot-baf/hf_hub/datasets--laion--nemotron-terminal-scientific_computing/snapshots/610c7db0b8510b87e3c99b3bd49660bc56821866_thinking_preprocessed, the /e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--exp_tas_repetition_penalty_1.05_traces/snapshots/b4f5500e00651d5ffc7f8701f8a055d9b2b68a0a_thinking_preprocessed, the /e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--a1_multifile_composition/snapshots/a19e5e467f3e83605b4de72bb5b7923e5e55efa9_thinking_preprocessed, the /e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--exp_tas_max_episodes_512_traces/snapshots/236c1dc9aa6d24cf77ce281b5342d93bae685832_thinking_preprocessed, the /e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--dev_set_part1_10k_glm_4.7_traces_jupiter/snapshots/f1871d1c1446b3b43cbfe2737d0df56cecf3f420_thinking_preprocessed, the /e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--swesmith-sandboxes-with_tests-gpt-5-mini-passed_glm_4.7_traces/snapshots/b9b0e0d113e9c37dd035f03644315478acc04487_thinking_preprocessed and the /e/data1/datasets/playground/ot-baf/hf_hub/datasets--penfever--Kimi-2.5-r2egym_sandboxes-maxeps-32k/snapshots/4d777f61eacde52705d17f4ec7388bc01c0d95b6_thinking_preprocessed datasets.
|
||||
|
||||
## Model description
|
||||
|
||||
More information needed
|
||||
|
||||
## Intended uses & limitations
|
||||
|
||||
More information needed
|
||||
|
||||
## Training and evaluation data
|
||||
|
||||
More information needed
|
||||
|
||||
## Training procedure
|
||||
|
||||
### Training hyperparameters
|
||||
|
||||
The following hyperparameters were used during training:
|
||||
- learning_rate: 4e-05
|
||||
- train_batch_size: 1
|
||||
- eval_batch_size: 8
|
||||
- seed: 42
|
||||
- distributed_type: multi-GPU
|
||||
- num_devices: 32
|
||||
- gradient_accumulation_steps: 3
|
||||
- total_train_batch_size: 96
|
||||
- total_eval_batch_size: 256
|
||||
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
||||
- lr_scheduler_type: cosine
|
||||
- lr_scheduler_warmup_ratio: 0.1
|
||||
- num_epochs: 5.0
|
||||
|
||||
### Training results
|
||||
|
||||
|
||||
|
||||
### Framework versions
|
||||
|
||||
- Transformers 4.57.6
|
||||
- Pytorch 2.9.1+cu130
|
||||
- Datasets 4.7.0
|
||||
- Tokenizers 0.22.2
|
||||
28
added_tokens.json
Normal file
28
added_tokens.json
Normal file
@@ -0,0 +1,28 @@
|
||||
{
|
||||
"</think>": 151668,
|
||||
"</tool_call>": 151658,
|
||||
"</tool_response>": 151666,
|
||||
"<think>": 151667,
|
||||
"<tool_call>": 151657,
|
||||
"<tool_response>": 151665,
|
||||
"<|box_end|>": 151649,
|
||||
"<|box_start|>": 151648,
|
||||
"<|endoftext|>": 151643,
|
||||
"<|file_sep|>": 151664,
|
||||
"<|fim_middle|>": 151660,
|
||||
"<|fim_pad|>": 151662,
|
||||
"<|fim_prefix|>": 151659,
|
||||
"<|fim_suffix|>": 151661,
|
||||
"<|im_end|>": 151645,
|
||||
"<|im_start|>": 151644,
|
||||
"<|image_pad|>": 151655,
|
||||
"<|object_ref_end|>": 151647,
|
||||
"<|object_ref_start|>": 151646,
|
||||
"<|quad_end|>": 151651,
|
||||
"<|quad_start|>": 151650,
|
||||
"<|repo_name|>": 151663,
|
||||
"<|video_pad|>": 151656,
|
||||
"<|vision_end|>": 151653,
|
||||
"<|vision_pad|>": 151654,
|
||||
"<|vision_start|>": 151652
|
||||
}
|
||||
16
all_results.json
Normal file
16
all_results.json
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"achieved_tflops_per_gpu": 9.145547146520583,
|
||||
"achieved_tflops_per_gpu_theoretical": 315.0751243425546,
|
||||
"epoch": 5.0,
|
||||
"loss_nan_ranks": 0,
|
||||
"loss_rank_avg": 0.3323420286178589,
|
||||
"mfu_percent": 0.6463284202488044,
|
||||
"mfu_percent_theoretical": 22.26679323975651,
|
||||
"total_flos": 1.1450092629186839e+19,
|
||||
"train_loss": 0.14989610923533592,
|
||||
"train_runtime": 39124.5476,
|
||||
"train_samples_per_second": 8.909,
|
||||
"train_steps_per_second": 0.093,
|
||||
"valid_targets_mean": 6423.8,
|
||||
"valid_targets_min": 192
|
||||
}
|
||||
89
chat_template.jinja
Normal file
89
chat_template.jinja
Normal file
@@ -0,0 +1,89 @@
|
||||
{%- if tools %}
|
||||
{{- '<|im_start|>system\n' }}
|
||||
{%- if messages[0].role == 'system' %}
|
||||
{{- messages[0].content + '\n\n' }}
|
||||
{%- endif %}
|
||||
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
|
||||
{%- for tool in tools %}
|
||||
{{- "\n" }}
|
||||
{{- tool | tojson }}
|
||||
{%- endfor %}
|
||||
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
|
||||
{%- else %}
|
||||
{%- if messages[0].role == 'system' %}
|
||||
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
|
||||
{%- for message in messages[::-1] %}
|
||||
{%- set index = (messages|length - 1) - loop.index0 %}
|
||||
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
|
||||
{%- set ns.multi_step_tool = false %}
|
||||
{%- set ns.last_query_index = index %}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
{%- for message in messages %}
|
||||
{%- if message.content is string %}
|
||||
{%- set content = message.content %}
|
||||
{%- else %}
|
||||
{%- set content = '' %}
|
||||
{%- endif %}
|
||||
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
|
||||
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
|
||||
{%- elif message.role == "assistant" %}
|
||||
{%- set reasoning_content = '' %}
|
||||
{%- if message.reasoning_content is string %}
|
||||
{%- set reasoning_content = message.reasoning_content %}
|
||||
{%- else %}
|
||||
{%- if '</think>' in content %}
|
||||
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
|
||||
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
{%- if loop.index0 > ns.last_query_index %}
|
||||
{%- if loop.last or (not loop.last and reasoning_content) %}
|
||||
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
|
||||
{%- else %}
|
||||
{{- '<|im_start|>' + message.role + '\n' + content }}
|
||||
{%- endif %}
|
||||
{%- else %}
|
||||
{{- '<|im_start|>' + message.role + '\n' + content }}
|
||||
{%- endif %}
|
||||
{%- if message.tool_calls %}
|
||||
{%- for tool_call in message.tool_calls %}
|
||||
{%- if (loop.first and content) or (not loop.first) %}
|
||||
{{- '\n' }}
|
||||
{%- endif %}
|
||||
{%- if tool_call.function %}
|
||||
{%- set tool_call = tool_call.function %}
|
||||
{%- endif %}
|
||||
{{- '<tool_call>\n{"name": "' }}
|
||||
{{- tool_call.name }}
|
||||
{{- '", "arguments": ' }}
|
||||
{%- if tool_call.arguments is string %}
|
||||
{{- tool_call.arguments }}
|
||||
{%- else %}
|
||||
{{- tool_call.arguments | tojson }}
|
||||
{%- endif %}
|
||||
{{- '}\n</tool_call>' }}
|
||||
{%- endfor %}
|
||||
{%- endif %}
|
||||
{{- '<|im_end|>\n' }}
|
||||
{%- elif message.role == "tool" %}
|
||||
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
|
||||
{{- '<|im_start|>user' }}
|
||||
{%- endif %}
|
||||
{{- '\n<tool_response>\n' }}
|
||||
{{- content }}
|
||||
{{- '\n</tool_response>' }}
|
||||
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
|
||||
{{- '<|im_end|>\n' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
{%- if add_generation_prompt %}
|
||||
{{- '<|im_start|>assistant\n' }}
|
||||
{%- if enable_thinking is defined and enable_thinking is false %}
|
||||
{{- '<think>\n\n</think>\n\n' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
68
config.json
Normal file
68
config.json
Normal file
@@ -0,0 +1,68 @@
|
||||
{
|
||||
"architectures": [
|
||||
"Qwen3ForCausalLM"
|
||||
],
|
||||
"attention_bias": false,
|
||||
"attention_dropout": 0.0,
|
||||
"dtype": "bfloat16",
|
||||
"eos_token_id": 151645,
|
||||
"head_dim": 128,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 12288,
|
||||
"layer_types": [
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention"
|
||||
],
|
||||
"max_position_embeddings": 40960,
|
||||
"max_window_layers": 36,
|
||||
"model_type": "qwen3",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 36,
|
||||
"num_key_value_heads": 8,
|
||||
"pad_token_id": 151643,
|
||||
"rms_norm_eps": 1e-06,
|
||||
"rope_scaling": null,
|
||||
"rope_theta": 1000000,
|
||||
"sliding_window": null,
|
||||
"tie_word_embeddings": false,
|
||||
"transformers_version": "4.57.6",
|
||||
"use_cache": false,
|
||||
"use_sliding_window": false,
|
||||
"vocab_size": 151936
|
||||
}
|
||||
12
generation_config.json
Normal file
12
generation_config.json
Normal file
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"do_sample": true,
|
||||
"eos_token_id": [
|
||||
151645,
|
||||
151643
|
||||
],
|
||||
"pad_token_id": 151643,
|
||||
"temperature": 0.6,
|
||||
"top_k": 20,
|
||||
"top_p": 0.95,
|
||||
"transformers_version": "4.57.6"
|
||||
}
|
||||
151388
merges.txt
Normal file
151388
merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
3
model-00001-of-00004.safetensors
Normal file
3
model-00001-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5c0ca50342d7028d1251c224643225cde61e208022ae4d5b1a70b2a90890595b
|
||||
size 4902257696
|
||||
3
model-00002-of-00004.safetensors
Normal file
3
model-00002-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:57cfb8312d61459784c97eaf43bccd2ac9ba80d3ad8fc8ed06cb4bbb6f06db87
|
||||
size 4915960368
|
||||
3
model-00003-of-00004.safetensors
Normal file
3
model-00003-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0b23e40b13e8f41476f47a3a58859be00ca0250b786a5e8218da50e50ae6623f
|
||||
size 4983068496
|
||||
3
model-00004-of-00004.safetensors
Normal file
3
model-00004-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5f136dd038151b93c2061e281920ed031f68c72643b75285a7e204f02e2fe06e
|
||||
size 1580230264
|
||||
407
model.safetensors.index.json
Normal file
407
model.safetensors.index.json
Normal file
@@ -0,0 +1,407 @@
|
||||
{
|
||||
"metadata": {
|
||||
"total_parameters": 308224,
|
||||
"total_size": 16381470720
|
||||
},
|
||||
"weight_map": {
|
||||
"lm_head.weight": "model-00004-of-00004.safetensors",
|
||||
"model.embed_tokens.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.20.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.30.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.35.input_layernorm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.k_norm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.q_norm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.norm.weight": "model-00004-of-00004.safetensors"
|
||||
}
|
||||
}
|
||||
12
run_summary.json
Normal file
12
run_summary.json
Normal file
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"agent_name": "4d777f61eacde52705d17f4ec7388bc01c0d95b6_thinking_preprocessed",
|
||||
"training_start": null,
|
||||
"training_end": null,
|
||||
"created_by": "DCAgent",
|
||||
"base_model_name": "Qwen/Qwen3-8B",
|
||||
"dataset_name": "/e/data1/datasets/playground/ot-baf/hf_hub/datasets--laion--nemotron-terminal-scientific_computing/snapshots/610c7db0b8510b87e3c99b3bd49660bc56821866_thinking_preprocessed,/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--exp_tas_repetition_penalty_1.05_traces/snapshots/b4f5500e00651d5ffc7f8701f8a055d9b2b68a0a_thinking_preprocessed,/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--a1_multifile_composition/snapshots/a19e5e467f3e83605b4de72bb5b7923e5e55efa9_thinking_preprocessed,/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--exp_tas_max_episodes_512_traces/snapshots/236c1dc9aa6d24cf77ce281b5342d93bae685832_thinking_preprocessed,/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--dev_set_part1_10k_glm_4.7_traces_jupiter/snapshots/f1871d1c1446b3b43cbfe2737d0df56cecf3f420_thinking_preprocessed,/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--swesmith-sandboxes-with_tests-gpt-5-mini-passed_glm_4.7_traces/snapshots/b9b0e0d113e9c37dd035f03644315478acc04487_thinking_preprocessed,/e/data1/datasets/playground/ot-baf/hf_hub/datasets--penfever--Kimi-2.5-r2egym_sandboxes-maxeps-32k/snapshots/4d777f61eacde52705d17f4ec7388bc01c0d95b6_thinking_preprocessed",
|
||||
"training_type": "SFT",
|
||||
"training_parameters": "https://huggingface.co/mlfoundations-dev/nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg__Qwen3-8B/blob/main/config.json",
|
||||
"wandb_link": null,
|
||||
"traces_location_s3": null
|
||||
}
|
||||
31
special_tokens_map.json
Normal file
31
special_tokens_map.json
Normal file
@@ -0,0 +1,31 @@
|
||||
{
|
||||
"additional_special_tokens": [
|
||||
"<|im_start|>",
|
||||
"<|im_end|>",
|
||||
"<|object_ref_start|>",
|
||||
"<|object_ref_end|>",
|
||||
"<|box_start|>",
|
||||
"<|box_end|>",
|
||||
"<|quad_start|>",
|
||||
"<|quad_end|>",
|
||||
"<|vision_start|>",
|
||||
"<|vision_end|>",
|
||||
"<|vision_pad|>",
|
||||
"<|image_pad|>",
|
||||
"<|video_pad|>"
|
||||
],
|
||||
"eos_token": {
|
||||
"content": "<|im_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"pad_token": {
|
||||
"content": "<|endoftext|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
Binary file not shown.
240
tokenizer_config.json
Normal file
240
tokenizer_config.json
Normal file
@@ -0,0 +1,240 @@
|
||||
{
|
||||
"add_bos_token": false,
|
||||
"add_prefix_space": false,
|
||||
"added_tokens_decoder": {
|
||||
"151643": {
|
||||
"content": "<|endoftext|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151644": {
|
||||
"content": "<|im_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151645": {
|
||||
"content": "<|im_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151646": {
|
||||
"content": "<|object_ref_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151647": {
|
||||
"content": "<|object_ref_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151648": {
|
||||
"content": "<|box_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151649": {
|
||||
"content": "<|box_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151650": {
|
||||
"content": "<|quad_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151651": {
|
||||
"content": "<|quad_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151652": {
|
||||
"content": "<|vision_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151653": {
|
||||
"content": "<|vision_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151654": {
|
||||
"content": "<|vision_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151655": {
|
||||
"content": "<|image_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151656": {
|
||||
"content": "<|video_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151657": {
|
||||
"content": "<tool_call>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151658": {
|
||||
"content": "</tool_call>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151659": {
|
||||
"content": "<|fim_prefix|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151660": {
|
||||
"content": "<|fim_middle|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151661": {
|
||||
"content": "<|fim_suffix|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151662": {
|
||||
"content": "<|fim_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151663": {
|
||||
"content": "<|repo_name|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151664": {
|
||||
"content": "<|file_sep|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151665": {
|
||||
"content": "<tool_response>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151666": {
|
||||
"content": "</tool_response>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151667": {
|
||||
"content": "<think>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151668": {
|
||||
"content": "</think>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
}
|
||||
},
|
||||
"additional_special_tokens": [
|
||||
"<|im_start|>",
|
||||
"<|im_end|>",
|
||||
"<|object_ref_start|>",
|
||||
"<|object_ref_end|>",
|
||||
"<|box_start|>",
|
||||
"<|box_end|>",
|
||||
"<|quad_start|>",
|
||||
"<|quad_end|>",
|
||||
"<|vision_start|>",
|
||||
"<|vision_end|>",
|
||||
"<|vision_pad|>",
|
||||
"<|image_pad|>",
|
||||
"<|video_pad|>"
|
||||
],
|
||||
"bos_token": null,
|
||||
"clean_up_tokenization_spaces": false,
|
||||
"eos_token": "<|im_end|>",
|
||||
"errors": "replace",
|
||||
"extra_special_tokens": {},
|
||||
"model_max_length": 32768,
|
||||
"pad_token": "<|endoftext|>",
|
||||
"padding_side": "right",
|
||||
"split_special_tokens": false,
|
||||
"tokenizer_class": "Qwen2Tokenizer",
|
||||
"unk_token": null
|
||||
}
|
||||
16
train_results.json
Normal file
16
train_results.json
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"achieved_tflops_per_gpu": 9.145547146520583,
|
||||
"achieved_tflops_per_gpu_theoretical": 315.0751243425546,
|
||||
"epoch": 5.0,
|
||||
"loss_nan_ranks": 0,
|
||||
"loss_rank_avg": 0.3323420286178589,
|
||||
"mfu_percent": 0.6463284202488044,
|
||||
"mfu_percent_theoretical": 22.26679323975651,
|
||||
"total_flos": 1.1450092629186839e+19,
|
||||
"train_loss": 0.14989610923533592,
|
||||
"train_runtime": 39124.5476,
|
||||
"train_samples_per_second": 8.909,
|
||||
"train_steps_per_second": 0.093,
|
||||
"valid_targets_mean": 6423.8,
|
||||
"valid_targets_min": 192
|
||||
}
|
||||
771
trainer_log.jsonl
Normal file
771
trainer_log.jsonl
Normal file
@@ -0,0 +1,771 @@
|
||||
{"current_steps": 5, "total_steps": 3635, "loss": 0.9672, "lr": 4.395604395604396e-07, "epoch": 0.006883891693437357, "percentage": 0.14, "elapsed_time": "0:01:42", "remaining_time": "20:42:12"}
|
||||
{"current_steps": 10, "total_steps": 3635, "loss": 0.9668, "lr": 9.890109890109891e-07, "epoch": 0.013767783386874713, "percentage": 0.28, "elapsed_time": "0:03:17", "remaining_time": "19:51:37"}
|
||||
{"current_steps": 15, "total_steps": 3635, "loss": 0.9211, "lr": 1.5384615384615387e-06, "epoch": 0.02065167508031207, "percentage": 0.41, "elapsed_time": "0:04:48", "remaining_time": "19:21:58"}
|
||||
{"current_steps": 20, "total_steps": 3635, "loss": 0.8559, "lr": 2.0879120879120883e-06, "epoch": 0.027535566773749427, "percentage": 0.55, "elapsed_time": "0:06:17", "remaining_time": "18:58:02"}
|
||||
{"current_steps": 25, "total_steps": 3635, "loss": 0.7873, "lr": 2.6373626373626375e-06, "epoch": 0.03441945846718678, "percentage": 0.69, "elapsed_time": "0:07:48", "remaining_time": "18:47:28"}
|
||||
{"current_steps": 30, "total_steps": 3635, "loss": 0.7615, "lr": 3.1868131868131867e-06, "epoch": 0.04130335016062414, "percentage": 0.83, "elapsed_time": "0:09:20", "remaining_time": "18:42:51"}
|
||||
{"current_steps": 35, "total_steps": 3635, "loss": 0.7274, "lr": 3.7362637362637367e-06, "epoch": 0.048187241854061494, "percentage": 0.96, "elapsed_time": "0:10:48", "remaining_time": "18:30:53"}
|
||||
{"current_steps": 40, "total_steps": 3635, "loss": 0.6803, "lr": 4.2857142857142855e-06, "epoch": 0.05507113354749885, "percentage": 1.1, "elapsed_time": "0:12:24", "remaining_time": "18:35:21"}
|
||||
{"current_steps": 45, "total_steps": 3635, "loss": 0.6548, "lr": 4.8351648351648355e-06, "epoch": 0.06195502524093621, "percentage": 1.24, "elapsed_time": "0:13:56", "remaining_time": "18:31:58"}
|
||||
{"current_steps": 50, "total_steps": 3635, "loss": 0.6199, "lr": 5.384615384615385e-06, "epoch": 0.06883891693437356, "percentage": 1.38, "elapsed_time": "0:15:25", "remaining_time": "18:25:32"}
|
||||
{"current_steps": 55, "total_steps": 3635, "loss": 0.5721, "lr": 5.934065934065935e-06, "epoch": 0.07572280862781092, "percentage": 1.51, "elapsed_time": "0:17:11", "remaining_time": "18:38:45"}
|
||||
{"current_steps": 60, "total_steps": 3635, "loss": 0.5352, "lr": 6.483516483516485e-06, "epoch": 0.08260670032124828, "percentage": 1.65, "elapsed_time": "0:18:51", "remaining_time": "18:43:32"}
|
||||
{"current_steps": 65, "total_steps": 3635, "loss": 0.509, "lr": 7.032967032967034e-06, "epoch": 0.08949059201468564, "percentage": 1.79, "elapsed_time": "0:20:46", "remaining_time": "19:00:55"}
|
||||
{"current_steps": 70, "total_steps": 3635, "loss": 0.4905, "lr": 7.582417582417583e-06, "epoch": 0.09637448370812299, "percentage": 1.93, "elapsed_time": "0:22:25", "remaining_time": "19:01:45"}
|
||||
{"current_steps": 75, "total_steps": 3635, "loss": 0.4843, "lr": 8.131868131868132e-06, "epoch": 0.10325837540156035, "percentage": 2.06, "elapsed_time": "0:24:06", "remaining_time": "19:03:58"}
|
||||
{"current_steps": 80, "total_steps": 3635, "loss": 0.4647, "lr": 8.681318681318681e-06, "epoch": 0.1101422670949977, "percentage": 2.2, "elapsed_time": "0:25:50", "remaining_time": "19:08:08"}
|
||||
{"current_steps": 85, "total_steps": 3635, "loss": 0.4668, "lr": 9.230769230769232e-06, "epoch": 0.11702615878843506, "percentage": 2.34, "elapsed_time": "0:27:36", "remaining_time": "19:13:00"}
|
||||
{"current_steps": 90, "total_steps": 3635, "loss": 0.4513, "lr": 9.780219780219781e-06, "epoch": 0.12391005048187242, "percentage": 2.48, "elapsed_time": "0:29:19", "remaining_time": "19:15:07"}
|
||||
{"current_steps": 95, "total_steps": 3635, "loss": 0.444, "lr": 1.0329670329670332e-05, "epoch": 0.13079394217530976, "percentage": 2.61, "elapsed_time": "0:31:06", "remaining_time": "19:19:10"}
|
||||
{"current_steps": 100, "total_steps": 3635, "loss": 0.4416, "lr": 1.087912087912088e-05, "epoch": 0.13767783386874713, "percentage": 2.75, "elapsed_time": "0:32:53", "remaining_time": "19:22:29"}
|
||||
{"current_steps": 105, "total_steps": 3635, "loss": 0.4334, "lr": 1.1428571428571429e-05, "epoch": 0.1445617255621845, "percentage": 2.89, "elapsed_time": "0:34:39", "remaining_time": "19:25:14"}
|
||||
{"current_steps": 110, "total_steps": 3635, "loss": 0.424, "lr": 1.1978021978021978e-05, "epoch": 0.15144561725562183, "percentage": 3.03, "elapsed_time": "0:36:56", "remaining_time": "19:43:53"}
|
||||
{"current_steps": 115, "total_steps": 3635, "loss": 0.4219, "lr": 1.2527472527472529e-05, "epoch": 0.1583295089490592, "percentage": 3.16, "elapsed_time": "0:39:14", "remaining_time": "20:01:19"}
|
||||
{"current_steps": 120, "total_steps": 3635, "loss": 0.422, "lr": 1.3076923076923078e-05, "epoch": 0.16521340064249657, "percentage": 3.3, "elapsed_time": "0:41:25", "remaining_time": "20:13:27"}
|
||||
{"current_steps": 125, "total_steps": 3635, "loss": 0.4272, "lr": 1.3626373626373627e-05, "epoch": 0.1720972923359339, "percentage": 3.44, "elapsed_time": "0:43:36", "remaining_time": "20:24:21"}
|
||||
{"current_steps": 130, "total_steps": 3635, "loss": 0.4122, "lr": 1.4175824175824178e-05, "epoch": 0.17898118402937127, "percentage": 3.58, "elapsed_time": "0:45:50", "remaining_time": "20:35:54"}
|
||||
{"current_steps": 135, "total_steps": 3635, "loss": 0.4021, "lr": 1.4725274725274727e-05, "epoch": 0.18586507572280864, "percentage": 3.71, "elapsed_time": "0:48:02", "remaining_time": "20:45:38"}
|
||||
{"current_steps": 140, "total_steps": 3635, "loss": 0.4044, "lr": 1.5274725274725277e-05, "epoch": 0.19274896741624598, "percentage": 3.85, "elapsed_time": "0:50:19", "remaining_time": "20:56:18"}
|
||||
{"current_steps": 145, "total_steps": 3635, "loss": 0.4077, "lr": 1.5824175824175826e-05, "epoch": 0.19963285910968334, "percentage": 3.99, "elapsed_time": "0:52:35", "remaining_time": "21:05:51"}
|
||||
{"current_steps": 150, "total_steps": 3635, "loss": 0.4011, "lr": 1.6373626373626375e-05, "epoch": 0.2065167508031207, "percentage": 4.13, "elapsed_time": "0:54:52", "remaining_time": "21:14:44"}
|
||||
{"current_steps": 155, "total_steps": 3635, "loss": 0.4017, "lr": 1.6923076923076924e-05, "epoch": 0.21340064249655805, "percentage": 4.26, "elapsed_time": "0:57:10", "remaining_time": "21:23:45"}
|
||||
{"current_steps": 160, "total_steps": 3635, "loss": 0.3994, "lr": 1.7472527472527473e-05, "epoch": 0.2202845341899954, "percentage": 4.4, "elapsed_time": "0:59:22", "remaining_time": "21:29:38"}
|
||||
{"current_steps": 165, "total_steps": 3635, "loss": 0.3972, "lr": 1.8021978021978023e-05, "epoch": 0.22716842588343278, "percentage": 4.54, "elapsed_time": "1:01:36", "remaining_time": "21:35:46"}
|
||||
{"current_steps": 170, "total_steps": 3635, "loss": 0.3939, "lr": 1.8571428571428575e-05, "epoch": 0.23405231757687012, "percentage": 4.68, "elapsed_time": "1:03:50", "remaining_time": "21:41:13"}
|
||||
{"current_steps": 175, "total_steps": 3635, "loss": 0.3896, "lr": 1.9120879120879124e-05, "epoch": 0.24093620927030748, "percentage": 4.81, "elapsed_time": "1:06:00", "remaining_time": "21:44:58"}
|
||||
{"current_steps": 180, "total_steps": 3635, "loss": 0.3882, "lr": 1.967032967032967e-05, "epoch": 0.24782010096374485, "percentage": 4.95, "elapsed_time": "1:08:14", "remaining_time": "21:50:01"}
|
||||
{"current_steps": 185, "total_steps": 3635, "loss": 0.3827, "lr": 2.021978021978022e-05, "epoch": 0.2547039926571822, "percentage": 5.09, "elapsed_time": "1:10:26", "remaining_time": "21:53:45"}
|
||||
{"current_steps": 190, "total_steps": 3635, "loss": 0.3852, "lr": 2.0769230769230772e-05, "epoch": 0.2615878843506195, "percentage": 5.23, "elapsed_time": "1:12:40", "remaining_time": "21:57:49"}
|
||||
{"current_steps": 195, "total_steps": 3635, "loss": 0.4062, "lr": 2.131868131868132e-05, "epoch": 0.2684717760440569, "percentage": 5.36, "elapsed_time": "1:14:23", "remaining_time": "21:52:26"}
|
||||
{"current_steps": 200, "total_steps": 3635, "loss": 0.4219, "lr": 2.1868131868131867e-05, "epoch": 0.27535566773749426, "percentage": 5.5, "elapsed_time": "1:15:30", "remaining_time": "21:36:56"}
|
||||
{"current_steps": 205, "total_steps": 3635, "loss": 0.6701, "lr": 2.241758241758242e-05, "epoch": 0.2822395594309316, "percentage": 5.64, "elapsed_time": "1:16:54", "remaining_time": "21:26:53"}
|
||||
{"current_steps": 210, "total_steps": 3635, "loss": 0.9022, "lr": 2.296703296703297e-05, "epoch": 0.289123451124369, "percentage": 5.78, "elapsed_time": "1:18:48", "remaining_time": "21:25:17"}
|
||||
{"current_steps": 215, "total_steps": 3635, "loss": 0.7856, "lr": 2.3516483516483518e-05, "epoch": 0.29600734281780633, "percentage": 5.91, "elapsed_time": "1:20:39", "remaining_time": "21:22:59"}
|
||||
{"current_steps": 220, "total_steps": 3635, "loss": 0.8366, "lr": 2.4065934065934067e-05, "epoch": 0.30289123451124367, "percentage": 6.05, "elapsed_time": "1:22:20", "remaining_time": "21:18:15"}
|
||||
{"current_steps": 225, "total_steps": 3635, "loss": 0.7601, "lr": 2.461538461538462e-05, "epoch": 0.30977512620468106, "percentage": 6.19, "elapsed_time": "1:24:13", "remaining_time": "21:16:22"}
|
||||
{"current_steps": 230, "total_steps": 3635, "loss": 0.7176, "lr": 2.5164835164835165e-05, "epoch": 0.3166590178981184, "percentage": 6.33, "elapsed_time": "1:26:02", "remaining_time": "21:13:48"}
|
||||
{"current_steps": 235, "total_steps": 3635, "loss": 0.7248, "lr": 2.5714285714285718e-05, "epoch": 0.32354290959155574, "percentage": 6.46, "elapsed_time": "1:27:50", "remaining_time": "21:10:48"}
|
||||
{"current_steps": 240, "total_steps": 3635, "loss": 0.6923, "lr": 2.6263736263736267e-05, "epoch": 0.33042680128499313, "percentage": 6.6, "elapsed_time": "1:29:35", "remaining_time": "21:07:20"}
|
||||
{"current_steps": 245, "total_steps": 3635, "loss": 0.7099, "lr": 2.6813186813186813e-05, "epoch": 0.33731069297843047, "percentage": 6.74, "elapsed_time": "1:31:14", "remaining_time": "21:02:32"}
|
||||
{"current_steps": 250, "total_steps": 3635, "loss": 0.6842, "lr": 2.7362637362637365e-05, "epoch": 0.3441945846718678, "percentage": 6.88, "elapsed_time": "1:33:05", "remaining_time": "21:00:33"}
|
||||
{"current_steps": 255, "total_steps": 3635, "loss": 0.6671, "lr": 2.7912087912087915e-05, "epoch": 0.3510784763653052, "percentage": 7.02, "elapsed_time": "1:34:45", "remaining_time": "20:56:07"}
|
||||
{"current_steps": 260, "total_steps": 3635, "loss": 0.6768, "lr": 2.8461538461538464e-05, "epoch": 0.35796236805874254, "percentage": 7.15, "elapsed_time": "1:36:35", "remaining_time": "20:53:46"}
|
||||
{"current_steps": 265, "total_steps": 3635, "loss": 0.639, "lr": 2.9010989010989013e-05, "epoch": 0.3648462597521799, "percentage": 7.29, "elapsed_time": "1:38:19", "remaining_time": "20:50:20"}
|
||||
{"current_steps": 270, "total_steps": 3635, "loss": 0.6749, "lr": 2.9560439560439565e-05, "epoch": 0.3717301514456173, "percentage": 7.43, "elapsed_time": "1:40:07", "remaining_time": "20:47:54"}
|
||||
{"current_steps": 275, "total_steps": 3635, "loss": 0.6522, "lr": 3.010989010989011e-05, "epoch": 0.3786140431390546, "percentage": 7.57, "elapsed_time": "1:41:57", "remaining_time": "20:45:45"}
|
||||
{"current_steps": 280, "total_steps": 3635, "loss": 0.6879, "lr": 3.065934065934067e-05, "epoch": 0.38549793483249195, "percentage": 7.7, "elapsed_time": "1:43:39", "remaining_time": "20:42:05"}
|
||||
{"current_steps": 285, "total_steps": 3635, "loss": 0.6619, "lr": 3.120879120879121e-05, "epoch": 0.39238182652592934, "percentage": 7.84, "elapsed_time": "1:45:32", "remaining_time": "20:40:28"}
|
||||
{"current_steps": 290, "total_steps": 3635, "loss": 0.6366, "lr": 3.175824175824176e-05, "epoch": 0.3992657182193667, "percentage": 7.98, "elapsed_time": "1:47:15", "remaining_time": "20:37:10"}
|
||||
{"current_steps": 295, "total_steps": 3635, "loss": 0.6609, "lr": 3.230769230769231e-05, "epoch": 0.406149609912804, "percentage": 8.12, "elapsed_time": "1:49:03", "remaining_time": "20:34:44"}
|
||||
{"current_steps": 300, "total_steps": 3635, "loss": 0.6472, "lr": 3.285714285714286e-05, "epoch": 0.4130335016062414, "percentage": 8.25, "elapsed_time": "1:50:48", "remaining_time": "20:31:54"}
|
||||
{"current_steps": 305, "total_steps": 3635, "loss": 0.4525, "lr": 3.340659340659341e-05, "epoch": 0.41991739329967875, "percentage": 8.39, "elapsed_time": "1:52:21", "remaining_time": "20:26:42"}
|
||||
{"current_steps": 310, "total_steps": 3635, "loss": 0.2494, "lr": 3.3956043956043956e-05, "epoch": 0.4268012849931161, "percentage": 8.53, "elapsed_time": "1:53:30", "remaining_time": "20:17:31"}
|
||||
{"current_steps": 315, "total_steps": 3635, "loss": 0.2298, "lr": 3.450549450549451e-05, "epoch": 0.4336851766865535, "percentage": 8.67, "elapsed_time": "1:54:39", "remaining_time": "20:08:30"}
|
||||
{"current_steps": 320, "total_steps": 3635, "loss": 0.222, "lr": 3.505494505494506e-05, "epoch": 0.4405690683799908, "percentage": 8.8, "elapsed_time": "1:55:51", "remaining_time": "20:00:15"}
|
||||
{"current_steps": 325, "total_steps": 3635, "loss": 0.2158, "lr": 3.56043956043956e-05, "epoch": 0.44745296007342816, "percentage": 8.94, "elapsed_time": "1:57:04", "remaining_time": "19:52:22"}
|
||||
{"current_steps": 330, "total_steps": 3635, "loss": 0.2122, "lr": 3.615384615384616e-05, "epoch": 0.45433685176686556, "percentage": 9.08, "elapsed_time": "1:58:16", "remaining_time": "19:44:35"}
|
||||
{"current_steps": 335, "total_steps": 3635, "loss": 0.2063, "lr": 3.67032967032967e-05, "epoch": 0.4612207434603029, "percentage": 9.22, "elapsed_time": "1:59:28", "remaining_time": "19:36:58"}
|
||||
{"current_steps": 340, "total_steps": 3635, "loss": 0.2062, "lr": 3.725274725274726e-05, "epoch": 0.46810463515374023, "percentage": 9.35, "elapsed_time": "2:00:47", "remaining_time": "19:30:36"}
|
||||
{"current_steps": 345, "total_steps": 3635, "loss": 0.2075, "lr": 3.7802197802197807e-05, "epoch": 0.47498852684717763, "percentage": 9.49, "elapsed_time": "2:02:09", "remaining_time": "19:24:52"}
|
||||
{"current_steps": 350, "total_steps": 3635, "loss": 0.1988, "lr": 3.8351648351648356e-05, "epoch": 0.48187241854061497, "percentage": 9.63, "elapsed_time": "2:03:20", "remaining_time": "19:17:37"}
|
||||
{"current_steps": 355, "total_steps": 3635, "loss": 0.1979, "lr": 3.8901098901098905e-05, "epoch": 0.4887563102340523, "percentage": 9.77, "elapsed_time": "2:04:45", "remaining_time": "19:12:38"}
|
||||
{"current_steps": 360, "total_steps": 3635, "loss": 0.1993, "lr": 3.9450549450549454e-05, "epoch": 0.4956402019274897, "percentage": 9.9, "elapsed_time": "2:06:06", "remaining_time": "19:07:14"}
|
||||
{"current_steps": 365, "total_steps": 3635, "loss": 0.194, "lr": 4e-05, "epoch": 0.502524093620927, "percentage": 10.04, "elapsed_time": "2:07:20", "remaining_time": "19:00:50"}
|
||||
{"current_steps": 370, "total_steps": 3635, "loss": 0.1944, "lr": 3.999976939016174e-05, "epoch": 0.5094079853143644, "percentage": 10.18, "elapsed_time": "2:08:40", "remaining_time": "18:55:28"}
|
||||
{"current_steps": 375, "total_steps": 3635, "loss": 0.1946, "lr": 3.999907756596503e-05, "epoch": 0.5162918770078018, "percentage": 10.32, "elapsed_time": "2:10:02", "remaining_time": "18:50:30"}
|
||||
{"current_steps": 380, "total_steps": 3635, "loss": 0.195, "lr": 3.999792454336403e-05, "epoch": 0.523175768701239, "percentage": 10.45, "elapsed_time": "2:11:13", "remaining_time": "18:44:03"}
|
||||
{"current_steps": 385, "total_steps": 3635, "loss": 0.1933, "lr": 3.999631034894857e-05, "epoch": 0.5300596603946764, "percentage": 10.59, "elapsed_time": "2:12:24", "remaining_time": "18:37:46"}
|
||||
{"current_steps": 390, "total_steps": 3635, "loss": 0.1952, "lr": 3.999423501994356e-05, "epoch": 0.5369435520881138, "percentage": 10.73, "elapsed_time": "2:13:39", "remaining_time": "18:32:07"}
|
||||
{"current_steps": 395, "total_steps": 3635, "loss": 0.1936, "lr": 3.999169860420813e-05, "epoch": 0.5438274437815511, "percentage": 10.87, "elapsed_time": "2:14:51", "remaining_time": "18:26:07"}
|
||||
{"current_steps": 400, "total_steps": 3635, "loss": 0.1885, "lr": 3.9988701160234525e-05, "epoch": 0.5507113354749885, "percentage": 11.0, "elapsed_time": "2:15:58", "remaining_time": "18:19:41"}
|
||||
{"current_steps": 405, "total_steps": 3635, "loss": 0.1864, "lr": 3.998524275714675e-05, "epoch": 0.5575952271684259, "percentage": 11.14, "elapsed_time": "2:17:07", "remaining_time": "18:13:36"}
|
||||
{"current_steps": 410, "total_steps": 3635, "loss": 0.5104, "lr": 3.998132347469898e-05, "epoch": 0.5644791188618632, "percentage": 11.28, "elapsed_time": "2:18:49", "remaining_time": "18:11:56"}
|
||||
{"current_steps": 415, "total_steps": 3635, "loss": 0.5536, "lr": 3.997694340327373e-05, "epoch": 0.5713630105553006, "percentage": 11.42, "elapsed_time": "2:20:37", "remaining_time": "18:11:03"}
|
||||
{"current_steps": 420, "total_steps": 3635, "loss": 0.5528, "lr": 3.9972102643879754e-05, "epoch": 0.578246902248738, "percentage": 11.55, "elapsed_time": "2:22:17", "remaining_time": "18:09:10"}
|
||||
{"current_steps": 425, "total_steps": 3635, "loss": 0.5426, "lr": 3.996680130814972e-05, "epoch": 0.5851307939421753, "percentage": 11.69, "elapsed_time": "2:24:05", "remaining_time": "18:08:22"}
|
||||
{"current_steps": 430, "total_steps": 3635, "loss": 0.5358, "lr": 3.996103951833766e-05, "epoch": 0.5920146856356127, "percentage": 11.83, "elapsed_time": "2:25:56", "remaining_time": "18:07:49"}
|
||||
{"current_steps": 435, "total_steps": 3635, "loss": 0.5545, "lr": 3.995481740731609e-05, "epoch": 0.59889857732905, "percentage": 11.97, "elapsed_time": "2:27:36", "remaining_time": "18:05:49"}
|
||||
{"current_steps": 440, "total_steps": 3635, "loss": 0.528, "lr": 3.994813511857304e-05, "epoch": 0.6057824690224873, "percentage": 12.1, "elapsed_time": "2:29:23", "remaining_time": "18:04:48"}
|
||||
{"current_steps": 445, "total_steps": 3635, "loss": 0.4993, "lr": 3.994099280620865e-05, "epoch": 0.6126663607159247, "percentage": 12.24, "elapsed_time": "2:31:17", "remaining_time": "18:04:31"}
|
||||
{"current_steps": 450, "total_steps": 3635, "loss": 0.5597, "lr": 3.9933390634931674e-05, "epoch": 0.6195502524093621, "percentage": 12.38, "elapsed_time": "2:33:00", "remaining_time": "18:02:59"}
|
||||
{"current_steps": 455, "total_steps": 3635, "loss": 0.5185, "lr": 3.992532878005565e-05, "epoch": 0.6264341441027994, "percentage": 12.52, "elapsed_time": "2:34:46", "remaining_time": "18:01:45"}
|
||||
{"current_steps": 460, "total_steps": 3635, "loss": 0.541, "lr": 3.9916807427494895e-05, "epoch": 0.6333180357962368, "percentage": 12.65, "elapsed_time": "2:36:31", "remaining_time": "18:00:20"}
|
||||
{"current_steps": 465, "total_steps": 3635, "loss": 0.5126, "lr": 3.9907826773760175e-05, "epoch": 0.6402019274896742, "percentage": 12.79, "elapsed_time": "2:38:23", "remaining_time": "17:59:44"}
|
||||
{"current_steps": 470, "total_steps": 3635, "loss": 0.5091, "lr": 3.98983870259542e-05, "epoch": 0.6470858191831115, "percentage": 12.93, "elapsed_time": "2:40:15", "remaining_time": "17:59:10"}
|
||||
{"current_steps": 475, "total_steps": 3635, "loss": 0.5552, "lr": 3.988848840176685e-05, "epoch": 0.6539697108765489, "percentage": 13.07, "elapsed_time": "2:41:54", "remaining_time": "17:57:09"}
|
||||
{"current_steps": 480, "total_steps": 3635, "loss": 0.5305, "lr": 3.987813112947013e-05, "epoch": 0.6608536025699863, "percentage": 13.2, "elapsed_time": "2:43:46", "remaining_time": "17:56:29"}
|
||||
{"current_steps": 485, "total_steps": 3635, "loss": 0.5285, "lr": 3.986731544791293e-05, "epoch": 0.6677374942634235, "percentage": 13.34, "elapsed_time": "2:45:30", "remaining_time": "17:54:53"}
|
||||
{"current_steps": 490, "total_steps": 3635, "loss": 0.5146, "lr": 3.9856041606515514e-05, "epoch": 0.6746213859568609, "percentage": 13.48, "elapsed_time": "2:47:12", "remaining_time": "17:53:14"}
|
||||
{"current_steps": 495, "total_steps": 3635, "loss": 0.5226, "lr": 3.984430986526375e-05, "epoch": 0.6815052776502983, "percentage": 13.62, "elapsed_time": "2:48:53", "remaining_time": "17:51:21"}
|
||||
{"current_steps": 500, "total_steps": 3635, "loss": 0.5065, "lr": 3.983212049470313e-05, "epoch": 0.6883891693437356, "percentage": 13.76, "elapsed_time": "2:50:38", "remaining_time": "17:49:53"}
|
||||
{"current_steps": 505, "total_steps": 3635, "loss": 0.5326, "lr": 3.981947377593254e-05, "epoch": 0.695273061037173, "percentage": 13.89, "elapsed_time": "2:52:27", "remaining_time": "17:48:54"}
|
||||
{"current_steps": 510, "total_steps": 3635, "loss": 0.5083, "lr": 3.9806370000597745e-05, "epoch": 0.7021569527306104, "percentage": 14.03, "elapsed_time": "2:54:12", "remaining_time": "17:47:27"}
|
||||
{"current_steps": 515, "total_steps": 3635, "loss": 0.4415, "lr": 3.9792809470884705e-05, "epoch": 0.7090408444240477, "percentage": 14.17, "elapsed_time": "2:56:28", "remaining_time": "17:49:06"}
|
||||
{"current_steps": 520, "total_steps": 3635, "loss": 0.3555, "lr": 3.977879249951258e-05, "epoch": 0.7159247361174851, "percentage": 14.31, "elapsed_time": "2:58:56", "remaining_time": "17:51:57"}
|
||||
{"current_steps": 525, "total_steps": 3635, "loss": 0.3654, "lr": 3.976431940972651e-05, "epoch": 0.7228086278109225, "percentage": 14.44, "elapsed_time": "3:01:24", "remaining_time": "17:54:38"}
|
||||
{"current_steps": 530, "total_steps": 3635, "loss": 0.352, "lr": 3.974939053529019e-05, "epoch": 0.7296925195043598, "percentage": 14.58, "elapsed_time": "3:03:53", "remaining_time": "17:57:16"}
|
||||
{"current_steps": 535, "total_steps": 3635, "loss": 0.3474, "lr": 3.9734006220478156e-05, "epoch": 0.7365764111977972, "percentage": 14.72, "elapsed_time": "3:06:19", "remaining_time": "17:59:38"}
|
||||
{"current_steps": 540, "total_steps": 3635, "loss": 0.3444, "lr": 3.971816682006783e-05, "epoch": 0.7434603028912345, "percentage": 14.86, "elapsed_time": "3:08:40", "remaining_time": "18:01:24"}
|
||||
{"current_steps": 545, "total_steps": 3635, "loss": 0.3257, "lr": 3.9701872699331384e-05, "epoch": 0.7503441945846718, "percentage": 14.99, "elapsed_time": "3:11:10", "remaining_time": "18:03:54"}
|
||||
{"current_steps": 550, "total_steps": 3635, "loss": 0.3309, "lr": 3.9685124234027264e-05, "epoch": 0.7572280862781092, "percentage": 15.13, "elapsed_time": "3:13:35", "remaining_time": "18:05:51"}
|
||||
{"current_steps": 555, "total_steps": 3635, "loss": 0.3208, "lr": 3.966792181039156e-05, "epoch": 0.7641119779715466, "percentage": 15.27, "elapsed_time": "3:16:04", "remaining_time": "18:08:06"}
|
||||
{"current_steps": 560, "total_steps": 3635, "loss": 0.3374, "lr": 3.965026582512908e-05, "epoch": 0.7709958696649839, "percentage": 15.41, "elapsed_time": "3:18:25", "remaining_time": "18:09:33"}
|
||||
{"current_steps": 565, "total_steps": 3635, "loss": 0.3186, "lr": 3.963215668540423e-05, "epoch": 0.7778797613584213, "percentage": 15.54, "elapsed_time": "3:20:55", "remaining_time": "18:11:46"}
|
||||
{"current_steps": 570, "total_steps": 3635, "loss": 0.324, "lr": 3.9613594808831565e-05, "epoch": 0.7847636530518587, "percentage": 15.68, "elapsed_time": "3:23:24", "remaining_time": "18:13:46"}
|
||||
{"current_steps": 575, "total_steps": 3635, "loss": 0.3403, "lr": 3.959458062346624e-05, "epoch": 0.791647544745296, "percentage": 15.82, "elapsed_time": "3:25:44", "remaining_time": "18:14:56"}
|
||||
{"current_steps": 580, "total_steps": 3635, "loss": 0.2958, "lr": 3.957511456779407e-05, "epoch": 0.7985314364387334, "percentage": 15.96, "elapsed_time": "3:28:12", "remaining_time": "18:16:41"}
|
||||
{"current_steps": 585, "total_steps": 3635, "loss": 0.3282, "lr": 3.9555197090721446e-05, "epoch": 0.8054153281321708, "percentage": 16.09, "elapsed_time": "3:30:40", "remaining_time": "18:18:25"}
|
||||
{"current_steps": 590, "total_steps": 3635, "loss": 0.3035, "lr": 3.9534828651564987e-05, "epoch": 0.812299219825608, "percentage": 16.23, "elapsed_time": "3:33:09", "remaining_time": "18:20:06"}
|
||||
{"current_steps": 595, "total_steps": 3635, "loss": 0.2981, "lr": 3.951400972004094e-05, "epoch": 0.8191831115190454, "percentage": 16.37, "elapsed_time": "3:35:28", "remaining_time": "18:20:56"}
|
||||
{"current_steps": 600, "total_steps": 3635, "loss": 0.3133, "lr": 3.949274077625435e-05, "epoch": 0.8260670032124828, "percentage": 16.51, "elapsed_time": "3:37:51", "remaining_time": "18:22:02"}
|
||||
{"current_steps": 605, "total_steps": 3635, "loss": 0.3015, "lr": 3.947102231068798e-05, "epoch": 0.8329508949059201, "percentage": 16.64, "elapsed_time": "3:40:28", "remaining_time": "18:24:13"}
|
||||
{"current_steps": 610, "total_steps": 3635, "loss": 0.3186, "lr": 3.944885482419102e-05, "epoch": 0.8398347865993575, "percentage": 16.78, "elapsed_time": "3:42:14", "remaining_time": "18:22:05"}
|
||||
{"current_steps": 615, "total_steps": 3635, "loss": 0.2921, "lr": 3.942623882796751e-05, "epoch": 0.8467186782927949, "percentage": 16.92, "elapsed_time": "3:43:33", "remaining_time": "18:17:47"}
|
||||
{"current_steps": 620, "total_steps": 3635, "loss": 0.2817, "lr": 3.940317484356459e-05, "epoch": 0.8536025699862322, "percentage": 17.06, "elapsed_time": "3:44:51", "remaining_time": "18:13:25"}
|
||||
{"current_steps": 625, "total_steps": 3635, "loss": 0.2724, "lr": 3.93796634028604e-05, "epoch": 0.8604864616796696, "percentage": 17.19, "elapsed_time": "3:46:12", "remaining_time": "18:09:26"}
|
||||
{"current_steps": 630, "total_steps": 3635, "loss": 0.2718, "lr": 3.935570504805192e-05, "epoch": 0.867370353373107, "percentage": 17.33, "elapsed_time": "3:47:30", "remaining_time": "18:05:12"}
|
||||
{"current_steps": 635, "total_steps": 3635, "loss": 0.2657, "lr": 3.933130033164238e-05, "epoch": 0.8742542450665443, "percentage": 17.47, "elapsed_time": "3:48:49", "remaining_time": "18:01:02"}
|
||||
{"current_steps": 640, "total_steps": 3635, "loss": 0.2587, "lr": 3.930644981642854e-05, "epoch": 0.8811381367599816, "percentage": 17.61, "elapsed_time": "3:50:04", "remaining_time": "17:56:42"}
|
||||
{"current_steps": 645, "total_steps": 3635, "loss": 0.2603, "lr": 3.928115407548774e-05, "epoch": 0.888022028453419, "percentage": 17.74, "elapsed_time": "3:51:25", "remaining_time": "17:52:46"}
|
||||
{"current_steps": 650, "total_steps": 3635, "loss": 0.2554, "lr": 3.925541369216465e-05, "epoch": 0.8949059201468563, "percentage": 17.88, "elapsed_time": "3:52:47", "remaining_time": "17:49:02"}
|
||||
{"current_steps": 655, "total_steps": 3635, "loss": 0.2538, "lr": 3.9229229260057824e-05, "epoch": 0.9017898118402937, "percentage": 18.02, "elapsed_time": "3:54:06", "remaining_time": "17:45:03"}
|
||||
{"current_steps": 660, "total_steps": 3635, "loss": 0.2513, "lr": 3.920260138300604e-05, "epoch": 0.9086737035337311, "percentage": 18.16, "elapsed_time": "3:55:28", "remaining_time": "17:41:23"}
|
||||
{"current_steps": 665, "total_steps": 3635, "loss": 0.2482, "lr": 3.917553067507433e-05, "epoch": 0.9155575952271684, "percentage": 18.29, "elapsed_time": "3:56:49", "remaining_time": "17:37:39"}
|
||||
{"current_steps": 670, "total_steps": 3635, "loss": 0.2474, "lr": 3.914801776053986e-05, "epoch": 0.9224414869206058, "percentage": 18.43, "elapsed_time": "3:58:10", "remaining_time": "17:33:58"}
|
||||
{"current_steps": 675, "total_steps": 3635, "loss": 0.2484, "lr": 3.91200632738775e-05, "epoch": 0.9293253786140432, "percentage": 18.57, "elapsed_time": "3:59:32", "remaining_time": "17:30:24"}
|
||||
{"current_steps": 680, "total_steps": 3635, "loss": 0.2495, "lr": 3.9091667859745225e-05, "epoch": 0.9362092703074805, "percentage": 18.71, "elapsed_time": "4:00:54", "remaining_time": "17:26:51"}
|
||||
{"current_steps": 685, "total_steps": 3635, "loss": 0.3283, "lr": 3.9062832172969206e-05, "epoch": 0.9430931620009179, "percentage": 18.84, "elapsed_time": "4:02:48", "remaining_time": "17:25:42"}
|
||||
{"current_steps": 690, "total_steps": 3635, "loss": 0.3196, "lr": 3.903355687852876e-05, "epoch": 0.9499770536943553, "percentage": 18.98, "elapsed_time": "4:04:55", "remaining_time": "17:25:21"}
|
||||
{"current_steps": 695, "total_steps": 3635, "loss": 0.2933, "lr": 3.900384265154098e-05, "epoch": 0.9568609453877925, "percentage": 19.12, "elapsed_time": "4:07:05", "remaining_time": "17:25:13"}
|
||||
{"current_steps": 700, "total_steps": 3635, "loss": 0.3011, "lr": 3.897369017724517e-05, "epoch": 0.9637448370812299, "percentage": 19.26, "elapsed_time": "4:09:11", "remaining_time": "17:24:50"}
|
||||
{"current_steps": 705, "total_steps": 3635, "loss": 0.2922, "lr": 3.894310015098705e-05, "epoch": 0.9706287287746673, "percentage": 19.39, "elapsed_time": "4:11:14", "remaining_time": "17:24:09"}
|
||||
{"current_steps": 710, "total_steps": 3635, "loss": 0.2917, "lr": 3.8912073278202716e-05, "epoch": 0.9775126204681046, "percentage": 19.53, "elapsed_time": "4:13:18", "remaining_time": "17:23:33"}
|
||||
{"current_steps": 715, "total_steps": 3635, "loss": 0.2855, "lr": 3.88806102744024e-05, "epoch": 0.984396512161542, "percentage": 19.67, "elapsed_time": "4:15:26", "remaining_time": "17:23:13"}
|
||||
{"current_steps": 720, "total_steps": 3635, "loss": 0.2667, "lr": 3.8848711865153906e-05, "epoch": 0.9912804038549794, "percentage": 19.81, "elapsed_time": "4:17:30", "remaining_time": "17:22:32"}
|
||||
{"current_steps": 725, "total_steps": 3635, "loss": 0.2714, "lr": 3.881637878606594e-05, "epoch": 0.9981642955484167, "percentage": 19.94, "elapsed_time": "4:19:34", "remaining_time": "17:21:54"}
|
||||
{"current_steps": 730, "total_steps": 3635, "loss": 0.5635, "lr": 3.878361178277111e-05, "epoch": 1.0041303350160624, "percentage": 20.08, "elapsed_time": "4:21:05", "remaining_time": "17:18:58"}
|
||||
{"current_steps": 735, "total_steps": 3635, "loss": 0.564, "lr": 3.875041161090875e-05, "epoch": 1.0110142267094997, "percentage": 20.22, "elapsed_time": "4:22:36", "remaining_time": "17:16:08"}
|
||||
{"current_steps": 740, "total_steps": 3635, "loss": 0.5041, "lr": 3.8716779036107494e-05, "epoch": 1.0178981184029372, "percentage": 20.36, "elapsed_time": "4:24:04", "remaining_time": "17:13:07"}
|
||||
{"current_steps": 745, "total_steps": 3635, "loss": 0.4711, "lr": 3.8682714833967606e-05, "epoch": 1.0247820100963745, "percentage": 20.5, "elapsed_time": "4:25:30", "remaining_time": "17:09:59"}
|
||||
{"current_steps": 750, "total_steps": 3635, "loss": 0.4479, "lr": 3.8648219790043086e-05, "epoch": 1.0316659017898118, "percentage": 20.63, "elapsed_time": "4:26:56", "remaining_time": "17:06:48"}
|
||||
{"current_steps": 755, "total_steps": 3635, "loss": 0.4401, "lr": 3.86132946998236e-05, "epoch": 1.0385497934832493, "percentage": 20.77, "elapsed_time": "4:28:19", "remaining_time": "17:03:32"}
|
||||
{"current_steps": 760, "total_steps": 3635, "loss": 0.433, "lr": 3.8577940368716076e-05, "epoch": 1.0454336851766866, "percentage": 20.91, "elapsed_time": "4:29:46", "remaining_time": "17:00:31"}
|
||||
{"current_steps": 765, "total_steps": 3635, "loss": 0.4184, "lr": 3.8542157612026184e-05, "epoch": 1.0523175768701238, "percentage": 21.05, "elapsed_time": "4:31:13", "remaining_time": "16:57:30"}
|
||||
{"current_steps": 770, "total_steps": 3635, "loss": 0.4161, "lr": 3.850594725493949e-05, "epoch": 1.0592014685635613, "percentage": 21.18, "elapsed_time": "4:32:44", "remaining_time": "16:54:47"}
|
||||
{"current_steps": 775, "total_steps": 3635, "loss": 0.4132, "lr": 3.846931013250246e-05, "epoch": 1.0660853602569986, "percentage": 21.32, "elapsed_time": "4:34:10", "remaining_time": "16:51:49"}
|
||||
{"current_steps": 780, "total_steps": 3635, "loss": 0.4042, "lr": 3.8432247089603167e-05, "epoch": 1.072969251950436, "percentage": 21.46, "elapsed_time": "4:35:46", "remaining_time": "16:49:23"}
|
||||
{"current_steps": 785, "total_steps": 3635, "loss": 0.3895, "lr": 3.8394758980951864e-05, "epoch": 1.0798531436438734, "percentage": 21.6, "elapsed_time": "4:37:22", "remaining_time": "16:47:01"}
|
||||
{"current_steps": 790, "total_steps": 3635, "loss": 0.3801, "lr": 3.83568466710612e-05, "epoch": 1.0867370353373107, "percentage": 21.73, "elapsed_time": "4:39:02", "remaining_time": "16:44:54"}
|
||||
{"current_steps": 795, "total_steps": 3635, "loss": 0.372, "lr": 3.8318511034226344e-05, "epoch": 1.093620927030748, "percentage": 21.87, "elapsed_time": "4:40:41", "remaining_time": "16:42:43"}
|
||||
{"current_steps": 800, "total_steps": 3635, "loss": 0.3748, "lr": 3.8279752954504814e-05, "epoch": 1.1005048187241855, "percentage": 22.01, "elapsed_time": "4:42:19", "remaining_time": "16:40:28"}
|
||||
{"current_steps": 805, "total_steps": 3635, "loss": 0.3691, "lr": 3.824057332569604e-05, "epoch": 1.1073887104176228, "percentage": 22.15, "elapsed_time": "4:43:56", "remaining_time": "16:38:11"}
|
||||
{"current_steps": 810, "total_steps": 3635, "loss": 0.3668, "lr": 3.820097305132081e-05, "epoch": 1.11427260211106, "percentage": 22.28, "elapsed_time": "4:45:41", "remaining_time": "16:36:23"}
|
||||
{"current_steps": 815, "total_steps": 3635, "loss": 0.3703, "lr": 3.8160953044600426e-05, "epoch": 1.1211564938044976, "percentage": 22.42, "elapsed_time": "4:47:20", "remaining_time": "16:34:12"}
|
||||
{"current_steps": 820, "total_steps": 3635, "loss": 0.3606, "lr": 3.81205142284356e-05, "epoch": 1.1280403854979348, "percentage": 22.56, "elapsed_time": "4:49:02", "remaining_time": "16:32:15"}
|
||||
{"current_steps": 825, "total_steps": 3635, "loss": 0.3648, "lr": 3.807965753538522e-05, "epoch": 1.1349242771913721, "percentage": 22.7, "elapsed_time": "4:50:47", "remaining_time": "16:30:27"}
|
||||
{"current_steps": 830, "total_steps": 3635, "loss": 0.363, "lr": 3.8038383907644834e-05, "epoch": 1.1418081688848096, "percentage": 22.83, "elapsed_time": "4:52:23", "remaining_time": "16:28:10"}
|
||||
{"current_steps": 835, "total_steps": 3635, "loss": 0.3578, "lr": 3.7996694297024895e-05, "epoch": 1.148692060578247, "percentage": 22.97, "elapsed_time": "4:54:28", "remaining_time": "16:27:27"}
|
||||
{"current_steps": 840, "total_steps": 3635, "loss": 0.3581, "lr": 3.795458966492884e-05, "epoch": 1.1555759522716842, "percentage": 23.11, "elapsed_time": "4:56:44", "remaining_time": "16:27:21"}
|
||||
{"current_steps": 845, "total_steps": 3635, "loss": 0.3575, "lr": 3.7912070982330906e-05, "epoch": 1.1624598439651217, "percentage": 23.25, "elapsed_time": "4:58:54", "remaining_time": "16:26:54"}
|
||||
{"current_steps": 850, "total_steps": 3635, "loss": 0.3634, "lr": 3.786913922975375e-05, "epoch": 1.169343735658559, "percentage": 23.38, "elapsed_time": "5:01:03", "remaining_time": "16:26:26"}
|
||||
{"current_steps": 855, "total_steps": 3635, "loss": 0.3621, "lr": 3.782579539724582e-05, "epoch": 1.1762276273519963, "percentage": 23.52, "elapsed_time": "5:03:09", "remaining_time": "16:25:42"}
|
||||
{"current_steps": 860, "total_steps": 3635, "loss": 0.3465, "lr": 3.7782040484358547e-05, "epoch": 1.1831115190454338, "percentage": 23.66, "elapsed_time": "5:05:23", "remaining_time": "16:25:25"}
|
||||
{"current_steps": 865, "total_steps": 3635, "loss": 0.3506, "lr": 3.773787550012325e-05, "epoch": 1.189995410738871, "percentage": 23.8, "elapsed_time": "5:07:36", "remaining_time": "16:25:04"}
|
||||
{"current_steps": 870, "total_steps": 3635, "loss": 0.3507, "lr": 3.7693301463027935e-05, "epoch": 1.1968793024323083, "percentage": 23.93, "elapsed_time": "5:09:44", "remaining_time": "16:24:24"}
|
||||
{"current_steps": 875, "total_steps": 3635, "loss": 0.3563, "lr": 3.764831940099374e-05, "epoch": 1.2037631941257458, "percentage": 24.07, "elapsed_time": "5:11:59", "remaining_time": "16:24:06"}
|
||||
{"current_steps": 880, "total_steps": 3635, "loss": 0.3447, "lr": 3.760293035135127e-05, "epoch": 1.2106470858191831, "percentage": 24.21, "elapsed_time": "5:14:12", "remaining_time": "16:23:41"}
|
||||
{"current_steps": 885, "total_steps": 3635, "loss": 0.3545, "lr": 3.755713536081667e-05, "epoch": 1.2175309775126204, "percentage": 24.35, "elapsed_time": "5:16:24", "remaining_time": "16:23:09"}
|
||||
{"current_steps": 890, "total_steps": 3635, "loss": 0.3461, "lr": 3.751093548546748e-05, "epoch": 1.224414869206058, "percentage": 24.48, "elapsed_time": "5:18:34", "remaining_time": "16:22:33"}
|
||||
{"current_steps": 895, "total_steps": 3635, "loss": 0.3492, "lr": 3.7464331790718265e-05, "epoch": 1.2312987608994952, "percentage": 24.62, "elapsed_time": "5:20:46", "remaining_time": "16:22:03"}
|
||||
{"current_steps": 900, "total_steps": 3635, "loss": 0.3434, "lr": 3.7417325351296086e-05, "epoch": 1.2381826525929325, "percentage": 24.76, "elapsed_time": "5:22:53", "remaining_time": "16:21:14"}
|
||||
{"current_steps": 905, "total_steps": 3635, "loss": 0.3454, "lr": 3.736991725121567e-05, "epoch": 1.24506654428637, "percentage": 24.9, "elapsed_time": "5:25:17", "remaining_time": "16:21:16"}
|
||||
{"current_steps": 910, "total_steps": 3635, "loss": 0.3399, "lr": 3.732210858375447e-05, "epoch": 1.2519504359798073, "percentage": 25.03, "elapsed_time": "5:27:29", "remaining_time": "16:20:41"}
|
||||
{"current_steps": 915, "total_steps": 3635, "loss": 0.3421, "lr": 3.727390045142737e-05, "epoch": 1.2588343276732445, "percentage": 25.17, "elapsed_time": "5:29:33", "remaining_time": "16:19:40"}
|
||||
{"current_steps": 920, "total_steps": 3635, "loss": 0.3473, "lr": 3.722529396596133e-05, "epoch": 1.2657182193666818, "percentage": 25.31, "elapsed_time": "5:31:45", "remaining_time": "16:19:01"}
|
||||
{"current_steps": 925, "total_steps": 3635, "loss": 0.3733, "lr": 3.717629024826974e-05, "epoch": 1.2726021110601193, "percentage": 25.45, "elapsed_time": "5:32:46", "remaining_time": "16:14:57"}
|
||||
{"current_steps": 930, "total_steps": 3635, "loss": 0.374, "lr": 3.712689042842654e-05, "epoch": 1.2794860027535566, "percentage": 25.58, "elapsed_time": "5:33:56", "remaining_time": "16:11:18"}
|
||||
{"current_steps": 935, "total_steps": 3635, "loss": 0.785, "lr": 3.707709564564017e-05, "epoch": 1.286369894446994, "percentage": 25.72, "elapsed_time": "5:35:43", "remaining_time": "16:09:27"}
|
||||
{"current_steps": 940, "total_steps": 3635, "loss": 0.6665, "lr": 3.70269070482273e-05, "epoch": 1.2932537861404314, "percentage": 25.86, "elapsed_time": "5:37:29", "remaining_time": "16:07:36"}
|
||||
{"current_steps": 945, "total_steps": 3635, "loss": 0.6768, "lr": 3.697632579358638e-05, "epoch": 1.3001376778338687, "percentage": 26.0, "elapsed_time": "5:39:11", "remaining_time": "16:05:31"}
|
||||
{"current_steps": 950, "total_steps": 3635, "loss": 0.629, "lr": 3.69253530481709e-05, "epoch": 1.307021569527306, "percentage": 26.13, "elapsed_time": "5:40:57", "remaining_time": "16:03:40"}
|
||||
{"current_steps": 955, "total_steps": 3635, "loss": 0.6108, "lr": 3.687398998746252e-05, "epoch": 1.3139054612207435, "percentage": 26.27, "elapsed_time": "5:42:48", "remaining_time": "16:02:00"}
|
||||
{"current_steps": 960, "total_steps": 3635, "loss": 0.606, "lr": 3.6822237795943954e-05, "epoch": 1.3207893529141808, "percentage": 26.41, "elapsed_time": "5:44:31", "remaining_time": "15:59:59"}
|
||||
{"current_steps": 965, "total_steps": 3635, "loss": 0.596, "lr": 3.6770097667071644e-05, "epoch": 1.327673244607618, "percentage": 26.55, "elapsed_time": "5:46:16", "remaining_time": "15:58:04"}
|
||||
{"current_steps": 970, "total_steps": 3635, "loss": 0.6175, "lr": 3.671757080324826e-05, "epoch": 1.3345571363010555, "percentage": 26.69, "elapsed_time": "5:47:50", "remaining_time": "15:55:38"}
|
||||
{"current_steps": 975, "total_steps": 3635, "loss": 0.5951, "lr": 3.666465841579497e-05, "epoch": 1.3414410279944928, "percentage": 26.82, "elapsed_time": "5:49:39", "remaining_time": "15:53:57"}
|
||||
{"current_steps": 980, "total_steps": 3635, "loss": 0.5772, "lr": 3.661136172492347e-05, "epoch": 1.3483249196879301, "percentage": 26.96, "elapsed_time": "5:51:18", "remaining_time": "15:51:44"}
|
||||
{"current_steps": 985, "total_steps": 3635, "loss": 0.5882, "lr": 3.6557681959707905e-05, "epoch": 1.3552088113813676, "percentage": 27.1, "elapsed_time": "5:53:01", "remaining_time": "15:49:44"}
|
||||
{"current_steps": 990, "total_steps": 3635, "loss": 0.5854, "lr": 3.6503620358056455e-05, "epoch": 1.362092703074805, "percentage": 27.24, "elapsed_time": "5:54:49", "remaining_time": "15:47:59"}
|
||||
{"current_steps": 995, "total_steps": 3635, "loss": 0.571, "lr": 3.6449178166682846e-05, "epoch": 1.3689765947682422, "percentage": 27.37, "elapsed_time": "5:56:32", "remaining_time": "15:46:00"}
|
||||
{"current_steps": 1000, "total_steps": 3635, "loss": 0.598, "lr": 3.639435664107758e-05, "epoch": 1.3758604864616797, "percentage": 27.51, "elapsed_time": "5:58:23", "remaining_time": "15:44:21"}
|
||||
{"current_steps": 1005, "total_steps": 3635, "loss": 0.594, "lr": 3.633915704547897e-05, "epoch": 1.382744378155117, "percentage": 27.65, "elapsed_time": "6:00:01", "remaining_time": "15:42:10"}
|
||||
{"current_steps": 1010, "total_steps": 3635, "loss": 0.5874, "lr": 3.628358065284399e-05, "epoch": 1.3896282698485543, "percentage": 27.79, "elapsed_time": "6:01:52", "remaining_time": "15:40:32"}
|
||||
{"current_steps": 1015, "total_steps": 3635, "loss": 0.5892, "lr": 3.6227628744818935e-05, "epoch": 1.3965121615419918, "percentage": 27.92, "elapsed_time": "6:03:38", "remaining_time": "15:38:38"}
|
||||
{"current_steps": 1020, "total_steps": 3635, "loss": 0.5805, "lr": 3.617130261170985e-05, "epoch": 1.403396053235429, "percentage": 28.06, "elapsed_time": "6:05:18", "remaining_time": "15:36:34"}
|
||||
{"current_steps": 1025, "total_steps": 3635, "loss": 0.6015, "lr": 3.611460355245279e-05, "epoch": 1.4102799449288663, "percentage": 28.2, "elapsed_time": "6:07:04", "remaining_time": "15:34:43"}
|
||||
{"current_steps": 1030, "total_steps": 3635, "loss": 0.5255, "lr": 3.6057532874583825e-05, "epoch": 1.4171638366223038, "percentage": 28.34, "elapsed_time": "6:08:44", "remaining_time": "15:32:34"}
|
||||
{"current_steps": 1035, "total_steps": 3635, "loss": 0.195, "lr": 3.600009189420895e-05, "epoch": 1.4240477283157411, "percentage": 28.47, "elapsed_time": "6:09:55", "remaining_time": "15:29:15"}
|
||||
{"current_steps": 1040, "total_steps": 3635, "loss": 0.1805, "lr": 3.594228193597369e-05, "epoch": 1.4309316200091784, "percentage": 28.61, "elapsed_time": "6:11:02", "remaining_time": "15:25:48"}
|
||||
{"current_steps": 1045, "total_steps": 3635, "loss": 0.1786, "lr": 3.588410433303253e-05, "epoch": 1.437815511702616, "percentage": 28.75, "elapsed_time": "6:12:15", "remaining_time": "15:22:38"}
|
||||
{"current_steps": 1050, "total_steps": 3635, "loss": 0.1734, "lr": 3.582556042701825e-05, "epoch": 1.4446994033960532, "percentage": 28.89, "elapsed_time": "6:13:27", "remaining_time": "15:19:25"}
|
||||
{"current_steps": 1055, "total_steps": 3635, "loss": 0.1756, "lr": 3.5766651568010926e-05, "epoch": 1.4515832950894905, "percentage": 29.02, "elapsed_time": "6:14:38", "remaining_time": "15:16:10"}
|
||||
{"current_steps": 1060, "total_steps": 3635, "loss": 0.1717, "lr": 3.5707379114506785e-05, "epoch": 1.458467186782928, "percentage": 29.16, "elapsed_time": "6:15:50", "remaining_time": "15:13:01"}
|
||||
{"current_steps": 1065, "total_steps": 3635, "loss": 0.1729, "lr": 3.564774443338692e-05, "epoch": 1.4653510784763653, "percentage": 29.3, "elapsed_time": "6:17:07", "remaining_time": "15:10:04"}
|
||||
{"current_steps": 1070, "total_steps": 3635, "loss": 0.1731, "lr": 3.558774889988577e-05, "epoch": 1.4722349701698025, "percentage": 29.44, "elapsed_time": "6:18:26", "remaining_time": "15:07:11"}
|
||||
{"current_steps": 1075, "total_steps": 3635, "loss": 0.1673, "lr": 3.552739389755934e-05, "epoch": 1.47911886186324, "percentage": 29.57, "elapsed_time": "6:19:41", "remaining_time": "15:04:10"}
|
||||
{"current_steps": 1080, "total_steps": 3635, "loss": 0.166, "lr": 3.546668081825337e-05, "epoch": 1.4860027535566773, "percentage": 29.71, "elapsed_time": "6:21:03", "remaining_time": "15:01:28"}
|
||||
{"current_steps": 1085, "total_steps": 3635, "loss": 0.1704, "lr": 3.54056110620712e-05, "epoch": 1.4928866452501146, "percentage": 29.85, "elapsed_time": "6:22:24", "remaining_time": "14:58:45"}
|
||||
{"current_steps": 1090, "total_steps": 3635, "loss": 0.1652, "lr": 3.534418603734149e-05, "epoch": 1.4997705369435521, "percentage": 29.99, "elapsed_time": "6:23:38", "remaining_time": "14:55:44"}
|
||||
{"current_steps": 1095, "total_steps": 3635, "loss": 0.1659, "lr": 3.5282407160585736e-05, "epoch": 1.5066544286369894, "percentage": 30.12, "elapsed_time": "6:24:58", "remaining_time": "14:52:58"}
|
||||
{"current_steps": 1100, "total_steps": 3635, "loss": 0.1652, "lr": 3.522027585648562e-05, "epoch": 1.5135383203304267, "percentage": 30.26, "elapsed_time": "6:26:17", "remaining_time": "14:50:13"}
|
||||
{"current_steps": 1105, "total_steps": 3635, "loss": 0.1632, "lr": 3.515779355785015e-05, "epoch": 1.5204222120238642, "percentage": 30.4, "elapsed_time": "6:27:33", "remaining_time": "14:47:20"}
|
||||
{"current_steps": 1110, "total_steps": 3635, "loss": 0.167, "lr": 3.50949617055826e-05, "epoch": 1.5273061037173015, "percentage": 30.54, "elapsed_time": "6:28:40", "remaining_time": "14:44:09"}
|
||||
{"current_steps": 1115, "total_steps": 3635, "loss": 0.166, "lr": 3.5031781748647286e-05, "epoch": 1.5341899954107387, "percentage": 30.67, "elapsed_time": "6:29:57", "remaining_time": "14:41:20"}
|
||||
{"current_steps": 1120, "total_steps": 3635, "loss": 0.1664, "lr": 3.496825514403618e-05, "epoch": 1.5410738871041763, "percentage": 30.81, "elapsed_time": "6:31:06", "remaining_time": "14:38:15"}
|
||||
{"current_steps": 1125, "total_steps": 3635, "loss": 0.1652, "lr": 3.490438335673529e-05, "epoch": 1.5479577787976135, "percentage": 30.95, "elapsed_time": "6:32:19", "remaining_time": "14:35:19"}
|
||||
{"current_steps": 1130, "total_steps": 3635, "loss": 0.1618, "lr": 3.484016785969087e-05, "epoch": 1.5548416704910508, "percentage": 31.09, "elapsed_time": "6:33:29", "remaining_time": "14:32:17"}
|
||||
{"current_steps": 1135, "total_steps": 3635, "loss": 0.3155, "lr": 3.4775610133775444e-05, "epoch": 1.5617255621844883, "percentage": 31.22, "elapsed_time": "6:34:51", "remaining_time": "14:29:43"}
|
||||
{"current_steps": 1140, "total_steps": 3635, "loss": 0.5021, "lr": 3.47107116677537e-05, "epoch": 1.5686094538779256, "percentage": 31.36, "elapsed_time": "6:36:41", "remaining_time": "14:28:11"}
|
||||
{"current_steps": 1145, "total_steps": 3635, "loss": 0.4634, "lr": 3.464547395824811e-05, "epoch": 1.575493345571363, "percentage": 31.5, "elapsed_time": "6:38:26", "remaining_time": "14:26:29"}
|
||||
{"current_steps": 1150, "total_steps": 3635, "loss": 0.513, "lr": 3.457989850970444e-05, "epoch": 1.5823772372648004, "percentage": 31.64, "elapsed_time": "6:40:07", "remaining_time": "14:24:37"}
|
||||
{"current_steps": 1155, "total_steps": 3635, "loss": 0.4809, "lr": 3.451398683435704e-05, "epoch": 1.5892611289582377, "percentage": 31.77, "elapsed_time": "6:41:54", "remaining_time": "14:22:57"}
|
||||
{"current_steps": 1160, "total_steps": 3635, "loss": 0.4868, "lr": 3.4447740452193995e-05, "epoch": 1.596145020651675, "percentage": 31.91, "elapsed_time": "6:43:34", "remaining_time": "14:21:04"}
|
||||
{"current_steps": 1165, "total_steps": 3635, "loss": 0.4682, "lr": 3.438116089092205e-05, "epoch": 1.6030289123451125, "percentage": 32.05, "elapsed_time": "6:45:20", "remaining_time": "14:19:23"}
|
||||
{"current_steps": 1170, "total_steps": 3635, "loss": 0.4608, "lr": 3.43142496859314e-05, "epoch": 1.6099128040385497, "percentage": 32.19, "elapsed_time": "6:47:08", "remaining_time": "14:17:47"}
|
||||
{"current_steps": 1175, "total_steps": 3635, "loss": 0.488, "lr": 3.4247008380260244e-05, "epoch": 1.616796695731987, "percentage": 32.32, "elapsed_time": "6:48:53", "remaining_time": "14:16:04"}
|
||||
{"current_steps": 1180, "total_steps": 3635, "loss": 0.4728, "lr": 3.4179438524559255e-05, "epoch": 1.6236805874254245, "percentage": 32.46, "elapsed_time": "6:50:38", "remaining_time": "14:14:21"}
|
||||
{"current_steps": 1185, "total_steps": 3635, "loss": 0.4511, "lr": 3.411154167705578e-05, "epoch": 1.6305644791188618, "percentage": 32.6, "elapsed_time": "6:52:30", "remaining_time": "14:12:51"}
|
||||
{"current_steps": 1190, "total_steps": 3635, "loss": 0.4907, "lr": 3.404331940351793e-05, "epoch": 1.637448370812299, "percentage": 32.74, "elapsed_time": "6:54:13", "remaining_time": "14:11:03"}
|
||||
{"current_steps": 1195, "total_steps": 3635, "loss": 0.4596, "lr": 3.397477327721844e-05, "epoch": 1.6443322625057366, "percentage": 32.87, "elapsed_time": "6:56:01", "remaining_time": "14:09:27"}
|
||||
{"current_steps": 1200, "total_steps": 3635, "loss": 0.4909, "lr": 3.390590487889842e-05, "epoch": 1.651216154199174, "percentage": 33.01, "elapsed_time": "6:57:42", "remaining_time": "14:07:36"}
|
||||
{"current_steps": 1205, "total_steps": 3635, "loss": 0.47, "lr": 3.38367157967309e-05, "epoch": 1.6581000458926112, "percentage": 33.15, "elapsed_time": "6:59:40", "remaining_time": "14:06:18"}
|
||||
{"current_steps": 1210, "total_steps": 3635, "loss": 0.4744, "lr": 3.3767207626284184e-05, "epoch": 1.6649839375860487, "percentage": 33.29, "elapsed_time": "7:01:28", "remaining_time": "14:04:41"}
|
||||
{"current_steps": 1215, "total_steps": 3635, "loss": 0.4813, "lr": 3.369738197048505e-05, "epoch": 1.671867829279486, "percentage": 33.43, "elapsed_time": "7:03:06", "remaining_time": "14:02:43"}
|
||||
{"current_steps": 1220, "total_steps": 3635, "loss": 0.4755, "lr": 3.362724043958184e-05, "epoch": 1.6787517209729232, "percentage": 33.56, "elapsed_time": "7:04:47", "remaining_time": "14:00:52"}
|
||||
{"current_steps": 1225, "total_steps": 3635, "loss": 0.4506, "lr": 3.355678465110725e-05, "epoch": 1.6856356126663607, "percentage": 33.7, "elapsed_time": "7:06:30", "remaining_time": "13:59:05"}
|
||||
{"current_steps": 1230, "total_steps": 3635, "loss": 0.4816, "lr": 3.348601622984107e-05, "epoch": 1.692519504359798, "percentage": 33.84, "elapsed_time": "7:08:16", "remaining_time": "13:57:23"}
|
||||
{"current_steps": 1235, "total_steps": 3635, "loss": 0.4764, "lr": 3.3414936807772736e-05, "epoch": 1.6994033960532353, "percentage": 33.98, "elapsed_time": "7:10:02", "remaining_time": "13:55:42"}
|
||||
{"current_steps": 1240, "total_steps": 3635, "loss": 0.4142, "lr": 3.334354802406364e-05, "epoch": 1.7062872877466728, "percentage": 34.11, "elapsed_time": "7:11:59", "remaining_time": "13:54:21"}
|
||||
{"current_steps": 1245, "total_steps": 3635, "loss": 0.302, "lr": 3.327185152500937e-05, "epoch": 1.71317117944011, "percentage": 34.25, "elapsed_time": "7:14:26", "remaining_time": "13:53:59"}
|
||||
{"current_steps": 1250, "total_steps": 3635, "loss": 0.2803, "lr": 3.3199848964001744e-05, "epoch": 1.7200550711335474, "percentage": 34.39, "elapsed_time": "7:16:53", "remaining_time": "13:53:34"}
|
||||
{"current_steps": 1255, "total_steps": 3635, "loss": 0.2977, "lr": 3.312754200149065e-05, "epoch": 1.726938962826985, "percentage": 34.53, "elapsed_time": "7:19:19", "remaining_time": "13:53:08"}
|
||||
{"current_steps": 1260, "total_steps": 3635, "loss": 0.2856, "lr": 3.305493230494576e-05, "epoch": 1.7338228545204222, "percentage": 34.66, "elapsed_time": "7:21:45", "remaining_time": "13:52:40"}
|
||||
{"current_steps": 1265, "total_steps": 3635, "loss": 0.3108, "lr": 3.298202154881814e-05, "epoch": 1.7407067462138595, "percentage": 34.8, "elapsed_time": "7:24:03", "remaining_time": "13:51:57"}
|
||||
{"current_steps": 1270, "total_steps": 3635, "loss": 0.2734, "lr": 3.2908811414501545e-05, "epoch": 1.747590637907297, "percentage": 34.94, "elapsed_time": "7:26:31", "remaining_time": "13:51:30"}
|
||||
{"current_steps": 1275, "total_steps": 3635, "loss": 0.29, "lr": 3.283530359029369e-05, "epoch": 1.7544745296007342, "percentage": 35.08, "elapsed_time": "7:28:55", "remaining_time": "13:50:57"}
|
||||
{"current_steps": 1280, "total_steps": 3635, "loss": 0.2718, "lr": 3.276149977135735e-05, "epoch": 1.7613584212941715, "percentage": 35.21, "elapsed_time": "7:31:22", "remaining_time": "13:50:26"}
|
||||
{"current_steps": 1285, "total_steps": 3635, "loss": 0.2876, "lr": 3.268740165968116e-05, "epoch": 1.768242312987609, "percentage": 35.35, "elapsed_time": "7:33:50", "remaining_time": "13:49:58"}
|
||||
{"current_steps": 1290, "total_steps": 3635, "loss": 0.2911, "lr": 3.261301096404051e-05, "epoch": 1.7751262046810463, "percentage": 35.49, "elapsed_time": "7:36:10", "remaining_time": "13:49:15"}
|
||||
{"current_steps": 1295, "total_steps": 3635, "loss": 0.2909, "lr": 3.2538329399958006e-05, "epoch": 1.7820100963744836, "percentage": 35.63, "elapsed_time": "7:38:38", "remaining_time": "13:48:45"}
|
||||
{"current_steps": 1300, "total_steps": 3635, "loss": 0.3069, "lr": 3.2463358689664e-05, "epoch": 1.788893988067921, "percentage": 35.76, "elapsed_time": "7:40:58", "remaining_time": "13:47:59"}
|
||||
{"current_steps": 1305, "total_steps": 3635, "loss": 0.265, "lr": 3.238810056205682e-05, "epoch": 1.7957778797613584, "percentage": 35.9, "elapsed_time": "7:43:25", "remaining_time": "13:47:25"}
|
||||
{"current_steps": 1310, "total_steps": 3635, "loss": 0.2753, "lr": 3.2312556752662946e-05, "epoch": 1.8026617714547957, "percentage": 36.04, "elapsed_time": "7:45:52", "remaining_time": "13:46:50"}
|
||||
{"current_steps": 1315, "total_steps": 3635, "loss": 0.2801, "lr": 3.223672900359693e-05, "epoch": 1.8095456631482332, "percentage": 36.18, "elapsed_time": "7:48:20", "remaining_time": "13:46:16"}
|
||||
{"current_steps": 1320, "total_steps": 3635, "loss": 0.264, "lr": 3.2160619063521274e-05, "epoch": 1.8164295548416705, "percentage": 36.31, "elapsed_time": "7:50:47", "remaining_time": "13:45:40"}
|
||||
{"current_steps": 1325, "total_steps": 3635, "loss": 0.2889, "lr": 3.2084228687606076e-05, "epoch": 1.8233134465351077, "percentage": 36.45, "elapsed_time": "7:53:01", "remaining_time": "13:44:40"}
|
||||
{"current_steps": 1330, "total_steps": 3635, "loss": 0.2693, "lr": 3.200755963748856e-05, "epoch": 1.8301973382285452, "percentage": 36.59, "elapsed_time": "7:55:30", "remaining_time": "13:44:05"}
|
||||
{"current_steps": 1335, "total_steps": 3635, "loss": 0.2806, "lr": 3.193061368123244e-05, "epoch": 1.8370812299219825, "percentage": 36.73, "elapsed_time": "7:57:43", "remaining_time": "13:43:02"}
|
||||
{"current_steps": 1340, "total_steps": 3635, "loss": 0.2375, "lr": 3.185339259328718e-05, "epoch": 1.8439651216154198, "percentage": 36.86, "elapsed_time": "7:59:03", "remaining_time": "13:40:28"}
|
||||
{"current_steps": 1345, "total_steps": 3635, "loss": 0.2296, "lr": 3.1775898154447035e-05, "epoch": 1.8508490133088573, "percentage": 37.0, "elapsed_time": "8:00:21", "remaining_time": "13:37:50"}
|
||||
{"current_steps": 1350, "total_steps": 3635, "loss": 0.2249, "lr": 3.169813215181e-05, "epoch": 1.8577329050022946, "percentage": 37.14, "elapsed_time": "8:01:40", "remaining_time": "13:35:16"}
|
||||
{"current_steps": 1355, "total_steps": 3635, "loss": 0.2269, "lr": 3.162009637873662e-05, "epoch": 1.8646167966957319, "percentage": 37.28, "elapsed_time": "8:03:02", "remaining_time": "13:32:48"}
|
||||
{"current_steps": 1360, "total_steps": 3635, "loss": 0.2237, "lr": 3.1541792634808586e-05, "epoch": 1.8715006883891694, "percentage": 37.41, "elapsed_time": "8:04:17", "remaining_time": "13:30:08"}
|
||||
{"current_steps": 1365, "total_steps": 3635, "loss": 0.2209, "lr": 3.146322272578726e-05, "epoch": 1.8783845800826067, "percentage": 37.55, "elapsed_time": "8:05:36", "remaining_time": "13:27:33"}
|
||||
{"current_steps": 1370, "total_steps": 3635, "loss": 0.2223, "lr": 3.138438846357208e-05, "epoch": 1.885268471776044, "percentage": 37.69, "elapsed_time": "8:06:53", "remaining_time": "13:24:57"}
|
||||
{"current_steps": 1375, "total_steps": 3635, "loss": 0.2204, "lr": 3.1305291666158645e-05, "epoch": 1.8921523634694815, "percentage": 37.83, "elapsed_time": "8:08:17", "remaining_time": "13:22:34"}
|
||||
{"current_steps": 1380, "total_steps": 3635, "loss": 0.2174, "lr": 3.1225934157596946e-05, "epoch": 1.8990362551629187, "percentage": 37.96, "elapsed_time": "8:09:38", "remaining_time": "13:20:05"}
|
||||
{"current_steps": 1385, "total_steps": 3635, "loss": 0.2163, "lr": 3.1146317767949205e-05, "epoch": 1.905920146856356, "percentage": 38.1, "elapsed_time": "8:10:57", "remaining_time": "13:17:34"}
|
||||
{"current_steps": 1390, "total_steps": 3635, "loss": 0.2147, "lr": 3.106644433324768e-05, "epoch": 1.9128040385497935, "percentage": 38.24, "elapsed_time": "8:12:16", "remaining_time": "13:15:05"}
|
||||
{"current_steps": 1395, "total_steps": 3635, "loss": 0.2167, "lr": 3.098631569545238e-05, "epoch": 1.9196879302432308, "percentage": 38.38, "elapsed_time": "8:13:42", "remaining_time": "13:12:46"}
|
||||
{"current_steps": 1400, "total_steps": 3635, "loss": 0.2135, "lr": 3.090593370240852e-05, "epoch": 1.926571821936668, "percentage": 38.51, "elapsed_time": "8:15:03", "remaining_time": "13:10:19"}
|
||||
{"current_steps": 1405, "total_steps": 3635, "loss": 0.2169, "lr": 3.082530020780392e-05, "epoch": 1.9334557136301056, "percentage": 38.65, "elapsed_time": "8:16:22", "remaining_time": "13:07:51"}
|
||||
{"current_steps": 1410, "total_steps": 3635, "loss": 0.2511, "lr": 3.074441707112632e-05, "epoch": 1.9403396053235429, "percentage": 38.79, "elapsed_time": "8:17:59", "remaining_time": "13:05:49"}
|
||||
{"current_steps": 1415, "total_steps": 3635, "loss": 0.2854, "lr": 3.0663286157620414e-05, "epoch": 1.9472234970169802, "percentage": 38.93, "elapsed_time": "8:20:04", "remaining_time": "13:04:34"}
|
||||
{"current_steps": 1420, "total_steps": 3635, "loss": 0.2593, "lr": 3.058190933824489e-05, "epoch": 1.9541073887104177, "percentage": 39.06, "elapsed_time": "8:22:11", "remaining_time": "13:03:21"}
|
||||
{"current_steps": 1425, "total_steps": 3635, "loss": 0.2617, "lr": 3.0500288489629263e-05, "epoch": 1.960991280403855, "percentage": 39.2, "elapsed_time": "8:24:17", "remaining_time": "13:02:06"}
|
||||
{"current_steps": 1430, "total_steps": 3635, "loss": 0.2623, "lr": 3.0418425494030596e-05, "epoch": 1.9678751720972922, "percentage": 39.34, "elapsed_time": "8:26:21", "remaining_time": "13:00:47"}
|
||||
{"current_steps": 1435, "total_steps": 3635, "loss": 0.259, "lr": 3.0336322239290118e-05, "epoch": 1.9747590637907297, "percentage": 39.48, "elapsed_time": "8:28:22", "remaining_time": "12:59:23"}
|
||||
{"current_steps": 1440, "total_steps": 3635, "loss": 0.2513, "lr": 3.0253980618789654e-05, "epoch": 1.981642955484167, "percentage": 39.61, "elapsed_time": "8:30:28", "remaining_time": "12:58:07"}
|
||||
{"current_steps": 1445, "total_steps": 3635, "loss": 0.2431, "lr": 3.0171402531407982e-05, "epoch": 1.9885268471776043, "percentage": 39.75, "elapsed_time": "8:32:28", "remaining_time": "12:56:42"}
|
||||
{"current_steps": 1450, "total_steps": 3635, "loss": 0.2389, "lr": 3.008858988147704e-05, "epoch": 1.9954107388710418, "percentage": 39.89, "elapsed_time": "8:34:35", "remaining_time": "12:55:26"}
|
||||
{"current_steps": 1455, "total_steps": 3635, "loss": 0.3707, "lr": 3.0005544578738005e-05, "epoch": 2.0013767783386873, "percentage": 40.03, "elapsed_time": "8:36:16", "remaining_time": "12:53:30"}
|
||||
{"current_steps": 1460, "total_steps": 3635, "loss": 0.5503, "lr": 2.9922268538297267e-05, "epoch": 2.008260670032125, "percentage": 40.17, "elapsed_time": "8:37:47", "remaining_time": "12:51:21"}
|
||||
{"current_steps": 1465, "total_steps": 3635, "loss": 0.4629, "lr": 2.9838763680582247e-05, "epoch": 2.0151445617255623, "percentage": 40.3, "elapsed_time": "8:39:19", "remaining_time": "12:49:14"}
|
||||
{"current_steps": 1470, "total_steps": 3635, "loss": 0.4265, "lr": 2.9755031931297113e-05, "epoch": 2.0220284534189994, "percentage": 40.44, "elapsed_time": "8:40:44", "remaining_time": "12:46:56"}
|
||||
{"current_steps": 1475, "total_steps": 3635, "loss": 0.3967, "lr": 2.9671075221378386e-05, "epoch": 2.028912345112437, "percentage": 40.58, "elapsed_time": "8:42:06", "remaining_time": "12:44:35"}
|
||||
{"current_steps": 1480, "total_steps": 3635, "loss": 0.3908, "lr": 2.9586895486950387e-05, "epoch": 2.0357962368058744, "percentage": 40.72, "elapsed_time": "8:43:33", "remaining_time": "12:42:20"}
|
||||
{"current_steps": 1485, "total_steps": 3635, "loss": 0.3858, "lr": 2.950249466928062e-05, "epoch": 2.0426801284993115, "percentage": 40.85, "elapsed_time": "8:45:03", "remaining_time": "12:40:10"}
|
||||
{"current_steps": 1490, "total_steps": 3635, "loss": 0.3794, "lr": 2.9417874714734977e-05, "epoch": 2.049564020192749, "percentage": 40.99, "elapsed_time": "8:46:24", "remaining_time": "12:37:48"}
|
||||
{"current_steps": 1495, "total_steps": 3635, "loss": 0.3706, "lr": 2.9333037574732847e-05, "epoch": 2.0564479118861865, "percentage": 41.13, "elapsed_time": "8:47:59", "remaining_time": "12:35:46"}
|
||||
{"current_steps": 1500, "total_steps": 3635, "loss": 0.3735, "lr": 2.9247985205702166e-05, "epoch": 2.0633318035796235, "percentage": 41.27, "elapsed_time": "8:49:23", "remaining_time": "12:33:30"}
|
||||
{"current_steps": 1505, "total_steps": 3635, "loss": 0.369, "lr": 2.9162719569034216e-05, "epoch": 2.070215695273061, "percentage": 41.4, "elapsed_time": "8:50:59", "remaining_time": "12:31:30"}
|
||||
{"current_steps": 1510, "total_steps": 3635, "loss": 0.3496, "lr": 2.9077242631038487e-05, "epoch": 2.0770995869664985, "percentage": 41.54, "elapsed_time": "8:52:42", "remaining_time": "12:29:40"}
|
||||
{"current_steps": 1515, "total_steps": 3635, "loss": 0.3384, "lr": 2.8991556362897248e-05, "epoch": 2.0839834786599356, "percentage": 41.68, "elapsed_time": "8:54:16", "remaining_time": "12:27:37"}
|
||||
{"current_steps": 1520, "total_steps": 3635, "loss": 0.3336, "lr": 2.890566274062015e-05, "epoch": 2.090867370353373, "percentage": 41.82, "elapsed_time": "8:56:02", "remaining_time": "12:25:52"}
|
||||
{"current_steps": 1525, "total_steps": 3635, "loss": 0.3326, "lr": 2.8819563744998626e-05, "epoch": 2.0977512620468106, "percentage": 41.95, "elapsed_time": "8:57:35", "remaining_time": "12:23:48"}
|
||||
{"current_steps": 1530, "total_steps": 3635, "loss": 0.3293, "lr": 2.8733261361560223e-05, "epoch": 2.1046351537402477, "percentage": 42.09, "elapsed_time": "8:59:13", "remaining_time": "12:21:52"}
|
||||
{"current_steps": 1535, "total_steps": 3635, "loss": 0.3285, "lr": 2.864675758052281e-05, "epoch": 2.111519045433685, "percentage": 42.23, "elapsed_time": "9:00:56", "remaining_time": "12:20:02"}
|
||||
{"current_steps": 1540, "total_steps": 3635, "loss": 0.3371, "lr": 2.8560054396748673e-05, "epoch": 2.1184029371271227, "percentage": 42.37, "elapsed_time": "9:02:37", "remaining_time": "12:18:11"}
|
||||
{"current_steps": 1545, "total_steps": 3635, "loss": 0.3277, "lr": 2.8473153809698546e-05, "epoch": 2.1252868288205597, "percentage": 42.5, "elapsed_time": "9:04:17", "remaining_time": "12:16:17"}
|
||||
{"current_steps": 1550, "total_steps": 3635, "loss": 0.3296, "lr": 2.8386057823385446e-05, "epoch": 2.1321707205139973, "percentage": 42.64, "elapsed_time": "9:05:59", "remaining_time": "12:14:26"}
|
||||
{"current_steps": 1555, "total_steps": 3635, "loss": 0.3261, "lr": 2.829876844632852e-05, "epoch": 2.1390546122074348, "percentage": 42.78, "elapsed_time": "9:07:40", "remaining_time": "12:12:34"}
|
||||
{"current_steps": 1560, "total_steps": 3635, "loss": 0.3273, "lr": 2.821128769150667e-05, "epoch": 2.145938503900872, "percentage": 42.92, "elapsed_time": "9:09:32", "remaining_time": "12:10:57"}
|
||||
{"current_steps": 1565, "total_steps": 3635, "loss": 0.3259, "lr": 2.8123617576312167e-05, "epoch": 2.1528223955943093, "percentage": 43.05, "elapsed_time": "9:11:47", "remaining_time": "12:09:50"}
|
||||
{"current_steps": 1570, "total_steps": 3635, "loss": 0.3258, "lr": 2.8035760122504126e-05, "epoch": 2.159706287287747, "percentage": 43.19, "elapsed_time": "9:14:01", "remaining_time": "12:08:42"}
|
||||
{"current_steps": 1575, "total_steps": 3635, "loss": 0.3265, "lr": 2.7947717356161867e-05, "epoch": 2.166590178981184, "percentage": 43.33, "elapsed_time": "9:16:09", "remaining_time": "12:07:25"}
|
||||
{"current_steps": 1580, "total_steps": 3635, "loss": 0.3337, "lr": 2.78594913076382e-05, "epoch": 2.1734740706746214, "percentage": 43.47, "elapsed_time": "9:18:13", "remaining_time": "12:06:03"}
|
||||
{"current_steps": 1585, "total_steps": 3635, "loss": 0.3227, "lr": 2.7771084011512603e-05, "epoch": 2.180357962368059, "percentage": 43.6, "elapsed_time": "9:20:25", "remaining_time": "12:04:50"}
|
||||
{"current_steps": 1590, "total_steps": 3635, "loss": 0.3184, "lr": 2.76824975065443e-05, "epoch": 2.187241854061496, "percentage": 43.74, "elapsed_time": "9:22:36", "remaining_time": "12:03:36"}
|
||||
{"current_steps": 1595, "total_steps": 3635, "loss": 0.3215, "lr": 2.7593733835625246e-05, "epoch": 2.1941257457549335, "percentage": 43.88, "elapsed_time": "9:24:48", "remaining_time": "12:02:23"}
|
||||
{"current_steps": 1600, "total_steps": 3635, "loss": 0.3241, "lr": 2.750479504573303e-05, "epoch": 2.201009637448371, "percentage": 44.02, "elapsed_time": "9:27:03", "remaining_time": "12:01:13"}
|
||||
{"current_steps": 1605, "total_steps": 3635, "loss": 0.3221, "lr": 2.7415683187883647e-05, "epoch": 2.207893529141808, "percentage": 44.15, "elapsed_time": "9:29:14", "remaining_time": "11:59:58"}
|
||||
{"current_steps": 1610, "total_steps": 3635, "loss": 0.3232, "lr": 2.7326400317084202e-05, "epoch": 2.2147774208352455, "percentage": 44.29, "elapsed_time": "9:31:27", "remaining_time": "11:58:45"}
|
||||
{"current_steps": 1615, "total_steps": 3635, "loss": 0.3214, "lr": 2.7236948492285535e-05, "epoch": 2.221661312528683, "percentage": 44.43, "elapsed_time": "9:33:38", "remaining_time": "11:57:29"}
|
||||
{"current_steps": 1620, "total_steps": 3635, "loss": 0.3232, "lr": 2.7147329776334742e-05, "epoch": 2.22854520422212, "percentage": 44.57, "elapsed_time": "9:35:46", "remaining_time": "11:56:10"}
|
||||
{"current_steps": 1625, "total_steps": 3635, "loss": 0.316, "lr": 2.7057546235927565e-05, "epoch": 2.2354290959155576, "percentage": 44.7, "elapsed_time": "9:37:57", "remaining_time": "11:54:53"}
|
||||
{"current_steps": 1630, "total_steps": 3635, "loss": 0.3173, "lr": 2.696759994156079e-05, "epoch": 2.242312987608995, "percentage": 44.84, "elapsed_time": "9:40:08", "remaining_time": "11:53:36"}
|
||||
{"current_steps": 1635, "total_steps": 3635, "loss": 0.3158, "lr": 2.6877492967484447e-05, "epoch": 2.249196879302432, "percentage": 44.98, "elapsed_time": "9:42:19", "remaining_time": "11:52:18"}
|
||||
{"current_steps": 1640, "total_steps": 3635, "loss": 0.3112, "lr": 2.6787227391654025e-05, "epoch": 2.2560807709958697, "percentage": 45.12, "elapsed_time": "9:44:28", "remaining_time": "11:50:59"}
|
||||
{"current_steps": 1645, "total_steps": 3635, "loss": 0.3215, "lr": 2.6696805295682487e-05, "epoch": 2.262964662689307, "percentage": 45.25, "elapsed_time": "9:46:38", "remaining_time": "11:49:40"}
|
||||
{"current_steps": 1650, "total_steps": 3635, "loss": 0.3332, "lr": 2.660622876479234e-05, "epoch": 2.2698485543827442, "percentage": 45.39, "elapsed_time": "9:48:04", "remaining_time": "11:47:28"}
|
||||
{"current_steps": 1655, "total_steps": 3635, "loss": 0.3377, "lr": 2.6515499887767495e-05, "epoch": 2.2767324460761817, "percentage": 45.53, "elapsed_time": "9:49:10", "remaining_time": "11:44:52"}
|
||||
{"current_steps": 1660, "total_steps": 3635, "loss": 0.6025, "lr": 2.642462075690511e-05, "epoch": 2.2836163377696193, "percentage": 45.67, "elapsed_time": "9:50:42", "remaining_time": "11:42:47"}
|
||||
{"current_steps": 1665, "total_steps": 3635, "loss": 0.6721, "lr": 2.633359346796736e-05, "epoch": 2.2905002294630563, "percentage": 45.8, "elapsed_time": "9:52:28", "remaining_time": "11:40:59"}
|
||||
{"current_steps": 1670, "total_steps": 3635, "loss": 0.6071, "lr": 2.6242420120133086e-05, "epoch": 2.297384121156494, "percentage": 45.94, "elapsed_time": "9:54:13", "remaining_time": "11:39:11"}
|
||||
{"current_steps": 1675, "total_steps": 3635, "loss": 0.59, "lr": 2.615110281594938e-05, "epoch": 2.3042680128499313, "percentage": 46.08, "elapsed_time": "9:55:57", "remaining_time": "11:37:21"}
|
||||
{"current_steps": 1680, "total_steps": 3635, "loss": 0.5761, "lr": 2.6059643661283116e-05, "epoch": 2.3111519045433684, "percentage": 46.22, "elapsed_time": "9:57:44", "remaining_time": "11:35:35"}
|
||||
{"current_steps": 1685, "total_steps": 3635, "loss": 0.5542, "lr": 2.5968044765272394e-05, "epoch": 2.318035796236806, "percentage": 46.35, "elapsed_time": "9:59:31", "remaining_time": "11:33:48"}
|
||||
{"current_steps": 1690, "total_steps": 3635, "loss": 0.561, "lr": 2.5876308240277857e-05, "epoch": 2.3249196879302434, "percentage": 46.49, "elapsed_time": "10:01:15", "remaining_time": "11:31:59"}
|
||||
{"current_steps": 1695, "total_steps": 3635, "loss": 0.5461, "lr": 2.578443620183403e-05, "epoch": 2.3318035796236805, "percentage": 46.63, "elapsed_time": "10:02:53", "remaining_time": "11:30:01"}
|
||||
{"current_steps": 1700, "total_steps": 3635, "loss": 0.54, "lr": 2.5692430768600512e-05, "epoch": 2.338687471317118, "percentage": 46.77, "elapsed_time": "10:04:36", "remaining_time": "11:28:11"}
|
||||
{"current_steps": 1705, "total_steps": 3635, "loss": 0.5428, "lr": 2.5600294062313103e-05, "epoch": 2.3455713630105555, "percentage": 46.91, "elapsed_time": "10:06:25", "remaining_time": "11:26:26"}
|
||||
{"current_steps": 1710, "total_steps": 3635, "loss": 0.5336, "lr": 2.55080282077349e-05, "epoch": 2.3524552547039925, "percentage": 47.04, "elapsed_time": "10:08:02", "remaining_time": "11:24:29"}
|
||||
{"current_steps": 1715, "total_steps": 3635, "loss": 0.5449, "lr": 2.5415635332607284e-05, "epoch": 2.35933914639743, "percentage": 47.18, "elapsed_time": "10:09:50", "remaining_time": "11:22:44"}
|
||||
{"current_steps": 1720, "total_steps": 3635, "loss": 0.5188, "lr": 2.5323117567600845e-05, "epoch": 2.3662230380908675, "percentage": 47.32, "elapsed_time": "10:11:34", "remaining_time": "11:20:55"}
|
||||
{"current_steps": 1725, "total_steps": 3635, "loss": 0.5298, "lr": 2.523047704626628e-05, "epoch": 2.3731069297843046, "percentage": 47.46, "elapsed_time": "10:13:21", "remaining_time": "11:19:08"}
|
||||
{"current_steps": 1730, "total_steps": 3635, "loss": 0.5431, "lr": 2.513771590498514e-05, "epoch": 2.379990821477742, "percentage": 47.59, "elapsed_time": "10:15:08", "remaining_time": "11:17:22"}
|
||||
{"current_steps": 1735, "total_steps": 3635, "loss": 0.5578, "lr": 2.504483628292061e-05, "epoch": 2.3868747131711796, "percentage": 47.73, "elapsed_time": "10:16:51", "remaining_time": "11:15:31"}
|
||||
{"current_steps": 1740, "total_steps": 3635, "loss": 0.5505, "lr": 2.4951840321968157e-05, "epoch": 2.3937586048646167, "percentage": 47.87, "elapsed_time": "10:18:37", "remaining_time": "11:13:44"}
|
||||
{"current_steps": 1745, "total_steps": 3635, "loss": 0.5337, "lr": 2.4858730166706125e-05, "epoch": 2.400642496558054, "percentage": 48.01, "elapsed_time": "10:20:22", "remaining_time": "11:11:55"}
|
||||
{"current_steps": 1750, "total_steps": 3635, "loss": 0.5376, "lr": 2.47655079643463e-05, "epoch": 2.4075263882514917, "percentage": 48.14, "elapsed_time": "10:22:09", "remaining_time": "11:10:09"}
|
||||
{"current_steps": 1755, "total_steps": 3635, "loss": 0.5434, "lr": 2.467217586468438e-05, "epoch": 2.4144102799449287, "percentage": 48.28, "elapsed_time": "10:23:54", "remaining_time": "11:08:21"}
|
||||
{"current_steps": 1760, "total_steps": 3635, "loss": 0.2811, "lr": 2.4578736020050423e-05, "epoch": 2.4212941716383662, "percentage": 48.42, "elapsed_time": "10:25:11", "remaining_time": "11:06:02"}
|
||||
{"current_steps": 1765, "total_steps": 3635, "loss": 0.1657, "lr": 2.448519058525915e-05, "epoch": 2.4281780633318037, "percentage": 48.56, "elapsed_time": "10:26:19", "remaining_time": "11:03:35"}
|
||||
{"current_steps": 1770, "total_steps": 3635, "loss": 0.1539, "lr": 2.4391541717560333e-05, "epoch": 2.435061955025241, "percentage": 48.69, "elapsed_time": "10:27:29", "remaining_time": "11:01:10"}
|
||||
{"current_steps": 1775, "total_steps": 3635, "loss": 0.1542, "lr": 2.4297791576588993e-05, "epoch": 2.4419458467186783, "percentage": 48.83, "elapsed_time": "10:28:41", "remaining_time": "10:58:47"}
|
||||
{"current_steps": 1780, "total_steps": 3635, "loss": 0.1489, "lr": 2.4203942324315623e-05, "epoch": 2.448829738412116, "percentage": 48.97, "elapsed_time": "10:29:57", "remaining_time": "10:56:29"}
|
||||
{"current_steps": 1785, "total_steps": 3635, "loss": 0.1514, "lr": 2.4109996124996297e-05, "epoch": 2.455713630105553, "percentage": 49.11, "elapsed_time": "10:31:07", "remaining_time": "10:54:06"}
|
||||
{"current_steps": 1790, "total_steps": 3635, "loss": 0.1481, "lr": 2.4015955145122807e-05, "epoch": 2.4625975217989904, "percentage": 49.24, "elapsed_time": "10:32:20", "remaining_time": "10:51:46"}
|
||||
{"current_steps": 1795, "total_steps": 3635, "loss": 0.1503, "lr": 2.3921821553372668e-05, "epoch": 2.469481413492428, "percentage": 49.38, "elapsed_time": "10:33:39", "remaining_time": "10:49:32"}
|
||||
{"current_steps": 1800, "total_steps": 3635, "loss": 0.1524, "lr": 2.3827597520559114e-05, "epoch": 2.476365305185865, "percentage": 49.52, "elapsed_time": "10:35:00", "remaining_time": "10:47:21"}
|
||||
{"current_steps": 1805, "total_steps": 3635, "loss": 0.1423, "lr": 2.3733285219581044e-05, "epoch": 2.4832491968793025, "percentage": 49.66, "elapsed_time": "10:36:22", "remaining_time": "10:45:11"}
|
||||
{"current_steps": 1810, "total_steps": 3635, "loss": 0.1457, "lr": 2.3638886825372905e-05, "epoch": 2.49013308857274, "percentage": 49.79, "elapsed_time": "10:37:45", "remaining_time": "10:43:02"}
|
||||
{"current_steps": 1815, "total_steps": 3635, "loss": 0.1463, "lr": 2.3544404514854546e-05, "epoch": 2.497016980266177, "percentage": 49.93, "elapsed_time": "10:39:05", "remaining_time": "10:40:51"}
|
||||
{"current_steps": 1820, "total_steps": 3635, "loss": 0.1453, "lr": 2.3449840466880982e-05, "epoch": 2.5039008719596145, "percentage": 50.07, "elapsed_time": "10:40:20", "remaining_time": "10:38:35"}
|
||||
{"current_steps": 1825, "total_steps": 3635, "loss": 0.1402, "lr": 2.3355196862192217e-05, "epoch": 2.510784763653052, "percentage": 50.21, "elapsed_time": "10:41:41", "remaining_time": "10:36:24"}
|
||||
{"current_steps": 1830, "total_steps": 3635, "loss": 0.1437, "lr": 2.3260475883362875e-05, "epoch": 2.517668655346489, "percentage": 50.34, "elapsed_time": "10:43:00", "remaining_time": "10:34:13"}
|
||||
{"current_steps": 1835, "total_steps": 3635, "loss": 0.1459, "lr": 2.316567971475192e-05, "epoch": 2.5245525470399266, "percentage": 50.48, "elapsed_time": "10:44:11", "remaining_time": "10:31:54"}
|
||||
{"current_steps": 1840, "total_steps": 3635, "loss": 0.143, "lr": 2.307081054245226e-05, "epoch": 2.5314364387333637, "percentage": 50.62, "elapsed_time": "10:45:23", "remaining_time": "10:29:36"}
|
||||
{"current_steps": 1845, "total_steps": 3635, "loss": 0.1444, "lr": 2.2975870554240355e-05, "epoch": 2.538320330426801, "percentage": 50.76, "elapsed_time": "10:46:35", "remaining_time": "10:27:18"}
|
||||
{"current_steps": 1850, "total_steps": 3635, "loss": 0.1463, "lr": 2.2880861939525723e-05, "epoch": 2.5452042221202387, "percentage": 50.89, "elapsed_time": "10:47:47", "remaining_time": "10:25:01"}
|
||||
{"current_steps": 1855, "total_steps": 3635, "loss": 0.1411, "lr": 2.2785786889300497e-05, "epoch": 2.552088113813676, "percentage": 51.03, "elapsed_time": "10:48:55", "remaining_time": "10:22:41"}
|
||||
{"current_steps": 1860, "total_steps": 3635, "loss": 0.1515, "lr": 2.2690647596088874e-05, "epoch": 2.5589720055071132, "percentage": 51.17, "elapsed_time": "10:50:06", "remaining_time": "10:20:23"}
|
||||
{"current_steps": 1865, "total_steps": 3635, "loss": 0.474, "lr": 2.2595446253896554e-05, "epoch": 2.5658558972005507, "percentage": 51.31, "elapsed_time": "10:51:53", "remaining_time": "10:18:41"}
|
||||
{"current_steps": 1870, "total_steps": 3635, "loss": 0.4628, "lr": 2.250018505816015e-05, "epoch": 2.572739788893988, "percentage": 51.44, "elapsed_time": "10:53:36", "remaining_time": "10:16:54"}
|
||||
{"current_steps": 1875, "total_steps": 3635, "loss": 0.4521, "lr": 2.2404866205696557e-05, "epoch": 2.5796236805874253, "percentage": 51.58, "elapsed_time": "10:55:17", "remaining_time": "10:15:06"}
|
||||
{"current_steps": 1880, "total_steps": 3635, "loss": 0.4504, "lr": 2.2309491894652285e-05, "epoch": 2.586507572280863, "percentage": 51.72, "elapsed_time": "10:57:07", "remaining_time": "10:13:25"}
|
||||
{"current_steps": 1885, "total_steps": 3635, "loss": 0.4171, "lr": 2.2214064324452785e-05, "epoch": 2.5933914639743003, "percentage": 51.86, "elapsed_time": "10:58:57", "remaining_time": "10:11:45"}
|
||||
{"current_steps": 1890, "total_steps": 3635, "loss": 0.4562, "lr": 2.2118585695751712e-05, "epoch": 2.6002753556677374, "percentage": 51.99, "elapsed_time": "11:00:33", "remaining_time": "10:09:52"}
|
||||
{"current_steps": 1895, "total_steps": 3635, "loss": 0.423, "lr": 2.202305821038017e-05, "epoch": 2.607159247361175, "percentage": 52.13, "elapsed_time": "11:02:19", "remaining_time": "10:08:08"}
|
||||
{"current_steps": 1900, "total_steps": 3635, "loss": 0.4207, "lr": 2.1927484071295965e-05, "epoch": 2.614043139054612, "percentage": 52.27, "elapsed_time": "11:04:06", "remaining_time": "10:06:26"}
|
||||
{"current_steps": 1905, "total_steps": 3635, "loss": 0.4435, "lr": 2.1831865482532753e-05, "epoch": 2.6209270307480494, "percentage": 52.41, "elapsed_time": "11:05:54", "remaining_time": "10:04:44"}
|
||||
{"current_steps": 1910, "total_steps": 3635, "loss": 0.4286, "lr": 2.173620464914929e-05, "epoch": 2.627810922441487, "percentage": 52.54, "elapsed_time": "11:07:40", "remaining_time": "10:03:00"}
|
||||
{"current_steps": 1915, "total_steps": 3635, "loss": 0.4432, "lr": 2.1640503777178484e-05, "epoch": 2.6346948141349245, "percentage": 52.68, "elapsed_time": "11:09:21", "remaining_time": "10:01:11"}
|
||||
{"current_steps": 1920, "total_steps": 3635, "loss": 0.4217, "lr": 2.154476507357661e-05, "epoch": 2.6415787058283615, "percentage": 52.82, "elapsed_time": "11:11:13", "remaining_time": "9:59:33"}
|
||||
{"current_steps": 1925, "total_steps": 3635, "loss": 0.4189, "lr": 2.1448990746172353e-05, "epoch": 2.648462597521799, "percentage": 52.96, "elapsed_time": "11:13:07", "remaining_time": "9:57:56"}
|
||||
{"current_steps": 1930, "total_steps": 3635, "loss": 0.4675, "lr": 2.1353183003615944e-05, "epoch": 2.655346489215236, "percentage": 53.09, "elapsed_time": "11:14:43", "remaining_time": "9:56:03"}
|
||||
{"current_steps": 1935, "total_steps": 3635, "loss": 0.4232, "lr": 2.1257344055328163e-05, "epoch": 2.6622303809086736, "percentage": 53.23, "elapsed_time": "11:16:34", "remaining_time": "9:54:24"}
|
||||
{"current_steps": 1940, "total_steps": 3635, "loss": 0.4524, "lr": 2.1161476111449466e-05, "epoch": 2.669114272602111, "percentage": 53.37, "elapsed_time": "11:18:14", "remaining_time": "9:52:34"}
|
||||
{"current_steps": 1945, "total_steps": 3635, "loss": 0.4234, "lr": 2.106558138278894e-05, "epoch": 2.6759981642955486, "percentage": 53.51, "elapsed_time": "11:19:58", "remaining_time": "9:50:49"}
|
||||
{"current_steps": 1950, "total_steps": 3635, "loss": 0.4354, "lr": 2.0969662080773387e-05, "epoch": 2.6828820559889857, "percentage": 53.65, "elapsed_time": "11:21:38", "remaining_time": "9:49:00"}
|
||||
{"current_steps": 1955, "total_steps": 3635, "loss": 0.4244, "lr": 2.0873720417396265e-05, "epoch": 2.689765947682423, "percentage": 53.78, "elapsed_time": "11:23:22", "remaining_time": "9:47:15"}
|
||||
{"current_steps": 1960, "total_steps": 3635, "loss": 0.4463, "lr": 2.0777758605166733e-05, "epoch": 2.6966498393758602, "percentage": 53.92, "elapsed_time": "11:25:08", "remaining_time": "9:45:30"}
|
||||
{"current_steps": 1965, "total_steps": 3635, "loss": 0.4208, "lr": 2.0681778857058584e-05, "epoch": 2.7035337310692977, "percentage": 54.06, "elapsed_time": "11:26:55", "remaining_time": "9:43:47"}
|
||||
{"current_steps": 1970, "total_steps": 3635, "loss": 0.3063, "lr": 2.0585783386459242e-05, "epoch": 2.7104176227627352, "percentage": 54.2, "elapsed_time": "11:29:16", "remaining_time": "9:42:33"}
|
||||
{"current_steps": 1975, "total_steps": 3635, "loss": 0.2554, "lr": 2.0489774407118695e-05, "epoch": 2.7173015144561727, "percentage": 54.33, "elapsed_time": "11:31:43", "remaining_time": "9:41:23"}
|
||||
{"current_steps": 1980, "total_steps": 3635, "loss": 0.2648, "lr": 2.039375413309847e-05, "epoch": 2.72418540614961, "percentage": 54.47, "elapsed_time": "11:34:08", "remaining_time": "9:40:11"}
|
||||
{"current_steps": 1985, "total_steps": 3635, "loss": 0.2704, "lr": 2.0297724778720553e-05, "epoch": 2.7310692978430473, "percentage": 54.61, "elapsed_time": "11:36:34", "remaining_time": "9:39:01"}
|
||||
{"current_steps": 1990, "total_steps": 3635, "loss": 0.2767, "lr": 2.0201688558516324e-05, "epoch": 2.7379531895364844, "percentage": 54.75, "elapsed_time": "11:38:52", "remaining_time": "9:37:42"}
|
||||
{"current_steps": 1995, "total_steps": 3635, "loss": 0.2601, "lr": 2.0105647687175507e-05, "epoch": 2.744837081229922, "percentage": 54.88, "elapsed_time": "11:41:20", "remaining_time": "9:36:32"}
|
||||
{"current_steps": 2000, "total_steps": 3635, "loss": 0.2541, "lr": 2.000960437949509e-05, "epoch": 2.7517209729233594, "percentage": 55.02, "elapsed_time": "11:43:48", "remaining_time": "9:35:21"}
|
||||
{"current_steps": 2005, "total_steps": 3635, "loss": 0.255, "lr": 1.991356085032823e-05, "epoch": 2.758604864616797, "percentage": 55.16, "elapsed_time": "11:46:11", "remaining_time": "9:34:06"}
|
||||
{"current_steps": 2010, "total_steps": 3635, "loss": 0.2556, "lr": 1.9817519314533203e-05, "epoch": 2.765488756310234, "percentage": 55.3, "elapsed_time": "11:48:38", "remaining_time": "9:32:54"}
|
||||
{"current_steps": 2015, "total_steps": 3635, "loss": 0.2695, "lr": 1.9721481986922314e-05, "epoch": 2.7723726480036714, "percentage": 55.43, "elapsed_time": "11:50:58", "remaining_time": "9:31:36"}
|
||||
{"current_steps": 1805, "total_steps": 3635, "loss": 0.1423, "lr": 2.3733285219581044e-05, "epoch": 2.4832491968793025, "percentage": 49.66, "elapsed_time": "0:01:26", "remaining_time": "0:01:27"}
|
||||
{"current_steps": 1810, "total_steps": 3635, "loss": 0.1457, "lr": 2.3638886825372905e-05, "epoch": 2.49013308857274, "percentage": 49.79, "elapsed_time": "0:02:53", "remaining_time": "0:02:55"}
|
||||
{"current_steps": 1815, "total_steps": 3635, "loss": 0.1462, "lr": 2.3544404514854546e-05, "epoch": 2.497016980266177, "percentage": 49.93, "elapsed_time": "0:04:19", "remaining_time": "0:04:19"}
|
||||
{"current_steps": 1820, "total_steps": 3635, "loss": 0.1453, "lr": 2.3449840466880982e-05, "epoch": 2.5039008719596145, "percentage": 50.07, "elapsed_time": "0:05:38", "remaining_time": "0:05:37"}
|
||||
{"current_steps": 1825, "total_steps": 3635, "loss": 0.1401, "lr": 2.3355196862192217e-05, "epoch": 2.510784763653052, "percentage": 50.21, "elapsed_time": "0:07:03", "remaining_time": "0:07:00"}
|
||||
{"current_steps": 1830, "total_steps": 3635, "loss": 0.1437, "lr": 2.3260475883362875e-05, "epoch": 2.517668655346489, "percentage": 50.34, "elapsed_time": "0:08:26", "remaining_time": "0:08:19"}
|
||||
{"current_steps": 1835, "total_steps": 3635, "loss": 0.1457, "lr": 2.316567971475192e-05, "epoch": 2.5245525470399266, "percentage": 50.48, "elapsed_time": "0:09:40", "remaining_time": "0:09:29"}
|
||||
{"current_steps": 1840, "total_steps": 3635, "loss": 0.1431, "lr": 2.307081054245226e-05, "epoch": 2.5314364387333637, "percentage": 50.62, "elapsed_time": "0:10:56", "remaining_time": "0:10:40"}
|
||||
{"current_steps": 1845, "total_steps": 3635, "loss": 0.1445, "lr": 2.2975870554240355e-05, "epoch": 2.538320330426801, "percentage": 50.76, "elapsed_time": "0:12:12", "remaining_time": "0:11:50"}
|
||||
{"current_steps": 1850, "total_steps": 3635, "loss": 0.1463, "lr": 2.2880861939525723e-05, "epoch": 2.5452042221202387, "percentage": 50.89, "elapsed_time": "0:13:27", "remaining_time": "0:12:59"}
|
||||
{"current_steps": 1855, "total_steps": 3635, "loss": 0.1411, "lr": 2.2785786889300497e-05, "epoch": 2.552088113813676, "percentage": 51.03, "elapsed_time": "0:14:40", "remaining_time": "0:14:04"}
|
||||
{"current_steps": 1860, "total_steps": 3635, "loss": 0.1515, "lr": 2.2690647596088874e-05, "epoch": 2.5589720055071132, "percentage": 51.17, "elapsed_time": "0:15:55", "remaining_time": "0:15:11"}
|
||||
{"current_steps": 1865, "total_steps": 3635, "loss": 0.474, "lr": 2.2595446253896554e-05, "epoch": 2.5658558972005507, "percentage": 51.31, "elapsed_time": "0:17:47", "remaining_time": "0:16:53"}
|
||||
{"current_steps": 1870, "total_steps": 3635, "loss": 0.4627, "lr": 2.250018505816015e-05, "epoch": 2.572739788893988, "percentage": 51.44, "elapsed_time": "0:19:38", "remaining_time": "0:18:31"}
|
||||
{"current_steps": 1875, "total_steps": 3635, "loss": 0.452, "lr": 2.2404866205696557e-05, "epoch": 2.5796236805874253, "percentage": 51.58, "elapsed_time": "0:21:25", "remaining_time": "0:20:06"}
|
||||
{"current_steps": 1880, "total_steps": 3635, "loss": 0.4504, "lr": 2.2309491894652285e-05, "epoch": 2.586507572280863, "percentage": 51.72, "elapsed_time": "0:23:18", "remaining_time": "0:21:45"}
|
||||
{"current_steps": 1885, "total_steps": 3635, "loss": 0.4171, "lr": 2.2214064324452785e-05, "epoch": 2.5933914639743003, "percentage": 51.86, "elapsed_time": "0:25:13", "remaining_time": "0:23:25"}
|
||||
{"current_steps": 1890, "total_steps": 3635, "loss": 0.4561, "lr": 2.2118585695751712e-05, "epoch": 2.6002753556677374, "percentage": 51.99, "elapsed_time": "0:26:52", "remaining_time": "0:24:48"}
|
||||
{"current_steps": 1895, "total_steps": 3635, "loss": 0.423, "lr": 2.202305821038017e-05, "epoch": 2.607159247361175, "percentage": 52.13, "elapsed_time": "0:28:40", "remaining_time": "0:26:19"}
|
||||
{"current_steps": 1900, "total_steps": 3635, "loss": 0.4208, "lr": 2.1927484071295965e-05, "epoch": 2.614043139054612, "percentage": 52.27, "elapsed_time": "0:30:30", "remaining_time": "0:27:51"}
|
||||
{"current_steps": 1905, "total_steps": 3635, "loss": 0.4435, "lr": 2.1831865482532753e-05, "epoch": 2.6209270307480494, "percentage": 52.41, "elapsed_time": "0:32:20", "remaining_time": "0:29:22"}
|
||||
{"current_steps": 1910, "total_steps": 3635, "loss": 0.4286, "lr": 2.173620464914929e-05, "epoch": 2.627810922441487, "percentage": 52.54, "elapsed_time": "0:34:09", "remaining_time": "0:30:51"}
|
||||
{"current_steps": 1915, "total_steps": 3635, "loss": 0.4433, "lr": 2.1640503777178484e-05, "epoch": 2.6346948141349245, "percentage": 52.68, "elapsed_time": "0:35:54", "remaining_time": "0:32:14"}
|
||||
{"current_steps": 1920, "total_steps": 3635, "loss": 0.4216, "lr": 2.154476507357661e-05, "epoch": 2.6415787058283615, "percentage": 52.82, "elapsed_time": "0:37:48", "remaining_time": "0:33:46"}
|
||||
{"current_steps": 1925, "total_steps": 3635, "loss": 0.4188, "lr": 2.1448990746172353e-05, "epoch": 2.648462597521799, "percentage": 52.96, "elapsed_time": "0:39:43", "remaining_time": "0:35:17"}
|
||||
{"current_steps": 1930, "total_steps": 3635, "loss": 0.4675, "lr": 2.1353183003615944e-05, "epoch": 2.655346489215236, "percentage": 53.09, "elapsed_time": "0:41:23", "remaining_time": "0:36:33"}
|
||||
{"current_steps": 1935, "total_steps": 3635, "loss": 0.4233, "lr": 2.1257344055328163e-05, "epoch": 2.6622303809086736, "percentage": 53.23, "elapsed_time": "0:43:18", "remaining_time": "0:38:03"}
|
||||
{"current_steps": 1940, "total_steps": 3635, "loss": 0.4524, "lr": 2.1161476111449466e-05, "epoch": 2.669114272602111, "percentage": 53.37, "elapsed_time": "0:45:01", "remaining_time": "0:39:20"}
|
||||
{"current_steps": 1945, "total_steps": 3635, "loss": 0.4234, "lr": 2.106558138278894e-05, "epoch": 2.6759981642955486, "percentage": 53.51, "elapsed_time": "0:46:48", "remaining_time": "0:40:40"}
|
||||
{"current_steps": 1950, "total_steps": 3635, "loss": 0.4354, "lr": 2.0969662080773387e-05, "epoch": 2.6828820559889857, "percentage": 53.65, "elapsed_time": "0:48:31", "remaining_time": "0:41:56"}
|
||||
{"current_steps": 1955, "total_steps": 3635, "loss": 0.4246, "lr": 2.0873720417396265e-05, "epoch": 2.689765947682423, "percentage": 53.78, "elapsed_time": "0:50:18", "remaining_time": "0:43:14"}
|
||||
{"current_steps": 1960, "total_steps": 3635, "loss": 0.4463, "lr": 2.0777758605166733e-05, "epoch": 2.6966498393758602, "percentage": 53.92, "elapsed_time": "0:52:07", "remaining_time": "0:44:32"}
|
||||
{"current_steps": 1965, "total_steps": 3635, "loss": 0.4209, "lr": 2.0681778857058584e-05, "epoch": 2.7035337310692977, "percentage": 54.06, "elapsed_time": "0:53:56", "remaining_time": "0:45:50"}
|
||||
{"current_steps": 1970, "total_steps": 3635, "loss": 0.3063, "lr": 2.0585783386459242e-05, "epoch": 2.7104176227627352, "percentage": 54.2, "elapsed_time": "0:56:19", "remaining_time": "0:47:35"}
|
||||
{"current_steps": 1975, "total_steps": 3635, "loss": 0.2554, "lr": 2.0489774407118695e-05, "epoch": 2.7173015144561727, "percentage": 54.33, "elapsed_time": "0:58:47", "remaining_time": "0:49:25"}
|
||||
{"current_steps": 1980, "total_steps": 3635, "loss": 0.2647, "lr": 2.039375413309847e-05, "epoch": 2.72418540614961, "percentage": 54.47, "elapsed_time": "1:01:15", "remaining_time": "0:51:12"}
|
||||
{"current_steps": 1985, "total_steps": 3635, "loss": 0.2704, "lr": 2.0297724778720553e-05, "epoch": 2.7310692978430473, "percentage": 54.61, "elapsed_time": "1:03:44", "remaining_time": "0:52:58"}
|
||||
{"current_steps": 1990, "total_steps": 3635, "loss": 0.2768, "lr": 2.0201688558516324e-05, "epoch": 2.7379531895364844, "percentage": 54.75, "elapsed_time": "1:06:03", "remaining_time": "0:54:36"}
|
||||
{"current_steps": 1995, "total_steps": 3635, "loss": 0.2601, "lr": 2.0105647687175507e-05, "epoch": 2.744837081229922, "percentage": 54.88, "elapsed_time": "1:08:33", "remaining_time": "0:56:21"}
|
||||
{"current_steps": 2000, "total_steps": 3635, "loss": 0.2542, "lr": 2.000960437949509e-05, "epoch": 2.7517209729233594, "percentage": 55.02, "elapsed_time": "1:11:02", "remaining_time": "0:58:04"}
|
||||
{"current_steps": 2005, "total_steps": 3635, "loss": 0.255, "lr": 1.991356085032823e-05, "epoch": 2.758604864616797, "percentage": 55.16, "elapsed_time": "1:13:27", "remaining_time": "0:59:42"}
|
||||
{"current_steps": 2010, "total_steps": 3635, "loss": 0.2556, "lr": 1.9817519314533203e-05, "epoch": 2.765488756310234, "percentage": 55.3, "elapsed_time": "1:15:55", "remaining_time": "1:01:22"}
|
||||
{"current_steps": 2015, "total_steps": 3635, "loss": 0.2694, "lr": 1.9721481986922314e-05, "epoch": 2.7723726480036714, "percentage": 55.43, "elapsed_time": "1:18:15", "remaining_time": "1:02:55"}
|
||||
{"current_steps": 2020, "total_steps": 3635, "loss": 0.2621, "lr": 1.9625451082210815e-05, "epoch": 2.7792565396971085, "percentage": 55.57, "elapsed_time": "1:20:44", "remaining_time": "1:04:32"}
|
||||
{"current_steps": 2025, "total_steps": 3635, "loss": 0.2696, "lr": 1.9529428814965855e-05, "epoch": 2.786140431390546, "percentage": 55.71, "elapsed_time": "1:23:05", "remaining_time": "1:06:03"}
|
||||
{"current_steps": 2030, "total_steps": 3635, "loss": 0.2625, "lr": 1.9433417399555385e-05, "epoch": 2.7930243230839835, "percentage": 55.85, "elapsed_time": "1:25:32", "remaining_time": "1:07:38"}
|
||||
{"current_steps": 2035, "total_steps": 3635, "loss": 0.2402, "lr": 1.93374190500971e-05, "epoch": 2.799908214777421, "percentage": 55.98, "elapsed_time": "1:28:02", "remaining_time": "1:09:13"}
|
||||
{"current_steps": 2040, "total_steps": 3635, "loss": 0.2668, "lr": 1.924143598040738e-05, "epoch": 2.806792106470858, "percentage": 56.12, "elapsed_time": "1:30:33", "remaining_time": "1:10:48"}
|
||||
{"current_steps": 2045, "total_steps": 3635, "loss": 0.2515, "lr": 1.9145470403950246e-05, "epoch": 2.8136759981642956, "percentage": 56.26, "elapsed_time": "1:33:02", "remaining_time": "1:12:20"}
|
||||
{"current_steps": 2050, "total_steps": 3635, "loss": 0.2461, "lr": 1.9049524533786306e-05, "epoch": 2.8205598898577326, "percentage": 56.4, "elapsed_time": "1:35:17", "remaining_time": "1:13:40"}
|
||||
{"current_steps": 2055, "total_steps": 3635, "loss": 0.2535, "lr": 1.8953600582521733e-05, "epoch": 2.82744378155117, "percentage": 56.53, "elapsed_time": "1:37:46", "remaining_time": "1:15:10"}
|
||||
{"current_steps": 2060, "total_steps": 3635, "loss": 0.2547, "lr": 1.8857700762257188e-05, "epoch": 2.8343276732446077, "percentage": 56.67, "elapsed_time": "1:40:13", "remaining_time": "1:16:37"}
|
||||
{"current_steps": 2065, "total_steps": 3635, "loss": 0.2277, "lr": 1.8761827284536894e-05, "epoch": 2.841211564938045, "percentage": 56.81, "elapsed_time": "1:41:45", "remaining_time": "1:17:22"}
|
||||
{"current_steps": 2070, "total_steps": 3635, "loss": 0.2047, "lr": 1.8665982360297564e-05, "epoch": 2.8480954566314822, "percentage": 56.95, "elapsed_time": "1:43:05", "remaining_time": "1:17:56"}
|
||||
{"current_steps": 2075, "total_steps": 3635, "loss": 0.1989, "lr": 1.8570168199817454e-05, "epoch": 2.8549793483249197, "percentage": 57.08, "elapsed_time": "1:44:26", "remaining_time": "1:18:31"}
|
||||
{"current_steps": 2080, "total_steps": 3635, "loss": 0.1992, "lr": 1.8474387012665357e-05, "epoch": 2.861863240018357, "percentage": 57.22, "elapsed_time": "1:45:50", "remaining_time": "1:19:07"}
|
||||
{"current_steps": 2085, "total_steps": 3635, "loss": 0.1959, "lr": 1.8378641007649686e-05, "epoch": 2.8687471317117943, "percentage": 57.36, "elapsed_time": "1:47:11", "remaining_time": "1:19:41"}
|
||||
{"current_steps": 2090, "total_steps": 3635, "loss": 0.1959, "lr": 1.8282932392767516e-05, "epoch": 2.875631023405232, "percentage": 57.5, "elapsed_time": "1:48:28", "remaining_time": "1:20:11"}
|
||||
{"current_steps": 2095, "total_steps": 3635, "loss": 0.1927, "lr": 1.8187263375153664e-05, "epoch": 2.8825149150986693, "percentage": 57.63, "elapsed_time": "1:49:44", "remaining_time": "1:20:40"}
|
||||
{"current_steps": 2100, "total_steps": 3635, "loss": 0.1923, "lr": 1.809163616102981e-05, "epoch": 2.8893988067921064, "percentage": 57.77, "elapsed_time": "1:51:06", "remaining_time": "1:21:12"}
|
||||
{"current_steps": 2105, "total_steps": 3635, "loss": 0.1927, "lr": 1.799605295565357e-05, "epoch": 2.896282698485544, "percentage": 57.91, "elapsed_time": "1:52:38", "remaining_time": "1:21:52"}
|
||||
{"current_steps": 2110, "total_steps": 3635, "loss": 0.1909, "lr": 1.7900515963267708e-05, "epoch": 2.903166590178981, "percentage": 58.05, "elapsed_time": "1:53:59", "remaining_time": "1:22:23"}
|
||||
{"current_steps": 2115, "total_steps": 3635, "loss": 0.1885, "lr": 1.7805027387049266e-05, "epoch": 2.9100504818724184, "percentage": 58.18, "elapsed_time": "1:55:18", "remaining_time": "1:22:52"}
|
||||
{"current_steps": 2120, "total_steps": 3635, "loss": 0.1895, "lr": 1.7709589429058756e-05, "epoch": 2.916934373565856, "percentage": 58.32, "elapsed_time": "1:56:45", "remaining_time": "1:23:26"}
|
||||
{"current_steps": 2125, "total_steps": 3635, "loss": 0.1902, "lr": 1.761420429018937e-05, "epoch": 2.9238182652592934, "percentage": 58.46, "elapsed_time": "1:58:07", "remaining_time": "1:23:56"}
|
||||
{"current_steps": 2130, "total_steps": 3635, "loss": 0.1878, "lr": 1.7518874170116262e-05, "epoch": 2.9307021569527305, "percentage": 58.6, "elapsed_time": "1:59:31", "remaining_time": "1:24:26"}
|
||||
{"current_steps": 2135, "total_steps": 3635, "loss": 0.1925, "lr": 1.742360126724579e-05, "epoch": 2.937586048646168, "percentage": 58.73, "elapsed_time": "2:00:50", "remaining_time": "1:24:53"}
|
||||
{"current_steps": 2140, "total_steps": 3635, "loss": 0.273, "lr": 1.732838777866483e-05, "epoch": 2.944469940339605, "percentage": 58.87, "elapsed_time": "2:02:56", "remaining_time": "1:25:53"}
|
||||
{"current_steps": 2145, "total_steps": 3635, "loss": 0.2511, "lr": 1.7233235900090094e-05, "epoch": 2.9513538320330426, "percentage": 59.01, "elapsed_time": "2:05:05", "remaining_time": "1:26:53"}
|
||||
{"current_steps": 2150, "total_steps": 3635, "loss": 0.2402, "lr": 1.7138147825817516e-05, "epoch": 2.95823772372648, "percentage": 59.15, "elapsed_time": "2:07:17", "remaining_time": "1:27:55"}
|
||||
{"current_steps": 2155, "total_steps": 3635, "loss": 0.2449, "lr": 1.7043125748671637e-05, "epoch": 2.9651216154199176, "percentage": 59.28, "elapsed_time": "2:09:24", "remaining_time": "1:28:52"}
|
||||
{"current_steps": 2160, "total_steps": 3635, "loss": 0.234, "lr": 1.6948171859955054e-05, "epoch": 2.9720055071133546, "percentage": 59.42, "elapsed_time": "2:11:29", "remaining_time": "1:29:47"}
|
||||
{"current_steps": 2165, "total_steps": 3635, "loss": 0.2363, "lr": 1.6853288349397842e-05, "epoch": 2.978889398806792, "percentage": 59.56, "elapsed_time": "2:13:37", "remaining_time": "1:30:43"}
|
||||
{"current_steps": 2170, "total_steps": 3635, "loss": 0.2275, "lr": 1.675847740510712e-05, "epoch": 2.985773290500229, "percentage": 59.7, "elapsed_time": "2:15:42", "remaining_time": "1:31:37"}
|
||||
{"current_steps": 2175, "total_steps": 3635, "loss": 0.2142, "lr": 1.666374121351652e-05, "epoch": 2.9926571821936667, "percentage": 59.83, "elapsed_time": "2:17:48", "remaining_time": "1:32:30"}
|
||||
{"current_steps": 2180, "total_steps": 3635, "loss": 0.2189, "lr": 1.6569081959335843e-05, "epoch": 2.9995410738871042, "percentage": 59.97, "elapsed_time": "2:19:54", "remaining_time": "1:33:22"}
|
||||
{"current_steps": 2185, "total_steps": 3635, "loss": 0.5678, "lr": 1.6474501825500617e-05, "epoch": 3.0055071133547497, "percentage": 60.11, "elapsed_time": "2:21:17", "remaining_time": "1:33:45"}
|
||||
{"current_steps": 2190, "total_steps": 3635, "loss": 0.4782, "lr": 1.6380002993121768e-05, "epoch": 3.0123910050481872, "percentage": 60.25, "elapsed_time": "2:22:47", "remaining_time": "1:34:12"}
|
||||
{"current_steps": 2195, "total_steps": 3635, "loss": 0.4252, "lr": 1.6285587641435347e-05, "epoch": 3.0192748967416247, "percentage": 60.39, "elapsed_time": "2:24:18", "remaining_time": "1:34:40"}
|
||||
{"current_steps": 2200, "total_steps": 3635, "loss": 0.3891, "lr": 1.6191257947752254e-05, "epoch": 3.026158788435062, "percentage": 60.52, "elapsed_time": "2:25:41", "remaining_time": "1:35:01"}
|
||||
{"current_steps": 2205, "total_steps": 3635, "loss": 0.3665, "lr": 1.609701608740803e-05, "epoch": 3.0330426801284993, "percentage": 60.66, "elapsed_time": "2:27:07", "remaining_time": "1:35:25"}
|
||||
{"current_steps": 2210, "total_steps": 3635, "loss": 0.3682, "lr": 1.6002864233712684e-05, "epoch": 3.039926571821937, "percentage": 60.8, "elapsed_time": "2:28:33", "remaining_time": "1:35:47"}
|
||||
{"current_steps": 2215, "total_steps": 3635, "loss": 0.3565, "lr": 1.5908804557900595e-05, "epoch": 3.046810463515374, "percentage": 60.94, "elapsed_time": "2:30:00", "remaining_time": "1:36:09"}
|
||||
{"current_steps": 2220, "total_steps": 3635, "loss": 0.3451, "lr": 1.581483922908043e-05, "epoch": 3.0536943552088114, "percentage": 61.07, "elapsed_time": "2:31:30", "remaining_time": "1:36:34"}
|
||||
{"current_steps": 2225, "total_steps": 3635, "loss": 0.3502, "lr": 1.572097041418512e-05, "epoch": 3.060578246902249, "percentage": 61.21, "elapsed_time": "2:32:58", "remaining_time": "1:36:56"}
|
||||
{"current_steps": 2230, "total_steps": 3635, "loss": 0.3478, "lr": 1.562720027792188e-05, "epoch": 3.067462138595686, "percentage": 61.35, "elapsed_time": "2:34:23", "remaining_time": "1:37:16"}
|
||||
{"current_steps": 2235, "total_steps": 3635, "loss": 0.3357, "lr": 1.55335309827223e-05, "epoch": 3.0743460302891235, "percentage": 61.49, "elapsed_time": "2:36:03", "remaining_time": "1:37:45"}
|
||||
{"current_steps": 2240, "total_steps": 3635, "loss": 0.3134, "lr": 1.5439964688692497e-05, "epoch": 3.081229921982561, "percentage": 61.62, "elapsed_time": "2:37:41", "remaining_time": "1:38:12"}
|
||||
{"current_steps": 2245, "total_steps": 3635, "loss": 0.3151, "lr": 1.534650355356325e-05, "epoch": 3.088113813675998, "percentage": 61.76, "elapsed_time": "2:39:22", "remaining_time": "1:38:40"}
|
||||
{"current_steps": 2250, "total_steps": 3635, "loss": 0.3037, "lr": 1.5253149732640305e-05, "epoch": 3.0949977053694355, "percentage": 61.9, "elapsed_time": "2:41:01", "remaining_time": "1:39:07"}
|
||||
{"current_steps": 2255, "total_steps": 3635, "loss": 0.3112, "lr": 1.515990537875459e-05, "epoch": 3.101881597062873, "percentage": 62.04, "elapsed_time": "2:42:37", "remaining_time": "1:39:31"}
|
||||
{"current_steps": 2260, "total_steps": 3635, "loss": 0.3041, "lr": 1.5066772642212657e-05, "epoch": 3.10876548875631, "percentage": 62.17, "elapsed_time": "2:44:17", "remaining_time": "1:39:57"}
|
||||
{"current_steps": 2265, "total_steps": 3635, "loss": 0.3109, "lr": 1.4973753670747033e-05, "epoch": 3.1156493804497476, "percentage": 62.31, "elapsed_time": "2:46:01", "remaining_time": "1:40:25"}
|
||||
{"current_steps": 2270, "total_steps": 3635, "loss": 0.3082, "lr": 1.4880850609466726e-05, "epoch": 3.122533272143185, "percentage": 62.45, "elapsed_time": "2:47:40", "remaining_time": "1:40:49"}
|
||||
{"current_steps": 2275, "total_steps": 3635, "loss": 0.3054, "lr": 1.478806560080771e-05, "epoch": 3.129417163836622, "percentage": 62.59, "elapsed_time": "2:49:26", "remaining_time": "1:41:17"}
|
||||
{"current_steps": 2280, "total_steps": 3635, "loss": 0.3074, "lr": 1.469540078448358e-05, "epoch": 3.1363010555300597, "percentage": 62.72, "elapsed_time": "2:51:20", "remaining_time": "1:41:49"}
|
||||
{"current_steps": 2285, "total_steps": 3635, "loss": 0.3058, "lr": 1.4602858297436164e-05, "epoch": 3.143184947223497, "percentage": 62.86, "elapsed_time": "2:53:55", "remaining_time": "1:42:45"}
|
||||
{"current_steps": 2290, "total_steps": 3635, "loss": 0.3035, "lr": 1.451044027378627e-05, "epoch": 3.1500688389169342, "percentage": 63.0, "elapsed_time": "2:56:10", "remaining_time": "1:43:28"}
|
||||
{"current_steps": 2295, "total_steps": 3635, "loss": 0.3045, "lr": 1.441814884478443e-05, "epoch": 3.1569527306103717, "percentage": 63.14, "elapsed_time": "2:58:28", "remaining_time": "1:44:12"}
|
||||
{"current_steps": 2300, "total_steps": 3635, "loss": 0.3061, "lr": 1.4325986138761803e-05, "epoch": 3.1638366223038092, "percentage": 63.27, "elapsed_time": "3:00:38", "remaining_time": "1:44:50"}
|
||||
{"current_steps": 2305, "total_steps": 3635, "loss": 0.3138, "lr": 1.4233954281081057e-05, "epoch": 3.1707205139972463, "percentage": 63.41, "elapsed_time": "3:02:48", "remaining_time": "1:45:28"}
|
||||
{"current_steps": 2310, "total_steps": 3635, "loss": 0.3072, "lr": 1.4142055394087386e-05, "epoch": 3.177604405690684, "percentage": 63.55, "elapsed_time": "3:05:01", "remaining_time": "1:46:07"}
|
||||
{"current_steps": 2315, "total_steps": 3635, "loss": 0.2963, "lr": 1.4050291597059529e-05, "epoch": 3.1844882973841213, "percentage": 63.69, "elapsed_time": "3:07:17", "remaining_time": "1:46:47"}
|
||||
{"current_steps": 2320, "total_steps": 3635, "loss": 0.2998, "lr": 1.3958665006160921e-05, "epoch": 3.1913721890775584, "percentage": 63.82, "elapsed_time": "3:09:32", "remaining_time": "1:47:26"}
|
||||
{"current_steps": 2325, "total_steps": 3635, "loss": 0.3044, "lr": 1.3867177734390895e-05, "epoch": 3.198256080770996, "percentage": 63.96, "elapsed_time": "3:11:42", "remaining_time": "1:48:01"}
|
||||
{"current_steps": 2330, "total_steps": 3635, "loss": 0.3025, "lr": 1.3775831891535947e-05, "epoch": 3.2051399724644334, "percentage": 64.1, "elapsed_time": "3:13:58", "remaining_time": "1:48:38"}
|
||||
{"current_steps": 2335, "total_steps": 3635, "loss": 0.299, "lr": 1.368462958412109e-05, "epoch": 3.2120238641578704, "percentage": 64.24, "elapsed_time": "3:16:13", "remaining_time": "1:49:14"}
|
||||
{"current_steps": 2340, "total_steps": 3635, "loss": 0.3049, "lr": 1.3593572915361243e-05, "epoch": 3.218907755851308, "percentage": 64.37, "elapsed_time": "3:18:25", "remaining_time": "1:49:48"}
|
||||
{"current_steps": 2345, "total_steps": 3635, "loss": 0.2999, "lr": 1.3502663985112788e-05, "epoch": 3.2257916475447455, "percentage": 64.51, "elapsed_time": "3:20:36", "remaining_time": "1:50:21"}
|
||||
{"current_steps": 2350, "total_steps": 3635, "loss": 0.3003, "lr": 1.3411904889825084e-05, "epoch": 3.2326755392381825, "percentage": 64.65, "elapsed_time": "3:22:51", "remaining_time": "1:50:55"}
|
||||
{"current_steps": 2355, "total_steps": 3635, "loss": 0.2958, "lr": 1.3321297722492167e-05, "epoch": 3.23955943093162, "percentage": 64.79, "elapsed_time": "3:24:55", "remaining_time": "1:51:23"}
|
||||
{"current_steps": 2360, "total_steps": 3635, "loss": 0.2972, "lr": 1.323084457260445e-05, "epoch": 3.2464433226250575, "percentage": 64.92, "elapsed_time": "3:27:13", "remaining_time": "1:51:57"}
|
||||
{"current_steps": 2365, "total_steps": 3635, "loss": 0.293, "lr": 1.3140547526100562e-05, "epoch": 3.2533272143184946, "percentage": 65.06, "elapsed_time": "3:29:23", "remaining_time": "1:52:26"}
|
||||
{"current_steps": 2370, "total_steps": 3635, "loss": 0.2972, "lr": 1.3050408665319237e-05, "epoch": 3.260211106011932, "percentage": 65.2, "elapsed_time": "3:31:31", "remaining_time": "1:52:54"}
|
||||
{"current_steps": 2375, "total_steps": 3635, "loss": 0.3037, "lr": 1.2960430068951288e-05, "epoch": 3.2670949977053696, "percentage": 65.34, "elapsed_time": "3:33:27", "remaining_time": "1:53:14"}
|
||||
{"current_steps": 2380, "total_steps": 3635, "loss": 0.311, "lr": 1.2870613811991657e-05, "epoch": 3.2739788893988067, "percentage": 65.47, "elapsed_time": "3:34:30", "remaining_time": "1:53:06"}
|
||||
{"current_steps": 2385, "total_steps": 3635, "loss": 0.4073, "lr": 1.2780961965691603e-05, "epoch": 3.280862781092244, "percentage": 65.61, "elapsed_time": "3:35:45", "remaining_time": "1:53:04"}
|
||||
{"current_steps": 2390, "total_steps": 3635, "loss": 0.6873, "lr": 1.2691476597510898e-05, "epoch": 3.2877466727856817, "percentage": 65.75, "elapsed_time": "3:37:37", "remaining_time": "1:53:21"}
|
||||
{"current_steps": 2395, "total_steps": 3635, "loss": 0.6139, "lr": 1.2602159771070178e-05, "epoch": 3.2946305644791187, "percentage": 65.89, "elapsed_time": "3:39:18", "remaining_time": "1:53:32"}
|
||||
{"current_steps": 2400, "total_steps": 3635, "loss": 0.5911, "lr": 1.2513013546103335e-05, "epoch": 3.3015144561725562, "percentage": 66.02, "elapsed_time": "3:40:59", "remaining_time": "1:53:43"}
|
||||
{"current_steps": 2405, "total_steps": 3635, "loss": 0.5589, "lr": 1.2424039978410005e-05, "epoch": 3.3083983478659937, "percentage": 66.16, "elapsed_time": "3:42:58", "remaining_time": "1:54:02"}
|
||||
{"current_steps": 2410, "total_steps": 3635, "loss": 0.5219, "lr": 1.2335241119808203e-05, "epoch": 3.315282239559431, "percentage": 66.3, "elapsed_time": "3:44:46", "remaining_time": "1:54:15"}
|
||||
{"current_steps": 2415, "total_steps": 3635, "loss": 0.511, "lr": 1.2246619018086973e-05, "epoch": 3.3221661312528683, "percentage": 66.44, "elapsed_time": "3:46:29", "remaining_time": "1:54:25"}
|
||||
{"current_steps": 2420, "total_steps": 3635, "loss": 0.5239, "lr": 1.2158175716959178e-05, "epoch": 3.329050022946306, "percentage": 66.57, "elapsed_time": "3:48:12", "remaining_time": "1:54:34"}
|
||||
{"current_steps": 2425, "total_steps": 3635, "loss": 0.5106, "lr": 1.2069913256014336e-05, "epoch": 3.335933914639743, "percentage": 66.71, "elapsed_time": "3:49:43", "remaining_time": "1:54:37"}
|
||||
{"current_steps": 2430, "total_steps": 3635, "loss": 0.5095, "lr": 1.1981833670671641e-05, "epoch": 3.3428178063331804, "percentage": 66.85, "elapsed_time": "3:51:35", "remaining_time": "1:54:50"}
|
||||
{"current_steps": 2435, "total_steps": 3635, "loss": 0.4842, "lr": 1.1893938992132983e-05, "epoch": 3.349701698026618, "percentage": 66.99, "elapsed_time": "3:53:13", "remaining_time": "1:54:56"}
|
||||
{"current_steps": 2440, "total_steps": 3635, "loss": 0.4978, "lr": 1.1806231247336135e-05, "epoch": 3.356585589720055, "percentage": 67.13, "elapsed_time": "3:54:58", "remaining_time": "1:55:04"}
|
||||
{"current_steps": 2445, "total_steps": 3635, "loss": 0.4965, "lr": 1.1718712458907961e-05, "epoch": 3.3634694814134924, "percentage": 67.26, "elapsed_time": "3:56:47", "remaining_time": "1:55:15"}
|
||||
{"current_steps": 2450, "total_steps": 3635, "loss": 0.496, "lr": 1.1631384645117831e-05, "epoch": 3.37035337310693, "percentage": 67.4, "elapsed_time": "3:58:29", "remaining_time": "1:55:21"}
|
||||
{"current_steps": 2455, "total_steps": 3635, "loss": 0.4972, "lr": 1.154424981983106e-05, "epoch": 3.377237264800367, "percentage": 67.54, "elapsed_time": "4:00:17", "remaining_time": "1:55:29"}
|
||||
{"current_steps": 2460, "total_steps": 3635, "loss": 0.5127, "lr": 1.1457309992462434e-05, "epoch": 3.3841211564938045, "percentage": 67.68, "elapsed_time": "4:01:57", "remaining_time": "1:55:34"}
|
||||
{"current_steps": 2465, "total_steps": 3635, "loss": 0.5128, "lr": 1.1370567167929907e-05, "epoch": 3.391005048187242, "percentage": 67.81, "elapsed_time": "4:03:49", "remaining_time": "1:55:43"}
|
||||
{"current_steps": 2470, "total_steps": 3635, "loss": 0.499, "lr": 1.1284023346608345e-05, "epoch": 3.397888939880679, "percentage": 67.95, "elapsed_time": "4:05:34", "remaining_time": "1:55:49"}
|
||||
{"current_steps": 2475, "total_steps": 3635, "loss": 0.5082, "lr": 1.1197680524283428e-05, "epoch": 3.4047728315741166, "percentage": 68.09, "elapsed_time": "4:07:15", "remaining_time": "1:55:53"}
|
||||
{"current_steps": 2480, "total_steps": 3635, "loss": 0.5111, "lr": 1.111154069210557e-05, "epoch": 3.411656723267554, "percentage": 68.23, "elapsed_time": "4:09:01", "remaining_time": "1:55:58"}
|
||||
{"current_steps": 2485, "total_steps": 3635, "loss": 0.3907, "lr": 1.1025605836544067e-05, "epoch": 3.418540614960991, "percentage": 68.36, "elapsed_time": "4:10:32", "remaining_time": "1:55:56"}
|
||||
{"current_steps": 2490, "total_steps": 3635, "loss": 0.1557, "lr": 1.0939877939341206e-05, "epoch": 3.4254245066544287, "percentage": 68.5, "elapsed_time": "4:11:42", "remaining_time": "1:55:44"}
|
||||
{"current_steps": 2495, "total_steps": 3635, "loss": 0.1416, "lr": 1.0854358977466664e-05, "epoch": 3.432308398347866, "percentage": 68.64, "elapsed_time": "4:12:51", "remaining_time": "1:55:32"}
|
||||
{"current_steps": 2500, "total_steps": 3635, "loss": 0.1404, "lr": 1.0769050923071813e-05, "epoch": 3.4391922900413032, "percentage": 68.78, "elapsed_time": "4:14:02", "remaining_time": "1:55:20"}
|
||||
{"current_steps": 2505, "total_steps": 3635, "loss": 0.1308, "lr": 1.0683955743444348e-05, "epoch": 3.4460761817347407, "percentage": 68.91, "elapsed_time": "4:15:14", "remaining_time": "1:55:08"}
|
||||
{"current_steps": 2510, "total_steps": 3635, "loss": 0.1343, "lr": 1.0599075400962793e-05, "epoch": 3.4529600734281782, "percentage": 69.05, "elapsed_time": "4:16:24", "remaining_time": "1:54:55"}
|
||||
{"current_steps": 2515, "total_steps": 3635, "loss": 0.1294, "lr": 1.0514411853051381e-05, "epoch": 3.4598439651216153, "percentage": 69.19, "elapsed_time": "4:17:35", "remaining_time": "1:54:42"}
|
||||
{"current_steps": 2520, "total_steps": 3635, "loss": 0.1345, "lr": 1.0429967052134801e-05, "epoch": 3.466727856815053, "percentage": 69.33, "elapsed_time": "4:18:55", "remaining_time": "1:54:33"}
|
||||
{"current_steps": 2525, "total_steps": 3635, "loss": 0.133, "lr": 1.0345742945593269e-05, "epoch": 3.4736117485084903, "percentage": 69.46, "elapsed_time": "4:20:14", "remaining_time": "1:54:24"}
|
||||
{"current_steps": 2530, "total_steps": 3635, "loss": 0.1274, "lr": 1.026174147571751e-05, "epoch": 3.4804956402019274, "percentage": 69.6, "elapsed_time": "4:21:27", "remaining_time": "1:54:11"}
|
||||
{"current_steps": 2535, "total_steps": 3635, "loss": 0.125, "lr": 1.0177964579664085e-05, "epoch": 3.487379531895365, "percentage": 69.74, "elapsed_time": "4:22:50", "remaining_time": "1:54:03"}
|
||||
{"current_steps": 2540, "total_steps": 3635, "loss": 0.1306, "lr": 1.0094414189410625e-05, "epoch": 3.4942634235888024, "percentage": 69.88, "elapsed_time": "4:24:14", "remaining_time": "1:53:54"}
|
||||
{"current_steps": 2545, "total_steps": 3635, "loss": 0.1257, "lr": 1.0011092231711346e-05, "epoch": 3.5011473152822394, "percentage": 70.01, "elapsed_time": "4:25:27", "remaining_time": "1:53:41"}
|
||||
{"current_steps": 2550, "total_steps": 3635, "loss": 0.1248, "lr": 9.928000628052552e-06, "epoch": 3.508031206975677, "percentage": 70.15, "elapsed_time": "4:26:45", "remaining_time": "1:53:30"}
|
||||
{"current_steps": 2555, "total_steps": 3635, "loss": 0.1254, "lr": 9.845141294608378e-06, "epoch": 3.514915098669114, "percentage": 70.29, "elapsed_time": "4:28:08", "remaining_time": "1:53:20"}
|
||||
{"current_steps": 2560, "total_steps": 3635, "loss": 0.1251, "lr": 9.762516142196563e-06, "epoch": 3.5217989903625515, "percentage": 70.43, "elapsed_time": "4:29:20", "remaining_time": "1:53:06"}
|
||||
{"current_steps": 2565, "total_steps": 3635, "loss": 0.1245, "lr": 9.680127076234425e-06, "epoch": 3.528682882055989, "percentage": 70.56, "elapsed_time": "4:30:29", "remaining_time": "1:52:50"}
|
||||
{"current_steps": 2570, "total_steps": 3635, "loss": 0.1276, "lr": 9.597975996694872e-06, "epoch": 3.5355667737494265, "percentage": 70.7, "elapsed_time": "4:31:44", "remaining_time": "1:52:36"}
|
||||
{"current_steps": 2575, "total_steps": 3635, "loss": 0.1275, "lr": 9.516064798062625e-06, "epoch": 3.5424506654428636, "percentage": 70.84, "elapsed_time": "4:32:53", "remaining_time": "1:52:20"}
|
||||
{"current_steps": 2580, "total_steps": 3635, "loss": 0.124, "lr": 9.434395369290499e-06, "epoch": 3.549334557136301, "percentage": 70.98, "elapsed_time": "4:34:04", "remaining_time": "1:52:04"}
|
||||
{"current_steps": 2585, "total_steps": 3635, "loss": 0.1229, "lr": 9.35296959375589e-06, "epoch": 3.556218448829738, "percentage": 71.11, "elapsed_time": "4:35:13", "remaining_time": "1:51:47"}
|
||||
{"current_steps": 2590, "total_steps": 3635, "loss": 0.3343, "lr": 9.271789349217278e-06, "epoch": 3.5631023405231756, "percentage": 71.25, "elapsed_time": "4:36:44", "remaining_time": "1:51:39"}
|
||||
{"current_steps": 2595, "total_steps": 3635, "loss": 0.4481, "lr": 9.190856507770965e-06, "epoch": 3.569986232216613, "percentage": 71.39, "elapsed_time": "4:38:31", "remaining_time": "1:51:37"}
|
||||
{"current_steps": 2600, "total_steps": 3635, "loss": 0.4122, "lr": 9.11017293580791e-06, "epoch": 3.5768701239100507, "percentage": 71.53, "elapsed_time": "4:40:14", "remaining_time": "1:51:33"}
|
||||
{"current_steps": 2605, "total_steps": 3635, "loss": 0.4207, "lr": 9.02974049397066e-06, "epoch": 3.5837540156034877, "percentage": 71.66, "elapsed_time": "4:41:59", "remaining_time": "1:51:29"}
|
||||
{"current_steps": 2610, "total_steps": 3635, "loss": 0.4125, "lr": 8.949561037110441e-06, "epoch": 3.5906379072969252, "percentage": 71.8, "elapsed_time": "4:43:45", "remaining_time": "1:51:26"}
|
||||
{"current_steps": 2615, "total_steps": 3635, "loss": 0.4018, "lr": 8.869636414244432e-06, "epoch": 3.5975217989903623, "percentage": 71.94, "elapsed_time": "4:45:26", "remaining_time": "1:51:20"}
|
||||
{"current_steps": 2620, "total_steps": 3635, "loss": 0.4022, "lr": 8.789968468513057e-06, "epoch": 3.6044056906838, "percentage": 72.08, "elapsed_time": "4:47:09", "remaining_time": "1:51:14"}
|
||||
{"current_steps": 2625, "total_steps": 3635, "loss": 0.3805, "lr": 8.710559037137516e-06, "epoch": 3.6112895823772373, "percentage": 72.21, "elapsed_time": "4:48:59", "remaining_time": "1:51:11"}
|
||||
{"current_steps": 2630, "total_steps": 3635, "loss": 0.4201, "lr": 8.631409951377442e-06, "epoch": 3.618173474070675, "percentage": 72.35, "elapsed_time": "4:50:39", "remaining_time": "1:51:04"}
|
||||
{"current_steps": 2635, "total_steps": 3635, "loss": 0.3955, "lr": 8.552523036488605e-06, "epoch": 3.625057365764112, "percentage": 72.49, "elapsed_time": "4:52:25", "remaining_time": "1:50:58"}
|
||||
{"current_steps": 2640, "total_steps": 3635, "loss": 0.4035, "lr": 8.47390011168088e-06, "epoch": 3.6319412574575494, "percentage": 72.63, "elapsed_time": "4:54:10", "remaining_time": "1:50:52"}
|
||||
{"current_steps": 2645, "total_steps": 3635, "loss": 0.3975, "lr": 8.395542990076256e-06, "epoch": 3.6388251491509864, "percentage": 72.76, "elapsed_time": "4:56:02", "remaining_time": "1:50:48"}
|
||||
{"current_steps": 2650, "total_steps": 3635, "loss": 0.3917, "lr": 8.317453478667061e-06, "epoch": 3.645709040844424, "percentage": 72.9, "elapsed_time": "4:57:46", "remaining_time": "1:50:40"}
|
||||
{"current_steps": 2655, "total_steps": 3635, "loss": 0.4106, "lr": 8.239633378274249e-06, "epoch": 3.6525929325378614, "percentage": 73.04, "elapsed_time": "4:59:28", "remaining_time": "1:50:32"}
|
||||
{"current_steps": 2660, "total_steps": 3635, "loss": 0.4167, "lr": 8.162084483505892e-06, "epoch": 3.659476824231299, "percentage": 73.18, "elapsed_time": "5:01:18", "remaining_time": "1:50:26"}
|
||||
{"current_steps": 2665, "total_steps": 3635, "loss": 0.3892, "lr": 8.084808582715795e-06, "epoch": 3.666360715924736, "percentage": 73.31, "elapsed_time": "5:03:08", "remaining_time": "1:50:20"}
|
||||
{"current_steps": 2670, "total_steps": 3635, "loss": 0.414, "lr": 8.007807457962272e-06, "epoch": 3.6732446076181735, "percentage": 73.45, "elapsed_time": "5:04:42", "remaining_time": "1:50:07"}
|
||||
{"current_steps": 2675, "total_steps": 3635, "loss": 0.402, "lr": 7.931082884966996e-06, "epoch": 3.6801284993116106, "percentage": 73.59, "elapsed_time": "5:06:22", "remaining_time": "1:49:56"}
|
||||
{"current_steps": 2680, "total_steps": 3635, "loss": 0.3874, "lr": 7.854636633074128e-06, "epoch": 3.687012391005048, "percentage": 73.73, "elapsed_time": "5:08:06", "remaining_time": "1:49:47"}
|
||||
{"current_steps": 2685, "total_steps": 3635, "loss": 0.408, "lr": 7.778470465209417e-06, "epoch": 3.6938962826984856, "percentage": 73.87, "elapsed_time": "5:09:55", "remaining_time": "1:49:39"}
|
||||
{"current_steps": 2690, "total_steps": 3635, "loss": 0.4076, "lr": 7.702586137839653e-06, "epoch": 3.700780174391923, "percentage": 74.0, "elapsed_time": "5:11:39", "remaining_time": "1:49:29"}
|
||||
{"current_steps": 2695, "total_steps": 3635, "loss": 0.3428, "lr": 7.626985400932068e-06, "epoch": 3.70766406608536, "percentage": 74.14, "elapsed_time": "5:13:42", "remaining_time": "1:49:25"}
|
||||
{"current_steps": 2700, "total_steps": 3635, "loss": 0.2464, "lr": 7.551669997914048e-06, "epoch": 3.7145479577787976, "percentage": 74.28, "elapsed_time": "5:16:09", "remaining_time": "1:49:29"}
|
||||
{"current_steps": 2705, "total_steps": 3635, "loss": 0.2478, "lr": 7.47664166563286e-06, "epoch": 3.7214318494722347, "percentage": 74.42, "elapsed_time": "5:18:43", "remaining_time": "1:49:34"}
|
||||
{"current_steps": 2710, "total_steps": 3635, "loss": 0.248, "lr": 7.401902134315684e-06, "epoch": 3.728315741165672, "percentage": 74.55, "elapsed_time": "5:21:08", "remaining_time": "1:49:36"}
|
||||
{"current_steps": 2715, "total_steps": 3635, "loss": 0.2513, "lr": 7.3274531275296265e-06, "epoch": 3.7351996328591097, "percentage": 74.69, "elapsed_time": "5:23:34", "remaining_time": "1:49:38"}
|
||||
{"current_steps": 2720, "total_steps": 3635, "loss": 0.2539, "lr": 7.253296362142048e-06, "epoch": 3.7420835245525472, "percentage": 74.83, "elapsed_time": "5:25:53", "remaining_time": "1:49:37"}
|
||||
{"current_steps": 2725, "total_steps": 3635, "loss": 0.2372, "lr": 7.179433548280892e-06, "epoch": 3.7489674162459843, "percentage": 74.97, "elapsed_time": "5:28:20", "remaining_time": "1:49:38"}
|
||||
{"current_steps": 2730, "total_steps": 3635, "loss": 0.2462, "lr": 7.105866389295335e-06, "epoch": 3.755851307939422, "percentage": 75.1, "elapsed_time": "5:30:43", "remaining_time": "1:49:38"}
|
||||
{"current_steps": 2735, "total_steps": 3635, "loss": 0.2372, "lr": 7.032596581716422e-06, "epoch": 3.762735199632859, "percentage": 75.24, "elapsed_time": "5:33:09", "remaining_time": "1:49:37"}
|
||||
{"current_steps": 2740, "total_steps": 3635, "loss": 0.2441, "lr": 6.959625815218018e-06, "epoch": 3.7696190913262964, "percentage": 75.38, "elapsed_time": "5:35:26", "remaining_time": "1:49:34"}
|
||||
{"current_steps": 2745, "total_steps": 3635, "loss": 0.242, "lr": 6.8869557725777836e-06, "epoch": 3.776502983019734, "percentage": 75.52, "elapsed_time": "5:37:51", "remaining_time": "1:49:32"}
|
||||
{"current_steps": 2750, "total_steps": 3635, "loss": 0.2496, "lr": 6.814588129638393e-06, "epoch": 3.7833868747131714, "percentage": 75.65, "elapsed_time": "5:40:18", "remaining_time": "1:49:30"}
|
||||
{"current_steps": 2755, "total_steps": 3635, "loss": 0.26, "lr": 6.742524555268886e-06, "epoch": 3.7902707664066084, "percentage": 75.79, "elapsed_time": "5:42:35", "remaining_time": "1:49:25"}
|
||||
{"current_steps": 2760, "total_steps": 3635, "loss": 0.226, "lr": 6.6707667113261956e-06, "epoch": 3.797154658100046, "percentage": 75.93, "elapsed_time": "5:45:01", "remaining_time": "1:49:23"}
|
||||
{"current_steps": 2765, "total_steps": 3635, "loss": 0.2431, "lr": 6.599316252616796e-06, "epoch": 3.804038549793483, "percentage": 76.07, "elapsed_time": "5:47:27", "remaining_time": "1:49:19"}
|
||||
{"current_steps": 2770, "total_steps": 3635, "loss": 0.2363, "lr": 6.528174826858553e-06, "epoch": 3.8109224414869205, "percentage": 76.2, "elapsed_time": "5:49:52", "remaining_time": "1:49:15"}
|
||||
{"current_steps": 2775, "total_steps": 3635, "loss": 0.2199, "lr": 6.457344074642751e-06, "epoch": 3.817806333180358, "percentage": 76.34, "elapsed_time": "5:52:18", "remaining_time": "1:49:10"}
|
||||
{"current_steps": 2780, "total_steps": 3635, "loss": 0.2384, "lr": 6.386825629396209e-06, "epoch": 3.8246902248737955, "percentage": 76.48, "elapsed_time": "5:54:31", "remaining_time": "1:49:01"}
|
||||
{"current_steps": 2785, "total_steps": 3635, "loss": 0.2349, "lr": 6.316621117343657e-06, "epoch": 3.8315741165672326, "percentage": 76.62, "elapsed_time": "5:56:56", "remaining_time": "1:48:56"}
|
||||
{"current_steps": 2790, "total_steps": 3635, "loss": 0.2297, "lr": 6.246732157470205e-06, "epoch": 3.83845800826067, "percentage": 76.75, "elapsed_time": "5:58:54", "remaining_time": "1:48:42"}
|
||||
{"current_steps": 2795, "total_steps": 3635, "loss": 0.1874, "lr": 6.177160361484038e-06, "epoch": 3.845341899954107, "percentage": 76.89, "elapsed_time": "6:00:12", "remaining_time": "1:48:15"}
|
||||
{"current_steps": 2800, "total_steps": 3635, "loss": 0.1852, "lr": 6.107907333779204e-06, "epoch": 3.8522257916475446, "percentage": 77.03, "elapsed_time": "6:01:31", "remaining_time": "1:47:48"}
|
||||
{"current_steps": 2805, "total_steps": 3635, "loss": 0.1787, "lr": 6.038974671398678e-06, "epoch": 3.859109683340982, "percentage": 77.17, "elapsed_time": "6:02:50", "remaining_time": "1:47:21"}
|
||||
{"current_steps": 2810, "total_steps": 3635, "loss": 0.1787, "lr": 5.970363963997443e-06, "epoch": 3.8659935750344196, "percentage": 77.3, "elapsed_time": "6:04:10", "remaining_time": "1:46:55"}
|
||||
{"current_steps": 2815, "total_steps": 3635, "loss": 0.1752, "lr": 5.902076793805933e-06, "epoch": 3.8728774667278567, "percentage": 77.44, "elapsed_time": "6:05:28", "remaining_time": "1:46:27"}
|
||||
{"current_steps": 2820, "total_steps": 3635, "loss": 0.1743, "lr": 5.834114735593464e-06, "epoch": 3.879761358421294, "percentage": 77.58, "elapsed_time": "6:06:45", "remaining_time": "1:45:59"}
|
||||
{"current_steps": 2825, "total_steps": 3635, "loss": 0.1739, "lr": 5.766479356631971e-06, "epoch": 3.8866452501147313, "percentage": 77.72, "elapsed_time": "6:08:02", "remaining_time": "1:45:31"}
|
||||
{"current_steps": 2830, "total_steps": 3635, "loss": 0.1703, "lr": 5.69917221665983e-06, "epoch": 3.893529141808169, "percentage": 77.85, "elapsed_time": "6:09:23", "remaining_time": "1:45:04"}
|
||||
{"current_steps": 2835, "total_steps": 3635, "loss": 0.172, "lr": 5.632194867845906e-06, "epoch": 3.9004130335016063, "percentage": 77.99, "elapsed_time": "6:10:45", "remaining_time": "1:44:37"}
|
||||
{"current_steps": 2840, "total_steps": 3635, "loss": 0.169, "lr": 5.5655488547537525e-06, "epoch": 3.907296925195044, "percentage": 78.13, "elapsed_time": "6:12:03", "remaining_time": "1:44:09"}
|
||||
{"current_steps": 2845, "total_steps": 3635, "loss": 0.1677, "lr": 5.499235714306013e-06, "epoch": 3.914180816888481, "percentage": 78.27, "elapsed_time": "6:13:23", "remaining_time": "1:43:41"}
|
||||
{"current_steps": 2850, "total_steps": 3635, "loss": 0.1708, "lr": 5.433256975748935e-06, "epoch": 3.9210647085819184, "percentage": 78.4, "elapsed_time": "6:14:47", "remaining_time": "1:43:13"}
|
||||
{"current_steps": 2855, "total_steps": 3635, "loss": 0.1686, "lr": 5.367614160617147e-06, "epoch": 3.9279486002753554, "percentage": 78.54, "elapsed_time": "6:16:09", "remaining_time": "1:42:46"}
|
||||
{"current_steps": 2860, "total_steps": 3635, "loss": 0.1702, "lr": 5.3023087826985395e-06, "epoch": 3.934832491968793, "percentage": 78.68, "elapsed_time": "6:17:30", "remaining_time": "1:42:17"}
|
||||
{"current_steps": 2865, "total_steps": 3635, "loss": 0.2319, "lr": 5.237342347999386e-06, "epoch": 3.9417163836622304, "percentage": 78.82, "elapsed_time": "6:19:15", "remaining_time": "1:41:55"}
|
||||
{"current_steps": 2870, "total_steps": 3635, "loss": 0.2529, "lr": 5.1727163547095835e-06, "epoch": 3.948600275355668, "percentage": 78.95, "elapsed_time": "6:21:19", "remaining_time": "1:41:38"}
|
||||
{"current_steps": 2875, "total_steps": 3635, "loss": 0.2295, "lr": 5.108432293168118e-06, "epoch": 3.955484167049105, "percentage": 79.09, "elapsed_time": "6:23:29", "remaining_time": "1:41:22"}
|
||||
{"current_steps": 2880, "total_steps": 3635, "loss": 0.2318, "lr": 5.044491645828684e-06, "epoch": 3.9623680587425425, "percentage": 79.23, "elapsed_time": "6:25:32", "remaining_time": "1:41:04"}
|
||||
{"current_steps": 2885, "total_steps": 3635, "loss": 0.2254, "lr": 4.980895887225529e-06, "epoch": 3.9692519504359796, "percentage": 79.37, "elapsed_time": "6:27:38", "remaining_time": "1:40:46"}
|
||||
{"current_steps": 2890, "total_steps": 3635, "loss": 0.2268, "lr": 4.917646483939405e-06, "epoch": 3.976135842129417, "percentage": 79.5, "elapsed_time": "6:29:40", "remaining_time": "1:40:27"}
|
||||
{"current_steps": 2895, "total_steps": 3635, "loss": 0.2158, "lr": 4.85474489456379e-06, "epoch": 3.9830197338228546, "percentage": 79.64, "elapsed_time": "6:31:46", "remaining_time": "1:40:08"}
|
||||
{"current_steps": 2900, "total_steps": 3635, "loss": 0.2051, "lr": 4.792192569671202e-06, "epoch": 3.989903625516292, "percentage": 79.78, "elapsed_time": "6:33:48", "remaining_time": "1:39:48"}
|
||||
{"current_steps": 2905, "total_steps": 3635, "loss": 0.2019, "lr": 4.7299909517798034e-06, "epoch": 3.996787517209729, "percentage": 79.92, "elapsed_time": "6:35:52", "remaining_time": "1:39:28"}
|
||||
{"current_steps": 2910, "total_steps": 3635, "loss": 0.3927, "lr": 4.668141475320096e-06, "epoch": 4.002753556677375, "percentage": 80.06, "elapsed_time": "6:37:25", "remaining_time": "1:39:00"}
|
||||
{"current_steps": 2915, "total_steps": 3635, "loss": 0.5545, "lr": 4.6066455666018664e-06, "epoch": 4.009637448370812, "percentage": 80.19, "elapsed_time": "6:38:57", "remaining_time": "1:38:32"}
|
||||
{"current_steps": 2920, "total_steps": 3635, "loss": 0.4776, "lr": 4.5455046437812515e-06, "epoch": 4.01652134006425, "percentage": 80.33, "elapsed_time": "6:40:27", "remaining_time": "1:38:03"}
|
||||
{"current_steps": 2925, "total_steps": 3635, "loss": 0.4246, "lr": 4.4847201168280985e-06, "epoch": 4.023405231757687, "percentage": 80.47, "elapsed_time": "6:41:53", "remaining_time": "1:37:33"}
|
||||
{"current_steps": 2930, "total_steps": 3635, "loss": 0.3812, "lr": 4.42429338749339e-06, "epoch": 4.030289123451125, "percentage": 80.61, "elapsed_time": "6:43:19", "remaining_time": "1:37:02"}
|
||||
{"current_steps": 2935, "total_steps": 3635, "loss": 0.3676, "lr": 4.3642258492769685e-06, "epoch": 4.037173015144562, "percentage": 80.74, "elapsed_time": "6:44:40", "remaining_time": "1:36:30"}
|
||||
{"current_steps": 2940, "total_steps": 3635, "loss": 0.3527, "lr": 4.3045188873953506e-06, "epoch": 4.044056906837999, "percentage": 80.88, "elapsed_time": "6:46:07", "remaining_time": "1:36:00"}
|
||||
{"current_steps": 2945, "total_steps": 3635, "loss": 0.3421, "lr": 4.24517387874982e-06, "epoch": 4.050940798531436, "percentage": 81.02, "elapsed_time": "6:47:33", "remaining_time": "1:35:29"}
|
||||
{"current_steps": 2950, "total_steps": 3635, "loss": 0.3363, "lr": 4.186192191894656e-06, "epoch": 4.057824690224874, "percentage": 81.16, "elapsed_time": "6:49:02", "remaining_time": "1:34:58"}
|
||||
{"current_steps": 2955, "total_steps": 3635, "loss": 0.3382, "lr": 4.127575187005597e-06, "epoch": 4.064708581918311, "percentage": 81.29, "elapsed_time": "6:50:30", "remaining_time": "1:34:27"}
|
||||
{"current_steps": 2960, "total_steps": 3635, "loss": 0.3356, "lr": 4.069324215848438e-06, "epoch": 4.071592473611749, "percentage": 81.43, "elapsed_time": "6:51:59", "remaining_time": "1:33:57"}
|
||||
{"current_steps": 2965, "total_steps": 3635, "loss": 0.304, "lr": 4.011440621747879e-06, "epoch": 4.078476365305185, "percentage": 81.57, "elapsed_time": "6:53:40", "remaining_time": "1:33:28"}
|
||||
{"current_steps": 2970, "total_steps": 3635, "loss": 0.2969, "lr": 3.953925739556556e-06, "epoch": 4.085360256998623, "percentage": 81.71, "elapsed_time": "6:55:20", "remaining_time": "1:32:59"}
|
||||
{"current_steps": 2975, "total_steps": 3635, "loss": 0.2967, "lr": 3.896780895624235e-06, "epoch": 4.09224414869206, "percentage": 81.84, "elapsed_time": "6:57:01", "remaining_time": "1:32:30"}
|
||||
{"current_steps": 2980, "total_steps": 3635, "loss": 0.2938, "lr": 3.840007407767235e-06, "epoch": 4.099128040385498, "percentage": 81.98, "elapsed_time": "6:58:36", "remaining_time": "1:32:00"}
|
||||
{"current_steps": 2985, "total_steps": 3635, "loss": 0.2931, "lr": 3.7836065852380355e-06, "epoch": 4.1060119320789354, "percentage": 82.12, "elapsed_time": "7:00:13", "remaining_time": "1:31:30"}
|
||||
{"current_steps": 2990, "total_steps": 3635, "loss": 0.2926, "lr": 3.727579728695103e-06, "epoch": 4.112895823772373, "percentage": 82.26, "elapsed_time": "7:01:56", "remaining_time": "1:31:01"}
|
||||
{"current_steps": 2995, "total_steps": 3635, "loss": 0.3, "lr": 3.671928130172864e-06, "epoch": 4.1197797154658105, "percentage": 82.39, "elapsed_time": "7:03:37", "remaining_time": "1:30:31"}
|
||||
{"current_steps": 3000, "total_steps": 3635, "loss": 0.291, "lr": 3.6166530730519413e-06, "epoch": 4.126663607159247, "percentage": 82.53, "elapsed_time": "7:05:16", "remaining_time": "1:30:01"}
|
||||
{"current_steps": 3005, "total_steps": 3635, "loss": 0.2949, "lr": 3.561755832029519e-06, "epoch": 4.133547498852685, "percentage": 82.67, "elapsed_time": "7:07:09", "remaining_time": "1:29:33"}
|
||||
{"current_steps": 3010, "total_steps": 3635, "loss": 0.2919, "lr": 3.5072376730899937e-06, "epoch": 4.140431390546122, "percentage": 82.81, "elapsed_time": "7:08:47", "remaining_time": "1:29:02"}
|
||||
{"current_steps": 3015, "total_steps": 3635, "loss": 0.2932, "lr": 3.453099853475739e-06, "epoch": 4.14731528223956, "percentage": 82.94, "elapsed_time": "7:10:48", "remaining_time": "1:28:35"}
|
||||
{"current_steps": 3020, "total_steps": 3635, "loss": 0.2931, "lr": 3.3993436216581556e-06, "epoch": 4.154199173932997, "percentage": 83.08, "elapsed_time": "7:13:03", "remaining_time": "1:28:11"}
|
||||
{"current_steps": 3025, "total_steps": 3635, "loss": 0.2901, "lr": 3.3459702173088114e-06, "epoch": 4.161083065626434, "percentage": 83.22, "elapsed_time": "7:15:14", "remaining_time": "1:27:46"}
|
||||
{"current_steps": 3030, "total_steps": 3635, "loss": 0.2955, "lr": 3.2929808712709364e-06, "epoch": 4.167966957319871, "percentage": 83.36, "elapsed_time": "7:17:26", "remaining_time": "1:27:20"}
|
||||
{"current_steps": 3035, "total_steps": 3635, "loss": 0.2985, "lr": 3.2403768055309782e-06, "epoch": 4.174850849013309, "percentage": 83.49, "elapsed_time": "7:19:29", "remaining_time": "1:26:53"}
|
||||
{"current_steps": 3040, "total_steps": 3635, "loss": 0.2872, "lr": 3.1881592331904486e-06, "epoch": 4.181734740706746, "percentage": 83.63, "elapsed_time": "7:21:44", "remaining_time": "1:26:27"}
|
||||
{"current_steps": 3045, "total_steps": 3635, "loss": 0.287, "lr": 3.136329358437937e-06, "epoch": 4.188618632400184, "percentage": 83.77, "elapsed_time": "7:23:56", "remaining_time": "1:26:01"}
|
||||
{"current_steps": 3050, "total_steps": 3635, "loss": 0.2897, "lr": 3.084888376521349e-06, "epoch": 4.195502524093621, "percentage": 83.91, "elapsed_time": "7:26:03", "remaining_time": "1:25:33"}
|
||||
{"current_steps": 3055, "total_steps": 3635, "loss": 0.2935, "lr": 3.033837473720329e-06, "epoch": 4.202386415787059, "percentage": 84.04, "elapsed_time": "7:28:21", "remaining_time": "1:25:07"}
|
||||
{"current_steps": 3060, "total_steps": 3635, "loss": 0.2843, "lr": 2.9831778273189306e-06, "epoch": 4.209270307480495, "percentage": 84.18, "elapsed_time": "7:30:30", "remaining_time": "1:24:39"}
|
||||
{"current_steps": 3065, "total_steps": 3635, "loss": 0.2929, "lr": 2.9329106055784364e-06, "epoch": 4.216154199173933, "percentage": 84.32, "elapsed_time": "7:32:42", "remaining_time": "1:24:11"}
|
||||
{"current_steps": 3070, "total_steps": 3635, "loss": 0.2871, "lr": 2.883036967710431e-06, "epoch": 4.22303809086737, "percentage": 84.46, "elapsed_time": "7:34:54", "remaining_time": "1:23:43"}
|
||||
{"current_steps": 3075, "total_steps": 3635, "loss": 0.2907, "lr": 2.8335580638500703e-06, "epoch": 4.229921982560808, "percentage": 84.59, "elapsed_time": "7:37:04", "remaining_time": "1:23:14"}
|
||||
{"current_steps": 3080, "total_steps": 3635, "loss": 0.2839, "lr": 2.7844750350295635e-06, "epoch": 4.236805874254245, "percentage": 84.73, "elapsed_time": "7:39:13", "remaining_time": "1:22:45"}
|
||||
{"current_steps": 3085, "total_steps": 3635, "loss": 0.2828, "lr": 2.7357890131518395e-06, "epoch": 4.243689765947682, "percentage": 84.87, "elapsed_time": "7:41:26", "remaining_time": "1:22:15"}
|
||||
{"current_steps": 3090, "total_steps": 3635, "loss": 0.2846, "lr": 2.6875011209644617e-06, "epoch": 4.2505736576411195, "percentage": 85.01, "elapsed_time": "7:43:38", "remaining_time": "1:21:46"}
|
||||
{"current_steps": 3095, "total_steps": 3635, "loss": 0.2819, "lr": 2.639612472033726e-06, "epoch": 4.257457549334557, "percentage": 85.14, "elapsed_time": "7:45:44", "remaining_time": "1:21:15"}
|
||||
{"current_steps": 3100, "total_steps": 3635, "loss": 0.2871, "lr": 2.592124170718999e-06, "epoch": 4.2643414410279945, "percentage": 85.28, "elapsed_time": "7:47:56", "remaining_time": "1:20:45"}
|
||||
{"current_steps": 3105, "total_steps": 3635, "loss": 0.2925, "lr": 2.545037312147223e-06, "epoch": 4.271225332721432, "percentage": 85.42, "elapsed_time": "7:49:09", "remaining_time": "1:20:04"}
|
||||
{"current_steps": 3110, "total_steps": 3635, "loss": 0.2915, "lr": 2.4983529821876885e-06, "epoch": 4.2781092244148695, "percentage": 85.56, "elapsed_time": "7:50:14", "remaining_time": "1:19:22"}
|
||||
{"current_steps": 3115, "total_steps": 3635, "loss": 0.6151, "lr": 2.4520722574269697e-06, "epoch": 4.284993116108307, "percentage": 85.69, "elapsed_time": "7:51:55", "remaining_time": "1:18:46"}
|
||||
{"current_steps": 3120, "total_steps": 3635, "loss": 0.6474, "lr": 2.4061962051441113e-06, "epoch": 4.291877007801744, "percentage": 85.83, "elapsed_time": "7:53:39", "remaining_time": "1:18:10"}
|
||||
{"current_steps": 3125, "total_steps": 3635, "loss": 0.6191, "lr": 2.36072588328601e-06, "epoch": 4.298760899495181, "percentage": 85.97, "elapsed_time": "7:55:22", "remaining_time": "1:17:34"}
|
||||
{"current_steps": 3130, "total_steps": 3635, "loss": 0.5633, "lr": 2.31566234044303e-06, "epoch": 4.305644791188619, "percentage": 86.11, "elapsed_time": "7:57:08", "remaining_time": "1:16:59"}
|
||||
{"current_steps": 3135, "total_steps": 3635, "loss": 0.5424, "lr": 2.2710066158248025e-06, "epoch": 4.312528682882056, "percentage": 86.24, "elapsed_time": "7:58:58", "remaining_time": "1:16:23"}
|
||||
{"current_steps": 3140, "total_steps": 3635, "loss": 0.517, "lr": 2.226759739236262e-06, "epoch": 4.319412574575494, "percentage": 86.38, "elapsed_time": "8:00:40", "remaining_time": "1:15:46"}
|
||||
{"current_steps": 3145, "total_steps": 3635, "loss": 0.5062, "lr": 2.1829227310539245e-06, "epoch": 4.32629646626893, "percentage": 86.52, "elapsed_time": "8:02:25", "remaining_time": "1:15:09"}
|
||||
{"current_steps": 3150, "total_steps": 3635, "loss": 0.506, "lr": 2.1394966022023246e-06, "epoch": 4.333180357962368, "percentage": 86.66, "elapsed_time": "8:03:57", "remaining_time": "1:14:30"}
|
||||
{"current_steps": 3155, "total_steps": 3635, "loss": 0.4878, "lr": 2.0964823541307157e-06, "epoch": 4.340064249655805, "percentage": 86.8, "elapsed_time": "8:05:47", "remaining_time": "1:13:54"}
|
||||
{"current_steps": 3160, "total_steps": 3635, "loss": 0.4748, "lr": 2.0538809787899728e-06, "epoch": 4.346948141349243, "percentage": 86.93, "elapsed_time": "8:07:30", "remaining_time": "1:13:16"}
|
||||
{"current_steps": 3165, "total_steps": 3635, "loss": 0.4778, "lr": 2.0116934586097294e-06, "epoch": 4.35383203304268, "percentage": 87.07, "elapsed_time": "8:09:08", "remaining_time": "1:12:38"}
|
||||
{"current_steps": 3170, "total_steps": 3635, "loss": 0.4836, "lr": 1.9699207664757057e-06, "epoch": 4.360715924736118, "percentage": 87.21, "elapsed_time": "8:10:59", "remaining_time": "1:12:01"}
|
||||
{"current_steps": 3175, "total_steps": 3635, "loss": 0.458, "lr": 1.928563865707278e-06, "epoch": 4.367599816429555, "percentage": 87.35, "elapsed_time": "8:12:41", "remaining_time": "1:11:22"}
|
||||
{"current_steps": 3180, "total_steps": 3635, "loss": 0.4951, "lr": 1.88762371003526e-06, "epoch": 4.374483708122992, "percentage": 87.48, "elapsed_time": "8:14:29", "remaining_time": "1:10:45"}
|
||||
{"current_steps": 3185, "total_steps": 3635, "loss": 0.4678, "lr": 1.8471012435799317e-06, "epoch": 4.381367599816429, "percentage": 87.62, "elapsed_time": "8:16:12", "remaining_time": "1:10:06"}
|
||||
{"current_steps": 3190, "total_steps": 3635, "loss": 0.5026, "lr": 1.8069974008292246e-06, "epoch": 4.388251491509867, "percentage": 87.76, "elapsed_time": "8:17:56", "remaining_time": "1:09:27"}
|
||||
{"current_steps": 3195, "total_steps": 3635, "loss": 0.4813, "lr": 1.7673131066172233e-06, "epoch": 4.395135383203304, "percentage": 87.9, "elapsed_time": "8:19:42", "remaining_time": "1:08:49"}
|
||||
{"current_steps": 3200, "total_steps": 3635, "loss": 0.4806, "lr": 1.7280492761027767e-06, "epoch": 4.402019274896742, "percentage": 88.03, "elapsed_time": "8:21:23", "remaining_time": "1:08:09"}
|
||||
{"current_steps": 3205, "total_steps": 3635, "loss": 0.4986, "lr": 1.6892068147484563e-06, "epoch": 4.408903166590179, "percentage": 88.17, "elapsed_time": "8:23:13", "remaining_time": "1:07:30"}
|
||||
{"current_steps": 3210, "total_steps": 3635, "loss": 0.4708, "lr": 1.650786618299629e-06, "epoch": 4.415787058283616, "percentage": 88.31, "elapsed_time": "8:24:57", "remaining_time": "1:06:51"}
|
||||
{"current_steps": 3215, "total_steps": 3635, "loss": 0.1883, "lr": 1.6127895727638354e-06, "epoch": 4.422670949977054, "percentage": 88.45, "elapsed_time": "8:26:05", "remaining_time": "1:06:06"}
|
||||
{"current_steps": 3220, "total_steps": 3635, "loss": 0.1447, "lr": 1.5752165543903108e-06, "epoch": 4.429554841670491, "percentage": 88.58, "elapsed_time": "8:27:12", "remaining_time": "1:05:22"}
|
||||
{"current_steps": 3225, "total_steps": 3635, "loss": 0.1348, "lr": 1.5380684296498372e-06, "epoch": 4.436438733363929, "percentage": 88.72, "elapsed_time": "8:28:24", "remaining_time": "1:04:38"}
|
||||
{"current_steps": 3230, "total_steps": 3635, "loss": 0.1295, "lr": 1.5013460552147119e-06, "epoch": 4.443322625057366, "percentage": 88.86, "elapsed_time": "8:29:35", "remaining_time": "1:03:53"}
|
||||
{"current_steps": 3235, "total_steps": 3635, "loss": 0.1247, "lr": 1.4650502779390285e-06, "epoch": 4.450206516750804, "percentage": 89.0, "elapsed_time": "8:30:48", "remaining_time": "1:03:09"}
|
||||
{"current_steps": 3240, "total_steps": 3635, "loss": 0.1256, "lr": 1.4291819348391034e-06, "epoch": 4.45709040844424, "percentage": 89.13, "elapsed_time": "8:32:00", "remaining_time": "1:02:25"}
|
||||
{"current_steps": 3245, "total_steps": 3635, "loss": 0.1235, "lr": 1.3937418530742287e-06, "epoch": 4.463974300137678, "percentage": 89.27, "elapsed_time": "8:33:13", "remaining_time": "1:01:40"}
|
||||
{"current_steps": 3250, "total_steps": 3635, "loss": 0.1269, "lr": 1.3587308499275475e-06, "epoch": 4.470858191831115, "percentage": 89.41, "elapsed_time": "8:34:32", "remaining_time": "1:00:57"}
|
||||
{"current_steps": 3255, "total_steps": 3635, "loss": 0.1206, "lr": 1.3241497327872476e-06, "epoch": 4.477742083524553, "percentage": 89.55, "elapsed_time": "8:35:50", "remaining_time": "1:00:13"}
|
||||
{"current_steps": 3260, "total_steps": 3635, "loss": 0.1169, "lr": 1.2899992991279086e-06, "epoch": 4.48462597521799, "percentage": 89.68, "elapsed_time": "8:37:08", "remaining_time": "0:59:29"}
|
||||
{"current_steps": 3265, "total_steps": 3635, "loss": 0.1193, "lr": 1.2562803364921261e-06, "epoch": 4.491509866911427, "percentage": 89.82, "elapsed_time": "8:38:32", "remaining_time": "0:58:45"}
|
||||
{"current_steps": 3270, "total_steps": 3635, "loss": 0.1191, "lr": 1.2229936224723483e-06, "epoch": 4.498393758604864, "percentage": 89.96, "elapsed_time": "8:39:50", "remaining_time": "0:58:01"}
|
||||
{"current_steps": 3275, "total_steps": 3635, "loss": 0.1185, "lr": 1.190139924692959e-06, "epoch": 4.505277650298302, "percentage": 90.1, "elapsed_time": "8:41:07", "remaining_time": "0:57:17"}
|
||||
{"current_steps": 3280, "total_steps": 3635, "loss": 0.1141, "lr": 1.157720000792546e-06, "epoch": 4.512161541991739, "percentage": 90.23, "elapsed_time": "8:42:27", "remaining_time": "0:56:32"}
|
||||
{"current_steps": 3285, "total_steps": 3635, "loss": 0.1148, "lr": 1.125734598406447e-06, "epoch": 4.519045433685177, "percentage": 90.37, "elapsed_time": "8:43:43", "remaining_time": "0:55:48"}
|
||||
{"current_steps": 3290, "total_steps": 3635, "loss": 0.1159, "lr": 1.0941844551495162e-06, "epoch": 4.525929325378614, "percentage": 90.51, "elapsed_time": "8:44:51", "remaining_time": "0:55:02"}
|
||||
{"current_steps": 3295, "total_steps": 3635, "loss": 0.1169, "lr": 1.0630702985990914e-06, "epoch": 4.532813217072052, "percentage": 90.65, "elapsed_time": "8:46:04", "remaining_time": "0:54:17"}
|
||||
{"current_steps": 3300, "total_steps": 3635, "loss": 0.116, "lr": 1.0323928462782363e-06, "epoch": 4.5396971087654885, "percentage": 90.78, "elapsed_time": "8:47:16", "remaining_time": "0:53:31"}
|
||||
{"current_steps": 3305, "total_steps": 3635, "loss": 0.117, "lr": 1.0021528056391782e-06, "epoch": 4.546581000458926, "percentage": 90.92, "elapsed_time": "8:48:35", "remaining_time": "0:52:46"}
|
||||
{"current_steps": 3310, "total_steps": 3635, "loss": 0.1115, "lr": 9.723508740470123e-07, "epoch": 4.5534648921523635, "percentage": 91.06, "elapsed_time": "8:49:44", "remaining_time": "0:52:00"}
|
||||
{"current_steps": 3315, "total_steps": 3635, "loss": 0.1947, "lr": 9.429877387635944e-07, "epoch": 4.560348783845801, "percentage": 91.2, "elapsed_time": "8:51:03", "remaining_time": "0:51:15"}
|
||||
{"current_steps": 3320, "total_steps": 3635, "loss": 0.4262, "lr": 9.140640769317177e-07, "epoch": 4.5672326755392385, "percentage": 91.33, "elapsed_time": "8:52:51", "remaining_time": "0:50:33"}
|
||||
{"current_steps": 3325, "total_steps": 3635, "loss": 0.4177, "lr": 8.855805555594777e-07, "epoch": 4.574116567232675, "percentage": 91.47, "elapsed_time": "8:54:33", "remaining_time": "0:49:50"}
|
||||
{"current_steps": 3330, "total_steps": 3635, "loss": 0.4132, "lr": 8.575378315048999e-07, "epoch": 4.581000458926113, "percentage": 91.61, "elapsed_time": "8:56:14", "remaining_time": "0:49:06"}
|
||||
{"current_steps": 3335, "total_steps": 3635, "loss": 0.4073, "lr": 8.299365514607883e-07, "epoch": 4.58788435061955, "percentage": 91.75, "elapsed_time": "8:58:02", "remaining_time": "0:48:24"}
|
||||
{"current_steps": 3340, "total_steps": 3635, "loss": 0.3835, "lr": 8.027773519398207e-07, "epoch": 4.594768242312988, "percentage": 91.88, "elapsed_time": "8:59:43", "remaining_time": "0:47:40"}
|
||||
{"current_steps": 3345, "total_steps": 3635, "loss": 0.3943, "lr": 7.760608592598528e-07, "epoch": 4.601652134006425, "percentage": 92.02, "elapsed_time": "9:01:28", "remaining_time": "0:46:56"}
|
||||
{"current_steps": 3350, "total_steps": 3635, "loss": 0.3707, "lr": 7.497876895294931e-07, "epoch": 4.608536025699863, "percentage": 92.16, "elapsed_time": "9:03:12", "remaining_time": "0:46:12"}
|
||||
{"current_steps": 3355, "total_steps": 3635, "loss": 0.392, "lr": 7.239584486338814e-07, "epoch": 4.6154199173933, "percentage": 92.3, "elapsed_time": "9:04:58", "remaining_time": "0:45:28"}
|
||||
{"current_steps": 3360, "total_steps": 3635, "loss": 0.3859, "lr": 6.985737322207287e-07, "epoch": 4.622303809086737, "percentage": 92.43, "elapsed_time": "9:06:39", "remaining_time": "0:44:44"}
|
||||
{"current_steps": 3365, "total_steps": 3635, "loss": 0.3797, "lr": 6.736341256865642e-07, "epoch": 4.629187700780174, "percentage": 92.57, "elapsed_time": "9:08:30", "remaining_time": "0:44:00"}
|
||||
{"current_steps": 3370, "total_steps": 3635, "loss": 0.3888, "lr": 6.491402041632544e-07, "epoch": 4.636071592473612, "percentage": 92.71, "elapsed_time": "9:10:13", "remaining_time": "0:43:16"}
|
||||
{"current_steps": 3375, "total_steps": 3635, "loss": 0.3744, "lr": 6.250925325047252e-07, "epoch": 4.642955484167049, "percentage": 92.85, "elapsed_time": "9:12:02", "remaining_time": "0:42:31"}
|
||||
{"current_steps": 3380, "total_steps": 3635, "loss": 0.3916, "lr": 6.014916652739455e-07, "epoch": 4.649839375860487, "percentage": 92.98, "elapsed_time": "9:13:43", "remaining_time": "0:41:46"}
|
||||
{"current_steps": 3385, "total_steps": 3635, "loss": 0.3922, "lr": 5.783381467301352e-07, "epoch": 4.656723267553923, "percentage": 93.12, "elapsed_time": "9:15:30", "remaining_time": "0:41:01"}
|
||||
{"current_steps": 3390, "total_steps": 3635, "loss": 0.3825, "lr": 5.556325108162064e-07, "epoch": 4.663607159247361, "percentage": 93.26, "elapsed_time": "9:17:22", "remaining_time": "0:40:16"}
|
||||
{"current_steps": 3395, "total_steps": 3635, "loss": 0.3898, "lr": 5.333752811464599e-07, "epoch": 4.670491050940798, "percentage": 93.4, "elapsed_time": "9:19:00", "remaining_time": "0:39:31"}
|
||||
{"current_steps": 3400, "total_steps": 3635, "loss": 0.3865, "lr": 5.115669709945148e-07, "epoch": 4.677374942634236, "percentage": 93.54, "elapsed_time": "9:20:45", "remaining_time": "0:38:45"}
|
||||
{"current_steps": 3405, "total_steps": 3635, "loss": 0.3763, "lr": 4.90208083281456e-07, "epoch": 4.684258834327673, "percentage": 93.67, "elapsed_time": "9:22:25", "remaining_time": "0:37:59"}
|
||||
{"current_steps": 3410, "total_steps": 3635, "loss": 0.3825, "lr": 4.692991105642519e-07, "epoch": 4.691142726021111, "percentage": 93.81, "elapsed_time": "9:24:07", "remaining_time": "0:37:13"}
|
||||
{"current_steps": 3415, "total_steps": 3635, "loss": 0.3955, "lr": 4.488405350243752e-07, "epoch": 4.698026617714548, "percentage": 93.95, "elapsed_time": "9:25:54", "remaining_time": "0:36:27"}
|
||||
{"current_steps": 3420, "total_steps": 3635, "loss": 0.359, "lr": 4.288328284567156e-07, "epoch": 4.704910509407985, "percentage": 94.09, "elapsed_time": "9:27:41", "remaining_time": "0:35:41"}
|
||||
{"current_steps": 3425, "total_steps": 3635, "loss": 0.2684, "lr": 4.092764522586601e-07, "epoch": 4.711794401101423, "percentage": 94.22, "elapsed_time": "9:30:08", "remaining_time": "0:34:57"}
|
||||
{"current_steps": 3430, "total_steps": 3635, "loss": 0.2404, "lr": 3.901718574194968e-07, "epoch": 4.71867829279486, "percentage": 94.36, "elapsed_time": "9:32:33", "remaining_time": "0:34:13"}
|
||||
{"current_steps": 3435, "total_steps": 3635, "loss": 0.2514, "lr": 3.715194845099657e-07, "epoch": 4.725562184488298, "percentage": 94.5, "elapsed_time": "9:34:58", "remaining_time": "0:33:28"}
|
||||
{"current_steps": 3440, "total_steps": 3635, "loss": 0.248, "lr": 3.533197636721442e-07, "epoch": 4.732446076181735, "percentage": 94.64, "elapsed_time": "9:37:24", "remaining_time": "0:32:43"}
|
||||
{"current_steps": 3445, "total_steps": 3635, "loss": 0.2602, "lr": 3.3557311460949537e-07, "epoch": 4.739329967875172, "percentage": 94.77, "elapsed_time": "9:39:42", "remaining_time": "0:31:58"}
|
||||
{"current_steps": 3450, "total_steps": 3635, "loss": 0.2339, "lr": 3.182799465772135e-07, "epoch": 4.746213859568609, "percentage": 94.91, "elapsed_time": "9:42:08", "remaining_time": "0:31:12"}
|
||||
{"current_steps": 3455, "total_steps": 3635, "loss": 0.2453, "lr": 3.014406583727603e-07, "epoch": 4.753097751262047, "percentage": 95.05, "elapsed_time": "9:44:31", "remaining_time": "0:30:27"}
|
||||
{"current_steps": 3460, "total_steps": 3635, "loss": 0.2358, "lr": 2.8505563832669e-07, "epoch": 4.759981642955484, "percentage": 95.19, "elapsed_time": "9:46:57", "remaining_time": "0:29:41"}
|
||||
{"current_steps": 3465, "total_steps": 3635, "loss": 0.2287, "lr": 2.691252642936859e-07, "epoch": 4.766865534648922, "percentage": 95.32, "elapsed_time": "9:49:23", "remaining_time": "0:28:55"}
|
||||
{"current_steps": 3470, "total_steps": 3635, "loss": 0.2505, "lr": 2.5364990364384887e-07, "epoch": 4.773749426342359, "percentage": 95.46, "elapsed_time": "9:51:41", "remaining_time": "0:28:08"}
|
||||
{"current_steps": 3475, "total_steps": 3635, "loss": 0.2485, "lr": 2.386299132542136e-07, "epoch": 4.780633318035797, "percentage": 95.6, "elapsed_time": "9:54:06", "remaining_time": "0:27:21"}
|
||||
{"current_steps": 3480, "total_steps": 3635, "loss": 0.2544, "lr": 2.240656395005414e-07, "epoch": 4.787517209729233, "percentage": 95.74, "elapsed_time": "9:56:23", "remaining_time": "0:26:33"}
|
||||
{"current_steps": 3485, "total_steps": 3635, "loss": 0.2246, "lr": 2.0995741824931138e-07, "epoch": 4.794401101422671, "percentage": 95.87, "elapsed_time": "9:58:49", "remaining_time": "0:25:46"}
|
||||
{"current_steps": 3490, "total_steps": 3635, "loss": 0.2276, "lr": 1.9630557484998647e-07, "epoch": 4.801284993116108, "percentage": 96.01, "elapsed_time": "10:01:16", "remaining_time": "0:24:58"}
|
||||
{"current_steps": 3495, "total_steps": 3635, "loss": 0.2375, "lr": 1.8311042412750613e-07, "epoch": 4.808168884809546, "percentage": 96.15, "elapsed_time": "10:03:41", "remaining_time": "0:24:10"}
|
||||
{"current_steps": 3500, "total_steps": 3635, "loss": 0.2224, "lr": 1.7037227037502546e-07, "epoch": 4.815052776502983, "percentage": 96.29, "elapsed_time": "10:06:06", "remaining_time": "0:23:22"}
|
||||
{"current_steps": 3505, "total_steps": 3635, "loss": 0.2342, "lr": 1.5809140734690753e-07, "epoch": 4.82193666819642, "percentage": 96.42, "elapsed_time": "10:08:20", "remaining_time": "0:22:33"}
|
||||
{"current_steps": 3510, "total_steps": 3635, "loss": 0.2247, "lr": 1.4626811825192878e-07, "epoch": 4.8288205598898575, "percentage": 96.56, "elapsed_time": "10:10:46", "remaining_time": "0:21:45"}
|
||||
{"current_steps": 3515, "total_steps": 3635, "loss": 0.2326, "lr": 1.3490267574677528e-07, "epoch": 4.835704451583295, "percentage": 96.7, "elapsed_time": "10:13:10", "remaining_time": "0:20:56"}
|
||||
{"current_steps": 3520, "total_steps": 3635, "loss": 0.1956, "lr": 1.2399534192972797e-07, "epoch": 4.8425883432767325, "percentage": 96.84, "elapsed_time": "10:14:30", "remaining_time": "0:20:04"}
|
||||
{"current_steps": 3525, "total_steps": 3635, "loss": 0.1817, "lr": 1.1354636833463827e-07, "epoch": 4.84947223497017, "percentage": 96.97, "elapsed_time": "10:15:49", "remaining_time": "0:19:13"}
|
||||
{"current_steps": 3530, "total_steps": 3635, "loss": 0.1773, "lr": 1.0355599592511534e-07, "epoch": 4.8563561266636075, "percentage": 97.11, "elapsed_time": "10:17:09", "remaining_time": "0:18:21"}
|
||||
{"current_steps": 3535, "total_steps": 3635, "loss": 0.1784, "lr": 9.402445508897685e-08, "epoch": 4.863240018357045, "percentage": 97.25, "elapsed_time": "10:18:32", "remaining_time": "0:17:29"}
|
||||
{"current_steps": 3540, "total_steps": 3635, "loss": 0.1737, "lr": 8.495196563293118e-08, "epoch": 4.870123910050482, "percentage": 97.39, "elapsed_time": "10:19:47", "remaining_time": "0:16:37"}
|
||||
{"current_steps": 3545, "total_steps": 3635, "loss": 0.1744, "lr": 7.633873677751036e-08, "epoch": 4.877007801743919, "percentage": 97.52, "elapsed_time": "10:21:04", "remaining_time": "0:15:46"}
|
||||
{"current_steps": 3550, "total_steps": 3635, "loss": 0.1711, "lr": 6.818496715224721e-08, "epoch": 4.883891693437357, "percentage": 97.66, "elapsed_time": "10:22:22", "remaining_time": "0:14:54"}
|
||||
{"current_steps": 3555, "total_steps": 3635, "loss": 0.1704, "lr": 6.049084479109013e-08, "epoch": 4.890775585130794, "percentage": 97.8, "elapsed_time": "10:23:43", "remaining_time": "0:14:02"}
|
||||
{"current_steps": 3560, "total_steps": 3635, "loss": 0.1705, "lr": 5.325654712807105e-08, "epoch": 4.897659476824232, "percentage": 97.94, "elapsed_time": "10:25:06", "remaining_time": "0:13:10"}
|
||||
{"current_steps": 3565, "total_steps": 3635, "loss": 0.1689, "lr": 4.648224099321086e-08, "epoch": 4.904543368517668, "percentage": 98.07, "elapsed_time": "10:26:27", "remaining_time": "0:12:18"}
|
||||
{"current_steps": 3570, "total_steps": 3635, "loss": 0.1664, "lr": 4.0168082608673666e-08, "epoch": 4.911427260211106, "percentage": 98.21, "elapsed_time": "10:27:44", "remaining_time": "0:11:25"}
|
||||
{"current_steps": 3575, "total_steps": 3635, "loss": 0.1702, "lr": 3.431421758516518e-08, "epoch": 4.918311151904543, "percentage": 98.35, "elapsed_time": "10:29:10", "remaining_time": "0:10:33"}
|
||||
{"current_steps": 3580, "total_steps": 3635, "loss": 0.1686, "lr": 2.8920780918570978e-08, "epoch": 4.925195043597981, "percentage": 98.49, "elapsed_time": "10:30:31", "remaining_time": "0:09:41"}
|
||||
{"current_steps": 3585, "total_steps": 3635, "loss": 0.1683, "lr": 2.398789698684789e-08, "epoch": 4.932078935291418, "percentage": 98.62, "elapsed_time": "10:31:52", "remaining_time": "0:08:48"}
|
||||
{"current_steps": 3590, "total_steps": 3635, "loss": 0.1863, "lr": 1.9515679547148504e-08, "epoch": 4.938962826984856, "percentage": 98.76, "elapsed_time": "10:33:19", "remaining_time": "0:07:56"}
|
||||
{"current_steps": 3595, "total_steps": 3635, "loss": 0.2581, "lr": 1.550423173321214e-08, "epoch": 4.945846718678293, "percentage": 98.9, "elapsed_time": "10:35:23", "remaining_time": "0:07:04"}
|
||||
{"current_steps": 3600, "total_steps": 3635, "loss": 0.2376, "lr": 1.1953646052969004e-08, "epoch": 4.95273061037173, "percentage": 99.04, "elapsed_time": "10:37:29", "remaining_time": "0:06:11"}
|
||||
{"current_steps": 3605, "total_steps": 3635, "loss": 0.2382, "lr": 8.864004386419655e-09, "epoch": 4.959614502065167, "percentage": 99.17, "elapsed_time": "10:39:46", "remaining_time": "0:05:19"}
|
||||
{"current_steps": 3610, "total_steps": 3635, "loss": 0.239, "lr": 6.235377983738744e-09, "epoch": 4.966498393758605, "percentage": 99.31, "elapsed_time": "10:41:51", "remaining_time": "0:04:26"}
|
||||
{"current_steps": 3615, "total_steps": 3635, "loss": 0.2314, "lr": 4.067827463638541e-09, "epoch": 4.973382285452042, "percentage": 99.45, "elapsed_time": "10:43:52", "remaining_time": "0:03:33"}
|
||||
{"current_steps": 3620, "total_steps": 3635, "loss": 0.2298, "lr": 2.3614028119656186e-09, "epoch": 4.98026617714548, "percentage": 99.59, "elapsed_time": "10:45:59", "remaining_time": "0:02:40"}
|
||||
{"current_steps": 3625, "total_steps": 3635, "loss": 0.2197, "lr": 1.116143380552881e-09, "epoch": 4.9871500688389165, "percentage": 99.72, "elapsed_time": "10:47:59", "remaining_time": "0:01:47"}
|
||||
{"current_steps": 3630, "total_steps": 3635, "loss": 0.2122, "lr": 3.320778863069585e-10, "epoch": 4.994033960532354, "percentage": 99.86, "elapsed_time": "10:50:05", "remaining_time": "0:00:53"}
|
||||
{"current_steps": 3635, "total_steps": 3635, "loss": 0.2367, "lr": 9.224410548736018e-12, "epoch": 5.0, "percentage": 100.0, "elapsed_time": "10:51:52", "remaining_time": "0:00:00"}
|
||||
{"current_steps": 3635, "total_steps": 3635, "epoch": 5.0, "percentage": 100.0, "elapsed_time": "10:52:00", "remaining_time": "0:00:00"}
|
||||
8044
trainer_state.json
Normal file
8044
trainer_state.json
Normal file
File diff suppressed because it is too large
Load Diff
3
training_args.bin
Normal file
3
training_args.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:eca78f1c46faf226208ad149fd47f971acd5e1832d6ce0abac92af69d1978c8a
|
||||
size 8785
|
||||
BIN
training_loss.png
Normal file
BIN
training_loss.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 60 KiB |
1
vocab.json
Normal file
1
vocab.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user