初始化项目,由ModelHub XC社区提供模型
Model: penfever/nl2bash-verified-GLM-4_6-traces-32ep-32k-dft Source: Original Platform
This commit is contained in:
36
.gitattributes
vendored
Normal file
36
.gitattributes
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||
60
README.md
Normal file
60
README.md
Normal file
@@ -0,0 +1,60 @@
|
||||
---
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
base_model: Qwen/Qwen3-8B
|
||||
tags:
|
||||
- llama-factory
|
||||
- full
|
||||
- generated_from_trainer
|
||||
model-index:
|
||||
- name: nl2bash-verified-GLM-4_6-traces-32ep-32k-dft
|
||||
results: []
|
||||
---
|
||||
|
||||
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
||||
should probably proofread and complete it, then remove this comment. -->
|
||||
|
||||
# nl2bash-verified-GLM-4_6-traces-32ep-32k-dft
|
||||
|
||||
This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the penfever/nl2bash-verified-GLM-4.6-traces-32ep-32k dataset.
|
||||
|
||||
## Model description
|
||||
|
||||
More information needed
|
||||
|
||||
## Intended uses & limitations
|
||||
|
||||
More information needed
|
||||
|
||||
## Training and evaluation data
|
||||
|
||||
More information needed
|
||||
|
||||
## Training procedure
|
||||
|
||||
### Training hyperparameters
|
||||
|
||||
The following hyperparameters were used during training:
|
||||
- learning_rate: 4e-05
|
||||
- train_batch_size: 1
|
||||
- eval_batch_size: 8
|
||||
- seed: 42
|
||||
- distributed_type: multi-GPU
|
||||
- num_devices: 16
|
||||
- total_train_batch_size: 16
|
||||
- total_eval_batch_size: 128
|
||||
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
||||
- lr_scheduler_type: cosine
|
||||
- lr_scheduler_warmup_ratio: 0.1
|
||||
- num_epochs: 7.0
|
||||
|
||||
### Training results
|
||||
|
||||
|
||||
|
||||
### Framework versions
|
||||
|
||||
- Transformers 4.56.0
|
||||
- Pytorch 2.7.0+cu128
|
||||
- Datasets 4.4.1
|
||||
- Tokenizers 0.22.1
|
||||
28
added_tokens.json
Normal file
28
added_tokens.json
Normal file
@@ -0,0 +1,28 @@
|
||||
{
|
||||
"</think>": 151668,
|
||||
"</tool_call>": 151658,
|
||||
"</tool_response>": 151666,
|
||||
"<think>": 151667,
|
||||
"<tool_call>": 151657,
|
||||
"<tool_response>": 151665,
|
||||
"<|box_end|>": 151649,
|
||||
"<|box_start|>": 151648,
|
||||
"<|endoftext|>": 151643,
|
||||
"<|file_sep|>": 151664,
|
||||
"<|fim_middle|>": 151660,
|
||||
"<|fim_pad|>": 151662,
|
||||
"<|fim_prefix|>": 151659,
|
||||
"<|fim_suffix|>": 151661,
|
||||
"<|im_end|>": 151645,
|
||||
"<|im_start|>": 151644,
|
||||
"<|image_pad|>": 151655,
|
||||
"<|object_ref_end|>": 151647,
|
||||
"<|object_ref_start|>": 151646,
|
||||
"<|quad_end|>": 151651,
|
||||
"<|quad_start|>": 151650,
|
||||
"<|repo_name|>": 151663,
|
||||
"<|video_pad|>": 151656,
|
||||
"<|vision_end|>": 151653,
|
||||
"<|vision_pad|>": 151654,
|
||||
"<|vision_start|>": 151652
|
||||
}
|
||||
14
all_results.json
Normal file
14
all_results.json
Normal file
@@ -0,0 +1,14 @@
|
||||
{
|
||||
"achieved_tflops_per_gpu": 0.002469331217094423,
|
||||
"achieved_tflops_per_gpu_theoretical": 1243.3609745561027,
|
||||
"epoch": 7.0,
|
||||
"loss_nan_ranks": 0,
|
||||
"loss_rank_avg": 0.002985658124089241,
|
||||
"mfu_percent": 0.00017451104007734438,
|
||||
"mfu_percent_theoretical": 87.87003353753376,
|
||||
"total_flos": 331629159579648.0,
|
||||
"train_loss": 0.006270654577846554,
|
||||
"train_runtime": 8393.6988,
|
||||
"train_samples_per_second": 5.86,
|
||||
"train_steps_per_second": 0.367
|
||||
}
|
||||
89
chat_template.jinja
Normal file
89
chat_template.jinja
Normal file
@@ -0,0 +1,89 @@
|
||||
{%- if tools %}
|
||||
{{- '<|im_start|>system\n' }}
|
||||
{%- if messages[0].role == 'system' %}
|
||||
{{- messages[0].content + '\n\n' }}
|
||||
{%- endif %}
|
||||
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
|
||||
{%- for tool in tools %}
|
||||
{{- "\n" }}
|
||||
{{- tool | tojson }}
|
||||
{%- endfor %}
|
||||
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
|
||||
{%- else %}
|
||||
{%- if messages[0].role == 'system' %}
|
||||
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
|
||||
{%- for message in messages[::-1] %}
|
||||
{%- set index = (messages|length - 1) - loop.index0 %}
|
||||
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
|
||||
{%- set ns.multi_step_tool = false %}
|
||||
{%- set ns.last_query_index = index %}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
{%- for message in messages %}
|
||||
{%- if message.content is string %}
|
||||
{%- set content = message.content %}
|
||||
{%- else %}
|
||||
{%- set content = '' %}
|
||||
{%- endif %}
|
||||
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
|
||||
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
|
||||
{%- elif message.role == "assistant" %}
|
||||
{%- set reasoning_content = '' %}
|
||||
{%- if message.reasoning_content is string %}
|
||||
{%- set reasoning_content = message.reasoning_content %}
|
||||
{%- else %}
|
||||
{%- if '</think>' in content %}
|
||||
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
|
||||
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
{%- if loop.index0 > ns.last_query_index %}
|
||||
{%- if loop.last or (not loop.last and reasoning_content) %}
|
||||
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
|
||||
{%- else %}
|
||||
{{- '<|im_start|>' + message.role + '\n' + content }}
|
||||
{%- endif %}
|
||||
{%- else %}
|
||||
{{- '<|im_start|>' + message.role + '\n' + content }}
|
||||
{%- endif %}
|
||||
{%- if message.tool_calls %}
|
||||
{%- for tool_call in message.tool_calls %}
|
||||
{%- if (loop.first and content) or (not loop.first) %}
|
||||
{{- '\n' }}
|
||||
{%- endif %}
|
||||
{%- if tool_call.function %}
|
||||
{%- set tool_call = tool_call.function %}
|
||||
{%- endif %}
|
||||
{{- '<tool_call>\n{"name": "' }}
|
||||
{{- tool_call.name }}
|
||||
{{- '", "arguments": ' }}
|
||||
{%- if tool_call.arguments is string %}
|
||||
{{- tool_call.arguments }}
|
||||
{%- else %}
|
||||
{{- tool_call.arguments | tojson }}
|
||||
{%- endif %}
|
||||
{{- '}\n</tool_call>' }}
|
||||
{%- endfor %}
|
||||
{%- endif %}
|
||||
{{- '<|im_end|>\n' }}
|
||||
{%- elif message.role == "tool" %}
|
||||
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
|
||||
{{- '<|im_start|>user' }}
|
||||
{%- endif %}
|
||||
{{- '\n<tool_response>\n' }}
|
||||
{{- content }}
|
||||
{{- '\n</tool_response>' }}
|
||||
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
|
||||
{{- '<|im_end|>\n' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
{%- if add_generation_prompt %}
|
||||
{{- '<|im_start|>assistant\n' }}
|
||||
{%- if enable_thinking is defined and enable_thinking is false %}
|
||||
{{- '<think>\n\n</think>\n\n' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
68
config.json
Normal file
68
config.json
Normal file
@@ -0,0 +1,68 @@
|
||||
{
|
||||
"architectures": [
|
||||
"Qwen3ForCausalLM"
|
||||
],
|
||||
"attention_bias": false,
|
||||
"attention_dropout": 0.0,
|
||||
"dtype": "bfloat16",
|
||||
"eos_token_id": 151645,
|
||||
"head_dim": 128,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 12288,
|
||||
"layer_types": [
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention"
|
||||
],
|
||||
"max_position_embeddings": 40960,
|
||||
"max_window_layers": 36,
|
||||
"model_type": "qwen3",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 36,
|
||||
"num_key_value_heads": 8,
|
||||
"pad_token_id": 151643,
|
||||
"rms_norm_eps": 1e-06,
|
||||
"rope_scaling": null,
|
||||
"rope_theta": 1000000,
|
||||
"sliding_window": null,
|
||||
"tie_word_embeddings": false,
|
||||
"transformers_version": "4.56.0",
|
||||
"use_cache": false,
|
||||
"use_sliding_window": false,
|
||||
"vocab_size": 151936
|
||||
}
|
||||
12
generation_config.json
Normal file
12
generation_config.json
Normal file
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"do_sample": true,
|
||||
"eos_token_id": [
|
||||
151645,
|
||||
151643
|
||||
],
|
||||
"pad_token_id": 151643,
|
||||
"temperature": 0.6,
|
||||
"top_k": 20,
|
||||
"top_p": 0.95,
|
||||
"transformers_version": "4.56.0"
|
||||
}
|
||||
151388
merges.txt
Normal file
151388
merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
3
model-00001-of-00004.safetensors
Normal file
3
model-00001-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:90db05b067a1d440e80e28b11dd70fcc398ebb6fb2d4145c99a43a0f7a03a9db
|
||||
size 4902257696
|
||||
3
model-00002-of-00004.safetensors
Normal file
3
model-00002-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b38753ca180b325b1bba65270224af389a692a957eb2f12720ccebce73694b56
|
||||
size 4915960368
|
||||
3
model-00003-of-00004.safetensors
Normal file
3
model-00003-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:9690a2794dd0439674dc96a042ff439e61e22ed038ebeec52ae182710fb829a0
|
||||
size 4983068496
|
||||
3
model-00004-of-00004.safetensors
Normal file
3
model-00004-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f8ce5db167df3353c7b774ee0e275134db1241f05ef37e5b80fb1b319b0e311e
|
||||
size 1580230264
|
||||
407
model.safetensors.index.json
Normal file
407
model.safetensors.index.json
Normal file
@@ -0,0 +1,407 @@
|
||||
{
|
||||
"metadata": {
|
||||
"total_parameters": 308224,
|
||||
"total_size": 16381470720
|
||||
},
|
||||
"weight_map": {
|
||||
"lm_head.weight": "model-00004-of-00004.safetensors",
|
||||
"model.embed_tokens.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.20.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.30.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.35.input_layernorm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.k_norm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.q_norm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.norm.weight": "model-00004-of-00004.safetensors"
|
||||
}
|
||||
}
|
||||
12
run_summary.json
Normal file
12
run_summary.json
Normal file
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"agent_name": "32ep-32k",
|
||||
"training_start": null,
|
||||
"training_end": null,
|
||||
"created_by": "DCAgent",
|
||||
"base_model_name": "Qwen/Qwen3-8B",
|
||||
"dataset_name": "penfever/nl2bash-verified-GLM-4.6-traces-32ep-32k",
|
||||
"training_type": "SFT",
|
||||
"training_parameters": "https://huggingface.co/penfever/nl2bash-verified-GLM-4_6-traces-32ep-32k-dft/blob/main/config.json",
|
||||
"wandb_link": "https://wandb.ai/dogml/dc-agent/runs/nl2b-veri-GLM-4.6-trac-32ep-32k_hub-mode-id_nl2b-veri-GLM-4.6-trac-32ep-32k-dft_Qwen3-8B",
|
||||
"traces_location_s3": null
|
||||
}
|
||||
31
special_tokens_map.json
Normal file
31
special_tokens_map.json
Normal file
@@ -0,0 +1,31 @@
|
||||
{
|
||||
"additional_special_tokens": [
|
||||
"<|im_start|>",
|
||||
"<|im_end|>",
|
||||
"<|object_ref_start|>",
|
||||
"<|object_ref_end|>",
|
||||
"<|box_start|>",
|
||||
"<|box_end|>",
|
||||
"<|quad_start|>",
|
||||
"<|quad_end|>",
|
||||
"<|vision_start|>",
|
||||
"<|vision_end|>",
|
||||
"<|vision_pad|>",
|
||||
"<|image_pad|>",
|
||||
"<|video_pad|>"
|
||||
],
|
||||
"eos_token": {
|
||||
"content": "<|im_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"pad_token": {
|
||||
"content": "<|endoftext|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
3
tokenizer.json
Normal file
3
tokenizer.json
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4
|
||||
size 11422654
|
||||
240
tokenizer_config.json
Normal file
240
tokenizer_config.json
Normal file
@@ -0,0 +1,240 @@
|
||||
{
|
||||
"add_bos_token": false,
|
||||
"add_prefix_space": false,
|
||||
"added_tokens_decoder": {
|
||||
"151643": {
|
||||
"content": "<|endoftext|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151644": {
|
||||
"content": "<|im_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151645": {
|
||||
"content": "<|im_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151646": {
|
||||
"content": "<|object_ref_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151647": {
|
||||
"content": "<|object_ref_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151648": {
|
||||
"content": "<|box_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151649": {
|
||||
"content": "<|box_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151650": {
|
||||
"content": "<|quad_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151651": {
|
||||
"content": "<|quad_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151652": {
|
||||
"content": "<|vision_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151653": {
|
||||
"content": "<|vision_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151654": {
|
||||
"content": "<|vision_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151655": {
|
||||
"content": "<|image_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151656": {
|
||||
"content": "<|video_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151657": {
|
||||
"content": "<tool_call>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151658": {
|
||||
"content": "</tool_call>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151659": {
|
||||
"content": "<|fim_prefix|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151660": {
|
||||
"content": "<|fim_middle|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151661": {
|
||||
"content": "<|fim_suffix|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151662": {
|
||||
"content": "<|fim_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151663": {
|
||||
"content": "<|repo_name|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151664": {
|
||||
"content": "<|file_sep|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151665": {
|
||||
"content": "<tool_response>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151666": {
|
||||
"content": "</tool_response>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151667": {
|
||||
"content": "<think>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151668": {
|
||||
"content": "</think>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
}
|
||||
},
|
||||
"additional_special_tokens": [
|
||||
"<|im_start|>",
|
||||
"<|im_end|>",
|
||||
"<|object_ref_start|>",
|
||||
"<|object_ref_end|>",
|
||||
"<|box_start|>",
|
||||
"<|box_end|>",
|
||||
"<|quad_start|>",
|
||||
"<|quad_end|>",
|
||||
"<|vision_start|>",
|
||||
"<|vision_end|>",
|
||||
"<|vision_pad|>",
|
||||
"<|image_pad|>",
|
||||
"<|video_pad|>"
|
||||
],
|
||||
"bos_token": null,
|
||||
"clean_up_tokenization_spaces": false,
|
||||
"eos_token": "<|im_end|>",
|
||||
"errors": "replace",
|
||||
"extra_special_tokens": {},
|
||||
"model_max_length": 32768,
|
||||
"pad_token": "<|endoftext|>",
|
||||
"padding_side": "right",
|
||||
"split_special_tokens": false,
|
||||
"tokenizer_class": "Qwen2Tokenizer",
|
||||
"unk_token": null
|
||||
}
|
||||
14
train_results.json
Normal file
14
train_results.json
Normal file
@@ -0,0 +1,14 @@
|
||||
{
|
||||
"achieved_tflops_per_gpu": 0.002469331217094423,
|
||||
"achieved_tflops_per_gpu_theoretical": 1243.3609745561027,
|
||||
"epoch": 7.0,
|
||||
"loss_nan_ranks": 0,
|
||||
"loss_rank_avg": 0.002985658124089241,
|
||||
"mfu_percent": 0.00017451104007734438,
|
||||
"mfu_percent_theoretical": 87.87003353753376,
|
||||
"total_flos": 331629159579648.0,
|
||||
"train_loss": 0.006270654577846554,
|
||||
"train_runtime": 8393.6988,
|
||||
"train_samples_per_second": 5.86,
|
||||
"train_steps_per_second": 0.367
|
||||
}
|
||||
617
trainer_log.jsonl
Normal file
617
trainer_log.jsonl
Normal file
@@ -0,0 +1,617 @@
|
||||
{"current_steps": 5, "total_steps": 3080, "loss": 0.0702, "lr": 5.194805194805196e-07, "epoch": 0.011363636363636364, "percentage": 0.16, "elapsed_time": "0:00:22", "remaining_time": "3:51:44"}
|
||||
{"current_steps": 10, "total_steps": 3080, "loss": 0.0688, "lr": 1.168831168831169e-06, "epoch": 0.022727272727272728, "percentage": 0.32, "elapsed_time": "0:00:35", "remaining_time": "3:02:54"}
|
||||
{"current_steps": 15, "total_steps": 3080, "loss": 0.0638, "lr": 1.8181818181818183e-06, "epoch": 0.03409090909090909, "percentage": 0.49, "elapsed_time": "0:00:48", "remaining_time": "2:45:25"}
|
||||
{"current_steps": 20, "total_steps": 3080, "loss": 0.0499, "lr": 2.4675324675324676e-06, "epoch": 0.045454545454545456, "percentage": 0.65, "elapsed_time": "0:01:01", "remaining_time": "2:36:38"}
|
||||
{"current_steps": 25, "total_steps": 3080, "loss": 0.0348, "lr": 3.116883116883117e-06, "epoch": 0.056818181818181816, "percentage": 0.81, "elapsed_time": "0:01:14", "remaining_time": "2:31:02"}
|
||||
{"current_steps": 30, "total_steps": 3080, "loss": 0.0255, "lr": 3.7662337662337666e-06, "epoch": 0.06818181818181818, "percentage": 0.97, "elapsed_time": "0:01:27", "remaining_time": "2:27:48"}
|
||||
{"current_steps": 35, "total_steps": 3080, "loss": 0.0214, "lr": 4.415584415584416e-06, "epoch": 0.07954545454545454, "percentage": 1.14, "elapsed_time": "0:01:40", "remaining_time": "2:25:26"}
|
||||
{"current_steps": 40, "total_steps": 3080, "loss": 0.0193, "lr": 5.064935064935065e-06, "epoch": 0.09090909090909091, "percentage": 1.3, "elapsed_time": "0:01:53", "remaining_time": "2:23:25"}
|
||||
{"current_steps": 45, "total_steps": 3080, "loss": 0.0183, "lr": 5.7142857142857145e-06, "epoch": 0.10227272727272728, "percentage": 1.46, "elapsed_time": "0:02:05", "remaining_time": "2:21:14"}
|
||||
{"current_steps": 50, "total_steps": 3080, "loss": 0.0179, "lr": 6.363636363636364e-06, "epoch": 0.11363636363636363, "percentage": 1.62, "elapsed_time": "0:02:19", "remaining_time": "2:20:46"}
|
||||
{"current_steps": 55, "total_steps": 3080, "loss": 0.0172, "lr": 7.012987012987014e-06, "epoch": 0.125, "percentage": 1.79, "elapsed_time": "0:02:32", "remaining_time": "2:19:34"}
|
||||
{"current_steps": 60, "total_steps": 3080, "loss": 0.0165, "lr": 7.662337662337663e-06, "epoch": 0.13636363636363635, "percentage": 1.95, "elapsed_time": "0:02:45", "remaining_time": "2:18:42"}
|
||||
{"current_steps": 65, "total_steps": 3080, "loss": 0.016, "lr": 8.311688311688313e-06, "epoch": 0.14772727272727273, "percentage": 2.11, "elapsed_time": "0:02:58", "remaining_time": "2:17:40"}
|
||||
{"current_steps": 70, "total_steps": 3080, "loss": 0.0159, "lr": 8.96103896103896e-06, "epoch": 0.1590909090909091, "percentage": 2.27, "elapsed_time": "0:03:11", "remaining_time": "2:17:00"}
|
||||
{"current_steps": 75, "total_steps": 3080, "loss": 0.015, "lr": 9.610389610389611e-06, "epoch": 0.17045454545454544, "percentage": 2.44, "elapsed_time": "0:03:24", "remaining_time": "2:16:14"}
|
||||
{"current_steps": 80, "total_steps": 3080, "loss": 0.0145, "lr": 1.025974025974026e-05, "epoch": 0.18181818181818182, "percentage": 2.6, "elapsed_time": "0:03:37", "remaining_time": "2:16:08"}
|
||||
{"current_steps": 85, "total_steps": 3080, "loss": 0.0147, "lr": 1.0909090909090909e-05, "epoch": 0.19318181818181818, "percentage": 2.76, "elapsed_time": "0:03:50", "remaining_time": "2:15:15"}
|
||||
{"current_steps": 90, "total_steps": 3080, "loss": 0.0142, "lr": 1.155844155844156e-05, "epoch": 0.20454545454545456, "percentage": 2.92, "elapsed_time": "0:04:03", "remaining_time": "2:14:55"}
|
||||
{"current_steps": 95, "total_steps": 3080, "loss": 0.0133, "lr": 1.2207792207792208e-05, "epoch": 0.2159090909090909, "percentage": 3.08, "elapsed_time": "0:04:16", "remaining_time": "2:14:23"}
|
||||
{"current_steps": 100, "total_steps": 3080, "loss": 0.013, "lr": 1.2857142857142859e-05, "epoch": 0.22727272727272727, "percentage": 3.25, "elapsed_time": "0:04:29", "remaining_time": "2:13:45"}
|
||||
{"current_steps": 105, "total_steps": 3080, "loss": 0.013, "lr": 1.3506493506493508e-05, "epoch": 0.23863636363636365, "percentage": 3.41, "elapsed_time": "0:04:42", "remaining_time": "2:13:27"}
|
||||
{"current_steps": 110, "total_steps": 3080, "loss": 0.0123, "lr": 1.4155844155844157e-05, "epoch": 0.25, "percentage": 3.57, "elapsed_time": "0:04:55", "remaining_time": "2:12:59"}
|
||||
{"current_steps": 115, "total_steps": 3080, "loss": 0.0125, "lr": 1.4805194805194807e-05, "epoch": 0.26136363636363635, "percentage": 3.73, "elapsed_time": "0:05:08", "remaining_time": "2:12:39"}
|
||||
{"current_steps": 120, "total_steps": 3080, "loss": 0.0124, "lr": 1.5454545454545454e-05, "epoch": 0.2727272727272727, "percentage": 3.9, "elapsed_time": "0:05:21", "remaining_time": "2:12:11"}
|
||||
{"current_steps": 125, "total_steps": 3080, "loss": 0.0116, "lr": 1.6103896103896105e-05, "epoch": 0.2840909090909091, "percentage": 4.06, "elapsed_time": "0:05:33", "remaining_time": "2:11:34"}
|
||||
{"current_steps": 130, "total_steps": 3080, "loss": 0.0118, "lr": 1.6753246753246756e-05, "epoch": 0.29545454545454547, "percentage": 4.22, "elapsed_time": "0:05:45", "remaining_time": "2:10:45"}
|
||||
{"current_steps": 135, "total_steps": 3080, "loss": 0.0118, "lr": 1.7402597402597403e-05, "epoch": 0.3068181818181818, "percentage": 4.38, "elapsed_time": "0:05:58", "remaining_time": "2:10:21"}
|
||||
{"current_steps": 140, "total_steps": 3080, "loss": 0.0114, "lr": 1.8051948051948053e-05, "epoch": 0.3181818181818182, "percentage": 4.55, "elapsed_time": "0:06:10", "remaining_time": "2:09:46"}
|
||||
{"current_steps": 145, "total_steps": 3080, "loss": 0.0115, "lr": 1.8701298701298704e-05, "epoch": 0.32954545454545453, "percentage": 4.71, "elapsed_time": "0:06:25", "remaining_time": "2:09:54"}
|
||||
{"current_steps": 150, "total_steps": 3080, "loss": 0.0113, "lr": 1.9350649350649354e-05, "epoch": 0.3409090909090909, "percentage": 4.87, "elapsed_time": "0:06:38", "remaining_time": "2:09:39"}
|
||||
{"current_steps": 155, "total_steps": 3080, "loss": 0.0111, "lr": 2e-05, "epoch": 0.3522727272727273, "percentage": 5.03, "elapsed_time": "0:06:51", "remaining_time": "2:09:31"}
|
||||
{"current_steps": 160, "total_steps": 3080, "loss": 0.011, "lr": 2.0649350649350652e-05, "epoch": 0.36363636363636365, "percentage": 5.19, "elapsed_time": "0:07:05", "remaining_time": "2:09:27"}
|
||||
{"current_steps": 165, "total_steps": 3080, "loss": 0.0106, "lr": 2.12987012987013e-05, "epoch": 0.375, "percentage": 5.36, "elapsed_time": "0:07:18", "remaining_time": "2:09:07"}
|
||||
{"current_steps": 170, "total_steps": 3080, "loss": 0.0106, "lr": 2.194805194805195e-05, "epoch": 0.38636363636363635, "percentage": 5.52, "elapsed_time": "0:07:30", "remaining_time": "2:08:37"}
|
||||
{"current_steps": 175, "total_steps": 3080, "loss": 0.0103, "lr": 2.25974025974026e-05, "epoch": 0.3977272727272727, "percentage": 5.68, "elapsed_time": "0:07:43", "remaining_time": "2:08:09"}
|
||||
{"current_steps": 180, "total_steps": 3080, "loss": 0.0108, "lr": 2.324675324675325e-05, "epoch": 0.4090909090909091, "percentage": 5.84, "elapsed_time": "0:07:55", "remaining_time": "2:07:36"}
|
||||
{"current_steps": 185, "total_steps": 3080, "loss": 0.0113, "lr": 2.3896103896103898e-05, "epoch": 0.42045454545454547, "percentage": 6.01, "elapsed_time": "0:08:07", "remaining_time": "2:07:10"}
|
||||
{"current_steps": 190, "total_steps": 3080, "loss": 0.0106, "lr": 2.454545454545455e-05, "epoch": 0.4318181818181818, "percentage": 6.17, "elapsed_time": "0:08:19", "remaining_time": "2:06:38"}
|
||||
{"current_steps": 195, "total_steps": 3080, "loss": 0.011, "lr": 2.51948051948052e-05, "epoch": 0.4431818181818182, "percentage": 6.33, "elapsed_time": "0:08:33", "remaining_time": "2:06:37"}
|
||||
{"current_steps": 200, "total_steps": 3080, "loss": 0.0098, "lr": 2.5844155844155843e-05, "epoch": 0.45454545454545453, "percentage": 6.49, "elapsed_time": "0:08:45", "remaining_time": "2:06:04"}
|
||||
{"current_steps": 205, "total_steps": 3080, "loss": 0.0104, "lr": 2.6493506493506497e-05, "epoch": 0.4659090909090909, "percentage": 6.66, "elapsed_time": "0:10:06", "remaining_time": "2:21:45"}
|
||||
{"current_steps": 210, "total_steps": 3080, "loss": 0.0097, "lr": 2.7142857142857148e-05, "epoch": 0.4772727272727273, "percentage": 6.82, "elapsed_time": "0:10:20", "remaining_time": "2:21:15"}
|
||||
{"current_steps": 215, "total_steps": 3080, "loss": 0.0101, "lr": 2.779220779220779e-05, "epoch": 0.48863636363636365, "percentage": 6.98, "elapsed_time": "0:10:33", "remaining_time": "2:20:39"}
|
||||
{"current_steps": 220, "total_steps": 3080, "loss": 0.0103, "lr": 2.8441558441558442e-05, "epoch": 0.5, "percentage": 7.14, "elapsed_time": "0:10:47", "remaining_time": "2:20:17"}
|
||||
{"current_steps": 225, "total_steps": 3080, "loss": 0.0103, "lr": 2.9090909090909093e-05, "epoch": 0.5113636363636364, "percentage": 7.31, "elapsed_time": "0:10:59", "remaining_time": "2:19:31"}
|
||||
{"current_steps": 230, "total_steps": 3080, "loss": 0.0106, "lr": 2.9740259740259743e-05, "epoch": 0.5227272727272727, "percentage": 7.47, "elapsed_time": "0:11:12", "remaining_time": "2:18:52"}
|
||||
{"current_steps": 235, "total_steps": 3080, "loss": 0.0098, "lr": 3.038961038961039e-05, "epoch": 0.5340909090909091, "percentage": 7.63, "elapsed_time": "0:11:25", "remaining_time": "2:18:17"}
|
||||
{"current_steps": 240, "total_steps": 3080, "loss": 0.0098, "lr": 3.103896103896104e-05, "epoch": 0.5454545454545454, "percentage": 7.79, "elapsed_time": "0:11:38", "remaining_time": "2:17:41"}
|
||||
{"current_steps": 245, "total_steps": 3080, "loss": 0.0098, "lr": 3.1688311688311695e-05, "epoch": 0.5568181818181818, "percentage": 7.95, "elapsed_time": "0:11:50", "remaining_time": "2:17:02"}
|
||||
{"current_steps": 250, "total_steps": 3080, "loss": 0.0087, "lr": 3.233766233766234e-05, "epoch": 0.5681818181818182, "percentage": 8.12, "elapsed_time": "0:12:05", "remaining_time": "2:16:56"}
|
||||
{"current_steps": 255, "total_steps": 3080, "loss": 0.0095, "lr": 3.298701298701299e-05, "epoch": 0.5795454545454546, "percentage": 8.28, "elapsed_time": "0:12:18", "remaining_time": "2:16:25"}
|
||||
{"current_steps": 260, "total_steps": 3080, "loss": 0.0096, "lr": 3.363636363636364e-05, "epoch": 0.5909090909090909, "percentage": 8.44, "elapsed_time": "0:12:30", "remaining_time": "2:15:42"}
|
||||
{"current_steps": 265, "total_steps": 3080, "loss": 0.0098, "lr": 3.4285714285714284e-05, "epoch": 0.6022727272727273, "percentage": 8.6, "elapsed_time": "0:12:43", "remaining_time": "2:15:06"}
|
||||
{"current_steps": 270, "total_steps": 3080, "loss": 0.0096, "lr": 3.493506493506494e-05, "epoch": 0.6136363636363636, "percentage": 8.77, "elapsed_time": "0:12:55", "remaining_time": "2:14:34"}
|
||||
{"current_steps": 275, "total_steps": 3080, "loss": 0.0092, "lr": 3.5584415584415585e-05, "epoch": 0.625, "percentage": 8.93, "elapsed_time": "0:13:07", "remaining_time": "2:13:56"}
|
||||
{"current_steps": 280, "total_steps": 3080, "loss": 0.0088, "lr": 3.623376623376624e-05, "epoch": 0.6363636363636364, "percentage": 9.09, "elapsed_time": "0:13:20", "remaining_time": "2:13:20"}
|
||||
{"current_steps": 285, "total_steps": 3080, "loss": 0.0088, "lr": 3.6883116883116886e-05, "epoch": 0.6477272727272727, "percentage": 9.25, "elapsed_time": "0:13:33", "remaining_time": "2:12:57"}
|
||||
{"current_steps": 290, "total_steps": 3080, "loss": 0.009, "lr": 3.753246753246753e-05, "epoch": 0.6590909090909091, "percentage": 9.42, "elapsed_time": "0:13:47", "remaining_time": "2:12:37"}
|
||||
{"current_steps": 295, "total_steps": 3080, "loss": 0.0091, "lr": 3.818181818181819e-05, "epoch": 0.6704545454545454, "percentage": 9.58, "elapsed_time": "0:13:59", "remaining_time": "2:12:07"}
|
||||
{"current_steps": 300, "total_steps": 3080, "loss": 0.0089, "lr": 3.8831168831168834e-05, "epoch": 0.6818181818181818, "percentage": 9.74, "elapsed_time": "0:14:11", "remaining_time": "2:11:32"}
|
||||
{"current_steps": 305, "total_steps": 3080, "loss": 0.0092, "lr": 3.948051948051948e-05, "epoch": 0.6931818181818182, "percentage": 9.9, "elapsed_time": "0:14:24", "remaining_time": "2:11:09"}
|
||||
{"current_steps": 310, "total_steps": 3080, "loss": 0.0088, "lr": 3.999998715561701e-05, "epoch": 0.7045454545454546, "percentage": 10.06, "elapsed_time": "0:14:37", "remaining_time": "2:10:37"}
|
||||
{"current_steps": 315, "total_steps": 3080, "loss": 0.0084, "lr": 3.999953760394435e-05, "epoch": 0.7159090909090909, "percentage": 10.23, "elapsed_time": "0:14:49", "remaining_time": "2:10:07"}
|
||||
{"current_steps": 320, "total_steps": 3080, "loss": 0.0084, "lr": 3.99984458496195e-05, "epoch": 0.7272727272727273, "percentage": 10.39, "elapsed_time": "0:15:01", "remaining_time": "2:09:38"}
|
||||
{"current_steps": 325, "total_steps": 3080, "loss": 0.0085, "lr": 3.999671192769966e-05, "epoch": 0.7386363636363636, "percentage": 10.55, "elapsed_time": "0:15:14", "remaining_time": "2:09:11"}
|
||||
{"current_steps": 330, "total_steps": 3080, "loss": 0.0092, "lr": 3.999433589386259e-05, "epoch": 0.75, "percentage": 10.71, "elapsed_time": "0:15:27", "remaining_time": "2:08:45"}
|
||||
{"current_steps": 335, "total_steps": 3080, "loss": 0.0089, "lr": 3.9991317824404785e-05, "epoch": 0.7613636363636364, "percentage": 10.88, "elapsed_time": "0:15:40", "remaining_time": "2:08:25"}
|
||||
{"current_steps": 340, "total_steps": 3080, "loss": 0.0086, "lr": 3.9987657816239124e-05, "epoch": 0.7727272727272727, "percentage": 11.04, "elapsed_time": "0:15:52", "remaining_time": "2:07:54"}
|
||||
{"current_steps": 345, "total_steps": 3080, "loss": 0.0085, "lr": 3.9983355986891664e-05, "epoch": 0.7840909090909091, "percentage": 11.2, "elapsed_time": "0:16:04", "remaining_time": "2:07:24"}
|
||||
{"current_steps": 350, "total_steps": 3080, "loss": 0.0085, "lr": 3.99784124744979e-05, "epoch": 0.7954545454545454, "percentage": 11.36, "elapsed_time": "0:16:16", "remaining_time": "2:06:59"}
|
||||
{"current_steps": 355, "total_steps": 3080, "loss": 0.0086, "lr": 3.997282743779835e-05, "epoch": 0.8068181818181818, "percentage": 11.53, "elapsed_time": "0:16:29", "remaining_time": "2:06:32"}
|
||||
{"current_steps": 360, "total_steps": 3080, "loss": 0.0084, "lr": 3.996660105613343e-05, "epoch": 0.8181818181818182, "percentage": 11.69, "elapsed_time": "0:16:42", "remaining_time": "2:06:15"}
|
||||
{"current_steps": 365, "total_steps": 3080, "loss": 0.008, "lr": 3.995973352943769e-05, "epoch": 0.8295454545454546, "percentage": 11.85, "elapsed_time": "0:16:55", "remaining_time": "2:05:53"}
|
||||
{"current_steps": 370, "total_steps": 3080, "loss": 0.0083, "lr": 3.9952225078233435e-05, "epoch": 0.8409090909090909, "percentage": 12.01, "elapsed_time": "0:17:07", "remaining_time": "2:05:25"}
|
||||
{"current_steps": 375, "total_steps": 3080, "loss": 0.0088, "lr": 3.9944075943623605e-05, "epoch": 0.8522727272727273, "percentage": 12.18, "elapsed_time": "0:17:20", "remaining_time": "2:05:04"}
|
||||
{"current_steps": 380, "total_steps": 3080, "loss": 0.0086, "lr": 3.9935286387284035e-05, "epoch": 0.8636363636363636, "percentage": 12.34, "elapsed_time": "0:17:32", "remaining_time": "2:04:41"}
|
||||
{"current_steps": 385, "total_steps": 3080, "loss": 0.0085, "lr": 3.9925856691455075e-05, "epoch": 0.875, "percentage": 12.5, "elapsed_time": "0:17:45", "remaining_time": "2:04:16"}
|
||||
{"current_steps": 390, "total_steps": 3080, "loss": 0.0082, "lr": 3.9915787158932505e-05, "epoch": 0.8863636363636364, "percentage": 12.66, "elapsed_time": "0:17:56", "remaining_time": "2:03:46"}
|
||||
{"current_steps": 395, "total_steps": 3080, "loss": 0.0081, "lr": 3.990507811305782e-05, "epoch": 0.8977272727272727, "percentage": 12.82, "elapsed_time": "0:18:08", "remaining_time": "2:03:21"}
|
||||
{"current_steps": 400, "total_steps": 3080, "loss": 0.0079, "lr": 3.989372989770787e-05, "epoch": 0.9090909090909091, "percentage": 12.99, "elapsed_time": "0:18:20", "remaining_time": "2:02:54"}
|
||||
{"current_steps": 405, "total_steps": 3080, "loss": 0.008, "lr": 3.988174287728376e-05, "epoch": 0.9204545454545454, "percentage": 13.15, "elapsed_time": "0:19:36", "remaining_time": "2:09:28"}
|
||||
{"current_steps": 410, "total_steps": 3080, "loss": 0.0084, "lr": 3.986911743669923e-05, "epoch": 0.9318181818181818, "percentage": 13.31, "elapsed_time": "0:19:48", "remaining_time": "2:09:02"}
|
||||
{"current_steps": 415, "total_steps": 3080, "loss": 0.0083, "lr": 3.9855853981368196e-05, "epoch": 0.9431818181818182, "percentage": 13.47, "elapsed_time": "0:19:59", "remaining_time": "2:08:25"}
|
||||
{"current_steps": 420, "total_steps": 3080, "loss": 0.0083, "lr": 3.984195293719182e-05, "epoch": 0.9545454545454546, "percentage": 13.64, "elapsed_time": "0:20:13", "remaining_time": "2:08:04"}
|
||||
{"current_steps": 425, "total_steps": 3080, "loss": 0.008, "lr": 3.982741475054481e-05, "epoch": 0.9659090909090909, "percentage": 13.8, "elapsed_time": "0:20:26", "remaining_time": "2:07:42"}
|
||||
{"current_steps": 430, "total_steps": 3080, "loss": 0.0076, "lr": 3.9812239888261054e-05, "epoch": 0.9772727272727273, "percentage": 13.96, "elapsed_time": "0:20:37", "remaining_time": "2:07:04"}
|
||||
{"current_steps": 435, "total_steps": 3080, "loss": 0.0082, "lr": 3.979642883761866e-05, "epoch": 0.9886363636363636, "percentage": 14.12, "elapsed_time": "0:20:50", "remaining_time": "2:06:45"}
|
||||
{"current_steps": 440, "total_steps": 3080, "loss": 0.0077, "lr": 3.9779982106324284e-05, "epoch": 1.0, "percentage": 14.29, "elapsed_time": "0:21:03", "remaining_time": "2:06:18"}
|
||||
{"current_steps": 445, "total_steps": 3080, "loss": 0.0073, "lr": 3.976290022249687e-05, "epoch": 1.0113636363636365, "percentage": 14.45, "elapsed_time": "0:21:14", "remaining_time": "2:05:49"}
|
||||
{"current_steps": 450, "total_steps": 3080, "loss": 0.0071, "lr": 3.974518373465066e-05, "epoch": 1.0227272727272727, "percentage": 14.61, "elapsed_time": "0:21:27", "remaining_time": "2:05:23"}
|
||||
{"current_steps": 455, "total_steps": 3080, "loss": 0.0075, "lr": 3.9726833211677576e-05, "epoch": 1.0340909090909092, "percentage": 14.77, "elapsed_time": "0:21:39", "remaining_time": "2:04:58"}
|
||||
{"current_steps": 460, "total_steps": 3080, "loss": 0.0073, "lr": 3.970784924282896e-05, "epoch": 1.0454545454545454, "percentage": 14.94, "elapsed_time": "0:21:51", "remaining_time": "2:04:27"}
|
||||
{"current_steps": 465, "total_steps": 3080, "loss": 0.0074, "lr": 3.968823243769667e-05, "epoch": 1.0568181818181819, "percentage": 15.1, "elapsed_time": "0:22:04", "remaining_time": "2:04:07"}
|
||||
{"current_steps": 470, "total_steps": 3080, "loss": 0.0075, "lr": 3.9667983426193485e-05, "epoch": 1.0681818181818181, "percentage": 15.26, "elapsed_time": "0:22:16", "remaining_time": "2:03:40"}
|
||||
{"current_steps": 475, "total_steps": 3080, "loss": 0.0077, "lr": 3.964710285853287e-05, "epoch": 1.0795454545454546, "percentage": 15.42, "elapsed_time": "0:22:28", "remaining_time": "2:03:16"}
|
||||
{"current_steps": 480, "total_steps": 3080, "loss": 0.0076, "lr": 3.9625591405208145e-05, "epoch": 1.0909090909090908, "percentage": 15.58, "elapsed_time": "0:22:41", "remaining_time": "2:02:52"}
|
||||
{"current_steps": 485, "total_steps": 3080, "loss": 0.0072, "lr": 3.9603449756970877e-05, "epoch": 1.1022727272727273, "percentage": 15.75, "elapsed_time": "0:22:53", "remaining_time": "2:02:28"}
|
||||
{"current_steps": 490, "total_steps": 3080, "loss": 0.0074, "lr": 3.958067862480878e-05, "epoch": 1.1136363636363635, "percentage": 15.91, "elapsed_time": "0:23:07", "remaining_time": "2:02:11"}
|
||||
{"current_steps": 495, "total_steps": 3080, "loss": 0.0069, "lr": 3.955727873992283e-05, "epoch": 1.125, "percentage": 16.07, "elapsed_time": "0:23:19", "remaining_time": "2:01:47"}
|
||||
{"current_steps": 500, "total_steps": 3080, "loss": 0.0072, "lr": 3.95332508537038e-05, "epoch": 1.1363636363636362, "percentage": 16.23, "elapsed_time": "0:23:31", "remaining_time": "2:01:25"}
|
||||
{"current_steps": 505, "total_steps": 3080, "loss": 0.0071, "lr": 3.950859573770815e-05, "epoch": 1.1477272727272727, "percentage": 16.4, "elapsed_time": "0:23:44", "remaining_time": "2:01:01"}
|
||||
{"current_steps": 510, "total_steps": 3080, "loss": 0.0068, "lr": 3.9483314183633206e-05, "epoch": 1.1590909090909092, "percentage": 16.56, "elapsed_time": "0:23:56", "remaining_time": "2:00:40"}
|
||||
{"current_steps": 515, "total_steps": 3080, "loss": 0.007, "lr": 3.9457407003291826e-05, "epoch": 1.1704545454545454, "percentage": 16.72, "elapsed_time": "0:24:09", "remaining_time": "2:00:16"}
|
||||
{"current_steps": 520, "total_steps": 3080, "loss": 0.0072, "lr": 3.943087502858621e-05, "epoch": 1.1818181818181819, "percentage": 16.88, "elapsed_time": "0:24:19", "remaining_time": "1:59:46"}
|
||||
{"current_steps": 525, "total_steps": 3080, "loss": 0.0077, "lr": 3.9403719111481295e-05, "epoch": 1.1931818181818181, "percentage": 17.05, "elapsed_time": "0:24:33", "remaining_time": "1:59:28"}
|
||||
{"current_steps": 530, "total_steps": 3080, "loss": 0.0079, "lr": 3.937594012397734e-05, "epoch": 1.2045454545454546, "percentage": 17.21, "elapsed_time": "0:24:45", "remaining_time": "1:59:09"}
|
||||
{"current_steps": 535, "total_steps": 3080, "loss": 0.0074, "lr": 3.934753895808193e-05, "epoch": 1.2159090909090908, "percentage": 17.37, "elapsed_time": "0:24:57", "remaining_time": "1:58:43"}
|
||||
{"current_steps": 540, "total_steps": 3080, "loss": 0.0073, "lr": 3.931851652578137e-05, "epoch": 1.2272727272727273, "percentage": 17.53, "elapsed_time": "0:25:09", "remaining_time": "1:58:18"}
|
||||
{"current_steps": 545, "total_steps": 3080, "loss": 0.0073, "lr": 3.928887375901134e-05, "epoch": 1.2386363636363638, "percentage": 17.69, "elapsed_time": "0:25:21", "remaining_time": "1:57:57"}
|
||||
{"current_steps": 550, "total_steps": 3080, "loss": 0.007, "lr": 3.9258611609627035e-05, "epoch": 1.25, "percentage": 17.86, "elapsed_time": "0:25:33", "remaining_time": "1:57:34"}
|
||||
{"current_steps": 555, "total_steps": 3080, "loss": 0.0072, "lr": 3.922773104937254e-05, "epoch": 1.2613636363636362, "percentage": 18.02, "elapsed_time": "0:25:46", "remaining_time": "1:57:14"}
|
||||
{"current_steps": 560, "total_steps": 3080, "loss": 0.0073, "lr": 3.919623306984967e-05, "epoch": 1.2727272727272727, "percentage": 18.18, "elapsed_time": "0:25:58", "remaining_time": "1:56:51"}
|
||||
{"current_steps": 565, "total_steps": 3080, "loss": 0.0072, "lr": 3.91641186824861e-05, "epoch": 1.2840909090909092, "percentage": 18.34, "elapsed_time": "0:26:11", "remaining_time": "1:56:33"}
|
||||
{"current_steps": 570, "total_steps": 3080, "loss": 0.0073, "lr": 3.9131388918502914e-05, "epoch": 1.2954545454545454, "percentage": 18.51, "elapsed_time": "0:26:23", "remaining_time": "1:56:13"}
|
||||
{"current_steps": 575, "total_steps": 3080, "loss": 0.0073, "lr": 3.9098044828881476e-05, "epoch": 1.3068181818181819, "percentage": 18.67, "elapsed_time": "0:26:34", "remaining_time": "1:55:48"}
|
||||
{"current_steps": 580, "total_steps": 3080, "loss": 0.0072, "lr": 3.906408748432968e-05, "epoch": 1.3181818181818181, "percentage": 18.83, "elapsed_time": "0:26:46", "remaining_time": "1:55:24"}
|
||||
{"current_steps": 585, "total_steps": 3080, "loss": 0.0072, "lr": 3.902951797524757e-05, "epoch": 1.3295454545454546, "percentage": 18.99, "elapsed_time": "0:26:59", "remaining_time": "1:55:05"}
|
||||
{"current_steps": 590, "total_steps": 3080, "loss": 0.0072, "lr": 3.899433741169233e-05, "epoch": 1.3409090909090908, "percentage": 19.16, "elapsed_time": "0:27:11", "remaining_time": "1:54:44"}
|
||||
{"current_steps": 595, "total_steps": 3080, "loss": 0.0073, "lr": 3.895854692334264e-05, "epoch": 1.3522727272727273, "percentage": 19.32, "elapsed_time": "0:27:24", "remaining_time": "1:54:27"}
|
||||
{"current_steps": 600, "total_steps": 3080, "loss": 0.0069, "lr": 3.892214765946239e-05, "epoch": 1.3636363636363638, "percentage": 19.48, "elapsed_time": "0:27:38", "remaining_time": "1:54:14"}
|
||||
{"current_steps": 605, "total_steps": 3080, "loss": 0.0071, "lr": 3.8885140788863814e-05, "epoch": 1.375, "percentage": 19.64, "elapsed_time": "0:28:50", "remaining_time": "1:57:57"}
|
||||
{"current_steps": 610, "total_steps": 3080, "loss": 0.0073, "lr": 3.8847527499869884e-05, "epoch": 1.3863636363636362, "percentage": 19.81, "elapsed_time": "0:29:02", "remaining_time": "1:57:35"}
|
||||
{"current_steps": 615, "total_steps": 3080, "loss": 0.0072, "lr": 3.8809309000276234e-05, "epoch": 1.3977272727272727, "percentage": 19.97, "elapsed_time": "0:29:14", "remaining_time": "1:57:12"}
|
||||
{"current_steps": 620, "total_steps": 3080, "loss": 0.0068, "lr": 3.877048651731232e-05, "epoch": 1.4090909090909092, "percentage": 20.13, "elapsed_time": "0:29:26", "remaining_time": "1:56:49"}
|
||||
{"current_steps": 625, "total_steps": 3080, "loss": 0.0074, "lr": 3.873106129760206e-05, "epoch": 1.4204545454545454, "percentage": 20.29, "elapsed_time": "0:29:37", "remaining_time": "1:56:23"}
|
||||
{"current_steps": 630, "total_steps": 3080, "loss": 0.0075, "lr": 3.8691034607123725e-05, "epoch": 1.4318181818181819, "percentage": 20.45, "elapsed_time": "0:29:49", "remaining_time": "1:56:00"}
|
||||
{"current_steps": 635, "total_steps": 3080, "loss": 0.0073, "lr": 3.8650407731169395e-05, "epoch": 1.4431818181818181, "percentage": 20.62, "elapsed_time": "0:30:01", "remaining_time": "1:55:35"}
|
||||
{"current_steps": 640, "total_steps": 3080, "loss": 0.0073, "lr": 3.8609181974303596e-05, "epoch": 1.4545454545454546, "percentage": 20.78, "elapsed_time": "0:30:13", "remaining_time": "1:55:13"}
|
||||
{"current_steps": 645, "total_steps": 3080, "loss": 0.0069, "lr": 3.856735866032145e-05, "epoch": 1.4659090909090908, "percentage": 20.94, "elapsed_time": "0:30:24", "remaining_time": "1:54:49"}
|
||||
{"current_steps": 650, "total_steps": 3080, "loss": 0.0069, "lr": 3.852493913220618e-05, "epoch": 1.4772727272727273, "percentage": 21.1, "elapsed_time": "0:30:37", "remaining_time": "1:54:28"}
|
||||
{"current_steps": 655, "total_steps": 3080, "loss": 0.0071, "lr": 3.8481924752085935e-05, "epoch": 1.4886363636363638, "percentage": 21.27, "elapsed_time": "0:30:49", "remaining_time": "1:54:08"}
|
||||
{"current_steps": 660, "total_steps": 3080, "loss": 0.0069, "lr": 3.84383169011901e-05, "epoch": 1.5, "percentage": 21.43, "elapsed_time": "0:31:02", "remaining_time": "1:53:50"}
|
||||
{"current_steps": 665, "total_steps": 3080, "loss": 0.007, "lr": 3.839411697980493e-05, "epoch": 1.5113636363636362, "percentage": 21.59, "elapsed_time": "0:31:14", "remaining_time": "1:53:27"}
|
||||
{"current_steps": 670, "total_steps": 3080, "loss": 0.0072, "lr": 3.834932640722857e-05, "epoch": 1.5227272727272727, "percentage": 21.75, "elapsed_time": "0:31:25", "remaining_time": "1:53:01"}
|
||||
{"current_steps": 675, "total_steps": 3080, "loss": 0.0069, "lr": 3.830394662172551e-05, "epoch": 1.5340909090909092, "percentage": 21.92, "elapsed_time": "0:31:37", "remaining_time": "1:52:42"}
|
||||
{"current_steps": 680, "total_steps": 3080, "loss": 0.0071, "lr": 3.8257979080480356e-05, "epoch": 1.5454545454545454, "percentage": 22.08, "elapsed_time": "0:31:48", "remaining_time": "1:52:17"}
|
||||
{"current_steps": 685, "total_steps": 3080, "loss": 0.007, "lr": 3.821142525955109e-05, "epoch": 1.5568181818181817, "percentage": 22.24, "elapsed_time": "0:32:00", "remaining_time": "1:51:54"}
|
||||
{"current_steps": 690, "total_steps": 3080, "loss": 0.0071, "lr": 3.8164286653821633e-05, "epoch": 1.5681818181818183, "percentage": 22.4, "elapsed_time": "0:32:12", "remaining_time": "1:51:32"}
|
||||
{"current_steps": 695, "total_steps": 3080, "loss": 0.0066, "lr": 3.811656477695385e-05, "epoch": 1.5795454545454546, "percentage": 22.56, "elapsed_time": "0:32:28", "remaining_time": "1:51:27"}
|
||||
{"current_steps": 700, "total_steps": 3080, "loss": 0.0072, "lr": 3.806826116133898e-05, "epoch": 1.5909090909090908, "percentage": 22.73, "elapsed_time": "0:32:41", "remaining_time": "1:51:07"}
|
||||
{"current_steps": 705, "total_steps": 3080, "loss": 0.0072, "lr": 3.801937735804838e-05, "epoch": 1.6022727272727273, "percentage": 22.89, "elapsed_time": "0:32:52", "remaining_time": "1:50:46"}
|
||||
{"current_steps": 710, "total_steps": 3080, "loss": 0.0074, "lr": 3.7969914936783754e-05, "epoch": 1.6136363636363638, "percentage": 23.05, "elapsed_time": "0:33:06", "remaining_time": "1:50:30"}
|
||||
{"current_steps": 715, "total_steps": 3080, "loss": 0.0069, "lr": 3.791987548582672e-05, "epoch": 1.625, "percentage": 23.21, "elapsed_time": "0:33:17", "remaining_time": "1:50:08"}
|
||||
{"current_steps": 720, "total_steps": 3080, "loss": 0.0075, "lr": 3.7869260611987834e-05, "epoch": 1.6363636363636362, "percentage": 23.38, "elapsed_time": "0:33:30", "remaining_time": "1:49:51"}
|
||||
{"current_steps": 725, "total_steps": 3080, "loss": 0.0078, "lr": 3.781807194055499e-05, "epoch": 1.6477272727272727, "percentage": 23.54, "elapsed_time": "0:33:44", "remaining_time": "1:49:34"}
|
||||
{"current_steps": 730, "total_steps": 3080, "loss": 0.0076, "lr": 3.776631111524121e-05, "epoch": 1.6590909090909092, "percentage": 23.7, "elapsed_time": "0:33:58", "remaining_time": "1:49:20"}
|
||||
{"current_steps": 735, "total_steps": 3080, "loss": 0.0073, "lr": 3.7713979798131886e-05, "epoch": 1.6704545454545454, "percentage": 23.86, "elapsed_time": "0:34:09", "remaining_time": "1:48:58"}
|
||||
{"current_steps": 740, "total_steps": 3080, "loss": 0.0066, "lr": 3.766107966963141e-05, "epoch": 1.6818181818181817, "percentage": 24.03, "elapsed_time": "0:34:22", "remaining_time": "1:48:40"}
|
||||
{"current_steps": 745, "total_steps": 3080, "loss": 0.0068, "lr": 3.760761242840918e-05, "epoch": 1.6931818181818183, "percentage": 24.19, "elapsed_time": "0:34:33", "remaining_time": "1:48:18"}
|
||||
{"current_steps": 750, "total_steps": 3080, "loss": 0.0071, "lr": 3.755357979134511e-05, "epoch": 1.7045454545454546, "percentage": 24.35, "elapsed_time": "0:34:45", "remaining_time": "1:48:00"}
|
||||
{"current_steps": 755, "total_steps": 3080, "loss": 0.0072, "lr": 3.749898349347446e-05, "epoch": 1.7159090909090908, "percentage": 24.51, "elapsed_time": "0:34:58", "remaining_time": "1:47:42"}
|
||||
{"current_steps": 760, "total_steps": 3080, "loss": 0.007, "lr": 3.744382528793211e-05, "epoch": 1.7272727272727273, "percentage": 24.68, "elapsed_time": "0:35:10", "remaining_time": "1:47:22"}
|
||||
{"current_steps": 765, "total_steps": 3080, "loss": 0.0068, "lr": 3.738810694589631e-05, "epoch": 1.7386363636363638, "percentage": 24.84, "elapsed_time": "0:35:22", "remaining_time": "1:47:03"}
|
||||
{"current_steps": 770, "total_steps": 3080, "loss": 0.0072, "lr": 3.733183025653178e-05, "epoch": 1.75, "percentage": 25.0, "elapsed_time": "0:35:34", "remaining_time": "1:46:44"}
|
||||
{"current_steps": 775, "total_steps": 3080, "loss": 0.0066, "lr": 3.7274997026932256e-05, "epoch": 1.7613636363636362, "percentage": 25.16, "elapsed_time": "0:35:45", "remaining_time": "1:46:22"}
|
||||
{"current_steps": 780, "total_steps": 3080, "loss": 0.0066, "lr": 3.721760908206247e-05, "epoch": 1.7727272727272727, "percentage": 25.32, "elapsed_time": "0:35:57", "remaining_time": "1:46:00"}
|
||||
{"current_steps": 785, "total_steps": 3080, "loss": 0.0069, "lr": 3.7159668264699546e-05, "epoch": 1.7840909090909092, "percentage": 25.49, "elapsed_time": "0:36:09", "remaining_time": "1:45:41"}
|
||||
{"current_steps": 790, "total_steps": 3080, "loss": 0.0066, "lr": 3.710117643537383e-05, "epoch": 1.7954545454545454, "percentage": 25.65, "elapsed_time": "0:36:22", "remaining_time": "1:45:25"}
|
||||
{"current_steps": 795, "total_steps": 3080, "loss": 0.0067, "lr": 3.7042135472309134e-05, "epoch": 1.8068181818181817, "percentage": 25.81, "elapsed_time": "0:36:33", "remaining_time": "1:45:03"}
|
||||
{"current_steps": 800, "total_steps": 3080, "loss": 0.0067, "lr": 3.698254727136245e-05, "epoch": 1.8181818181818183, "percentage": 25.97, "elapsed_time": "0:36:46", "remaining_time": "1:44:48"}
|
||||
{"current_steps": 805, "total_steps": 3080, "loss": 0.0071, "lr": 3.692241374596306e-05, "epoch": 1.8295454545454546, "percentage": 26.14, "elapsed_time": "0:37:56", "remaining_time": "1:47:14"}
|
||||
{"current_steps": 810, "total_steps": 3080, "loss": 0.0072, "lr": 3.6861736827051066e-05, "epoch": 1.8409090909090908, "percentage": 26.3, "elapsed_time": "0:38:09", "remaining_time": "1:46:55"}
|
||||
{"current_steps": 815, "total_steps": 3080, "loss": 0.0069, "lr": 3.680051846301543e-05, "epoch": 1.8522727272727273, "percentage": 26.46, "elapsed_time": "0:38:21", "remaining_time": "1:46:36"}
|
||||
{"current_steps": 820, "total_steps": 3080, "loss": 0.0071, "lr": 3.67387606196314e-05, "epoch": 1.8636363636363638, "percentage": 26.62, "elapsed_time": "0:38:32", "remaining_time": "1:46:14"}
|
||||
{"current_steps": 825, "total_steps": 3080, "loss": 0.007, "lr": 3.6676465279997343e-05, "epoch": 1.875, "percentage": 26.79, "elapsed_time": "0:38:44", "remaining_time": "1:45:53"}
|
||||
{"current_steps": 830, "total_steps": 3080, "loss": 0.0072, "lr": 3.6613634444471145e-05, "epoch": 1.8863636363636362, "percentage": 26.95, "elapsed_time": "0:38:58", "remaining_time": "1:45:38"}
|
||||
{"current_steps": 835, "total_steps": 3080, "loss": 0.0072, "lr": 3.655027013060591e-05, "epoch": 1.8977272727272727, "percentage": 27.11, "elapsed_time": "0:39:09", "remaining_time": "1:45:16"}
|
||||
{"current_steps": 840, "total_steps": 3080, "loss": 0.0067, "lr": 3.648637437308519e-05, "epoch": 1.9090909090909092, "percentage": 27.27, "elapsed_time": "0:39:21", "remaining_time": "1:44:56"}
|
||||
{"current_steps": 845, "total_steps": 3080, "loss": 0.0069, "lr": 3.642194922365766e-05, "epoch": 1.9204545454545454, "percentage": 27.44, "elapsed_time": "0:39:32", "remaining_time": "1:44:34"}
|
||||
{"current_steps": 850, "total_steps": 3080, "loss": 0.0068, "lr": 3.635699675107126e-05, "epoch": 1.9318181818181817, "percentage": 27.6, "elapsed_time": "0:39:44", "remaining_time": "1:44:16"}
|
||||
{"current_steps": 855, "total_steps": 3080, "loss": 0.0065, "lr": 3.629151904100672e-05, "epoch": 1.9431818181818183, "percentage": 27.76, "elapsed_time": "0:39:57", "remaining_time": "1:43:58"}
|
||||
{"current_steps": 860, "total_steps": 3080, "loss": 0.0065, "lr": 3.622551819601058e-05, "epoch": 1.9545454545454546, "percentage": 27.92, "elapsed_time": "0:40:08", "remaining_time": "1:43:37"}
|
||||
{"current_steps": 865, "total_steps": 3080, "loss": 0.0066, "lr": 3.615899633542775e-05, "epoch": 1.9659090909090908, "percentage": 28.08, "elapsed_time": "0:40:19", "remaining_time": "1:43:16"}
|
||||
{"current_steps": 870, "total_steps": 3080, "loss": 0.0068, "lr": 3.609195559533337e-05, "epoch": 1.9772727272727273, "percentage": 28.25, "elapsed_time": "0:40:31", "remaining_time": "1:42:56"}
|
||||
{"current_steps": 875, "total_steps": 3080, "loss": 0.0073, "lr": 3.6024398128464264e-05, "epoch": 1.9886363636363638, "percentage": 28.41, "elapsed_time": "0:40:42", "remaining_time": "1:42:35"}
|
||||
{"current_steps": 880, "total_steps": 3080, "loss": 0.0067, "lr": 3.595632610414981e-05, "epoch": 2.0, "percentage": 28.57, "elapsed_time": "0:40:54", "remaining_time": "1:42:16"}
|
||||
{"current_steps": 885, "total_steps": 3080, "loss": 0.0062, "lr": 3.588774170824225e-05, "epoch": 2.0113636363636362, "percentage": 28.73, "elapsed_time": "0:41:05", "remaining_time": "1:41:54"}
|
||||
{"current_steps": 890, "total_steps": 3080, "loss": 0.0065, "lr": 3.581864714304659e-05, "epoch": 2.022727272727273, "percentage": 28.9, "elapsed_time": "0:41:16", "remaining_time": "1:41:33"}
|
||||
{"current_steps": 895, "total_steps": 3080, "loss": 0.0061, "lr": 3.5749044627249744e-05, "epoch": 2.034090909090909, "percentage": 29.06, "elapsed_time": "0:41:28", "remaining_time": "1:41:14"}
|
||||
{"current_steps": 900, "total_steps": 3080, "loss": 0.0063, "lr": 3.56789363958494e-05, "epoch": 2.0454545454545454, "percentage": 29.22, "elapsed_time": "0:41:40", "remaining_time": "1:40:57"}
|
||||
{"current_steps": 905, "total_steps": 3080, "loss": 0.0066, "lr": 3.560832470008223e-05, "epoch": 2.0568181818181817, "percentage": 29.38, "elapsed_time": "0:41:53", "remaining_time": "1:40:39"}
|
||||
{"current_steps": 910, "total_steps": 3080, "loss": 0.0062, "lr": 3.553721180735157e-05, "epoch": 2.0681818181818183, "percentage": 29.55, "elapsed_time": "0:42:04", "remaining_time": "1:40:19"}
|
||||
{"current_steps": 915, "total_steps": 3080, "loss": 0.0064, "lr": 3.546560000115461e-05, "epoch": 2.0795454545454546, "percentage": 29.71, "elapsed_time": "0:42:15", "remaining_time": "1:40:00"}
|
||||
{"current_steps": 920, "total_steps": 3080, "loss": 0.006, "lr": 3.539349158100912e-05, "epoch": 2.090909090909091, "percentage": 29.87, "elapsed_time": "0:42:27", "remaining_time": "1:39:42"}
|
||||
{"current_steps": 925, "total_steps": 3080, "loss": 0.0066, "lr": 3.532088886237956e-05, "epoch": 2.102272727272727, "percentage": 30.03, "elapsed_time": "0:42:40", "remaining_time": "1:39:25"}
|
||||
{"current_steps": 930, "total_steps": 3080, "loss": 0.0061, "lr": 3.524779417660277e-05, "epoch": 2.1136363636363638, "percentage": 30.19, "elapsed_time": "0:42:52", "remaining_time": "1:39:06"}
|
||||
{"current_steps": 935, "total_steps": 3080, "loss": 0.0062, "lr": 3.517420987081304e-05, "epoch": 2.125, "percentage": 30.36, "elapsed_time": "0:43:03", "remaining_time": "1:38:46"}
|
||||
{"current_steps": 940, "total_steps": 3080, "loss": 0.006, "lr": 3.510013830786685e-05, "epoch": 2.1363636363636362, "percentage": 30.52, "elapsed_time": "0:43:14", "remaining_time": "1:38:27"}
|
||||
{"current_steps": 945, "total_steps": 3080, "loss": 0.0062, "lr": 3.502558186626687e-05, "epoch": 2.147727272727273, "percentage": 30.68, "elapsed_time": "0:43:26", "remaining_time": "1:38:08"}
|
||||
{"current_steps": 950, "total_steps": 3080, "loss": 0.0063, "lr": 3.4950542940085695e-05, "epoch": 2.159090909090909, "percentage": 30.84, "elapsed_time": "0:43:37", "remaining_time": "1:37:49"}
|
||||
{"current_steps": 955, "total_steps": 3080, "loss": 0.006, "lr": 3.48750239388889e-05, "epoch": 2.1704545454545454, "percentage": 31.01, "elapsed_time": "0:43:49", "remaining_time": "1:37:30"}
|
||||
{"current_steps": 960, "total_steps": 3080, "loss": 0.0061, "lr": 3.479902728765768e-05, "epoch": 2.1818181818181817, "percentage": 31.17, "elapsed_time": "0:44:00", "remaining_time": "1:37:10"}
|
||||
{"current_steps": 965, "total_steps": 3080, "loss": 0.0061, "lr": 3.4722555426711017e-05, "epoch": 2.1931818181818183, "percentage": 31.33, "elapsed_time": "0:44:12", "remaining_time": "1:36:52"}
|
||||
{"current_steps": 970, "total_steps": 3080, "loss": 0.006, "lr": 3.464561081162728e-05, "epoch": 2.2045454545454546, "percentage": 31.49, "elapsed_time": "0:44:23", "remaining_time": "1:36:34"}
|
||||
{"current_steps": 975, "total_steps": 3080, "loss": 0.006, "lr": 3.456819591316539e-05, "epoch": 2.215909090909091, "percentage": 31.66, "elapsed_time": "0:44:35", "remaining_time": "1:36:17"}
|
||||
{"current_steps": 980, "total_steps": 3080, "loss": 0.0063, "lr": 3.4490313217185454e-05, "epoch": 2.227272727272727, "percentage": 31.82, "elapsed_time": "0:44:47", "remaining_time": "1:35:58"}
|
||||
{"current_steps": 985, "total_steps": 3080, "loss": 0.0062, "lr": 3.4411965224569006e-05, "epoch": 2.2386363636363638, "percentage": 31.98, "elapsed_time": "0:45:00", "remaining_time": "1:35:43"}
|
||||
{"current_steps": 990, "total_steps": 3080, "loss": 0.0067, "lr": 3.4333154451138644e-05, "epoch": 2.25, "percentage": 32.14, "elapsed_time": "0:45:11", "remaining_time": "1:35:24"}
|
||||
{"current_steps": 995, "total_steps": 3080, "loss": 0.0061, "lr": 3.4253883427577266e-05, "epoch": 2.2613636363636362, "percentage": 32.31, "elapsed_time": "0:45:23", "remaining_time": "1:35:07"}
|
||||
{"current_steps": 1000, "total_steps": 3080, "loss": 0.006, "lr": 3.417415469934678e-05, "epoch": 2.2727272727272725, "percentage": 32.47, "elapsed_time": "0:45:35", "remaining_time": "1:34:50"}
|
||||
{"current_steps": 1005, "total_steps": 3080, "loss": 0.0059, "lr": 3.409397082660643e-05, "epoch": 2.284090909090909, "percentage": 32.63, "elapsed_time": "0:46:49", "remaining_time": "1:36:40"}
|
||||
{"current_steps": 1010, "total_steps": 3080, "loss": 0.0062, "lr": 3.401333438413053e-05, "epoch": 2.2954545454545454, "percentage": 32.79, "elapsed_time": "0:47:03", "remaining_time": "1:36:27"}
|
||||
{"current_steps": 1015, "total_steps": 3080, "loss": 0.0062, "lr": 3.3932247961225805e-05, "epoch": 2.3068181818181817, "percentage": 32.95, "elapsed_time": "0:47:14", "remaining_time": "1:36:07"}
|
||||
{"current_steps": 1020, "total_steps": 3080, "loss": 0.006, "lr": 3.385071416164824e-05, "epoch": 2.3181818181818183, "percentage": 33.12, "elapsed_time": "0:47:27", "remaining_time": "1:35:50"}
|
||||
{"current_steps": 1025, "total_steps": 3080, "loss": 0.0063, "lr": 3.376873560351948e-05, "epoch": 2.3295454545454546, "percentage": 33.28, "elapsed_time": "0:47:38", "remaining_time": "1:35:30"}
|
||||
{"current_steps": 1030, "total_steps": 3080, "loss": 0.0062, "lr": 3.368631491924277e-05, "epoch": 2.340909090909091, "percentage": 33.44, "elapsed_time": "0:47:49", "remaining_time": "1:35:12"}
|
||||
{"current_steps": 1035, "total_steps": 3080, "loss": 0.0065, "lr": 3.360345475541839e-05, "epoch": 2.3522727272727275, "percentage": 33.6, "elapsed_time": "0:48:01", "remaining_time": "1:34:53"}
|
||||
{"current_steps": 1040, "total_steps": 3080, "loss": 0.0062, "lr": 3.3520157772758716e-05, "epoch": 2.3636363636363638, "percentage": 33.77, "elapsed_time": "0:48:12", "remaining_time": "1:34:34"}
|
||||
{"current_steps": 1045, "total_steps": 3080, "loss": 0.0061, "lr": 3.343642664600273e-05, "epoch": 2.375, "percentage": 33.93, "elapsed_time": "0:48:24", "remaining_time": "1:34:16"}
|
||||
{"current_steps": 1050, "total_steps": 3080, "loss": 0.0062, "lr": 3.3352264063830184e-05, "epoch": 2.3863636363636362, "percentage": 34.09, "elapsed_time": "0:48:36", "remaining_time": "1:33:59"}
|
||||
{"current_steps": 1055, "total_steps": 3080, "loss": 0.0062, "lr": 3.3267672728775245e-05, "epoch": 2.3977272727272725, "percentage": 34.25, "elapsed_time": "0:48:48", "remaining_time": "1:33:41"}
|
||||
{"current_steps": 1060, "total_steps": 3080, "loss": 0.0064, "lr": 3.3182655357139686e-05, "epoch": 2.409090909090909, "percentage": 34.42, "elapsed_time": "0:49:01", "remaining_time": "1:33:25"}
|
||||
{"current_steps": 1065, "total_steps": 3080, "loss": 0.0061, "lr": 3.309721467890571e-05, "epoch": 2.4204545454545454, "percentage": 34.58, "elapsed_time": "0:49:12", "remaining_time": "1:33:06"}
|
||||
{"current_steps": 1070, "total_steps": 3080, "loss": 0.0059, "lr": 3.301135343764824e-05, "epoch": 2.4318181818181817, "percentage": 34.74, "elapsed_time": "0:49:24", "remaining_time": "1:32:49"}
|
||||
{"current_steps": 1075, "total_steps": 3080, "loss": 0.0059, "lr": 3.292507439044689e-05, "epoch": 2.4431818181818183, "percentage": 34.9, "elapsed_time": "0:49:36", "remaining_time": "1:32:30"}
|
||||
{"current_steps": 1080, "total_steps": 3080, "loss": 0.0063, "lr": 3.283838030779733e-05, "epoch": 2.4545454545454546, "percentage": 35.06, "elapsed_time": "0:49:47", "remaining_time": "1:32:12"}
|
||||
{"current_steps": 1085, "total_steps": 3080, "loss": 0.0061, "lr": 3.275127397352243e-05, "epoch": 2.465909090909091, "percentage": 35.23, "elapsed_time": "0:49:59", "remaining_time": "1:31:54"}
|
||||
{"current_steps": 1090, "total_steps": 3080, "loss": 0.0065, "lr": 3.2663758184682804e-05, "epoch": 2.4772727272727275, "percentage": 35.39, "elapsed_time": "0:50:11", "remaining_time": "1:31:38"}
|
||||
{"current_steps": 1095, "total_steps": 3080, "loss": 0.0063, "lr": 3.257583575148699e-05, "epoch": 2.4886363636363638, "percentage": 35.55, "elapsed_time": "0:50:23", "remaining_time": "1:31:20"}
|
||||
{"current_steps": 1100, "total_steps": 3080, "loss": 0.006, "lr": 3.2487509497201274e-05, "epoch": 2.5, "percentage": 35.71, "elapsed_time": "0:50:36", "remaining_time": "1:31:05"}
|
||||
{"current_steps": 1105, "total_steps": 3080, "loss": 0.0066, "lr": 3.239878225805895e-05, "epoch": 2.5113636363636362, "percentage": 35.88, "elapsed_time": "0:50:50", "remaining_time": "1:30:51"}
|
||||
{"current_steps": 1110, "total_steps": 3080, "loss": 0.006, "lr": 3.230965688316931e-05, "epoch": 2.5227272727272725, "percentage": 36.04, "elapsed_time": "0:51:03", "remaining_time": "1:30:37"}
|
||||
{"current_steps": 1115, "total_steps": 3080, "loss": 0.006, "lr": 3.222013623442613e-05, "epoch": 2.534090909090909, "percentage": 36.2, "elapsed_time": "0:51:15", "remaining_time": "1:30:19"}
|
||||
{"current_steps": 1120, "total_steps": 3080, "loss": 0.0053, "lr": 3.2130223186415766e-05, "epoch": 2.5454545454545454, "percentage": 36.36, "elapsed_time": "0:51:27", "remaining_time": "1:30:02"}
|
||||
{"current_steps": 1125, "total_steps": 3080, "loss": 0.0059, "lr": 3.203992062632487e-05, "epoch": 2.5568181818181817, "percentage": 36.53, "elapsed_time": "0:51:38", "remaining_time": "1:29:43"}
|
||||
{"current_steps": 1130, "total_steps": 3080, "loss": 0.006, "lr": 3.194923145384766e-05, "epoch": 2.5681818181818183, "percentage": 36.69, "elapsed_time": "0:51:50", "remaining_time": "1:29:27"}
|
||||
{"current_steps": 1135, "total_steps": 3080, "loss": 0.0057, "lr": 3.1858158581092816e-05, "epoch": 2.5795454545454546, "percentage": 36.85, "elapsed_time": "0:52:03", "remaining_time": "1:29:12"}
|
||||
{"current_steps": 1140, "total_steps": 3080, "loss": 0.0064, "lr": 3.176670493248997e-05, "epoch": 2.590909090909091, "percentage": 37.01, "elapsed_time": "0:52:15", "remaining_time": "1:28:55"}
|
||||
{"current_steps": 1145, "total_steps": 3080, "loss": 0.0062, "lr": 3.1674873444695804e-05, "epoch": 2.6022727272727275, "percentage": 37.18, "elapsed_time": "0:52:26", "remaining_time": "1:28:36"}
|
||||
{"current_steps": 1150, "total_steps": 3080, "loss": 0.0068, "lr": 3.158266706649974e-05, "epoch": 2.6136363636363638, "percentage": 37.34, "elapsed_time": "0:52:38", "remaining_time": "1:28:20"}
|
||||
{"current_steps": 1155, "total_steps": 3080, "loss": 0.0065, "lr": 3.1490088758729274e-05, "epoch": 2.625, "percentage": 37.5, "elapsed_time": "0:52:51", "remaining_time": "1:28:05"}
|
||||
{"current_steps": 1160, "total_steps": 3080, "loss": 0.0063, "lr": 3.1397141494154864e-05, "epoch": 2.6363636363636362, "percentage": 37.66, "elapsed_time": "0:53:02", "remaining_time": "1:27:47"}
|
||||
{"current_steps": 1165, "total_steps": 3080, "loss": 0.0062, "lr": 3.13038282573945e-05, "epoch": 2.6477272727272725, "percentage": 37.82, "elapsed_time": "0:53:14", "remaining_time": "1:27:30"}
|
||||
{"current_steps": 1170, "total_steps": 3080, "loss": 0.006, "lr": 3.121015204481788e-05, "epoch": 2.659090909090909, "percentage": 37.99, "elapsed_time": "0:53:26", "remaining_time": "1:27:15"}
|
||||
{"current_steps": 1175, "total_steps": 3080, "loss": 0.0059, "lr": 3.111611586445015e-05, "epoch": 2.6704545454545454, "percentage": 38.15, "elapsed_time": "0:53:37", "remaining_time": "1:26:56"}
|
||||
{"current_steps": 1180, "total_steps": 3080, "loss": 0.0063, "lr": 3.1021722735875345e-05, "epoch": 2.6818181818181817, "percentage": 38.31, "elapsed_time": "0:53:48", "remaining_time": "1:26:37"}
|
||||
{"current_steps": 1185, "total_steps": 3080, "loss": 0.0061, "lr": 3.0926975690139415e-05, "epoch": 2.6931818181818183, "percentage": 38.47, "elapsed_time": "0:53:58", "remaining_time": "1:26:19"}
|
||||
{"current_steps": 1190, "total_steps": 3080, "loss": 0.0061, "lr": 3.0831877769652905e-05, "epoch": 2.7045454545454546, "percentage": 38.64, "elapsed_time": "0:54:10", "remaining_time": "1:26:02"}
|
||||
{"current_steps": 1195, "total_steps": 3080, "loss": 0.0064, "lr": 3.073643202809325e-05, "epoch": 2.715909090909091, "percentage": 38.8, "elapsed_time": "0:54:22", "remaining_time": "1:25:45"}
|
||||
{"current_steps": 1200, "total_steps": 3080, "loss": 0.006, "lr": 3.064064153030673e-05, "epoch": 2.7272727272727275, "percentage": 38.96, "elapsed_time": "0:54:35", "remaining_time": "1:25:30"}
|
||||
{"current_steps": 1205, "total_steps": 3080, "loss": 0.0058, "lr": 3.054450935221005e-05, "epoch": 2.7386363636363638, "percentage": 39.12, "elapsed_time": "0:55:44", "remaining_time": "1:26:43"}
|
||||
{"current_steps": 1210, "total_steps": 3080, "loss": 0.006, "lr": 3.0448038580691563e-05, "epoch": 2.75, "percentage": 39.29, "elapsed_time": "0:55:56", "remaining_time": "1:26:27"}
|
||||
{"current_steps": 1215, "total_steps": 3080, "loss": 0.0059, "lr": 3.0351232313512145e-05, "epoch": 2.7613636363636362, "percentage": 39.45, "elapsed_time": "0:56:08", "remaining_time": "1:26:10"}
|
||||
{"current_steps": 1220, "total_steps": 3080, "loss": 0.006, "lr": 3.0254093659205752e-05, "epoch": 2.7727272727272725, "percentage": 39.61, "elapsed_time": "0:56:20", "remaining_time": "1:25:53"}
|
||||
{"current_steps": 1225, "total_steps": 3080, "loss": 0.0059, "lr": 3.015662573697957e-05, "epoch": 2.784090909090909, "percentage": 39.77, "elapsed_time": "0:56:31", "remaining_time": "1:25:36"}
|
||||
{"current_steps": 1230, "total_steps": 3080, "loss": 0.0058, "lr": 3.0058831676613854e-05, "epoch": 2.7954545454545454, "percentage": 39.94, "elapsed_time": "0:56:44", "remaining_time": "1:25:20"}
|
||||
{"current_steps": 1235, "total_steps": 3080, "loss": 0.006, "lr": 2.996071461836147e-05, "epoch": 2.8068181818181817, "percentage": 40.1, "elapsed_time": "0:56:55", "remaining_time": "1:25:02"}
|
||||
{"current_steps": 1240, "total_steps": 3080, "loss": 0.0059, "lr": 2.986227771284701e-05, "epoch": 2.8181818181818183, "percentage": 40.26, "elapsed_time": "0:57:07", "remaining_time": "1:24:46"}
|
||||
{"current_steps": 1245, "total_steps": 3080, "loss": 0.0057, "lr": 2.976352412096563e-05, "epoch": 2.8295454545454546, "percentage": 40.42, "elapsed_time": "0:57:22", "remaining_time": "1:24:34"}
|
||||
{"current_steps": 1250, "total_steps": 3080, "loss": 0.006, "lr": 2.9664457013781588e-05, "epoch": 2.840909090909091, "percentage": 40.58, "elapsed_time": "0:57:33", "remaining_time": "1:24:16"}
|
||||
{"current_steps": 1255, "total_steps": 3080, "loss": 0.0059, "lr": 2.956507957242637e-05, "epoch": 2.8522727272727275, "percentage": 40.75, "elapsed_time": "0:57:46", "remaining_time": "1:24:01"}
|
||||
{"current_steps": 1260, "total_steps": 3080, "loss": 0.0059, "lr": 2.9465394987996575e-05, "epoch": 2.8636363636363638, "percentage": 40.91, "elapsed_time": "0:57:58", "remaining_time": "1:23:44"}
|
||||
{"current_steps": 1265, "total_steps": 3080, "loss": 0.0061, "lr": 2.9365406461451442e-05, "epoch": 2.875, "percentage": 41.07, "elapsed_time": "0:58:10", "remaining_time": "1:23:27"}
|
||||
{"current_steps": 1270, "total_steps": 3080, "loss": 0.006, "lr": 2.9265117203510045e-05, "epoch": 2.8863636363636362, "percentage": 41.23, "elapsed_time": "0:58:22", "remaining_time": "1:23:11"}
|
||||
{"current_steps": 1275, "total_steps": 3080, "loss": 0.006, "lr": 2.916453043454821e-05, "epoch": 2.8977272727272725, "percentage": 41.4, "elapsed_time": "0:58:33", "remaining_time": "1:22:53"}
|
||||
{"current_steps": 1280, "total_steps": 3080, "loss": 0.0059, "lr": 2.9063649384495104e-05, "epoch": 2.909090909090909, "percentage": 41.56, "elapsed_time": "0:58:46", "remaining_time": "1:22:38"}
|
||||
{"current_steps": 1285, "total_steps": 3080, "loss": 0.006, "lr": 2.896247729272951e-05, "epoch": 2.9204545454545454, "percentage": 41.72, "elapsed_time": "0:58:57", "remaining_time": "1:22:21"}
|
||||
{"current_steps": 1290, "total_steps": 3080, "loss": 0.0062, "lr": 2.8861017407975828e-05, "epoch": 2.9318181818181817, "percentage": 41.88, "elapsed_time": "0:59:07", "remaining_time": "1:22:03"}
|
||||
{"current_steps": 1295, "total_steps": 3080, "loss": 0.0062, "lr": 2.8759272988199724e-05, "epoch": 2.9431818181818183, "percentage": 42.05, "elapsed_time": "0:59:19", "remaining_time": "1:21:46"}
|
||||
{"current_steps": 1300, "total_steps": 3080, "loss": 0.0062, "lr": 2.865724730050356e-05, "epoch": 2.9545454545454546, "percentage": 42.21, "elapsed_time": "0:59:31", "remaining_time": "1:21:30"}
|
||||
{"current_steps": 1305, "total_steps": 3080, "loss": 0.0063, "lr": 2.855494362102142e-05, "epoch": 2.965909090909091, "percentage": 42.37, "elapsed_time": "0:59:43", "remaining_time": "1:21:14"}
|
||||
{"current_steps": 1310, "total_steps": 3080, "loss": 0.0062, "lr": 2.8452365234813992e-05, "epoch": 2.9772727272727275, "percentage": 42.53, "elapsed_time": "0:59:56", "remaining_time": "1:20:59"}
|
||||
{"current_steps": 1315, "total_steps": 3080, "loss": 0.0063, "lr": 2.8349515435763e-05, "epoch": 2.9886363636363638, "percentage": 42.69, "elapsed_time": "1:00:07", "remaining_time": "1:20:42"}
|
||||
{"current_steps": 1320, "total_steps": 3080, "loss": 0.0059, "lr": 2.824639752646549e-05, "epoch": 3.0, "percentage": 42.86, "elapsed_time": "1:00:19", "remaining_time": "1:20:26"}
|
||||
{"current_steps": 1325, "total_steps": 3080, "loss": 0.0053, "lr": 2.814301481812776e-05, "epoch": 3.0113636363636362, "percentage": 43.02, "elapsed_time": "1:00:31", "remaining_time": "1:20:09"}
|
||||
{"current_steps": 1330, "total_steps": 3080, "loss": 0.0055, "lr": 2.8039370630459026e-05, "epoch": 3.022727272727273, "percentage": 43.18, "elapsed_time": "1:00:42", "remaining_time": "1:19:52"}
|
||||
{"current_steps": 1335, "total_steps": 3080, "loss": 0.0055, "lr": 2.793546829156485e-05, "epoch": 3.034090909090909, "percentage": 43.34, "elapsed_time": "1:00:53", "remaining_time": "1:19:35"}
|
||||
{"current_steps": 1340, "total_steps": 3080, "loss": 0.0052, "lr": 2.7831311137840252e-05, "epoch": 3.0454545454545454, "percentage": 43.51, "elapsed_time": "1:01:05", "remaining_time": "1:19:19"}
|
||||
{"current_steps": 1345, "total_steps": 3080, "loss": 0.0053, "lr": 2.7726902513862572e-05, "epoch": 3.0568181818181817, "percentage": 43.67, "elapsed_time": "1:01:17", "remaining_time": "1:19:03"}
|
||||
{"current_steps": 1350, "total_steps": 3080, "loss": 0.0053, "lr": 2.7622245772284086e-05, "epoch": 3.0681818181818183, "percentage": 43.83, "elapsed_time": "1:01:29", "remaining_time": "1:18:48"}
|
||||
{"current_steps": 1355, "total_steps": 3080, "loss": 0.0053, "lr": 2.7517344273724344e-05, "epoch": 3.0795454545454546, "percentage": 43.99, "elapsed_time": "1:01:41", "remaining_time": "1:18:32"}
|
||||
{"current_steps": 1360, "total_steps": 3080, "loss": 0.0051, "lr": 2.7412201386662247e-05, "epoch": 3.090909090909091, "percentage": 44.16, "elapsed_time": "1:01:52", "remaining_time": "1:18:15"}
|
||||
{"current_steps": 1365, "total_steps": 3080, "loss": 0.0052, "lr": 2.7306820487327906e-05, "epoch": 3.102272727272727, "percentage": 44.32, "elapsed_time": "1:02:03", "remaining_time": "1:17:58"}
|
||||
{"current_steps": 1370, "total_steps": 3080, "loss": 0.0052, "lr": 2.72012049595942e-05, "epoch": 3.1136363636363638, "percentage": 44.48, "elapsed_time": "1:02:15", "remaining_time": "1:17:41"}
|
||||
{"current_steps": 1375, "total_steps": 3080, "loss": 0.0055, "lr": 2.7095358194868146e-05, "epoch": 3.125, "percentage": 44.64, "elapsed_time": "1:02:26", "remaining_time": "1:17:25"}
|
||||
{"current_steps": 1380, "total_steps": 3080, "loss": 0.0057, "lr": 2.698928359198197e-05, "epoch": 3.1363636363636362, "percentage": 44.81, "elapsed_time": "1:02:39", "remaining_time": "1:17:11"}
|
||||
{"current_steps": 1385, "total_steps": 3080, "loss": 0.0052, "lr": 2.6882984557083987e-05, "epoch": 3.147727272727273, "percentage": 44.97, "elapsed_time": "1:02:51", "remaining_time": "1:16:55"}
|
||||
{"current_steps": 1390, "total_steps": 3080, "loss": 0.0053, "lr": 2.677646450352923e-05, "epoch": 3.159090909090909, "percentage": 45.13, "elapsed_time": "1:03:03", "remaining_time": "1:16:39"}
|
||||
{"current_steps": 1395, "total_steps": 3080, "loss": 0.0054, "lr": 2.6669726851769814e-05, "epoch": 3.1704545454545454, "percentage": 45.29, "elapsed_time": "1:03:13", "remaining_time": "1:16:22"}
|
||||
{"current_steps": 1400, "total_steps": 3080, "loss": 0.0056, "lr": 2.656277502924514e-05, "epoch": 3.1818181818181817, "percentage": 45.45, "elapsed_time": "1:03:25", "remaining_time": "1:16:06"}
|
||||
{"current_steps": 1405, "total_steps": 3080, "loss": 0.0054, "lr": 2.6455612470271805e-05, "epoch": 3.1931818181818183, "percentage": 45.62, "elapsed_time": "1:05:08", "remaining_time": "1:17:40"}
|
||||
{"current_steps": 1410, "total_steps": 3080, "loss": 0.0056, "lr": 2.6348242615933348e-05, "epoch": 3.2045454545454546, "percentage": 45.78, "elapsed_time": "1:05:20", "remaining_time": "1:17:23"}
|
||||
{"current_steps": 1415, "total_steps": 3080, "loss": 0.0056, "lr": 2.6240668913969743e-05, "epoch": 3.215909090909091, "percentage": 45.94, "elapsed_time": "1:05:33", "remaining_time": "1:17:08"}
|
||||
{"current_steps": 1420, "total_steps": 3080, "loss": 0.0052, "lr": 2.613289481866669e-05, "epoch": 3.227272727272727, "percentage": 46.1, "elapsed_time": "1:05:46", "remaining_time": "1:16:53"}
|
||||
{"current_steps": 1425, "total_steps": 3080, "loss": 0.0052, "lr": 2.6024923790744686e-05, "epoch": 3.2386363636363638, "percentage": 46.27, "elapsed_time": "1:05:58", "remaining_time": "1:16:37"}
|
||||
{"current_steps": 1430, "total_steps": 3080, "loss": 0.0055, "lr": 2.5916759297247917e-05, "epoch": 3.25, "percentage": 46.43, "elapsed_time": "1:06:09", "remaining_time": "1:16:20"}
|
||||
{"current_steps": 1435, "total_steps": 3080, "loss": 0.0057, "lr": 2.5808404811432918e-05, "epoch": 3.2613636363636362, "percentage": 46.59, "elapsed_time": "1:06:22", "remaining_time": "1:16:05"}
|
||||
{"current_steps": 1440, "total_steps": 3080, "loss": 0.0053, "lr": 2.5699863812657033e-05, "epoch": 3.2727272727272725, "percentage": 46.75, "elapsed_time": "1:06:34", "remaining_time": "1:15:48"}
|
||||
{"current_steps": 1445, "total_steps": 3080, "loss": 0.0056, "lr": 2.5591139786266705e-05, "epoch": 3.284090909090909, "percentage": 46.92, "elapsed_time": "1:06:45", "remaining_time": "1:15:32"}
|
||||
{"current_steps": 1450, "total_steps": 3080, "loss": 0.0058, "lr": 2.5482236223485557e-05, "epoch": 3.2954545454545454, "percentage": 47.08, "elapsed_time": "1:06:56", "remaining_time": "1:15:15"}
|
||||
{"current_steps": 1455, "total_steps": 3080, "loss": 0.0055, "lr": 2.537315662130228e-05, "epoch": 3.3068181818181817, "percentage": 47.24, "elapsed_time": "1:07:07", "remaining_time": "1:14:57"}
|
||||
{"current_steps": 1460, "total_steps": 3080, "loss": 0.0057, "lr": 2.5263904482358353e-05, "epoch": 3.3181818181818183, "percentage": 47.4, "elapsed_time": "1:07:18", "remaining_time": "1:14:41"}
|
||||
{"current_steps": 1465, "total_steps": 3080, "loss": 0.0056, "lr": 2.515448331483555e-05, "epoch": 3.3295454545454546, "percentage": 47.56, "elapsed_time": "1:07:30", "remaining_time": "1:14:24"}
|
||||
{"current_steps": 1470, "total_steps": 3080, "loss": 0.0055, "lr": 2.5044896632343303e-05, "epoch": 3.340909090909091, "percentage": 47.73, "elapsed_time": "1:07:43", "remaining_time": "1:14:10"}
|
||||
{"current_steps": 1475, "total_steps": 3080, "loss": 0.0052, "lr": 2.493514795380587e-05, "epoch": 3.3522727272727275, "percentage": 47.89, "elapsed_time": "1:07:54", "remaining_time": "1:13:54"}
|
||||
{"current_steps": 1480, "total_steps": 3080, "loss": 0.0051, "lr": 2.4825240803349368e-05, "epoch": 3.3636363636363638, "percentage": 48.05, "elapsed_time": "1:08:06", "remaining_time": "1:13:38"}
|
||||
{"current_steps": 1485, "total_steps": 3080, "loss": 0.0051, "lr": 2.471517871018855e-05, "epoch": 3.375, "percentage": 48.21, "elapsed_time": "1:08:17", "remaining_time": "1:13:21"}
|
||||
{"current_steps": 1490, "total_steps": 3080, "loss": 0.0051, "lr": 2.4604965208513535e-05, "epoch": 3.3863636363636362, "percentage": 48.38, "elapsed_time": "1:08:29", "remaining_time": "1:13:05"}
|
||||
{"current_steps": 1495, "total_steps": 3080, "loss": 0.0052, "lr": 2.44946038373763e-05, "epoch": 3.3977272727272725, "percentage": 48.54, "elapsed_time": "1:08:41", "remaining_time": "1:12:49"}
|
||||
{"current_steps": 1500, "total_steps": 3080, "loss": 0.0052, "lr": 2.4384098140577048e-05, "epoch": 3.409090909090909, "percentage": 48.7, "elapsed_time": "1:08:53", "remaining_time": "1:12:34"}
|
||||
{"current_steps": 1505, "total_steps": 3080, "loss": 0.0053, "lr": 2.4273451666550382e-05, "epoch": 3.4204545454545454, "percentage": 48.86, "elapsed_time": "1:09:06", "remaining_time": "1:12:18"}
|
||||
{"current_steps": 1510, "total_steps": 3080, "loss": 0.0051, "lr": 2.4162667968251414e-05, "epoch": 3.4318181818181817, "percentage": 49.03, "elapsed_time": "1:09:18", "remaining_time": "1:12:03"}
|
||||
{"current_steps": 1515, "total_steps": 3080, "loss": 0.0053, "lr": 2.4051750603041623e-05, "epoch": 3.4431818181818183, "percentage": 49.19, "elapsed_time": "1:09:29", "remaining_time": "1:11:47"}
|
||||
{"current_steps": 1520, "total_steps": 3080, "loss": 0.005, "lr": 2.3940703132574664e-05, "epoch": 3.4545454545454546, "percentage": 49.35, "elapsed_time": "1:09:41", "remaining_time": "1:11:31"}
|
||||
{"current_steps": 1525, "total_steps": 3080, "loss": 0.0053, "lr": 2.3829529122681977e-05, "epoch": 3.465909090909091, "percentage": 49.51, "elapsed_time": "1:09:53", "remaining_time": "1:11:16"}
|
||||
{"current_steps": 1530, "total_steps": 3080, "loss": 0.0052, "lr": 2.3718232143258296e-05, "epoch": 3.4772727272727275, "percentage": 49.68, "elapsed_time": "1:10:05", "remaining_time": "1:11:00"}
|
||||
{"current_steps": 1535, "total_steps": 3080, "loss": 0.0053, "lr": 2.360681576814702e-05, "epoch": 3.4886363636363638, "percentage": 49.84, "elapsed_time": "1:10:15", "remaining_time": "1:10:43"}
|
||||
{"current_steps": 1540, "total_steps": 3080, "loss": 0.0051, "lr": 2.3495283575025445e-05, "epoch": 3.5, "percentage": 50.0, "elapsed_time": "1:10:27", "remaining_time": "1:10:27"}
|
||||
{"current_steps": 1545, "total_steps": 3080, "loss": 0.0052, "lr": 2.3383639145289882e-05, "epoch": 3.5113636363636362, "percentage": 50.16, "elapsed_time": "1:10:38", "remaining_time": "1:10:10"}
|
||||
{"current_steps": 1550, "total_steps": 3080, "loss": 0.0051, "lr": 2.3271886063940655e-05, "epoch": 3.5227272727272725, "percentage": 50.32, "elapsed_time": "1:10:50", "remaining_time": "1:09:56"}
|
||||
{"current_steps": 1555, "total_steps": 3080, "loss": 0.0051, "lr": 2.3160027919467e-05, "epoch": 3.534090909090909, "percentage": 50.49, "elapsed_time": "1:11:02", "remaining_time": "1:09:39"}
|
||||
{"current_steps": 1560, "total_steps": 3080, "loss": 0.0054, "lr": 2.3048068303731808e-05, "epoch": 3.5454545454545454, "percentage": 50.65, "elapsed_time": "1:11:13", "remaining_time": "1:09:24"}
|
||||
{"current_steps": 1565, "total_steps": 3080, "loss": 0.0053, "lr": 2.2936010811856302e-05, "epoch": 3.5568181818181817, "percentage": 50.81, "elapsed_time": "1:11:26", "remaining_time": "1:09:09"}
|
||||
{"current_steps": 1570, "total_steps": 3080, "loss": 0.0054, "lr": 2.2823859042104596e-05, "epoch": 3.5681818181818183, "percentage": 50.97, "elapsed_time": "1:11:38", "remaining_time": "1:08:53"}
|
||||
{"current_steps": 1575, "total_steps": 3080, "loss": 0.0054, "lr": 2.2711616595768157e-05, "epoch": 3.5795454545454546, "percentage": 51.14, "elapsed_time": "1:11:51", "remaining_time": "1:08:40"}
|
||||
{"current_steps": 1580, "total_steps": 3080, "loss": 0.0051, "lr": 2.259928707705015e-05, "epoch": 3.590909090909091, "percentage": 51.3, "elapsed_time": "1:12:03", "remaining_time": "1:08:24"}
|
||||
{"current_steps": 1585, "total_steps": 3080, "loss": 0.0052, "lr": 2.2486874092949708e-05, "epoch": 3.6022727272727275, "percentage": 51.46, "elapsed_time": "1:12:14", "remaining_time": "1:08:08"}
|
||||
{"current_steps": 1590, "total_steps": 3080, "loss": 0.0054, "lr": 2.2374381253146105e-05, "epoch": 3.6136363636363638, "percentage": 51.62, "elapsed_time": "1:12:25", "remaining_time": "1:07:52"}
|
||||
{"current_steps": 1595, "total_steps": 3080, "loss": 0.0054, "lr": 2.226181216988287e-05, "epoch": 3.625, "percentage": 51.79, "elapsed_time": "1:12:37", "remaining_time": "1:07:37"}
|
||||
{"current_steps": 1600, "total_steps": 3080, "loss": 0.0054, "lr": 2.2149170457851767e-05, "epoch": 3.6363636363636362, "percentage": 51.95, "elapsed_time": "1:12:49", "remaining_time": "1:07:22"}
|
||||
{"current_steps": 1605, "total_steps": 3080, "loss": 0.0052, "lr": 2.2036459734076715e-05, "epoch": 3.6477272727272725, "percentage": 52.11, "elapsed_time": "1:14:05", "remaining_time": "1:08:04"}
|
||||
{"current_steps": 1610, "total_steps": 3080, "loss": 0.0052, "lr": 2.1923683617797685e-05, "epoch": 3.659090909090909, "percentage": 52.27, "elapsed_time": "1:14:15", "remaining_time": "1:07:48"}
|
||||
{"current_steps": 1615, "total_steps": 3080, "loss": 0.0055, "lr": 2.1810845730354458e-05, "epoch": 3.6704545454545454, "percentage": 52.44, "elapsed_time": "1:14:27", "remaining_time": "1:07:32"}
|
||||
{"current_steps": 1620, "total_steps": 3080, "loss": 0.0054, "lr": 2.1697949695070326e-05, "epoch": 3.6818181818181817, "percentage": 52.6, "elapsed_time": "1:14:38", "remaining_time": "1:07:16"}
|
||||
{"current_steps": 1625, "total_steps": 3080, "loss": 0.0055, "lr": 2.158499913713577e-05, "epoch": 3.6931818181818183, "percentage": 52.76, "elapsed_time": "1:14:51", "remaining_time": "1:07:01"}
|
||||
{"current_steps": 1630, "total_steps": 3080, "loss": 0.0056, "lr": 2.1471997683492036e-05, "epoch": 3.7045454545454546, "percentage": 52.92, "elapsed_time": "1:15:02", "remaining_time": "1:06:45"}
|
||||
{"current_steps": 1635, "total_steps": 3080, "loss": 0.0054, "lr": 2.1358948962714684e-05, "epoch": 3.715909090909091, "percentage": 53.08, "elapsed_time": "1:15:13", "remaining_time": "1:06:29"}
|
||||
{"current_steps": 1640, "total_steps": 3080, "loss": 0.0054, "lr": 2.1245856604897045e-05, "epoch": 3.7272727272727275, "percentage": 53.25, "elapsed_time": "1:15:25", "remaining_time": "1:06:13"}
|
||||
{"current_steps": 1645, "total_steps": 3080, "loss": 0.0054, "lr": 2.1132724241533692e-05, "epoch": 3.7386363636363638, "percentage": 53.41, "elapsed_time": "1:15:37", "remaining_time": "1:05:58"}
|
||||
{"current_steps": 1650, "total_steps": 3080, "loss": 0.0054, "lr": 2.10195555054038e-05, "epoch": 3.75, "percentage": 53.57, "elapsed_time": "1:15:49", "remaining_time": "1:05:42"}
|
||||
{"current_steps": 1655, "total_steps": 3080, "loss": 0.0055, "lr": 2.0906354030454515e-05, "epoch": 3.7613636363636362, "percentage": 53.73, "elapsed_time": "1:16:01", "remaining_time": "1:05:27"}
|
||||
{"current_steps": 1660, "total_steps": 3080, "loss": 0.0055, "lr": 2.0793123451684248e-05, "epoch": 3.7727272727272725, "percentage": 53.9, "elapsed_time": "1:16:12", "remaining_time": "1:05:11"}
|
||||
{"current_steps": 1665, "total_steps": 3080, "loss": 0.005, "lr": 2.0679867405025956e-05, "epoch": 3.784090909090909, "percentage": 54.06, "elapsed_time": "1:16:23", "remaining_time": "1:04:55"}
|
||||
{"current_steps": 1670, "total_steps": 3080, "loss": 0.0052, "lr": 2.0566589527230404e-05, "epoch": 3.7954545454545454, "percentage": 54.22, "elapsed_time": "1:16:34", "remaining_time": "1:04:39"}
|
||||
{"current_steps": 1675, "total_steps": 3080, "loss": 0.0053, "lr": 2.045329345574936e-05, "epoch": 3.8068181818181817, "percentage": 54.38, "elapsed_time": "1:16:46", "remaining_time": "1:04:23"}
|
||||
{"current_steps": 1680, "total_steps": 3080, "loss": 0.0054, "lr": 2.0339982828618826e-05, "epoch": 3.8181818181818183, "percentage": 54.55, "elapsed_time": "1:16:57", "remaining_time": "1:04:08"}
|
||||
{"current_steps": 1685, "total_steps": 3080, "loss": 0.0056, "lr": 2.0226661284342168e-05, "epoch": 3.8295454545454546, "percentage": 54.71, "elapsed_time": "1:17:09", "remaining_time": "1:03:52"}
|
||||
{"current_steps": 1690, "total_steps": 3080, "loss": 0.0053, "lr": 2.0113332461773344e-05, "epoch": 3.840909090909091, "percentage": 54.87, "elapsed_time": "1:17:22", "remaining_time": "1:03:38"}
|
||||
{"current_steps": 1695, "total_steps": 3080, "loss": 0.0053, "lr": 2e-05, "epoch": 3.8522727272727275, "percentage": 55.03, "elapsed_time": "1:17:32", "remaining_time": "1:03:21"}
|
||||
{"current_steps": 1700, "total_steps": 3080, "loss": 0.0052, "lr": 1.9886667538226663e-05, "epoch": 3.8636363636363638, "percentage": 55.19, "elapsed_time": "1:17:43", "remaining_time": "1:03:05"}
|
||||
{"current_steps": 1705, "total_steps": 3080, "loss": 0.0052, "lr": 1.977333871565784e-05, "epoch": 3.875, "percentage": 55.36, "elapsed_time": "1:17:55", "remaining_time": "1:02:50"}
|
||||
{"current_steps": 1710, "total_steps": 3080, "loss": 0.0051, "lr": 1.966001717138118e-05, "epoch": 3.8863636363636362, "percentage": 55.52, "elapsed_time": "1:18:08", "remaining_time": "1:02:36"}
|
||||
{"current_steps": 1715, "total_steps": 3080, "loss": 0.0055, "lr": 1.9546706544250646e-05, "epoch": 3.8977272727272725, "percentage": 55.68, "elapsed_time": "1:18:20", "remaining_time": "1:02:20"}
|
||||
{"current_steps": 1720, "total_steps": 3080, "loss": 0.0053, "lr": 1.94334104727696e-05, "epoch": 3.909090909090909, "percentage": 55.84, "elapsed_time": "1:18:31", "remaining_time": "1:02:05"}
|
||||
{"current_steps": 1725, "total_steps": 3080, "loss": 0.0055, "lr": 1.9320132594974047e-05, "epoch": 3.9204545454545454, "percentage": 56.01, "elapsed_time": "1:18:42", "remaining_time": "1:01:49"}
|
||||
{"current_steps": 1730, "total_steps": 3080, "loss": 0.0054, "lr": 1.9206876548315755e-05, "epoch": 3.9318181818181817, "percentage": 56.17, "elapsed_time": "1:18:58", "remaining_time": "1:01:37"}
|
||||
{"current_steps": 1735, "total_steps": 3080, "loss": 0.0053, "lr": 1.9093645969545488e-05, "epoch": 3.9431818181818183, "percentage": 56.33, "elapsed_time": "1:19:10", "remaining_time": "1:01:22"}
|
||||
{"current_steps": 1740, "total_steps": 3080, "loss": 0.0049, "lr": 1.89804444945962e-05, "epoch": 3.9545454545454546, "percentage": 56.49, "elapsed_time": "1:19:21", "remaining_time": "1:01:07"}
|
||||
{"current_steps": 1745, "total_steps": 3080, "loss": 0.0055, "lr": 1.886727575846631e-05, "epoch": 3.965909090909091, "percentage": 56.66, "elapsed_time": "1:19:33", "remaining_time": "1:00:52"}
|
||||
{"current_steps": 1750, "total_steps": 3080, "loss": 0.0052, "lr": 1.8754143395102958e-05, "epoch": 3.9772727272727275, "percentage": 56.82, "elapsed_time": "1:19:45", "remaining_time": "1:00:36"}
|
||||
{"current_steps": 1755, "total_steps": 3080, "loss": 0.0052, "lr": 1.8641051037285322e-05, "epoch": 3.9886363636363638, "percentage": 56.98, "elapsed_time": "1:19:57", "remaining_time": "1:00:22"}
|
||||
{"current_steps": 1760, "total_steps": 3080, "loss": 0.0052, "lr": 1.8528002316507964e-05, "epoch": 4.0, "percentage": 57.14, "elapsed_time": "1:20:09", "remaining_time": "1:00:07"}
|
||||
{"current_steps": 1765, "total_steps": 3080, "loss": 0.0051, "lr": 1.841500086286423e-05, "epoch": 4.011363636363637, "percentage": 57.31, "elapsed_time": "1:20:20", "remaining_time": "0:59:51"}
|
||||
{"current_steps": 1770, "total_steps": 3080, "loss": 0.0049, "lr": 1.8302050304929674e-05, "epoch": 4.0227272727272725, "percentage": 57.47, "elapsed_time": "1:20:31", "remaining_time": "0:59:35"}
|
||||
{"current_steps": 1775, "total_steps": 3080, "loss": 0.0047, "lr": 1.8189154269645552e-05, "epoch": 4.034090909090909, "percentage": 57.63, "elapsed_time": "1:20:43", "remaining_time": "0:59:20"}
|
||||
{"current_steps": 1780, "total_steps": 3080, "loss": 0.0052, "lr": 1.8076316382202325e-05, "epoch": 4.045454545454546, "percentage": 57.79, "elapsed_time": "1:20:55", "remaining_time": "0:59:05"}
|
||||
{"current_steps": 1785, "total_steps": 3080, "loss": 0.0049, "lr": 1.7963540265923298e-05, "epoch": 4.056818181818182, "percentage": 57.95, "elapsed_time": "1:21:07", "remaining_time": "0:58:51"}
|
||||
{"current_steps": 1790, "total_steps": 3080, "loss": 0.0048, "lr": 1.7850829542148247e-05, "epoch": 4.068181818181818, "percentage": 58.12, "elapsed_time": "1:21:18", "remaining_time": "0:58:35"}
|
||||
{"current_steps": 1795, "total_steps": 3080, "loss": 0.0047, "lr": 1.7738187830117134e-05, "epoch": 4.079545454545454, "percentage": 58.28, "elapsed_time": "1:21:31", "remaining_time": "0:58:21"}
|
||||
{"current_steps": 1800, "total_steps": 3080, "loss": 0.0047, "lr": 1.7625618746853902e-05, "epoch": 4.090909090909091, "percentage": 58.44, "elapsed_time": "1:21:43", "remaining_time": "0:58:07"}
|
||||
{"current_steps": 1805, "total_steps": 3080, "loss": 0.0046, "lr": 1.7513125907050302e-05, "epoch": 4.1022727272727275, "percentage": 58.6, "elapsed_time": "1:22:57", "remaining_time": "0:58:35"}
|
||||
{"current_steps": 1810, "total_steps": 3080, "loss": 0.0047, "lr": 1.7400712922949854e-05, "epoch": 4.113636363636363, "percentage": 58.77, "elapsed_time": "1:23:09", "remaining_time": "0:58:20"}
|
||||
{"current_steps": 1815, "total_steps": 3080, "loss": 0.0044, "lr": 1.7288383404231846e-05, "epoch": 4.125, "percentage": 58.93, "elapsed_time": "1:23:20", "remaining_time": "0:58:05"}
|
||||
{"current_steps": 1820, "total_steps": 3080, "loss": 0.0048, "lr": 1.7176140957895407e-05, "epoch": 4.136363636363637, "percentage": 59.09, "elapsed_time": "1:23:31", "remaining_time": "0:57:49"}
|
||||
{"current_steps": 1825, "total_steps": 3080, "loss": 0.0048, "lr": 1.7063989188143705e-05, "epoch": 4.1477272727272725, "percentage": 59.25, "elapsed_time": "1:23:44", "remaining_time": "0:57:35"}
|
||||
{"current_steps": 1830, "total_steps": 3080, "loss": 0.0045, "lr": 1.69519316962682e-05, "epoch": 4.159090909090909, "percentage": 59.42, "elapsed_time": "1:23:55", "remaining_time": "0:57:19"}
|
||||
{"current_steps": 1835, "total_steps": 3080, "loss": 0.0045, "lr": 1.6839972080533004e-05, "epoch": 4.170454545454546, "percentage": 59.58, "elapsed_time": "1:24:08", "remaining_time": "0:57:04"}
|
||||
{"current_steps": 1840, "total_steps": 3080, "loss": 0.0047, "lr": 1.6728113936059348e-05, "epoch": 4.181818181818182, "percentage": 59.74, "elapsed_time": "1:24:19", "remaining_time": "0:56:49"}
|
||||
{"current_steps": 1845, "total_steps": 3080, "loss": 0.0048, "lr": 1.6616360854710125e-05, "epoch": 4.193181818181818, "percentage": 59.9, "elapsed_time": "1:24:30", "remaining_time": "0:56:34"}
|
||||
{"current_steps": 1850, "total_steps": 3080, "loss": 0.0042, "lr": 1.650471642497456e-05, "epoch": 4.204545454545454, "percentage": 60.06, "elapsed_time": "1:24:44", "remaining_time": "0:56:20"}
|
||||
{"current_steps": 1855, "total_steps": 3080, "loss": 0.0047, "lr": 1.6393184231852983e-05, "epoch": 4.215909090909091, "percentage": 60.23, "elapsed_time": "1:24:55", "remaining_time": "0:56:05"}
|
||||
{"current_steps": 1860, "total_steps": 3080, "loss": 0.0045, "lr": 1.628176785674171e-05, "epoch": 4.2272727272727275, "percentage": 60.39, "elapsed_time": "1:25:07", "remaining_time": "0:55:50"}
|
||||
{"current_steps": 1865, "total_steps": 3080, "loss": 0.0048, "lr": 1.617047087731803e-05, "epoch": 4.238636363636363, "percentage": 60.55, "elapsed_time": "1:25:20", "remaining_time": "0:55:36"}
|
||||
{"current_steps": 1870, "total_steps": 3080, "loss": 0.0047, "lr": 1.6059296867425343e-05, "epoch": 4.25, "percentage": 60.71, "elapsed_time": "1:25:31", "remaining_time": "0:55:20"}
|
||||
{"current_steps": 1875, "total_steps": 3080, "loss": 0.0045, "lr": 1.5948249396958384e-05, "epoch": 4.261363636363637, "percentage": 60.88, "elapsed_time": "1:25:43", "remaining_time": "0:55:05"}
|
||||
{"current_steps": 1880, "total_steps": 3080, "loss": 0.0046, "lr": 1.583733203174859e-05, "epoch": 4.2727272727272725, "percentage": 61.04, "elapsed_time": "1:25:54", "remaining_time": "0:54:50"}
|
||||
{"current_steps": 1885, "total_steps": 3080, "loss": 0.0047, "lr": 1.572654833344962e-05, "epoch": 4.284090909090909, "percentage": 61.2, "elapsed_time": "1:26:06", "remaining_time": "0:54:35"}
|
||||
{"current_steps": 1890, "total_steps": 3080, "loss": 0.0043, "lr": 1.5615901859422956e-05, "epoch": 4.295454545454546, "percentage": 61.36, "elapsed_time": "1:26:17", "remaining_time": "0:54:20"}
|
||||
{"current_steps": 1895, "total_steps": 3080, "loss": 0.0048, "lr": 1.55053961626237e-05, "epoch": 4.306818181818182, "percentage": 61.53, "elapsed_time": "1:26:29", "remaining_time": "0:54:05"}
|
||||
{"current_steps": 1900, "total_steps": 3080, "loss": 0.0046, "lr": 1.539503479148647e-05, "epoch": 4.318181818181818, "percentage": 61.69, "elapsed_time": "1:26:41", "remaining_time": "0:53:50"}
|
||||
{"current_steps": 1905, "total_steps": 3080, "loss": 0.0045, "lr": 1.5284821289811453e-05, "epoch": 4.329545454545454, "percentage": 61.85, "elapsed_time": "1:26:52", "remaining_time": "0:53:35"}
|
||||
{"current_steps": 1910, "total_steps": 3080, "loss": 0.0047, "lr": 1.5174759196650637e-05, "epoch": 4.340909090909091, "percentage": 62.01, "elapsed_time": "1:27:04", "remaining_time": "0:53:20"}
|
||||
{"current_steps": 1915, "total_steps": 3080, "loss": 0.0045, "lr": 1.5064852046194127e-05, "epoch": 4.3522727272727275, "percentage": 62.18, "elapsed_time": "1:27:15", "remaining_time": "0:53:04"}
|
||||
{"current_steps": 1920, "total_steps": 3080, "loss": 0.0043, "lr": 1.49551033676567e-05, "epoch": 4.363636363636363, "percentage": 62.34, "elapsed_time": "1:27:27", "remaining_time": "0:52:50"}
|
||||
{"current_steps": 1925, "total_steps": 3080, "loss": 0.0046, "lr": 1.484551668516446e-05, "epoch": 4.375, "percentage": 62.5, "elapsed_time": "1:27:39", "remaining_time": "0:52:35"}
|
||||
{"current_steps": 1930, "total_steps": 3080, "loss": 0.0046, "lr": 1.4736095517641654e-05, "epoch": 4.386363636363637, "percentage": 62.66, "elapsed_time": "1:27:51", "remaining_time": "0:52:21"}
|
||||
{"current_steps": 1935, "total_steps": 3080, "loss": 0.0048, "lr": 1.4626843378697725e-05, "epoch": 4.3977272727272725, "percentage": 62.82, "elapsed_time": "1:28:02", "remaining_time": "0:52:05"}
|
||||
{"current_steps": 1940, "total_steps": 3080, "loss": 0.0048, "lr": 1.4517763776514453e-05, "epoch": 4.409090909090909, "percentage": 62.99, "elapsed_time": "1:28:14", "remaining_time": "0:51:51"}
|
||||
{"current_steps": 1945, "total_steps": 3080, "loss": 0.0047, "lr": 1.4408860213733307e-05, "epoch": 4.420454545454546, "percentage": 63.15, "elapsed_time": "1:28:26", "remaining_time": "0:51:36"}
|
||||
{"current_steps": 1950, "total_steps": 3080, "loss": 0.0046, "lr": 1.4300136187342979e-05, "epoch": 4.431818181818182, "percentage": 63.31, "elapsed_time": "1:28:37", "remaining_time": "0:51:21"}
|
||||
{"current_steps": 1955, "total_steps": 3080, "loss": 0.0045, "lr": 1.4191595188567089e-05, "epoch": 4.443181818181818, "percentage": 63.47, "elapsed_time": "1:28:48", "remaining_time": "0:51:06"}
|
||||
{"current_steps": 1960, "total_steps": 3080, "loss": 0.0048, "lr": 1.4083240702752088e-05, "epoch": 4.454545454545454, "percentage": 63.64, "elapsed_time": "1:28:59", "remaining_time": "0:50:51"}
|
||||
{"current_steps": 1965, "total_steps": 3080, "loss": 0.0048, "lr": 1.3975076209255321e-05, "epoch": 4.465909090909091, "percentage": 63.8, "elapsed_time": "1:29:10", "remaining_time": "0:50:36"}
|
||||
{"current_steps": 1970, "total_steps": 3080, "loss": 0.0044, "lr": 1.3867105181333318e-05, "epoch": 4.4772727272727275, "percentage": 63.96, "elapsed_time": "1:29:22", "remaining_time": "0:50:21"}
|
||||
{"current_steps": 1975, "total_steps": 3080, "loss": 0.0048, "lr": 1.375933108603026e-05, "epoch": 4.488636363636363, "percentage": 64.12, "elapsed_time": "1:29:35", "remaining_time": "0:50:07"}
|
||||
{"current_steps": 1980, "total_steps": 3080, "loss": 0.0049, "lr": 1.3651757384066657e-05, "epoch": 4.5, "percentage": 64.29, "elapsed_time": "1:29:46", "remaining_time": "0:49:52"}
|
||||
{"current_steps": 1985, "total_steps": 3080, "loss": 0.0047, "lr": 1.3544387529728203e-05, "epoch": 4.511363636363637, "percentage": 64.45, "elapsed_time": "1:29:59", "remaining_time": "0:49:38"}
|
||||
{"current_steps": 1990, "total_steps": 3080, "loss": 0.0047, "lr": 1.3437224970754865e-05, "epoch": 4.5227272727272725, "percentage": 64.61, "elapsed_time": "1:30:11", "remaining_time": "0:49:24"}
|
||||
{"current_steps": 1995, "total_steps": 3080, "loss": 0.0047, "lr": 1.3330273148230191e-05, "epoch": 4.534090909090909, "percentage": 64.77, "elapsed_time": "1:30:23", "remaining_time": "0:49:09"}
|
||||
{"current_steps": 2000, "total_steps": 3080, "loss": 0.0048, "lr": 1.3223535496470775e-05, "epoch": 4.545454545454545, "percentage": 64.94, "elapsed_time": "1:30:34", "remaining_time": "0:48:54"}
|
||||
{"current_steps": 2005, "total_steps": 3080, "loss": 0.0045, "lr": 1.3117015442916014e-05, "epoch": 4.556818181818182, "percentage": 65.1, "elapsed_time": "1:32:04", "remaining_time": "0:49:21"}
|
||||
{"current_steps": 2010, "total_steps": 3080, "loss": 0.0049, "lr": 1.3010716408018037e-05, "epoch": 4.568181818181818, "percentage": 65.26, "elapsed_time": "1:32:16", "remaining_time": "0:49:07"}
|
||||
{"current_steps": 2015, "total_steps": 3080, "loss": 0.0043, "lr": 1.290464180513186e-05, "epoch": 4.579545454545455, "percentage": 65.42, "elapsed_time": "1:32:27", "remaining_time": "0:48:52"}
|
||||
{"current_steps": 2020, "total_steps": 3080, "loss": 0.0048, "lr": 1.2798795040405804e-05, "epoch": 4.590909090909091, "percentage": 65.58, "elapsed_time": "1:32:39", "remaining_time": "0:48:37"}
|
||||
{"current_steps": 2025, "total_steps": 3080, "loss": 0.0045, "lr": 1.26931795126721e-05, "epoch": 4.6022727272727275, "percentage": 65.75, "elapsed_time": "1:32:50", "remaining_time": "0:48:22"}
|
||||
{"current_steps": 2030, "total_steps": 3080, "loss": 0.0048, "lr": 1.2587798613337758e-05, "epoch": 4.613636363636363, "percentage": 65.91, "elapsed_time": "1:33:02", "remaining_time": "0:48:07"}
|
||||
{"current_steps": 2035, "total_steps": 3080, "loss": 0.0044, "lr": 1.2482655726275661e-05, "epoch": 4.625, "percentage": 66.07, "elapsed_time": "1:33:13", "remaining_time": "0:47:52"}
|
||||
{"current_steps": 2040, "total_steps": 3080, "loss": 0.0044, "lr": 1.2377754227715915e-05, "epoch": 4.636363636363637, "percentage": 66.23, "elapsed_time": "1:33:24", "remaining_time": "0:47:37"}
|
||||
{"current_steps": 2045, "total_steps": 3080, "loss": 0.0045, "lr": 1.2273097486137426e-05, "epoch": 4.6477272727272725, "percentage": 66.4, "elapsed_time": "1:33:35", "remaining_time": "0:47:22"}
|
||||
{"current_steps": 2050, "total_steps": 3080, "loss": 0.0047, "lr": 1.2168688862159748e-05, "epoch": 4.659090909090909, "percentage": 66.56, "elapsed_time": "1:33:46", "remaining_time": "0:47:07"}
|
||||
{"current_steps": 2055, "total_steps": 3080, "loss": 0.0048, "lr": 1.2064531708435148e-05, "epoch": 4.670454545454545, "percentage": 66.72, "elapsed_time": "1:33:58", "remaining_time": "0:46:52"}
|
||||
{"current_steps": 2060, "total_steps": 3080, "loss": 0.0045, "lr": 1.1960629369540974e-05, "epoch": 4.681818181818182, "percentage": 66.88, "elapsed_time": "1:34:09", "remaining_time": "0:46:37"}
|
||||
{"current_steps": 2065, "total_steps": 3080, "loss": 0.0046, "lr": 1.1856985181872242e-05, "epoch": 4.693181818181818, "percentage": 67.05, "elapsed_time": "1:34:21", "remaining_time": "0:46:22"}
|
||||
{"current_steps": 2070, "total_steps": 3080, "loss": 0.0045, "lr": 1.1753602473534514e-05, "epoch": 4.704545454545455, "percentage": 67.21, "elapsed_time": "1:34:32", "remaining_time": "0:46:07"}
|
||||
{"current_steps": 2075, "total_steps": 3080, "loss": 0.0049, "lr": 1.165048456423701e-05, "epoch": 4.715909090909091, "percentage": 67.37, "elapsed_time": "1:34:44", "remaining_time": "0:45:53"}
|
||||
{"current_steps": 2080, "total_steps": 3080, "loss": 0.0045, "lr": 1.1547634765186016e-05, "epoch": 4.7272727272727275, "percentage": 67.53, "elapsed_time": "1:34:55", "remaining_time": "0:45:38"}
|
||||
{"current_steps": 2085, "total_steps": 3080, "loss": 0.0045, "lr": 1.1445056378978588e-05, "epoch": 4.738636363636363, "percentage": 67.69, "elapsed_time": "1:35:06", "remaining_time": "0:45:23"}
|
||||
{"current_steps": 2090, "total_steps": 3080, "loss": 0.0044, "lr": 1.1342752699496452e-05, "epoch": 4.75, "percentage": 67.86, "elapsed_time": "1:35:19", "remaining_time": "0:45:09"}
|
||||
{"current_steps": 2095, "total_steps": 3080, "loss": 0.0046, "lr": 1.1240727011800288e-05, "epoch": 4.761363636363637, "percentage": 68.02, "elapsed_time": "1:35:30", "remaining_time": "0:44:54"}
|
||||
{"current_steps": 2100, "total_steps": 3080, "loss": 0.0044, "lr": 1.113898259202419e-05, "epoch": 4.7727272727272725, "percentage": 68.18, "elapsed_time": "1:35:42", "remaining_time": "0:44:39"}
|
||||
{"current_steps": 2105, "total_steps": 3080, "loss": 0.0046, "lr": 1.1037522707270498e-05, "epoch": 4.784090909090909, "percentage": 68.34, "elapsed_time": "1:35:53", "remaining_time": "0:44:25"}
|
||||
{"current_steps": 2110, "total_steps": 3080, "loss": 0.0046, "lr": 1.093635061550491e-05, "epoch": 4.795454545454545, "percentage": 68.51, "elapsed_time": "1:36:04", "remaining_time": "0:44:10"}
|
||||
{"current_steps": 2115, "total_steps": 3080, "loss": 0.0046, "lr": 1.0835469565451792e-05, "epoch": 4.806818181818182, "percentage": 68.67, "elapsed_time": "1:36:16", "remaining_time": "0:43:55"}
|
||||
{"current_steps": 2120, "total_steps": 3080, "loss": 0.0045, "lr": 1.073488279648996e-05, "epoch": 4.818181818181818, "percentage": 68.83, "elapsed_time": "1:36:27", "remaining_time": "0:43:40"}
|
||||
{"current_steps": 2125, "total_steps": 3080, "loss": 0.0045, "lr": 1.063459353854856e-05, "epoch": 4.829545454545455, "percentage": 68.99, "elapsed_time": "1:36:39", "remaining_time": "0:43:26"}
|
||||
{"current_steps": 2130, "total_steps": 3080, "loss": 0.0041, "lr": 1.0534605012003429e-05, "epoch": 4.840909090909091, "percentage": 69.16, "elapsed_time": "1:36:50", "remaining_time": "0:43:11"}
|
||||
{"current_steps": 2135, "total_steps": 3080, "loss": 0.0045, "lr": 1.0434920427573643e-05, "epoch": 4.8522727272727275, "percentage": 69.32, "elapsed_time": "1:37:01", "remaining_time": "0:42:56"}
|
||||
{"current_steps": 2140, "total_steps": 3080, "loss": 0.0047, "lr": 1.033554298621842e-05, "epoch": 4.863636363636363, "percentage": 69.48, "elapsed_time": "1:37:13", "remaining_time": "0:42:42"}
|
||||
{"current_steps": 2145, "total_steps": 3080, "loss": 0.0047, "lr": 1.023647587903438e-05, "epoch": 4.875, "percentage": 69.64, "elapsed_time": "1:37:25", "remaining_time": "0:42:27"}
|
||||
{"current_steps": 2150, "total_steps": 3080, "loss": 0.0045, "lr": 1.0137722287152995e-05, "epoch": 4.886363636363637, "percentage": 69.81, "elapsed_time": "1:37:36", "remaining_time": "0:42:13"}
|
||||
{"current_steps": 2155, "total_steps": 3080, "loss": 0.0043, "lr": 1.0039285381638532e-05, "epoch": 4.8977272727272725, "percentage": 69.97, "elapsed_time": "1:37:47", "remaining_time": "0:41:58"}
|
||||
{"current_steps": 2160, "total_steps": 3080, "loss": 0.0044, "lr": 9.941168323386146e-06, "epoch": 4.909090909090909, "percentage": 70.13, "elapsed_time": "1:37:59", "remaining_time": "0:41:44"}
|
||||
{"current_steps": 2165, "total_steps": 3080, "loss": 0.0048, "lr": 9.84337426302044e-06, "epoch": 4.920454545454545, "percentage": 70.29, "elapsed_time": "1:38:11", "remaining_time": "0:41:29"}
|
||||
{"current_steps": 2170, "total_steps": 3080, "loss": 0.0045, "lr": 9.74590634079425e-06, "epoch": 4.931818181818182, "percentage": 70.45, "elapsed_time": "1:38:25", "remaining_time": "0:41:16"}
|
||||
{"current_steps": 2175, "total_steps": 3080, "loss": 0.0044, "lr": 9.648767686487859e-06, "epoch": 4.943181818181818, "percentage": 70.62, "elapsed_time": "1:38:36", "remaining_time": "0:41:01"}
|
||||
{"current_steps": 2180, "total_steps": 3080, "loss": 0.0047, "lr": 9.551961419308447e-06, "epoch": 4.954545454545455, "percentage": 70.78, "elapsed_time": "1:38:48", "remaining_time": "0:40:47"}
|
||||
{"current_steps": 2185, "total_steps": 3080, "loss": 0.0045, "lr": 9.455490647789951e-06, "epoch": 4.965909090909091, "percentage": 70.94, "elapsed_time": "1:39:01", "remaining_time": "0:40:33"}
|
||||
{"current_steps": 2190, "total_steps": 3080, "loss": 0.0044, "lr": 9.359358469693272e-06, "epoch": 4.9772727272727275, "percentage": 71.1, "elapsed_time": "1:39:12", "remaining_time": "0:40:19"}
|
||||
{"current_steps": 2195, "total_steps": 3080, "loss": 0.0046, "lr": 9.263567971906748e-06, "epoch": 4.988636363636363, "percentage": 71.27, "elapsed_time": "1:39:23", "remaining_time": "0:40:04"}
|
||||
{"current_steps": 2200, "total_steps": 3080, "loss": 0.0046, "lr": 9.168122230347098e-06, "epoch": 5.0, "percentage": 71.43, "elapsed_time": "1:39:35", "remaining_time": "0:39:50"}
|
||||
{"current_steps": 2205, "total_steps": 3080, "loss": 0.004, "lr": 9.073024309860583e-06, "epoch": 5.011363636363637, "percentage": 71.59, "elapsed_time": "1:40:54", "remaining_time": "0:40:02"}
|
||||
{"current_steps": 2210, "total_steps": 3080, "loss": 0.0039, "lr": 8.978277264124655e-06, "epoch": 5.0227272727272725, "percentage": 71.75, "elapsed_time": "1:41:06", "remaining_time": "0:39:48"}
|
||||
{"current_steps": 2215, "total_steps": 3080, "loss": 0.0039, "lr": 8.883884135549852e-06, "epoch": 5.034090909090909, "percentage": 71.92, "elapsed_time": "1:41:18", "remaining_time": "0:39:33"}
|
||||
{"current_steps": 2220, "total_steps": 3080, "loss": 0.004, "lr": 8.789847955182118e-06, "epoch": 5.045454545454546, "percentage": 72.08, "elapsed_time": "1:41:30", "remaining_time": "0:39:19"}
|
||||
{"current_steps": 2225, "total_steps": 3080, "loss": 0.0037, "lr": 8.696171742605508e-06, "epoch": 5.056818181818182, "percentage": 72.24, "elapsed_time": "1:41:42", "remaining_time": "0:39:04"}
|
||||
{"current_steps": 2230, "total_steps": 3080, "loss": 0.004, "lr": 8.602858505845149e-06, "epoch": 5.068181818181818, "percentage": 72.4, "elapsed_time": "1:41:54", "remaining_time": "0:38:50"}
|
||||
{"current_steps": 2235, "total_steps": 3080, "loss": 0.004, "lr": 8.509911241270741e-06, "epoch": 5.079545454545454, "percentage": 72.56, "elapsed_time": "1:42:06", "remaining_time": "0:38:36"}
|
||||
{"current_steps": 2240, "total_steps": 3080, "loss": 0.0039, "lr": 8.417332933500267e-06, "epoch": 5.090909090909091, "percentage": 72.73, "elapsed_time": "1:42:17", "remaining_time": "0:38:21"}
|
||||
{"current_steps": 2245, "total_steps": 3080, "loss": 0.0039, "lr": 8.325126555304208e-06, "epoch": 5.1022727272727275, "percentage": 72.89, "elapsed_time": "1:42:29", "remaining_time": "0:38:07"}
|
||||
{"current_steps": 2250, "total_steps": 3080, "loss": 0.0039, "lr": 8.233295067510036e-06, "epoch": 5.113636363636363, "percentage": 73.05, "elapsed_time": "1:42:41", "remaining_time": "0:37:52"}
|
||||
{"current_steps": 2255, "total_steps": 3080, "loss": 0.0038, "lr": 8.141841418907194e-06, "epoch": 5.125, "percentage": 73.21, "elapsed_time": "1:42:53", "remaining_time": "0:37:38"}
|
||||
{"current_steps": 2260, "total_steps": 3080, "loss": 0.0041, "lr": 8.050768546152352e-06, "epoch": 5.136363636363637, "percentage": 73.38, "elapsed_time": "1:43:04", "remaining_time": "0:37:24"}
|
||||
{"current_steps": 2265, "total_steps": 3080, "loss": 0.0039, "lr": 7.960079373675135e-06, "epoch": 5.1477272727272725, "percentage": 73.54, "elapsed_time": "1:43:15", "remaining_time": "0:37:09"}
|
||||
{"current_steps": 2270, "total_steps": 3080, "loss": 0.0041, "lr": 7.86977681358424e-06, "epoch": 5.159090909090909, "percentage": 73.7, "elapsed_time": "1:43:26", "remaining_time": "0:36:54"}
|
||||
{"current_steps": 2275, "total_steps": 3080, "loss": 0.004, "lr": 7.779863765573874e-06, "epoch": 5.170454545454546, "percentage": 73.86, "elapsed_time": "1:43:38", "remaining_time": "0:36:40"}
|
||||
{"current_steps": 2280, "total_steps": 3080, "loss": 0.0039, "lr": 7.690343116830696e-06, "epoch": 5.181818181818182, "percentage": 74.03, "elapsed_time": "1:43:49", "remaining_time": "0:36:25"}
|
||||
{"current_steps": 2285, "total_steps": 3080, "loss": 0.0039, "lr": 7.601217741941054e-06, "epoch": 5.193181818181818, "percentage": 74.19, "elapsed_time": "1:44:01", "remaining_time": "0:36:11"}
|
||||
{"current_steps": 2290, "total_steps": 3080, "loss": 0.0038, "lr": 7.512490502798735e-06, "epoch": 5.204545454545454, "percentage": 74.35, "elapsed_time": "1:44:14", "remaining_time": "0:35:57"}
|
||||
{"current_steps": 2295, "total_steps": 3080, "loss": 0.0042, "lr": 7.424164248513017e-06, "epoch": 5.215909090909091, "percentage": 74.51, "elapsed_time": "1:44:24", "remaining_time": "0:35:42"}
|
||||
{"current_steps": 2300, "total_steps": 3080, "loss": 0.0042, "lr": 7.336241815317207e-06, "epoch": 5.2272727272727275, "percentage": 74.68, "elapsed_time": "1:44:35", "remaining_time": "0:35:28"}
|
||||
{"current_steps": 2305, "total_steps": 3080, "loss": 0.0041, "lr": 7.248726026477577e-06, "epoch": 5.238636363636363, "percentage": 74.84, "elapsed_time": "1:44:46", "remaining_time": "0:35:13"}
|
||||
{"current_steps": 2310, "total_steps": 3080, "loss": 0.0039, "lr": 7.161619692202675e-06, "epoch": 5.25, "percentage": 75.0, "elapsed_time": "1:44:58", "remaining_time": "0:34:59"}
|
||||
{"current_steps": 2315, "total_steps": 3080, "loss": 0.0039, "lr": 7.0749256095531225e-06, "epoch": 5.261363636363637, "percentage": 75.16, "elapsed_time": "1:45:09", "remaining_time": "0:34:44"}
|
||||
{"current_steps": 2320, "total_steps": 3080, "loss": 0.0041, "lr": 6.98864656235176e-06, "epoch": 5.2727272727272725, "percentage": 75.32, "elapsed_time": "1:45:20", "remaining_time": "0:34:30"}
|
||||
{"current_steps": 2325, "total_steps": 3080, "loss": 0.0038, "lr": 6.902785321094301e-06, "epoch": 5.284090909090909, "percentage": 75.49, "elapsed_time": "1:45:30", "remaining_time": "0:34:15"}
|
||||
{"current_steps": 2330, "total_steps": 3080, "loss": 0.0037, "lr": 6.817344642860322e-06, "epoch": 5.295454545454546, "percentage": 75.65, "elapsed_time": "1:45:42", "remaining_time": "0:34:01"}
|
||||
{"current_steps": 2335, "total_steps": 3080, "loss": 0.0039, "lr": 6.732327271224759e-06, "epoch": 5.306818181818182, "percentage": 75.81, "elapsed_time": "1:45:54", "remaining_time": "0:33:47"}
|
||||
{"current_steps": 2340, "total_steps": 3080, "loss": 0.004, "lr": 6.647735936169819e-06, "epoch": 5.318181818181818, "percentage": 75.97, "elapsed_time": "1:46:07", "remaining_time": "0:33:33"}
|
||||
{"current_steps": 2345, "total_steps": 3080, "loss": 0.004, "lr": 6.563573353997275e-06, "epoch": 5.329545454545454, "percentage": 76.14, "elapsed_time": "1:46:18", "remaining_time": "0:33:19"}
|
||||
{"current_steps": 2350, "total_steps": 3080, "loss": 0.0039, "lr": 6.479842227241293e-06, "epoch": 5.340909090909091, "percentage": 76.3, "elapsed_time": "1:46:29", "remaining_time": "0:33:04"}
|
||||
{"current_steps": 2355, "total_steps": 3080, "loss": 0.004, "lr": 6.396545244581609e-06, "epoch": 5.3522727272727275, "percentage": 76.46, "elapsed_time": "1:46:40", "remaining_time": "0:32:50"}
|
||||
{"current_steps": 2360, "total_steps": 3080, "loss": 0.0039, "lr": 6.313685080757235e-06, "epoch": 5.363636363636363, "percentage": 76.62, "elapsed_time": "1:46:51", "remaining_time": "0:32:35"}
|
||||
{"current_steps": 2365, "total_steps": 3080, "loss": 0.0039, "lr": 6.231264396480525e-06, "epoch": 5.375, "percentage": 76.79, "elapsed_time": "1:47:04", "remaining_time": "0:32:22"}
|
||||
{"current_steps": 2370, "total_steps": 3080, "loss": 0.004, "lr": 6.149285838351766e-06, "epoch": 5.386363636363637, "percentage": 76.95, "elapsed_time": "1:47:15", "remaining_time": "0:32:07"}
|
||||
{"current_steps": 2375, "total_steps": 3080, "loss": 0.004, "lr": 6.067752038774204e-06, "epoch": 5.3977272727272725, "percentage": 77.11, "elapsed_time": "1:47:26", "remaining_time": "0:31:53"}
|
||||
{"current_steps": 2380, "total_steps": 3080, "loss": 0.0037, "lr": 5.986665615869478e-06, "epoch": 5.409090909090909, "percentage": 77.27, "elapsed_time": "1:47:40", "remaining_time": "0:31:40"}
|
||||
{"current_steps": 2385, "total_steps": 3080, "loss": 0.004, "lr": 5.9060291733935795e-06, "epoch": 5.420454545454546, "percentage": 77.44, "elapsed_time": "1:47:52", "remaining_time": "0:31:26"}
|
||||
{"current_steps": 2390, "total_steps": 3080, "loss": 0.0039, "lr": 5.825845300653226e-06, "epoch": 5.431818181818182, "percentage": 77.6, "elapsed_time": "1:48:03", "remaining_time": "0:31:11"}
|
||||
{"current_steps": 2395, "total_steps": 3080, "loss": 0.0041, "lr": 5.7461165724227485e-06, "epoch": 5.443181818181818, "percentage": 77.76, "elapsed_time": "1:48:15", "remaining_time": "0:30:57"}
|
||||
{"current_steps": 2400, "total_steps": 3080, "loss": 0.004, "lr": 5.666845548861357e-06, "epoch": 5.454545454545454, "percentage": 77.92, "elapsed_time": "1:48:26", "remaining_time": "0:30:43"}
|
||||
{"current_steps": 2405, "total_steps": 3080, "loss": 0.004, "lr": 5.588034775430997e-06, "epoch": 5.465909090909091, "percentage": 78.08, "elapsed_time": "1:49:38", "remaining_time": "0:30:46"}
|
||||
{"current_steps": 2410, "total_steps": 3080, "loss": 0.0039, "lr": 5.509686782814547e-06, "epoch": 5.4772727272727275, "percentage": 78.25, "elapsed_time": "1:49:50", "remaining_time": "0:30:32"}
|
||||
{"current_steps": 2415, "total_steps": 3080, "loss": 0.0039, "lr": 5.43180408683462e-06, "epoch": 5.488636363636363, "percentage": 78.41, "elapsed_time": "1:50:01", "remaining_time": "0:30:17"}
|
||||
{"current_steps": 2420, "total_steps": 3080, "loss": 0.0039, "lr": 5.354389188372726e-06, "epoch": 5.5, "percentage": 78.57, "elapsed_time": "1:50:14", "remaining_time": "0:30:03"}
|
||||
{"current_steps": 2425, "total_steps": 3080, "loss": 0.0039, "lr": 5.277444573288986e-06, "epoch": 5.511363636363637, "percentage": 78.73, "elapsed_time": "1:50:25", "remaining_time": "0:29:49"}
|
||||
{"current_steps": 2430, "total_steps": 3080, "loss": 0.0039, "lr": 5.200972712342327e-06, "epoch": 5.5227272727272725, "percentage": 78.9, "elapsed_time": "1:50:36", "remaining_time": "0:29:35"}
|
||||
{"current_steps": 2435, "total_steps": 3080, "loss": 0.0038, "lr": 5.124976061111109e-06, "epoch": 5.534090909090909, "percentage": 79.06, "elapsed_time": "1:50:47", "remaining_time": "0:29:20"}
|
||||
{"current_steps": 2440, "total_steps": 3080, "loss": 0.0039, "lr": 5.049457059914311e-06, "epoch": 5.545454545454545, "percentage": 79.22, "elapsed_time": "1:50:58", "remaining_time": "0:29:06"}
|
||||
{"current_steps": 2445, "total_steps": 3080, "loss": 0.0042, "lr": 4.97441813373313e-06, "epoch": 5.556818181818182, "percentage": 79.38, "elapsed_time": "1:51:10", "remaining_time": "0:28:52"}
|
||||
{"current_steps": 2450, "total_steps": 3080, "loss": 0.0039, "lr": 4.899861692133157e-06, "epoch": 5.568181818181818, "percentage": 79.55, "elapsed_time": "1:51:22", "remaining_time": "0:28:38"}
|
||||
{"current_steps": 2455, "total_steps": 3080, "loss": 0.004, "lr": 4.82579012918696e-06, "epoch": 5.579545454545455, "percentage": 79.71, "elapsed_time": "1:51:34", "remaining_time": "0:28:24"}
|
||||
{"current_steps": 2460, "total_steps": 3080, "loss": 0.0038, "lr": 4.752205823397236e-06, "epoch": 5.590909090909091, "percentage": 79.87, "elapsed_time": "1:51:44", "remaining_time": "0:28:09"}
|
||||
{"current_steps": 2465, "total_steps": 3080, "loss": 0.0038, "lr": 4.679111137620442e-06, "epoch": 5.6022727272727275, "percentage": 80.03, "elapsed_time": "1:51:56", "remaining_time": "0:27:55"}
|
||||
{"current_steps": 2470, "total_steps": 3080, "loss": 0.0038, "lr": 4.606508418990885e-06, "epoch": 5.613636363636363, "percentage": 80.19, "elapsed_time": "1:52:07", "remaining_time": "0:27:41"}
|
||||
{"current_steps": 2475, "total_steps": 3080, "loss": 0.004, "lr": 4.534399998845396e-06, "epoch": 5.625, "percentage": 80.36, "elapsed_time": "1:52:18", "remaining_time": "0:27:27"}
|
||||
{"current_steps": 2480, "total_steps": 3080, "loss": 0.0039, "lr": 4.462788192648435e-06, "epoch": 5.636363636363637, "percentage": 80.52, "elapsed_time": "1:52:30", "remaining_time": "0:27:13"}
|
||||
{"current_steps": 2485, "total_steps": 3080, "loss": 0.0039, "lr": 4.391675299917768e-06, "epoch": 5.6477272727272725, "percentage": 80.68, "elapsed_time": "1:52:41", "remaining_time": "0:26:58"}
|
||||
{"current_steps": 2490, "total_steps": 3080, "loss": 0.0039, "lr": 4.3210636041506e-06, "epoch": 5.659090909090909, "percentage": 80.84, "elapsed_time": "1:52:52", "remaining_time": "0:26:44"}
|
||||
{"current_steps": 2495, "total_steps": 3080, "loss": 0.0038, "lr": 4.25095537275026e-06, "epoch": 5.670454545454545, "percentage": 81.01, "elapsed_time": "1:53:03", "remaining_time": "0:26:30"}
|
||||
{"current_steps": 2500, "total_steps": 3080, "loss": 0.0037, "lr": 4.181352856953418e-06, "epoch": 5.681818181818182, "percentage": 81.17, "elapsed_time": "1:53:14", "remaining_time": "0:26:16"}
|
||||
{"current_steps": 2505, "total_steps": 3080, "loss": 0.0039, "lr": 4.112258291757747e-06, "epoch": 5.693181818181818, "percentage": 81.33, "elapsed_time": "1:53:25", "remaining_time": "0:26:02"}
|
||||
{"current_steps": 2510, "total_steps": 3080, "loss": 0.0036, "lr": 4.043673895850202e-06, "epoch": 5.704545454545455, "percentage": 81.49, "elapsed_time": "1:53:37", "remaining_time": "0:25:48"}
|
||||
{"current_steps": 2515, "total_steps": 3080, "loss": 0.0038, "lr": 3.975601871535739e-06, "epoch": 5.715909090909091, "percentage": 81.66, "elapsed_time": "1:53:49", "remaining_time": "0:25:34"}
|
||||
{"current_steps": 2520, "total_steps": 3080, "loss": 0.004, "lr": 3.908044404666633e-06, "epoch": 5.7272727272727275, "percentage": 81.82, "elapsed_time": "1:54:00", "remaining_time": "0:25:20"}
|
||||
{"current_steps": 2525, "total_steps": 3080, "loss": 0.0039, "lr": 3.841003664572251e-06, "epoch": 5.738636363636363, "percentage": 81.98, "elapsed_time": "1:54:11", "remaining_time": "0:25:06"}
|
||||
{"current_steps": 2530, "total_steps": 3080, "loss": 0.0039, "lr": 3.7744818039894202e-06, "epoch": 5.75, "percentage": 82.14, "elapsed_time": "1:54:23", "remaining_time": "0:24:52"}
|
||||
{"current_steps": 2535, "total_steps": 3080, "loss": 0.0038, "lr": 3.708480958993286e-06, "epoch": 5.761363636363637, "percentage": 82.31, "elapsed_time": "1:54:34", "remaining_time": "0:24:38"}
|
||||
{"current_steps": 2540, "total_steps": 3080, "loss": 0.0038, "lr": 3.6430032489287424e-06, "epoch": 5.7727272727272725, "percentage": 82.47, "elapsed_time": "1:54:47", "remaining_time": "0:24:24"}
|
||||
{"current_steps": 2545, "total_steps": 3080, "loss": 0.0038, "lr": 3.5780507763423456e-06, "epoch": 5.784090909090909, "percentage": 82.63, "elapsed_time": "1:54:58", "remaining_time": "0:24:10"}
|
||||
{"current_steps": 2550, "total_steps": 3080, "loss": 0.0037, "lr": 3.513625626914823e-06, "epoch": 5.795454545454545, "percentage": 82.79, "elapsed_time": "1:55:09", "remaining_time": "0:23:56"}
|
||||
{"current_steps": 2555, "total_steps": 3080, "loss": 0.0038, "lr": 3.4497298693941007e-06, "epoch": 5.806818181818182, "percentage": 82.95, "elapsed_time": "1:55:21", "remaining_time": "0:23:42"}
|
||||
{"current_steps": 2560, "total_steps": 3080, "loss": 0.0038, "lr": 3.3863655555288564e-06, "epoch": 5.818181818181818, "percentage": 83.12, "elapsed_time": "1:55:33", "remaining_time": "0:23:28"}
|
||||
{"current_steps": 2565, "total_steps": 3080, "loss": 0.0039, "lr": 3.323534720002659e-06, "epoch": 5.829545454545455, "percentage": 83.28, "elapsed_time": "1:55:45", "remaining_time": "0:23:14"}
|
||||
{"current_steps": 2570, "total_steps": 3080, "loss": 0.0039, "lr": 3.261239380368606e-06, "epoch": 5.840909090909091, "percentage": 83.44, "elapsed_time": "1:55:56", "remaining_time": "0:23:00"}
|
||||
{"current_steps": 2575, "total_steps": 3080, "loss": 0.0041, "lr": 3.199481536984572e-06, "epoch": 5.8522727272727275, "percentage": 83.6, "elapsed_time": "1:56:09", "remaining_time": "0:22:46"}
|
||||
{"current_steps": 2580, "total_steps": 3080, "loss": 0.0039, "lr": 3.138263172948943e-06, "epoch": 5.863636363636363, "percentage": 83.77, "elapsed_time": "1:56:20", "remaining_time": "0:22:32"}
|
||||
{"current_steps": 2585, "total_steps": 3080, "loss": 0.0039, "lr": 3.077586254036946e-06, "epoch": 5.875, "percentage": 83.93, "elapsed_time": "1:56:32", "remaining_time": "0:22:18"}
|
||||
{"current_steps": 2590, "total_steps": 3080, "loss": 0.0037, "lr": 3.017452728637553e-06, "epoch": 5.886363636363637, "percentage": 84.09, "elapsed_time": "1:56:42", "remaining_time": "0:22:04"}
|
||||
{"current_steps": 2595, "total_steps": 3080, "loss": 0.004, "lr": 2.95786452769087e-06, "epoch": 5.8977272727272725, "percentage": 84.25, "elapsed_time": "1:56:53", "remaining_time": "0:21:50"}
|
||||
{"current_steps": 2600, "total_steps": 3080, "loss": 0.0038, "lr": 2.898823564626181e-06, "epoch": 5.909090909090909, "percentage": 84.42, "elapsed_time": "1:57:04", "remaining_time": "0:21:36"}
|
||||
{"current_steps": 2605, "total_steps": 3080, "loss": 0.0036, "lr": 2.840331735300459e-06, "epoch": 5.920454545454545, "percentage": 84.58, "elapsed_time": "1:58:17", "remaining_time": "0:21:34"}
|
||||
{"current_steps": 2610, "total_steps": 3080, "loss": 0.004, "lr": 2.7823909179375365e-06, "epoch": 5.931818181818182, "percentage": 84.74, "elapsed_time": "1:58:29", "remaining_time": "0:21:20"}
|
||||
{"current_steps": 2615, "total_steps": 3080, "loss": 0.0038, "lr": 2.7250029730677498e-06, "epoch": 5.943181818181818, "percentage": 84.9, "elapsed_time": "1:58:41", "remaining_time": "0:21:06"}
|
||||
{"current_steps": 2620, "total_steps": 3080, "loss": 0.0038, "lr": 2.668169743468223e-06, "epoch": 5.954545454545455, "percentage": 85.06, "elapsed_time": "1:58:52", "remaining_time": "0:20:52"}
|
||||
{"current_steps": 2625, "total_steps": 3080, "loss": 0.0039, "lr": 2.6118930541036937e-06, "epoch": 5.965909090909091, "percentage": 85.23, "elapsed_time": "1:59:03", "remaining_time": "0:20:38"}
|
||||
{"current_steps": 2630, "total_steps": 3080, "loss": 0.0036, "lr": 2.556174712067894e-06, "epoch": 5.9772727272727275, "percentage": 85.39, "elapsed_time": "1:59:15", "remaining_time": "0:20:24"}
|
||||
{"current_steps": 2635, "total_steps": 3080, "loss": 0.004, "lr": 2.5010165065255465e-06, "epoch": 5.988636363636363, "percentage": 85.55, "elapsed_time": "1:59:27", "remaining_time": "0:20:10"}
|
||||
{"current_steps": 2640, "total_steps": 3080, "loss": 0.0037, "lr": 2.4464202086548874e-06, "epoch": 6.0, "percentage": 85.71, "elapsed_time": "1:59:39", "remaining_time": "0:19:56"}
|
||||
{"current_steps": 2645, "total_steps": 3080, "loss": 0.0034, "lr": 2.3923875715908195e-06, "epoch": 6.011363636363637, "percentage": 85.88, "elapsed_time": "1:59:51", "remaining_time": "0:19:42"}
|
||||
{"current_steps": 2650, "total_steps": 3080, "loss": 0.0033, "lr": 2.338920330368599e-06, "epoch": 6.0227272727272725, "percentage": 86.04, "elapsed_time": "2:00:02", "remaining_time": "0:19:28"}
|
||||
{"current_steps": 2655, "total_steps": 3080, "loss": 0.0033, "lr": 2.2860202018681153e-06, "epoch": 6.034090909090909, "percentage": 86.2, "elapsed_time": "2:00:14", "remaining_time": "0:19:14"}
|
||||
{"current_steps": 2660, "total_steps": 3080, "loss": 0.0032, "lr": 2.2336888847587978e-06, "epoch": 6.045454545454546, "percentage": 86.36, "elapsed_time": "2:00:25", "remaining_time": "0:19:00"}
|
||||
{"current_steps": 2665, "total_steps": 3080, "loss": 0.0034, "lr": 2.1819280594450134e-06, "epoch": 6.056818181818182, "percentage": 86.53, "elapsed_time": "2:00:37", "remaining_time": "0:18:47"}
|
||||
{"current_steps": 2670, "total_steps": 3080, "loss": 0.0035, "lr": 2.1307393880121684e-06, "epoch": 6.068181818181818, "percentage": 86.69, "elapsed_time": "2:00:50", "remaining_time": "0:18:33"}
|
||||
{"current_steps": 2675, "total_steps": 3080, "loss": 0.0033, "lr": 2.080124514173285e-06, "epoch": 6.079545454545454, "percentage": 86.85, "elapsed_time": "2:01:00", "remaining_time": "0:18:19"}
|
||||
{"current_steps": 2680, "total_steps": 3080, "loss": 0.0033, "lr": 2.0300850632162517e-06, "epoch": 6.090909090909091, "percentage": 87.01, "elapsed_time": "2:01:13", "remaining_time": "0:18:05"}
|
||||
{"current_steps": 2685, "total_steps": 3080, "loss": 0.0033, "lr": 1.9806226419516195e-06, "epoch": 6.1022727272727275, "percentage": 87.18, "elapsed_time": "2:01:24", "remaining_time": "0:17:51"}
|
||||
{"current_steps": 2690, "total_steps": 3080, "loss": 0.0033, "lr": 1.9317388386610213e-06, "epoch": 6.113636363636363, "percentage": 87.34, "elapsed_time": "2:01:36", "remaining_time": "0:17:37"}
|
||||
{"current_steps": 2695, "total_steps": 3080, "loss": 0.0031, "lr": 1.8834352230461506e-06, "epoch": 6.125, "percentage": 87.5, "elapsed_time": "2:01:46", "remaining_time": "0:17:23"}
|
||||
{"current_steps": 2700, "total_steps": 3080, "loss": 0.0033, "lr": 1.835713346178376e-06, "epoch": 6.136363636363637, "percentage": 87.66, "elapsed_time": "2:01:58", "remaining_time": "0:17:09"}
|
||||
{"current_steps": 2705, "total_steps": 3080, "loss": 0.0032, "lr": 1.7885747404489162e-06, "epoch": 6.1477272727272725, "percentage": 87.82, "elapsed_time": "2:02:09", "remaining_time": "0:16:56"}
|
||||
{"current_steps": 2710, "total_steps": 3080, "loss": 0.0032, "lr": 1.7420209195196447e-06, "epoch": 6.159090909090909, "percentage": 87.99, "elapsed_time": "2:02:23", "remaining_time": "0:16:42"}
|
||||
{"current_steps": 2715, "total_steps": 3080, "loss": 0.0034, "lr": 1.6960533782744938e-06, "epoch": 6.170454545454546, "percentage": 88.15, "elapsed_time": "2:02:36", "remaining_time": "0:16:28"}
|
||||
{"current_steps": 2720, "total_steps": 3080, "loss": 0.0033, "lr": 1.650673592771428e-06, "epoch": 6.181818181818182, "percentage": 88.31, "elapsed_time": "2:02:47", "remaining_time": "0:16:15"}
|
||||
{"current_steps": 2725, "total_steps": 3080, "loss": 0.0031, "lr": 1.6058830201950738e-06, "epoch": 6.193181818181818, "percentage": 88.47, "elapsed_time": "2:02:58", "remaining_time": "0:16:01"}
|
||||
{"current_steps": 2730, "total_steps": 3080, "loss": 0.0032, "lr": 1.561683098809903e-06, "epoch": 6.204545454545454, "percentage": 88.64, "elapsed_time": "2:03:09", "remaining_time": "0:15:47"}
|
||||
{"current_steps": 2735, "total_steps": 3080, "loss": 0.0031, "lr": 1.5180752479140727e-06, "epoch": 6.215909090909091, "percentage": 88.8, "elapsed_time": "2:03:21", "remaining_time": "0:15:33"}
|
||||
{"current_steps": 2740, "total_steps": 3080, "loss": 0.0032, "lr": 1.475060867793827e-06, "epoch": 6.2272727272727275, "percentage": 88.96, "elapsed_time": "2:03:34", "remaining_time": "0:15:20"}
|
||||
{"current_steps": 2745, "total_steps": 3080, "loss": 0.0033, "lr": 1.4326413396785488e-06, "epoch": 6.238636363636363, "percentage": 89.12, "elapsed_time": "2:03:46", "remaining_time": "0:15:06"}
|
||||
{"current_steps": 2750, "total_steps": 3080, "loss": 0.0032, "lr": 1.3908180256964078e-06, "epoch": 6.25, "percentage": 89.29, "elapsed_time": "2:03:57", "remaining_time": "0:14:52"}
|
||||
{"current_steps": 2755, "total_steps": 3080, "loss": 0.0034, "lr": 1.3495922688306063e-06, "epoch": 6.261363636363637, "percentage": 89.45, "elapsed_time": "2:04:09", "remaining_time": "0:14:38"}
|
||||
{"current_steps": 2760, "total_steps": 3080, "loss": 0.0033, "lr": 1.3089653928762758e-06, "epoch": 6.2727272727272725, "percentage": 89.61, "elapsed_time": "2:04:21", "remaining_time": "0:14:25"}
|
||||
{"current_steps": 2765, "total_steps": 3080, "loss": 0.0033, "lr": 1.268938702397946e-06, "epoch": 6.284090909090909, "percentage": 89.77, "elapsed_time": "2:04:32", "remaining_time": "0:14:11"}
|
||||
{"current_steps": 2770, "total_steps": 3080, "loss": 0.0032, "lr": 1.2295134826876788e-06, "epoch": 6.295454545454546, "percentage": 89.94, "elapsed_time": "2:04:44", "remaining_time": "0:13:57"}
|
||||
{"current_steps": 2775, "total_steps": 3080, "loss": 0.0035, "lr": 1.1906909997237714e-06, "epoch": 6.306818181818182, "percentage": 90.1, "elapsed_time": "2:04:56", "remaining_time": "0:13:43"}
|
||||
{"current_steps": 2780, "total_steps": 3080, "loss": 0.0032, "lr": 1.1524725001301197e-06, "epoch": 6.318181818181818, "percentage": 90.26, "elapsed_time": "2:05:08", "remaining_time": "0:13:30"}
|
||||
{"current_steps": 2785, "total_steps": 3080, "loss": 0.0034, "lr": 1.1148592111361945e-06, "epoch": 6.329545454545454, "percentage": 90.42, "elapsed_time": "2:05:19", "remaining_time": "0:13:16"}
|
||||
{"current_steps": 2790, "total_steps": 3080, "loss": 0.0033, "lr": 1.0778523405376084e-06, "epoch": 6.340909090909091, "percentage": 90.58, "elapsed_time": "2:05:31", "remaining_time": "0:13:02"}
|
||||
{"current_steps": 2795, "total_steps": 3080, "loss": 0.0034, "lr": 1.0414530766573661e-06, "epoch": 6.3522727272727275, "percentage": 90.75, "elapsed_time": "2:05:42", "remaining_time": "0:12:49"}
|
||||
{"current_steps": 2800, "total_steps": 3080, "loss": 0.0031, "lr": 1.005662588307672e-06, "epoch": 6.363636363636363, "percentage": 90.91, "elapsed_time": "2:05:53", "remaining_time": "0:12:35"}
|
||||
{"current_steps": 2805, "total_steps": 3080, "loss": 0.0034, "lr": 9.704820247524328e-07, "epoch": 6.375, "percentage": 91.07, "elapsed_time": "2:07:03", "remaining_time": "0:12:27"}
|
||||
{"current_steps": 2810, "total_steps": 3080, "loss": 0.0032, "lr": 9.359125156703208e-07, "epoch": 6.386363636363637, "percentage": 91.23, "elapsed_time": "2:07:14", "remaining_time": "0:12:13"}
|
||||
{"current_steps": 2815, "total_steps": 3080, "loss": 0.0033, "lr": 9.019551711185226e-07, "epoch": 6.3977272727272725, "percentage": 91.4, "elapsed_time": "2:07:25", "remaining_time": "0:11:59"}
|
||||
{"current_steps": 2820, "total_steps": 3080, "loss": 0.0033, "lr": 8.68611081497086e-07, "epoch": 6.409090909090909, "percentage": 91.56, "elapsed_time": "2:07:36", "remaining_time": "0:11:45"}
|
||||
{"current_steps": 2825, "total_steps": 3080, "loss": 0.0032, "lr": 8.358813175139046e-07, "epoch": 6.420454545454546, "percentage": 91.72, "elapsed_time": "2:07:48", "remaining_time": "0:11:32"}
|
||||
{"current_steps": 2830, "total_steps": 3080, "loss": 0.0034, "lr": 8.037669301503403e-07, "epoch": 6.431818181818182, "percentage": 91.88, "elapsed_time": "2:08:00", "remaining_time": "0:11:18"}
|
||||
{"current_steps": 2835, "total_steps": 3080, "loss": 0.0032, "lr": 7.722689506274639e-07, "epoch": 6.443181818181818, "percentage": 92.05, "elapsed_time": "2:08:11", "remaining_time": "0:11:04"}
|
||||
{"current_steps": 2840, "total_steps": 3080, "loss": 0.0032, "lr": 7.413883903729701e-07, "epoch": 6.454545454545454, "percentage": 92.21, "elapsed_time": "2:08:23", "remaining_time": "0:10:51"}
|
||||
{"current_steps": 2845, "total_steps": 3080, "loss": 0.0035, "lr": 7.111262409886599e-07, "epoch": 6.465909090909091, "percentage": 92.37, "elapsed_time": "2:08:34", "remaining_time": "0:10:37"}
|
||||
{"current_steps": 2850, "total_steps": 3080, "loss": 0.0033, "lr": 6.814834742186361e-07, "epoch": 6.4772727272727275, "percentage": 92.53, "elapsed_time": "2:08:47", "remaining_time": "0:10:23"}
|
||||
{"current_steps": 2855, "total_steps": 3080, "loss": 0.0032, "lr": 6.52461041918071e-07, "epoch": 6.488636363636363, "percentage": 92.69, "elapsed_time": "2:08:59", "remaining_time": "0:10:09"}
|
||||
{"current_steps": 2860, "total_steps": 3080, "loss": 0.0033, "lr": 6.240598760226691e-07, "epoch": 6.5, "percentage": 92.86, "elapsed_time": "2:09:10", "remaining_time": "0:09:56"}
|
||||
{"current_steps": 2865, "total_steps": 3080, "loss": 0.0034, "lr": 5.962808885187121e-07, "epoch": 6.511363636363637, "percentage": 93.02, "elapsed_time": "2:09:22", "remaining_time": "0:09:42"}
|
||||
{"current_steps": 2870, "total_steps": 3080, "loss": 0.0032, "lr": 5.691249714137948e-07, "epoch": 6.5227272727272725, "percentage": 93.18, "elapsed_time": "2:09:33", "remaining_time": "0:09:28"}
|
||||
{"current_steps": 2875, "total_steps": 3080, "loss": 0.0031, "lr": 5.425929967081822e-07, "epoch": 6.534090909090909, "percentage": 93.34, "elapsed_time": "2:09:45", "remaining_time": "0:09:15"}
|
||||
{"current_steps": 2880, "total_steps": 3080, "loss": 0.0032, "lr": 5.166858163667932e-07, "epoch": 6.545454545454545, "percentage": 93.51, "elapsed_time": "2:09:56", "remaining_time": "0:09:01"}
|
||||
{"current_steps": 2885, "total_steps": 3080, "loss": 0.0033, "lr": 4.914042622918591e-07, "epoch": 6.556818181818182, "percentage": 93.67, "elapsed_time": "2:10:08", "remaining_time": "0:08:47"}
|
||||
{"current_steps": 2890, "total_steps": 3080, "loss": 0.0032, "lr": 4.667491462962037e-07, "epoch": 6.568181818181818, "percentage": 93.83, "elapsed_time": "2:10:19", "remaining_time": "0:08:34"}
|
||||
{"current_steps": 2895, "total_steps": 3080, "loss": 0.0033, "lr": 4.4272126007717466e-07, "epoch": 6.579545454545455, "percentage": 93.99, "elapsed_time": "2:10:31", "remaining_time": "0:08:20"}
|
||||
{"current_steps": 2900, "total_steps": 3080, "loss": 0.0031, "lr": 4.1932137519122175e-07, "epoch": 6.590909090909091, "percentage": 94.16, "elapsed_time": "2:10:42", "remaining_time": "0:08:06"}
|
||||
{"current_steps": 2905, "total_steps": 3080, "loss": 0.0034, "lr": 3.965502430291235e-07, "epoch": 6.6022727272727275, "percentage": 94.32, "elapsed_time": "2:10:54", "remaining_time": "0:07:53"}
|
||||
{"current_steps": 2910, "total_steps": 3080, "loss": 0.0032, "lr": 3.7440859479185967e-07, "epoch": 6.613636363636363, "percentage": 94.48, "elapsed_time": "2:11:06", "remaining_time": "0:07:39"}
|
||||
{"current_steps": 2915, "total_steps": 3080, "loss": 0.0034, "lr": 3.5289714146713004e-07, "epoch": 6.625, "percentage": 94.64, "elapsed_time": "2:11:17", "remaining_time": "0:07:25"}
|
||||
{"current_steps": 2920, "total_steps": 3080, "loss": 0.0032, "lr": 3.3201657380652173e-07, "epoch": 6.636363636363637, "percentage": 94.81, "elapsed_time": "2:11:28", "remaining_time": "0:07:12"}
|
||||
{"current_steps": 2925, "total_steps": 3080, "loss": 0.0033, "lr": 3.117675623033334e-07, "epoch": 6.6477272727272725, "percentage": 94.97, "elapsed_time": "2:11:39", "remaining_time": "0:06:58"}
|
||||
{"current_steps": 2930, "total_steps": 3080, "loss": 0.0034, "lr": 2.9215075717104356e-07, "epoch": 6.659090909090909, "percentage": 95.13, "elapsed_time": "2:11:50", "remaining_time": "0:06:44"}
|
||||
{"current_steps": 2935, "total_steps": 3080, "loss": 0.0031, "lr": 2.731667883224298e-07, "epoch": 6.670454545454545, "percentage": 95.29, "elapsed_time": "2:12:03", "remaining_time": "0:06:31"}
|
||||
{"current_steps": 2940, "total_steps": 3080, "loss": 0.0033, "lr": 2.548162653493402e-07, "epoch": 6.681818181818182, "percentage": 95.45, "elapsed_time": "2:12:14", "remaining_time": "0:06:17"}
|
||||
{"current_steps": 2945, "total_steps": 3080, "loss": 0.0033, "lr": 2.370997775031292e-07, "epoch": 6.693181818181818, "percentage": 95.62, "elapsed_time": "2:12:25", "remaining_time": "0:06:04"}
|
||||
{"current_steps": 2950, "total_steps": 3080, "loss": 0.0034, "lr": 2.2001789367571692e-07, "epoch": 6.704545454545455, "percentage": 95.78, "elapsed_time": "2:12:36", "remaining_time": "0:05:50"}
|
||||
{"current_steps": 2955, "total_steps": 3080, "loss": 0.0035, "lr": 2.0357116238134633e-07, "epoch": 6.715909090909091, "percentage": 95.94, "elapsed_time": "2:12:48", "remaining_time": "0:05:37"}
|
||||
{"current_steps": 2960, "total_steps": 3080, "loss": 0.0032, "lr": 1.8776011173894383e-07, "epoch": 6.7272727272727275, "percentage": 96.1, "elapsed_time": "2:13:00", "remaining_time": "0:05:23"}
|
||||
{"current_steps": 2965, "total_steps": 3080, "loss": 0.0032, "lr": 1.725852494551883e-07, "epoch": 6.738636363636363, "percentage": 96.27, "elapsed_time": "2:13:11", "remaining_time": "0:05:09"}
|
||||
{"current_steps": 2970, "total_steps": 3080, "loss": 0.0033, "lr": 1.5804706280817984e-07, "epoch": 6.75, "percentage": 96.43, "elapsed_time": "2:13:22", "remaining_time": "0:04:56"}
|
||||
{"current_steps": 2975, "total_steps": 3080, "loss": 0.0033, "lr": 1.441460186318122e-07, "epoch": 6.761363636363637, "percentage": 96.59, "elapsed_time": "2:13:34", "remaining_time": "0:04:42"}
|
||||
{"current_steps": 2980, "total_steps": 3080, "loss": 0.0031, "lr": 1.3088256330078264e-07, "epoch": 6.7727272727272725, "percentage": 96.75, "elapsed_time": "2:13:45", "remaining_time": "0:04:29"}
|
||||
{"current_steps": 2985, "total_steps": 3080, "loss": 0.0032, "lr": 1.1825712271624102e-07, "epoch": 6.784090909090909, "percentage": 96.92, "elapsed_time": "2:13:56", "remaining_time": "0:04:15"}
|
||||
{"current_steps": 2990, "total_steps": 3080, "loss": 0.0032, "lr": 1.0627010229213641e-07, "epoch": 6.795454545454545, "percentage": 97.08, "elapsed_time": "2:14:07", "remaining_time": "0:04:02"}
|
||||
{"current_steps": 2995, "total_steps": 3080, "loss": 0.0033, "lr": 9.492188694218085e-08, "epoch": 6.806818181818182, "percentage": 97.24, "elapsed_time": "2:14:20", "remaining_time": "0:03:48"}
|
||||
{"current_steps": 3000, "total_steps": 3080, "loss": 0.0033, "lr": 8.421284106750139e-08, "epoch": 6.818181818181818, "percentage": 97.4, "elapsed_time": "2:14:31", "remaining_time": "0:03:35"}
|
||||
{"current_steps": 3005, "total_steps": 3080, "loss": 0.0032, "lr": 7.414330854492946e-08, "epoch": 6.829545454545455, "percentage": 97.56, "elapsed_time": "2:15:47", "remaining_time": "0:03:23"}
|
||||
{"current_steps": 3010, "total_steps": 3080, "loss": 0.0032, "lr": 6.471361271596754e-08, "epoch": 6.840909090909091, "percentage": 97.73, "elapsed_time": "2:15:58", "remaining_time": "0:03:09"}
|
||||
{"current_steps": 3015, "total_steps": 3080, "loss": 0.0033, "lr": 5.592405637639742e-08, "epoch": 6.8522727272727275, "percentage": 97.89, "elapsed_time": "2:16:09", "remaining_time": "0:02:56"}
|
||||
{"current_steps": 3020, "total_steps": 3080, "loss": 0.0032, "lr": 4.777492176656351e-08, "epoch": 6.863636363636363, "percentage": 98.05, "elapsed_time": "2:16:20", "remaining_time": "0:02:42"}
|
||||
{"current_steps": 3025, "total_steps": 3080, "loss": 0.0032, "lr": 4.026647056231348e-08, "epoch": 6.875, "percentage": 98.21, "elapsed_time": "2:16:31", "remaining_time": "0:02:28"}
|
||||
{"current_steps": 3030, "total_steps": 3080, "loss": 0.0031, "lr": 3.339894386657827e-08, "epoch": 6.886363636363637, "percentage": 98.38, "elapsed_time": "2:16:43", "remaining_time": "0:02:15"}
|
||||
{"current_steps": 3035, "total_steps": 3080, "loss": 0.0031, "lr": 2.7172562201656096e-08, "epoch": 6.8977272727272725, "percentage": 98.54, "elapsed_time": "2:16:54", "remaining_time": "0:02:01"}
|
||||
{"current_steps": 3040, "total_steps": 3080, "loss": 0.0033, "lr": 2.1587525502104745e-08, "epoch": 6.909090909090909, "percentage": 98.7, "elapsed_time": "2:17:06", "remaining_time": "0:01:48"}
|
||||
{"current_steps": 3045, "total_steps": 3080, "loss": 0.0032, "lr": 1.6644013108342294e-08, "epoch": 6.920454545454545, "percentage": 98.86, "elapsed_time": "2:17:17", "remaining_time": "0:01:34"}
|
||||
{"current_steps": 3050, "total_steps": 3080, "loss": 0.003, "lr": 1.2342183760878368e-08, "epoch": 6.931818181818182, "percentage": 99.03, "elapsed_time": "2:17:32", "remaining_time": "0:01:21"}
|
||||
{"current_steps": 3055, "total_steps": 3080, "loss": 0.0032, "lr": 8.682175595216002e-09, "epoch": 6.943181818181818, "percentage": 99.19, "elapsed_time": "2:17:44", "remaining_time": "0:01:07"}
|
||||
{"current_steps": 3060, "total_steps": 3080, "loss": 0.0031, "lr": 5.664106137419634e-09, "epoch": 6.954545454545455, "percentage": 99.35, "elapsed_time": "2:17:56", "remaining_time": "0:00:54"}
|
||||
{"current_steps": 3065, "total_steps": 3080, "loss": 0.0033, "lr": 3.288072300340339e-09, "epoch": 6.965909090909091, "percentage": 99.51, "elapsed_time": "2:18:08", "remaining_time": "0:00:40"}
|
||||
{"current_steps": 3070, "total_steps": 3080, "loss": 0.0033, "lr": 1.5541503805027725e-09, "epoch": 6.9772727272727275, "percentage": 99.68, "elapsed_time": "2:18:19", "remaining_time": "0:00:27"}
|
||||
{"current_steps": 3075, "total_steps": 3080, "loss": 0.0033, "lr": 4.623960556604523e-10, "epoch": 6.988636363636363, "percentage": 99.84, "elapsed_time": "2:18:30", "remaining_time": "0:00:13"}
|
||||
{"current_steps": 3080, "total_steps": 3080, "loss": 0.0031, "lr": 1.2844382997201365e-11, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "2:18:41", "remaining_time": "0:00:00"}
|
||||
{"current_steps": 3080, "total_steps": 3080, "epoch": 7.0, "percentage": 100.0, "elapsed_time": "2:19:51", "remaining_time": "0:00:00"}
|
||||
5589
trainer_state.json
Normal file
5589
trainer_state.json
Normal file
File diff suppressed because it is too large
Load Diff
3
training_args.bin
Normal file
3
training_args.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:78ee58ed30dee3a36656e2e4411b474deb5d436a6a6142e8e80fdbb1c83d12be
|
||||
size 8721
|
||||
BIN
training_loss.png
Normal file
BIN
training_loss.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 36 KiB |
1
vocab.json
Normal file
1
vocab.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user