初始化项目,由ModelHub XC社区提供模型
Model: laion/Qwen3-8B_exp_tas_tmux_large_traces_save-strategy_steps Source: Original Platform
This commit is contained in:
36
.gitattributes
vendored
Normal file
36
.gitattributes
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||
60
README.md
Normal file
60
README.md
Normal file
@@ -0,0 +1,60 @@
|
||||
---
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
base_model: Qwen/Qwen3-8B
|
||||
tags:
|
||||
- llama-factory
|
||||
- full
|
||||
- generated_from_trainer
|
||||
model-index:
|
||||
- name: Qwen3-8B_exp_tas_tmux_large_traces_save-strategy_steps
|
||||
results: []
|
||||
---
|
||||
|
||||
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
||||
should probably proofread and complete it, then remove this comment. -->
|
||||
|
||||
# Qwen3-8B_exp_tas_tmux_large_traces_save-strategy_steps
|
||||
|
||||
This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the DCAgent/exp_tas_tmux_large_traces dataset.
|
||||
|
||||
## Model description
|
||||
|
||||
More information needed
|
||||
|
||||
## Intended uses & limitations
|
||||
|
||||
More information needed
|
||||
|
||||
## Training and evaluation data
|
||||
|
||||
More information needed
|
||||
|
||||
## Training procedure
|
||||
|
||||
### Training hyperparameters
|
||||
|
||||
The following hyperparameters were used during training:
|
||||
- learning_rate: 0.0001
|
||||
- train_batch_size: 1
|
||||
- eval_batch_size: 8
|
||||
- seed: 42
|
||||
- distributed_type: multi-GPU
|
||||
- num_devices: 32
|
||||
- total_train_batch_size: 32
|
||||
- total_eval_batch_size: 256
|
||||
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.87,0.99) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
||||
- lr_scheduler_type: cosine
|
||||
- lr_scheduler_warmup_ratio: 0.005
|
||||
- num_epochs: 8.0
|
||||
|
||||
### Training results
|
||||
|
||||
|
||||
|
||||
### Framework versions
|
||||
|
||||
- Transformers 4.55.0
|
||||
- Pytorch 2.7.1+cu128
|
||||
- Datasets 3.6.0
|
||||
- Tokenizers 0.21.1
|
||||
28
added_tokens.json
Normal file
28
added_tokens.json
Normal file
@@ -0,0 +1,28 @@
|
||||
{
|
||||
"</think>": 151668,
|
||||
"</tool_call>": 151658,
|
||||
"</tool_response>": 151666,
|
||||
"<think>": 151667,
|
||||
"<tool_call>": 151657,
|
||||
"<tool_response>": 151665,
|
||||
"<|box_end|>": 151649,
|
||||
"<|box_start|>": 151648,
|
||||
"<|endoftext|>": 151643,
|
||||
"<|file_sep|>": 151664,
|
||||
"<|fim_middle|>": 151660,
|
||||
"<|fim_pad|>": 151662,
|
||||
"<|fim_prefix|>": 151659,
|
||||
"<|fim_suffix|>": 151661,
|
||||
"<|im_end|>": 151645,
|
||||
"<|im_start|>": 151644,
|
||||
"<|image_pad|>": 151655,
|
||||
"<|object_ref_end|>": 151647,
|
||||
"<|object_ref_start|>": 151646,
|
||||
"<|quad_end|>": 151651,
|
||||
"<|quad_start|>": 151650,
|
||||
"<|repo_name|>": 151663,
|
||||
"<|video_pad|>": 151656,
|
||||
"<|vision_end|>": 151653,
|
||||
"<|vision_pad|>": 151654,
|
||||
"<|vision_start|>": 151652
|
||||
}
|
||||
16
all_results.json
Normal file
16
all_results.json
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"achieved_tflops_per_gpu": 0.0011402087434851426,
|
||||
"achieved_tflops_per_gpu_theoretical": 226.98960570915227,
|
||||
"epoch": 8.0,
|
||||
"loss_nan_ranks": 0,
|
||||
"loss_rank_avg": 0.016571061685681343,
|
||||
"mfu_percent": 0.0003654515203478021,
|
||||
"mfu_percent_theoretical": 72.75307875293342,
|
||||
"total_flos": 1378268975529984.0,
|
||||
"train_loss": 0.16813087239582386,
|
||||
"train_runtime": 37774.5792,
|
||||
"train_samples_per_second": 2.146,
|
||||
"train_steps_per_second": 0.067,
|
||||
"valid_targets_mean": 4493.3,
|
||||
"valid_targets_min": 905
|
||||
}
|
||||
89
chat_template.jinja
Normal file
89
chat_template.jinja
Normal file
@@ -0,0 +1,89 @@
|
||||
{%- if tools %}
|
||||
{{- '<|im_start|>system\n' }}
|
||||
{%- if messages[0].role == 'system' %}
|
||||
{{- messages[0].content + '\n\n' }}
|
||||
{%- endif %}
|
||||
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
|
||||
{%- for tool in tools %}
|
||||
{{- "\n" }}
|
||||
{{- tool | tojson }}
|
||||
{%- endfor %}
|
||||
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
|
||||
{%- else %}
|
||||
{%- if messages[0].role == 'system' %}
|
||||
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
|
||||
{%- for message in messages[::-1] %}
|
||||
{%- set index = (messages|length - 1) - loop.index0 %}
|
||||
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
|
||||
{%- set ns.multi_step_tool = false %}
|
||||
{%- set ns.last_query_index = index %}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
{%- for message in messages %}
|
||||
{%- if message.content is string %}
|
||||
{%- set content = message.content %}
|
||||
{%- else %}
|
||||
{%- set content = '' %}
|
||||
{%- endif %}
|
||||
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
|
||||
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
|
||||
{%- elif message.role == "assistant" %}
|
||||
{%- set reasoning_content = '' %}
|
||||
{%- if message.reasoning_content is string %}
|
||||
{%- set reasoning_content = message.reasoning_content %}
|
||||
{%- else %}
|
||||
{%- if '</think>' in content %}
|
||||
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
|
||||
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
{%- if loop.index0 > ns.last_query_index %}
|
||||
{%- if loop.last or (not loop.last and reasoning_content) %}
|
||||
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
|
||||
{%- else %}
|
||||
{{- '<|im_start|>' + message.role + '\n' + content }}
|
||||
{%- endif %}
|
||||
{%- else %}
|
||||
{{- '<|im_start|>' + message.role + '\n' + content }}
|
||||
{%- endif %}
|
||||
{%- if message.tool_calls %}
|
||||
{%- for tool_call in message.tool_calls %}
|
||||
{%- if (loop.first and content) or (not loop.first) %}
|
||||
{{- '\n' }}
|
||||
{%- endif %}
|
||||
{%- if tool_call.function %}
|
||||
{%- set tool_call = tool_call.function %}
|
||||
{%- endif %}
|
||||
{{- '<tool_call>\n{"name": "' }}
|
||||
{{- tool_call.name }}
|
||||
{{- '", "arguments": ' }}
|
||||
{%- if tool_call.arguments is string %}
|
||||
{{- tool_call.arguments }}
|
||||
{%- else %}
|
||||
{{- tool_call.arguments | tojson }}
|
||||
{%- endif %}
|
||||
{{- '}\n</tool_call>' }}
|
||||
{%- endfor %}
|
||||
{%- endif %}
|
||||
{{- '<|im_end|>\n' }}
|
||||
{%- elif message.role == "tool" %}
|
||||
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
|
||||
{{- '<|im_start|>user' }}
|
||||
{%- endif %}
|
||||
{{- '\n<tool_response>\n' }}
|
||||
{{- content }}
|
||||
{{- '\n</tool_response>' }}
|
||||
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
|
||||
{{- '<|im_end|>\n' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
{%- endfor %}
|
||||
{%- if add_generation_prompt %}
|
||||
{{- '<|im_start|>assistant\n' }}
|
||||
{%- if enable_thinking is defined and enable_thinking is false %}
|
||||
{{- '<think>\n\n</think>\n\n' }}
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
68
config.json
Normal file
68
config.json
Normal file
@@ -0,0 +1,68 @@
|
||||
{
|
||||
"architectures": [
|
||||
"Qwen3ForCausalLM"
|
||||
],
|
||||
"attention_bias": false,
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 151643,
|
||||
"eos_token_id": 151645,
|
||||
"head_dim": 128,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 12288,
|
||||
"layer_types": [
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention",
|
||||
"full_attention"
|
||||
],
|
||||
"max_position_embeddings": 40960,
|
||||
"max_window_layers": 36,
|
||||
"model_type": "qwen3",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 36,
|
||||
"num_key_value_heads": 8,
|
||||
"rms_norm_eps": 1e-06,
|
||||
"rope_scaling": null,
|
||||
"rope_theta": 1000000,
|
||||
"sliding_window": null,
|
||||
"tie_word_embeddings": false,
|
||||
"torch_dtype": "bfloat16",
|
||||
"transformers_version": "4.55.0",
|
||||
"use_cache": false,
|
||||
"use_sliding_window": false,
|
||||
"vocab_size": 151936
|
||||
}
|
||||
13
generation_config.json
Normal file
13
generation_config.json
Normal file
@@ -0,0 +1,13 @@
|
||||
{
|
||||
"bos_token_id": 151643,
|
||||
"do_sample": true,
|
||||
"eos_token_id": [
|
||||
151645,
|
||||
151643
|
||||
],
|
||||
"pad_token_id": 151643,
|
||||
"temperature": 0.6,
|
||||
"top_k": 20,
|
||||
"top_p": 0.95,
|
||||
"transformers_version": "4.55.0"
|
||||
}
|
||||
151388
merges.txt
Normal file
151388
merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
3
model-00001-of-00004.safetensors
Normal file
3
model-00001-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:42fba83abc939aa942639715ff32735d9fd7623bd3e50be0a21d174efb2b0855
|
||||
size 4902257696
|
||||
3
model-00002-of-00004.safetensors
Normal file
3
model-00002-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:28c0b557ccffef59e2b02d373ccc5bc9a4cd1b4b21f94b61cd26d7f5d442ab18
|
||||
size 4915960368
|
||||
3
model-00003-of-00004.safetensors
Normal file
3
model-00003-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:df52e20750ce7810443361829c217d78bb18b1823ade0220e3f569150e54a9eb
|
||||
size 4983068496
|
||||
3
model-00004-of-00004.safetensors
Normal file
3
model-00004-of-00004.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a312c1bfdad56434820140d7ef1127aa6e268203c0f69ee404cee79c77f754ed
|
||||
size 1580230264
|
||||
407
model.safetensors.index.json
Normal file
407
model.safetensors.index.json
Normal file
@@ -0,0 +1,407 @@
|
||||
{
|
||||
"metadata": {
|
||||
"total_parameters": 308224,
|
||||
"total_size": 16381470720
|
||||
},
|
||||
"weight_map": {
|
||||
"lm_head.weight": "model-00004-of-00004.safetensors",
|
||||
"model.embed_tokens.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.19.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.20.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.k_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.q_norm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.28.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.29.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.30.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.31.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.32.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.33.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.k_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.q_norm.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.34.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.35.input_layernorm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.mlp.down_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.mlp.gate_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.mlp.up_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.post_attention_layernorm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.k_norm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.o_proj.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.q_norm.weight": "model-00004-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.35.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
||||
"model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.k_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.q_norm.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
||||
"model.norm.weight": "model-00004-of-00004.safetensors"
|
||||
}
|
||||
}
|
||||
12
run_summary.json
Normal file
12
run_summary.json
Normal file
@@ -0,0 +1,12 @@
|
||||
{
|
||||
"agent_name": null,
|
||||
"training_start": null,
|
||||
"training_end": null,
|
||||
"created_by": "DCAgent",
|
||||
"base_model_name": "Qwen/Qwen3-8B",
|
||||
"dataset_name": "DCAgent/exp_tas_tmux_large_traces",
|
||||
"training_type": "SFT",
|
||||
"training_parameters": "https://huggingface.co/laion/Qwen3-8B_exp_tas_tmux_large_traces_save-strategy_steps/blob/main/config.json",
|
||||
"wandb_link": "https://wandb.ai/dogml/dc-agent/runs/Qwen3-8B_exp_tas_tmux_large_traces_save-strategy_steps",
|
||||
"traces_location_s3": null
|
||||
}
|
||||
31
special_tokens_map.json
Normal file
31
special_tokens_map.json
Normal file
@@ -0,0 +1,31 @@
|
||||
{
|
||||
"additional_special_tokens": [
|
||||
"<|im_start|>",
|
||||
"<|im_end|>",
|
||||
"<|object_ref_start|>",
|
||||
"<|object_ref_end|>",
|
||||
"<|box_start|>",
|
||||
"<|box_end|>",
|
||||
"<|quad_start|>",
|
||||
"<|quad_end|>",
|
||||
"<|vision_start|>",
|
||||
"<|vision_end|>",
|
||||
"<|vision_pad|>",
|
||||
"<|image_pad|>",
|
||||
"<|video_pad|>"
|
||||
],
|
||||
"eos_token": {
|
||||
"content": "<|im_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"pad_token": {
|
||||
"content": "<|endoftext|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
BIN
tokenizer.json
(Stored with Git LFS)
Normal file
Binary file not shown.
240
tokenizer_config.json
Normal file
240
tokenizer_config.json
Normal file
@@ -0,0 +1,240 @@
|
||||
{
|
||||
"add_bos_token": false,
|
||||
"add_prefix_space": false,
|
||||
"added_tokens_decoder": {
|
||||
"151643": {
|
||||
"content": "<|endoftext|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151644": {
|
||||
"content": "<|im_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151645": {
|
||||
"content": "<|im_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151646": {
|
||||
"content": "<|object_ref_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151647": {
|
||||
"content": "<|object_ref_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151648": {
|
||||
"content": "<|box_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151649": {
|
||||
"content": "<|box_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151650": {
|
||||
"content": "<|quad_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151651": {
|
||||
"content": "<|quad_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151652": {
|
||||
"content": "<|vision_start|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151653": {
|
||||
"content": "<|vision_end|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151654": {
|
||||
"content": "<|vision_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151655": {
|
||||
"content": "<|image_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151656": {
|
||||
"content": "<|video_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"151657": {
|
||||
"content": "<tool_call>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151658": {
|
||||
"content": "</tool_call>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151659": {
|
||||
"content": "<|fim_prefix|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151660": {
|
||||
"content": "<|fim_middle|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151661": {
|
||||
"content": "<|fim_suffix|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151662": {
|
||||
"content": "<|fim_pad|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151663": {
|
||||
"content": "<|repo_name|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151664": {
|
||||
"content": "<|file_sep|>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151665": {
|
||||
"content": "<tool_response>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151666": {
|
||||
"content": "</tool_response>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151667": {
|
||||
"content": "<think>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
},
|
||||
"151668": {
|
||||
"content": "</think>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": false
|
||||
}
|
||||
},
|
||||
"additional_special_tokens": [
|
||||
"<|im_start|>",
|
||||
"<|im_end|>",
|
||||
"<|object_ref_start|>",
|
||||
"<|object_ref_end|>",
|
||||
"<|box_start|>",
|
||||
"<|box_end|>",
|
||||
"<|quad_start|>",
|
||||
"<|quad_end|>",
|
||||
"<|vision_start|>",
|
||||
"<|vision_end|>",
|
||||
"<|vision_pad|>",
|
||||
"<|image_pad|>",
|
||||
"<|video_pad|>"
|
||||
],
|
||||
"bos_token": null,
|
||||
"clean_up_tokenization_spaces": false,
|
||||
"eos_token": "<|im_end|>",
|
||||
"errors": "replace",
|
||||
"extra_special_tokens": {},
|
||||
"model_max_length": 32768,
|
||||
"pad_token": "<|endoftext|>",
|
||||
"padding_side": "right",
|
||||
"split_special_tokens": false,
|
||||
"tokenizer_class": "Qwen2Tokenizer",
|
||||
"unk_token": null
|
||||
}
|
||||
16
train_results.json
Normal file
16
train_results.json
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"achieved_tflops_per_gpu": 0.0011402087434851426,
|
||||
"achieved_tflops_per_gpu_theoretical": 226.98960570915227,
|
||||
"epoch": 8.0,
|
||||
"loss_nan_ranks": 0,
|
||||
"loss_rank_avg": 0.016571061685681343,
|
||||
"mfu_percent": 0.0003654515203478021,
|
||||
"mfu_percent_theoretical": 72.75307875293342,
|
||||
"total_flos": 1378268975529984.0,
|
||||
"train_loss": 0.16813087239582386,
|
||||
"train_runtime": 37774.5792,
|
||||
"train_samples_per_second": 2.146,
|
||||
"train_steps_per_second": 0.067,
|
||||
"valid_targets_mean": 4493.3,
|
||||
"valid_targets_min": 905
|
||||
}
|
||||
508
trainer_log.jsonl
Normal file
508
trainer_log.jsonl
Normal file
@@ -0,0 +1,508 @@
|
||||
{"current_steps": 5, "total_steps": 2536, "loss": 0.9249, "lr": 3.0769230769230774e-05, "epoch": 0.015772870662460567, "percentage": 0.2, "elapsed_time": "0:07:11", "remaining_time": "2 days, 12:39:28"}
|
||||
{"current_steps": 10, "total_steps": 2536, "loss": 0.7835, "lr": 6.923076923076924e-05, "epoch": 0.031545741324921134, "percentage": 0.39, "elapsed_time": "0:09:29", "remaining_time": "1 day, 15:58:05"}
|
||||
{"current_steps": 15, "total_steps": 2536, "loss": 0.6389, "lr": 9.99999612380875e-05, "epoch": 0.0473186119873817, "percentage": 0.59, "elapsed_time": "0:11:10", "remaining_time": "1 day, 7:19:14"}
|
||||
{"current_steps": 20, "total_steps": 2536, "loss": 0.6338, "lr": 9.999860457746025e-05, "epoch": 0.06309148264984227, "percentage": 0.79, "elapsed_time": "0:12:55", "remaining_time": "1 day, 3:05:53"}
|
||||
{"current_steps": 25, "total_steps": 2536, "loss": 0.6127, "lr": 9.999530988130677e-05, "epoch": 0.07886435331230283, "percentage": 0.99, "elapsed_time": "0:14:40", "remaining_time": "1 day, 0:33:32"}
|
||||
{"current_steps": 30, "total_steps": 2536, "loss": 0.6193, "lr": 9.999007727733537e-05, "epoch": 0.0946372239747634, "percentage": 1.18, "elapsed_time": "0:16:19", "remaining_time": "22:43:33"}
|
||||
{"current_steps": 35, "total_steps": 2536, "loss": 0.5795, "lr": 9.998290696837115e-05, "epoch": 0.11041009463722397, "percentage": 1.38, "elapsed_time": "0:17:41", "remaining_time": "21:04:28"}
|
||||
{"current_steps": 40, "total_steps": 2536, "loss": 0.5804, "lr": 9.997379923234816e-05, "epoch": 0.12618296529968454, "percentage": 1.58, "elapsed_time": "0:18:55", "remaining_time": "19:40:54"}
|
||||
{"current_steps": 45, "total_steps": 2536, "loss": 0.5701, "lr": 9.996275442229857e-05, "epoch": 0.14195583596214512, "percentage": 1.77, "elapsed_time": "0:20:23", "remaining_time": "18:48:26"}
|
||||
{"current_steps": 50, "total_steps": 2536, "loss": 0.5859, "lr": 9.994977296633902e-05, "epoch": 0.15772870662460567, "percentage": 1.97, "elapsed_time": "0:21:45", "remaining_time": "18:01:51"}
|
||||
{"current_steps": 55, "total_steps": 2536, "loss": 0.5602, "lr": 9.993485536765398e-05, "epoch": 0.17350157728706625, "percentage": 2.17, "elapsed_time": "0:23:11", "remaining_time": "17:26:11"}
|
||||
{"current_steps": 60, "total_steps": 2536, "loss": 0.5922, "lr": 9.991800220447634e-05, "epoch": 0.1892744479495268, "percentage": 2.37, "elapsed_time": "0:24:27", "remaining_time": "16:49:34"}
|
||||
{"current_steps": 65, "total_steps": 2536, "loss": 0.5805, "lr": 9.989921413006489e-05, "epoch": 0.20504731861198738, "percentage": 2.56, "elapsed_time": "0:25:39", "remaining_time": "16:15:11"}
|
||||
{"current_steps": 70, "total_steps": 2536, "loss": 0.5505, "lr": 9.987849187267908e-05, "epoch": 0.22082018927444794, "percentage": 2.76, "elapsed_time": "0:27:03", "remaining_time": "15:53:24"}
|
||||
{"current_steps": 75, "total_steps": 2536, "loss": 0.5571, "lr": 9.985583623555076e-05, "epoch": 0.23659305993690852, "percentage": 2.96, "elapsed_time": "0:28:30", "remaining_time": "15:35:37"}
|
||||
{"current_steps": 80, "total_steps": 2536, "loss": 0.5465, "lr": 9.9831248096853e-05, "epoch": 0.25236593059936907, "percentage": 3.15, "elapsed_time": "0:29:44", "remaining_time": "15:12:50"}
|
||||
{"current_steps": 85, "total_steps": 2536, "loss": 0.558, "lr": 9.980472840966614e-05, "epoch": 0.26813880126182965, "percentage": 3.35, "elapsed_time": "0:31:08", "remaining_time": "14:57:57"}
|
||||
{"current_steps": 90, "total_steps": 2536, "loss": 0.5496, "lr": 9.977627820194082e-05, "epoch": 0.28391167192429023, "percentage": 3.55, "elapsed_time": "0:32:25", "remaining_time": "14:41:18"}
|
||||
{"current_steps": 95, "total_steps": 2536, "loss": 0.5503, "lr": 9.974589857645802e-05, "epoch": 0.2996845425867508, "percentage": 3.75, "elapsed_time": "0:33:40", "remaining_time": "14:25:22"}
|
||||
{"current_steps": 100, "total_steps": 2536, "loss": 0.5593, "lr": 9.97135907107865e-05, "epoch": 0.31545741324921134, "percentage": 3.94, "elapsed_time": "0:34:59", "remaining_time": "14:12:16"}
|
||||
{"current_steps": 105, "total_steps": 2536, "loss": 0.5566, "lr": 9.967935585723706e-05, "epoch": 0.3312302839116719, "percentage": 4.14, "elapsed_time": "1:00:56", "remaining_time": "23:31:03"}
|
||||
{"current_steps": 110, "total_steps": 2536, "loss": 0.562, "lr": 9.964319534281397e-05, "epoch": 0.3470031545741325, "percentage": 4.34, "elapsed_time": "1:02:11", "remaining_time": "22:51:32"}
|
||||
{"current_steps": 115, "total_steps": 2536, "loss": 0.5599, "lr": 9.960511056916357e-05, "epoch": 0.3627760252365931, "percentage": 4.53, "elapsed_time": "1:03:26", "remaining_time": "22:15:25"}
|
||||
{"current_steps": 120, "total_steps": 2536, "loss": 0.5354, "lr": 9.956510301251995e-05, "epoch": 0.3785488958990536, "percentage": 4.73, "elapsed_time": "1:04:42", "remaining_time": "21:42:57"}
|
||||
{"current_steps": 125, "total_steps": 2536, "loss": 0.5783, "lr": 9.952317422364772e-05, "epoch": 0.3943217665615142, "percentage": 4.93, "elapsed_time": "1:05:58", "remaining_time": "21:12:41"}
|
||||
{"current_steps": 130, "total_steps": 2536, "loss": 0.5513, "lr": 9.947932582778188e-05, "epoch": 0.41009463722397477, "percentage": 5.13, "elapsed_time": "1:07:17", "remaining_time": "20:45:19"}
|
||||
{"current_steps": 135, "total_steps": 2536, "loss": 0.5508, "lr": 9.943355952456483e-05, "epoch": 0.42586750788643535, "percentage": 5.32, "elapsed_time": "1:08:37", "remaining_time": "20:20:29"}
|
||||
{"current_steps": 140, "total_steps": 2536, "loss": 0.5545, "lr": 9.938587708798053e-05, "epoch": 0.4416403785488959, "percentage": 5.52, "elapsed_time": "1:09:47", "remaining_time": "19:54:33"}
|
||||
{"current_steps": 145, "total_steps": 2536, "loss": 0.5707, "lr": 9.933628036628569e-05, "epoch": 0.45741324921135645, "percentage": 5.72, "elapsed_time": "1:11:04", "remaining_time": "19:31:53"}
|
||||
{"current_steps": 150, "total_steps": 2536, "loss": 0.5351, "lr": 9.92847712819381e-05, "epoch": 0.47318611987381703, "percentage": 5.91, "elapsed_time": "1:12:19", "remaining_time": "19:10:30"}
|
||||
{"current_steps": 155, "total_steps": 2536, "loss": 0.5435, "lr": 9.923135183152224e-05, "epoch": 0.4889589905362776, "percentage": 6.11, "elapsed_time": "1:13:26", "remaining_time": "18:48:13"}
|
||||
{"current_steps": 160, "total_steps": 2536, "loss": 0.5583, "lr": 9.91760240856717e-05, "epoch": 0.5047318611987381, "percentage": 6.31, "elapsed_time": "1:14:36", "remaining_time": "18:27:54"}
|
||||
{"current_steps": 165, "total_steps": 2536, "loss": 0.5518, "lr": 9.91187901889891e-05, "epoch": 0.5205047318611987, "percentage": 6.51, "elapsed_time": "1:15:49", "remaining_time": "18:09:33"}
|
||||
{"current_steps": 170, "total_steps": 2536, "loss": 0.5438, "lr": 9.905965235996286e-05, "epoch": 0.5362776025236593, "percentage": 6.7, "elapsed_time": "1:16:57", "remaining_time": "17:51:09"}
|
||||
{"current_steps": 175, "total_steps": 2536, "loss": 0.5728, "lr": 9.899861289088121e-05, "epoch": 0.5520504731861199, "percentage": 6.9, "elapsed_time": "1:18:07", "remaining_time": "17:33:59"}
|
||||
{"current_steps": 180, "total_steps": 2536, "loss": 0.5182, "lr": 9.893567414774341e-05, "epoch": 0.5678233438485805, "percentage": 7.1, "elapsed_time": "1:19:16", "remaining_time": "17:17:35"}
|
||||
{"current_steps": 185, "total_steps": 2536, "loss": 0.5362, "lr": 9.88708385701679e-05, "epoch": 0.583596214511041, "percentage": 7.29, "elapsed_time": "1:20:32", "remaining_time": "17:03:31"}
|
||||
{"current_steps": 190, "total_steps": 2536, "loss": 0.5185, "lr": 9.88041086712979e-05, "epoch": 0.5993690851735016, "percentage": 7.49, "elapsed_time": "1:21:44", "remaining_time": "16:49:13"}
|
||||
{"current_steps": 195, "total_steps": 2536, "loss": 0.5143, "lr": 9.873548703770388e-05, "epoch": 0.6151419558359621, "percentage": 7.69, "elapsed_time": "1:22:57", "remaining_time": "16:35:58"}
|
||||
{"current_steps": 200, "total_steps": 2536, "loss": 0.5225, "lr": 9.866497632928336e-05, "epoch": 0.6309148264984227, "percentage": 7.89, "elapsed_time": "1:24:09", "remaining_time": "16:22:56"}
|
||||
{"current_steps": 205, "total_steps": 2536, "loss": 0.5376, "lr": 9.859257927915774e-05, "epoch": 0.6466876971608833, "percentage": 8.08, "elapsed_time": "2:04:18", "remaining_time": "23:33:26"}
|
||||
{"current_steps": 210, "total_steps": 2536, "loss": 0.522, "lr": 9.851829869356651e-05, "epoch": 0.6624605678233438, "percentage": 8.28, "elapsed_time": "2:05:28", "remaining_time": "23:09:52"}
|
||||
{"current_steps": 215, "total_steps": 2536, "loss": 0.5332, "lr": 9.844213745175826e-05, "epoch": 0.6782334384858044, "percentage": 8.48, "elapsed_time": "2:06:43", "remaining_time": "22:48:01"}
|
||||
{"current_steps": 220, "total_steps": 2536, "loss": 0.5092, "lr": 9.83640985058792e-05, "epoch": 0.694006309148265, "percentage": 8.68, "elapsed_time": "2:07:54", "remaining_time": "22:26:32"}
|
||||
{"current_steps": 225, "total_steps": 2536, "loss": 0.5302, "lr": 9.828418488085877e-05, "epoch": 0.7097791798107256, "percentage": 8.87, "elapsed_time": "2:09:00", "remaining_time": "22:04:59"}
|
||||
{"current_steps": 230, "total_steps": 2536, "loss": 0.5371, "lr": 9.820239967429233e-05, "epoch": 0.7255520504731862, "percentage": 9.07, "elapsed_time": "2:10:07", "remaining_time": "21:44:40"}
|
||||
{"current_steps": 235, "total_steps": 2536, "loss": 0.5283, "lr": 9.811874605632104e-05, "epoch": 0.7413249211356467, "percentage": 9.27, "elapsed_time": "2:11:23", "remaining_time": "21:26:27"}
|
||||
{"current_steps": 240, "total_steps": 2536, "loss": 0.5378, "lr": 9.803322726950905e-05, "epoch": 0.7570977917981072, "percentage": 9.46, "elapsed_time": "2:12:37", "remaining_time": "21:08:42"}
|
||||
{"current_steps": 245, "total_steps": 2536, "loss": 0.5135, "lr": 9.794584662871787e-05, "epoch": 0.7728706624605678, "percentage": 9.66, "elapsed_time": "2:13:52", "remaining_time": "20:51:56"}
|
||||
{"current_steps": 250, "total_steps": 2536, "loss": 0.511, "lr": 9.785660752097768e-05, "epoch": 0.7886435331230284, "percentage": 9.86, "elapsed_time": "2:15:08", "remaining_time": "20:35:41"}
|
||||
{"current_steps": 255, "total_steps": 2536, "loss": 0.5271, "lr": 9.77655134053563e-05, "epoch": 0.804416403785489, "percentage": 10.06, "elapsed_time": "2:16:23", "remaining_time": "20:20:01"}
|
||||
{"current_steps": 260, "total_steps": 2536, "loss": 0.5459, "lr": 9.767256781282486e-05, "epoch": 0.8201892744479495, "percentage": 10.25, "elapsed_time": "2:17:33", "remaining_time": "20:04:11"}
|
||||
{"current_steps": 265, "total_steps": 2536, "loss": 0.5321, "lr": 9.757777434612116e-05, "epoch": 0.8359621451104101, "percentage": 10.45, "elapsed_time": "2:18:33", "remaining_time": "19:47:22"}
|
||||
{"current_steps": 270, "total_steps": 2536, "loss": 0.5416, "lr": 9.748113667960987e-05, "epoch": 0.8517350157728707, "percentage": 10.65, "elapsed_time": "2:19:43", "remaining_time": "19:32:37"}
|
||||
{"current_steps": 275, "total_steps": 2536, "loss": 0.4984, "lr": 9.738265855914013e-05, "epoch": 0.8675078864353313, "percentage": 10.84, "elapsed_time": "2:20:51", "remaining_time": "19:18:10"}
|
||||
{"current_steps": 280, "total_steps": 2536, "loss": 0.5217, "lr": 9.728234380190038e-05, "epoch": 0.8832807570977917, "percentage": 11.04, "elapsed_time": "2:22:00", "remaining_time": "19:04:11"}
|
||||
{"current_steps": 285, "total_steps": 2536, "loss": 0.5383, "lr": 9.718019629627045e-05, "epoch": 0.8990536277602523, "percentage": 11.24, "elapsed_time": "2:23:09", "remaining_time": "18:50:41"}
|
||||
{"current_steps": 290, "total_steps": 2536, "loss": 0.5156, "lr": 9.70762200016707e-05, "epoch": 0.9148264984227129, "percentage": 11.44, "elapsed_time": "2:24:24", "remaining_time": "18:38:23"}
|
||||
{"current_steps": 295, "total_steps": 2536, "loss": 0.5172, "lr": 9.697041894840865e-05, "epoch": 0.9305993690851735, "percentage": 11.63, "elapsed_time": "2:25:37", "remaining_time": "18:26:17"}
|
||||
{"current_steps": 300, "total_steps": 2536, "loss": 0.5197, "lr": 9.68627972375228e-05, "epoch": 0.9463722397476341, "percentage": 11.83, "elapsed_time": "2:26:53", "remaining_time": "18:14:48"}
|
||||
{"current_steps": 305, "total_steps": 2536, "loss": 0.509, "lr": 9.675335904062353e-05, "epoch": 0.9621451104100947, "percentage": 12.03, "elapsed_time": "0:01:55", "remaining_time": "0:14:07"}
|
||||
{"current_steps": 310, "total_steps": 2536, "loss": 0.515, "lr": 9.66421085997315e-05, "epoch": 0.9779179810725552, "percentage": 12.22, "elapsed_time": "0:03:14", "remaining_time": "0:23:15"}
|
||||
{"current_steps": 315, "total_steps": 2536, "loss": 0.5298, "lr": 9.65290502271132e-05, "epoch": 0.9936908517350158, "percentage": 12.42, "elapsed_time": "0:04:26", "remaining_time": "0:31:17"}
|
||||
{"current_steps": 320, "total_steps": 2536, "loss": 0.469, "lr": 9.641418830511377e-05, "epoch": 1.0094637223974763, "percentage": 12.62, "elapsed_time": "0:05:38", "remaining_time": "0:39:02"}
|
||||
{"current_steps": 325, "total_steps": 2536, "loss": 0.4145, "lr": 9.62975272859872e-05, "epoch": 1.025236593059937, "percentage": 12.82, "elapsed_time": "0:06:49", "remaining_time": "0:46:27"}
|
||||
{"current_steps": 330, "total_steps": 2536, "loss": 0.4115, "lr": 9.617907169172367e-05, "epoch": 1.0410094637223974, "percentage": 13.01, "elapsed_time": "0:07:59", "remaining_time": "0:53:22"}
|
||||
{"current_steps": 335, "total_steps": 2536, "loss": 0.4526, "lr": 9.605882611387432e-05, "epoch": 1.0567823343848581, "percentage": 13.21, "elapsed_time": "0:09:10", "remaining_time": "1:00:15"}
|
||||
{"current_steps": 340, "total_steps": 2536, "loss": 0.4484, "lr": 9.593679521337327e-05, "epoch": 1.0725552050473186, "percentage": 13.41, "elapsed_time": "0:10:19", "remaining_time": "1:06:40"}
|
||||
{"current_steps": 345, "total_steps": 2536, "loss": 0.4302, "lr": 9.581298372035695e-05, "epoch": 1.088328075709779, "percentage": 13.6, "elapsed_time": "0:11:32", "remaining_time": "1:13:18"}
|
||||
{"current_steps": 350, "total_steps": 2536, "loss": 0.4462, "lr": 9.56873964339807e-05, "epoch": 1.1041009463722398, "percentage": 13.8, "elapsed_time": "0:12:42", "remaining_time": "1:19:21"}
|
||||
{"current_steps": 355, "total_steps": 2536, "loss": 0.4148, "lr": 9.556003822223287e-05, "epoch": 1.1198738170347002, "percentage": 14.0, "elapsed_time": "0:13:54", "remaining_time": "1:25:27"}
|
||||
{"current_steps": 360, "total_steps": 2536, "loss": 0.4371, "lr": 9.5430914021746e-05, "epoch": 1.135646687697161, "percentage": 14.2, "elapsed_time": "0:15:04", "remaining_time": "1:31:05"}
|
||||
{"current_steps": 365, "total_steps": 2536, "loss": 0.4218, "lr": 9.530002883760552e-05, "epoch": 1.1514195583596214, "percentage": 14.39, "elapsed_time": "0:16:14", "remaining_time": "1:36:36"}
|
||||
{"current_steps": 370, "total_steps": 2536, "loss": 0.4395, "lr": 9.516738774315577e-05, "epoch": 1.167192429022082, "percentage": 14.59, "elapsed_time": "0:17:16", "remaining_time": "1:41:09"}
|
||||
{"current_steps": 375, "total_steps": 2536, "loss": 0.4434, "lr": 9.503299587980331e-05, "epoch": 1.1829652996845426, "percentage": 14.79, "elapsed_time": "0:18:32", "remaining_time": "1:46:53"}
|
||||
{"current_steps": 380, "total_steps": 2536, "loss": 0.4428, "lr": 9.489685845681762e-05, "epoch": 1.1987381703470033, "percentage": 14.98, "elapsed_time": "0:19:50", "remaining_time": "1:52:35"}
|
||||
{"current_steps": 385, "total_steps": 2536, "loss": 0.4292, "lr": 9.47589807511292e-05, "epoch": 1.2145110410094637, "percentage": 15.18, "elapsed_time": "0:20:57", "remaining_time": "1:57:08"}
|
||||
{"current_steps": 390, "total_steps": 2536, "loss": 0.4611, "lr": 9.461936810712507e-05, "epoch": 1.2302839116719242, "percentage": 15.38, "elapsed_time": "0:22:04", "remaining_time": "2:01:29"}
|
||||
{"current_steps": 395, "total_steps": 2536, "loss": 0.4318, "lr": 9.447802593644152e-05, "epoch": 1.2460567823343849, "percentage": 15.58, "elapsed_time": "0:23:12", "remaining_time": "2:05:45"}
|
||||
{"current_steps": 400, "total_steps": 2536, "loss": 0.4169, "lr": 9.433495971775444e-05, "epoch": 1.2618296529968454, "percentage": 15.77, "elapsed_time": "0:24:23", "remaining_time": "2:10:13"}
|
||||
{"current_steps": 405, "total_steps": 2536, "loss": 0.4393, "lr": 9.419017499656686e-05, "epoch": 1.277602523659306, "percentage": 15.97, "elapsed_time": "0:26:59", "remaining_time": "2:22:03"}
|
||||
{"current_steps": 410, "total_steps": 2536, "loss": 0.4616, "lr": 9.404367738499409e-05, "epoch": 1.2933753943217665, "percentage": 16.17, "elapsed_time": "0:28:10", "remaining_time": "2:26:05"}
|
||||
{"current_steps": 415, "total_steps": 2536, "loss": 0.4729, "lr": 9.38954725615461e-05, "epoch": 1.3091482649842272, "percentage": 16.36, "elapsed_time": "0:29:15", "remaining_time": "2:29:29"}
|
||||
{"current_steps": 420, "total_steps": 2536, "loss": 0.4627, "lr": 9.374556627090749e-05, "epoch": 1.3249211356466877, "percentage": 16.56, "elapsed_time": "0:30:27", "remaining_time": "2:33:28"}
|
||||
{"current_steps": 425, "total_steps": 2536, "loss": 0.4677, "lr": 9.359396432371476e-05, "epoch": 1.3406940063091484, "percentage": 16.76, "elapsed_time": "0:31:40", "remaining_time": "2:37:19"}
|
||||
{"current_steps": 430, "total_steps": 2536, "loss": 0.4323, "lr": 9.344067259633112e-05, "epoch": 1.3564668769716088, "percentage": 16.96, "elapsed_time": "0:32:51", "remaining_time": "2:40:56"}
|
||||
{"current_steps": 435, "total_steps": 2536, "loss": 0.4336, "lr": 9.328569703061862e-05, "epoch": 1.3722397476340693, "percentage": 17.15, "elapsed_time": "0:34:01", "remaining_time": "2:44:21"}
|
||||
{"current_steps": 440, "total_steps": 2536, "loss": 0.4421, "lr": 9.3129043633708e-05, "epoch": 1.38801261829653, "percentage": 17.35, "elapsed_time": "0:35:15", "remaining_time": "2:47:58"}
|
||||
{"current_steps": 445, "total_steps": 2536, "loss": 0.4423, "lr": 9.297071847776568e-05, "epoch": 1.4037854889589905, "percentage": 17.55, "elapsed_time": "0:36:23", "remaining_time": "2:51:00"}
|
||||
{"current_steps": 450, "total_steps": 2536, "loss": 0.4443, "lr": 9.281072769975847e-05, "epoch": 1.4195583596214512, "percentage": 17.74, "elapsed_time": "0:37:26", "remaining_time": "2:53:31"}
|
||||
{"current_steps": 455, "total_steps": 2536, "loss": 0.4334, "lr": 9.264907750121568e-05, "epoch": 1.4353312302839116, "percentage": 17.94, "elapsed_time": "0:38:30", "remaining_time": "2:56:05"}
|
||||
{"current_steps": 460, "total_steps": 2536, "loss": 0.4416, "lr": 9.248577414798871e-05, "epoch": 1.4511041009463723, "percentage": 18.14, "elapsed_time": "0:39:42", "remaining_time": "2:59:10"}
|
||||
{"current_steps": 465, "total_steps": 2536, "loss": 0.4275, "lr": 9.232082397000826e-05, "epoch": 1.4668769716088328, "percentage": 18.34, "elapsed_time": "0:40:54", "remaining_time": "3:02:12"}
|
||||
{"current_steps": 470, "total_steps": 2536, "loss": 0.4431, "lr": 9.215423336103884e-05, "epoch": 1.4826498422712935, "percentage": 18.53, "elapsed_time": "0:42:04", "remaining_time": "3:04:57"}
|
||||
{"current_steps": 475, "total_steps": 2536, "loss": 0.4412, "lr": 9.198600877843105e-05, "epoch": 1.498422712933754, "percentage": 18.73, "elapsed_time": "0:43:17", "remaining_time": "3:07:48"}
|
||||
{"current_steps": 480, "total_steps": 2536, "loss": 0.4461, "lr": 9.181615674287121e-05, "epoch": 1.5141955835962144, "percentage": 18.93, "elapsed_time": "0:44:24", "remaining_time": "3:10:11"}
|
||||
{"current_steps": 485, "total_steps": 2536, "loss": 0.4403, "lr": 9.164468383812864e-05, "epoch": 1.5299684542586751, "percentage": 19.12, "elapsed_time": "0:45:33", "remaining_time": "3:12:40"}
|
||||
{"current_steps": 490, "total_steps": 2536, "loss": 0.4559, "lr": 9.147159671080049e-05, "epoch": 1.5457413249211358, "percentage": 19.32, "elapsed_time": "0:46:42", "remaining_time": "3:15:00"}
|
||||
{"current_steps": 495, "total_steps": 2536, "loss": 0.4551, "lr": 9.129690207005402e-05, "epoch": 1.5615141955835963, "percentage": 19.52, "elapsed_time": "0:47:49", "remaining_time": "3:17:12"}
|
||||
{"current_steps": 500, "total_steps": 2536, "loss": 0.439, "lr": 9.11206066873666e-05, "epoch": 1.5772870662460567, "percentage": 19.72, "elapsed_time": "0:49:03", "remaining_time": "3:19:46"}
|
||||
{"current_steps": 505, "total_steps": 2536, "loss": 0.4559, "lr": 9.094271739626326e-05, "epoch": 1.5930599369085172, "percentage": 19.91, "elapsed_time": "0:57:37", "remaining_time": "3:51:45"}
|
||||
{"current_steps": 510, "total_steps": 2536, "loss": 0.4507, "lr": 9.076324109205174e-05, "epoch": 1.608832807570978, "percentage": 20.11, "elapsed_time": "0:58:54", "remaining_time": "3:54:00"}
|
||||
{"current_steps": 515, "total_steps": 2536, "loss": 0.4333, "lr": 9.058218473155528e-05, "epoch": 1.6246056782334386, "percentage": 20.31, "elapsed_time": "1:00:02", "remaining_time": "3:55:38"}
|
||||
{"current_steps": 520, "total_steps": 2536, "loss": 0.4527, "lr": 9.039955533284292e-05, "epoch": 1.640378548895899, "percentage": 20.5, "elapsed_time": "1:01:12", "remaining_time": "3:57:17"}
|
||||
{"current_steps": 525, "total_steps": 2536, "loss": 0.471, "lr": 9.021535997495749e-05, "epoch": 1.6561514195583595, "percentage": 20.7, "elapsed_time": "1:02:19", "remaining_time": "3:58:45"}
|
||||
{"current_steps": 530, "total_steps": 2536, "loss": 0.4503, "lr": 9.002960579764116e-05, "epoch": 1.6719242902208202, "percentage": 20.9, "elapsed_time": "1:03:28", "remaining_time": "4:00:12"}
|
||||
{"current_steps": 535, "total_steps": 2536, "loss": 0.4455, "lr": 8.984230000105882e-05, "epoch": 1.687697160883281, "percentage": 21.1, "elapsed_time": "1:04:39", "remaining_time": "4:01:48"}
|
||||
{"current_steps": 540, "total_steps": 2536, "loss": 0.4431, "lr": 8.965344984551882e-05, "epoch": 1.7034700315457414, "percentage": 21.29, "elapsed_time": "1:05:50", "remaining_time": "4:03:21"}
|
||||
{"current_steps": 545, "total_steps": 2536, "loss": 0.434, "lr": 8.946306265119167e-05, "epoch": 1.7192429022082019, "percentage": 21.49, "elapsed_time": "1:07:08", "remaining_time": "4:05:15"}
|
||||
{"current_steps": 550, "total_steps": 2536, "loss": 0.4376, "lr": 8.927114579782625e-05, "epoch": 1.7350157728706623, "percentage": 21.69, "elapsed_time": "1:08:21", "remaining_time": "4:06:50"}
|
||||
{"current_steps": 555, "total_steps": 2536, "loss": 0.4574, "lr": 8.907770672446381e-05, "epoch": 1.750788643533123, "percentage": 21.88, "elapsed_time": "1:09:26", "remaining_time": "4:07:53"}
|
||||
{"current_steps": 560, "total_steps": 2536, "loss": 0.4397, "lr": 8.888275292914948e-05, "epoch": 1.7665615141955837, "percentage": 22.08, "elapsed_time": "1:10:33", "remaining_time": "4:08:58"}
|
||||
{"current_steps": 565, "total_steps": 2536, "loss": 0.4264, "lr": 8.868629196864182e-05, "epoch": 1.7823343848580442, "percentage": 22.28, "elapsed_time": "1:11:45", "remaining_time": "4:10:19"}
|
||||
{"current_steps": 570, "total_steps": 2536, "loss": 0.4486, "lr": 8.848833145811976e-05, "epoch": 1.7981072555205047, "percentage": 22.48, "elapsed_time": "1:12:57", "remaining_time": "4:11:40"}
|
||||
{"current_steps": 575, "total_steps": 2536, "loss": 0.4338, "lr": 8.828887907088753e-05, "epoch": 1.8138801261829653, "percentage": 22.67, "elapsed_time": "1:14:06", "remaining_time": "4:12:43"}
|
||||
{"current_steps": 580, "total_steps": 2536, "loss": 0.4275, "lr": 8.808794253807707e-05, "epoch": 1.8296529968454258, "percentage": 22.87, "elapsed_time": "1:15:13", "remaining_time": "4:13:42"}
|
||||
{"current_steps": 585, "total_steps": 2536, "loss": 0.4468, "lr": 8.788552964834859e-05, "epoch": 1.8454258675078865, "percentage": 23.07, "elapsed_time": "1:16:24", "remaining_time": "4:14:49"}
|
||||
{"current_steps": 590, "total_steps": 2536, "loss": 0.4558, "lr": 8.768164824758846e-05, "epoch": 1.861198738170347, "percentage": 23.26, "elapsed_time": "1:17:32", "remaining_time": "4:15:45"}
|
||||
{"current_steps": 595, "total_steps": 2536, "loss": 0.4274, "lr": 8.747630623860521e-05, "epoch": 1.8769716088328074, "percentage": 23.46, "elapsed_time": "1:18:40", "remaining_time": "4:16:39"}
|
||||
{"current_steps": 600, "total_steps": 2536, "loss": 0.4339, "lr": 8.726951158082311e-05, "epoch": 1.8927444794952681, "percentage": 23.66, "elapsed_time": "1:19:50", "remaining_time": "4:17:36"}
|
||||
{"current_steps": 605, "total_steps": 2536, "loss": 0.4539, "lr": 8.706127228997376e-05, "epoch": 1.9085173501577288, "percentage": 23.86, "elapsed_time": "1:27:14", "remaining_time": "4:38:28"}
|
||||
{"current_steps": 610, "total_steps": 2536, "loss": 0.431, "lr": 8.685159643778528e-05, "epoch": 1.9242902208201893, "percentage": 24.05, "elapsed_time": "1:28:16", "remaining_time": "4:38:42"}
|
||||
{"current_steps": 615, "total_steps": 2536, "loss": 0.4606, "lr": 8.664049215166955e-05, "epoch": 1.9400630914826498, "percentage": 24.25, "elapsed_time": "1:29:29", "remaining_time": "4:39:30"}
|
||||
{"current_steps": 620, "total_steps": 2536, "loss": 0.4387, "lr": 8.6427967614407e-05, "epoch": 1.9558359621451105, "percentage": 24.45, "elapsed_time": "1:30:39", "remaining_time": "4:40:10"}
|
||||
{"current_steps": 625, "total_steps": 2536, "loss": 0.4398, "lr": 8.621403106382968e-05, "epoch": 1.971608832807571, "percentage": 24.65, "elapsed_time": "1:31:49", "remaining_time": "4:40:44"}
|
||||
{"current_steps": 630, "total_steps": 2536, "loss": 0.4582, "lr": 8.599869079250165e-05, "epoch": 1.9873817034700316, "percentage": 24.84, "elapsed_time": "1:32:55", "remaining_time": "4:41:09"}
|
||||
{"current_steps": 635, "total_steps": 2536, "loss": 0.4383, "lr": 8.578195514739784e-05, "epoch": 2.003154574132492, "percentage": 25.04, "elapsed_time": "1:34:07", "remaining_time": "4:41:47"}
|
||||
{"current_steps": 640, "total_steps": 2536, "loss": 0.3511, "lr": 8.556383252958026e-05, "epoch": 2.0189274447949526, "percentage": 25.24, "elapsed_time": "1:35:22", "remaining_time": "4:42:33"}
|
||||
{"current_steps": 645, "total_steps": 2536, "loss": 0.3508, "lr": 8.534433139387259e-05, "epoch": 2.034700315457413, "percentage": 25.43, "elapsed_time": "1:36:32", "remaining_time": "4:43:02"}
|
||||
{"current_steps": 650, "total_steps": 2536, "loss": 0.3432, "lr": 8.512346024853219e-05, "epoch": 2.050473186119874, "percentage": 25.63, "elapsed_time": "1:37:40", "remaining_time": "4:43:25"}
|
||||
{"current_steps": 655, "total_steps": 2536, "loss": 0.3577, "lr": 8.490122765492057e-05, "epoch": 2.0662460567823344, "percentage": 25.83, "elapsed_time": "1:38:51", "remaining_time": "4:43:52"}
|
||||
{"current_steps": 660, "total_steps": 2536, "loss": 0.3287, "lr": 8.467764222717136e-05, "epoch": 2.082018927444795, "percentage": 26.03, "elapsed_time": "1:40:05", "remaining_time": "4:44:30"}
|
||||
{"current_steps": 665, "total_steps": 2536, "loss": 0.3454, "lr": 8.445271263185646e-05, "epoch": 2.0977917981072554, "percentage": 26.22, "elapsed_time": "1:41:13", "remaining_time": "4:44:48"}
|
||||
{"current_steps": 670, "total_steps": 2536, "loss": 0.3308, "lr": 8.422644758765012e-05, "epoch": 2.1135646687697163, "percentage": 26.42, "elapsed_time": "1:42:21", "remaining_time": "4:45:04"}
|
||||
{"current_steps": 675, "total_steps": 2536, "loss": 0.3229, "lr": 8.399885586499101e-05, "epoch": 2.1293375394321767, "percentage": 26.62, "elapsed_time": "1:43:30", "remaining_time": "4:45:23"}
|
||||
{"current_steps": 680, "total_steps": 2536, "loss": 0.3333, "lr": 8.376994628574219e-05, "epoch": 2.145110410094637, "percentage": 26.81, "elapsed_time": "1:44:36", "remaining_time": "4:45:32"}
|
||||
{"current_steps": 685, "total_steps": 2536, "loss": 0.3452, "lr": 8.353972772284927e-05, "epoch": 2.1608832807570977, "percentage": 27.01, "elapsed_time": "1:45:51", "remaining_time": "4:46:03"}
|
||||
{"current_steps": 690, "total_steps": 2536, "loss": 0.3384, "lr": 8.330820909999633e-05, "epoch": 2.176656151419558, "percentage": 27.21, "elapsed_time": "1:47:00", "remaining_time": "4:46:17"}
|
||||
{"current_steps": 695, "total_steps": 2536, "loss": 0.3262, "lr": 8.307539939126016e-05, "epoch": 2.192429022082019, "percentage": 27.41, "elapsed_time": "1:48:09", "remaining_time": "4:46:29"}
|
||||
{"current_steps": 700, "total_steps": 2536, "loss": 0.3276, "lr": 8.284130762076235e-05, "epoch": 2.2082018927444795, "percentage": 27.6, "elapsed_time": "1:49:14", "remaining_time": "4:46:30"}
|
||||
{"current_steps": 705, "total_steps": 2536, "loss": 0.3275, "lr": 8.260594286231947e-05, "epoch": 2.22397476340694, "percentage": 27.8, "elapsed_time": "1:57:46", "remaining_time": "5:05:52"}
|
||||
{"current_steps": 710, "total_steps": 2536, "loss": 0.3422, "lr": 8.236931423909138e-05, "epoch": 2.2397476340694005, "percentage": 28.0, "elapsed_time": "1:58:51", "remaining_time": "5:05:42"}
|
||||
{"current_steps": 715, "total_steps": 2536, "loss": 0.319, "lr": 8.213143092322769e-05, "epoch": 2.2555205047318614, "percentage": 28.19, "elapsed_time": "1:59:57", "remaining_time": "5:05:31"}
|
||||
{"current_steps": 720, "total_steps": 2536, "loss": 0.3456, "lr": 8.189230213551202e-05, "epoch": 2.271293375394322, "percentage": 28.39, "elapsed_time": "2:01:04", "remaining_time": "5:05:23"}
|
||||
{"current_steps": 725, "total_steps": 2536, "loss": 0.3593, "lr": 8.165193714500481e-05, "epoch": 2.2870662460567823, "percentage": 28.59, "elapsed_time": "2:02:12", "remaining_time": "5:05:15"}
|
||||
{"current_steps": 730, "total_steps": 2536, "loss": 0.3396, "lr": 8.141034526868389e-05, "epoch": 2.302839116719243, "percentage": 28.79, "elapsed_time": "2:03:21", "remaining_time": "5:05:10"}
|
||||
{"current_steps": 735, "total_steps": 2536, "loss": 0.3275, "lr": 8.116753587108339e-05, "epoch": 2.3186119873817033, "percentage": 28.98, "elapsed_time": "2:04:30", "remaining_time": "5:05:05"}
|
||||
{"current_steps": 740, "total_steps": 2536, "loss": 0.3394, "lr": 8.092351836393076e-05, "epoch": 2.334384858044164, "percentage": 29.18, "elapsed_time": "2:05:47", "remaining_time": "5:05:17"}
|
||||
{"current_steps": 745, "total_steps": 2536, "loss": 0.3717, "lr": 8.067830220578191e-05, "epoch": 2.3501577287066246, "percentage": 29.38, "elapsed_time": "2:06:56", "remaining_time": "5:05:09"}
|
||||
{"current_steps": 750, "total_steps": 2536, "loss": 0.3322, "lr": 8.043189690165467e-05, "epoch": 2.365930599369085, "percentage": 29.57, "elapsed_time": "2:08:04", "remaining_time": "5:04:59"}
|
||||
{"current_steps": 755, "total_steps": 2536, "loss": 0.3334, "lr": 8.018431200266023e-05, "epoch": 2.3817034700315456, "percentage": 29.77, "elapsed_time": "2:09:02", "remaining_time": "5:04:24"}
|
||||
{"current_steps": 760, "total_steps": 2536, "loss": 0.3301, "lr": 7.993555710563303e-05, "epoch": 2.3974763406940065, "percentage": 29.97, "elapsed_time": "2:10:13", "remaining_time": "5:04:19"}
|
||||
{"current_steps": 765, "total_steps": 2536, "loss": 0.3503, "lr": 7.968564185275873e-05, "epoch": 2.413249211356467, "percentage": 30.17, "elapsed_time": "2:11:20", "remaining_time": "5:04:03"}
|
||||
{"current_steps": 770, "total_steps": 2536, "loss": 0.348, "lr": 7.943457593120045e-05, "epoch": 2.4290220820189274, "percentage": 30.36, "elapsed_time": "2:12:31", "remaining_time": "5:03:56"}
|
||||
{"current_steps": 775, "total_steps": 2536, "loss": 0.3453, "lr": 7.918236907272327e-05, "epoch": 2.444794952681388, "percentage": 30.56, "elapsed_time": "2:13:39", "remaining_time": "5:03:41"}
|
||||
{"current_steps": 780, "total_steps": 2536, "loss": 0.34, "lr": 7.892903105331712e-05, "epoch": 2.4605678233438484, "percentage": 30.76, "elapsed_time": "2:14:50", "remaining_time": "5:03:33"}
|
||||
{"current_steps": 785, "total_steps": 2536, "loss": 0.3489, "lr": 7.867457169281765e-05, "epoch": 2.4763406940063093, "percentage": 30.95, "elapsed_time": "2:15:56", "remaining_time": "5:03:14"}
|
||||
{"current_steps": 790, "total_steps": 2536, "loss": 0.3527, "lr": 7.841900085452574e-05, "epoch": 2.4921135646687698, "percentage": 31.15, "elapsed_time": "2:16:53", "remaining_time": "5:02:32"}
|
||||
{"current_steps": 795, "total_steps": 2536, "loss": 0.3336, "lr": 7.816232844482516e-05, "epoch": 2.5078864353312302, "percentage": 31.35, "elapsed_time": "2:18:03", "remaining_time": "5:02:20"}
|
||||
{"current_steps": 800, "total_steps": 2536, "loss": 0.3512, "lr": 7.790456441279853e-05, "epoch": 2.5236593059936907, "percentage": 31.55, "elapsed_time": "2:19:12", "remaining_time": "5:02:05"}
|
||||
{"current_steps": 805, "total_steps": 2536, "loss": 0.3511, "lr": 7.764571874984174e-05, "epoch": 2.5394321766561516, "percentage": 31.74, "elapsed_time": "2:31:01", "remaining_time": "5:24:44"}
|
||||
{"current_steps": 810, "total_steps": 2536, "loss": 0.3504, "lr": 7.73858014892766e-05, "epoch": 2.555205047318612, "percentage": 31.94, "elapsed_time": "2:32:11", "remaining_time": "5:24:17"}
|
||||
{"current_steps": 815, "total_steps": 2536, "loss": 0.328, "lr": 7.712482270596199e-05, "epoch": 2.5709779179810726, "percentage": 32.14, "elapsed_time": "2:33:16", "remaining_time": "5:23:39"}
|
||||
{"current_steps": 820, "total_steps": 2536, "loss": 0.3359, "lr": 7.686279251590331e-05, "epoch": 2.586750788643533, "percentage": 32.33, "elapsed_time": "2:34:27", "remaining_time": "5:23:13"}
|
||||
{"current_steps": 825, "total_steps": 2536, "loss": 0.3533, "lr": 7.659972107586035e-05, "epoch": 2.6025236593059935, "percentage": 32.53, "elapsed_time": "2:35:34", "remaining_time": "5:22:39"}
|
||||
{"current_steps": 830, "total_steps": 2536, "loss": 0.3614, "lr": 7.633561858295364e-05, "epoch": 2.6182965299684544, "percentage": 32.73, "elapsed_time": "2:36:44", "remaining_time": "5:22:10"}
|
||||
{"current_steps": 835, "total_steps": 2536, "loss": 0.3531, "lr": 7.607049527426916e-05, "epoch": 2.634069400630915, "percentage": 32.93, "elapsed_time": "2:37:56", "remaining_time": "5:21:45"}
|
||||
{"current_steps": 840, "total_steps": 2536, "loss": 0.3615, "lr": 7.580436142646155e-05, "epoch": 2.6498422712933754, "percentage": 33.12, "elapsed_time": "2:39:05", "remaining_time": "5:21:13"}
|
||||
{"current_steps": 845, "total_steps": 2536, "loss": 0.3458, "lr": 7.55372273553557e-05, "epoch": 2.665615141955836, "percentage": 33.32, "elapsed_time": "2:40:13", "remaining_time": "5:20:38"}
|
||||
{"current_steps": 850, "total_steps": 2536, "loss": 0.3463, "lr": 7.526910341554703e-05, "epoch": 2.6813880126182967, "percentage": 33.52, "elapsed_time": "2:41:19", "remaining_time": "5:19:58"}
|
||||
{"current_steps": 855, "total_steps": 2536, "loss": 0.351, "lr": 7.500000000000001e-05, "epoch": 2.697160883280757, "percentage": 33.71, "elapsed_time": "2:42:23", "remaining_time": "5:19:17"}
|
||||
{"current_steps": 860, "total_steps": 2536, "loss": 0.3578, "lr": 7.472992753964532e-05, "epoch": 2.7129337539432177, "percentage": 33.91, "elapsed_time": "2:43:35", "remaining_time": "5:18:48"}
|
||||
{"current_steps": 865, "total_steps": 2536, "loss": 0.3374, "lr": 7.445889650297559e-05, "epoch": 2.728706624605678, "percentage": 34.11, "elapsed_time": "2:44:42", "remaining_time": "5:18:10"}
|
||||
{"current_steps": 870, "total_steps": 2536, "loss": 0.3316, "lr": 7.418691739563957e-05, "epoch": 2.7444794952681386, "percentage": 34.31, "elapsed_time": "2:45:51", "remaining_time": "5:17:35"}
|
||||
{"current_steps": 875, "total_steps": 2536, "loss": 0.3363, "lr": 7.391400076003492e-05, "epoch": 2.7602523659305995, "percentage": 34.5, "elapsed_time": "2:46:55", "remaining_time": "5:16:52"}
|
||||
{"current_steps": 880, "total_steps": 2536, "loss": 0.3435, "lr": 7.36401571748996e-05, "epoch": 2.77602523659306, "percentage": 34.7, "elapsed_time": "2:48:06", "remaining_time": "5:16:20"}
|
||||
{"current_steps": 885, "total_steps": 2536, "loss": 0.3664, "lr": 7.336539725490178e-05, "epoch": 2.7917981072555205, "percentage": 34.9, "elapsed_time": "2:49:10", "remaining_time": "5:15:35"}
|
||||
{"current_steps": 890, "total_steps": 2536, "loss": 0.3513, "lr": 7.30897316502284e-05, "epoch": 2.807570977917981, "percentage": 35.09, "elapsed_time": "2:50:20", "remaining_time": "5:15:02"}
|
||||
{"current_steps": 895, "total_steps": 2536, "loss": 0.3526, "lr": 7.281317104617239e-05, "epoch": 2.823343848580442, "percentage": 35.29, "elapsed_time": "2:51:23", "remaining_time": "5:14:14"}
|
||||
{"current_steps": 900, "total_steps": 2536, "loss": 0.3543, "lr": 7.253572616271844e-05, "epoch": 2.8391167192429023, "percentage": 35.49, "elapsed_time": "2:52:32", "remaining_time": "5:13:38"}
|
||||
{"current_steps": 905, "total_steps": 2536, "loss": 0.3455, "lr": 7.225740775412751e-05, "epoch": 2.854889589905363, "percentage": 35.69, "elapsed_time": "3:02:43", "remaining_time": "5:29:18"}
|
||||
{"current_steps": 910, "total_steps": 2536, "loss": 0.3538, "lr": 7.197822660851991e-05, "epoch": 2.8706624605678233, "percentage": 35.88, "elapsed_time": "3:03:57", "remaining_time": "5:28:41"}
|
||||
{"current_steps": 915, "total_steps": 2536, "loss": 0.3552, "lr": 7.169819354745725e-05, "epoch": 2.8864353312302837, "percentage": 36.08, "elapsed_time": "3:05:04", "remaining_time": "5:27:52"}
|
||||
{"current_steps": 920, "total_steps": 2536, "loss": 0.3686, "lr": 7.141731942552288e-05, "epoch": 2.9022082018927446, "percentage": 36.28, "elapsed_time": "3:06:13", "remaining_time": "5:27:06"}
|
||||
{"current_steps": 925, "total_steps": 2536, "loss": 0.3568, "lr": 7.113561512990119e-05, "epoch": 2.917981072555205, "percentage": 36.47, "elapsed_time": "3:07:24", "remaining_time": "5:26:23"}
|
||||
{"current_steps": 930, "total_steps": 2536, "loss": 0.3656, "lr": 7.085309157995557e-05, "epoch": 2.9337539432176656, "percentage": 36.67, "elapsed_time": "3:08:32", "remaining_time": "5:25:34"}
|
||||
{"current_steps": 935, "total_steps": 2536, "loss": 0.3781, "lr": 7.056975972680517e-05, "epoch": 2.949526813880126, "percentage": 36.87, "elapsed_time": "3:09:40", "remaining_time": "5:24:47"}
|
||||
{"current_steps": 940, "total_steps": 2536, "loss": 0.3673, "lr": 7.028563055290044e-05, "epoch": 2.965299684542587, "percentage": 37.07, "elapsed_time": "3:10:42", "remaining_time": "5:23:47"}
|
||||
{"current_steps": 945, "total_steps": 2536, "loss": 0.3358, "lr": 7.000071507159744e-05, "epoch": 2.9810725552050474, "percentage": 37.26, "elapsed_time": "3:11:52", "remaining_time": "5:23:03"}
|
||||
{"current_steps": 950, "total_steps": 2536, "loss": 0.3437, "lr": 6.971502432673085e-05, "epoch": 2.996845425867508, "percentage": 37.46, "elapsed_time": "3:13:04", "remaining_time": "5:22:20"}
|
||||
{"current_steps": 955, "total_steps": 2536, "loss": 0.2638, "lr": 6.942856939218599e-05, "epoch": 3.0126182965299684, "percentage": 37.66, "elapsed_time": "3:14:12", "remaining_time": "5:21:31"}
|
||||
{"current_steps": 960, "total_steps": 2536, "loss": 0.2445, "lr": 6.914136137146951e-05, "epoch": 3.028391167192429, "percentage": 37.85, "elapsed_time": "3:15:22", "remaining_time": "5:20:44"}
|
||||
{"current_steps": 965, "total_steps": 2536, "loss": 0.2397, "lr": 6.885341139727912e-05, "epoch": 3.0441640378548898, "percentage": 38.05, "elapsed_time": "3:16:31", "remaining_time": "5:19:56"}
|
||||
{"current_steps": 970, "total_steps": 2536, "loss": 0.246, "lr": 6.856473063107187e-05, "epoch": 3.0599369085173502, "percentage": 38.25, "elapsed_time": "3:17:38", "remaining_time": "5:19:04"}
|
||||
{"current_steps": 975, "total_steps": 2536, "loss": 0.241, "lr": 6.827533026263169e-05, "epoch": 3.0757097791798107, "percentage": 38.45, "elapsed_time": "3:18:46", "remaining_time": "5:18:14"}
|
||||
{"current_steps": 980, "total_steps": 2536, "loss": 0.2418, "lr": 6.798522150963552e-05, "epoch": 3.091482649842271, "percentage": 38.64, "elapsed_time": "3:19:54", "remaining_time": "5:17:23"}
|
||||
{"current_steps": 985, "total_steps": 2536, "loss": 0.2322, "lr": 6.769441561721863e-05, "epoch": 3.107255520504732, "percentage": 38.84, "elapsed_time": "3:21:05", "remaining_time": "5:16:38"}
|
||||
{"current_steps": 990, "total_steps": 2536, "loss": 0.237, "lr": 6.740292385753858e-05, "epoch": 3.1230283911671926, "percentage": 39.04, "elapsed_time": "3:22:10", "remaining_time": "5:15:43"}
|
||||
{"current_steps": 995, "total_steps": 2536, "loss": 0.2383, "lr": 6.711075752933847e-05, "epoch": 3.138801261829653, "percentage": 39.24, "elapsed_time": "3:23:16", "remaining_time": "5:14:49"}
|
||||
{"current_steps": 1000, "total_steps": 2536, "loss": 0.2309, "lr": 6.681792795750875e-05, "epoch": 3.1545741324921135, "percentage": 39.43, "elapsed_time": "3:24:26", "remaining_time": "5:14:01"}
|
||||
{"current_steps": 1005, "total_steps": 2536, "loss": 0.2362, "lr": 6.652444649264856e-05, "epoch": 3.170347003154574, "percentage": 39.63, "elapsed_time": "3:27:26", "remaining_time": "5:16:00"}
|
||||
{"current_steps": 1010, "total_steps": 2536, "loss": 0.2287, "lr": 6.623032451062542e-05, "epoch": 3.186119873817035, "percentage": 39.83, "elapsed_time": "3:28:33", "remaining_time": "5:15:07"}
|
||||
{"current_steps": 1015, "total_steps": 2536, "loss": 0.2328, "lr": 6.593557341213457e-05, "epoch": 3.2018927444794953, "percentage": 40.02, "elapsed_time": "3:29:36", "remaining_time": "5:14:06"}
|
||||
{"current_steps": 1020, "total_steps": 2536, "loss": 0.2404, "lr": 6.564020462225679e-05, "epoch": 3.217665615141956, "percentage": 40.22, "elapsed_time": "3:30:48", "remaining_time": "5:13:19"}
|
||||
{"current_steps": 1025, "total_steps": 2536, "loss": 0.235, "lr": 6.534422959001585e-05, "epoch": 3.2334384858044163, "percentage": 40.42, "elapsed_time": "3:31:56", "remaining_time": "5:12:26"}
|
||||
{"current_steps": 1030, "total_steps": 2536, "loss": 0.2355, "lr": 6.504765978793443e-05, "epoch": 3.249211356466877, "percentage": 40.62, "elapsed_time": "3:33:07", "remaining_time": "5:11:36"}
|
||||
{"current_steps": 1035, "total_steps": 2536, "loss": 0.2297, "lr": 6.475050671158961e-05, "epoch": 3.2649842271293377, "percentage": 40.81, "elapsed_time": "3:34:08", "remaining_time": "5:10:33"}
|
||||
{"current_steps": 1040, "total_steps": 2536, "loss": 0.2355, "lr": 6.445278187916722e-05, "epoch": 3.280757097791798, "percentage": 41.01, "elapsed_time": "3:35:19", "remaining_time": "5:09:43"}
|
||||
{"current_steps": 1045, "total_steps": 2536, "loss": 0.2402, "lr": 6.415449683101537e-05, "epoch": 3.2965299684542586, "percentage": 41.21, "elapsed_time": "3:36:25", "remaining_time": "5:08:47"}
|
||||
{"current_steps": 1050, "total_steps": 2536, "loss": 0.2397, "lr": 6.385566312919716e-05, "epoch": 3.312302839116719, "percentage": 41.4, "elapsed_time": "3:37:30", "remaining_time": "5:07:49"}
|
||||
{"current_steps": 1055, "total_steps": 2536, "loss": 0.2423, "lr": 6.355629235704248e-05, "epoch": 3.32807570977918, "percentage": 41.6, "elapsed_time": "3:38:40", "remaining_time": "5:06:58"}
|
||||
{"current_steps": 1060, "total_steps": 2536, "loss": 0.2309, "lr": 6.3256396118699e-05, "epoch": 3.3438485804416405, "percentage": 41.8, "elapsed_time": "3:39:47", "remaining_time": "5:06:02"}
|
||||
{"current_steps": 1065, "total_steps": 2536, "loss": 0.2302, "lr": 6.295598603868246e-05, "epoch": 3.359621451104101, "percentage": 42.0, "elapsed_time": "3:40:54", "remaining_time": "5:05:06"}
|
||||
{"current_steps": 1070, "total_steps": 2536, "loss": 0.2366, "lr": 6.265507376142594e-05, "epoch": 3.3753943217665614, "percentage": 42.19, "elapsed_time": "3:42:06", "remaining_time": "5:04:17"}
|
||||
{"current_steps": 1075, "total_steps": 2536, "loss": 0.2402, "lr": 6.235367095082867e-05, "epoch": 3.3911671924290223, "percentage": 42.39, "elapsed_time": "3:43:10", "remaining_time": "5:03:18"}
|
||||
{"current_steps": 1080, "total_steps": 2536, "loss": 0.2425, "lr": 6.205178928980377e-05, "epoch": 3.406940063091483, "percentage": 42.59, "elapsed_time": "3:44:18", "remaining_time": "5:02:24"}
|
||||
{"current_steps": 1085, "total_steps": 2536, "loss": 0.2501, "lr": 6.174944047982549e-05, "epoch": 3.4227129337539433, "percentage": 42.78, "elapsed_time": "3:45:25", "remaining_time": "5:01:28"}
|
||||
{"current_steps": 1090, "total_steps": 2536, "loss": 0.2508, "lr": 6.144663624047564e-05, "epoch": 3.4384858044164037, "percentage": 42.98, "elapsed_time": "3:46:33", "remaining_time": "5:00:32"}
|
||||
{"current_steps": 1095, "total_steps": 2536, "loss": 0.2466, "lr": 6.114338830898922e-05, "epoch": 3.454258675078864, "percentage": 43.18, "elapsed_time": "3:47:38", "remaining_time": "4:59:34"}
|
||||
{"current_steps": 1100, "total_steps": 2536, "loss": 0.2293, "lr": 6.083970843979957e-05, "epoch": 3.470031545741325, "percentage": 43.38, "elapsed_time": "3:48:49", "remaining_time": "4:58:43"}
|
||||
{"current_steps": 1105, "total_steps": 2536, "loss": 0.2527, "lr": 6.0535608404082724e-05, "epoch": 3.4858044164037856, "percentage": 43.57, "elapsed_time": "3:55:32", "remaining_time": "5:05:01"}
|
||||
{"current_steps": 1110, "total_steps": 2536, "loss": 0.2364, "lr": 6.0231099989301086e-05, "epoch": 3.501577287066246, "percentage": 43.77, "elapsed_time": "3:56:42", "remaining_time": "5:04:05"}
|
||||
{"current_steps": 1115, "total_steps": 2536, "loss": 0.2366, "lr": 5.9926194998746624e-05, "epoch": 3.5173501577287065, "percentage": 43.97, "elapsed_time": "3:57:45", "remaining_time": "5:03:00"}
|
||||
{"current_steps": 1120, "total_steps": 2536, "loss": 0.2321, "lr": 5.9620905251083196e-05, "epoch": 3.5331230283911674, "percentage": 44.16, "elapsed_time": "3:58:49", "remaining_time": "5:01:56"}
|
||||
{"current_steps": 1125, "total_steps": 2536, "loss": 0.2445, "lr": 5.931524257988864e-05, "epoch": 3.548895899053628, "percentage": 44.36, "elapsed_time": "3:59:58", "remaining_time": "5:00:59"}
|
||||
{"current_steps": 1130, "total_steps": 2536, "loss": 0.2344, "lr": 5.900921883319591e-05, "epoch": 3.5646687697160884, "percentage": 44.56, "elapsed_time": "4:01:07", "remaining_time": "5:00:01"}
|
||||
{"current_steps": 1135, "total_steps": 2536, "loss": 0.2349, "lr": 5.870284587303394e-05, "epoch": 3.580441640378549, "percentage": 44.76, "elapsed_time": "4:02:13", "remaining_time": "4:58:59"}
|
||||
{"current_steps": 1140, "total_steps": 2536, "loss": 0.2345, "lr": 5.839613557496776e-05, "epoch": 3.5962145110410093, "percentage": 44.95, "elapsed_time": "4:03:16", "remaining_time": "4:57:53"}
|
||||
{"current_steps": 1145, "total_steps": 2536, "loss": 0.2425, "lr": 5.808909982763825e-05, "epoch": 3.61198738170347, "percentage": 45.15, "elapsed_time": "4:04:26", "remaining_time": "4:56:57"}
|
||||
{"current_steps": 1150, "total_steps": 2536, "loss": 0.2351, "lr": 5.778175053230126e-05, "epoch": 3.6277602523659307, "percentage": 45.35, "elapsed_time": "4:05:36", "remaining_time": "4:56:00"}
|
||||
{"current_steps": 1155, "total_steps": 2536, "loss": 0.2263, "lr": 5.747409960236637e-05, "epoch": 3.643533123028391, "percentage": 45.54, "elapsed_time": "4:06:43", "remaining_time": "4:55:00"}
|
||||
{"current_steps": 1160, "total_steps": 2536, "loss": 0.225, "lr": 5.716615896293501e-05, "epoch": 3.6593059936908516, "percentage": 45.74, "elapsed_time": "4:07:57", "remaining_time": "4:54:08"}
|
||||
{"current_steps": 1165, "total_steps": 2536, "loss": 0.2369, "lr": 5.68579405503383e-05, "epoch": 3.6750788643533125, "percentage": 45.94, "elapsed_time": "4:09:00", "remaining_time": "4:53:02"}
|
||||
{"current_steps": 1170, "total_steps": 2536, "loss": 0.246, "lr": 5.654945631167433e-05, "epoch": 3.690851735015773, "percentage": 46.14, "elapsed_time": "4:10:08", "remaining_time": "4:52:03"}
|
||||
{"current_steps": 1175, "total_steps": 2536, "loss": 0.2285, "lr": 5.624071820434508e-05, "epoch": 3.7066246056782335, "percentage": 46.33, "elapsed_time": "4:11:13", "remaining_time": "4:50:59"}
|
||||
{"current_steps": 1180, "total_steps": 2536, "loss": 0.2469, "lr": 5.593173819559294e-05, "epoch": 3.722397476340694, "percentage": 46.53, "elapsed_time": "4:12:21", "remaining_time": "4:49:59"}
|
||||
{"current_steps": 1185, "total_steps": 2536, "loss": 0.2442, "lr": 5.562252826203687e-05, "epoch": 3.7381703470031544, "percentage": 46.73, "elapsed_time": "4:13:28", "remaining_time": "4:48:59"}
|
||||
{"current_steps": 1190, "total_steps": 2536, "loss": 0.2403, "lr": 5.531310038920805e-05, "epoch": 3.753943217665615, "percentage": 46.92, "elapsed_time": "4:14:40", "remaining_time": "4:48:03"}
|
||||
{"current_steps": 1195, "total_steps": 2536, "loss": 0.2368, "lr": 5.500346657108545e-05, "epoch": 3.769716088328076, "percentage": 47.12, "elapsed_time": "4:15:52", "remaining_time": "4:47:08"}
|
||||
{"current_steps": 1200, "total_steps": 2536, "loss": 0.2558, "lr": 5.469363880963082e-05, "epoch": 3.7854889589905363, "percentage": 47.32, "elapsed_time": "4:17:02", "remaining_time": "4:46:10"}
|
||||
{"current_steps": 1205, "total_steps": 2536, "loss": 0.2477, "lr": 5.438362911432347e-05, "epoch": 3.8012618296529967, "percentage": 47.52, "elapsed_time": "4:22:55", "remaining_time": "4:50:25"}
|
||||
{"current_steps": 1210, "total_steps": 2536, "loss": 0.2553, "lr": 5.407344950169486e-05, "epoch": 3.8170347003154577, "percentage": 47.71, "elapsed_time": "4:24:01", "remaining_time": "4:49:20"}
|
||||
{"current_steps": 1215, "total_steps": 2536, "loss": 0.2492, "lr": 5.376311199486268e-05, "epoch": 3.832807570977918, "percentage": 47.91, "elapsed_time": "4:25:10", "remaining_time": "4:48:18"}
|
||||
{"current_steps": 1220, "total_steps": 2536, "loss": 0.2358, "lr": 5.3452628623064934e-05, "epoch": 3.8485804416403786, "percentage": 48.11, "elapsed_time": "4:26:18", "remaining_time": "4:47:15"}
|
||||
{"current_steps": 1225, "total_steps": 2536, "loss": 0.2574, "lr": 5.31420114211936e-05, "epoch": 3.864353312302839, "percentage": 48.3, "elapsed_time": "4:27:29", "remaining_time": "4:46:15"}
|
||||
{"current_steps": 1230, "total_steps": 2536, "loss": 0.2411, "lr": 5.2831272429328116e-05, "epoch": 3.8801261829652995, "percentage": 48.5, "elapsed_time": "4:28:38", "remaining_time": "4:45:13"}
|
||||
{"current_steps": 1235, "total_steps": 2536, "loss": 0.2452, "lr": 5.2520423692268775e-05, "epoch": 3.89589905362776, "percentage": 48.7, "elapsed_time": "4:29:43", "remaining_time": "4:44:08"}
|
||||
{"current_steps": 1240, "total_steps": 2536, "loss": 0.2349, "lr": 5.220947725906975e-05, "epoch": 3.911671924290221, "percentage": 48.9, "elapsed_time": "4:30:54", "remaining_time": "4:43:08"}
|
||||
{"current_steps": 1245, "total_steps": 2536, "loss": 0.2574, "lr": 5.18984451825721e-05, "epoch": 3.9274447949526814, "percentage": 49.09, "elapsed_time": "4:32:02", "remaining_time": "4:42:05"}
|
||||
{"current_steps": 1250, "total_steps": 2536, "loss": 0.2504, "lr": 5.1587339518936585e-05, "epoch": 3.943217665615142, "percentage": 49.29, "elapsed_time": "4:33:16", "remaining_time": "4:41:09"}
|
||||
{"current_steps": 1255, "total_steps": 2536, "loss": 0.2435, "lr": 5.127617232717631e-05, "epoch": 3.958990536277603, "percentage": 49.49, "elapsed_time": "4:34:29", "remaining_time": "4:40:10"}
|
||||
{"current_steps": 1260, "total_steps": 2536, "loss": 0.2608, "lr": 5.096495566868935e-05, "epoch": 3.9747634069400632, "percentage": 49.68, "elapsed_time": "4:35:32", "remaining_time": "4:39:02"}
|
||||
{"current_steps": 1265, "total_steps": 2536, "loss": 0.2419, "lr": 5.065370160679115e-05, "epoch": 3.9905362776025237, "percentage": 49.88, "elapsed_time": "4:36:37", "remaining_time": "4:37:56"}
|
||||
{"current_steps": 1270, "total_steps": 2536, "loss": 0.2013, "lr": 5.034242220624706e-05, "epoch": 4.006309148264984, "percentage": 50.08, "elapsed_time": "4:37:39", "remaining_time": "4:36:46"}
|
||||
{"current_steps": 1275, "total_steps": 2536, "loss": 0.1431, "lr": 5.003112953280452e-05, "epoch": 4.022082018927445, "percentage": 50.28, "elapsed_time": "4:38:52", "remaining_time": "4:35:48"}
|
||||
{"current_steps": 1280, "total_steps": 2536, "loss": 0.1438, "lr": 4.971983565272553e-05, "epoch": 4.037854889589905, "percentage": 50.47, "elapsed_time": "4:39:59", "remaining_time": "4:34:44"}
|
||||
{"current_steps": 1285, "total_steps": 2536, "loss": 0.1387, "lr": 4.940855263231873e-05, "epoch": 4.053627760252366, "percentage": 50.67, "elapsed_time": "4:41:07", "remaining_time": "4:33:40"}
|
||||
{"current_steps": 1290, "total_steps": 2536, "loss": 0.1464, "lr": 4.909729253747197e-05, "epoch": 4.069400630914826, "percentage": 50.87, "elapsed_time": "4:42:14", "remaining_time": "4:32:37"}
|
||||
{"current_steps": 1295, "total_steps": 2536, "loss": 0.1425, "lr": 4.878606743318439e-05, "epoch": 4.085173501577287, "percentage": 51.06, "elapsed_time": "4:43:16", "remaining_time": "4:31:27"}
|
||||
{"current_steps": 1300, "total_steps": 2536, "loss": 0.1452, "lr": 4.8474889383098855e-05, "epoch": 4.100946372239748, "percentage": 51.26, "elapsed_time": "4:44:27", "remaining_time": "4:30:26"}
|
||||
{"current_steps": 1305, "total_steps": 2536, "loss": 0.1429, "lr": 4.816377044903428e-05, "epoch": 4.116719242902208, "percentage": 51.46, "elapsed_time": "4:50:53", "remaining_time": "4:34:24"}
|
||||
{"current_steps": 1310, "total_steps": 2536, "loss": 0.1509, "lr": 4.7852722690518196e-05, "epoch": 4.132492113564669, "percentage": 51.66, "elapsed_time": "4:51:59", "remaining_time": "4:33:16"}
|
||||
{"current_steps": 1315, "total_steps": 2536, "loss": 0.1389, "lr": 4.75417581643192e-05, "epoch": 4.148264984227129, "percentage": 51.85, "elapsed_time": "4:53:06", "remaining_time": "4:32:09"}
|
||||
{"current_steps": 1320, "total_steps": 2536, "loss": 0.1387, "lr": 4.723088892397968e-05, "epoch": 4.16403785488959, "percentage": 52.05, "elapsed_time": "4:54:07", "remaining_time": "4:30:57"}
|
||||
{"current_steps": 1325, "total_steps": 2536, "loss": 0.1432, "lr": 4.6920127019348556e-05, "epoch": 4.17981072555205, "percentage": 52.25, "elapsed_time": "4:55:14", "remaining_time": "4:29:50"}
|
||||
{"current_steps": 1330, "total_steps": 2536, "loss": 0.1384, "lr": 4.6609484496114256e-05, "epoch": 4.195583596214511, "percentage": 52.44, "elapsed_time": "4:56:27", "remaining_time": "4:28:48"}
|
||||
{"current_steps": 1335, "total_steps": 2536, "loss": 0.1371, "lr": 4.629897339533771e-05, "epoch": 4.211356466876971, "percentage": 52.64, "elapsed_time": "4:57:30", "remaining_time": "4:27:38"}
|
||||
{"current_steps": 1340, "total_steps": 2536, "loss": 0.1429, "lr": 4.598860575298575e-05, "epoch": 4.2271293375394325, "percentage": 52.84, "elapsed_time": "4:58:39", "remaining_time": "4:26:33"}
|
||||
{"current_steps": 1345, "total_steps": 2536, "loss": 0.1533, "lr": 4.5678393599464435e-05, "epoch": 4.242902208201893, "percentage": 53.04, "elapsed_time": "4:59:46", "remaining_time": "4:25:27"}
|
||||
{"current_steps": 1350, "total_steps": 2536, "loss": 0.1555, "lr": 4.5368348959152864e-05, "epoch": 4.2586750788643535, "percentage": 53.23, "elapsed_time": "5:00:53", "remaining_time": "4:24:20"}
|
||||
{"current_steps": 1355, "total_steps": 2536, "loss": 0.1421, "lr": 4.505848384993696e-05, "epoch": 4.274447949526814, "percentage": 53.43, "elapsed_time": "5:01:59", "remaining_time": "4:23:13"}
|
||||
{"current_steps": 1360, "total_steps": 2536, "loss": 0.1479, "lr": 4.474881028274375e-05, "epoch": 4.290220820189274, "percentage": 53.63, "elapsed_time": "5:03:10", "remaining_time": "4:22:09"}
|
||||
{"current_steps": 1365, "total_steps": 2536, "loss": 0.1459, "lr": 4.4439340261075716e-05, "epoch": 4.305993690851735, "percentage": 53.82, "elapsed_time": "5:04:13", "remaining_time": "4:20:59"}
|
||||
{"current_steps": 1370, "total_steps": 2536, "loss": 0.1486, "lr": 4.413008578054558e-05, "epoch": 4.321766561514195, "percentage": 54.02, "elapsed_time": "5:05:07", "remaining_time": "4:19:41"}
|
||||
{"current_steps": 1375, "total_steps": 2536, "loss": 0.1408, "lr": 4.3821058828411244e-05, "epoch": 4.337539432176656, "percentage": 54.22, "elapsed_time": "5:06:16", "remaining_time": "4:18:36"}
|
||||
{"current_steps": 1380, "total_steps": 2536, "loss": 0.1453, "lr": 4.35122713831113e-05, "epoch": 4.353312302839116, "percentage": 54.42, "elapsed_time": "5:07:24", "remaining_time": "4:17:30"}
|
||||
{"current_steps": 1385, "total_steps": 2536, "loss": 0.1446, "lr": 4.320373541380054e-05, "epoch": 4.369085173501578, "percentage": 54.61, "elapsed_time": "5:08:33", "remaining_time": "4:16:25"}
|
||||
{"current_steps": 1390, "total_steps": 2536, "loss": 0.1488, "lr": 4.289546287988614e-05, "epoch": 4.384858044164038, "percentage": 54.81, "elapsed_time": "5:09:47", "remaining_time": "4:15:24"}
|
||||
{"current_steps": 1395, "total_steps": 2536, "loss": 0.152, "lr": 4.258746573056401e-05, "epoch": 4.400630914826499, "percentage": 55.01, "elapsed_time": "5:10:52", "remaining_time": "4:14:16"}
|
||||
{"current_steps": 1400, "total_steps": 2536, "loss": 0.1518, "lr": 4.2279755904355704e-05, "epoch": 4.416403785488959, "percentage": 55.21, "elapsed_time": "5:12:02", "remaining_time": "4:13:12"}
|
||||
{"current_steps": 1405, "total_steps": 2536, "loss": 0.138, "lr": 4.197234532864558e-05, "epoch": 4.4321766561514195, "percentage": 55.4, "elapsed_time": "5:23:22", "remaining_time": "4:20:18"}
|
||||
{"current_steps": 1410, "total_steps": 2536, "loss": 0.1509, "lr": 4.1665245919218544e-05, "epoch": 4.44794952681388, "percentage": 55.6, "elapsed_time": "5:24:30", "remaining_time": "4:19:08"}
|
||||
{"current_steps": 1415, "total_steps": 2536, "loss": 0.1486, "lr": 4.135846957979811e-05, "epoch": 4.4637223974763405, "percentage": 55.8, "elapsed_time": "5:25:39", "remaining_time": "4:17:59"}
|
||||
{"current_steps": 1420, "total_steps": 2536, "loss": 0.148, "lr": 4.105202820158503e-05, "epoch": 4.479495268138801, "percentage": 55.99, "elapsed_time": "5:26:55", "remaining_time": "4:16:56"}
|
||||
{"current_steps": 1425, "total_steps": 2536, "loss": 0.1347, "lr": 4.074593366279636e-05, "epoch": 4.495268138801261, "percentage": 56.19, "elapsed_time": "5:28:03", "remaining_time": "4:15:45"}
|
||||
{"current_steps": 1430, "total_steps": 2536, "loss": 0.1656, "lr": 4.044019782820505e-05, "epoch": 4.511041009463723, "percentage": 56.39, "elapsed_time": "5:29:18", "remaining_time": "4:14:42"}
|
||||
{"current_steps": 1435, "total_steps": 2536, "loss": 0.1479, "lr": 4.0134832548680006e-05, "epoch": 4.526813880126183, "percentage": 56.59, "elapsed_time": "5:30:27", "remaining_time": "4:13:32"}
|
||||
{"current_steps": 1440, "total_steps": 2536, "loss": 0.1444, "lr": 3.982984966072677e-05, "epoch": 4.542586750788644, "percentage": 56.78, "elapsed_time": "5:31:35", "remaining_time": "4:12:22"}
|
||||
{"current_steps": 1445, "total_steps": 2536, "loss": 0.1512, "lr": 3.952526098602873e-05, "epoch": 4.558359621451104, "percentage": 56.98, "elapsed_time": "5:32:40", "remaining_time": "4:11:10"}
|
||||
{"current_steps": 1450, "total_steps": 2536, "loss": 0.1397, "lr": 3.9221078330988806e-05, "epoch": 4.574132492113565, "percentage": 57.18, "elapsed_time": "5:33:49", "remaining_time": "4:10:01"}
|
||||
{"current_steps": 1455, "total_steps": 2536, "loss": 0.1505, "lr": 3.89173134862719e-05, "epoch": 4.589905362776025, "percentage": 57.37, "elapsed_time": "5:34:57", "remaining_time": "4:08:51"}
|
||||
{"current_steps": 1460, "total_steps": 2536, "loss": 0.1515, "lr": 3.861397822634784e-05, "epoch": 4.605678233438486, "percentage": 57.57, "elapsed_time": "5:36:09", "remaining_time": "4:07:44"}
|
||||
{"current_steps": 1465, "total_steps": 2536, "loss": 0.14, "lr": 3.831108430903494e-05, "epoch": 4.621451104100946, "percentage": 57.77, "elapsed_time": "5:37:14", "remaining_time": "4:06:32"}
|
||||
{"current_steps": 1470, "total_steps": 2536, "loss": 0.1513, "lr": 3.800864347504437e-05, "epoch": 4.6372239747634065, "percentage": 57.97, "elapsed_time": "5:38:24", "remaining_time": "4:05:24"}
|
||||
{"current_steps": 1475, "total_steps": 2536, "loss": 0.1524, "lr": 3.7706667447524876e-05, "epoch": 4.652996845425868, "percentage": 58.16, "elapsed_time": "5:39:33", "remaining_time": "4:04:15"}
|
||||
{"current_steps": 1480, "total_steps": 2536, "loss": 0.1465, "lr": 3.740516793160855e-05, "epoch": 4.668769716088328, "percentage": 58.36, "elapsed_time": "5:40:35", "remaining_time": "4:03:01"}
|
||||
{"current_steps": 1485, "total_steps": 2536, "loss": 0.1503, "lr": 3.710415661395699e-05, "epoch": 4.684542586750789, "percentage": 58.56, "elapsed_time": "5:41:46", "remaining_time": "4:01:53"}
|
||||
{"current_steps": 1490, "total_steps": 2536, "loss": 0.1495, "lr": 3.6803645162308376e-05, "epoch": 4.700315457413249, "percentage": 58.75, "elapsed_time": "5:42:57", "remaining_time": "4:00:45"}
|
||||
{"current_steps": 1495, "total_steps": 2536, "loss": 0.1606, "lr": 3.6503645225025175e-05, "epoch": 4.71608832807571, "percentage": 58.95, "elapsed_time": "5:44:03", "remaining_time": "3:59:34"}
|
||||
{"current_steps": 1500, "total_steps": 2536, "loss": 0.1449, "lr": 3.620416843064266e-05, "epoch": 4.73186119873817, "percentage": 59.15, "elapsed_time": "5:45:06", "remaining_time": "3:58:20"}
|
||||
{"current_steps": 1505, "total_steps": 2536, "loss": 0.1483, "lr": 3.5905226387418126e-05, "epoch": 4.747634069400631, "percentage": 59.35, "elapsed_time": "5:52:06", "remaining_time": "4:01:12"}
|
||||
{"current_steps": 1510, "total_steps": 2536, "loss": 0.1556, "lr": 3.5606830682880965e-05, "epoch": 4.763406940063091, "percentage": 59.54, "elapsed_time": "5:53:11", "remaining_time": "3:59:59"}
|
||||
{"current_steps": 1515, "total_steps": 2536, "loss": 0.1563, "lr": 3.530899288338352e-05, "epoch": 4.779179810725552, "percentage": 59.74, "elapsed_time": "5:54:15", "remaining_time": "3:58:44"}
|
||||
{"current_steps": 1520, "total_steps": 2536, "loss": 0.1502, "lr": 3.501172453365268e-05, "epoch": 4.794952681388013, "percentage": 59.94, "elapsed_time": "5:55:22", "remaining_time": "3:57:32"}
|
||||
{"current_steps": 1525, "total_steps": 2536, "loss": 0.1403, "lr": 3.471503715634252e-05, "epoch": 4.8107255520504735, "percentage": 60.13, "elapsed_time": "5:56:34", "remaining_time": "3:56:23"}
|
||||
{"current_steps": 1530, "total_steps": 2536, "loss": 0.1482, "lr": 3.44189422515875e-05, "epoch": 4.826498422712934, "percentage": 60.33, "elapsed_time": "5:57:41", "remaining_time": "3:55:11"}
|
||||
{"current_steps": 1535, "total_steps": 2536, "loss": 0.1532, "lr": 3.4123451296556845e-05, "epoch": 4.842271293375394, "percentage": 60.53, "elapsed_time": "5:58:47", "remaining_time": "3:53:58"}
|
||||
{"current_steps": 1540, "total_steps": 2536, "loss": 0.1443, "lr": 3.382857574500957e-05, "epoch": 4.858044164037855, "percentage": 60.73, "elapsed_time": "6:00:01", "remaining_time": "3:52:50"}
|
||||
{"current_steps": 1545, "total_steps": 2536, "loss": 0.1521, "lr": 3.3534327026850574e-05, "epoch": 4.873817034700315, "percentage": 60.92, "elapsed_time": "6:01:07", "remaining_time": "3:51:38"}
|
||||
{"current_steps": 1550, "total_steps": 2536, "loss": 0.1333, "lr": 3.324071654768754e-05, "epoch": 4.889589905362776, "percentage": 61.12, "elapsed_time": "6:02:20", "remaining_time": "3:50:29"}
|
||||
{"current_steps": 1555, "total_steps": 2536, "loss": 0.1449, "lr": 3.2947755688388874e-05, "epoch": 4.905362776025236, "percentage": 61.32, "elapsed_time": "6:03:26", "remaining_time": "3:49:16"}
|
||||
{"current_steps": 1560, "total_steps": 2536, "loss": 0.1553, "lr": 3.26554558046426e-05, "epoch": 4.921135646687697, "percentage": 61.51, "elapsed_time": "6:04:35", "remaining_time": "3:48:06"}
|
||||
{"current_steps": 1565, "total_steps": 2536, "loss": 0.149, "lr": 3.236382822651606e-05, "epoch": 4.936908517350158, "percentage": 61.71, "elapsed_time": "6:05:44", "remaining_time": "3:46:55"}
|
||||
{"current_steps": 1570, "total_steps": 2536, "loss": 0.1398, "lr": 3.207288425801689e-05, "epoch": 4.952681388012619, "percentage": 61.91, "elapsed_time": "6:06:49", "remaining_time": "3:45:42"}
|
||||
{"current_steps": 1575, "total_steps": 2536, "loss": 0.1568, "lr": 3.1782635176654764e-05, "epoch": 4.968454258675079, "percentage": 62.11, "elapsed_time": "6:07:56", "remaining_time": "3:44:29"}
|
||||
{"current_steps": 1580, "total_steps": 2536, "loss": 0.1456, "lr": 3.149309223300428e-05, "epoch": 4.9842271293375395, "percentage": 62.3, "elapsed_time": "6:09:03", "remaining_time": "3:43:18"}
|
||||
{"current_steps": 1585, "total_steps": 2536, "loss": 0.1485, "lr": 3.120426665026891e-05, "epoch": 5.0, "percentage": 62.5, "elapsed_time": "6:10:12", "remaining_time": "3:42:07"}
|
||||
{"current_steps": 1590, "total_steps": 2536, "loss": 0.0813, "lr": 3.091616962384587e-05, "epoch": 5.0157728706624605, "percentage": 62.7, "elapsed_time": "6:11:20", "remaining_time": "3:40:56"}
|
||||
{"current_steps": 1595, "total_steps": 2536, "loss": 0.0813, "lr": 3.06288123208923e-05, "epoch": 5.031545741324921, "percentage": 62.89, "elapsed_time": "6:12:26", "remaining_time": "3:39:43"}
|
||||
{"current_steps": 1600, "total_steps": 2536, "loss": 0.0785, "lr": 3.034220587989226e-05, "epoch": 5.047318611987381, "percentage": 63.09, "elapsed_time": "6:13:38", "remaining_time": "3:38:34"}
|
||||
{"current_steps": 1605, "total_steps": 2536, "loss": 0.0806, "lr": 3.005636141022512e-05, "epoch": 5.063091482649842, "percentage": 63.29, "elapsed_time": "6:19:37", "remaining_time": "3:40:12"}
|
||||
{"current_steps": 1610, "total_steps": 2536, "loss": 0.0793, "lr": 2.977128999173482e-05, "epoch": 5.078864353312303, "percentage": 63.49, "elapsed_time": "6:20:49", "remaining_time": "3:39:01"}
|
||||
{"current_steps": 1615, "total_steps": 2536, "loss": 0.0754, "lr": 2.948700267430049e-05, "epoch": 5.094637223974764, "percentage": 63.68, "elapsed_time": "6:21:54", "remaining_time": "3:37:47"}
|
||||
{"current_steps": 1620, "total_steps": 2536, "loss": 0.0773, "lr": 2.920351047740808e-05, "epoch": 5.110410094637224, "percentage": 63.88, "elapsed_time": "6:23:02", "remaining_time": "3:36:35"}
|
||||
{"current_steps": 1625, "total_steps": 2536, "loss": 0.0854, "lr": 2.892082438972325e-05, "epoch": 5.126182965299685, "percentage": 64.08, "elapsed_time": "6:24:10", "remaining_time": "3:35:22"}
|
||||
{"current_steps": 1630, "total_steps": 2536, "loss": 0.0779, "lr": 2.863895536866541e-05, "epoch": 5.141955835962145, "percentage": 64.27, "elapsed_time": "6:25:21", "remaining_time": "3:34:11"}
|
||||
{"current_steps": 1635, "total_steps": 2536, "loss": 0.084, "lr": 2.835791433998301e-05, "epoch": 5.157728706624606, "percentage": 64.47, "elapsed_time": "6:26:32", "remaining_time": "3:33:00"}
|
||||
{"current_steps": 1640, "total_steps": 2536, "loss": 0.0783, "lr": 2.807771219733004e-05, "epoch": 5.173501577287066, "percentage": 64.67, "elapsed_time": "6:27:45", "remaining_time": "3:31:50"}
|
||||
{"current_steps": 1645, "total_steps": 2536, "loss": 0.0764, "lr": 2.7798359801843766e-05, "epoch": 5.1892744479495265, "percentage": 64.87, "elapsed_time": "6:29:01", "remaining_time": "3:30:42"}
|
||||
{"current_steps": 1650, "total_steps": 2536, "loss": 0.0811, "lr": 2.7519867981723712e-05, "epoch": 5.205047318611987, "percentage": 65.06, "elapsed_time": "6:30:08", "remaining_time": "3:29:29"}
|
||||
{"current_steps": 1655, "total_steps": 2536, "loss": 0.081, "lr": 2.724224753181197e-05, "epoch": 5.220820189274448, "percentage": 65.26, "elapsed_time": "6:31:16", "remaining_time": "3:28:17"}
|
||||
{"current_steps": 1660, "total_steps": 2536, "loss": 0.0758, "lr": 2.6965509213174777e-05, "epoch": 5.236593059936909, "percentage": 65.46, "elapsed_time": "6:32:24", "remaining_time": "3:27:04"}
|
||||
{"current_steps": 1665, "total_steps": 2536, "loss": 0.0791, "lr": 2.6689663752685334e-05, "epoch": 5.252365930599369, "percentage": 65.65, "elapsed_time": "6:33:33", "remaining_time": "3:25:52"}
|
||||
{"current_steps": 1670, "total_steps": 2536, "loss": 0.0814, "lr": 2.641472184260809e-05, "epoch": 5.26813880126183, "percentage": 65.85, "elapsed_time": "6:34:46", "remaining_time": "3:24:42"}
|
||||
{"current_steps": 1675, "total_steps": 2536, "loss": 0.0758, "lr": 2.614069414018428e-05, "epoch": 5.28391167192429, "percentage": 66.05, "elapsed_time": "6:35:56", "remaining_time": "3:23:31"}
|
||||
{"current_steps": 1680, "total_steps": 2536, "loss": 0.0818, "lr": 2.5867591267218805e-05, "epoch": 5.299684542586751, "percentage": 66.25, "elapsed_time": "6:37:06", "remaining_time": "3:22:19"}
|
||||
{"current_steps": 1685, "total_steps": 2536, "loss": 0.076, "lr": 2.5595423809668452e-05, "epoch": 5.315457413249211, "percentage": 66.44, "elapsed_time": "6:38:10", "remaining_time": "3:21:05"}
|
||||
{"current_steps": 1690, "total_steps": 2536, "loss": 0.078, "lr": 2.532420231723172e-05, "epoch": 5.331230283911672, "percentage": 66.64, "elapsed_time": "6:39:14", "remaining_time": "3:19:51"}
|
||||
{"current_steps": 1695, "total_steps": 2536, "loss": 0.0831, "lr": 2.5053937302939767e-05, "epoch": 5.347003154574132, "percentage": 66.84, "elapsed_time": "6:40:22", "remaining_time": "3:18:39"}
|
||||
{"current_steps": 1700, "total_steps": 2536, "loss": 0.0824, "lr": 2.4784639242748953e-05, "epoch": 5.3627760252365935, "percentage": 67.03, "elapsed_time": "6:41:33", "remaining_time": "3:17:28"}
|
||||
{"current_steps": 1705, "total_steps": 2536, "loss": 0.0821, "lr": 2.451631857513472e-05, "epoch": 5.378548895899054, "percentage": 67.23, "elapsed_time": "6:46:25", "remaining_time": "3:18:05"}
|
||||
{"current_steps": 1710, "total_steps": 2536, "loss": 0.0891, "lr": 2.4248985700687084e-05, "epoch": 5.394321766561514, "percentage": 67.43, "elapsed_time": "6:47:28", "remaining_time": "3:16:49"}
|
||||
{"current_steps": 1715, "total_steps": 2536, "loss": 0.0751, "lr": 2.39826509817074e-05, "epoch": 5.410094637223975, "percentage": 67.63, "elapsed_time": "6:48:35", "remaining_time": "3:15:35"}
|
||||
{"current_steps": 1720, "total_steps": 2536, "loss": 0.0775, "lr": 2.3717324741806718e-05, "epoch": 5.425867507886435, "percentage": 67.82, "elapsed_time": "6:49:38", "remaining_time": "3:14:20"}
|
||||
{"current_steps": 1725, "total_steps": 2536, "loss": 0.0856, "lr": 2.3453017265505673e-05, "epoch": 5.441640378548896, "percentage": 68.02, "elapsed_time": "6:50:50", "remaining_time": "3:13:09"}
|
||||
{"current_steps": 1730, "total_steps": 2536, "loss": 0.0788, "lr": 2.3189738797835708e-05, "epoch": 5.457413249211356, "percentage": 68.22, "elapsed_time": "6:51:59", "remaining_time": "3:11:56"}
|
||||
{"current_steps": 1735, "total_steps": 2536, "loss": 0.0767, "lr": 2.292749954394216e-05, "epoch": 5.473186119873817, "percentage": 68.41, "elapsed_time": "6:53:02", "remaining_time": "3:10:41"}
|
||||
{"current_steps": 1740, "total_steps": 2536, "loss": 0.0846, "lr": 2.266630966868852e-05, "epoch": 5.488958990536277, "percentage": 68.61, "elapsed_time": "6:54:05", "remaining_time": "3:09:26"}
|
||||
{"current_steps": 1745, "total_steps": 2536, "loss": 0.0759, "lr": 2.2406179296262453e-05, "epoch": 5.504731861198739, "percentage": 68.81, "elapsed_time": "6:55:15", "remaining_time": "3:08:13"}
|
||||
{"current_steps": 1750, "total_steps": 2536, "loss": 0.0821, "lr": 2.2147118509783445e-05, "epoch": 5.520504731861199, "percentage": 69.01, "elapsed_time": "6:56:18", "remaining_time": "3:06:58"}
|
||||
{"current_steps": 1755, "total_steps": 2536, "loss": 0.0728, "lr": 2.1889137350911894e-05, "epoch": 5.5362776025236595, "percentage": 69.2, "elapsed_time": "6:57:27", "remaining_time": "3:05:46"}
|
||||
{"current_steps": 1760, "total_steps": 2536, "loss": 0.0814, "lr": 2.1632245819459913e-05, "epoch": 5.55205047318612, "percentage": 69.4, "elapsed_time": "6:58:34", "remaining_time": "3:04:33"}
|
||||
{"current_steps": 1765, "total_steps": 2536, "loss": 0.0729, "lr": 2.1376453873003664e-05, "epoch": 5.5678233438485805, "percentage": 69.6, "elapsed_time": "6:59:43", "remaining_time": "3:03:20"}
|
||||
{"current_steps": 1770, "total_steps": 2536, "loss": 0.0825, "lr": 2.112177142649746e-05, "epoch": 5.583596214511041, "percentage": 69.79, "elapsed_time": "7:00:48", "remaining_time": "3:02:06"}
|
||||
{"current_steps": 1775, "total_steps": 2536, "loss": 0.0767, "lr": 2.0868208351889402e-05, "epoch": 5.599369085173501, "percentage": 69.99, "elapsed_time": "7:01:57", "remaining_time": "3:00:54"}
|
||||
{"current_steps": 1780, "total_steps": 2536, "loss": 0.0829, "lr": 2.0615774477738738e-05, "epoch": 5.615141955835962, "percentage": 70.19, "elapsed_time": "7:03:02", "remaining_time": "2:59:40"}
|
||||
{"current_steps": 1785, "total_steps": 2536, "loss": 0.0808, "lr": 2.0364479588834835e-05, "epoch": 5.630914826498422, "percentage": 70.39, "elapsed_time": "7:04:07", "remaining_time": "2:58:26"}
|
||||
{"current_steps": 1790, "total_steps": 2536, "loss": 0.0884, "lr": 2.0114333425817993e-05, "epoch": 5.646687697160884, "percentage": 70.58, "elapsed_time": "7:05:14", "remaining_time": "2:57:13"}
|
||||
{"current_steps": 1795, "total_steps": 2536, "loss": 0.0787, "lr": 1.9865345684801846e-05, "epoch": 5.662460567823344, "percentage": 70.78, "elapsed_time": "7:06:19", "remaining_time": "2:55:59"}
|
||||
{"current_steps": 1800, "total_steps": 2536, "loss": 0.0819, "lr": 1.9617526016997486e-05, "epoch": 5.678233438485805, "percentage": 70.98, "elapsed_time": "7:07:28", "remaining_time": "2:54:47"}
|
||||
{"current_steps": 1805, "total_steps": 2536, "loss": 0.0753, "lr": 1.937088402833943e-05, "epoch": 5.694006309148265, "percentage": 71.18, "elapsed_time": "7:10:14", "remaining_time": "2:54:14"}
|
||||
{"current_steps": 1810, "total_steps": 2536, "loss": 0.0753, "lr": 1.9125429279113173e-05, "epoch": 5.709779179810726, "percentage": 71.37, "elapsed_time": "7:11:26", "remaining_time": "2:53:03"}
|
||||
{"current_steps": 1815, "total_steps": 2536, "loss": 0.0773, "lr": 1.8881171283584752e-05, "epoch": 5.725552050473186, "percentage": 71.57, "elapsed_time": "7:12:40", "remaining_time": "2:51:52"}
|
||||
{"current_steps": 1820, "total_steps": 2536, "loss": 0.0843, "lr": 1.8638119509631853e-05, "epoch": 5.7413249211356465, "percentage": 71.77, "elapsed_time": "7:13:41", "remaining_time": "2:50:36"}
|
||||
{"current_steps": 1825, "total_steps": 2536, "loss": 0.0798, "lr": 1.839628337837686e-05, "epoch": 5.757097791798107, "percentage": 71.96, "elapsed_time": "7:14:50", "remaining_time": "2:49:24"}
|
||||
{"current_steps": 1830, "total_steps": 2536, "loss": 0.0791, "lr": 1.8155672263821666e-05, "epoch": 5.7728706624605675, "percentage": 72.16, "elapsed_time": "7:15:58", "remaining_time": "2:48:11"}
|
||||
{"current_steps": 1835, "total_steps": 2536, "loss": 0.0823, "lr": 1.7916295492484315e-05, "epoch": 5.788643533123029, "percentage": 72.36, "elapsed_time": "7:17:04", "remaining_time": "2:46:58"}
|
||||
{"current_steps": 1840, "total_steps": 2536, "loss": 0.0889, "lr": 1.7678162343037524e-05, "epoch": 5.804416403785489, "percentage": 72.56, "elapsed_time": "7:18:12", "remaining_time": "2:45:45"}
|
||||
{"current_steps": 1845, "total_steps": 2536, "loss": 0.0728, "lr": 1.744128204594893e-05, "epoch": 5.82018927444795, "percentage": 72.75, "elapsed_time": "7:19:21", "remaining_time": "2:44:33"}
|
||||
{"current_steps": 1850, "total_steps": 2536, "loss": 0.083, "lr": 1.7205663783123436e-05, "epoch": 5.83596214511041, "percentage": 72.95, "elapsed_time": "7:20:32", "remaining_time": "2:43:21"}
|
||||
{"current_steps": 1855, "total_steps": 2536, "loss": 0.0759, "lr": 1.6971316687547213e-05, "epoch": 5.851735015772871, "percentage": 73.15, "elapsed_time": "7:21:40", "remaining_time": "2:42:08"}
|
||||
{"current_steps": 1860, "total_steps": 2536, "loss": 0.0804, "lr": 1.6738249842933697e-05, "epoch": 5.867507886435331, "percentage": 73.34, "elapsed_time": "7:22:41", "remaining_time": "2:40:53"}
|
||||
{"current_steps": 1865, "total_steps": 2536, "loss": 0.0753, "lr": 1.6506472283371527e-05, "epoch": 5.883280757097792, "percentage": 73.54, "elapsed_time": "7:23:48", "remaining_time": "2:39:40"}
|
||||
{"current_steps": 1870, "total_steps": 2536, "loss": 0.0734, "lr": 1.6275992992974308e-05, "epoch": 5.899053627760252, "percentage": 73.74, "elapsed_time": "7:24:56", "remaining_time": "2:38:28"}
|
||||
{"current_steps": 1875, "total_steps": 2536, "loss": 0.077, "lr": 1.604682090553243e-05, "epoch": 5.914826498422713, "percentage": 73.94, "elapsed_time": "7:26:06", "remaining_time": "2:37:15"}
|
||||
{"current_steps": 1880, "total_steps": 2536, "loss": 0.0795, "lr": 1.5818964904166756e-05, "epoch": 5.930599369085174, "percentage": 74.13, "elapsed_time": "7:27:15", "remaining_time": "2:36:03"}
|
||||
{"current_steps": 1885, "total_steps": 2536, "loss": 0.0806, "lr": 1.55924338209843e-05, "epoch": 5.946372239747634, "percentage": 74.33, "elapsed_time": "7:28:21", "remaining_time": "2:34:50"}
|
||||
{"current_steps": 1890, "total_steps": 2536, "loss": 0.0822, "lr": 1.536723643673582e-05, "epoch": 5.962145110410095, "percentage": 74.53, "elapsed_time": "7:29:33", "remaining_time": "2:33:39"}
|
||||
{"current_steps": 1895, "total_steps": 2536, "loss": 0.0835, "lr": 1.5143381480475583e-05, "epoch": 5.977917981072555, "percentage": 74.72, "elapsed_time": "7:30:44", "remaining_time": "2:32:27"}
|
||||
{"current_steps": 1900, "total_steps": 2536, "loss": 0.0805, "lr": 1.49208776292229e-05, "epoch": 5.993690851735016, "percentage": 74.92, "elapsed_time": "7:31:53", "remaining_time": "2:31:15"}
|
||||
{"current_steps": 1905, "total_steps": 2536, "loss": 0.0539, "lr": 1.4699733507625862e-05, "epoch": 6.009463722397476, "percentage": 75.12, "elapsed_time": "7:37:50", "remaining_time": "2:31:39"}
|
||||
{"current_steps": 1910, "total_steps": 2536, "loss": 0.0445, "lr": 1.4479957687626933e-05, "epoch": 6.025236593059937, "percentage": 75.32, "elapsed_time": "7:38:57", "remaining_time": "2:30:25"}
|
||||
{"current_steps": 1915, "total_steps": 2536, "loss": 0.04, "lr": 1.4261558688130838e-05, "epoch": 6.041009463722397, "percentage": 75.51, "elapsed_time": "7:40:11", "remaining_time": "2:29:14"}
|
||||
{"current_steps": 1920, "total_steps": 2536, "loss": 0.0408, "lr": 1.4044544974674246e-05, "epoch": 6.056782334384858, "percentage": 75.71, "elapsed_time": "7:41:19", "remaining_time": "2:28:00"}
|
||||
{"current_steps": 1925, "total_steps": 2536, "loss": 0.0375, "lr": 1.3828924959097612e-05, "epoch": 6.072555205047319, "percentage": 75.91, "elapsed_time": "7:42:25", "remaining_time": "2:26:46"}
|
||||
{"current_steps": 1930, "total_steps": 2536, "loss": 0.0437, "lr": 1.3614706999219213e-05, "epoch": 6.0883280757097795, "percentage": 76.1, "elapsed_time": "7:43:22", "remaining_time": "2:25:29"}
|
||||
{"current_steps": 1935, "total_steps": 2536, "loss": 0.0382, "lr": 1.340189939851112e-05, "epoch": 6.10410094637224, "percentage": 76.3, "elapsed_time": "7:44:32", "remaining_time": "2:24:17"}
|
||||
{"current_steps": 1940, "total_steps": 2536, "loss": 0.0403, "lr": 1.3190510405777345e-05, "epoch": 6.1198738170347005, "percentage": 76.5, "elapsed_time": "7:45:37", "remaining_time": "2:23:02"}
|
||||
{"current_steps": 1945, "total_steps": 2536, "loss": 0.0411, "lr": 1.2980548214834142e-05, "epoch": 6.135646687697161, "percentage": 76.7, "elapsed_time": "7:46:48", "remaining_time": "2:21:50"}
|
||||
{"current_steps": 1950, "total_steps": 2536, "loss": 0.0381, "lr": 1.2772020964192316e-05, "epoch": 6.151419558359621, "percentage": 76.89, "elapsed_time": "7:47:55", "remaining_time": "2:20:37"}
|
||||
{"current_steps": 1955, "total_steps": 2536, "loss": 0.0426, "lr": 1.2564936736741867e-05, "epoch": 6.167192429022082, "percentage": 77.09, "elapsed_time": "7:49:01", "remaining_time": "2:19:23"}
|
||||
{"current_steps": 1960, "total_steps": 2536, "loss": 0.0375, "lr": 1.23593035594386e-05, "epoch": 6.182965299684542, "percentage": 77.29, "elapsed_time": "7:50:10", "remaining_time": "2:18:10"}
|
||||
{"current_steps": 1965, "total_steps": 2536, "loss": 0.0379, "lr": 1.215512940299305e-05, "epoch": 6.198738170347003, "percentage": 77.48, "elapsed_time": "7:51:17", "remaining_time": "2:16:57"}
|
||||
{"current_steps": 1970, "total_steps": 2536, "loss": 0.0399, "lr": 1.1952422181561424e-05, "epoch": 6.214511041009464, "percentage": 77.68, "elapsed_time": "7:52:23", "remaining_time": "2:15:43"}
|
||||
{"current_steps": 1975, "total_steps": 2536, "loss": 0.0427, "lr": 1.1751189752438957e-05, "epoch": 6.230283911671925, "percentage": 77.88, "elapsed_time": "7:53:27", "remaining_time": "2:14:29"}
|
||||
{"current_steps": 1980, "total_steps": 2536, "loss": 0.0389, "lr": 1.1551439915755274e-05, "epoch": 6.246056782334385, "percentage": 78.08, "elapsed_time": "7:54:35", "remaining_time": "2:13:16"}
|
||||
{"current_steps": 1985, "total_steps": 2536, "loss": 0.0428, "lr": 1.135318041417207e-05, "epoch": 6.261829652996846, "percentage": 78.27, "elapsed_time": "7:55:42", "remaining_time": "2:12:02"}
|
||||
{"current_steps": 1990, "total_steps": 2536, "loss": 0.0388, "lr": 1.1156418932582941e-05, "epoch": 6.277602523659306, "percentage": 78.47, "elapsed_time": "7:56:50", "remaining_time": "2:10:49"}
|
||||
{"current_steps": 1995, "total_steps": 2536, "loss": 0.0391, "lr": 1.096116309781558e-05, "epoch": 6.2933753943217665, "percentage": 78.67, "elapsed_time": "7:58:00", "remaining_time": "2:09:37"}
|
||||
{"current_steps": 2000, "total_steps": 2536, "loss": 0.0414, "lr": 1.0767420478336093e-05, "epoch": 6.309148264984227, "percentage": 78.86, "elapsed_time": "7:59:11", "remaining_time": "2:08:25"}
|
||||
{"current_steps": 2005, "total_steps": 2536, "loss": 0.042, "lr": 1.0575198583955698e-05, "epoch": 6.3249211356466875, "percentage": 79.06, "elapsed_time": "8:05:56", "remaining_time": "2:08:41"}
|
||||
{"current_steps": 2010, "total_steps": 2536, "loss": 0.0418, "lr": 1.0384504865539497e-05, "epoch": 6.340694006309148, "percentage": 79.26, "elapsed_time": "8:07:08", "remaining_time": "2:07:28"}
|
||||
{"current_steps": 2015, "total_steps": 2536, "loss": 0.0374, "lr": 1.0195346714717813e-05, "epoch": 6.356466876971609, "percentage": 79.46, "elapsed_time": "8:08:17", "remaining_time": "2:06:15"}
|
||||
{"current_steps": 2020, "total_steps": 2536, "loss": 0.0417, "lr": 1.0007731463599601e-05, "epoch": 6.37223974763407, "percentage": 79.65, "elapsed_time": "8:09:17", "remaining_time": "2:04:59"}
|
||||
{"current_steps": 2025, "total_steps": 2536, "loss": 0.0404, "lr": 9.82166638448827e-06, "epoch": 6.38801261829653, "percentage": 79.85, "elapsed_time": "8:10:22", "remaining_time": "2:03:44"}
|
||||
{"current_steps": 2030, "total_steps": 2536, "loss": 0.042, "lr": 9.637158689599746e-06, "epoch": 6.403785488958991, "percentage": 80.05, "elapsed_time": "8:11:32", "remaining_time": "2:02:31"}
|
||||
{"current_steps": 2035, "total_steps": 2536, "loss": 0.0422, "lr": 9.454215530782994e-06, "epoch": 6.419558359621451, "percentage": 80.24, "elapsed_time": "8:12:43", "remaining_time": "2:01:18"}
|
||||
{"current_steps": 2040, "total_steps": 2536, "loss": 0.041, "lr": 9.272843999242736e-06, "epoch": 6.435331230283912, "percentage": 80.44, "elapsed_time": "8:13:51", "remaining_time": "2:00:04"}
|
||||
{"current_steps": 2045, "total_steps": 2536, "loss": 0.041, "lr": 9.093051125264623e-06, "epoch": 6.451104100946372, "percentage": 80.64, "elapsed_time": "8:14:57", "remaining_time": "1:58:50"}
|
||||
{"current_steps": 2050, "total_steps": 2536, "loss": 0.0395, "lr": 8.91484387794267e-06, "epoch": 6.466876971608833, "percentage": 80.84, "elapsed_time": "8:16:08", "remaining_time": "1:57:37"}
|
||||
{"current_steps": 2055, "total_steps": 2536, "loss": 0.0421, "lr": 8.73822916490919e-06, "epoch": 6.482649842271293, "percentage": 81.03, "elapsed_time": "8:17:20", "remaining_time": "1:56:24"}
|
||||
{"current_steps": 2060, "total_steps": 2536, "loss": 0.0394, "lr": 8.563213832067014e-06, "epoch": 6.498422712933754, "percentage": 81.23, "elapsed_time": "8:18:30", "remaining_time": "1:55:11"}
|
||||
{"current_steps": 2065, "total_steps": 2536, "loss": 0.0402, "lr": 8.389804663324142e-06, "epoch": 6.514195583596215, "percentage": 81.43, "elapsed_time": "8:19:37", "remaining_time": "1:53:57"}
|
||||
{"current_steps": 2070, "total_steps": 2536, "loss": 0.0421, "lr": 8.218008380330723e-06, "epoch": 6.529968454258675, "percentage": 81.62, "elapsed_time": "8:20:50", "remaining_time": "1:52:44"}
|
||||
{"current_steps": 2075, "total_steps": 2536, "loss": 0.0423, "lr": 8.047831642218611e-06, "epoch": 6.545741324921136, "percentage": 81.82, "elapsed_time": "8:21:59", "remaining_time": "1:51:31"}
|
||||
{"current_steps": 2080, "total_steps": 2536, "loss": 0.0398, "lr": 7.879281045343184e-06, "epoch": 6.561514195583596, "percentage": 82.02, "elapsed_time": "8:23:07", "remaining_time": "1:50:17"}
|
||||
{"current_steps": 2085, "total_steps": 2536, "loss": 0.0399, "lr": 7.712363123027678e-06, "epoch": 6.577287066246057, "percentage": 82.22, "elapsed_time": "8:24:12", "remaining_time": "1:49:03"}
|
||||
{"current_steps": 2090, "total_steps": 2536, "loss": 0.0395, "lr": 7.547084345309924e-06, "epoch": 6.593059936908517, "percentage": 82.41, "elapsed_time": "8:25:17", "remaining_time": "1:47:49"}
|
||||
{"current_steps": 2095, "total_steps": 2536, "loss": 0.0381, "lr": 7.383451118691576e-06, "epoch": 6.608832807570978, "percentage": 82.61, "elapsed_time": "8:26:22", "remaining_time": "1:46:35"}
|
||||
{"current_steps": 2100, "total_steps": 2536, "loss": 0.0399, "lr": 7.221469785889784e-06, "epoch": 6.624605678233438, "percentage": 82.81, "elapsed_time": "8:27:32", "remaining_time": "1:45:22"}
|
||||
{"current_steps": 2105, "total_steps": 2536, "loss": 0.0397, "lr": 7.061146625591331e-06, "epoch": 6.6403785488958995, "percentage": 83.0, "elapsed_time": "8:34:18", "remaining_time": "1:45:18"}
|
||||
{"current_steps": 2110, "total_steps": 2536, "loss": 0.0372, "lr": 6.902487852209238e-06, "epoch": 6.65615141955836, "percentage": 83.2, "elapsed_time": "8:35:22", "remaining_time": "1:44:03"}
|
||||
{"current_steps": 2115, "total_steps": 2536, "loss": 0.0364, "lr": 6.7454996156419485e-06, "epoch": 6.6719242902208205, "percentage": 83.4, "elapsed_time": "8:36:31", "remaining_time": "1:42:48"}
|
||||
{"current_steps": 2120, "total_steps": 2536, "loss": 0.0377, "lr": 6.590188001034864e-06, "epoch": 6.687697160883281, "percentage": 83.6, "elapsed_time": "8:37:39", "remaining_time": "1:41:34"}
|
||||
{"current_steps": 2125, "total_steps": 2536, "loss": 0.0395, "lr": 6.436559028544559e-06, "epoch": 6.703470031545741, "percentage": 83.79, "elapsed_time": "8:38:47", "remaining_time": "1:40:20"}
|
||||
{"current_steps": 2130, "total_steps": 2536, "loss": 0.0391, "lr": 6.284618653105328e-06, "epoch": 6.719242902208202, "percentage": 83.99, "elapsed_time": "8:40:02", "remaining_time": "1:39:07"}
|
||||
{"current_steps": 2135, "total_steps": 2536, "loss": 0.0394, "lr": 6.134372764198465e-06, "epoch": 6.735015772870662, "percentage": 84.19, "elapsed_time": "8:41:10", "remaining_time": "1:37:53"}
|
||||
{"current_steps": 2140, "total_steps": 2536, "loss": 0.0363, "lr": 5.985827185623899e-06, "epoch": 6.750788643533123, "percentage": 84.38, "elapsed_time": "8:42:19", "remaining_time": "1:36:39"}
|
||||
{"current_steps": 2145, "total_steps": 2536, "loss": 0.0402, "lr": 5.8389876752745045e-06, "epoch": 6.766561514195583, "percentage": 84.58, "elapsed_time": "8:43:25", "remaining_time": "1:35:24"}
|
||||
{"current_steps": 2150, "total_steps": 2536, "loss": 0.0384, "lr": 5.693859924912892e-06, "epoch": 6.782334384858045, "percentage": 84.78, "elapsed_time": "8:44:33", "remaining_time": "1:34:10"}
|
||||
{"current_steps": 2155, "total_steps": 2536, "loss": 0.0373, "lr": 5.550449559950755e-06, "epoch": 6.798107255520505, "percentage": 84.98, "elapsed_time": "8:45:39", "remaining_time": "1:32:56"}
|
||||
{"current_steps": 2160, "total_steps": 2536, "loss": 0.0389, "lr": 5.408762139230888e-06, "epoch": 6.813880126182966, "percentage": 85.17, "elapsed_time": "8:46:48", "remaining_time": "1:31:42"}
|
||||
{"current_steps": 2165, "total_steps": 2536, "loss": 0.0359, "lr": 5.268803154811669e-06, "epoch": 6.829652996845426, "percentage": 85.37, "elapsed_time": "8:47:57", "remaining_time": "1:30:28"}
|
||||
{"current_steps": 2170, "total_steps": 2536, "loss": 0.0414, "lr": 5.1305780317541855e-06, "epoch": 6.8454258675078865, "percentage": 85.57, "elapsed_time": "8:49:04", "remaining_time": "1:29:14"}
|
||||
{"current_steps": 2175, "total_steps": 2536, "loss": 0.0393, "lr": 4.99409212791192e-06, "epoch": 6.861198738170347, "percentage": 85.76, "elapsed_time": "8:50:11", "remaining_time": "1:28:00"}
|
||||
{"current_steps": 2180, "total_steps": 2536, "loss": 0.0419, "lr": 4.8593507337231666e-06, "epoch": 6.8769716088328074, "percentage": 85.96, "elapsed_time": "8:51:17", "remaining_time": "1:26:45"}
|
||||
{"current_steps": 2185, "total_steps": 2536, "loss": 0.0429, "lr": 4.726359072005859e-06, "epoch": 6.892744479495268, "percentage": 86.16, "elapsed_time": "8:52:28", "remaining_time": "1:25:32"}
|
||||
{"current_steps": 2190, "total_steps": 2536, "loss": 0.0414, "lr": 4.5951222977551444e-06, "epoch": 6.908517350157728, "percentage": 86.36, "elapsed_time": "8:53:33", "remaining_time": "1:24:17"}
|
||||
{"current_steps": 2195, "total_steps": 2536, "loss": 0.0359, "lr": 4.465645497943621e-06, "epoch": 6.92429022082019, "percentage": 86.55, "elapsed_time": "8:54:43", "remaining_time": "1:23:04"}
|
||||
{"current_steps": 2200, "total_steps": 2536, "loss": 0.0361, "lr": 4.337933691324109e-06, "epoch": 6.94006309148265, "percentage": 86.75, "elapsed_time": "8:55:54", "remaining_time": "1:21:50"}
|
||||
{"current_steps": 2205, "total_steps": 2536, "loss": 0.0359, "lr": 4.21199182823514e-06, "epoch": 6.955835962145111, "percentage": 86.95, "elapsed_time": "9:01:00", "remaining_time": "1:21:12"}
|
||||
{"current_steps": 2210, "total_steps": 2536, "loss": 0.0413, "lr": 4.08782479040905e-06, "epoch": 6.971608832807571, "percentage": 87.15, "elapsed_time": "9:02:05", "remaining_time": "1:19:57"}
|
||||
{"current_steps": 2215, "total_steps": 2536, "loss": 0.0391, "lr": 3.9654373907827665e-06, "epoch": 6.987381703470032, "percentage": 87.34, "elapsed_time": "9:03:12", "remaining_time": "1:18:43"}
|
||||
{"current_steps": 2220, "total_steps": 2536, "loss": 0.0368, "lr": 3.844834373311257e-06, "epoch": 7.003154574132492, "percentage": 87.54, "elapsed_time": "9:04:25", "remaining_time": "1:17:29"}
|
||||
{"current_steps": 2225, "total_steps": 2536, "loss": 0.0241, "lr": 3.7260204127836316e-06, "epoch": 7.018927444794953, "percentage": 87.74, "elapsed_time": "9:05:34", "remaining_time": "1:16:15"}
|
||||
{"current_steps": 2230, "total_steps": 2536, "loss": 0.0222, "lr": 3.609000114641964e-06, "epoch": 7.034700315457413, "percentage": 87.93, "elapsed_time": "9:06:43", "remaining_time": "1:15:01"}
|
||||
{"current_steps": 2235, "total_steps": 2536, "loss": 0.021, "lr": 3.4937780148027344e-06, "epoch": 7.0504731861198735, "percentage": 88.13, "elapsed_time": "9:07:52", "remaining_time": "1:13:47"}
|
||||
{"current_steps": 2240, "total_steps": 2536, "loss": 0.0219, "lr": 3.3803585794810466e-06, "epoch": 7.066246056782334, "percentage": 88.33, "elapsed_time": "9:08:59", "remaining_time": "1:12:32"}
|
||||
{"current_steps": 2245, "total_steps": 2536, "loss": 0.0233, "lr": 3.2687462050175034e-06, "epoch": 7.082018927444795, "percentage": 88.53, "elapsed_time": "9:10:08", "remaining_time": "1:11:18"}
|
||||
{"current_steps": 2250, "total_steps": 2536, "loss": 0.0239, "lr": 3.1589452177077815e-06, "epoch": 7.097791798107256, "percentage": 88.72, "elapsed_time": "9:11:19", "remaining_time": "1:10:04"}
|
||||
{"current_steps": 2255, "total_steps": 2536, "loss": 0.0232, "lr": 3.0509598736349343e-06, "epoch": 7.113564668769716, "percentage": 88.92, "elapsed_time": "9:12:29", "remaining_time": "1:08:50"}
|
||||
{"current_steps": 2260, "total_steps": 2536, "loss": 0.0209, "lr": 2.9447943585044545e-06, "epoch": 7.129337539432177, "percentage": 89.12, "elapsed_time": "9:13:34", "remaining_time": "1:07:36"}
|
||||
{"current_steps": 2265, "total_steps": 2536, "loss": 0.0229, "lr": 2.840452787481979e-06, "epoch": 7.145110410094637, "percentage": 89.31, "elapsed_time": "9:14:43", "remaining_time": "1:06:22"}
|
||||
{"current_steps": 2270, "total_steps": 2536, "loss": 0.0226, "lr": 2.7379392050338236e-06, "epoch": 7.160883280757098, "percentage": 89.51, "elapsed_time": "9:15:50", "remaining_time": "1:05:08"}
|
||||
{"current_steps": 2275, "total_steps": 2536, "loss": 0.0226, "lr": 2.63725758477017e-06, "epoch": 7.176656151419558, "percentage": 89.71, "elapsed_time": "9:17:00", "remaining_time": "1:03:54"}
|
||||
{"current_steps": 2280, "total_steps": 2536, "loss": 0.0226, "lr": 2.5384118292910818e-06, "epoch": 7.192429022082019, "percentage": 89.91, "elapsed_time": "9:18:08", "remaining_time": "1:02:40"}
|
||||
{"current_steps": 2285, "total_steps": 2536, "loss": 0.0212, "lr": 2.4414057700351934e-06, "epoch": 7.208201892744479, "percentage": 90.1, "elapsed_time": "9:19:16", "remaining_time": "1:01:26"}
|
||||
{"current_steps": 2290, "total_steps": 2536, "loss": 0.0229, "lr": 2.34624316713124e-06, "epoch": 7.2239747634069404, "percentage": 90.3, "elapsed_time": "9:20:22", "remaining_time": "1:00:11"}
|
||||
{"current_steps": 2295, "total_steps": 2536, "loss": 0.0226, "lr": 2.2529277092522503e-06, "epoch": 7.239747634069401, "percentage": 90.5, "elapsed_time": "9:21:32", "remaining_time": "0:58:58"}
|
||||
{"current_steps": 2300, "total_steps": 2536, "loss": 0.0197, "lr": 2.1614630134726367e-06, "epoch": 7.255520504731861, "percentage": 90.69, "elapsed_time": "9:22:42", "remaining_time": "0:57:44"}
|
||||
{"current_steps": 2305, "total_steps": 2536, "loss": 0.0214, "lr": 2.0718526251279346e-06, "epoch": 7.271293375394322, "percentage": 90.89, "elapsed_time": "9:27:43", "remaining_time": "0:56:53"}
|
||||
{"current_steps": 2310, "total_steps": 2536, "loss": 0.0214, "lr": 1.9841000176774148e-06, "epoch": 7.287066246056782, "percentage": 91.09, "elapsed_time": "9:28:52", "remaining_time": "0:55:39"}
|
||||
{"current_steps": 2315, "total_steps": 2536, "loss": 0.021, "lr": 1.898208592569406e-06, "epoch": 7.302839116719243, "percentage": 91.29, "elapsed_time": "9:30:00", "remaining_time": "0:54:24"}
|
||||
{"current_steps": 2320, "total_steps": 2536, "loss": 0.0236, "lr": 1.8141816791095e-06, "epoch": 7.318611987381703, "percentage": 91.48, "elapsed_time": "9:31:08", "remaining_time": "0:53:10"}
|
||||
{"current_steps": 2325, "total_steps": 2536, "loss": 0.0211, "lr": 1.7320225343314566e-06, "epoch": 7.334384858044164, "percentage": 91.68, "elapsed_time": "9:32:22", "remaining_time": "0:51:56"}
|
||||
{"current_steps": 2330, "total_steps": 2536, "loss": 0.0238, "lr": 1.6517343428709975e-06, "epoch": 7.350157728706624, "percentage": 91.88, "elapsed_time": "9:33:29", "remaining_time": "0:50:42"}
|
||||
{"current_steps": 2335, "total_steps": 2536, "loss": 0.0218, "lr": 1.5733202168423055e-06, "epoch": 7.365930599369086, "percentage": 92.07, "elapsed_time": "9:34:39", "remaining_time": "0:49:28"}
|
||||
{"current_steps": 2340, "total_steps": 2536, "loss": 0.0215, "lr": 1.4967831957174606e-06, "epoch": 7.381703470031546, "percentage": 92.27, "elapsed_time": "9:35:41", "remaining_time": "0:48:13"}
|
||||
{"current_steps": 2345, "total_steps": 2536, "loss": 0.0251, "lr": 1.4221262462085715e-06, "epoch": 7.3974763406940065, "percentage": 92.47, "elapsed_time": "9:36:51", "remaining_time": "0:46:59"}
|
||||
{"current_steps": 2350, "total_steps": 2536, "loss": 0.0244, "lr": 1.3493522621528088e-06, "epoch": 7.413249211356467, "percentage": 92.67, "elapsed_time": "9:38:04", "remaining_time": "0:45:45"}
|
||||
{"current_steps": 2355, "total_steps": 2536, "loss": 0.0214, "lr": 1.2784640644002366e-06, "epoch": 7.429022082018927, "percentage": 92.86, "elapsed_time": "9:39:17", "remaining_time": "0:44:31"}
|
||||
{"current_steps": 2360, "total_steps": 2536, "loss": 0.0206, "lr": 1.209464400704452e-06, "epoch": 7.444794952681388, "percentage": 93.06, "elapsed_time": "9:40:25", "remaining_time": "0:43:17"}
|
||||
{"current_steps": 2365, "total_steps": 2536, "loss": 0.0223, "lr": 1.1423559456160803e-06, "epoch": 7.460567823343848, "percentage": 93.26, "elapsed_time": "9:41:33", "remaining_time": "0:42:02"}
|
||||
{"current_steps": 2370, "total_steps": 2536, "loss": 0.0202, "lr": 1.0771413003791253e-06, "epoch": 7.476340694006309, "percentage": 93.45, "elapsed_time": "9:42:36", "remaining_time": "0:40:48"}
|
||||
{"current_steps": 2375, "total_steps": 2536, "loss": 0.0212, "lr": 1.0138229928301212e-06, "epoch": 7.492113564668769, "percentage": 93.65, "elapsed_time": "9:43:42", "remaining_time": "0:39:34"}
|
||||
{"current_steps": 2380, "total_steps": 2536, "loss": 0.0193, "lr": 9.524034773001511e-07, "epoch": 7.50788643533123, "percentage": 93.85, "elapsed_time": "9:44:54", "remaining_time": "0:38:20"}
|
||||
{"current_steps": 2385, "total_steps": 2536, "loss": 0.0229, "lr": 8.928851345197165e-07, "epoch": 7.523659305993691, "percentage": 94.05, "elapsed_time": "9:46:05", "remaining_time": "0:37:06"}
|
||||
{"current_steps": 2390, "total_steps": 2536, "loss": 0.022, "lr": 8.352702715264726e-07, "epoch": 7.539432176656152, "percentage": 94.24, "elapsed_time": "9:47:11", "remaining_time": "0:35:52"}
|
||||
{"current_steps": 2395, "total_steps": 2536, "loss": 0.0229, "lr": 7.795611215757615e-07, "epoch": 7.555205047318612, "percentage": 94.44, "elapsed_time": "9:48:15", "remaining_time": "0:34:37"}
|
||||
{"current_steps": 2400, "total_steps": 2536, "loss": 0.025, "lr": 7.257598440540802e-07, "epoch": 7.570977917981073, "percentage": 94.64, "elapsed_time": "9:49:18", "remaining_time": "0:33:23"}
|
||||
{"current_steps": 2405, "total_steps": 2536, "loss": 0.0245, "lr": 6.738685243953769e-07, "epoch": 7.586750788643533, "percentage": 94.83, "elapsed_time": "9:53:21", "remaining_time": "0:32:19"}
|
||||
{"current_steps": 2410, "total_steps": 2536, "loss": 0.0213, "lr": 6.238891740002195e-07, "epoch": 7.6025236593059935, "percentage": 95.03, "elapsed_time": "9:54:32", "remaining_time": "0:31:05"}
|
||||
{"current_steps": 2415, "total_steps": 2536, "loss": 0.0218, "lr": 5.758237301577874e-07, "epoch": 7.618296529968454, "percentage": 95.23, "elapsed_time": "9:55:45", "remaining_time": "0:29:50"}
|
||||
{"current_steps": 2420, "total_steps": 2536, "loss": 0.0224, "lr": 5.296740559708413e-07, "epoch": 7.634069400630915, "percentage": 95.43, "elapsed_time": "9:56:55", "remaining_time": "0:28:36"}
|
||||
{"current_steps": 2425, "total_steps": 2536, "loss": 0.0231, "lr": 4.854419402834709e-07, "epoch": 7.649842271293375, "percentage": 95.62, "elapsed_time": "9:58:01", "remaining_time": "0:27:22"}
|
||||
{"current_steps": 2430, "total_steps": 2536, "loss": 0.0224, "lr": 4.431290976117497e-07, "epoch": 7.665615141955836, "percentage": 95.82, "elapsed_time": "9:59:01", "remaining_time": "0:26:07"}
|
||||
{"current_steps": 2435, "total_steps": 2536, "loss": 0.0223, "lr": 4.0273716807731067e-07, "epoch": 7.681388012618297, "percentage": 96.02, "elapsed_time": "10:00:11", "remaining_time": "0:24:53"}
|
||||
{"current_steps": 2440, "total_steps": 2536, "loss": 0.023, "lr": 3.642677173437137e-07, "epoch": 7.697160883280757, "percentage": 96.21, "elapsed_time": "10:01:13", "remaining_time": "0:23:39"}
|
||||
{"current_steps": 2445, "total_steps": 2536, "loss": 0.0221, "lr": 3.2772223655583857e-07, "epoch": 7.712933753943218, "percentage": 96.41, "elapsed_time": "10:02:22", "remaining_time": "0:22:25"}
|
||||
{"current_steps": 2450, "total_steps": 2536, "loss": 0.0235, "lr": 2.9310214228202013e-07, "epoch": 7.728706624605678, "percentage": 96.61, "elapsed_time": "10:03:32", "remaining_time": "0:21:11"}
|
||||
{"current_steps": 2455, "total_steps": 2536, "loss": 0.0219, "lr": 2.604087764591534e-07, "epoch": 7.744479495268139, "percentage": 96.81, "elapsed_time": "10:04:39", "remaining_time": "0:19:57"}
|
||||
{"current_steps": 2460, "total_steps": 2536, "loss": 0.0232, "lr": 2.2964340634069603e-07, "epoch": 7.760252365930599, "percentage": 97.0, "elapsed_time": "10:05:46", "remaining_time": "0:18:42"}
|
||||
{"current_steps": 2465, "total_steps": 2536, "loss": 0.0234, "lr": 2.0080722444754118e-07, "epoch": 7.7760252365930596, "percentage": 97.2, "elapsed_time": "10:06:50", "remaining_time": "0:17:28"}
|
||||
{"current_steps": 2470, "total_steps": 2536, "loss": 0.0258, "lr": 1.7390134852177664e-07, "epoch": 7.79179810725552, "percentage": 97.4, "elapsed_time": "10:07:58", "remaining_time": "0:16:14"}
|
||||
{"current_steps": 2475, "total_steps": 2536, "loss": 0.0249, "lr": 1.48926821483375e-07, "epoch": 7.807570977917981, "percentage": 97.59, "elapsed_time": "10:09:08", "remaining_time": "0:15:00"}
|
||||
{"current_steps": 2480, "total_steps": 2536, "loss": 0.0205, "lr": 1.2588461138977604e-07, "epoch": 7.823343848580442, "percentage": 97.79, "elapsed_time": "10:10:12", "remaining_time": "0:13:46"}
|
||||
{"current_steps": 2485, "total_steps": 2536, "loss": 0.024, "lr": 1.0477561139832781e-07, "epoch": 7.839116719242902, "percentage": 97.99, "elapsed_time": "10:11:19", "remaining_time": "0:12:32"}
|
||||
{"current_steps": 2490, "total_steps": 2536, "loss": 0.0246, "lr": 8.560063973171439e-08, "epoch": 7.854889589905363, "percentage": 98.19, "elapsed_time": "10:12:27", "remaining_time": "0:11:18"}
|
||||
{"current_steps": 2495, "total_steps": 2536, "loss": 0.0213, "lr": 6.836043964620342e-08, "epoch": 7.870662460567823, "percentage": 98.38, "elapsed_time": "10:13:33", "remaining_time": "0:10:04"}
|
||||
{"current_steps": 2500, "total_steps": 2536, "loss": 0.0222, "lr": 5.3055679402846946e-08, "epoch": 7.886435331230284, "percentage": 98.58, "elapsed_time": "10:14:46", "remaining_time": "0:08:51"}
|
||||
{"current_steps": 2505, "total_steps": 2536, "loss": 0.0221, "lr": 3.968695224158547e-08, "epoch": 7.902208201892744, "percentage": 98.78, "elapsed_time": "10:20:29", "remaining_time": "0:07:40"}
|
||||
{"current_steps": 2510, "total_steps": 2536, "loss": 0.0232, "lr": 2.8254776358238588e-08, "epoch": 7.917981072555205, "percentage": 98.97, "elapsed_time": "10:21:35", "remaining_time": "0:06:26"}
|
||||
{"current_steps": 2515, "total_steps": 2536, "loss": 0.0214, "lr": 1.8759594884443233e-08, "epoch": 7.933753943217665, "percentage": 99.17, "elapsed_time": "10:22:37", "remaining_time": "0:05:11"}
|
||||
{"current_steps": 2520, "total_steps": 2536, "loss": 0.0208, "lr": 1.1201775870445242e-08, "epoch": 7.9495268138801265, "percentage": 99.37, "elapsed_time": "10:23:44", "remaining_time": "0:03:57"}
|
||||
{"current_steps": 2525, "total_steps": 2536, "loss": 0.0223, "lr": 5.581612270855186e-09, "epoch": 7.965299684542587, "percentage": 99.57, "elapsed_time": "10:24:52", "remaining_time": "0:02:43"}
|
||||
{"current_steps": 2530, "total_steps": 2536, "loss": 0.0214, "lr": 1.8993219332907877e-09, "epoch": 7.981072555205047, "percentage": 99.76, "elapsed_time": "10:26:07", "remaining_time": "0:01:29"}
|
||||
{"current_steps": 2535, "total_steps": 2536, "loss": 0.0225, "lr": 1.5504758992257451e-10, "epoch": 7.996845425867508, "percentage": 99.96, "elapsed_time": "10:27:10", "remaining_time": "0:00:14"}
|
||||
{"current_steps": 2536, "total_steps": 2536, "epoch": 8.0, "percentage": 100.0, "elapsed_time": "10:29:44", "remaining_time": "0:00:00"}
|
||||
5624
trainer_state.json
Normal file
5624
trainer_state.json
Normal file
File diff suppressed because it is too large
Load Diff
3
training_args.bin
Normal file
3
training_args.bin
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:51c59af6abee80ceb59d8343ef6643d9d848f264b9af26725f19e9a70ea0a4e4
|
||||
size 8785
|
||||
BIN
training_loss.png
Normal file
BIN
training_loss.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 39 KiB |
1
vocab.json
Normal file
1
vocab.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user