初始化项目,由ModelHub XC社区提供模型

Model: ecnu-icalk/PsychAgent-Qwen3-32B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-04 10:54:17 +08:00
commit 90763e8666
32 changed files with 158858 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

201
README.md Normal file
View File

@@ -0,0 +1,201 @@
---
library_name: transformers
license: other
base_model: Qwen/Qwen3-32B
pipeline_tag: text-generation
tags:
- psychological-counseling
- counseling-agent
- qwen3
- llama-factory
- full
model-index:
- name: PsychAgent-Qwen3-32B
results: []
---
# PsychAgent-Qwen3-32B
PsychAgent-Qwen3-32B is a psychological counseling model built on top of `Qwen/Qwen3-32B`. It is the 32B instantiation of **PsychAgent**, an experience-driven lifelong learning framework for AI psychological counseling.
This checkpoint is trained from `Qwen/Qwen3-32B` on two training variants:
- `rft_explicit_skill_0218`: the system prompt retains the skill candidate pool.
- `rft_implicit_skill_0218`: the system prompt removes the skill candidate pool to encourage stronger skill internalization.
## Model description
PsychAgent is designed for **multi-session psychological counseling**. Unlike static SFT-only counseling models, it improves longitudinal consistency and counseling quality through a closed-loop framework with three components:
- **Memory-Augmented Planning Engine (MAPE)**: maintains an evolving client profile and session summaries, then performs session-level planning for longitudinal continuity.
- **Skill Evolution Engine (SEE)**: extracts and organizes practice-grounded therapeutic skills into a hierarchical skill tree.
- **Reinforced Internalization Engine (RIE)**: internalizes successful counseling trajectories via rejection fine-tuning so that useful strategies become more endogenous to the model.
According to the paper, the 32B model is trained with a maximum context length of **32,768 tokens**, using **DeepSpeed ZeRO-3**, **bf16** precision, **history masking**, and a rollout number of **8**.
## Key features
- Longitudinal multi-session counseling with memory-augmented planning.
- Experience-driven skill evolution from historical counseling trajectories.
- Reinforced internalization of high-quality trajectories through rejection fine-tuning.
- Strong benchmark performance on PsychEval across both counselor-side and client-side dimensions.
## Intended uses
This model is intended for:
- research on AI psychological counseling and longitudinal dialogue agents;
- experiments on memory, planning, skill evolution, and lifelong learning for counseling agents;
- benchmarking on multi-session counseling settings similar to PsychEval.
## Out-of-scope use and limitations
This model is **not** a licensed mental health professional and should **not** be used as a substitute for clinical care.
It should **not** be relied on in emergencies, crisis intervention, suicide risk handling, or any high-stakes clinical scenario requiring qualified professionals.
The paper evaluates the model on benchmarked multi-session counseling tasks rather than real-world clinical deployment. The authors also note that future work is needed for more realistic counseling settings and for stronger safety and privacy protections. Some observed improvements should therefore be interpreted as **benchmark trends rather than direct clinical evidence**.
## Training and evaluation data
The experiments are built on **PsychEval**. Following the benchmark protocol, the authors first perform supervised fine-tuning on the released multi-session counseling corpus, and then reuse a pool of **2,000+ client profiles** for rollout training and evaluation.
For each therapeutic school, **140** client profiles are sampled, with **120** used for training and **20** for evaluation.
Reported metrics include shared counseling metrics such as:
- **PANAS**
- **RRO**
- **SRS**
- **CUSTOM_DIM**
- **HTAIS**
- **WAI**
- **DIALOGUE_PLANNING**
The benchmark also includes school-specific metrics such as:
- Behavioral Therapy: **MITI**, **STAI**
- Cognitive Behavioral Therapy: **CTRS**, **BDI_II**
- Postmodernist Therapy: **EFT_TFS**, **SFBT**
- Humanistic-Existential Therapy: **TES**, **CCT**
- Psychodynamic Therapy: **PSC**, **IPO**
## Performance
In the paper, PsychAgent outperforms the compared general-purpose and psychology-specific baselines on all four aggregated PsychEval dimensions.
| Model | Counselor Shared | Counselor Specific | Client Shared | Client Specific |
|---|---:|---:|---:|---:|
| GPT-5.4 | 5.54 | 7.41 | 5.07 | 7.72 |
| Gemini-3 | 5.34 | 7.04 | 4.97 | 7.52 |
| Qwen3-Max | 5.88 | 7.74 | 5.41 | 7.81 |
| DeepSeek-V3.2 | 5.54 | 7.12 | 5.06 | 7.70 |
| PsyLLM | 5.30 | 4.67 | 5.63 | 7.93 |
| PsyDTLLM | 6.10 | 5.43 | 5.27 | 7.42 |
| CPsyCounX | 4.21 | 2.51 | 4.73 | 7.27 |
| TheraMind | 6.25 | 6.94 | 5.48 | 7.83 |
| PsychAgent† (8B) | 7.35 | 7.78 | 5.94 | 8.19 |
| **PsychAgent (32B)** | **7.32** | **7.91** | **5.92** | **8.24** |
The paper also reports human evaluation on **522 matched multi-session dialogues** rated by two human annotators and one LLM rater (**Gemini-3**) across four dimensions:
- **Ethics**
- **Interaction**
- **Intervention**
- **Perception**
PsychAgent ranks first in all three rater columns, ahead of Qwen3-Max and TheraMind. The paper further reports moderate-to-strong inter-rater agreement, with:
- **human-human QWK = 0.675**
- **LLM-human QWK = 0.770 / 0.877**
## Ablation summary
The paper shows that removing any of the three main modules—**MAPE**, **SEE**, or **RIE**—degrades performance.
Among them, removing **SEE** causes the largest drop in the reported ablation, suggesting that skill evolution is especially important under the reported setting.
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 16
- gradient_accumulation_steps: 8
- total_train_batch_size: 128
- total_eval_batch_size: 128
- optimizer: adamw_torch_fused
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3.0
The paper additionally states that training used:
- **bf16** precision
- **DeepSpeed ZeRO-3**
- maximum context length **32,768**
- rollout number **N = 8**
- **10% warmup**
- two servers with **8 NVIDIA H200 GPUs each**
### Framework versions
- Transformers 4.55.0
- Pytorch 2.9.0+cu128
- Datasets 3.6.0
- Tokenizers 0.21.1
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "ecnu-icalk/PsychAgent-Qwen3-32B"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
trust_remote_code=True,
)
messages = [
{
"role": "system",
"content": "You are a supportive and cautious psychological counseling assistant. Do not claim to be a licensed clinician.",
},
{
"role": "user",
"content": "I've been feeling increasingly anxious about job hunting and sleeping poorly for two nights. Can we talk through it step by step?",
},
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
````
## Citation
If you use this model, please cite the **PsychAgent** paper.
```bibtex
@article{yang2026psychagent,
title={PsychAgent: An Experience-Driven Lifelong Learning Agent for Self-Evolving Psychological Counselor},
author={Yang, Yutao and Li, Junsong and Pan, Qianjun and Zhou, Jie and Chen, Kai and Chen, Qin and Zhao, Jingyuan and Zhou, Ningning and Li, Xin and He, Liang},
journal={arXiv preprint arXiv},
year={2026}
}
```

28
added_tokens.json Normal file
View File

@@ -0,0 +1,28 @@
{
"</think>": 151668,
"</tool_call>": 151658,
"</tool_response>": 151666,
"<think>": 151667,
"<tool_call>": 151657,
"<tool_response>": 151665,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

8
all_results.json Normal file
View File

@@ -0,0 +1,8 @@
{
"epoch": 3.0,
"total_flos": 4127658346151936.0,
"train_loss": 0.21015015432889433,
"train_runtime": 144374.0796,
"train_samples_per_second": 1.311,
"train_steps_per_second": 0.01
}

89
chat_template.jinja Normal file
View File

@@ -0,0 +1,89 @@
{%- if tools %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].role == 'system' %}
{{- messages[0].content + '\n\n' }}
{%- endif %}
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
{%- for tool in tools %}
{{- "\n" }}
{{- tool | tojson }}
{%- endfor %}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
{%- if messages[0].role == 'system' %}
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
{%- set index = (messages|length - 1) - loop.index0 %}
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
{%- set ns.multi_step_tool = false %}
{%- set ns.last_query_index = index %}
{%- endif %}
{%- endfor %}
{%- for message in messages %}
{%- if message.content is string %}
{%- set content = message.content %}
{%- else %}
{%- set content = '' %}
{%- endif %}
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
{%- elif message.role == "assistant" %}
{%- set reasoning_content = '' %}
{%- if message.reasoning_content is string %}
{%- set reasoning_content = message.reasoning_content %}
{%- else %}
{%- if '</think>' in content %}
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
{%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_query_index %}
{%- if loop.last or (not loop.last and reasoning_content) %}
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and content) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "tool" %}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|im_start|>user' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- content }}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- if enable_thinking is defined and enable_thinking is false %}
{{- '<think>\n\n</think>\n\n' }}
{%- endif %}
{%- endif %}

96
config.json Normal file
View File

@@ -0,0 +1,96 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 25600,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 40960,
"max_window_layers": 64,
"model_type": "qwen3",
"num_attention_heads": 64,
"num_hidden_layers": 64,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.55.0",
"use_cache": false,
"use_sliding_window": false,
"vocab_size": 151936
}

13
generation_config.json Normal file
View File

@@ -0,0 +1,13 @@
{
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"transformers_version": "4.55.0"
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ff9e062e123e8ff2bf3f9cc1006a0bed93c8d65a5d6f2aa44b0b78043b7ca2ec
size 4932307584

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5dd631eea30d529dbf5e74d752e9841d4f16e5a18b780529c90eae796a0f4864
size 4875989696

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:be8e28dbb22bbcc0cc9e3d5f191f5437dfa3717302e68a4a7a2ea6bc81b2c37a
size 4875989720

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8e8889f113d91488face14d700457df6b1d39be1cccac796fb3a4adb77422bc7
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:859e974b654aab605867a544b0134a0dfc2356e05dcb8813fa26fa0d2a826e86
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:feeac00d1eb98b7aeea50c45a1d8fd96eac7fdf018a88f2b789fd239f4a5319a
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ed9d3a2b0566efe409d2df6ad3d4ce4a8e8be7d7fbcbd5b7d852d643cd7b0c89
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f11adac709a5b36cbaf8b81925ab26d11724b473444c725ca98ff9a0d9ec4c03
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cdd578421aa2e697d8d8c3732f010ddb9da3afc10dfd7121bec9a2233bfd2d87
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ba154cf8de8e423a987196e0951d7a0eb2d0a3d358add722e33bcb8c7d6711c3
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:eca819d45c7ca103618c421fa6329c663895d3ab200cc0ecaebdc40f8922f692
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6900352469d719ea954507fe6412d242eae66611c69b156c49dc8900ad5fd807
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:677e6cfbd79c8597acf464559293742f93ed8e24390f526027eef415c3f3b0af
size 4875989752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2255e3ba40961becc7ee1a17126f807af2bd7981ae4ca1ce7b9770def63cfa49
size 2080144040

View File

@@ -0,0 +1,715 @@
{
"metadata": {
"total_parameters": 676864,
"total_size": 65524246528
},
"weight_map": {
"lm_head.weight": "model-00014-of-00014.safetensors",
"model.embed_tokens.weight": "model-00001-of-00014.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.k_norm.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.q_norm.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.k_norm.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.q_norm.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.10.input_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.10.self_attn.k_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.10.self_attn.q_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.11.input_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.11.self_attn.k_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.11.self_attn.q_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.12.input_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.12.self_attn.k_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.12.self_attn.q_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.13.input_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.13.self_attn.k_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.13.self_attn.q_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.14.input_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.14.self_attn.k_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.14.self_attn.q_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.15.input_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.15.self_attn.k_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.15.self_attn.q_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.16.input_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.16.self_attn.k_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.16.self_attn.q_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.17.input_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00004-of-00014.safetensors",
"model.layers.17.self_attn.k_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.17.self_attn.q_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.18.input_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.18.self_attn.k_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.18.self_attn.q_norm.weight": "model-00004-of-00014.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00004-of-00014.safetensors",
"model.layers.19.input_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.19.self_attn.k_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.19.self_attn.q_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00014.safetensors",
"model.layers.2.self_attn.k_norm.weight": "model-00001-of-00014.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.self_attn.q_norm.weight": "model-00001-of-00014.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.20.input_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.20.self_attn.k_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.20.self_attn.q_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.21.input_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.21.self_attn.k_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.21.self_attn.q_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.22.input_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00005-of-00014.safetensors",
"model.layers.22.self_attn.k_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.22.self_attn.q_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.23.input_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.23.self_attn.k_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.23.self_attn.q_norm.weight": "model-00005-of-00014.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00005-of-00014.safetensors",
"model.layers.24.input_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.24.self_attn.k_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.24.self_attn.q_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.25.input_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.25.self_attn.k_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.25.self_attn.q_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.26.input_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.26.self_attn.k_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.26.self_attn.q_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.27.input_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00006-of-00014.safetensors",
"model.layers.27.self_attn.k_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.27.self_attn.q_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.28.input_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.28.self_attn.k_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.28.self_attn.q_norm.weight": "model-00006-of-00014.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00006-of-00014.safetensors",
"model.layers.29.input_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.29.self_attn.k_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.29.self_attn.q_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.3.input_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.3.self_attn.k_norm.weight": "model-00001-of-00014.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.3.self_attn.q_norm.weight": "model-00001-of-00014.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00014.safetensors",
"model.layers.30.input_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.30.self_attn.k_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.30.self_attn.q_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.31.input_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.31.self_attn.k_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.31.self_attn.q_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.32.input_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.32.mlp.down_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.32.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.32.mlp.up_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.32.post_attention_layernorm.weight": "model-00007-of-00014.safetensors",
"model.layers.32.self_attn.k_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.32.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.32.self_attn.o_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.32.self_attn.q_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.32.self_attn.q_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.32.self_attn.v_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.33.input_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.33.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.33.mlp.gate_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.33.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.33.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.33.self_attn.k_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.33.self_attn.k_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.33.self_attn.o_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.33.self_attn.q_norm.weight": "model-00007-of-00014.safetensors",
"model.layers.33.self_attn.q_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.33.self_attn.v_proj.weight": "model-00007-of-00014.safetensors",
"model.layers.34.input_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.34.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.34.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.34.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.34.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.34.self_attn.k_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.34.self_attn.k_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.34.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.34.self_attn.q_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.34.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.34.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.35.input_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.35.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.35.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.35.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.35.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.35.self_attn.k_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.35.self_attn.k_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.35.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.35.self_attn.q_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.35.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.35.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.36.input_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.36.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.36.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.36.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.36.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.36.self_attn.k_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.36.self_attn.k_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.36.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.36.self_attn.q_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.36.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.36.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.37.input_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.37.mlp.down_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.37.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.37.mlp.up_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.37.post_attention_layernorm.weight": "model-00008-of-00014.safetensors",
"model.layers.37.self_attn.k_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.37.self_attn.k_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.37.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.37.self_attn.q_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.37.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.37.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.38.input_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.38.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.38.mlp.gate_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.38.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.38.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.38.self_attn.k_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.38.self_attn.k_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.38.self_attn.o_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.38.self_attn.q_norm.weight": "model-00008-of-00014.safetensors",
"model.layers.38.self_attn.q_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.38.self_attn.v_proj.weight": "model-00008-of-00014.safetensors",
"model.layers.39.input_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.39.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.39.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.39.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.39.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.39.self_attn.k_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.39.self_attn.k_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.39.self_attn.o_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.39.self_attn.q_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.39.self_attn.q_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.39.self_attn.v_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.4.input_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.k_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.q_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.40.input_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.40.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.40.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.40.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.40.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.40.self_attn.k_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.40.self_attn.k_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.40.self_attn.o_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.40.self_attn.q_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.40.self_attn.q_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.40.self_attn.v_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.41.input_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.41.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.41.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.41.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.41.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.41.self_attn.k_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.41.self_attn.k_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.41.self_attn.o_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.41.self_attn.q_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.41.self_attn.q_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.41.self_attn.v_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.42.input_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.42.mlp.down_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.42.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.42.mlp.up_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.42.post_attention_layernorm.weight": "model-00009-of-00014.safetensors",
"model.layers.42.self_attn.k_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.42.self_attn.k_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.42.self_attn.o_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.42.self_attn.q_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.42.self_attn.q_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.42.self_attn.v_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.43.input_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.43.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.43.mlp.gate_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.43.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.43.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.43.self_attn.k_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.43.self_attn.k_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.43.self_attn.o_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.43.self_attn.q_norm.weight": "model-00009-of-00014.safetensors",
"model.layers.43.self_attn.q_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.43.self_attn.v_proj.weight": "model-00009-of-00014.safetensors",
"model.layers.44.input_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.44.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.44.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.44.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.44.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.44.self_attn.k_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.44.self_attn.k_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.44.self_attn.o_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.44.self_attn.q_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.44.self_attn.q_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.44.self_attn.v_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.45.input_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.45.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.45.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.45.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.45.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.45.self_attn.k_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.45.self_attn.k_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.45.self_attn.o_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.45.self_attn.q_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.45.self_attn.q_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.45.self_attn.v_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.46.input_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.46.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.46.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.46.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.46.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.46.self_attn.k_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.46.self_attn.k_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.46.self_attn.o_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.46.self_attn.q_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.46.self_attn.q_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.46.self_attn.v_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.47.input_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.47.mlp.down_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.47.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.47.mlp.up_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.47.post_attention_layernorm.weight": "model-00010-of-00014.safetensors",
"model.layers.47.self_attn.k_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.47.self_attn.k_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.47.self_attn.o_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.47.self_attn.q_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.47.self_attn.q_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.47.self_attn.v_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.48.input_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.48.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.48.mlp.gate_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.48.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.48.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.48.self_attn.k_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.48.self_attn.k_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.48.self_attn.o_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.48.self_attn.q_norm.weight": "model-00010-of-00014.safetensors",
"model.layers.48.self_attn.q_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.48.self_attn.v_proj.weight": "model-00010-of-00014.safetensors",
"model.layers.49.input_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.49.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.49.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.49.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.49.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.49.self_attn.k_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.49.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.49.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.49.self_attn.q_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.49.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.49.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.5.input_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.5.self_attn.k_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.self_attn.q_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.50.input_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.50.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.50.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.50.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.50.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.50.self_attn.k_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.50.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.50.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.50.self_attn.q_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.50.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.50.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.51.input_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.51.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.51.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.51.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.51.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.51.self_attn.k_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.51.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.51.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.51.self_attn.q_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.51.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.51.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.52.input_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.52.mlp.down_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.52.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.52.mlp.up_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.52.post_attention_layernorm.weight": "model-00011-of-00014.safetensors",
"model.layers.52.self_attn.k_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.52.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.52.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.52.self_attn.q_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.52.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.52.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.53.input_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.53.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.53.mlp.gate_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.53.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.53.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.53.self_attn.k_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.53.self_attn.k_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.53.self_attn.o_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.53.self_attn.q_norm.weight": "model-00011-of-00014.safetensors",
"model.layers.53.self_attn.q_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.53.self_attn.v_proj.weight": "model-00011-of-00014.safetensors",
"model.layers.54.input_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.54.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.54.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.54.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.54.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.54.self_attn.k_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.54.self_attn.k_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.54.self_attn.o_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.54.self_attn.q_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.54.self_attn.q_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.54.self_attn.v_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.55.input_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.55.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.55.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.55.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.55.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.55.self_attn.k_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.55.self_attn.k_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.55.self_attn.o_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.55.self_attn.q_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.55.self_attn.q_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.55.self_attn.v_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.56.input_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.56.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.56.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.56.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.56.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.56.self_attn.k_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.56.self_attn.k_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.56.self_attn.o_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.56.self_attn.q_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.56.self_attn.q_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.56.self_attn.v_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.57.input_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.57.mlp.down_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.57.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.57.mlp.up_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.57.post_attention_layernorm.weight": "model-00012-of-00014.safetensors",
"model.layers.57.self_attn.k_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.57.self_attn.k_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.57.self_attn.o_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.57.self_attn.q_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.57.self_attn.q_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.57.self_attn.v_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.58.input_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.58.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.58.mlp.gate_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.58.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.58.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.58.self_attn.k_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.58.self_attn.k_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.58.self_attn.o_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.58.self_attn.q_norm.weight": "model-00012-of-00014.safetensors",
"model.layers.58.self_attn.q_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.58.self_attn.v_proj.weight": "model-00012-of-00014.safetensors",
"model.layers.59.input_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.59.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.59.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.59.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.59.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.59.self_attn.k_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.59.self_attn.k_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.59.self_attn.o_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.59.self_attn.q_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.59.self_attn.q_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.59.self_attn.v_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.6.input_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.6.self_attn.k_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.6.self_attn.q_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.60.input_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.60.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.60.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.60.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.60.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.60.self_attn.k_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.60.self_attn.k_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.60.self_attn.o_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.60.self_attn.q_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.60.self_attn.q_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.60.self_attn.v_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.61.input_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.61.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.61.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.61.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.61.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.61.self_attn.k_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.61.self_attn.k_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.61.self_attn.o_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.61.self_attn.q_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.61.self_attn.q_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.61.self_attn.v_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.62.input_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.62.mlp.down_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.62.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.62.mlp.up_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.62.post_attention_layernorm.weight": "model-00013-of-00014.safetensors",
"model.layers.62.self_attn.k_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.62.self_attn.k_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.62.self_attn.o_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.62.self_attn.q_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.62.self_attn.q_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.62.self_attn.v_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.63.input_layernorm.weight": "model-00014-of-00014.safetensors",
"model.layers.63.mlp.down_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.63.mlp.gate_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.63.mlp.up_proj.weight": "model-00014-of-00014.safetensors",
"model.layers.63.post_attention_layernorm.weight": "model-00014-of-00014.safetensors",
"model.layers.63.self_attn.k_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.63.self_attn.k_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.63.self_attn.o_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.63.self_attn.q_norm.weight": "model-00013-of-00014.safetensors",
"model.layers.63.self_attn.q_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.63.self_attn.v_proj.weight": "model-00013-of-00014.safetensors",
"model.layers.7.input_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00002-of-00014.safetensors",
"model.layers.7.self_attn.k_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.7.self_attn.q_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.8.input_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.8.self_attn.k_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.8.self_attn.q_norm.weight": "model-00002-of-00014.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00002-of-00014.safetensors",
"model.layers.9.input_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00003-of-00014.safetensors",
"model.layers.9.self_attn.k_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.9.self_attn.q_norm.weight": "model-00003-of-00014.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00003-of-00014.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00003-of-00014.safetensors",
"model.norm.weight": "model-00014-of-00014.safetensors"
}
}

31
special_tokens_map.json Normal file
View File

@@ -0,0 +1,31 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

240
tokenizer_config.json Normal file
View File

@@ -0,0 +1,240 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"extra_special_tokens": {},
"model_max_length": 131072,
"pad_token": "<|endoftext|>",
"padding_side": "right",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null
}

8
train_results.json Normal file
View File

@@ -0,0 +1,8 @@
{
"epoch": 3.0,
"total_flos": 4127658346151936.0,
"train_loss": 0.21015015432889433,
"train_runtime": 144374.0796,
"train_samples_per_second": 1.311,
"train_steps_per_second": 0.01
}

740
trainer_log.jsonl Normal file
View File

@@ -0,0 +1,740 @@
{"current_steps": 2, "total_steps": 1479, "loss": 0.3778, "lr": 6.756756756756757e-08, "epoch": 0.004058853373921867, "percentage": 0.14, "elapsed_time": "0:03:15", "remaining_time": "1 day, 16:04:46"}
{"current_steps": 4, "total_steps": 1479, "loss": 0.3539, "lr": 2.0270270270270273e-07, "epoch": 0.008117706747843734, "percentage": 0.27, "elapsed_time": "0:06:21", "remaining_time": "1 day, 15:04:39"}
{"current_steps": 6, "total_steps": 1479, "loss": 0.3621, "lr": 3.378378378378379e-07, "epoch": 0.0121765601217656, "percentage": 0.41, "elapsed_time": "0:09:43", "remaining_time": "1 day, 15:48:57"}
{"current_steps": 8, "total_steps": 1479, "loss": 0.3345, "lr": 4.7297297297297305e-07, "epoch": 0.016235413495687467, "percentage": 0.54, "elapsed_time": "0:12:53", "remaining_time": "1 day, 15:30:33"}
{"current_steps": 10, "total_steps": 1479, "loss": 0.3408, "lr": 6.081081081081082e-07, "epoch": 0.020294266869609334, "percentage": 0.68, "elapsed_time": "0:16:11", "remaining_time": "1 day, 15:39:41"}
{"current_steps": 12, "total_steps": 1479, "loss": 0.351, "lr": 7.432432432432434e-07, "epoch": 0.0243531202435312, "percentage": 0.81, "elapsed_time": "0:19:36", "remaining_time": "1 day, 15:57:57"}
{"current_steps": 14, "total_steps": 1479, "loss": 0.3319, "lr": 8.783783783783785e-07, "epoch": 0.028411973617453068, "percentage": 0.95, "elapsed_time": "0:22:54", "remaining_time": "1 day, 15:58:02"}
{"current_steps": 16, "total_steps": 1479, "loss": 0.3631, "lr": 1.0135135135135136e-06, "epoch": 0.032470826991374935, "percentage": 1.08, "elapsed_time": "0:26:03", "remaining_time": "1 day, 15:42:49"}
{"current_steps": 18, "total_steps": 1479, "loss": 0.3369, "lr": 1.148648648648649e-06, "epoch": 0.0365296803652968, "percentage": 1.22, "elapsed_time": "0:29:38", "remaining_time": "1 day, 16:05:38"}
{"current_steps": 20, "total_steps": 1479, "loss": 0.3401, "lr": 1.2837837837837838e-06, "epoch": 0.04058853373921867, "percentage": 1.35, "elapsed_time": "0:32:36", "remaining_time": "1 day, 15:38:18"}
{"current_steps": 22, "total_steps": 1479, "loss": 0.4064, "lr": 1.418918918918919e-06, "epoch": 0.044647387113140535, "percentage": 1.49, "elapsed_time": "0:35:43", "remaining_time": "1 day, 15:26:00"}
{"current_steps": 24, "total_steps": 1479, "loss": 0.3511, "lr": 1.5540540540540541e-06, "epoch": 0.0487062404870624, "percentage": 1.62, "elapsed_time": "0:38:58", "remaining_time": "1 day, 15:23:13"}
{"current_steps": 26, "total_steps": 1479, "loss": 0.3303, "lr": 1.6891891891891894e-06, "epoch": 0.05276509386098427, "percentage": 1.76, "elapsed_time": "0:42:22", "remaining_time": "1 day, 15:28:03"}
{"current_steps": 28, "total_steps": 1479, "loss": 0.3458, "lr": 1.8243243243243245e-06, "epoch": 0.056823947234906136, "percentage": 1.89, "elapsed_time": "0:45:43", "remaining_time": "1 day, 15:29:42"}
{"current_steps": 30, "total_steps": 1479, "loss": 0.3401, "lr": 1.9594594594594595e-06, "epoch": 0.060882800608828, "percentage": 2.03, "elapsed_time": "0:48:48", "remaining_time": "1 day, 15:17:27"}
{"current_steps": 32, "total_steps": 1479, "loss": 0.341, "lr": 2.0945945945945946e-06, "epoch": 0.06494165398274987, "percentage": 2.16, "elapsed_time": "0:51:55", "remaining_time": "1 day, 15:08:04"}
{"current_steps": 34, "total_steps": 1479, "loss": 0.333, "lr": 2.22972972972973e-06, "epoch": 0.06900050735667174, "percentage": 2.3, "elapsed_time": "0:54:55", "remaining_time": "1 day, 14:53:57"}
{"current_steps": 36, "total_steps": 1479, "loss": 0.3389, "lr": 2.364864864864865e-06, "epoch": 0.0730593607305936, "percentage": 2.43, "elapsed_time": "0:57:45", "remaining_time": "1 day, 14:35:04"}
{"current_steps": 38, "total_steps": 1479, "loss": 0.3422, "lr": 2.5e-06, "epoch": 0.07711821410451547, "percentage": 2.57, "elapsed_time": "1:01:03", "remaining_time": "1 day, 14:35:13"}
{"current_steps": 40, "total_steps": 1479, "loss": 0.348, "lr": 2.6351351351351353e-06, "epoch": 0.08117706747843734, "percentage": 2.7, "elapsed_time": "1:04:13", "remaining_time": "1 day, 14:30:23"}
{"current_steps": 42, "total_steps": 1479, "loss": 0.3376, "lr": 2.7702702702702703e-06, "epoch": 0.0852359208523592, "percentage": 2.84, "elapsed_time": "1:07:07", "remaining_time": "1 day, 14:16:35"}
{"current_steps": 44, "total_steps": 1479, "loss": 0.3556, "lr": 2.9054054054054054e-06, "epoch": 0.08929477422628107, "percentage": 2.97, "elapsed_time": "1:10:25", "remaining_time": "1 day, 14:16:49"}
{"current_steps": 46, "total_steps": 1479, "loss": 0.3524, "lr": 3.040540540540541e-06, "epoch": 0.09335362760020294, "percentage": 3.11, "elapsed_time": "1:13:40", "remaining_time": "1 day, 14:15:18"}
{"current_steps": 48, "total_steps": 1479, "loss": 0.3279, "lr": 3.1756756756756755e-06, "epoch": 0.0974124809741248, "percentage": 3.25, "elapsed_time": "1:16:54", "remaining_time": "1 day, 14:12:41"}
{"current_steps": 50, "total_steps": 1479, "loss": 0.3063, "lr": 3.310810810810811e-06, "epoch": 0.10147133434804667, "percentage": 3.38, "elapsed_time": "1:20:10", "remaining_time": "1 day, 14:11:33"}
{"current_steps": 52, "total_steps": 1479, "loss": 0.3453, "lr": 3.445945945945946e-06, "epoch": 0.10553018772196854, "percentage": 3.52, "elapsed_time": "1:23:18", "remaining_time": "1 day, 14:06:05"}
{"current_steps": 54, "total_steps": 1479, "loss": 0.3151, "lr": 3.5810810810810816e-06, "epoch": 0.1095890410958904, "percentage": 3.65, "elapsed_time": "1:26:36", "remaining_time": "1 day, 14:05:26"}
{"current_steps": 56, "total_steps": 1479, "loss": 0.3267, "lr": 3.7162162162162162e-06, "epoch": 0.11364789446981227, "percentage": 3.79, "elapsed_time": "1:29:36", "remaining_time": "1 day, 13:56:54"}
{"current_steps": 58, "total_steps": 1479, "loss": 0.3426, "lr": 3.851351351351352e-06, "epoch": 0.11770674784373414, "percentage": 3.92, "elapsed_time": "1:32:44", "remaining_time": "1 day, 13:52:17"}
{"current_steps": 60, "total_steps": 1479, "loss": 0.3431, "lr": 3.986486486486487e-06, "epoch": 0.121765601217656, "percentage": 4.06, "elapsed_time": "1:35:51", "remaining_time": "1 day, 13:47:04"}
{"current_steps": 62, "total_steps": 1479, "loss": 0.3328, "lr": 4.121621621621622e-06, "epoch": 0.12582445459157787, "percentage": 4.19, "elapsed_time": "1:38:57", "remaining_time": "1 day, 13:41:35"}
{"current_steps": 64, "total_steps": 1479, "loss": 0.318, "lr": 4.256756756756757e-06, "epoch": 0.12988330796549974, "percentage": 4.33, "elapsed_time": "1:42:14", "remaining_time": "1 day, 13:40:36"}
{"current_steps": 66, "total_steps": 1479, "loss": 0.3457, "lr": 4.391891891891892e-06, "epoch": 0.1339421613394216, "percentage": 4.46, "elapsed_time": "1:45:15", "remaining_time": "1 day, 13:33:28"}
{"current_steps": 68, "total_steps": 1479, "loss": 0.329, "lr": 4.527027027027027e-06, "epoch": 0.13800101471334347, "percentage": 4.6, "elapsed_time": "1:48:30", "remaining_time": "1 day, 13:31:35"}
{"current_steps": 70, "total_steps": 1479, "loss": 0.3346, "lr": 4.6621621621621625e-06, "epoch": 0.14205986808726534, "percentage": 4.73, "elapsed_time": "1:51:43", "remaining_time": "1 day, 13:28:46"}
{"current_steps": 72, "total_steps": 1479, "loss": 0.33, "lr": 4.797297297297297e-06, "epoch": 0.1461187214611872, "percentage": 4.87, "elapsed_time": "1:54:50", "remaining_time": "1 day, 13:24:05"}
{"current_steps": 74, "total_steps": 1479, "loss": 0.3111, "lr": 4.932432432432433e-06, "epoch": 0.15017757483510907, "percentage": 5.0, "elapsed_time": "1:57:56", "remaining_time": "1 day, 13:19:24"}
{"current_steps": 76, "total_steps": 1479, "loss": 0.3164, "lr": 5.067567567567568e-06, "epoch": 0.15423642820903094, "percentage": 5.14, "elapsed_time": "2:01:17", "remaining_time": "1 day, 13:18:59"}
{"current_steps": 78, "total_steps": 1479, "loss": 0.3462, "lr": 5.202702702702704e-06, "epoch": 0.1582952815829528, "percentage": 5.27, "elapsed_time": "2:04:47", "remaining_time": "1 day, 13:21:20"}
{"current_steps": 80, "total_steps": 1479, "loss": 0.328, "lr": 5.337837837837838e-06, "epoch": 0.16235413495687467, "percentage": 5.41, "elapsed_time": "2:08:08", "remaining_time": "1 day, 13:20:58"}
{"current_steps": 82, "total_steps": 1479, "loss": 0.3356, "lr": 5.472972972972973e-06, "epoch": 0.16641298833079654, "percentage": 5.54, "elapsed_time": "2:11:13", "remaining_time": "1 day, 13:15:40"}
{"current_steps": 84, "total_steps": 1479, "loss": 0.3317, "lr": 5.608108108108109e-06, "epoch": 0.1704718417047184, "percentage": 5.68, "elapsed_time": "2:14:26", "remaining_time": "1 day, 13:12:46"}
{"current_steps": 86, "total_steps": 1479, "loss": 0.3197, "lr": 5.743243243243244e-06, "epoch": 0.17453069507864027, "percentage": 5.81, "elapsed_time": "2:18:00", "remaining_time": "1 day, 13:15:24"}
{"current_steps": 88, "total_steps": 1479, "loss": 0.3328, "lr": 5.8783783783783786e-06, "epoch": 0.17858954845256214, "percentage": 5.95, "elapsed_time": "2:20:54", "remaining_time": "1 day, 13:07:21"}
{"current_steps": 90, "total_steps": 1479, "loss": 0.3081, "lr": 6.013513513513514e-06, "epoch": 0.182648401826484, "percentage": 6.09, "elapsed_time": "2:24:27", "remaining_time": "1 day, 13:09:24"}
{"current_steps": 92, "total_steps": 1479, "loss": 0.3276, "lr": 6.1486486486486495e-06, "epoch": 0.18670725520040587, "percentage": 6.22, "elapsed_time": "2:27:51", "remaining_time": "1 day, 13:09:14"}
{"current_steps": 94, "total_steps": 1479, "loss": 0.3523, "lr": 6.283783783783784e-06, "epoch": 0.19076610857432774, "percentage": 6.36, "elapsed_time": "2:30:56", "remaining_time": "1 day, 13:04:01"}
{"current_steps": 96, "total_steps": 1479, "loss": 0.3612, "lr": 6.41891891891892e-06, "epoch": 0.1948249619482496, "percentage": 6.49, "elapsed_time": "2:33:50", "remaining_time": "1 day, 12:56:20"}
{"current_steps": 98, "total_steps": 1479, "loss": 0.3106, "lr": 6.554054054054054e-06, "epoch": 0.19888381532217148, "percentage": 6.63, "elapsed_time": "2:37:15", "remaining_time": "1 day, 12:55:58"}
{"current_steps": 100, "total_steps": 1479, "loss": 0.3441, "lr": 6.689189189189191e-06, "epoch": 0.20294266869609334, "percentage": 6.76, "elapsed_time": "2:40:25", "remaining_time": "1 day, 12:52:19"}
{"current_steps": 102, "total_steps": 1479, "loss": 0.334, "lr": 6.824324324324325e-06, "epoch": 0.2070015220700152, "percentage": 6.9, "elapsed_time": "2:43:24", "remaining_time": "1 day, 12:46:07"}
{"current_steps": 104, "total_steps": 1479, "loss": 0.3365, "lr": 6.95945945945946e-06, "epoch": 0.21106037544393708, "percentage": 7.03, "elapsed_time": "2:46:50", "remaining_time": "1 day, 12:45:44"}
{"current_steps": 106, "total_steps": 1479, "loss": 0.3306, "lr": 7.0945945945945946e-06, "epoch": 0.21511922881785894, "percentage": 7.17, "elapsed_time": "2:50:04", "remaining_time": "1 day, 12:42:51"}
{"current_steps": 108, "total_steps": 1479, "loss": 0.3435, "lr": 7.229729729729731e-06, "epoch": 0.2191780821917808, "percentage": 7.3, "elapsed_time": "2:53:13", "remaining_time": "1 day, 12:38:55"}
{"current_steps": 110, "total_steps": 1479, "loss": 0.3698, "lr": 7.3648648648648655e-06, "epoch": 0.22323693556570268, "percentage": 7.44, "elapsed_time": "2:56:26", "remaining_time": "1 day, 12:35:52"}
{"current_steps": 112, "total_steps": 1479, "loss": 0.3209, "lr": 7.500000000000001e-06, "epoch": 0.22729578893962454, "percentage": 7.57, "elapsed_time": "2:59:52", "remaining_time": "1 day, 12:35:27"}
{"current_steps": 114, "total_steps": 1479, "loss": 0.3488, "lr": 7.635135135135135e-06, "epoch": 0.2313546423135464, "percentage": 7.71, "elapsed_time": "3:03:22", "remaining_time": "1 day, 12:35:42"}
{"current_steps": 116, "total_steps": 1479, "loss": 0.3506, "lr": 7.77027027027027e-06, "epoch": 0.23541349568746828, "percentage": 7.84, "elapsed_time": "3:06:37", "remaining_time": "1 day, 12:32:52"}
{"current_steps": 118, "total_steps": 1479, "loss": 0.3481, "lr": 7.905405405405406e-06, "epoch": 0.23947234906139014, "percentage": 7.98, "elapsed_time": "3:10:06", "remaining_time": "1 day, 12:32:36"}
{"current_steps": 120, "total_steps": 1479, "loss": 0.3603, "lr": 8.040540540540541e-06, "epoch": 0.243531202435312, "percentage": 8.11, "elapsed_time": "3:13:12", "remaining_time": "1 day, 12:28:05"}
{"current_steps": 122, "total_steps": 1479, "loss": 0.3351, "lr": 8.175675675675677e-06, "epoch": 0.24759005580923388, "percentage": 8.25, "elapsed_time": "3:16:41", "remaining_time": "1 day, 12:27:43"}
{"current_steps": 124, "total_steps": 1479, "loss": 0.3579, "lr": 8.31081081081081e-06, "epoch": 0.25164890918315574, "percentage": 8.38, "elapsed_time": "3:20:11", "remaining_time": "1 day, 12:27:30"}
{"current_steps": 126, "total_steps": 1479, "loss": 0.3344, "lr": 8.445945945945948e-06, "epoch": 0.2557077625570776, "percentage": 8.52, "elapsed_time": "3:23:42", "remaining_time": "1 day, 12:27:23"}
{"current_steps": 128, "total_steps": 1479, "loss": 0.3671, "lr": 8.581081081081082e-06, "epoch": 0.2597666159309995, "percentage": 8.65, "elapsed_time": "3:27:06", "remaining_time": "1 day, 12:25:54"}
{"current_steps": 130, "total_steps": 1479, "loss": 0.3668, "lr": 8.716216216216217e-06, "epoch": 0.26382546930492135, "percentage": 8.79, "elapsed_time": "3:30:21", "remaining_time": "1 day, 12:22:52"}
{"current_steps": 132, "total_steps": 1479, "loss": 0.3701, "lr": 8.851351351351351e-06, "epoch": 0.2678843226788432, "percentage": 8.92, "elapsed_time": "3:33:40", "remaining_time": "1 day, 12:20:28"}
{"current_steps": 134, "total_steps": 1479, "loss": 0.3718, "lr": 8.986486486486488e-06, "epoch": 0.2719431760527651, "percentage": 9.06, "elapsed_time": "3:36:55", "remaining_time": "1 day, 12:17:19"}
{"current_steps": 136, "total_steps": 1479, "loss": 0.3576, "lr": 9.121621621621622e-06, "epoch": 0.27600202942668695, "percentage": 9.2, "elapsed_time": "3:40:14", "remaining_time": "1 day, 12:14:55"}
{"current_steps": 138, "total_steps": 1479, "loss": 0.3838, "lr": 9.256756756756757e-06, "epoch": 0.2800608828006088, "percentage": 9.33, "elapsed_time": "3:43:40", "remaining_time": "1 day, 12:13:29"}
{"current_steps": 140, "total_steps": 1479, "loss": 0.3782, "lr": 9.391891891891893e-06, "epoch": 0.2841197361745307, "percentage": 9.47, "elapsed_time": "3:47:10", "remaining_time": "1 day, 12:12:46"}
{"current_steps": 142, "total_steps": 1479, "loss": 0.3666, "lr": 9.527027027027028e-06, "epoch": 0.28817858954845255, "percentage": 9.6, "elapsed_time": "3:50:06", "remaining_time": "1 day, 12:06:35"}
{"current_steps": 144, "total_steps": 1479, "loss": 0.392, "lr": 9.662162162162164e-06, "epoch": 0.2922374429223744, "percentage": 9.74, "elapsed_time": "3:52:57", "remaining_time": "1 day, 11:59:47"}
{"current_steps": 146, "total_steps": 1479, "loss": 0.3852, "lr": 9.797297297297298e-06, "epoch": 0.2962962962962963, "percentage": 9.87, "elapsed_time": "3:56:21", "remaining_time": "1 day, 11:57:55"}
{"current_steps": 148, "total_steps": 1479, "loss": 0.3947, "lr": 9.932432432432433e-06, "epoch": 0.30035514967021815, "percentage": 10.01, "elapsed_time": "3:59:46", "remaining_time": "1 day, 11:56:23"}
{"current_steps": 150, "total_steps": 1479, "loss": 0.3778, "lr": 9.999986072170506e-06, "epoch": 0.30441400304414, "percentage": 10.14, "elapsed_time": "4:03:28", "remaining_time": "1 day, 11:57:09"}
{"current_steps": 152, "total_steps": 1479, "loss": 0.3634, "lr": 9.99987465000011e-06, "epoch": 0.3084728564180619, "percentage": 10.28, "elapsed_time": "4:07:04", "remaining_time": "1 day, 11:57:05"}
{"current_steps": 154, "total_steps": 1479, "loss": 0.3629, "lr": 9.999651808142305e-06, "epoch": 0.31253170979198375, "percentage": 10.41, "elapsed_time": "4:10:28", "remaining_time": "1 day, 11:55:00"}
{"current_steps": 156, "total_steps": 1479, "loss": 0.3674, "lr": 9.999317551563011e-06, "epoch": 0.3165905631659056, "percentage": 10.55, "elapsed_time": "4:13:46", "remaining_time": "1 day, 11:52:09"}
{"current_steps": 158, "total_steps": 1479, "loss": 0.3844, "lr": 9.998871887710965e-06, "epoch": 0.3206494165398275, "percentage": 10.68, "elapsed_time": "4:17:04", "remaining_time": "1 day, 11:49:17"}
{"current_steps": 160, "total_steps": 1479, "loss": 0.3986, "lr": 9.998314826517564e-06, "epoch": 0.32470826991374935, "percentage": 10.82, "elapsed_time": "4:20:18", "remaining_time": "1 day, 11:45:52"}
{"current_steps": 162, "total_steps": 1479, "loss": 0.3934, "lr": 9.997646380396633e-06, "epoch": 0.3287671232876712, "percentage": 10.95, "elapsed_time": "4:23:36", "remaining_time": "1 day, 11:42:58"}
{"current_steps": 164, "total_steps": 1479, "loss": 0.3958, "lr": 9.996866564244158e-06, "epoch": 0.3328259766615931, "percentage": 11.09, "elapsed_time": "4:26:51", "remaining_time": "1 day, 11:39:41"}
{"current_steps": 166, "total_steps": 1479, "loss": 0.3697, "lr": 9.995975395437952e-06, "epoch": 0.33688483003551495, "percentage": 11.22, "elapsed_time": "4:30:04", "remaining_time": "1 day, 11:36:11"}
{"current_steps": 168, "total_steps": 1479, "loss": 0.382, "lr": 9.994972893837259e-06, "epoch": 0.3409436834094368, "percentage": 11.36, "elapsed_time": "4:33:14", "remaining_time": "1 day, 11:32:18"}
{"current_steps": 170, "total_steps": 1479, "loss": 0.3989, "lr": 9.993859081782322e-06, "epoch": 0.3450025367833587, "percentage": 11.49, "elapsed_time": "4:36:45", "remaining_time": "1 day, 11:30:58"}
{"current_steps": 172, "total_steps": 1479, "loss": 0.3746, "lr": 9.992633984093886e-06, "epoch": 0.34906139015728055, "percentage": 11.63, "elapsed_time": "4:40:15", "remaining_time": "1 day, 11:29:39"}
{"current_steps": 174, "total_steps": 1479, "loss": 0.3965, "lr": 9.991297628072632e-06, "epoch": 0.3531202435312024, "percentage": 11.76, "elapsed_time": "4:43:38", "remaining_time": "1 day, 11:27:20"}
{"current_steps": 176, "total_steps": 1479, "loss": 0.418, "lr": 9.98985004349858e-06, "epoch": 0.3571790969051243, "percentage": 11.9, "elapsed_time": "4:46:47", "remaining_time": "1 day, 11:23:14"}
{"current_steps": 178, "total_steps": 1479, "loss": 0.3954, "lr": 9.988291262630425e-06, "epoch": 0.36123795027904615, "percentage": 12.04, "elapsed_time": "4:50:20", "remaining_time": "1 day, 11:22:09"}
{"current_steps": 180, "total_steps": 1479, "loss": 0.3944, "lr": 9.986621320204813e-06, "epoch": 0.365296803652968, "percentage": 12.17, "elapsed_time": "4:53:21", "remaining_time": "1 day, 11:17:03"}
{"current_steps": 182, "total_steps": 1479, "loss": 0.396, "lr": 9.984840253435569e-06, "epoch": 0.3693556570268899, "percentage": 12.31, "elapsed_time": "4:56:54", "remaining_time": "1 day, 11:15:52"}
{"current_steps": 184, "total_steps": 1479, "loss": 0.3783, "lr": 9.982948102012866e-06, "epoch": 0.37341451040081175, "percentage": 12.44, "elapsed_time": "5:00:13", "remaining_time": "1 day, 11:13:03"}
{"current_steps": 186, "total_steps": 1479, "loss": 0.4078, "lr": 9.98094490810235e-06, "epoch": 0.3774733637747336, "percentage": 12.58, "elapsed_time": "5:03:07", "remaining_time": "1 day, 11:07:15"}
{"current_steps": 188, "total_steps": 1479, "loss": 0.3892, "lr": 9.978830716344185e-06, "epoch": 0.3815322171486555, "percentage": 12.71, "elapsed_time": "5:06:48", "remaining_time": "1 day, 11:06:49"}
{"current_steps": 190, "total_steps": 1479, "loss": 0.3696, "lr": 9.976605573852071e-06, "epoch": 0.38559107052257735, "percentage": 12.85, "elapsed_time": "5:10:03", "remaining_time": "1 day, 11:03:29"}
{"current_steps": 192, "total_steps": 1479, "loss": 0.405, "lr": 9.974269530212185e-06, "epoch": 0.3896499238964992, "percentage": 12.98, "elapsed_time": "5:13:21", "remaining_time": "1 day, 11:00:29"}
{"current_steps": 194, "total_steps": 1479, "loss": 0.4135, "lr": 9.971822637482085e-06, "epoch": 0.3937087772704211, "percentage": 13.12, "elapsed_time": "5:16:28", "remaining_time": "1 day, 10:56:15"}
{"current_steps": 196, "total_steps": 1479, "loss": 0.4006, "lr": 9.969264950189539e-06, "epoch": 0.39776763064434295, "percentage": 13.25, "elapsed_time": "5:19:48", "remaining_time": "1 day, 10:53:28"}
{"current_steps": 198, "total_steps": 1479, "loss": 0.4188, "lr": 9.966596525331324e-06, "epoch": 0.4018264840182648, "percentage": 13.39, "elapsed_time": "5:22:55", "remaining_time": "1 day, 10:49:16"}
{"current_steps": 200, "total_steps": 1479, "loss": 0.4042, "lr": 9.96381742237194e-06, "epoch": 0.4058853373921867, "percentage": 13.52, "elapsed_time": "5:26:10", "remaining_time": "1 day, 10:45:53"}
{"current_steps": 202, "total_steps": 1479, "loss": 0.3956, "lr": 9.960927703242298e-06, "epoch": 0.40994419076610855, "percentage": 13.66, "elapsed_time": "5:29:26", "remaining_time": "1 day, 10:42:40"}
{"current_steps": 204, "total_steps": 1479, "loss": 0.3813, "lr": 9.957927432338332e-06, "epoch": 0.4140030441400304, "percentage": 13.79, "elapsed_time": "5:32:44", "remaining_time": "1 day, 10:39:35"}
{"current_steps": 206, "total_steps": 1479, "loss": 0.3846, "lr": 9.954816676519569e-06, "epoch": 0.4180618975139523, "percentage": 13.93, "elapsed_time": "5:36:04", "remaining_time": "1 day, 10:36:46"}
{"current_steps": 208, "total_steps": 1479, "loss": 0.3826, "lr": 9.951595505107633e-06, "epoch": 0.42212075088787415, "percentage": 14.06, "elapsed_time": "5:39:05", "remaining_time": "1 day, 10:32:01"}
{"current_steps": 210, "total_steps": 1479, "loss": 0.4118, "lr": 9.948263989884708e-06, "epoch": 0.426179604261796, "percentage": 14.2, "elapsed_time": "5:42:23", "remaining_time": "1 day, 10:28:59"}
{"current_steps": 212, "total_steps": 1479, "loss": 0.3974, "lr": 9.944822205091929e-06, "epoch": 0.4302384576357179, "percentage": 14.33, "elapsed_time": "5:45:25", "remaining_time": "1 day, 10:24:23"}
{"current_steps": 214, "total_steps": 1479, "loss": 0.3784, "lr": 9.94127022742774e-06, "epoch": 0.43429731100963975, "percentage": 14.47, "elapsed_time": "5:48:37", "remaining_time": "1 day, 10:20:47"}
{"current_steps": 216, "total_steps": 1479, "loss": 0.3857, "lr": 9.937608136046171e-06, "epoch": 0.4383561643835616, "percentage": 14.6, "elapsed_time": "5:51:43", "remaining_time": "1 day, 10:16:35"}
{"current_steps": 218, "total_steps": 1479, "loss": 0.4089, "lr": 9.933836012555083e-06, "epoch": 0.4424150177574835, "percentage": 14.74, "elapsed_time": "5:55:10", "remaining_time": "1 day, 10:14:30"}
{"current_steps": 220, "total_steps": 1479, "loss": 0.3649, "lr": 9.929953941014349e-06, "epoch": 0.44647387113140535, "percentage": 14.87, "elapsed_time": "5:58:46", "remaining_time": "1 day, 10:13:12"}
{"current_steps": 222, "total_steps": 1479, "loss": 0.4093, "lr": 9.925962007933975e-06, "epoch": 0.4505327245053272, "percentage": 15.01, "elapsed_time": "6:02:09", "remaining_time": "1 day, 10:10:33"}
{"current_steps": 224, "total_steps": 1479, "loss": 0.3959, "lr": 9.921860302272184e-06, "epoch": 0.4545915778792491, "percentage": 15.15, "elapsed_time": "6:05:44", "remaining_time": "1 day, 10:09:07"}
{"current_steps": 226, "total_steps": 1479, "loss": 0.4271, "lr": 9.917648915433413e-06, "epoch": 0.45865043125317095, "percentage": 15.28, "elapsed_time": "6:09:04", "remaining_time": "1 day, 10:06:17"}
{"current_steps": 228, "total_steps": 1479, "loss": 0.3963, "lr": 9.9133279412663e-06, "epoch": 0.4627092846270928, "percentage": 15.42, "elapsed_time": "6:12:29", "remaining_time": "1 day, 10:03:46"}
{"current_steps": 230, "total_steps": 1479, "loss": 0.4128, "lr": 9.908897476061576e-06, "epoch": 0.4667681380010147, "percentage": 15.55, "elapsed_time": "6:15:49", "remaining_time": "1 day, 10:00:55"}
{"current_steps": 232, "total_steps": 1479, "loss": 0.4032, "lr": 9.904357618549925e-06, "epoch": 0.47082699137493655, "percentage": 15.69, "elapsed_time": "6:19:36", "remaining_time": "1 day, 10:00:25"}
{"current_steps": 234, "total_steps": 1479, "loss": 0.402, "lr": 9.899708469899786e-06, "epoch": 0.4748858447488584, "percentage": 15.82, "elapsed_time": "6:22:47", "remaining_time": "1 day, 9:56:39"}
{"current_steps": 236, "total_steps": 1479, "loss": 0.4079, "lr": 9.894950133715094e-06, "epoch": 0.4789446981227803, "percentage": 15.96, "elapsed_time": "6:25:56", "remaining_time": "1 day, 9:52:46"}
{"current_steps": 238, "total_steps": 1479, "loss": 0.3908, "lr": 9.89008271603297e-06, "epoch": 0.48300355149670215, "percentage": 16.09, "elapsed_time": "6:29:12", "remaining_time": "1 day, 9:49:24"}
{"current_steps": 240, "total_steps": 1479, "loss": 0.418, "lr": 9.885106325321371e-06, "epoch": 0.487062404870624, "percentage": 16.23, "elapsed_time": "6:32:32", "remaining_time": "1 day, 9:46:31"}
{"current_steps": 242, "total_steps": 1479, "loss": 0.4136, "lr": 9.880021072476651e-06, "epoch": 0.4911212582445459, "percentage": 16.36, "elapsed_time": "6:35:44", "remaining_time": "1 day, 9:42:53"}
{"current_steps": 244, "total_steps": 1479, "loss": 0.4037, "lr": 9.874827070821112e-06, "epoch": 0.49518011161846776, "percentage": 16.5, "elapsed_time": "6:38:35", "remaining_time": "1 day, 9:37:26"}
{"current_steps": 246, "total_steps": 1479, "loss": 0.3723, "lr": 9.869524436100458e-06, "epoch": 0.4992389649923896, "percentage": 16.63, "elapsed_time": "6:41:53", "remaining_time": "1 day, 9:34:22"}
{"current_steps": 248, "total_steps": 1479, "loss": 0.3665, "lr": 9.864113286481237e-06, "epoch": 0.5032978183663115, "percentage": 16.77, "elapsed_time": "6:45:05", "remaining_time": "1 day, 9:30:44"}
{"current_steps": 250, "total_steps": 1479, "loss": 0.4243, "lr": 9.85859374254819e-06, "epoch": 0.5073566717402334, "percentage": 16.9, "elapsed_time": "6:48:27", "remaining_time": "1 day, 9:27:56"}
{"current_steps": 252, "total_steps": 1479, "loss": 0.3945, "lr": 9.852965927301573e-06, "epoch": 0.5114155251141552, "percentage": 17.04, "elapsed_time": "6:51:50", "remaining_time": "1 day, 9:25:15"}
{"current_steps": 254, "total_steps": 1479, "loss": 0.4303, "lr": 9.847229966154415e-06, "epoch": 0.5154743784880771, "percentage": 17.17, "elapsed_time": "6:55:22", "remaining_time": "1 day, 9:23:15"}
{"current_steps": 256, "total_steps": 1479, "loss": 0.4223, "lr": 9.841385986929716e-06, "epoch": 0.519533231861999, "percentage": 17.31, "elapsed_time": "6:59:03", "remaining_time": "1 day, 9:21:58"}
{"current_steps": 258, "total_steps": 1479, "loss": 0.4124, "lr": 9.835434119857612e-06, "epoch": 0.5235920852359208, "percentage": 17.44, "elapsed_time": "7:02:01", "remaining_time": "1 day, 9:17:13"}
{"current_steps": 260, "total_steps": 1479, "loss": 0.4156, "lr": 9.829374497572461e-06, "epoch": 0.5276509386098427, "percentage": 17.58, "elapsed_time": "7:05:11", "remaining_time": "1 day, 9:13:31"}
{"current_steps": 262, "total_steps": 1479, "loss": 0.3865, "lr": 9.823207255109891e-06, "epoch": 0.5317097919837646, "percentage": 17.71, "elapsed_time": "7:08:32", "remaining_time": "1 day, 9:10:36"}
{"current_steps": 264, "total_steps": 1479, "loss": 0.381, "lr": 9.816932529903795e-06, "epoch": 0.5357686453576864, "percentage": 17.85, "elapsed_time": "7:11:52", "remaining_time": "1 day, 9:07:36"}
{"current_steps": 266, "total_steps": 1479, "loss": 0.4209, "lr": 9.810550461783261e-06, "epoch": 0.5398274987316083, "percentage": 17.99, "elapsed_time": "7:15:22", "remaining_time": "1 day, 9:05:22"}
{"current_steps": 268, "total_steps": 1479, "loss": 0.3935, "lr": 9.804061192969465e-06, "epoch": 0.5438863521055302, "percentage": 18.12, "elapsed_time": "7:18:26", "remaining_time": "1 day, 9:01:10"}
{"current_steps": 270, "total_steps": 1479, "loss": 0.4055, "lr": 9.797464868072489e-06, "epoch": 0.547945205479452, "percentage": 18.26, "elapsed_time": "7:22:03", "remaining_time": "1 day, 8:59:25"}
{"current_steps": 272, "total_steps": 1479, "loss": 0.4016, "lr": 9.790761634088108e-06, "epoch": 0.5520040588533739, "percentage": 18.39, "elapsed_time": "7:25:36", "remaining_time": "1 day, 8:57:22"}
{"current_steps": 274, "total_steps": 1479, "loss": 0.4066, "lr": 9.78395164039452e-06, "epoch": 0.5560629122272958, "percentage": 18.53, "elapsed_time": "7:28:54", "remaining_time": "1 day, 8:54:14"}
{"current_steps": 276, "total_steps": 1479, "loss": 0.4072, "lr": 9.777035038749002e-06, "epoch": 0.5601217656012176, "percentage": 18.66, "elapsed_time": "7:31:58", "remaining_time": "1 day, 8:50:02"}
{"current_steps": 278, "total_steps": 1479, "loss": 0.385, "lr": 9.77001198328453e-06, "epoch": 0.5641806189751395, "percentage": 18.8, "elapsed_time": "7:35:26", "remaining_time": "1 day, 8:47:35"}
{"current_steps": 280, "total_steps": 1479, "loss": 0.4138, "lr": 9.762882630506366e-06, "epoch": 0.5682394723490614, "percentage": 18.93, "elapsed_time": "7:38:48", "remaining_time": "1 day, 8:44:41"}
{"current_steps": 282, "total_steps": 1479, "loss": 0.4108, "lr": 9.75564713928854e-06, "epoch": 0.5722983257229832, "percentage": 19.07, "elapsed_time": "7:41:58", "remaining_time": "1 day, 8:40:55"}
{"current_steps": 284, "total_steps": 1479, "loss": 0.4105, "lr": 9.748305670870326e-06, "epoch": 0.5763571790969051, "percentage": 19.2, "elapsed_time": "7:45:00", "remaining_time": "1 day, 8:36:36"}
{"current_steps": 286, "total_steps": 1479, "loss": 0.4187, "lr": 9.740858388852652e-06, "epoch": 0.580416032470827, "percentage": 19.34, "elapsed_time": "7:47:54", "remaining_time": "1 day, 8:31:49"}
{"current_steps": 288, "total_steps": 1479, "loss": 0.4026, "lr": 9.733305459194444e-06, "epoch": 0.5844748858447488, "percentage": 19.47, "elapsed_time": "7:50:52", "remaining_time": "1 day, 8:27:17"}
{"current_steps": 290, "total_steps": 1479, "loss": 0.4194, "lr": 9.725647050208936e-06, "epoch": 0.5885337392186707, "percentage": 19.61, "elapsed_time": "7:54:11", "remaining_time": "1 day, 8:24:09"}
{"current_steps": 292, "total_steps": 1479, "loss": 0.4043, "lr": 9.717883332559911e-06, "epoch": 0.5925925925925926, "percentage": 19.74, "elapsed_time": "7:57:34", "remaining_time": "1 day, 8:21:21"}
{"current_steps": 294, "total_steps": 1479, "loss": 0.4279, "lr": 9.710014479257906e-06, "epoch": 0.5966514459665144, "percentage": 19.88, "elapsed_time": "8:00:44", "remaining_time": "1 day, 8:17:39"}
{"current_steps": 296, "total_steps": 1479, "loss": 0.417, "lr": 9.702040665656353e-06, "epoch": 0.6007102993404363, "percentage": 20.01, "elapsed_time": "8:04:00", "remaining_time": "1 day, 8:14:24"}
{"current_steps": 298, "total_steps": 1479, "loss": 0.4399, "lr": 9.693962069447669e-06, "epoch": 0.6047691527143582, "percentage": 20.15, "elapsed_time": "8:07:22", "remaining_time": "1 day, 8:11:31"}
{"current_steps": 300, "total_steps": 1479, "loss": 0.4024, "lr": 9.685778870659301e-06, "epoch": 0.60882800608828, "percentage": 20.28, "elapsed_time": "8:10:31", "remaining_time": "1 day, 8:07:47"}
{"current_steps": 302, "total_steps": 1479, "loss": 0.3912, "lr": 9.677491251649711e-06, "epoch": 0.6128868594622019, "percentage": 20.42, "elapsed_time": "8:13:52", "remaining_time": "1 day, 8:04:46"}
{"current_steps": 304, "total_steps": 1479, "loss": 0.4174, "lr": 9.669099397104314e-06, "epoch": 0.6169457128361238, "percentage": 20.55, "elapsed_time": "8:16:56", "remaining_time": "1 day, 8:00:43"}
{"current_steps": 306, "total_steps": 1479, "loss": 0.3918, "lr": 9.660603494031358e-06, "epoch": 0.6210045662100456, "percentage": 20.69, "elapsed_time": "8:19:55", "remaining_time": "1 day, 7:56:22"}
{"current_steps": 308, "total_steps": 1479, "loss": 0.4157, "lr": 9.652003731757763e-06, "epoch": 0.6250634195839675, "percentage": 20.82, "elapsed_time": "8:23:02", "remaining_time": "1 day, 7:52:31"}
{"current_steps": 310, "total_steps": 1479, "loss": 0.4015, "lr": 9.643300301924902e-06, "epoch": 0.6291222729578894, "percentage": 20.96, "elapsed_time": "8:26:13", "remaining_time": "1 day, 7:48:57"}
{"current_steps": 312, "total_steps": 1479, "loss": 0.4128, "lr": 9.634493398484319e-06, "epoch": 0.6331811263318112, "percentage": 21.1, "elapsed_time": "8:29:21", "remaining_time": "1 day, 7:45:11"}
{"current_steps": 314, "total_steps": 1479, "loss": 0.3874, "lr": 9.625583217693419e-06, "epoch": 0.6372399797057331, "percentage": 21.23, "elapsed_time": "8:32:25", "remaining_time": "1 day, 7:41:12"}
{"current_steps": 316, "total_steps": 1479, "loss": 0.4219, "lr": 9.616569958111097e-06, "epoch": 0.641298833079655, "percentage": 21.37, "elapsed_time": "8:35:16", "remaining_time": "1 day, 7:36:25"}
{"current_steps": 318, "total_steps": 1479, "loss": 0.4138, "lr": 9.607453820593297e-06, "epoch": 0.6453576864535768, "percentage": 21.5, "elapsed_time": "8:38:30", "remaining_time": "1 day, 7:33:03"}
{"current_steps": 320, "total_steps": 1479, "loss": 0.4075, "lr": 9.598235008288551e-06, "epoch": 0.6494165398274987, "percentage": 21.64, "elapsed_time": "8:41:46", "remaining_time": "1 day, 7:29:49"}
{"current_steps": 322, "total_steps": 1479, "loss": 0.4111, "lr": 9.58891372663345e-06, "epoch": 0.6534753932014206, "percentage": 21.77, "elapsed_time": "8:45:08", "remaining_time": "1 day, 7:26:54"}
{"current_steps": 324, "total_steps": 1479, "loss": 0.4358, "lr": 9.579490183348052e-06, "epoch": 0.6575342465753424, "percentage": 21.91, "elapsed_time": "8:48:36", "remaining_time": "1 day, 7:24:22"}
{"current_steps": 326, "total_steps": 1479, "loss": 0.3986, "lr": 9.56996458843128e-06, "epoch": 0.6615930999492643, "percentage": 22.04, "elapsed_time": "8:52:13", "remaining_time": "1 day, 7:22:21"}
{"current_steps": 328, "total_steps": 1479, "loss": 0.4075, "lr": 9.56033715415621e-06, "epoch": 0.6656519533231862, "percentage": 22.18, "elapsed_time": "8:55:13", "remaining_time": "1 day, 7:18:10"}
{"current_steps": 330, "total_steps": 1479, "loss": 0.4071, "lr": 9.550608095065367e-06, "epoch": 0.669710806697108, "percentage": 22.31, "elapsed_time": "8:58:29", "remaining_time": "1 day, 7:14:57"}
{"current_steps": 332, "total_steps": 1479, "loss": 0.3957, "lr": 9.540777627965933e-06, "epoch": 0.6737696600710299, "percentage": 22.45, "elapsed_time": "9:01:57", "remaining_time": "1 day, 7:12:21"}
{"current_steps": 334, "total_steps": 1479, "loss": 0.4158, "lr": 9.53084597192491e-06, "epoch": 0.6778285134449518, "percentage": 22.58, "elapsed_time": "9:04:54", "remaining_time": "1 day, 7:08:01"}
{"current_steps": 336, "total_steps": 1479, "loss": 0.4277, "lr": 9.520813348264252e-06, "epoch": 0.6818873668188736, "percentage": 22.72, "elapsed_time": "9:08:19", "remaining_time": "1 day, 7:05:15"}
{"current_steps": 338, "total_steps": 1479, "loss": 0.3995, "lr": 9.510679980555922e-06, "epoch": 0.6859462201927955, "percentage": 22.85, "elapsed_time": "9:11:17", "remaining_time": "1 day, 7:01:00"}
{"current_steps": 340, "total_steps": 1479, "loss": 0.4005, "lr": 9.500446094616911e-06, "epoch": 0.6900050735667174, "percentage": 22.99, "elapsed_time": "9:14:28", "remaining_time": "1 day, 6:57:28"}
{"current_steps": 342, "total_steps": 1479, "loss": 0.4169, "lr": 9.490111918504213e-06, "epoch": 0.6940639269406392, "percentage": 23.12, "elapsed_time": "9:17:33", "remaining_time": "1 day, 6:53:38"}
{"current_steps": 344, "total_steps": 1479, "loss": 0.3986, "lr": 9.479677682509737e-06, "epoch": 0.6981227803145611, "percentage": 23.26, "elapsed_time": "9:20:42", "remaining_time": "1 day, 6:50:00"}
{"current_steps": 346, "total_steps": 1479, "loss": 0.3923, "lr": 9.469143619155172e-06, "epoch": 0.702181633688483, "percentage": 23.39, "elapsed_time": "9:24:06", "remaining_time": "1 day, 6:47:13"}
{"current_steps": 348, "total_steps": 1479, "loss": 0.4043, "lr": 9.458509963186815e-06, "epoch": 0.7062404870624048, "percentage": 23.53, "elapsed_time": "9:27:29", "remaining_time": "1 day, 6:44:21"}
{"current_steps": 350, "total_steps": 1479, "loss": 0.4066, "lr": 9.44777695157033e-06, "epoch": 0.7102993404363267, "percentage": 23.66, "elapsed_time": "9:30:59", "remaining_time": "1 day, 6:41:52"}
{"current_steps": 352, "total_steps": 1479, "loss": 0.4146, "lr": 9.436944823485475e-06, "epoch": 0.7143581938102486, "percentage": 23.8, "elapsed_time": "9:34:18", "remaining_time": "1 day, 6:38:46"}
{"current_steps": 354, "total_steps": 1479, "loss": 0.4206, "lr": 9.426013820320764e-06, "epoch": 0.7184170471841704, "percentage": 23.94, "elapsed_time": "9:37:36", "remaining_time": "1 day, 6:35:38"}
{"current_steps": 356, "total_steps": 1479, "loss": 0.3991, "lr": 9.414984185668097e-06, "epoch": 0.7224759005580923, "percentage": 24.07, "elapsed_time": "9:40:56", "remaining_time": "1 day, 6:32:33"}
{"current_steps": 358, "total_steps": 1479, "loss": 0.4359, "lr": 9.403856165317322e-06, "epoch": 0.7265347539320142, "percentage": 24.21, "elapsed_time": "9:44:02", "remaining_time": "1 day, 6:28:47"}
{"current_steps": 360, "total_steps": 1479, "loss": 0.4415, "lr": 9.392630007250769e-06, "epoch": 0.730593607305936, "percentage": 24.34, "elapsed_time": "9:47:13", "remaining_time": "1 day, 6:25:17"}
{"current_steps": 362, "total_steps": 1479, "loss": 0.4219, "lr": 9.381305961637713e-06, "epoch": 0.7346524606798579, "percentage": 24.48, "elapsed_time": "9:50:15", "remaining_time": "1 day, 6:21:18"}
{"current_steps": 364, "total_steps": 1479, "loss": 0.4308, "lr": 9.369884280828806e-06, "epoch": 0.7387113140537798, "percentage": 24.61, "elapsed_time": "9:53:39", "remaining_time": "1 day, 6:18:28"}
{"current_steps": 366, "total_steps": 1479, "loss": 0.4376, "lr": 9.358365219350448e-06, "epoch": 0.7427701674277016, "percentage": 24.75, "elapsed_time": "9:57:02", "remaining_time": "1 day, 6:15:35"}
{"current_steps": 368, "total_steps": 1479, "loss": 0.4331, "lr": 9.346749033899121e-06, "epoch": 0.7468290208016235, "percentage": 24.88, "elapsed_time": "10:00:18", "remaining_time": "1 day, 6:12:19"}
{"current_steps": 370, "total_steps": 1479, "loss": 0.3992, "lr": 9.335035983335667e-06, "epoch": 0.7508878741755454, "percentage": 25.02, "elapsed_time": "10:03:25", "remaining_time": "1 day, 6:08:38"}
{"current_steps": 372, "total_steps": 1479, "loss": 0.4044, "lr": 9.323226328679512e-06, "epoch": 0.7549467275494672, "percentage": 25.15, "elapsed_time": "10:06:50", "remaining_time": "1 day, 6:05:50"}
{"current_steps": 374, "total_steps": 1479, "loss": 0.3954, "lr": 9.311320333102864e-06, "epoch": 0.7590055809233891, "percentage": 25.29, "elapsed_time": "10:10:01", "remaining_time": "1 day, 6:02:20"}
{"current_steps": 376, "total_steps": 1479, "loss": 0.3998, "lr": 9.299318261924834e-06, "epoch": 0.763064434297311, "percentage": 25.42, "elapsed_time": "10:13:08", "remaining_time": "1 day, 5:58:39"}
{"current_steps": 378, "total_steps": 1479, "loss": 0.4042, "lr": 9.287220382605532e-06, "epoch": 0.7671232876712328, "percentage": 25.56, "elapsed_time": "10:16:48", "remaining_time": "1 day, 5:56:33"}
{"current_steps": 380, "total_steps": 1479, "loss": 0.4067, "lr": 9.275026964740101e-06, "epoch": 0.7711821410451547, "percentage": 25.69, "elapsed_time": "10:20:12", "remaining_time": "1 day, 5:53:43"}
{"current_steps": 382, "total_steps": 1479, "loss": 0.4183, "lr": 9.262738280052715e-06, "epoch": 0.7752409944190766, "percentage": 25.83, "elapsed_time": "10:23:22", "remaining_time": "1 day, 5:50:09"}
{"current_steps": 384, "total_steps": 1479, "loss": 0.4409, "lr": 9.250354602390523e-06, "epoch": 0.7792998477929984, "percentage": 25.96, "elapsed_time": "10:26:22", "remaining_time": "1 day, 5:46:10"}
{"current_steps": 386, "total_steps": 1479, "loss": 0.4029, "lr": 9.237876207717538e-06, "epoch": 0.7833587011669203, "percentage": 26.1, "elapsed_time": "10:29:40", "remaining_time": "1 day, 5:42:59"}
{"current_steps": 388, "total_steps": 1479, "loss": 0.4178, "lr": 9.225303374108503e-06, "epoch": 0.7874175545408422, "percentage": 26.23, "elapsed_time": "10:33:06", "remaining_time": "1 day, 5:40:11"}
{"current_steps": 390, "total_steps": 1479, "loss": 0.4197, "lr": 9.212636381742676e-06, "epoch": 0.791476407914764, "percentage": 26.37, "elapsed_time": "10:36:16", "remaining_time": "1 day, 5:36:39"}
{"current_steps": 392, "total_steps": 1479, "loss": 0.4173, "lr": 9.199875512897602e-06, "epoch": 0.7955352612886859, "percentage": 26.5, "elapsed_time": "10:39:25", "remaining_time": "1 day, 5:33:07"}
{"current_steps": 394, "total_steps": 1479, "loss": 0.4145, "lr": 9.187021051942814e-06, "epoch": 0.7995941146626078, "percentage": 26.64, "elapsed_time": "10:42:49", "remaining_time": "1 day, 5:30:12"}
{"current_steps": 396, "total_steps": 1479, "loss": 0.4181, "lr": 9.174073285333498e-06, "epoch": 0.8036529680365296, "percentage": 26.77, "elapsed_time": "10:46:03", "remaining_time": "1 day, 5:26:51"}
{"current_steps": 398, "total_steps": 1479, "loss": 0.3949, "lr": 9.161032501604106e-06, "epoch": 0.8077118214104515, "percentage": 26.91, "elapsed_time": "10:49:05", "remaining_time": "1 day, 5:22:59"}
{"current_steps": 400, "total_steps": 1479, "loss": 0.4076, "lr": 9.147898991361936e-06, "epoch": 0.8117706747843734, "percentage": 27.05, "elapsed_time": "10:52:08", "remaining_time": "1 day, 5:19:09"}
{"current_steps": 402, "total_steps": 1479, "loss": 0.4233, "lr": 9.134673047280644e-06, "epoch": 0.8158295281582952, "percentage": 27.18, "elapsed_time": "10:55:30", "remaining_time": "1 day, 5:16:09"}
{"current_steps": 404, "total_steps": 1479, "loss": 0.4127, "lr": 9.121354964093732e-06, "epoch": 0.8198883815322171, "percentage": 27.32, "elapsed_time": "10:58:36", "remaining_time": "1 day, 5:12:28"}
{"current_steps": 406, "total_steps": 1479, "loss": 0.4226, "lr": 9.107945038587974e-06, "epoch": 0.823947234906139, "percentage": 27.45, "elapsed_time": "11:01:53", "remaining_time": "1 day, 5:09:17"}
{"current_steps": 408, "total_steps": 1479, "loss": 0.4033, "lr": 9.094443569596802e-06, "epoch": 0.8280060882800608, "percentage": 27.59, "elapsed_time": "11:05:09", "remaining_time": "1 day, 5:06:01"}
{"current_steps": 410, "total_steps": 1479, "loss": 0.4088, "lr": 9.08085085799365e-06, "epoch": 0.8320649416539827, "percentage": 27.72, "elapsed_time": "11:08:15", "remaining_time": "1 day, 5:02:21"}
{"current_steps": 412, "total_steps": 1479, "loss": 0.4124, "lr": 9.067167206685248e-06, "epoch": 0.8361237950279046, "percentage": 27.86, "elapsed_time": "11:11:26", "remaining_time": "1 day, 4:58:54"}
{"current_steps": 414, "total_steps": 1479, "loss": 0.434, "lr": 9.05339292060487e-06, "epoch": 0.8401826484018264, "percentage": 27.99, "elapsed_time": "11:14:31", "remaining_time": "1 day, 4:55:10"}
{"current_steps": 416, "total_steps": 1479, "loss": 0.425, "lr": 9.039528306705543e-06, "epoch": 0.8442415017757483, "percentage": 28.13, "elapsed_time": "11:17:56", "remaining_time": "1 day, 4:52:19"}
{"current_steps": 418, "total_steps": 1479, "loss": 0.4423, "lr": 9.025573673953201e-06, "epoch": 0.8483003551496702, "percentage": 28.26, "elapsed_time": "11:20:58", "remaining_time": "1 day, 4:48:30"}
{"current_steps": 420, "total_steps": 1479, "loss": 0.3987, "lr": 9.011529333319804e-06, "epoch": 0.852359208523592, "percentage": 28.4, "elapsed_time": "11:24:18", "remaining_time": "1 day, 4:45:25"}
{"current_steps": 422, "total_steps": 1479, "loss": 0.3908, "lr": 8.997395597776404e-06, "epoch": 0.8564180618975139, "percentage": 28.53, "elapsed_time": "11:27:45", "remaining_time": "1 day, 4:42:38"}
{"current_steps": 424, "total_steps": 1479, "loss": 0.4055, "lr": 8.98317278228618e-06, "epoch": 0.8604769152714358, "percentage": 28.67, "elapsed_time": "11:31:03", "remaining_time": "1 day, 4:39:29"}
{"current_steps": 426, "total_steps": 1479, "loss": 0.4241, "lr": 8.96886120379741e-06, "epoch": 0.8645357686453576, "percentage": 28.8, "elapsed_time": "11:34:10", "remaining_time": "1 day, 4:35:53"}
{"current_steps": 428, "total_steps": 1479, "loss": 0.4343, "lr": 8.954461181236406e-06, "epoch": 0.8685946220192795, "percentage": 28.94, "elapsed_time": "11:37:20", "remaining_time": "1 day, 4:32:23"}
{"current_steps": 430, "total_steps": 1479, "loss": 0.4012, "lr": 8.939973035500418e-06, "epoch": 0.8726534753932014, "percentage": 29.07, "elapsed_time": "11:40:06", "remaining_time": "1 day, 4:27:57"}
{"current_steps": 432, "total_steps": 1479, "loss": 0.4116, "lr": 8.925397089450473e-06, "epoch": 0.8767123287671232, "percentage": 29.21, "elapsed_time": "11:43:19", "remaining_time": "1 day, 4:24:34"}
{"current_steps": 434, "total_steps": 1479, "loss": 0.4128, "lr": 8.910733667904186e-06, "epoch": 0.8807711821410451, "percentage": 29.34, "elapsed_time": "11:46:17", "remaining_time": "1 day, 4:20:39"}
{"current_steps": 436, "total_steps": 1479, "loss": 0.4332, "lr": 8.895983097628515e-06, "epoch": 0.884830035514967, "percentage": 29.48, "elapsed_time": "11:49:21", "remaining_time": "1 day, 4:16:55"}
{"current_steps": 438, "total_steps": 1479, "loss": 0.4005, "lr": 8.88114570733249e-06, "epoch": 0.8888888888888888, "percentage": 29.61, "elapsed_time": "11:52:29", "remaining_time": "1 day, 4:13:24"}
{"current_steps": 440, "total_steps": 1479, "loss": 0.4233, "lr": 8.866221827659876e-06, "epoch": 0.8929477422628107, "percentage": 29.75, "elapsed_time": "11:55:24", "remaining_time": "1 day, 4:09:21"}
{"current_steps": 442, "total_steps": 1479, "loss": 0.4133, "lr": 8.851211791181813e-06, "epoch": 0.8970065956367326, "percentage": 29.89, "elapsed_time": "11:58:39", "remaining_time": "1 day, 4:06:04"}
{"current_steps": 444, "total_steps": 1479, "loss": 0.447, "lr": 8.8361159323894e-06, "epoch": 0.9010654490106544, "percentage": 30.02, "elapsed_time": "12:01:38", "remaining_time": "1 day, 4:02:13"}
{"current_steps": 446, "total_steps": 1479, "loss": 0.3884, "lr": 8.820934587686247e-06, "epoch": 0.9051243023845763, "percentage": 30.16, "elapsed_time": "12:04:57", "remaining_time": "1 day, 3:59:06"}
{"current_steps": 448, "total_steps": 1479, "loss": 0.4139, "lr": 8.805668095380969e-06, "epoch": 0.9091831557584982, "percentage": 30.29, "elapsed_time": "12:08:26", "remaining_time": "1 day, 3:56:23"}
{"current_steps": 450, "total_steps": 1479, "loss": 0.4258, "lr": 8.790316795679654e-06, "epoch": 0.91324200913242, "percentage": 30.43, "elapsed_time": "12:11:35", "remaining_time": "1 day, 3:52:53"}
{"current_steps": 452, "total_steps": 1479, "loss": 0.4039, "lr": 8.774881030678284e-06, "epoch": 0.9173008625063419, "percentage": 30.56, "elapsed_time": "12:14:52", "remaining_time": "1 day, 3:49:42"}
{"current_steps": 454, "total_steps": 1479, "loss": 0.4186, "lr": 8.759361144355103e-06, "epoch": 0.9213597158802638, "percentage": 30.7, "elapsed_time": "12:18:20", "remaining_time": "1 day, 3:46:57"}
{"current_steps": 456, "total_steps": 1479, "loss": 0.43, "lr": 8.74375748256296e-06, "epoch": 0.9254185692541856, "percentage": 30.83, "elapsed_time": "12:21:27", "remaining_time": "1 day, 3:43:24"}
{"current_steps": 458, "total_steps": 1479, "loss": 0.3952, "lr": 8.728070393021595e-06, "epoch": 0.9294774226281075, "percentage": 30.97, "elapsed_time": "12:24:58", "remaining_time": "1 day, 3:40:45"}
{"current_steps": 460, "total_steps": 1479, "loss": 0.419, "lr": 8.712300225309894e-06, "epoch": 0.9335362760020294, "percentage": 31.1, "elapsed_time": "12:28:18", "remaining_time": "1 day, 3:37:40"}
{"current_steps": 462, "total_steps": 1479, "loss": 0.4295, "lr": 8.6964473308581e-06, "epoch": 0.9375951293759512, "percentage": 31.24, "elapsed_time": "12:31:32", "remaining_time": "1 day, 3:34:21"}
{"current_steps": 464, "total_steps": 1479, "loss": 0.3994, "lr": 8.680512062939976e-06, "epoch": 0.9416539827498731, "percentage": 31.37, "elapsed_time": "12:34:53", "remaining_time": "1 day, 3:31:19"}
{"current_steps": 466, "total_steps": 1479, "loss": 0.4305, "lr": 8.664494776664942e-06, "epoch": 0.945712836123795, "percentage": 31.51, "elapsed_time": "12:38:28", "remaining_time": "1 day, 3:28:47"}
{"current_steps": 468, "total_steps": 1479, "loss": 0.4247, "lr": 8.64839582897015e-06, "epoch": 0.9497716894977168, "percentage": 31.64, "elapsed_time": "12:41:44", "remaining_time": "1 day, 3:25:32"}
{"current_steps": 470, "total_steps": 1479, "loss": 0.414, "lr": 8.63221557861254e-06, "epoch": 0.9538305428716387, "percentage": 31.78, "elapsed_time": "12:45:01", "remaining_time": "1 day, 3:22:21"}
{"current_steps": 472, "total_steps": 1479, "loss": 0.3944, "lr": 8.615954386160836e-06, "epoch": 0.9578893962455606, "percentage": 31.91, "elapsed_time": "12:48:08", "remaining_time": "1 day, 3:18:48"}
{"current_steps": 474, "total_steps": 1479, "loss": 0.4138, "lr": 8.599612613987522e-06, "epoch": 0.9619482496194824, "percentage": 32.05, "elapsed_time": "12:51:33", "remaining_time": "1 day, 3:15:52"}
{"current_steps": 476, "total_steps": 1479, "loss": 0.408, "lr": 8.583190626260754e-06, "epoch": 0.9660071029934043, "percentage": 32.18, "elapsed_time": "12:54:53", "remaining_time": "1 day, 3:12:48"}
{"current_steps": 478, "total_steps": 1479, "loss": 0.4326, "lr": 8.566688788936254e-06, "epoch": 0.9700659563673262, "percentage": 32.32, "elapsed_time": "12:57:56", "remaining_time": "1 day, 3:09:07"}
{"current_steps": 480, "total_steps": 1479, "loss": 0.4095, "lr": 8.550107469749159e-06, "epoch": 0.974124809741248, "percentage": 32.45, "elapsed_time": "13:01:19", "remaining_time": "1 day, 3:06:07"}
{"current_steps": 482, "total_steps": 1479, "loss": 0.4019, "lr": 8.533447038205805e-06, "epoch": 0.9781836631151699, "percentage": 32.59, "elapsed_time": "13:04:30", "remaining_time": "1 day, 3:02:43"}
{"current_steps": 484, "total_steps": 1479, "loss": 0.4301, "lr": 8.516707865575515e-06, "epoch": 0.9822425164890918, "percentage": 32.72, "elapsed_time": "13:07:46", "remaining_time": "1 day, 2:59:29"}
{"current_steps": 486, "total_steps": 1479, "loss": 0.3998, "lr": 8.499890324882323e-06, "epoch": 0.9863013698630136, "percentage": 32.86, "elapsed_time": "13:11:12", "remaining_time": "1 day, 2:56:36"}
{"current_steps": 488, "total_steps": 1479, "loss": 0.4422, "lr": 8.482994790896645e-06, "epoch": 0.9903602232369355, "percentage": 33.0, "elapsed_time": "13:14:06", "remaining_time": "1 day, 2:52:37"}
{"current_steps": 490, "total_steps": 1479, "loss": 0.4122, "lr": 8.466021640126946e-06, "epoch": 0.9944190766108574, "percentage": 33.13, "elapsed_time": "13:17:23", "remaining_time": "1 day, 2:49:26"}
{"current_steps": 492, "total_steps": 1479, "loss": 0.4137, "lr": 8.448971250811337e-06, "epoch": 0.9984779299847792, "percentage": 33.27, "elapsed_time": "13:20:49", "remaining_time": "1 day, 2:46:32"}
{"current_steps": 494, "total_steps": 1479, "loss": 0.3068, "lr": 8.431844002909153e-06, "epoch": 1.002029426686961, "percentage": 33.4, "elapsed_time": "13:25:25", "remaining_time": "1 day, 2:45:58"}
{"current_steps": 496, "total_steps": 1479, "loss": 0.2196, "lr": 8.414640278092485e-06, "epoch": 1.0060882800608828, "percentage": 33.54, "elapsed_time": "13:28:30", "remaining_time": "1 day, 2:42:21"}
{"current_steps": 498, "total_steps": 1479, "loss": 0.214, "lr": 8.397360459737673e-06, "epoch": 1.0101471334348047, "percentage": 33.67, "elapsed_time": "13:31:29", "remaining_time": "1 day, 2:38:31"}
{"current_steps": 500, "total_steps": 1479, "loss": 0.1968, "lr": 8.38000493291676e-06, "epoch": 1.0142059868087265, "percentage": 33.81, "elapsed_time": "13:34:39", "remaining_time": "1 day, 2:35:06"}
{"current_steps": 502, "total_steps": 1479, "loss": 0.2037, "lr": 8.362574084388921e-06, "epoch": 1.0182648401826484, "percentage": 33.94, "elapsed_time": "13:38:07", "remaining_time": "1 day, 2:32:15"}
{"current_steps": 504, "total_steps": 1479, "loss": 0.1732, "lr": 8.34506830259183e-06, "epoch": 1.0223236935565703, "percentage": 34.08, "elapsed_time": "13:41:29", "remaining_time": "1 day, 2:29:12"}
{"current_steps": 506, "total_steps": 1479, "loss": 0.198, "lr": 8.327487977633013e-06, "epoch": 1.0263825469304921, "percentage": 34.21, "elapsed_time": "13:44:26", "remaining_time": "1 day, 2:25:20"}
{"current_steps": 508, "total_steps": 1479, "loss": 0.1968, "lr": 8.309833501281159e-06, "epoch": 1.030441400304414, "percentage": 34.35, "elapsed_time": "13:47:46", "remaining_time": "1 day, 2:22:13"}
{"current_steps": 510, "total_steps": 1479, "loss": 0.2058, "lr": 8.292105266957372e-06, "epoch": 1.0345002536783359, "percentage": 34.48, "elapsed_time": "13:51:14", "remaining_time": "1 day, 2:19:21"}
{"current_steps": 512, "total_steps": 1479, "loss": 0.1837, "lr": 8.274303669726427e-06, "epoch": 1.0385591070522577, "percentage": 34.62, "elapsed_time": "13:55:01", "remaining_time": "1 day, 2:17:04"}
{"current_steps": 514, "total_steps": 1479, "loss": 0.1937, "lr": 8.256429106287944e-06, "epoch": 1.0426179604261796, "percentage": 34.75, "elapsed_time": "13:58:13", "remaining_time": "1 day, 2:13:42"}
{"current_steps": 516, "total_steps": 1479, "loss": 0.2044, "lr": 8.238481974967567e-06, "epoch": 1.0466768138001015, "percentage": 34.89, "elapsed_time": "14:01:01", "remaining_time": "1 day, 2:09:34"}
{"current_steps": 518, "total_steps": 1479, "loss": 0.2025, "lr": 8.220462675708075e-06, "epoch": 1.0507356671740233, "percentage": 35.02, "elapsed_time": "14:04:08", "remaining_time": "1 day, 2:06:03"}
{"current_steps": 520, "total_steps": 1479, "loss": 0.1944, "lr": 8.202371610060471e-06, "epoch": 1.0547945205479452, "percentage": 35.16, "elapsed_time": "14:07:19", "remaining_time": "1 day, 2:02:40"}
{"current_steps": 522, "total_steps": 1479, "loss": 0.1949, "lr": 8.184209181175038e-06, "epoch": 1.058853373921867, "percentage": 35.29, "elapsed_time": "14:10:29", "remaining_time": "1 day, 1:59:13"}
{"current_steps": 524, "total_steps": 1479, "loss": 0.1923, "lr": 8.165975793792355e-06, "epoch": 1.062912227295789, "percentage": 35.43, "elapsed_time": "14:13:40", "remaining_time": "1 day, 1:55:50"}
{"current_steps": 526, "total_steps": 1479, "loss": 0.18, "lr": 8.14767185423427e-06, "epoch": 1.0669710806697108, "percentage": 35.56, "elapsed_time": "14:16:49", "remaining_time": "1 day, 1:52:22"}
{"current_steps": 528, "total_steps": 1479, "loss": 0.199, "lr": 8.129297770394855e-06, "epoch": 1.0710299340436327, "percentage": 35.7, "elapsed_time": "14:19:51", "remaining_time": "1 day, 1:48:43"}
{"current_steps": 530, "total_steps": 1479, "loss": 0.1758, "lr": 8.11085395173131e-06, "epoch": 1.0750887874175545, "percentage": 35.84, "elapsed_time": "14:23:15", "remaining_time": "1 day, 1:45:44"}
{"current_steps": 532, "total_steps": 1479, "loss": 0.183, "lr": 8.092340809254844e-06, "epoch": 1.0791476407914764, "percentage": 35.97, "elapsed_time": "14:26:32", "remaining_time": "1 day, 1:42:29"}
{"current_steps": 534, "total_steps": 1479, "loss": 0.2001, "lr": 8.073758755521506e-06, "epoch": 1.0832064941653983, "percentage": 36.11, "elapsed_time": "14:29:37", "remaining_time": "1 day, 1:38:56"}
{"current_steps": 536, "total_steps": 1479, "loss": 0.1854, "lr": 8.055108204623001e-06, "epoch": 1.0872653475393201, "percentage": 36.24, "elapsed_time": "14:33:00", "remaining_time": "1 day, 1:35:55"}
{"current_steps": 538, "total_steps": 1479, "loss": 0.1887, "lr": 8.03638957217746e-06, "epoch": 1.091324200913242, "percentage": 36.38, "elapsed_time": "14:35:56", "remaining_time": "1 day, 1:32:04"}
{"current_steps": 540, "total_steps": 1479, "loss": 0.206, "lr": 8.017603275320176e-06, "epoch": 1.0953830542871639, "percentage": 36.51, "elapsed_time": "14:39:31", "remaining_time": "1 day, 1:29:23"}
{"current_steps": 542, "total_steps": 1479, "loss": 0.1852, "lr": 7.998749732694308e-06, "epoch": 1.0994419076610857, "percentage": 36.65, "elapsed_time": "14:42:43", "remaining_time": "1 day, 1:26:02"}
{"current_steps": 544, "total_steps": 1479, "loss": 0.1792, "lr": 7.979829364441555e-06, "epoch": 1.1035007610350076, "percentage": 36.78, "elapsed_time": "14:45:58", "remaining_time": "1 day, 1:22:46"}
{"current_steps": 546, "total_steps": 1479, "loss": 0.1914, "lr": 7.960842592192792e-06, "epoch": 1.1075596144089295, "percentage": 36.92, "elapsed_time": "14:49:05", "remaining_time": "1 day, 1:19:15"}
{"current_steps": 548, "total_steps": 1479, "loss": 0.1947, "lr": 7.94178983905867e-06, "epoch": 1.1116184677828513, "percentage": 37.05, "elapsed_time": "14:52:19", "remaining_time": "1 day, 1:15:58"}
{"current_steps": 550, "total_steps": 1479, "loss": 0.1901, "lr": 7.922671529620192e-06, "epoch": 1.1156773211567732, "percentage": 37.19, "elapsed_time": "14:55:37", "remaining_time": "1 day, 1:12:47"}
{"current_steps": 552, "total_steps": 1479, "loss": 0.1732, "lr": 7.903488089919253e-06, "epoch": 1.119736174530695, "percentage": 37.32, "elapsed_time": "14:58:42", "remaining_time": "1 day, 1:09:13"}
{"current_steps": 554, "total_steps": 1479, "loss": 0.2106, "lr": 7.88423994744914e-06, "epoch": 1.123795027904617, "percentage": 37.46, "elapsed_time": "15:01:52", "remaining_time": "1 day, 1:05:49"}
{"current_steps": 556, "total_steps": 1479, "loss": 0.1868, "lr": 7.864927531145012e-06, "epoch": 1.1278538812785388, "percentage": 37.59, "elapsed_time": "15:05:24", "remaining_time": "1 day, 1:03:03"}
{"current_steps": 558, "total_steps": 1479, "loss": 0.1814, "lr": 7.845551271374333e-06, "epoch": 1.1319127346524607, "percentage": 37.73, "elapsed_time": "15:08:45", "remaining_time": "1 day, 0:59:56"}
{"current_steps": 560, "total_steps": 1479, "loss": 0.1851, "lr": 7.82611159992729e-06, "epoch": 1.1359715880263825, "percentage": 37.86, "elapsed_time": "15:12:03", "remaining_time": "1 day, 0:56:45"}
{"current_steps": 562, "total_steps": 1479, "loss": 0.196, "lr": 7.80660895000717e-06, "epoch": 1.1400304414003044, "percentage": 38.0, "elapsed_time": "15:15:03", "remaining_time": "1 day, 0:53:05"}
{"current_steps": 564, "total_steps": 1479, "loss": 0.1721, "lr": 7.787043756220698e-06, "epoch": 1.1440892947742263, "percentage": 38.13, "elapsed_time": "15:18:46", "remaining_time": "1 day, 0:50:33"}
{"current_steps": 566, "total_steps": 1479, "loss": 0.1848, "lr": 7.767416454568358e-06, "epoch": 1.1481481481481481, "percentage": 38.27, "elapsed_time": "15:21:49", "remaining_time": "1 day, 0:46:58"}
{"current_steps": 568, "total_steps": 1479, "loss": 0.2007, "lr": 7.747727482434679e-06, "epoch": 1.15220700152207, "percentage": 38.4, "elapsed_time": "15:24:54", "remaining_time": "1 day, 0:43:26"}
{"current_steps": 570, "total_steps": 1479, "loss": 0.1881, "lr": 7.727977278578484e-06, "epoch": 1.1562658548959919, "percentage": 38.54, "elapsed_time": "15:28:23", "remaining_time": "1 day, 0:40:32"}
{"current_steps": 572, "total_steps": 1479, "loss": 0.1945, "lr": 7.708166283123118e-06, "epoch": 1.1603247082699137, "percentage": 38.67, "elapsed_time": "15:31:39", "remaining_time": "1 day, 0:37:17"}
{"current_steps": 574, "total_steps": 1479, "loss": 0.1858, "lr": 7.68829493754663e-06, "epoch": 1.1643835616438356, "percentage": 38.81, "elapsed_time": "15:34:35", "remaining_time": "1 day, 0:33:31"}
{"current_steps": 576, "total_steps": 1479, "loss": 0.1857, "lr": 7.668363684671947e-06, "epoch": 1.1684424150177575, "percentage": 38.95, "elapsed_time": "15:37:57", "remaining_time": "1 day, 0:30:26"}
{"current_steps": 578, "total_steps": 1479, "loss": 0.1786, "lr": 7.648372968656995e-06, "epoch": 1.1725012683916793, "percentage": 39.08, "elapsed_time": "15:41:16", "remaining_time": "1 day, 0:27:17"}
{"current_steps": 580, "total_steps": 1479, "loss": 0.1848, "lr": 7.628323234984806e-06, "epoch": 1.1765601217656012, "percentage": 39.22, "elapsed_time": "15:44:49", "remaining_time": "1 day, 0:24:28"}
{"current_steps": 582, "total_steps": 1479, "loss": 0.2032, "lr": 7.608214930453597e-06, "epoch": 1.180618975139523, "percentage": 39.35, "elapsed_time": "15:47:49", "remaining_time": "1 day, 0:20:49"}
{"current_steps": 584, "total_steps": 1479, "loss": 0.1933, "lr": 7.588048503166801e-06, "epoch": 1.184677828513445, "percentage": 39.49, "elapsed_time": "15:51:15", "remaining_time": "1 day, 0:17:49"}
{"current_steps": 586, "total_steps": 1479, "loss": 0.1842, "lr": 7.5678244025230894e-06, "epoch": 1.1887366818873668, "percentage": 39.62, "elapsed_time": "15:54:34", "remaining_time": "1 day, 0:14:40"}
{"current_steps": 588, "total_steps": 1479, "loss": 0.1711, "lr": 7.547543079206355e-06, "epoch": 1.1927955352612887, "percentage": 39.76, "elapsed_time": "15:58:00", "remaining_time": "1 day, 0:11:39"}
{"current_steps": 590, "total_steps": 1479, "loss": 0.2027, "lr": 7.5272049851756716e-06, "epoch": 1.1968543886352105, "percentage": 39.89, "elapsed_time": "16:01:09", "remaining_time": "1 day, 0:08:15"}
{"current_steps": 592, "total_steps": 1479, "loss": 0.1852, "lr": 7.506810573655215e-06, "epoch": 1.2009132420091324, "percentage": 40.03, "elapsed_time": "16:04:30", "remaining_time": "1 day, 0:05:08"}
{"current_steps": 594, "total_steps": 1479, "loss": 0.1887, "lr": 7.486360299124169e-06, "epoch": 1.2049720953830543, "percentage": 40.16, "elapsed_time": "16:07:45", "remaining_time": "1 day, 0:01:51"}
{"current_steps": 596, "total_steps": 1479, "loss": 0.1826, "lr": 7.4658546173066005e-06, "epoch": 1.2090309487569761, "percentage": 40.3, "elapsed_time": "16:11:09", "remaining_time": "23:58:49"}
{"current_steps": 598, "total_steps": 1479, "loss": 0.1722, "lr": 7.445293985161296e-06, "epoch": 1.213089802130898, "percentage": 40.43, "elapsed_time": "16:14:32", "remaining_time": "23:55:44"}
{"current_steps": 600, "total_steps": 1479, "loss": 0.1754, "lr": 7.424678860871584e-06, "epoch": 1.2171486555048199, "percentage": 40.57, "elapsed_time": "16:17:52", "remaining_time": "23:52:34"}
{"current_steps": 602, "total_steps": 1479, "loss": 0.1828, "lr": 7.404009703835121e-06, "epoch": 1.2212075088787417, "percentage": 40.7, "elapsed_time": "16:21:00", "remaining_time": "23:49:09"}
{"current_steps": 604, "total_steps": 1479, "loss": 0.2043, "lr": 7.383286974653659e-06, "epoch": 1.2252663622526636, "percentage": 40.84, "elapsed_time": "16:24:02", "remaining_time": "23:45:32"}
{"current_steps": 606, "total_steps": 1479, "loss": 0.2, "lr": 7.362511135122779e-06, "epoch": 1.2293252156265855, "percentage": 40.97, "elapsed_time": "16:27:21", "remaining_time": "23:42:22"}
{"current_steps": 608, "total_steps": 1479, "loss": 0.1823, "lr": 7.341682648221591e-06, "epoch": 1.2333840690005073, "percentage": 41.11, "elapsed_time": "16:30:30", "remaining_time": "23:38:57"}
{"current_steps": 610, "total_steps": 1479, "loss": 0.1826, "lr": 7.320801978102434e-06, "epoch": 1.2374429223744292, "percentage": 41.24, "elapsed_time": "16:33:54", "remaining_time": "23:35:54"}
{"current_steps": 612, "total_steps": 1479, "loss": 0.1916, "lr": 7.299869590080524e-06, "epoch": 1.241501775748351, "percentage": 41.38, "elapsed_time": "16:37:35", "remaining_time": "23:33:15"}
{"current_steps": 614, "total_steps": 1479, "loss": 0.2034, "lr": 7.278885950623578e-06, "epoch": 1.245560629122273, "percentage": 41.51, "elapsed_time": "16:40:44", "remaining_time": "23:29:50"}
{"current_steps": 616, "total_steps": 1479, "loss": 0.2007, "lr": 7.257851527341429e-06, "epoch": 1.2496194824961948, "percentage": 41.65, "elapsed_time": "16:44:13", "remaining_time": "23:26:53"}
{"current_steps": 618, "total_steps": 1479, "loss": 0.1926, "lr": 7.236766788975603e-06, "epoch": 1.2536783358701167, "percentage": 41.78, "elapsed_time": "16:47:31", "remaining_time": "23:23:41"}
{"current_steps": 620, "total_steps": 1479, "loss": 0.1738, "lr": 7.215632205388872e-06, "epoch": 1.2577371892440385, "percentage": 41.92, "elapsed_time": "16:50:54", "remaining_time": "23:20:36"}
{"current_steps": 622, "total_steps": 1479, "loss": 0.1895, "lr": 7.19444824755478e-06, "epoch": 1.2617960426179604, "percentage": 42.06, "elapsed_time": "16:54:05", "remaining_time": "23:17:13"}
{"current_steps": 624, "total_steps": 1479, "loss": 0.1798, "lr": 7.173215387547155e-06, "epoch": 1.2658548959918823, "percentage": 42.19, "elapsed_time": "16:57:00", "remaining_time": "23:13:30"}
{"current_steps": 626, "total_steps": 1479, "loss": 0.1876, "lr": 7.151934098529583e-06, "epoch": 1.2699137493658041, "percentage": 42.33, "elapsed_time": "17:00:06", "remaining_time": "23:10:01"}
{"current_steps": 628, "total_steps": 1479, "loss": 0.1959, "lr": 7.130604854744871e-06, "epoch": 1.273972602739726, "percentage": 42.46, "elapsed_time": "17:03:25", "remaining_time": "23:06:50"}
{"current_steps": 630, "total_steps": 1479, "loss": 0.2055, "lr": 7.109228131504465e-06, "epoch": 1.2780314561136479, "percentage": 42.6, "elapsed_time": "17:06:42", "remaining_time": "23:03:37"}
{"current_steps": 632, "total_steps": 1479, "loss": 0.1866, "lr": 7.087804405177876e-06, "epoch": 1.2820903094875697, "percentage": 42.73, "elapsed_time": "17:09:50", "remaining_time": "23:00:11"}
{"current_steps": 634, "total_steps": 1479, "loss": 0.1805, "lr": 7.066334153182049e-06, "epoch": 1.2861491628614916, "percentage": 42.87, "elapsed_time": "17:13:20", "remaining_time": "22:57:14"}
{"current_steps": 636, "total_steps": 1479, "loss": 0.1866, "lr": 7.044817853970732e-06, "epoch": 1.2902080162354135, "percentage": 43.0, "elapsed_time": "17:16:47", "remaining_time": "22:54:13"}
{"current_steps": 638, "total_steps": 1479, "loss": 0.182, "lr": 7.023255987023813e-06, "epoch": 1.2942668696093353, "percentage": 43.14, "elapsed_time": "17:19:52", "remaining_time": "22:50:44"}
{"current_steps": 640, "total_steps": 1479, "loss": 0.1863, "lr": 7.001649032836631e-06, "epoch": 1.2983257229832572, "percentage": 43.27, "elapsed_time": "17:23:04", "remaining_time": "22:47:24"}
{"current_steps": 642, "total_steps": 1479, "loss": 0.1732, "lr": 6.9799974729092765e-06, "epoch": 1.302384576357179, "percentage": 43.41, "elapsed_time": "17:26:10", "remaining_time": "22:43:55"}
{"current_steps": 644, "total_steps": 1479, "loss": 0.1763, "lr": 6.958301789735853e-06, "epoch": 1.306443429731101, "percentage": 43.54, "elapsed_time": "17:29:30", "remaining_time": "22:40:45"}
{"current_steps": 646, "total_steps": 1479, "loss": 0.21, "lr": 6.936562466793724e-06, "epoch": 1.3105022831050228, "percentage": 43.68, "elapsed_time": "17:32:43", "remaining_time": "22:37:27"}
{"current_steps": 648, "total_steps": 1479, "loss": 0.1889, "lr": 6.914779988532755e-06, "epoch": 1.3145611364789447, "percentage": 43.81, "elapsed_time": "17:35:59", "remaining_time": "22:34:12"}
{"current_steps": 650, "total_steps": 1479, "loss": 0.2028, "lr": 6.892954840364493e-06, "epoch": 1.3186199898528665, "percentage": 43.95, "elapsed_time": "17:39:09", "remaining_time": "22:30:49"}
{"current_steps": 652, "total_steps": 1479, "loss": 0.1884, "lr": 6.871087508651373e-06, "epoch": 1.3226788432267884, "percentage": 44.08, "elapsed_time": "17:41:59", "remaining_time": "22:27:02"}
{"current_steps": 654, "total_steps": 1479, "loss": 0.2021, "lr": 6.8491784806958616e-06, "epoch": 1.3267376966007103, "percentage": 44.22, "elapsed_time": "17:45:32", "remaining_time": "22:24:08"}
{"current_steps": 656, "total_steps": 1479, "loss": 0.1932, "lr": 6.827228244729609e-06, "epoch": 1.3307965499746321, "percentage": 44.35, "elapsed_time": "17:48:56", "remaining_time": "22:21:04"}
{"current_steps": 658, "total_steps": 1479, "loss": 0.1965, "lr": 6.805237289902565e-06, "epoch": 1.334855403348554, "percentage": 44.49, "elapsed_time": "17:52:08", "remaining_time": "22:17:44"}
{"current_steps": 660, "total_steps": 1479, "loss": 0.198, "lr": 6.783206106272076e-06, "epoch": 1.3389142567224759, "percentage": 44.62, "elapsed_time": "17:55:18", "remaining_time": "22:14:21"}
{"current_steps": 662, "total_steps": 1479, "loss": 0.1846, "lr": 6.761135184791969e-06, "epoch": 1.3429731100963977, "percentage": 44.76, "elapsed_time": "17:58:52", "remaining_time": "22:11:29"}
{"current_steps": 664, "total_steps": 1479, "loss": 0.181, "lr": 6.7390250173016104e-06, "epoch": 1.3470319634703196, "percentage": 44.9, "elapsed_time": "18:02:00", "remaining_time": "22:08:04"}
{"current_steps": 666, "total_steps": 1479, "loss": 0.2095, "lr": 6.716876096514944e-06, "epoch": 1.3510908168442415, "percentage": 45.03, "elapsed_time": "18:05:06", "remaining_time": "22:04:36"}
{"current_steps": 668, "total_steps": 1479, "loss": 0.1848, "lr": 6.694688916009505e-06, "epoch": 1.3551496702181633, "percentage": 45.17, "elapsed_time": "18:08:06", "remaining_time": "22:01:03"}
{"current_steps": 670, "total_steps": 1479, "loss": 0.1961, "lr": 6.672463970215436e-06, "epoch": 1.3592085235920852, "percentage": 45.3, "elapsed_time": "18:11:20", "remaining_time": "21:57:44"}
{"current_steps": 672, "total_steps": 1479, "loss": 0.1836, "lr": 6.650201754404455e-06, "epoch": 1.363267376966007, "percentage": 45.44, "elapsed_time": "18:14:37", "remaining_time": "21:54:31"}
{"current_steps": 674, "total_steps": 1479, "loss": 0.1881, "lr": 6.627902764678824e-06, "epoch": 1.367326230339929, "percentage": 45.57, "elapsed_time": "18:17:44", "remaining_time": "21:51:06"}
{"current_steps": 676, "total_steps": 1479, "loss": 0.1803, "lr": 6.605567497960295e-06, "epoch": 1.3713850837138508, "percentage": 45.71, "elapsed_time": "18:20:55", "remaining_time": "21:47:45"}
{"current_steps": 678, "total_steps": 1479, "loss": 0.1917, "lr": 6.583196451979031e-06, "epoch": 1.3754439370877727, "percentage": 45.84, "elapsed_time": "18:24:21", "remaining_time": "21:44:41"}
{"current_steps": 680, "total_steps": 1479, "loss": 0.1979, "lr": 6.560790125262524e-06, "epoch": 1.3795027904616946, "percentage": 45.98, "elapsed_time": "18:27:28", "remaining_time": "21:41:17"}
{"current_steps": 682, "total_steps": 1479, "loss": 0.1631, "lr": 6.538349017124472e-06, "epoch": 1.3835616438356164, "percentage": 46.11, "elapsed_time": "18:30:46", "remaining_time": "21:38:05"}
{"current_steps": 684, "total_steps": 1479, "loss": 0.1808, "lr": 6.515873627653663e-06, "epoch": 1.3876204972095383, "percentage": 46.25, "elapsed_time": "18:33:49", "remaining_time": "21:34:35"}
{"current_steps": 686, "total_steps": 1479, "loss": 0.1799, "lr": 6.493364457702831e-06, "epoch": 1.3916793505834602, "percentage": 46.38, "elapsed_time": "18:36:53", "remaining_time": "21:31:06"}
{"current_steps": 688, "total_steps": 1479, "loss": 0.1822, "lr": 6.470822008877482e-06, "epoch": 1.395738203957382, "percentage": 46.52, "elapsed_time": "18:40:12", "remaining_time": "21:27:54"}
{"current_steps": 690, "total_steps": 1479, "loss": 0.1919, "lr": 6.448246783524734e-06, "epoch": 1.3997970573313039, "percentage": 46.65, "elapsed_time": "18:43:31", "remaining_time": "21:24:43"}
{"current_steps": 692, "total_steps": 1479, "loss": 0.1851, "lr": 6.42563928472211e-06, "epoch": 1.4038559107052258, "percentage": 46.79, "elapsed_time": "18:46:44", "remaining_time": "21:21:25"}
{"current_steps": 694, "total_steps": 1479, "loss": 0.1866, "lr": 6.403000016266326e-06, "epoch": 1.4079147640791476, "percentage": 46.92, "elapsed_time": "18:50:05", "remaining_time": "21:18:15"}
{"current_steps": 696, "total_steps": 1479, "loss": 0.2035, "lr": 6.380329482662078e-06, "epoch": 1.4119736174530695, "percentage": 47.06, "elapsed_time": "18:53:10", "remaining_time": "21:14:48"}
{"current_steps": 698, "total_steps": 1479, "loss": 0.1717, "lr": 6.35762818911078e-06, "epoch": 1.4160324708269914, "percentage": 47.19, "elapsed_time": "18:56:27", "remaining_time": "21:11:35"}
{"current_steps": 700, "total_steps": 1479, "loss": 0.178, "lr": 6.334896641499324e-06, "epoch": 1.4200913242009132, "percentage": 47.33, "elapsed_time": "19:00:00", "remaining_time": "21:08:40"}
{"current_steps": 702, "total_steps": 1479, "loss": 0.1881, "lr": 6.312135346388793e-06, "epoch": 1.424150177574835, "percentage": 47.46, "elapsed_time": "19:03:24", "remaining_time": "21:05:34"}
{"current_steps": 704, "total_steps": 1479, "loss": 0.2033, "lr": 6.289344811003184e-06, "epoch": 1.428209030948757, "percentage": 47.6, "elapsed_time": "19:06:19", "remaining_time": "21:01:55"}
{"current_steps": 706, "total_steps": 1479, "loss": 0.1931, "lr": 6.2665255432180916e-06, "epoch": 1.4322678843226788, "percentage": 47.73, "elapsed_time": "19:09:17", "remaining_time": "20:58:21"}
{"current_steps": 708, "total_steps": 1479, "loss": 0.1837, "lr": 6.2436780515494035e-06, "epoch": 1.4363267376966007, "percentage": 47.87, "elapsed_time": "19:12:45", "remaining_time": "20:55:19"}
{"current_steps": 710, "total_steps": 1479, "loss": 0.2112, "lr": 6.2208028451419575e-06, "epoch": 1.4403855910705226, "percentage": 48.01, "elapsed_time": "19:15:51", "remaining_time": "20:51:54"}
{"current_steps": 712, "total_steps": 1479, "loss": 0.2021, "lr": 6.197900433758205e-06, "epoch": 1.4444444444444444, "percentage": 48.14, "elapsed_time": "19:19:06", "remaining_time": "20:48:38"}
{"current_steps": 714, "total_steps": 1479, "loss": 0.1958, "lr": 6.174971327766842e-06, "epoch": 1.4485032978183663, "percentage": 48.28, "elapsed_time": "19:22:23", "remaining_time": "20:45:24"}
{"current_steps": 716, "total_steps": 1479, "loss": 0.1854, "lr": 6.1520160381314465e-06, "epoch": 1.4525621511922882, "percentage": 48.41, "elapsed_time": "19:25:53", "remaining_time": "20:42:24"}
{"current_steps": 718, "total_steps": 1479, "loss": 0.1896, "lr": 6.129035076399077e-06, "epoch": 1.45662100456621, "percentage": 48.55, "elapsed_time": "19:28:56", "remaining_time": "20:38:57"}
{"current_steps": 720, "total_steps": 1479, "loss": 0.1903, "lr": 6.106028954688892e-06, "epoch": 1.4606798579401319, "percentage": 48.68, "elapsed_time": "19:32:12", "remaining_time": "20:35:42"}
{"current_steps": 722, "total_steps": 1479, "loss": 0.1872, "lr": 6.082998185680718e-06, "epoch": 1.4647387113140538, "percentage": 48.82, "elapsed_time": "19:35:38", "remaining_time": "20:32:37"}
{"current_steps": 724, "total_steps": 1479, "loss": 0.1983, "lr": 6.059943282603642e-06, "epoch": 1.4687975646879756, "percentage": 48.95, "elapsed_time": "19:38:54", "remaining_time": "20:29:22"}
{"current_steps": 726, "total_steps": 1479, "loss": 0.178, "lr": 6.03686475922456e-06, "epoch": 1.4728564180618975, "percentage": 49.09, "elapsed_time": "19:42:30", "remaining_time": "20:26:29"}
{"current_steps": 728, "total_steps": 1479, "loss": 0.1874, "lr": 6.013763129836739e-06, "epoch": 1.4769152714358194, "percentage": 49.22, "elapsed_time": "19:46:02", "remaining_time": "20:23:30"}
{"current_steps": 730, "total_steps": 1479, "loss": 0.1941, "lr": 5.990638909248352e-06, "epoch": 1.4809741248097412, "percentage": 49.36, "elapsed_time": "19:49:25", "remaining_time": "20:20:23"}
{"current_steps": 732, "total_steps": 1479, "loss": 0.1869, "lr": 5.967492612770999e-06, "epoch": 1.485032978183663, "percentage": 49.49, "elapsed_time": "19:52:43", "remaining_time": "20:17:10"}
{"current_steps": 734, "total_steps": 1479, "loss": 0.1807, "lr": 5.944324756208238e-06, "epoch": 1.489091831557585, "percentage": 49.63, "elapsed_time": "19:55:58", "remaining_time": "20:13:54"}
{"current_steps": 736, "total_steps": 1479, "loss": 0.1945, "lr": 5.92113585584408e-06, "epoch": 1.4931506849315068, "percentage": 49.76, "elapsed_time": "19:59:01", "remaining_time": "20:10:26"}
{"current_steps": 738, "total_steps": 1479, "loss": 0.193, "lr": 5.897926428431485e-06, "epoch": 1.4972095383054287, "percentage": 49.9, "elapsed_time": "20:02:23", "remaining_time": "20:07:17"}
{"current_steps": 740, "total_steps": 1479, "loss": 0.1941, "lr": 5.87469699118085e-06, "epoch": 1.5012683916793506, "percentage": 50.03, "elapsed_time": "20:05:31", "remaining_time": "20:03:53"}
{"current_steps": 742, "total_steps": 1479, "loss": 0.1954, "lr": 5.851448061748477e-06, "epoch": 1.5053272450532724, "percentage": 50.17, "elapsed_time": "20:08:50", "remaining_time": "20:00:41"}
{"current_steps": 744, "total_steps": 1479, "loss": 0.1812, "lr": 5.828180158225047e-06, "epoch": 1.5093860984271943, "percentage": 50.3, "elapsed_time": "20:12:34", "remaining_time": "19:57:54"}
{"current_steps": 746, "total_steps": 1479, "loss": 0.1892, "lr": 5.804893799124068e-06, "epoch": 1.5134449518011162, "percentage": 50.44, "elapsed_time": "20:15:31", "remaining_time": "19:54:20"}
{"current_steps": 748, "total_steps": 1479, "loss": 0.1965, "lr": 5.7815895033703164e-06, "epoch": 1.517503805175038, "percentage": 50.57, "elapsed_time": "20:18:32", "remaining_time": "19:50:51"}
{"current_steps": 750, "total_steps": 1479, "loss": 0.2082, "lr": 5.758267790288282e-06, "epoch": 1.52156265854896, "percentage": 50.71, "elapsed_time": "20:21:47", "remaining_time": "19:47:34"}
{"current_steps": 752, "total_steps": 1479, "loss": 0.1801, "lr": 5.734929179590593e-06, "epoch": 1.5256215119228818, "percentage": 50.85, "elapsed_time": "20:25:21", "remaining_time": "19:44:37"}
{"current_steps": 754, "total_steps": 1479, "loss": 0.1807, "lr": 5.711574191366427e-06, "epoch": 1.5296803652968036, "percentage": 50.98, "elapsed_time": "20:28:28", "remaining_time": "19:41:13"}
{"current_steps": 756, "total_steps": 1479, "loss": 0.1934, "lr": 5.6882033460699294e-06, "epoch": 1.5337392186707255, "percentage": 51.12, "elapsed_time": "20:31:58", "remaining_time": "19:38:11"}
{"current_steps": 758, "total_steps": 1479, "loss": 0.183, "lr": 5.664817164508614e-06, "epoch": 1.5377980720446474, "percentage": 51.25, "elapsed_time": "20:34:58", "remaining_time": "19:34:41"}
{"current_steps": 760, "total_steps": 1479, "loss": 0.1983, "lr": 5.641416167831752e-06, "epoch": 1.5418569254185692, "percentage": 51.39, "elapsed_time": "20:38:11", "remaining_time": "19:31:23"}
{"current_steps": 762, "total_steps": 1479, "loss": 0.205, "lr": 5.618000877518767e-06, "epoch": 1.545915778792491, "percentage": 51.52, "elapsed_time": "20:41:12", "remaining_time": "19:27:54"}
{"current_steps": 764, "total_steps": 1479, "loss": 0.1871, "lr": 5.594571815367602e-06, "epoch": 1.549974632166413, "percentage": 51.66, "elapsed_time": "20:44:18", "remaining_time": "19:24:29"}
{"current_steps": 766, "total_steps": 1479, "loss": 0.1588, "lr": 5.5711295034831034e-06, "epoch": 1.5540334855403348, "percentage": 51.79, "elapsed_time": "20:47:26", "remaining_time": "19:21:07"}
{"current_steps": 768, "total_steps": 1479, "loss": 0.1885, "lr": 5.547674464265384e-06, "epoch": 1.5580923389142567, "percentage": 51.93, "elapsed_time": "20:50:20", "remaining_time": "19:17:32"}
{"current_steps": 770, "total_steps": 1479, "loss": 0.1844, "lr": 5.524207220398169e-06, "epoch": 1.5621511922881786, "percentage": 52.06, "elapsed_time": "20:53:35", "remaining_time": "19:14:17"}
{"current_steps": 772, "total_steps": 1479, "loss": 0.1717, "lr": 5.500728294837168e-06, "epoch": 1.5662100456621004, "percentage": 52.2, "elapsed_time": "20:56:35", "remaining_time": "19:10:47"}
{"current_steps": 774, "total_steps": 1479, "loss": 0.1816, "lr": 5.477238210798406e-06, "epoch": 1.5702688990360223, "percentage": 52.33, "elapsed_time": "20:59:49", "remaining_time": "19:07:30"}
{"current_steps": 776, "total_steps": 1479, "loss": 0.1956, "lr": 5.453737491746572e-06, "epoch": 1.5743277524099442, "percentage": 52.47, "elapsed_time": "21:02:47", "remaining_time": "19:03:59"}
{"current_steps": 778, "total_steps": 1479, "loss": 0.1831, "lr": 5.430226661383348e-06, "epoch": 1.578386605783866, "percentage": 52.6, "elapsed_time": "21:06:04", "remaining_time": "19:00:46"}
{"current_steps": 780, "total_steps": 1479, "loss": 0.1859, "lr": 5.406706243635742e-06, "epoch": 1.582445459157788, "percentage": 52.74, "elapsed_time": "21:09:11", "remaining_time": "18:57:23"}
{"current_steps": 782, "total_steps": 1479, "loss": 0.1799, "lr": 5.383176762644416e-06, "epoch": 1.5865043125317098, "percentage": 52.87, "elapsed_time": "21:12:47", "remaining_time": "18:54:26"}
{"current_steps": 784, "total_steps": 1479, "loss": 0.1859, "lr": 5.359638742751994e-06, "epoch": 1.5905631659056316, "percentage": 53.01, "elapsed_time": "21:15:52", "remaining_time": "18:51:01"}
{"current_steps": 786, "total_steps": 1479, "loss": 0.1949, "lr": 5.3360927084913925e-06, "epoch": 1.5946220192795535, "percentage": 53.14, "elapsed_time": "21:19:04", "remaining_time": "18:47:44"}
{"current_steps": 788, "total_steps": 1479, "loss": 0.1795, "lr": 5.312539184574123e-06, "epoch": 1.5986808726534754, "percentage": 53.28, "elapsed_time": "21:22:24", "remaining_time": "18:44:32"}
{"current_steps": 790, "total_steps": 1479, "loss": 0.1842, "lr": 5.288978695878596e-06, "epoch": 1.6027397260273972, "percentage": 53.41, "elapsed_time": "21:25:40", "remaining_time": "18:41:18"}
{"current_steps": 792, "total_steps": 1479, "loss": 0.1892, "lr": 5.265411767438432e-06, "epoch": 1.606798579401319, "percentage": 53.55, "elapsed_time": "21:28:43", "remaining_time": "18:37:52"}
{"current_steps": 794, "total_steps": 1479, "loss": 0.1857, "lr": 5.241838924430757e-06, "epoch": 1.610857432775241, "percentage": 53.68, "elapsed_time": "21:31:55", "remaining_time": "18:34:34"}
{"current_steps": 796, "total_steps": 1479, "loss": 0.1839, "lr": 5.2182606921645e-06, "epoch": 1.6149162861491628, "percentage": 53.82, "elapsed_time": "21:35:20", "remaining_time": "18:31:27"}
{"current_steps": 798, "total_steps": 1479, "loss": 0.1974, "lr": 5.194677596068689e-06, "epoch": 1.6189751395230847, "percentage": 53.96, "elapsed_time": "21:38:34", "remaining_time": "18:28:11"}
{"current_steps": 800, "total_steps": 1479, "loss": 0.186, "lr": 5.171090161680736e-06, "epoch": 1.6230339928970066, "percentage": 54.09, "elapsed_time": "21:42:09", "remaining_time": "18:25:12"}
{"current_steps": 802, "total_steps": 1479, "loss": 0.1818, "lr": 5.1474989146347355e-06, "epoch": 1.6270928462709284, "percentage": 54.23, "elapsed_time": "21:45:30", "remaining_time": "18:22:02"}
{"current_steps": 804, "total_steps": 1479, "loss": 0.1878, "lr": 5.1239043806497365e-06, "epoch": 1.6311516996448503, "percentage": 54.36, "elapsed_time": "21:48:52", "remaining_time": "18:18:52"}
{"current_steps": 806, "total_steps": 1479, "loss": 0.1792, "lr": 5.100307085518046e-06, "epoch": 1.6352105530187722, "percentage": 54.5, "elapsed_time": "21:52:18", "remaining_time": "18:15:45"}
{"current_steps": 808, "total_steps": 1479, "loss": 0.1816, "lr": 5.076707555093491e-06, "epoch": 1.639269406392694, "percentage": 54.63, "elapsed_time": "21:55:24", "remaining_time": "18:12:22"}
{"current_steps": 810, "total_steps": 1479, "loss": 0.2025, "lr": 5.053106315279721e-06, "epoch": 1.643328259766616, "percentage": 54.77, "elapsed_time": "21:58:31", "remaining_time": "18:08:59"}
{"current_steps": 812, "total_steps": 1479, "loss": 0.1669, "lr": 5.029503892018472e-06, "epoch": 1.6473871131405378, "percentage": 54.9, "elapsed_time": "22:01:57", "remaining_time": "18:05:54"}
{"current_steps": 814, "total_steps": 1479, "loss": 0.1802, "lr": 5.005900811277856e-06, "epoch": 1.6514459665144596, "percentage": 55.04, "elapsed_time": "22:05:05", "remaining_time": "18:02:32"}
{"current_steps": 816, "total_steps": 1479, "loss": 0.1636, "lr": 4.982297599040633e-06, "epoch": 1.6555048198883815, "percentage": 55.17, "elapsed_time": "22:08:29", "remaining_time": "17:59:23"}
{"current_steps": 818, "total_steps": 1479, "loss": 0.1923, "lr": 4.958694781292496e-06, "epoch": 1.6595636732623034, "percentage": 55.31, "elapsed_time": "22:11:41", "remaining_time": "17:56:05"}
{"current_steps": 820, "total_steps": 1479, "loss": 0.1873, "lr": 4.935092884010347e-06, "epoch": 1.6636225266362252, "percentage": 55.44, "elapsed_time": "22:14:36", "remaining_time": "17:52:34"}
{"current_steps": 822, "total_steps": 1479, "loss": 0.1809, "lr": 4.911492433150573e-06, "epoch": 1.667681380010147, "percentage": 55.58, "elapsed_time": "22:17:41", "remaining_time": "17:49:10"}
{"current_steps": 824, "total_steps": 1479, "loss": 0.1864, "lr": 4.887893954637335e-06, "epoch": 1.671740233384069, "percentage": 55.71, "elapsed_time": "22:21:09", "remaining_time": "17:46:05"}
{"current_steps": 826, "total_steps": 1479, "loss": 0.1766, "lr": 4.86429797435083e-06, "epoch": 1.6757990867579908, "percentage": 55.85, "elapsed_time": "22:24:39", "remaining_time": "17:43:01"}
{"current_steps": 828, "total_steps": 1479, "loss": 0.1808, "lr": 4.840705018115595e-06, "epoch": 1.6798579401319127, "percentage": 55.98, "elapsed_time": "22:27:53", "remaining_time": "17:39:45"}
{"current_steps": 830, "total_steps": 1479, "loss": 0.1757, "lr": 4.8171156116887725e-06, "epoch": 1.6839167935058346, "percentage": 56.12, "elapsed_time": "22:30:57", "remaining_time": "17:36:21"}
{"current_steps": 832, "total_steps": 1479, "loss": 0.1924, "lr": 4.7935302807483965e-06, "epoch": 1.6879756468797564, "percentage": 56.25, "elapsed_time": "22:33:48", "remaining_time": "17:32:47"}
{"current_steps": 834, "total_steps": 1479, "loss": 0.1902, "lr": 4.769949550881687e-06, "epoch": 1.6920345002536783, "percentage": 56.39, "elapsed_time": "22:37:06", "remaining_time": "17:29:33"}
{"current_steps": 836, "total_steps": 1479, "loss": 0.1787, "lr": 4.746373947573325e-06, "epoch": 1.6960933536276002, "percentage": 56.52, "elapsed_time": "22:40:10", "remaining_time": "17:26:09"}
{"current_steps": 838, "total_steps": 1479, "loss": 0.197, "lr": 4.722803996193753e-06, "epoch": 1.700152207001522, "percentage": 56.66, "elapsed_time": "22:43:25", "remaining_time": "17:22:54"}
{"current_steps": 840, "total_steps": 1479, "loss": 0.1819, "lr": 4.699240221987461e-06, "epoch": 1.704211060375444, "percentage": 56.8, "elapsed_time": "22:46:31", "remaining_time": "17:19:32"}
{"current_steps": 842, "total_steps": 1479, "loss": 0.1888, "lr": 4.6756831500612846e-06, "epoch": 1.7082699137493658, "percentage": 56.93, "elapsed_time": "22:49:55", "remaining_time": "17:16:23"}
{"current_steps": 844, "total_steps": 1479, "loss": 0.1727, "lr": 4.652133305372705e-06, "epoch": 1.7123287671232876, "percentage": 57.07, "elapsed_time": "22:53:03", "remaining_time": "17:13:03"}
{"current_steps": 846, "total_steps": 1479, "loss": 0.1756, "lr": 4.628591212718144e-06, "epoch": 1.7163876204972095, "percentage": 57.2, "elapsed_time": "22:56:36", "remaining_time": "17:10:01"}
{"current_steps": 848, "total_steps": 1479, "loss": 0.1741, "lr": 4.605057396721275e-06, "epoch": 1.7204464738711314, "percentage": 57.34, "elapsed_time": "22:59:29", "remaining_time": "17:06:28"}
{"current_steps": 850, "total_steps": 1479, "loss": 0.1841, "lr": 4.58153238182133e-06, "epoch": 1.7245053272450532, "percentage": 57.47, "elapsed_time": "23:02:40", "remaining_time": "17:03:10"}
{"current_steps": 852, "total_steps": 1479, "loss": 0.1698, "lr": 4.558016692261412e-06, "epoch": 1.728564180618975, "percentage": 57.61, "elapsed_time": "23:06:07", "remaining_time": "17:00:04"}
{"current_steps": 854, "total_steps": 1479, "loss": 0.1886, "lr": 4.534510852076817e-06, "epoch": 1.732623033992897, "percentage": 57.74, "elapsed_time": "23:09:15", "remaining_time": "16:56:43"}
{"current_steps": 856, "total_steps": 1479, "loss": 0.1945, "lr": 4.511015385083345e-06, "epoch": 1.7366818873668188, "percentage": 57.88, "elapsed_time": "23:12:35", "remaining_time": "16:53:31"}
{"current_steps": 858, "total_steps": 1479, "loss": 0.1824, "lr": 4.487530814865646e-06, "epoch": 1.7407407407407407, "percentage": 58.01, "elapsed_time": "23:15:55", "remaining_time": "16:50:20"}
{"current_steps": 860, "total_steps": 1479, "loss": 0.1823, "lr": 4.464057664765532e-06, "epoch": 1.7447995941146626, "percentage": 58.15, "elapsed_time": "23:19:12", "remaining_time": "16:47:06"}
{"current_steps": 862, "total_steps": 1479, "loss": 0.1834, "lr": 4.440596457870327e-06, "epoch": 1.7488584474885844, "percentage": 58.28, "elapsed_time": "23:22:08", "remaining_time": "16:43:37"}
{"current_steps": 864, "total_steps": 1479, "loss": 0.1746, "lr": 4.417147717001205e-06, "epoch": 1.7529173008625063, "percentage": 58.42, "elapsed_time": "23:25:09", "remaining_time": "16:40:11"}
{"current_steps": 866, "total_steps": 1479, "loss": 0.1682, "lr": 4.393711964701541e-06, "epoch": 1.7569761542364282, "percentage": 58.55, "elapsed_time": "23:28:17", "remaining_time": "16:36:51"}
{"current_steps": 868, "total_steps": 1479, "loss": 0.1786, "lr": 4.37028972322527e-06, "epoch": 1.76103500761035, "percentage": 58.69, "elapsed_time": "23:31:22", "remaining_time": "16:33:29"}
{"current_steps": 870, "total_steps": 1479, "loss": 0.1791, "lr": 4.346881514525236e-06, "epoch": 1.765093860984272, "percentage": 58.82, "elapsed_time": "23:34:52", "remaining_time": "16:30:24"}
{"current_steps": 872, "total_steps": 1479, "loss": 0.1672, "lr": 4.323487860241582e-06, "epoch": 1.7691527143581938, "percentage": 58.96, "elapsed_time": "23:38:09", "remaining_time": "16:27:11"}
{"current_steps": 874, "total_steps": 1479, "loss": 0.1854, "lr": 4.3001092816901055e-06, "epoch": 1.7732115677321156, "percentage": 59.09, "elapsed_time": "23:41:13", "remaining_time": "16:23:48"}
{"current_steps": 876, "total_steps": 1479, "loss": 0.1823, "lr": 4.2767462998506485e-06, "epoch": 1.7772704211060375, "percentage": 59.23, "elapsed_time": "23:44:15", "remaining_time": "16:20:23"}
{"current_steps": 878, "total_steps": 1479, "loss": 0.1895, "lr": 4.253399435355492e-06, "epoch": 1.7813292744799594, "percentage": 59.36, "elapsed_time": "23:47:38", "remaining_time": "16:17:14"}
{"current_steps": 880, "total_steps": 1479, "loss": 0.175, "lr": 4.230069208477745e-06, "epoch": 1.7853881278538812, "percentage": 59.5, "elapsed_time": "23:51:12", "remaining_time": "16:14:12"}
{"current_steps": 882, "total_steps": 1479, "loss": 0.1953, "lr": 4.206756139119762e-06, "epoch": 1.789446981227803, "percentage": 59.63, "elapsed_time": "23:54:13", "remaining_time": "16:10:47"}
{"current_steps": 884, "total_steps": 1479, "loss": 0.1702, "lr": 4.183460746801546e-06, "epoch": 1.793505834601725, "percentage": 59.77, "elapsed_time": "23:58:02", "remaining_time": "16:07:54"}
{"current_steps": 886, "total_steps": 1479, "loss": 0.1778, "lr": 4.160183550649176e-06, "epoch": 1.7975646879756468, "percentage": 59.91, "elapsed_time": "1 day, 0:01:21", "remaining_time": "16:04:42"}
{"current_steps": 888, "total_steps": 1479, "loss": 0.1917, "lr": 4.136925069383243e-06, "epoch": 1.8016235413495687, "percentage": 60.04, "elapsed_time": "1 day, 0:04:54", "remaining_time": "16:01:38"}
{"current_steps": 890, "total_steps": 1479, "loss": 0.1898, "lr": 4.113685821307282e-06, "epoch": 1.8056823947234906, "percentage": 60.18, "elapsed_time": "1 day, 0:08:04", "remaining_time": "15:58:19"}
{"current_steps": 892, "total_steps": 1479, "loss": 0.1822, "lr": 4.090466324296228e-06, "epoch": 1.8097412480974124, "percentage": 60.31, "elapsed_time": "1 day, 0:11:00", "remaining_time": "15:54:52"}
{"current_steps": 894, "total_steps": 1479, "loss": 0.1841, "lr": 4.067267095784871e-06, "epoch": 1.8138001014713343, "percentage": 60.45, "elapsed_time": "1 day, 0:14:18", "remaining_time": "15:51:38"}
{"current_steps": 896, "total_steps": 1479, "loss": 0.1629, "lr": 4.044088652756332e-06, "epoch": 1.8178589548452562, "percentage": 60.58, "elapsed_time": "1 day, 0:17:34", "remaining_time": "15:48:23"}
{"current_steps": 898, "total_steps": 1479, "loss": 0.1774, "lr": 4.020931511730533e-06, "epoch": 1.821917808219178, "percentage": 60.72, "elapsed_time": "1 day, 0:20:29", "remaining_time": "15:44:55"}
{"current_steps": 900, "total_steps": 1479, "loss": 0.1733, "lr": 3.997796188752695e-06, "epoch": 1.8259766615931, "percentage": 60.85, "elapsed_time": "1 day, 0:23:53", "remaining_time": "15:41:46"}
{"current_steps": 902, "total_steps": 1479, "loss": 0.1685, "lr": 3.974683199381836e-06, "epoch": 1.8300355149670218, "percentage": 60.99, "elapsed_time": "1 day, 0:26:58", "remaining_time": "15:38:24"}
{"current_steps": 904, "total_steps": 1479, "loss": 0.1672, "lr": 3.951593058679276e-06, "epoch": 1.8340943683409436, "percentage": 61.12, "elapsed_time": "1 day, 0:30:18", "remaining_time": "15:35:12"}
{"current_steps": 906, "total_steps": 1479, "loss": 0.1749, "lr": 3.928526281197169e-06, "epoch": 1.8381532217148655, "percentage": 61.26, "elapsed_time": "1 day, 0:33:15", "remaining_time": "15:31:45"}
{"current_steps": 908, "total_steps": 1479, "loss": 0.1722, "lr": 3.905483380967027e-06, "epoch": 1.8422120750887874, "percentage": 61.39, "elapsed_time": "1 day, 0:36:50", "remaining_time": "15:28:42"}
{"current_steps": 910, "total_steps": 1479, "loss": 0.1693, "lr": 3.882464871488273e-06, "epoch": 1.8462709284627092, "percentage": 61.53, "elapsed_time": "1 day, 0:40:02", "remaining_time": "15:25:25"}
{"current_steps": 912, "total_steps": 1479, "loss": 0.1691, "lr": 3.859471265716791e-06, "epoch": 1.850329781836631, "percentage": 61.66, "elapsed_time": "1 day, 0:43:36", "remaining_time": "15:22:22"}
{"current_steps": 914, "total_steps": 1479, "loss": 0.1751, "lr": 3.836503076053501e-06, "epoch": 1.854388635210553, "percentage": 61.8, "elapsed_time": "1 day, 0:46:51", "remaining_time": "15:19:07"}
{"current_steps": 916, "total_steps": 1479, "loss": 0.1809, "lr": 3.8135608143329404e-06, "epoch": 1.8584474885844748, "percentage": 61.93, "elapsed_time": "1 day, 0:49:59", "remaining_time": "15:15:47"}
{"current_steps": 918, "total_steps": 1479, "loss": 0.1696, "lr": 3.7906449918118493e-06, "epoch": 1.8625063419583967, "percentage": 62.07, "elapsed_time": "1 day, 0:53:11", "remaining_time": "15:12:30"}
{"current_steps": 920, "total_steps": 1479, "loss": 0.17, "lr": 3.7677561191577873e-06, "epoch": 1.8665651953323186, "percentage": 62.2, "elapsed_time": "1 day, 0:56:21", "remaining_time": "15:09:12"}
{"current_steps": 922, "total_steps": 1479, "loss": 0.1768, "lr": 3.7448947064377496e-06, "epoch": 1.8706240487062404, "percentage": 62.34, "elapsed_time": "1 day, 0:59:30", "remaining_time": "15:05:52"}
{"current_steps": 924, "total_steps": 1479, "loss": 0.1712, "lr": 3.722061263106797e-06, "epoch": 1.8746829020801623, "percentage": 62.47, "elapsed_time": "1 day, 1:03:03", "remaining_time": "15:02:48"}
{"current_steps": 926, "total_steps": 1479, "loss": 0.1802, "lr": 3.699256297996714e-06, "epoch": 1.8787417554540842, "percentage": 62.61, "elapsed_time": "1 day, 1:06:12", "remaining_time": "14:59:29"}
{"current_steps": 928, "total_steps": 1479, "loss": 0.1787, "lr": 3.6764803193046538e-06, "epoch": 1.882800608828006, "percentage": 62.75, "elapsed_time": "1 day, 1:09:14", "remaining_time": "14:56:06"}
{"current_steps": 930, "total_steps": 1479, "loss": 0.177, "lr": 3.6537338345818273e-06, "epoch": 1.886859462201928, "percentage": 62.88, "elapsed_time": "1 day, 1:12:38", "remaining_time": "14:52:56"}
{"current_steps": 932, "total_steps": 1479, "loss": 0.1784, "lr": 3.6310173507221884e-06, "epoch": 1.8909183155758498, "percentage": 63.02, "elapsed_time": "1 day, 1:15:51", "remaining_time": "14:49:40"}
{"current_steps": 934, "total_steps": 1479, "loss": 0.1613, "lr": 3.6083313739511316e-06, "epoch": 1.8949771689497716, "percentage": 63.15, "elapsed_time": "1 day, 1:19:16", "remaining_time": "14:46:31"}
{"current_steps": 936, "total_steps": 1479, "loss": 0.1722, "lr": 3.5856764098142207e-06, "epoch": 1.8990360223236935, "percentage": 63.29, "elapsed_time": "1 day, 1:22:21", "remaining_time": "14:43:09"}
{"current_steps": 938, "total_steps": 1479, "loss": 0.1619, "lr": 3.563052963165915e-06, "epoch": 1.9030948756976154, "percentage": 63.42, "elapsed_time": "1 day, 1:25:51", "remaining_time": "14:40:03"}
{"current_steps": 940, "total_steps": 1479, "loss": 0.1786, "lr": 3.5404615381583264e-06, "epoch": 1.9071537290715372, "percentage": 63.56, "elapsed_time": "1 day, 1:29:03", "remaining_time": "14:36:46"}
{"current_steps": 942, "total_steps": 1479, "loss": 0.1635, "lr": 3.5179026382299752e-06, "epoch": 1.911212582445459, "percentage": 63.69, "elapsed_time": "1 day, 1:32:17", "remaining_time": "14:33:30"}
{"current_steps": 944, "total_steps": 1479, "loss": 0.1849, "lr": 3.4953767660945825e-06, "epoch": 1.915271435819381, "percentage": 63.83, "elapsed_time": "1 day, 1:35:13", "remaining_time": "14:30:03"}
{"current_steps": 946, "total_steps": 1479, "loss": 0.1824, "lr": 3.472884423729861e-06, "epoch": 1.9193302891933028, "percentage": 63.96, "elapsed_time": "1 day, 1:38:34", "remaining_time": "14:26:52"}
{"current_steps": 948, "total_steps": 1479, "loss": 0.1741, "lr": 3.4504261123663243e-06, "epoch": 1.9233891425672247, "percentage": 64.1, "elapsed_time": "1 day, 1:41:36", "remaining_time": "14:23:29"}
{"current_steps": 950, "total_steps": 1479, "loss": 0.1622, "lr": 3.4280023324761287e-06, "epoch": 1.9274479959411466, "percentage": 64.23, "elapsed_time": "1 day, 1:45:00", "remaining_time": "14:20:19"}
{"current_steps": 952, "total_steps": 1479, "loss": 0.1714, "lr": 3.4056135837619077e-06, "epoch": 1.9315068493150684, "percentage": 64.37, "elapsed_time": "1 day, 1:48:19", "remaining_time": "14:17:06"}
{"current_steps": 954, "total_steps": 1479, "loss": 0.1704, "lr": 3.3832603651456486e-06, "epoch": 1.9355657026889903, "percentage": 64.5, "elapsed_time": "1 day, 1:51:17", "remaining_time": "14:13:41"}
{"current_steps": 956, "total_steps": 1479, "loss": 0.1835, "lr": 3.360943174757564e-06, "epoch": 1.9396245560629122, "percentage": 64.64, "elapsed_time": "1 day, 1:54:05", "remaining_time": "14:10:11"}
{"current_steps": 958, "total_steps": 1479, "loss": 0.1722, "lr": 3.3386625099249957e-06, "epoch": 1.943683409436834, "percentage": 64.77, "elapsed_time": "1 day, 1:57:10", "remaining_time": "14:06:51"}
{"current_steps": 960, "total_steps": 1479, "loss": 0.1799, "lr": 3.3164188671613382e-06, "epoch": 1.947742262810756, "percentage": 64.91, "elapsed_time": "1 day, 2:00:12", "remaining_time": "14:03:29"}
{"current_steps": 962, "total_steps": 1479, "loss": 0.1665, "lr": 3.29421274215496e-06, "epoch": 1.9518011161846778, "percentage": 65.04, "elapsed_time": "1 day, 2:03:40", "remaining_time": "14:00:21"}
{"current_steps": 964, "total_steps": 1479, "loss": 0.1756, "lr": 3.2720446297581696e-06, "epoch": 1.9558599695585996, "percentage": 65.18, "elapsed_time": "1 day, 2:06:55", "remaining_time": "13:57:05"}
{"current_steps": 966, "total_steps": 1479, "loss": 0.1674, "lr": 3.2499150239761813e-06, "epoch": 1.9599188229325215, "percentage": 65.31, "elapsed_time": "1 day, 2:10:18", "remaining_time": "13:53:55"}
{"current_steps": 968, "total_steps": 1479, "loss": 0.176, "lr": 3.2278244179561107e-06, "epoch": 1.9639776763064434, "percentage": 65.45, "elapsed_time": "1 day, 2:13:33", "remaining_time": "13:50:40"}
{"current_steps": 970, "total_steps": 1479, "loss": 0.1649, "lr": 3.205773303975982e-06, "epoch": 1.9680365296803652, "percentage": 65.58, "elapsed_time": "1 day, 2:16:52", "remaining_time": "13:47:27"}
{"current_steps": 972, "total_steps": 1479, "loss": 0.1712, "lr": 3.1837621734337607e-06, "epoch": 1.972095383054287, "percentage": 65.72, "elapsed_time": "1 day, 2:20:12", "remaining_time": "13:44:14"}
{"current_steps": 974, "total_steps": 1479, "loss": 0.1835, "lr": 3.1617915168363994e-06, "epoch": 1.976154236428209, "percentage": 65.86, "elapsed_time": "1 day, 2:23:28", "remaining_time": "13:41:00"}
{"current_steps": 976, "total_steps": 1479, "loss": 0.1685, "lr": 3.1398618237889124e-06, "epoch": 1.9802130898021308, "percentage": 65.99, "elapsed_time": "1 day, 2:26:35", "remaining_time": "13:37:40"}
{"current_steps": 978, "total_steps": 1479, "loss": 0.1707, "lr": 3.11797358298346e-06, "epoch": 1.9842719431760527, "percentage": 66.13, "elapsed_time": "1 day, 2:29:49", "remaining_time": "13:34:25"}
{"current_steps": 980, "total_steps": 1479, "loss": 0.1687, "lr": 3.096127282188458e-06, "epoch": 1.9883307965499746, "percentage": 66.26, "elapsed_time": "1 day, 2:33:02", "remaining_time": "13:31:09"}
{"current_steps": 982, "total_steps": 1479, "loss": 0.1788, "lr": 3.074323408237716e-06, "epoch": 1.9923896499238964, "percentage": 66.4, "elapsed_time": "1 day, 2:36:17", "remaining_time": "13:27:54"}
{"current_steps": 984, "total_steps": 1479, "loss": 0.1878, "lr": 3.0525624470195746e-06, "epoch": 1.9964485032978183, "percentage": 66.53, "elapsed_time": "1 day, 2:39:30", "remaining_time": "13:24:38"}
{"current_steps": 986, "total_steps": 1479, "loss": 0.1664, "lr": 3.0308448834660953e-06, "epoch": 2.0, "percentage": 66.67, "elapsed_time": "1 day, 2:42:15", "remaining_time": "13:21:07"}
{"current_steps": 988, "total_steps": 1479, "loss": 0.073, "lr": 3.009171201542235e-06, "epoch": 2.004058853373922, "percentage": 66.8, "elapsed_time": "1 day, 2:47:05", "remaining_time": "13:18:39"}
{"current_steps": 990, "total_steps": 1479, "loss": 0.0666, "lr": 2.987541884235078e-06, "epoch": 2.0081177067478437, "percentage": 66.94, "elapsed_time": "1 day, 2:50:20", "remaining_time": "13:15:24"}
{"current_steps": 992, "total_steps": 1479, "loss": 0.068, "lr": 2.965957413543063e-06, "epoch": 2.0121765601217656, "percentage": 67.07, "elapsed_time": "1 day, 2:53:49", "remaining_time": "13:12:16"}
{"current_steps": 994, "total_steps": 1479, "loss": 0.0722, "lr": 2.944418270465243e-06, "epoch": 2.0162354134956875, "percentage": 67.21, "elapsed_time": "1 day, 2:56:54", "remaining_time": "13:08:56"}
{"current_steps": 996, "total_steps": 1479, "loss": 0.0636, "lr": 2.9229249349905686e-06, "epoch": 2.0202942668696093, "percentage": 67.34, "elapsed_time": "1 day, 3:00:02", "remaining_time": "13:05:37"}
{"current_steps": 998, "total_steps": 1479, "loss": 0.056, "lr": 2.9014778860871916e-06, "epoch": 2.024353120243531, "percentage": 67.48, "elapsed_time": "1 day, 3:03:16", "remaining_time": "13:02:21"}
{"current_steps": 1000, "total_steps": 1479, "loss": 0.0711, "lr": 2.880077601691793e-06, "epoch": 2.028411973617453, "percentage": 67.61, "elapsed_time": "1 day, 3:06:26", "remaining_time": "12:59:03"}
{"current_steps": 1002, "total_steps": 1479, "loss": 0.0619, "lr": 2.8587245586989265e-06, "epoch": 2.032470826991375, "percentage": 67.75, "elapsed_time": "1 day, 3:09:39", "remaining_time": "12:55:47"}
{"current_steps": 1004, "total_steps": 1479, "loss": 0.0604, "lr": 2.8374192329503934e-06, "epoch": 2.036529680365297, "percentage": 67.88, "elapsed_time": "1 day, 3:12:43", "remaining_time": "12:52:27"}
{"current_steps": 1006, "total_steps": 1479, "loss": 0.0616, "lr": 2.8161620992246497e-06, "epoch": 2.0405885337392187, "percentage": 68.02, "elapsed_time": "1 day, 3:16:06", "remaining_time": "12:49:15"}
{"current_steps": 1008, "total_steps": 1479, "loss": 0.0649, "lr": 2.7949536312262048e-06, "epoch": 2.0446473871131405, "percentage": 68.15, "elapsed_time": "1 day, 3:19:18", "remaining_time": "12:45:59"}
{"current_steps": 1010, "total_steps": 1479, "loss": 0.0699, "lr": 2.7737943015750862e-06, "epoch": 2.0487062404870624, "percentage": 68.29, "elapsed_time": "1 day, 3:22:05", "remaining_time": "12:42:30"}
{"current_steps": 1012, "total_steps": 1479, "loss": 0.0626, "lr": 2.752684581796292e-06, "epoch": 2.0527650938609843, "percentage": 68.42, "elapsed_time": "1 day, 3:25:20", "remaining_time": "12:39:15"}
{"current_steps": 1014, "total_steps": 1479, "loss": 0.0594, "lr": 2.7316249423092923e-06, "epoch": 2.056823947234906, "percentage": 68.56, "elapsed_time": "1 day, 3:29:00", "remaining_time": "12:36:12"}
{"current_steps": 1016, "total_steps": 1479, "loss": 0.0646, "lr": 2.7106158524175396e-06, "epoch": 2.060882800608828, "percentage": 68.7, "elapsed_time": "1 day, 3:32:20", "remaining_time": "12:32:59"}
{"current_steps": 1018, "total_steps": 1479, "loss": 0.0552, "lr": 2.689657780298019e-06, "epoch": 2.06494165398275, "percentage": 68.83, "elapsed_time": "1 day, 3:35:27", "remaining_time": "12:29:40"}
{"current_steps": 1020, "total_steps": 1479, "loss": 0.0633, "lr": 2.6687511929908093e-06, "epoch": 2.0690005073566717, "percentage": 68.97, "elapsed_time": "1 day, 3:38:36", "remaining_time": "12:26:22"}
{"current_steps": 1022, "total_steps": 1479, "loss": 0.0567, "lr": 2.6478965563886745e-06, "epoch": 2.0730593607305936, "percentage": 69.1, "elapsed_time": "1 day, 3:41:57", "remaining_time": "12:23:09"}
{"current_steps": 1024, "total_steps": 1479, "loss": 0.059, "lr": 2.627094335226682e-06, "epoch": 2.0771182141045155, "percentage": 69.24, "elapsed_time": "1 day, 3:45:19", "remaining_time": "12:19:57"}
{"current_steps": 1026, "total_steps": 1479, "loss": 0.0566, "lr": 2.6063449930718487e-06, "epoch": 2.0811770674784373, "percentage": 69.37, "elapsed_time": "1 day, 3:48:49", "remaining_time": "12:16:49"}
{"current_steps": 1028, "total_steps": 1479, "loss": 0.0573, "lr": 2.5856489923128136e-06, "epoch": 2.085235920852359, "percentage": 69.51, "elapsed_time": "1 day, 3:51:48", "remaining_time": "12:13:26"}
{"current_steps": 1030, "total_steps": 1479, "loss": 0.0543, "lr": 2.5650067941495236e-06, "epoch": 2.089294774226281, "percentage": 69.64, "elapsed_time": "1 day, 3:55:04", "remaining_time": "12:10:11"}
{"current_steps": 1032, "total_steps": 1479, "loss": 0.0573, "lr": 2.5444188585829634e-06, "epoch": 2.093353627600203, "percentage": 69.78, "elapsed_time": "1 day, 3:58:39", "remaining_time": "12:07:05"}
{"current_steps": 1034, "total_steps": 1479, "loss": 0.0629, "lr": 2.523885644404906e-06, "epoch": 2.097412480974125, "percentage": 69.91, "elapsed_time": "1 day, 4:01:56", "remaining_time": "12:03:51"}
{"current_steps": 1036, "total_steps": 1479, "loss": 0.0599, "lr": 2.5034076091876813e-06, "epoch": 2.1014713343480467, "percentage": 70.05, "elapsed_time": "1 day, 4:05:15", "remaining_time": "12:00:37"}
{"current_steps": 1038, "total_steps": 1479, "loss": 0.0685, "lr": 2.48298520927399e-06, "epoch": 2.1055301877219685, "percentage": 70.18, "elapsed_time": "1 day, 4:08:12", "remaining_time": "11:57:14"}
{"current_steps": 1040, "total_steps": 1479, "loss": 0.0528, "lr": 2.4626188997667224e-06, "epoch": 2.1095890410958904, "percentage": 70.32, "elapsed_time": "1 day, 4:11:46", "remaining_time": "11:54:07"}
{"current_steps": 1042, "total_steps": 1479, "loss": 0.0609, "lr": 2.4423091345188244e-06, "epoch": 2.1136478944698123, "percentage": 70.45, "elapsed_time": "1 day, 4:15:00", "remaining_time": "11:50:51"}
{"current_steps": 1044, "total_steps": 1479, "loss": 0.0607, "lr": 2.4220563661231793e-06, "epoch": 2.117706747843734, "percentage": 70.59, "elapsed_time": "1 day, 4:18:08", "remaining_time": "11:47:33"}
{"current_steps": 1046, "total_steps": 1479, "loss": 0.0614, "lr": 2.4018610459025317e-06, "epoch": 2.121765601217656, "percentage": 70.72, "elapsed_time": "1 day, 4:21:24", "remaining_time": "11:44:18"}
{"current_steps": 1048, "total_steps": 1479, "loss": 0.0576, "lr": 2.381723623899412e-06, "epoch": 2.125824454591578, "percentage": 70.86, "elapsed_time": "1 day, 4:24:43", "remaining_time": "11:41:04"}
{"current_steps": 1050, "total_steps": 1479, "loss": 0.0612, "lr": 2.361644548866127e-06, "epoch": 2.1298833079654997, "percentage": 70.99, "elapsed_time": "1 day, 4:28:08", "remaining_time": "11:37:53"}
{"current_steps": 1052, "total_steps": 1479, "loss": 0.0637, "lr": 2.341624268254747e-06, "epoch": 2.1339421613394216, "percentage": 71.13, "elapsed_time": "1 day, 4:31:19", "remaining_time": "11:34:36"}
{"current_steps": 1054, "total_steps": 1479, "loss": 0.0653, "lr": 2.3216632282071345e-06, "epoch": 2.1380010147133435, "percentage": 71.26, "elapsed_time": "1 day, 4:34:20", "remaining_time": "11:31:15"}
{"current_steps": 1056, "total_steps": 1479, "loss": 0.055, "lr": 2.3017618735450142e-06, "epoch": 2.1420598680872653, "percentage": 71.4, "elapsed_time": "1 day, 4:37:39", "remaining_time": "11:28:02"}
{"current_steps": 1058, "total_steps": 1479, "loss": 0.0593, "lr": 2.2819206477600462e-06, "epoch": 2.146118721461187, "percentage": 71.53, "elapsed_time": "1 day, 4:41:21", "remaining_time": "11:24:58"}
{"current_steps": 1060, "total_steps": 1479, "loss": 0.0576, "lr": 2.2621399930039493e-06, "epoch": 2.150177574835109, "percentage": 71.67, "elapsed_time": "1 day, 4:44:36", "remaining_time": "11:21:42"}
{"current_steps": 1062, "total_steps": 1479, "loss": 0.0565, "lr": 2.2424203500786473e-06, "epoch": 2.154236428209031, "percentage": 71.81, "elapsed_time": "1 day, 4:47:37", "remaining_time": "11:18:21"}
{"current_steps": 1064, "total_steps": 1479, "loss": 0.0609, "lr": 2.2227621584264505e-06, "epoch": 2.158295281582953, "percentage": 71.94, "elapsed_time": "1 day, 4:51:09", "remaining_time": "11:15:13"}
{"current_steps": 1066, "total_steps": 1479, "loss": 0.0486, "lr": 2.203165856120251e-06, "epoch": 2.1623541349568747, "percentage": 72.08, "elapsed_time": "1 day, 4:54:25", "remaining_time": "11:11:58"}
{"current_steps": 1068, "total_steps": 1479, "loss": 0.0592, "lr": 2.183631879853776e-06, "epoch": 2.1664129883307965, "percentage": 72.21, "elapsed_time": "1 day, 4:57:23", "remaining_time": "11:08:36"}
{"current_steps": 1070, "total_steps": 1479, "loss": 0.0564, "lr": 2.164160664931843e-06, "epoch": 2.1704718417047184, "percentage": 72.35, "elapsed_time": "1 day, 5:00:33", "remaining_time": "11:05:18"}
{"current_steps": 1072, "total_steps": 1479, "loss": 0.0608, "lr": 2.1447526452606658e-06, "epoch": 2.1745306950786403, "percentage": 72.48, "elapsed_time": "1 day, 5:03:37", "remaining_time": "11:01:59"}
{"current_steps": 1074, "total_steps": 1479, "loss": 0.0572, "lr": 2.125408253338183e-06, "epoch": 2.178589548452562, "percentage": 72.62, "elapsed_time": "1 day, 5:06:47", "remaining_time": "10:58:42"}
{"current_steps": 1076, "total_steps": 1479, "loss": 0.056, "lr": 2.106127920244423e-06, "epoch": 2.182648401826484, "percentage": 72.75, "elapsed_time": "1 day, 5:10:10", "remaining_time": "10:55:29"}
{"current_steps": 1078, "total_steps": 1479, "loss": 0.0579, "lr": 2.086912075631896e-06, "epoch": 2.186707255200406, "percentage": 72.89, "elapsed_time": "1 day, 5:13:43", "remaining_time": "10:52:21"}
{"current_steps": 1080, "total_steps": 1479, "loss": 0.0573, "lr": 2.067761147716017e-06, "epoch": 2.1907661085743277, "percentage": 73.02, "elapsed_time": "1 day, 5:17:02", "remaining_time": "10:49:07"}
{"current_steps": 1082, "total_steps": 1479, "loss": 0.0593, "lr": 2.0486755632655643e-06, "epoch": 2.1948249619482496, "percentage": 73.16, "elapsed_time": "1 day, 5:20:14", "remaining_time": "10:45:51"}
{"current_steps": 1084, "total_steps": 1479, "loss": 0.0605, "lr": 2.029655747593169e-06, "epoch": 2.1988838153221715, "percentage": 73.29, "elapsed_time": "1 day, 5:23:28", "remaining_time": "10:42:35"}
{"current_steps": 1086, "total_steps": 1479, "loss": 0.0598, "lr": 2.010702124545845e-06, "epoch": 2.2029426686960933, "percentage": 73.43, "elapsed_time": "1 day, 5:27:00", "remaining_time": "10:39:26"}
{"current_steps": 1088, "total_steps": 1479, "loss": 0.0617, "lr": 1.9918151164955303e-06, "epoch": 2.207001522070015, "percentage": 73.56, "elapsed_time": "1 day, 5:30:05", "remaining_time": "10:36:07"}
{"current_steps": 1090, "total_steps": 1479, "loss": 0.0564, "lr": 1.9729951443296823e-06, "epoch": 2.211060375443937, "percentage": 73.7, "elapsed_time": "1 day, 5:33:38", "remaining_time": "10:32:58"}
{"current_steps": 1092, "total_steps": 1479, "loss": 0.0628, "lr": 1.9542426274418975e-06, "epoch": 2.215119228817859, "percentage": 73.83, "elapsed_time": "1 day, 5:36:48", "remaining_time": "10:29:41"}
{"current_steps": 1094, "total_steps": 1479, "loss": 0.0601, "lr": 1.9355579837225673e-06, "epoch": 2.219178082191781, "percentage": 73.97, "elapsed_time": "1 day, 5:39:48", "remaining_time": "10:26:20"}
{"current_steps": 1096, "total_steps": 1479, "loss": 0.0562, "lr": 1.916941629549565e-06, "epoch": 2.2232369355657027, "percentage": 74.1, "elapsed_time": "1 day, 5:42:58", "remaining_time": "10:23:03"}
{"current_steps": 1098, "total_steps": 1479, "loss": 0.0604, "lr": 1.8983939797789624e-06, "epoch": 2.2272957889396245, "percentage": 74.24, "elapsed_time": "1 day, 5:46:10", "remaining_time": "10:19:47"}
{"current_steps": 1100, "total_steps": 1479, "loss": 0.057, "lr": 1.8799154477357883e-06, "epoch": 2.2313546423135464, "percentage": 74.37, "elapsed_time": "1 day, 5:49:25", "remaining_time": "10:16:32"}
{"current_steps": 1102, "total_steps": 1479, "loss": 0.0529, "lr": 1.8615064452048181e-06, "epoch": 2.2354134956874683, "percentage": 74.51, "elapsed_time": "1 day, 5:52:51", "remaining_time": "10:13:20"}
{"current_steps": 1104, "total_steps": 1479, "loss": 0.0607, "lr": 1.8431673824214013e-06, "epoch": 2.23947234906139, "percentage": 74.65, "elapsed_time": "1 day, 5:56:23", "remaining_time": "10:10:11"}
{"current_steps": 1106, "total_steps": 1479, "loss": 0.0524, "lr": 1.8248986680623077e-06, "epoch": 2.243531202435312, "percentage": 74.78, "elapsed_time": "1 day, 5:59:56", "remaining_time": "10:07:01"}
{"current_steps": 1108, "total_steps": 1479, "loss": 0.0633, "lr": 1.8067007092366368e-06, "epoch": 2.247590055809234, "percentage": 74.92, "elapsed_time": "1 day, 6:03:01", "remaining_time": "10:03:43"}
{"current_steps": 1110, "total_steps": 1479, "loss": 0.0575, "lr": 1.7885739114767292e-06, "epoch": 2.2516489091831557, "percentage": 75.05, "elapsed_time": "1 day, 6:06:00", "remaining_time": "10:00:22"}
{"current_steps": 1112, "total_steps": 1479, "loss": 0.0532, "lr": 1.770518678729139e-06, "epoch": 2.2557077625570776, "percentage": 75.19, "elapsed_time": "1 day, 6:09:17", "remaining_time": "9:57:07"}
{"current_steps": 1114, "total_steps": 1479, "loss": 0.0572, "lr": 1.752535413345634e-06, "epoch": 2.2597666159309995, "percentage": 75.32, "elapsed_time": "1 day, 6:12:39", "remaining_time": "9:53:54"}
{"current_steps": 1116, "total_steps": 1479, "loss": 0.0591, "lr": 1.734624516074221e-06, "epoch": 2.2638254693049213, "percentage": 75.46, "elapsed_time": "1 day, 6:15:52", "remaining_time": "9:50:39"}
{"current_steps": 1118, "total_steps": 1479, "loss": 0.0619, "lr": 1.716786386050221e-06, "epoch": 2.267884322678843, "percentage": 75.59, "elapsed_time": "1 day, 6:19:04", "remaining_time": "9:47:22"}
{"current_steps": 1120, "total_steps": 1479, "loss": 0.0603, "lr": 1.6990214207873723e-06, "epoch": 2.271943176052765, "percentage": 75.73, "elapsed_time": "1 day, 6:22:03", "remaining_time": "9:44:02"}
{"current_steps": 1122, "total_steps": 1479, "loss": 0.0583, "lr": 1.681330016168977e-06, "epoch": 2.276002029426687, "percentage": 75.86, "elapsed_time": "1 day, 6:25:13", "remaining_time": "9:40:45"}
{"current_steps": 1124, "total_steps": 1479, "loss": 0.06, "lr": 1.6637125664390747e-06, "epoch": 2.280060882800609, "percentage": 76.0, "elapsed_time": "1 day, 6:28:31", "remaining_time": "9:37:30"}
{"current_steps": 1126, "total_steps": 1479, "loss": 0.0532, "lr": 1.6461694641936544e-06, "epoch": 2.2841197361745307, "percentage": 76.13, "elapsed_time": "1 day, 6:32:03", "remaining_time": "9:34:21"}
{"current_steps": 1128, "total_steps": 1479, "loss": 0.0581, "lr": 1.6287011003719105e-06, "epoch": 2.2881785895484525, "percentage": 76.27, "elapsed_time": "1 day, 6:35:04", "remaining_time": "9:31:01"}
{"current_steps": 1130, "total_steps": 1479, "loss": 0.0578, "lr": 1.61130786424753e-06, "epoch": 2.2922374429223744, "percentage": 76.4, "elapsed_time": "1 day, 6:38:05", "remaining_time": "9:27:41"}
{"current_steps": 1132, "total_steps": 1479, "loss": 0.0587, "lr": 1.5939901434200145e-06, "epoch": 2.2962962962962963, "percentage": 76.54, "elapsed_time": "1 day, 6:41:35", "remaining_time": "9:24:31"}
{"current_steps": 1134, "total_steps": 1479, "loss": 0.0568, "lr": 1.5767483238060498e-06, "epoch": 2.300355149670218, "percentage": 76.67, "elapsed_time": "1 day, 6:45:06", "remaining_time": "9:21:20"}
{"current_steps": 1136, "total_steps": 1479, "loss": 0.0615, "lr": 1.5595827896308968e-06, "epoch": 2.30441400304414, "percentage": 76.81, "elapsed_time": "1 day, 6:48:27", "remaining_time": "9:18:07"}
{"current_steps": 1138, "total_steps": 1479, "loss": 0.0558, "lr": 1.5424939234198377e-06, "epoch": 2.308472856418062, "percentage": 76.94, "elapsed_time": "1 day, 6:51:46", "remaining_time": "9:14:52"}
{"current_steps": 1140, "total_steps": 1479, "loss": 0.0569, "lr": 1.5254821059896452e-06, "epoch": 2.3125317097919837, "percentage": 77.08, "elapsed_time": "1 day, 6:54:53", "remaining_time": "9:11:35"}
{"current_steps": 1142, "total_steps": 1479, "loss": 0.0564, "lr": 1.5085477164400975e-06, "epoch": 2.3165905631659056, "percentage": 77.21, "elapsed_time": "1 day, 6:58:14", "remaining_time": "9:08:21"}
{"current_steps": 1144, "total_steps": 1479, "loss": 0.0566, "lr": 1.4916911321455362e-06, "epoch": 2.3206494165398275, "percentage": 77.35, "elapsed_time": "1 day, 7:01:36", "remaining_time": "9:05:08"}
{"current_steps": 1146, "total_steps": 1479, "loss": 0.0566, "lr": 1.4749127287464483e-06, "epoch": 2.3247082699137493, "percentage": 77.48, "elapsed_time": "1 day, 7:04:42", "remaining_time": "9:01:50"}
{"current_steps": 1148, "total_steps": 1479, "loss": 0.0568, "lr": 1.458212880141099e-06, "epoch": 2.328767123287671, "percentage": 77.62, "elapsed_time": "1 day, 7:08:04", "remaining_time": "8:58:37"}
{"current_steps": 1150, "total_steps": 1479, "loss": 0.0547, "lr": 1.4415919584771999e-06, "epoch": 2.332825976661593, "percentage": 77.76, "elapsed_time": "1 day, 7:11:14", "remaining_time": "8:55:20"}
{"current_steps": 1152, "total_steps": 1479, "loss": 0.0586, "lr": 1.425050334143616e-06, "epoch": 2.336884830035515, "percentage": 77.89, "elapsed_time": "1 day, 7:14:08", "remaining_time": "8:51:58"}
{"current_steps": 1154, "total_steps": 1479, "loss": 0.0575, "lr": 1.408588375762114e-06, "epoch": 2.340943683409437, "percentage": 78.03, "elapsed_time": "1 day, 7:17:28", "remaining_time": "8:48:44"}
{"current_steps": 1156, "total_steps": 1479, "loss": 0.0575, "lr": 1.39220645017914e-06, "epoch": 2.3450025367833587, "percentage": 78.16, "elapsed_time": "1 day, 7:20:43", "remaining_time": "8:45:29"}
{"current_steps": 1158, "total_steps": 1479, "loss": 0.054, "lr": 1.3759049224576516e-06, "epoch": 2.3490613901572805, "percentage": 78.3, "elapsed_time": "1 day, 7:24:06", "remaining_time": "8:42:16"}
{"current_steps": 1160, "total_steps": 1479, "loss": 0.0611, "lr": 1.3596841558689788e-06, "epoch": 2.3531202435312024, "percentage": 78.43, "elapsed_time": "1 day, 7:27:29", "remaining_time": "8:39:03"}
{"current_steps": 1162, "total_steps": 1479, "loss": 0.0597, "lr": 1.3435445118847362e-06, "epoch": 2.3571790969051243, "percentage": 78.57, "elapsed_time": "1 day, 7:30:39", "remaining_time": "8:35:46"}
{"current_steps": 1164, "total_steps": 1479, "loss": 0.0582, "lr": 1.3274863501687546e-06, "epoch": 2.361237950279046, "percentage": 78.7, "elapsed_time": "1 day, 7:33:46", "remaining_time": "8:32:29"}
{"current_steps": 1166, "total_steps": 1479, "loss": 0.0586, "lr": 1.3115100285690795e-06, "epoch": 2.365296803652968, "percentage": 78.84, "elapsed_time": "1 day, 7:36:38", "remaining_time": "8:29:08"}
{"current_steps": 1168, "total_steps": 1479, "loss": 0.0572, "lr": 1.2956159031099874e-06, "epoch": 2.36935565702689, "percentage": 78.97, "elapsed_time": "1 day, 7:39:54", "remaining_time": "8:25:53"}
{"current_steps": 1170, "total_steps": 1479, "loss": 0.0573, "lr": 1.2798043279840544e-06, "epoch": 2.3734145104008117, "percentage": 79.11, "elapsed_time": "1 day, 7:43:03", "remaining_time": "8:22:36"}
{"current_steps": 1172, "total_steps": 1479, "loss": 0.0585, "lr": 1.2640756555442684e-06, "epoch": 2.3774733637747336, "percentage": 79.24, "elapsed_time": "1 day, 7:46:21", "remaining_time": "8:19:21"}
{"current_steps": 1174, "total_steps": 1479, "loss": 0.0531, "lr": 1.248430236296168e-06, "epoch": 2.3815322171486555, "percentage": 79.38, "elapsed_time": "1 day, 7:49:23", "remaining_time": "8:16:03"}
{"current_steps": 1176, "total_steps": 1479, "loss": 0.0562, "lr": 1.2328684188900392e-06, "epoch": 2.3855910705225774, "percentage": 79.51, "elapsed_time": "1 day, 7:52:41", "remaining_time": "8:12:48"}
{"current_steps": 1178, "total_steps": 1479, "loss": 0.0555, "lr": 1.2173905501131395e-06, "epoch": 2.389649923896499, "percentage": 79.65, "elapsed_time": "1 day, 7:55:49", "remaining_time": "8:09:31"}
{"current_steps": 1180, "total_steps": 1479, "loss": 0.0633, "lr": 1.2019969748819783e-06, "epoch": 2.393708777270421, "percentage": 79.78, "elapsed_time": "1 day, 7:58:55", "remaining_time": "8:06:14"}
{"current_steps": 1182, "total_steps": 1479, "loss": 0.0512, "lr": 1.186688036234625e-06, "epoch": 2.397767630644343, "percentage": 79.92, "elapsed_time": "1 day, 8:02:24", "remaining_time": "8:03:02"}
{"current_steps": 1184, "total_steps": 1479, "loss": 0.0523, "lr": 1.1714640753230628e-06, "epoch": 2.401826484018265, "percentage": 80.05, "elapsed_time": "1 day, 8:05:39", "remaining_time": "7:59:47"}
{"current_steps": 1186, "total_steps": 1479, "loss": 0.0553, "lr": 1.1563254314055893e-06, "epoch": 2.4058853373921867, "percentage": 80.19, "elapsed_time": "1 day, 8:08:49", "remaining_time": "7:56:30"}
{"current_steps": 1188, "total_steps": 1479, "loss": 0.0544, "lr": 1.1412724418392562e-06, "epoch": 2.4099441907661086, "percentage": 80.32, "elapsed_time": "1 day, 8:12:05", "remaining_time": "7:53:15"}
{"current_steps": 1190, "total_steps": 1479, "loss": 0.055, "lr": 1.126305442072354e-06, "epoch": 2.4140030441400304, "percentage": 80.46, "elapsed_time": "1 day, 8:15:07", "remaining_time": "7:49:57"}
{"current_steps": 1192, "total_steps": 1479, "loss": 0.0545, "lr": 1.1114247656369305e-06, "epoch": 2.4180618975139523, "percentage": 80.59, "elapsed_time": "1 day, 8:18:27", "remaining_time": "7:46:43"}
{"current_steps": 1194, "total_steps": 1479, "loss": 0.0581, "lr": 1.0966307441413598e-06, "epoch": 2.422120750887874, "percentage": 80.73, "elapsed_time": "1 day, 8:21:37", "remaining_time": "7:43:27"}
{"current_steps": 1196, "total_steps": 1479, "loss": 0.0597, "lr": 1.0819237072629606e-06, "epoch": 2.426179604261796, "percentage": 80.87, "elapsed_time": "1 day, 8:24:48", "remaining_time": "7:40:11"}
{"current_steps": 1198, "total_steps": 1479, "loss": 0.0592, "lr": 1.0673039827406373e-06, "epoch": 2.430238457635718, "percentage": 81.0, "elapsed_time": "1 day, 8:27:52", "remaining_time": "7:36:53"}
{"current_steps": 1200, "total_steps": 1479, "loss": 0.0543, "lr": 1.0527718963675871e-06, "epoch": 2.4342973110096398, "percentage": 81.14, "elapsed_time": "1 day, 8:31:35", "remaining_time": "7:33:44"}
{"current_steps": 1202, "total_steps": 1479, "loss": 0.051, "lr": 1.0383277719840318e-06, "epoch": 2.4383561643835616, "percentage": 81.27, "elapsed_time": "1 day, 8:35:17", "remaining_time": "7:30:35"}
{"current_steps": 1204, "total_steps": 1479, "loss": 0.0569, "lr": 1.0239719314700052e-06, "epoch": 2.4424150177574835, "percentage": 81.41, "elapsed_time": "1 day, 8:38:45", "remaining_time": "7:27:23"}
{"current_steps": 1206, "total_steps": 1479, "loss": 0.0622, "lr": 1.0097046947381805e-06, "epoch": 2.4464738711314054, "percentage": 81.54, "elapsed_time": "1 day, 8:41:52", "remaining_time": "7:24:06"}
{"current_steps": 1208, "total_steps": 1479, "loss": 0.0593, "lr": 9.955263797267379e-07, "epoch": 2.450532724505327, "percentage": 81.68, "elapsed_time": "1 day, 8:45:09", "remaining_time": "7:20:51"}
{"current_steps": 1210, "total_steps": 1479, "loss": 0.0573, "lr": 9.814373023922851e-07, "epoch": 2.454591577879249, "percentage": 81.81, "elapsed_time": "1 day, 8:48:23", "remaining_time": "7:17:36"}
{"current_steps": 1212, "total_steps": 1479, "loss": 0.0595, "lr": 9.674377767028142e-07, "epoch": 2.458650431253171, "percentage": 81.95, "elapsed_time": "1 day, 8:51:21", "remaining_time": "7:14:17"}
{"current_steps": 1214, "total_steps": 1479, "loss": 0.0539, "lr": 9.53528114630699e-07, "epoch": 2.462709284627093, "percentage": 82.08, "elapsed_time": "1 day, 8:54:30", "remaining_time": "7:11:00"}
{"current_steps": 1216, "total_steps": 1479, "loss": 0.0587, "lr": 9.397086261457511e-07, "epoch": 2.4667681380010147, "percentage": 82.22, "elapsed_time": "1 day, 8:57:47", "remaining_time": "7:07:45"}
{"current_steps": 1218, "total_steps": 1479, "loss": 0.0576, "lr": 9.259796192083071e-07, "epoch": 2.4708269913749366, "percentage": 82.35, "elapsed_time": "1 day, 9:00:57", "remaining_time": "7:04:29"}
{"current_steps": 1220, "total_steps": 1479, "loss": 0.0543, "lr": 9.123413997623714e-07, "epoch": 2.4748858447488584, "percentage": 82.49, "elapsed_time": "1 day, 9:04:12", "remaining_time": "7:01:14"}
{"current_steps": 1222, "total_steps": 1479, "loss": 0.0578, "lr": 8.987942717287923e-07, "epoch": 2.4789446981227803, "percentage": 82.62, "elapsed_time": "1 day, 9:07:26", "remaining_time": "6:57:58"}
{"current_steps": 1224, "total_steps": 1479, "loss": 0.0537, "lr": 8.853385369984901e-07, "epoch": 2.483003551496702, "percentage": 82.76, "elapsed_time": "1 day, 9:10:48", "remaining_time": "6:54:45"}
{"current_steps": 1226, "total_steps": 1479, "loss": 0.056, "lr": 8.719744954257375e-07, "epoch": 2.487062404870624, "percentage": 82.89, "elapsed_time": "1 day, 9:14:14", "remaining_time": "6:51:32"}
{"current_steps": 1228, "total_steps": 1479, "loss": 0.0541, "lr": 8.587024448214637e-07, "epoch": 2.491121258244546, "percentage": 83.03, "elapsed_time": "1 day, 9:17:23", "remaining_time": "6:48:15"}
{"current_steps": 1230, "total_steps": 1479, "loss": 0.0592, "lr": 8.455226809466327e-07, "epoch": 2.4951801116184678, "percentage": 83.16, "elapsed_time": "1 day, 9:20:27", "remaining_time": "6:44:58"}
{"current_steps": 1232, "total_steps": 1479, "loss": 0.0539, "lr": 8.324354975056403e-07, "epoch": 2.4992389649923896, "percentage": 83.3, "elapsed_time": "1 day, 9:23:33", "remaining_time": "6:41:41"}
{"current_steps": 1234, "total_steps": 1479, "loss": 0.0591, "lr": 8.19441186139776e-07, "epoch": 2.5032978183663115, "percentage": 83.43, "elapsed_time": "1 day, 9:26:33", "remaining_time": "6:38:23"}
{"current_steps": 1236, "total_steps": 1479, "loss": 0.0584, "lr": 8.065400364207194e-07, "epoch": 2.5073566717402334, "percentage": 83.57, "elapsed_time": "1 day, 9:29:33", "remaining_time": "6:35:05"}
{"current_steps": 1238, "total_steps": 1479, "loss": 0.0543, "lr": 7.937323358440935e-07, "epoch": 2.5114155251141552, "percentage": 83.71, "elapsed_time": "1 day, 9:32:53", "remaining_time": "6:31:50"}
{"current_steps": 1240, "total_steps": 1479, "loss": 0.0572, "lr": 7.810183698230539e-07, "epoch": 2.515474378488077, "percentage": 83.84, "elapsed_time": "1 day, 9:36:06", "remaining_time": "6:28:35"}
{"current_steps": 1242, "total_steps": 1479, "loss": 0.0545, "lr": 7.683984216819262e-07, "epoch": 2.519533231861999, "percentage": 83.98, "elapsed_time": "1 day, 9:39:03", "remaining_time": "6:25:16"}
{"current_steps": 1244, "total_steps": 1479, "loss": 0.0535, "lr": 7.55872772649896e-07, "epoch": 2.523592085235921, "percentage": 84.11, "elapsed_time": "1 day, 9:42:28", "remaining_time": "6:22:03"}
{"current_steps": 1246, "total_steps": 1479, "loss": 0.0514, "lr": 7.434417018547396e-07, "epoch": 2.5276509386098427, "percentage": 84.25, "elapsed_time": "1 day, 9:46:28", "remaining_time": "6:18:56"}
{"current_steps": 1248, "total_steps": 1479, "loss": 0.0588, "lr": 7.311054863166095e-07, "epoch": 2.5317097919837646, "percentage": 84.38, "elapsed_time": "1 day, 9:49:20", "remaining_time": "6:15:37"}
{"current_steps": 1250, "total_steps": 1479, "loss": 0.0603, "lr": 7.188644009418517e-07, "epoch": 2.5357686453576864, "percentage": 84.52, "elapsed_time": "1 day, 9:52:43", "remaining_time": "6:12:23"}
{"current_steps": 1252, "total_steps": 1479, "loss": 0.0531, "lr": 7.067187185168862e-07, "epoch": 2.5398274987316083, "percentage": 84.65, "elapsed_time": "1 day, 9:56:17", "remaining_time": "6:09:12"}
{"current_steps": 1254, "total_steps": 1479, "loss": 0.0544, "lr": 6.946687097021249e-07, "epoch": 2.54388635210553, "percentage": 84.79, "elapsed_time": "1 day, 10:00:00", "remaining_time": "6:06:01"}
{"current_steps": 1256, "total_steps": 1479, "loss": 0.0504, "lr": 6.827146430259446e-07, "epoch": 2.547945205479452, "percentage": 84.92, "elapsed_time": "1 day, 10:03:23", "remaining_time": "6:02:47"}
{"current_steps": 1258, "total_steps": 1479, "loss": 0.0576, "lr": 6.70856784878699e-07, "epoch": 2.552004058853374, "percentage": 85.06, "elapsed_time": "1 day, 10:06:48", "remaining_time": "5:59:34"}
{"current_steps": 1260, "total_steps": 1479, "loss": 0.0585, "lr": 6.590953995067812e-07, "epoch": 2.5560629122272958, "percentage": 85.19, "elapsed_time": "1 day, 10:10:11", "remaining_time": "5:56:20"}
{"current_steps": 1262, "total_steps": 1479, "loss": 0.0591, "lr": 6.474307490067383e-07, "epoch": 2.5601217656012176, "percentage": 85.33, "elapsed_time": "1 day, 10:13:10", "remaining_time": "5:53:02"}
{"current_steps": 1264, "total_steps": 1479, "loss": 0.0618, "lr": 6.358630933194282e-07, "epoch": 2.5641806189751395, "percentage": 85.46, "elapsed_time": "1 day, 10:16:26", "remaining_time": "5:49:47"}
{"current_steps": 1266, "total_steps": 1479, "loss": 0.0607, "lr": 6.24392690224232e-07, "epoch": 2.5682394723490614, "percentage": 85.6, "elapsed_time": "1 day, 10:19:45", "remaining_time": "5:46:32"}
{"current_steps": 1268, "total_steps": 1479, "loss": 0.0602, "lr": 6.130197953333017e-07, "epoch": 2.5722983257229832, "percentage": 85.73, "elapsed_time": "1 day, 10:22:55", "remaining_time": "5:43:16"}
{"current_steps": 1270, "total_steps": 1479, "loss": 0.0565, "lr": 6.017446620858708e-07, "epoch": 2.576357179096905, "percentage": 85.87, "elapsed_time": "1 day, 10:25:56", "remaining_time": "5:39:59"}
{"current_steps": 1272, "total_steps": 1479, "loss": 0.0572, "lr": 5.905675417426027e-07, "epoch": 2.580416032470827, "percentage": 86.0, "elapsed_time": "1 day, 10:29:03", "remaining_time": "5:36:42"}
{"current_steps": 1274, "total_steps": 1479, "loss": 0.0514, "lr": 5.794886833799923e-07, "epoch": 2.584474885844749, "percentage": 86.14, "elapsed_time": "1 day, 10:32:27", "remaining_time": "5:33:28"}
{"current_steps": 1276, "total_steps": 1479, "loss": 0.0509, "lr": 5.685083338848152e-07, "epoch": 2.5885337392186707, "percentage": 86.27, "elapsed_time": "1 day, 10:36:07", "remaining_time": "5:30:17"}
{"current_steps": 1278, "total_steps": 1479, "loss": 0.0608, "lr": 5.576267379486294e-07, "epoch": 2.5925925925925926, "percentage": 86.41, "elapsed_time": "1 day, 10:39:10", "remaining_time": "5:27:00"}
{"current_steps": 1280, "total_steps": 1479, "loss": 0.0619, "lr": 5.468441380623169e-07, "epoch": 2.5966514459665144, "percentage": 86.54, "elapsed_time": "1 day, 10:42:44", "remaining_time": "5:23:48"}
{"current_steps": 1282, "total_steps": 1479, "loss": 0.0534, "lr": 5.361607745106817e-07, "epoch": 2.6007102993404363, "percentage": 86.68, "elapsed_time": "1 day, 10:46:03", "remaining_time": "5:20:33"}
{"current_steps": 1284, "total_steps": 1479, "loss": 0.0568, "lr": 5.255768853671011e-07, "epoch": 2.604769152714358, "percentage": 86.82, "elapsed_time": "1 day, 10:49:28", "remaining_time": "5:17:19"}
{"current_steps": 1286, "total_steps": 1479, "loss": 0.0488, "lr": 5.150927064882089e-07, "epoch": 2.60882800608828, "percentage": 86.95, "elapsed_time": "1 day, 10:52:49", "remaining_time": "5:14:05"}
{"current_steps": 1288, "total_steps": 1479, "loss": 0.0627, "lr": 5.047084715086515e-07, "epoch": 2.612886859462202, "percentage": 87.09, "elapsed_time": "1 day, 10:55:47", "remaining_time": "5:10:47"}
{"current_steps": 1290, "total_steps": 1479, "loss": 0.0496, "lr": 4.944244118358721e-07, "epoch": 2.6169457128361238, "percentage": 87.22, "elapsed_time": "1 day, 10:59:17", "remaining_time": "5:07:34"}
{"current_steps": 1292, "total_steps": 1479, "loss": 0.0527, "lr": 4.842407566449591e-07, "epoch": 2.6210045662100456, "percentage": 87.36, "elapsed_time": "1 day, 11:02:31", "remaining_time": "5:04:18"}
{"current_steps": 1294, "total_steps": 1479, "loss": 0.0562, "lr": 4.741577328735364e-07, "epoch": 2.6250634195839675, "percentage": 87.49, "elapsed_time": "1 day, 11:05:43", "remaining_time": "5:01:03"}
{"current_steps": 1296, "total_steps": 1479, "loss": 0.0557, "lr": 4.641755652167107e-07, "epoch": 2.6291222729578894, "percentage": 87.63, "elapsed_time": "1 day, 11:08:44", "remaining_time": "4:57:45"}
{"current_steps": 1298, "total_steps": 1479, "loss": 0.0559, "lr": 4.5429447612205635e-07, "epoch": 2.6331811263318112, "percentage": 87.76, "elapsed_time": "1 day, 11:12:05", "remaining_time": "4:54:31"}
{"current_steps": 1300, "total_steps": 1479, "loss": 0.0505, "lr": 4.445146857846672e-07, "epoch": 2.637239979705733, "percentage": 87.9, "elapsed_time": "1 day, 11:15:46", "remaining_time": "4:51:19"}
{"current_steps": 1302, "total_steps": 1479, "loss": 0.0536, "lr": 4.3483641214224325e-07, "epoch": 2.641298833079655, "percentage": 88.03, "elapsed_time": "1 day, 11:18:55", "remaining_time": "4:48:03"}
{"current_steps": 1304, "total_steps": 1479, "loss": 0.0492, "lr": 4.2525987087023433e-07, "epoch": 2.645357686453577, "percentage": 88.17, "elapsed_time": "1 day, 11:22:22", "remaining_time": "4:44:49"}
{"current_steps": 1306, "total_steps": 1479, "loss": 0.061, "lr": 4.1578527537703973e-07, "epoch": 2.6494165398274987, "percentage": 88.3, "elapsed_time": "1 day, 11:25:26", "remaining_time": "4:41:32"}
{"current_steps": 1308, "total_steps": 1479, "loss": 0.0556, "lr": 4.064128367992459e-07, "epoch": 2.6534753932014206, "percentage": 88.44, "elapsed_time": "1 day, 11:28:36", "remaining_time": "4:38:16"}
{"current_steps": 1310, "total_steps": 1479, "loss": 0.0557, "lr": 3.971427639969233e-07, "epoch": 2.6575342465753424, "percentage": 88.57, "elapsed_time": "1 day, 11:31:47", "remaining_time": "4:35:00"}
{"current_steps": 1312, "total_steps": 1479, "loss": 0.0525, "lr": 3.879752635489736e-07, "epoch": 2.6615930999492643, "percentage": 88.71, "elapsed_time": "1 day, 11:34:52", "remaining_time": "4:31:44"}
{"current_steps": 1314, "total_steps": 1479, "loss": 0.0524, "lr": 3.7891053974852597e-07, "epoch": 2.665651953323186, "percentage": 88.84, "elapsed_time": "1 day, 11:38:12", "remaining_time": "4:28:29"}
{"current_steps": 1316, "total_steps": 1479, "loss": 0.0557, "lr": 3.6994879459838375e-07, "epoch": 2.669710806697108, "percentage": 88.98, "elapsed_time": "1 day, 11:41:20", "remaining_time": "4:25:13"}
{"current_steps": 1318, "total_steps": 1479, "loss": 0.0569, "lr": 3.6109022780652147e-07, "epoch": 2.67376966007103, "percentage": 89.11, "elapsed_time": "1 day, 11:44:33", "remaining_time": "4:21:58"}
{"current_steps": 1320, "total_steps": 1479, "loss": 0.0571, "lr": 3.5233503678163696e-07, "epoch": 2.6778285134449518, "percentage": 89.25, "elapsed_time": "1 day, 11:47:49", "remaining_time": "4:18:42"}
{"current_steps": 1322, "total_steps": 1479, "loss": 0.0535, "lr": 3.4368341662875004e-07, "epoch": 2.6818873668188736, "percentage": 89.38, "elapsed_time": "1 day, 11:51:10", "remaining_time": "4:15:28"}
{"current_steps": 1324, "total_steps": 1479, "loss": 0.0615, "lr": 3.3513556014485805e-07, "epoch": 2.6859462201927955, "percentage": 89.52, "elapsed_time": "1 day, 11:54:14", "remaining_time": "4:12:11"}
{"current_steps": 1326, "total_steps": 1479, "loss": 0.0592, "lr": 3.26691657814634e-07, "epoch": 2.6900050735667174, "percentage": 89.66, "elapsed_time": "1 day, 11:57:30", "remaining_time": "4:08:56"}
{"current_steps": 1328, "total_steps": 1479, "loss": 0.0555, "lr": 3.183518978061895e-07, "epoch": 2.6940639269406392, "percentage": 89.79, "elapsed_time": "1 day, 12:00:45", "remaining_time": "4:05:41"}
{"current_steps": 1330, "total_steps": 1479, "loss": 0.0557, "lr": 3.101164659668732e-07, "epoch": 2.698122780314561, "percentage": 89.93, "elapsed_time": "1 day, 12:04:22", "remaining_time": "4:02:28"}
{"current_steps": 1332, "total_steps": 1479, "loss": 0.0572, "lr": 3.0198554581913343e-07, "epoch": 2.702181633688483, "percentage": 90.06, "elapsed_time": "1 day, 12:07:35", "remaining_time": "3:59:12"}
{"current_steps": 1334, "total_steps": 1479, "loss": 0.0529, "lr": 2.9395931855643043e-07, "epoch": 2.706240487062405, "percentage": 90.2, "elapsed_time": "1 day, 12:10:46", "remaining_time": "3:55:57"}
{"current_steps": 1336, "total_steps": 1479, "loss": 0.0522, "lr": 2.860379630391935e-07, "epoch": 2.7102993404363267, "percentage": 90.33, "elapsed_time": "1 day, 12:13:55", "remaining_time": "3:52:41"}
{"current_steps": 1338, "total_steps": 1479, "loss": 0.0516, "lr": 2.7822165579084013e-07, "epoch": 2.7143581938102486, "percentage": 90.47, "elapsed_time": "1 day, 12:17:24", "remaining_time": "3:49:27"}
{"current_steps": 1340, "total_steps": 1479, "loss": 0.0522, "lr": 2.705105709938388e-07, "epoch": 2.7184170471841704, "percentage": 90.6, "elapsed_time": "1 day, 12:20:34", "remaining_time": "3:46:11"}
{"current_steps": 1342, "total_steps": 1479, "loss": 0.0571, "lr": 2.629048804858275e-07, "epoch": 2.7224759005580923, "percentage": 90.74, "elapsed_time": "1 day, 12:23:48", "remaining_time": "3:42:56"}
{"current_steps": 1344, "total_steps": 1479, "loss": 0.0579, "lr": 2.5540475375578967e-07, "epoch": 2.726534753932014, "percentage": 90.87, "elapsed_time": "1 day, 12:27:01", "remaining_time": "3:39:40"}
{"current_steps": 1346, "total_steps": 1479, "loss": 0.0537, "lr": 2.4801035794026987e-07, "epoch": 2.730593607305936, "percentage": 91.01, "elapsed_time": "1 day, 12:30:02", "remaining_time": "3:36:24"}
{"current_steps": 1348, "total_steps": 1479, "loss": 0.0521, "lr": 2.407218578196524e-07, "epoch": 2.734652460679858, "percentage": 91.14, "elapsed_time": "1 day, 12:33:26", "remaining_time": "3:33:09"}
{"current_steps": 1350, "total_steps": 1479, "loss": 0.0584, "lr": 2.3353941581449048e-07, "epoch": 2.7387113140537798, "percentage": 91.28, "elapsed_time": "1 day, 12:36:47", "remaining_time": "3:29:54"}
{"current_steps": 1352, "total_steps": 1479, "loss": 0.0531, "lr": 2.2646319198188495e-07, "epoch": 2.7427701674277016, "percentage": 91.41, "elapsed_time": "1 day, 12:39:49", "remaining_time": "3:26:38"}
{"current_steps": 1354, "total_steps": 1479, "loss": 0.0597, "lr": 2.1949334401192013e-07, "epoch": 2.7468290208016235, "percentage": 91.55, "elapsed_time": "1 day, 12:43:11", "remaining_time": "3:23:23"}
{"current_steps": 1356, "total_steps": 1479, "loss": 0.0593, "lr": 2.1263002722414383e-07, "epoch": 2.7508878741755454, "percentage": 91.68, "elapsed_time": "1 day, 12:46:23", "remaining_time": "3:20:08"}
{"current_steps": 1358, "total_steps": 1479, "loss": 0.0558, "lr": 2.0587339456411503e-07, "epoch": 2.7549467275494672, "percentage": 91.82, "elapsed_time": "1 day, 12:49:36", "remaining_time": "3:16:52"}
{"current_steps": 1360, "total_steps": 1479, "loss": 0.0535, "lr": 1.9922359659998724e-07, "epoch": 2.759005580923389, "percentage": 91.95, "elapsed_time": "1 day, 12:53:12", "remaining_time": "3:13:39"}
{"current_steps": 1362, "total_steps": 1479, "loss": 0.0572, "lr": 1.9268078151915724e-07, "epoch": 2.763064434297311, "percentage": 92.09, "elapsed_time": "1 day, 12:56:15", "remaining_time": "3:10:23"}
{"current_steps": 1364, "total_steps": 1479, "loss": 0.0567, "lr": 1.8624509512496336e-07, "epoch": 2.767123287671233, "percentage": 92.22, "elapsed_time": "1 day, 12:59:14", "remaining_time": "3:07:06"}
{"current_steps": 1366, "total_steps": 1479, "loss": 0.0561, "lr": 1.799166808334335e-07, "epoch": 2.7711821410451547, "percentage": 92.36, "elapsed_time": "1 day, 13:02:25", "remaining_time": "3:03:50"}
{"current_steps": 1368, "total_steps": 1479, "loss": 0.052, "lr": 1.7369567967009226e-07, "epoch": 2.7752409944190766, "percentage": 92.49, "elapsed_time": "1 day, 13:06:04", "remaining_time": "3:00:37"}
{"current_steps": 1370, "total_steps": 1479, "loss": 0.056, "lr": 1.6758223026681507e-07, "epoch": 2.7792998477929984, "percentage": 92.63, "elapsed_time": "1 day, 13:09:29", "remaining_time": "2:57:23"}
{"current_steps": 1372, "total_steps": 1479, "loss": 0.0508, "lr": 1.615764688587429e-07, "epoch": 2.7833587011669203, "percentage": 92.77, "elapsed_time": "1 day, 13:12:39", "remaining_time": "2:54:07"}
{"current_steps": 1374, "total_steps": 1479, "loss": 0.0571, "lr": 1.5567852928124237e-07, "epoch": 2.787417554540842, "percentage": 92.9, "elapsed_time": "1 day, 13:15:48", "remaining_time": "2:50:51"}
{"current_steps": 1376, "total_steps": 1479, "loss": 0.0503, "lr": 1.4988854296692557e-07, "epoch": 2.791476407914764, "percentage": 93.04, "elapsed_time": "1 day, 13:19:00", "remaining_time": "2:47:36"}
{"current_steps": 1378, "total_steps": 1479, "loss": 0.0599, "lr": 1.442066389427199e-07, "epoch": 2.795535261288686, "percentage": 93.17, "elapsed_time": "1 day, 13:22:15", "remaining_time": "2:44:20"}
{"current_steps": 1380, "total_steps": 1479, "loss": 0.0563, "lr": 1.386329438269929e-07, "epoch": 2.7995941146626078, "percentage": 93.31, "elapsed_time": "1 day, 13:25:15", "remaining_time": "2:41:04"}
{"current_steps": 1382, "total_steps": 1479, "loss": 0.0559, "lr": 1.3316758182673307e-07, "epoch": 2.8036529680365296, "percentage": 93.44, "elapsed_time": "1 day, 13:28:16", "remaining_time": "2:37:48"}
{"current_steps": 1384, "total_steps": 1479, "loss": 0.0553, "lr": 1.2781067473477905e-07, "epoch": 2.8077118214104515, "percentage": 93.58, "elapsed_time": "1 day, 13:31:04", "remaining_time": "2:34:31"}
{"current_steps": 1386, "total_steps": 1479, "loss": 0.0492, "lr": 1.225623419271055e-07, "epoch": 2.8117706747843734, "percentage": 93.71, "elapsed_time": "1 day, 13:34:11", "remaining_time": "2:31:15"}
{"current_steps": 1388, "total_steps": 1479, "loss": 0.0542, "lr": 1.1742270036016523e-07, "epoch": 2.8158295281582952, "percentage": 93.85, "elapsed_time": "1 day, 13:37:17", "remaining_time": "2:27:59"}
{"current_steps": 1390, "total_steps": 1479, "loss": 0.0551, "lr": 1.1239186456828033e-07, "epoch": 2.819888381532217, "percentage": 93.98, "elapsed_time": "1 day, 13:40:05", "remaining_time": "2:24:42"}
{"current_steps": 1392, "total_steps": 1479, "loss": 0.0573, "lr": 1.0746994666109234e-07, "epoch": 2.823947234906139, "percentage": 94.12, "elapsed_time": "1 day, 13:43:19", "remaining_time": "2:21:27"}
{"current_steps": 1394, "total_steps": 1479, "loss": 0.0546, "lr": 1.0265705632106216e-07, "epoch": 2.828006088280061, "percentage": 94.25, "elapsed_time": "1 day, 13:46:30", "remaining_time": "2:18:12"}
{"current_steps": 1396, "total_steps": 1479, "loss": 0.0541, "lr": 9.795330080102527e-08, "epoch": 2.8320649416539827, "percentage": 94.39, "elapsed_time": "1 day, 13:49:52", "remaining_time": "2:14:57"}
{"current_steps": 1398, "total_steps": 1479, "loss": 0.0519, "lr": 9.335878492180373e-08, "epoch": 2.8361237950279046, "percentage": 94.52, "elapsed_time": "1 day, 13:53:02", "remaining_time": "2:11:42"}
{"current_steps": 1400, "total_steps": 1479, "loss": 0.0557, "lr": 8.887361106986848e-08, "epoch": 2.8401826484018264, "percentage": 94.66, "elapsed_time": "1 day, 13:56:13", "remaining_time": "2:08:26"}
{"current_steps": 1402, "total_steps": 1479, "loss": 0.0594, "lr": 8.44978791950607e-08, "epoch": 2.8442415017757483, "percentage": 94.79, "elapsed_time": "1 day, 13:59:29", "remaining_time": "2:05:11"}
{"current_steps": 1404, "total_steps": 1479, "loss": 0.0599, "lr": 8.023168680835913e-08, "epoch": 2.84830035514967, "percentage": 94.93, "elapsed_time": "1 day, 14:02:42", "remaining_time": "2:01:56"}
{"current_steps": 1406, "total_steps": 1479, "loss": 0.0596, "lr": 7.60751289797118e-08, "epoch": 2.852359208523592, "percentage": 95.06, "elapsed_time": "1 day, 14:05:51", "remaining_time": "1:58:40"}
{"current_steps": 1408, "total_steps": 1479, "loss": 0.056, "lr": 7.202829833591496e-08, "epoch": 2.856418061897514, "percentage": 95.2, "elapsed_time": "1 day, 14:09:13", "remaining_time": "1:55:26"}
{"current_steps": 1410, "total_steps": 1479, "loss": 0.0556, "lr": 6.809128505855189e-08, "epoch": 2.8604769152714358, "percentage": 95.33, "elapsed_time": "1 day, 14:12:37", "remaining_time": "1:52:11"}
{"current_steps": 1412, "total_steps": 1479, "loss": 0.0532, "lr": 6.426417688197961e-08, "epoch": 2.8645357686453576, "percentage": 95.47, "elapsed_time": "1 day, 14:15:50", "remaining_time": "1:48:56"}
{"current_steps": 1414, "total_steps": 1479, "loss": 0.0496, "lr": 6.054705909137426e-08, "epoch": 2.8685946220192795, "percentage": 95.61, "elapsed_time": "1 day, 14:19:04", "remaining_time": "1:45:41"}
{"current_steps": 1416, "total_steps": 1479, "loss": 0.0539, "lr": 5.6940014520834865e-08, "epoch": 2.8726534753932014, "percentage": 95.74, "elapsed_time": "1 day, 14:22:25", "remaining_time": "1:42:26"}
{"current_steps": 1418, "total_steps": 1479, "loss": 0.0523, "lr": 5.344312355153036e-08, "epoch": 2.8767123287671232, "percentage": 95.88, "elapsed_time": "1 day, 14:25:42", "remaining_time": "1:39:11"}
{"current_steps": 1420, "total_steps": 1479, "loss": 0.0547, "lr": 5.005646410991549e-08, "epoch": 2.880771182141045, "percentage": 96.01, "elapsed_time": "1 day, 14:29:01", "remaining_time": "1:35:56"}
{"current_steps": 1422, "total_steps": 1479, "loss": 0.0619, "lr": 4.678011166598884e-08, "epoch": 2.884830035514967, "percentage": 96.15, "elapsed_time": "1 day, 14:31:54", "remaining_time": "1:32:40"}
{"current_steps": 1424, "total_steps": 1479, "loss": 0.0506, "lr": 4.3614139231614725e-08, "epoch": 2.888888888888889, "percentage": 96.28, "elapsed_time": "1 day, 14:34:56", "remaining_time": "1:29:24"}
{"current_steps": 1426, "total_steps": 1479, "loss": 0.054, "lr": 4.0558617358892326e-08, "epoch": 2.8929477422628107, "percentage": 96.42, "elapsed_time": "1 day, 14:38:04", "remaining_time": "1:26:09"}
{"current_steps": 1428, "total_steps": 1479, "loss": 0.0587, "lr": 3.7613614138587995e-08, "epoch": 2.8970065956367326, "percentage": 96.55, "elapsed_time": "1 day, 14:41:15", "remaining_time": "1:22:54"}
{"current_steps": 1430, "total_steps": 1479, "loss": 0.0537, "lr": 3.477919519861428e-08, "epoch": 2.9010654490106544, "percentage": 96.69, "elapsed_time": "1 day, 14:44:10", "remaining_time": "1:19:38"}
{"current_steps": 1432, "total_steps": 1479, "loss": 0.0549, "lr": 3.205542370256997e-08, "epoch": 2.9051243023845763, "percentage": 96.82, "elapsed_time": "1 day, 14:47:26", "remaining_time": "1:16:23"}
{"current_steps": 1434, "total_steps": 1479, "loss": 0.059, "lr": 2.944236034832959e-08, "epoch": 2.909183155758498, "percentage": 96.96, "elapsed_time": "1 day, 14:50:33", "remaining_time": "1:13:08"}
{"current_steps": 1436, "total_steps": 1479, "loss": 0.0545, "lr": 2.6940063366693303e-08, "epoch": 2.91324200913242, "percentage": 97.09, "elapsed_time": "1 day, 14:53:41", "remaining_time": "1:09:52"}
{"current_steps": 1438, "total_steps": 1479, "loss": 0.0544, "lr": 2.4548588520089123e-08, "epoch": 2.917300862506342, "percentage": 97.23, "elapsed_time": "1 day, 14:57:03", "remaining_time": "1:06:38"}
{"current_steps": 1440, "total_steps": 1479, "loss": 0.0531, "lr": 2.2267989101328878e-08, "epoch": 2.9213597158802638, "percentage": 97.36, "elapsed_time": "1 day, 15:00:06", "remaining_time": "1:03:22"}
{"current_steps": 1442, "total_steps": 1479, "loss": 0.0548, "lr": 2.0098315932421952e-08, "epoch": 2.9254185692541856, "percentage": 97.5, "elapsed_time": "1 day, 15:03:35", "remaining_time": "1:00:08"}
{"current_steps": 1444, "total_steps": 1479, "loss": 0.0489, "lr": 1.803961736344062e-08, "epoch": 2.9294774226281075, "percentage": 97.63, "elapsed_time": "1 day, 15:06:49", "remaining_time": "0:56:52"}
{"current_steps": 1446, "total_steps": 1479, "loss": 0.0541, "lr": 1.6091939271446478e-08, "epoch": 2.9335362760020294, "percentage": 97.77, "elapsed_time": "1 day, 15:10:03", "remaining_time": "0:53:37"}
{"current_steps": 1448, "total_steps": 1479, "loss": 0.057, "lr": 1.4255325059463477e-08, "epoch": 2.9375951293759512, "percentage": 97.9, "elapsed_time": "1 day, 15:13:18", "remaining_time": "0:50:22"}
{"current_steps": 1450, "total_steps": 1479, "loss": 0.0524, "lr": 1.252981565551481e-08, "epoch": 2.941653982749873, "percentage": 98.04, "elapsed_time": "1 day, 15:16:58", "remaining_time": "0:47:08"}
{"current_steps": 1452, "total_steps": 1479, "loss": 0.0546, "lr": 1.0915449511708088e-08, "epoch": 2.945712836123795, "percentage": 98.17, "elapsed_time": "1 day, 15:20:12", "remaining_time": "0:43:53"}
{"current_steps": 1454, "total_steps": 1479, "loss": 0.0544, "lr": 9.412262603378797e-09, "epoch": 2.949771689497717, "percentage": 98.31, "elapsed_time": "1 day, 15:23:32", "remaining_time": "0:40:38"}
{"current_steps": 1456, "total_steps": 1479, "loss": 0.0532, "lr": 8.020288428289836e-09, "epoch": 2.9538305428716387, "percentage": 98.44, "elapsed_time": "1 day, 15:27:05", "remaining_time": "0:37:23"}
{"current_steps": 1458, "total_steps": 1479, "loss": 0.0577, "lr": 6.739558005884883e-09, "epoch": 2.9578893962455606, "percentage": 98.58, "elapsed_time": "1 day, 15:30:25", "remaining_time": "0:34:08"}
{"current_steps": 1460, "total_steps": 1479, "loss": 0.0582, "lr": 5.570099876595625e-09, "epoch": 2.9619482496194824, "percentage": 98.72, "elapsed_time": "1 day, 15:33:52", "remaining_time": "0:30:53"}
{"current_steps": 1462, "total_steps": 1479, "loss": 0.0517, "lr": 4.511940101207812e-09, "epoch": 2.9660071029934043, "percentage": 98.85, "elapsed_time": "1 day, 15:37:06", "remaining_time": "0:27:38"}
{"current_steps": 1464, "total_steps": 1479, "loss": 0.0566, "lr": 3.565102260278397e-09, "epoch": 2.970065956367326, "percentage": 98.99, "elapsed_time": "1 day, 15:40:17", "remaining_time": "0:24:23"}
{"current_steps": 1466, "total_steps": 1479, "loss": 0.0521, "lr": 2.72960745361206e-09, "epoch": 2.974124809741248, "percentage": 99.12, "elapsed_time": "1 day, 15:43:20", "remaining_time": "0:21:08"}
{"current_steps": 1468, "total_steps": 1479, "loss": 0.0561, "lr": 2.0054742997893674e-09, "epoch": 2.97818366311517, "percentage": 99.26, "elapsed_time": "1 day, 15:46:49", "remaining_time": "0:17:53"}
{"current_steps": 1470, "total_steps": 1479, "loss": 0.0527, "lr": 1.392718935752102e-09, "epoch": 2.9822425164890918, "percentage": 99.39, "elapsed_time": "1 day, 15:50:03", "remaining_time": "0:14:37"}
{"current_steps": 1472, "total_steps": 1479, "loss": 0.053, "lr": 8.913550164463269e-10, "epoch": 2.9863013698630136, "percentage": 99.53, "elapsed_time": "1 day, 15:53:21", "remaining_time": "0:11:22"}
{"current_steps": 1474, "total_steps": 1479, "loss": 0.0563, "lr": 5.013937145131875e-10, "epoch": 2.9903602232369355, "percentage": 99.66, "elapsed_time": "1 day, 15:56:18", "remaining_time": "0:08:07"}
{"current_steps": 1476, "total_steps": 1479, "loss": 0.0562, "lr": 2.2284372004410804e-10, "epoch": 2.9944190766108574, "percentage": 99.8, "elapsed_time": "1 day, 16:00:02", "remaining_time": "0:04:52"}
{"current_steps": 1478, "total_steps": 1479, "loss": 0.0501, "lr": 5.5711240385392106e-11, "epoch": 2.9984779299847792, "percentage": 99.93, "elapsed_time": "1 day, 16:03:15", "remaining_time": "0:01:37"}
{"current_steps": 1479, "total_steps": 1479, "epoch": 3.0, "percentage": 100.0, "elapsed_time": "1 day, 16:06:14", "remaining_time": "0:00:00"}

5216
trainer_state.json Normal file

File diff suppressed because it is too large Load Diff

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6015fc8e533802016e17825d7d7dc917a1ac96f1847d500c295a0374f98f4b48
size 8081

BIN
training_loss.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

1
vocab.json Normal file

File diff suppressed because one or more lines are too long