初始化项目,由ModelHub XC社区提供模型
Model: W-61/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315 Source: Original Platform
This commit is contained in:
36
.gitattributes
vendored
Normal file
36
.gitattributes
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||
76
README.md
Normal file
76
README.md
Normal file
@@ -0,0 +1,76 @@
|
||||
---
|
||||
library_name: transformers
|
||||
base_model: W-61/qwen3-8b-base-sft-ultrachat-4xh200-batch-128
|
||||
tags:
|
||||
- alignment-handbook
|
||||
- margin-dpo
|
||||
- generated_from_trainer
|
||||
datasets:
|
||||
- HuggingFaceH4/ultrafeedback_binarized
|
||||
model-index:
|
||||
- name: qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315
|
||||
results: []
|
||||
---
|
||||
|
||||
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
||||
should probably proofread and complete it, then remove this comment. -->
|
||||
|
||||
# qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315
|
||||
|
||||
This model is a fine-tuned version of [W-61/qwen3-8b-base-sft-ultrachat-4xh200-batch-128](https://huggingface.co/W-61/qwen3-8b-base-sft-ultrachat-4xh200-batch-128) on the HuggingFaceH4/ultrafeedback_binarized dataset.
|
||||
It achieves the following results on the evaluation set:
|
||||
- Loss: 0.5602
|
||||
- Margin Dpo/margin Mean: 48.7131
|
||||
- Margin Dpo/margin Std: 68.1546
|
||||
- Logps/chosen: -316.0414
|
||||
- Logps/rejected: -345.1451
|
||||
- Logps/ref Chosen: -281.4589
|
||||
- Logps/ref Rejected: -261.8495
|
||||
- Logits/chosen: 1.1933
|
||||
- Logits/rejected: 1.2367
|
||||
|
||||
## Model description
|
||||
|
||||
More information needed
|
||||
|
||||
## Intended uses & limitations
|
||||
|
||||
More information needed
|
||||
|
||||
## Training and evaluation data
|
||||
|
||||
More information needed
|
||||
|
||||
## Training procedure
|
||||
|
||||
### Training hyperparameters
|
||||
|
||||
The following hyperparameters were used during training:
|
||||
- learning_rate: 5e-07
|
||||
- train_batch_size: 4
|
||||
- eval_batch_size: 4
|
||||
- seed: 42
|
||||
- distributed_type: multi-GPU
|
||||
- num_devices: 4
|
||||
- gradient_accumulation_steps: 8
|
||||
- total_train_batch_size: 128
|
||||
- total_eval_batch_size: 16
|
||||
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
||||
- lr_scheduler_type: cosine
|
||||
- lr_scheduler_warmup_ratio: 0.1
|
||||
- num_epochs: 1
|
||||
|
||||
### Training results
|
||||
|
||||
| Training Loss | Epoch | Step | Validation Loss | Margin Dpo/margin Mean | Margin Dpo/margin Std | Logps/chosen | Logps/rejected | Logps/ref Chosen | Logps/ref Rejected | Logits/chosen | Logits/rejected |
|
||||
|:-------------:|:------:|:----:|:---------------:|:----------------------:|:---------------------:|:------------:|:--------------:|:----------------:|:------------------:|:-------------:|:---------------:|
|
||||
| 4.9383 | 0.4188 | 200 | 0.5970 | 28.2584 | 39.0244 | -287.0227 | -295.6718 | -281.4589 | -261.8495 | 1.4300 | 1.4697 |
|
||||
| 4.2739 | 0.8377 | 400 | 0.5602 | 48.7131 | 68.1546 | -316.0414 | -345.1451 | -281.4589 | -261.8495 | 1.1933 | 1.2367 |
|
||||
|
||||
|
||||
### Framework versions
|
||||
|
||||
- Transformers 4.51.0
|
||||
- Pytorch 2.3.1+cu121
|
||||
- Datasets 2.21.0
|
||||
- Tokenizers 0.21.4
|
||||
28
added_tokens.json
Normal file
28
added_tokens.json
Normal file
@@ -0,0 +1,28 @@
|
||||
{
|
||||
"</think>": 151668,
|
||||
"</tool_call>": 151658,
|
||||
"</tool_response>": 151666,
|
||||
"<think>": 151667,
|
||||
"<tool_call>": 151657,
|
||||
"<tool_response>": 151665,
|
||||
"<|box_end|>": 151649,
|
||||
"<|box_start|>": 151648,
|
||||
"<|endoftext|>": 151643,
|
||||
"<|file_sep|>": 151664,
|
||||
"<|fim_middle|>": 151660,
|
||||
"<|fim_pad|>": 151662,
|
||||
"<|fim_prefix|>": 151659,
|
||||
"<|fim_suffix|>": 151661,
|
||||
"<|im_end|>": 151645,
|
||||
"<|im_start|>": 151644,
|
||||
"<|image_pad|>": 151655,
|
||||
"<|object_ref_end|>": 151647,
|
||||
"<|object_ref_start|>": 151646,
|
||||
"<|quad_end|>": 151651,
|
||||
"<|quad_start|>": 151650,
|
||||
"<|repo_name|>": 151663,
|
||||
"<|video_pad|>": 151656,
|
||||
"<|vision_end|>": 151653,
|
||||
"<|vision_pad|>": 151654,
|
||||
"<|vision_start|>": 151652
|
||||
}
|
||||
22
all_results.json
Normal file
22
all_results.json
Normal file
@@ -0,0 +1,22 @@
|
||||
{
|
||||
"epoch": 0.9989528795811519,
|
||||
"eval_logits/chosen": 1.1346925497055054,
|
||||
"eval_logits/rejected": 1.1706666946411133,
|
||||
"eval_logps/chosen": -313.02630615234375,
|
||||
"eval_logps/ref_chosen": -281.4588928222656,
|
||||
"eval_logps/ref_rejected": -261.84954833984375,
|
||||
"eval_logps/rejected": -342.32696533203125,
|
||||
"eval_loss": 0.5572407245635986,
|
||||
"eval_margin_dpo/margin_mean": 48.91001510620117,
|
||||
"eval_margin_dpo/margin_std": 68.51956176757812,
|
||||
"eval_runtime": 92.7227,
|
||||
"eval_samples": 2000,
|
||||
"eval_samples_per_second": 21.57,
|
||||
"eval_steps_per_second": 1.348,
|
||||
"total_flos": 0.0,
|
||||
"train_loss": 4.779813265150698,
|
||||
"train_runtime": 7822.2821,
|
||||
"train_samples": 61135,
|
||||
"train_samples_per_second": 7.815,
|
||||
"train_steps_per_second": 0.061
|
||||
}
|
||||
30
config.json
Normal file
30
config.json
Normal file
@@ -0,0 +1,30 @@
|
||||
{
|
||||
"architectures": [
|
||||
"Qwen3ForCausalLM"
|
||||
],
|
||||
"attention_bias": false,
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 151643,
|
||||
"eos_token_id": 151643,
|
||||
"head_dim": 128,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 12288,
|
||||
"max_position_embeddings": 32768,
|
||||
"max_window_layers": 36,
|
||||
"model_type": "qwen3",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 36,
|
||||
"num_key_value_heads": 8,
|
||||
"rms_norm_eps": 1e-06,
|
||||
"rope_scaling": null,
|
||||
"rope_theta": 1000000,
|
||||
"sliding_window": null,
|
||||
"tie_word_embeddings": false,
|
||||
"torch_dtype": "float32",
|
||||
"transformers_version": "4.51.0",
|
||||
"use_cache": true,
|
||||
"use_sliding_window": false,
|
||||
"vocab_size": 151936
|
||||
}
|
||||
16
eval_results.json
Normal file
16
eval_results.json
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"epoch": 0.9989528795811519,
|
||||
"eval_logits/chosen": 1.1346925497055054,
|
||||
"eval_logits/rejected": 1.1706666946411133,
|
||||
"eval_logps/chosen": -313.02630615234375,
|
||||
"eval_logps/ref_chosen": -281.4588928222656,
|
||||
"eval_logps/ref_rejected": -261.84954833984375,
|
||||
"eval_logps/rejected": -342.32696533203125,
|
||||
"eval_loss": 0.5572407245635986,
|
||||
"eval_margin_dpo/margin_mean": 48.91001510620117,
|
||||
"eval_margin_dpo/margin_std": 68.51956176757812,
|
||||
"eval_runtime": 92.7227,
|
||||
"eval_samples": 2000,
|
||||
"eval_samples_per_second": 21.57,
|
||||
"eval_steps_per_second": 1.348
|
||||
}
|
||||
6
generation_config.json
Normal file
6
generation_config.json
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"bos_token_id": 151643,
|
||||
"eos_token_id": 151643,
|
||||
"max_new_tokens": 2048,
|
||||
"transformers_version": "4.51.0"
|
||||
}
|
||||
477
margin_logs/margins.jsonl
Normal file
477
margin_logs/margins.jsonl
Normal file
@@ -0,0 +1,477 @@
|
||||
{"epoch": 0.0, "step": 1, "batch_size": 128, "mean": 0.01704716682434082, "std": 0.7752009630203247, "min": -2.2227783203125, "p10": -0.8393455505371094, "median": 0.0678863525390625, "p90": 0.8599296569824219, "max": 2.196014404296875, "pos_frac": 0.5546875, "sample": [0.6739997863769531, -0.3828582763671875, -1.298828125, 0.071014404296875, 0.8088531494140625, 0.6873779296875, -0.081512451171875, 1.77093505859375, -1.4920806884765625, 1.769927978515625, 0.0503692626953125, -0.1540679931640625, 0.15589141845703125, -0.27809906005859375, -0.009723663330078125, 0.3800621032714844, 0.111541748046875, 0.2325439453125, 1.35693359375, -0.482269287109375, 0.2686614990234375, 1.01312255859375, -0.5781745910644531, -0.1348876953125, 0.15979766845703125, -1.054656982421875, 0.01500701904296875, 0.2495269775390625, 0.5402507781982422, 1.80511474609375, -0.8189926147460938, -0.17108154296875, 0.2899322509765625, -0.8205108642578125, 0.650146484375, -0.6688079833984375, 0.080841064453125, -1.4810791015625, 0.07396697998046875, -1.242919921875, 0.086151123046875, 1.62188720703125, -0.38132476806640625, 0.317474365234375, 2.196014404296875, 0.5562095642089844, 0.1424732208251953, -0.7183074951171875, -0.6070556640625, -0.6271820068359375, -0.5325164794921875, 0.67987060546875, -1.229156494140625, -1.91693115234375, -0.143280029296875, 0.2532958984375, -0.349273681640625, 0.42700958251953125, -0.262420654296875, 1.2039794921875, 0.364410400390625, 0.8435211181640625, 0.00079345703125, -0.19837188720703125, -0.5950164794921875, 0.362579345703125, 0.7775421142578125, -0.015625, -0.0785369873046875, -0.6810150146484375, 0.31806182861328125, -0.27123260498046875, 0.00982666015625, 0.098541259765625, -1.4693603515625, 0.20909881591796875, 0.87872314453125, 0.0, 0.4891510009765625, 0.100677490234375, -0.3092498779296875, 0.839263916015625, -0.5568389892578125, -0.5927276611328125, 1.82940673828125, -1.8787841796875, 0.4121856689453125, 0.1029205322265625, -0.05619049072265625, 0.41302490234375, -0.20123291015625, -1.11163330078125, 0.11328125, -0.6293182373046875, -0.4356689453125, 1.026641845703125, -0.4438629150390625, 0.06475830078125, 0.15139389038085938, 0.4244384765625, -0.3410911560058594, -0.8832931518554688, 0.38405609130859375, -0.6134033203125, 0.14745521545410156, -2.2227783203125, -0.3340606689453125, -0.4396820068359375, 0.543609619140625, 0.310455322265625, -0.6277847290039062, 0.28015899658203125, 0.1476898193359375, 0.3895263671875, 0.0498046875, -0.1731414794921875, 0.24071502685546875, -0.351348876953125, -0.0234375, 0.6951370239257812, 0.0606689453125, 0.8756256103515625, 0.680145263671875, 0.14440345764160156, 0.8532028198242188, -1.6363525390625, 1.4149551391601562, -0.476959228515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000001.npy"}
|
||||
{"epoch": 0.0020942408376963353, "step": 2, "batch_size": 128, "mean": 0.08979731798171997, "std": 1.0063953399658203, "min": -2.3612060546875, "p10": -1.1013275146484374, "median": 0.015979766845703125, "p90": 1.2868972778320311, "max": 4.49609375, "pos_frac": 0.53125, "sample": [0.006134033203125, 0.6754608154296875, 0.06409072875976562, 0.3348197937011719, 0.104522705078125, 0.848785400390625, -1.3365631103515625, 0.40827178955078125, 1.093963623046875, -0.233673095703125, 0.2210693359375, 0.34949493408203125, 1.295623779296875, -0.01953887939453125, -0.0232696533203125, -0.88238525390625, 1.92608642578125, -0.0975494384765625, -0.0151214599609375, -0.137054443359375, 0.18506240844726562, -0.7067947387695312, -1.04547119140625, 0.01041412353515625, 0.3020477294921875, -0.03971099853515625, -0.283477783203125, 0.6910400390625, 0.15057373046875, -0.1564178466796875, 0.561614990234375, 0.05670166015625, -0.48236846923828125, 1.7666015625, -0.0575408935546875, -1.0196533203125, -2.3612060546875, 2.6371917724609375, -0.6959686279296875, 0.836669921875, -0.16046142578125, -0.033050537109375, 0.0088958740234375, 1.2831573486328125, 1.604461669921875, -0.9659347534179688, 0.1966876983642578, -1.5263671875, -0.4609222412109375, -0.87158203125, 4.49609375, -0.16178131103515625, -1.54376220703125, 0.0208740234375, 0.0, 0.33994483947753906, 0.185943603515625, -0.30196380615234375, 3.5068359375, 1.463623046875, 0.9294891357421875, -0.02225494384765625, -1.140228271484375, 0.1556549072265625, -0.2645263671875, 0.90374755859375, 0.5412216186523438, -1.59906005859375, -0.3666229248046875, -0.4569854736328125, -0.6967010498046875, 0.36395263671875, 0.13336181640625, -0.068084716796875, -0.563751220703125, 0.5245513916015625, 0.082733154296875, -0.73480224609375, 0.570953369140625, 0.2091522216796875, -0.021331787109375, -0.547821044921875, 1.6211090087890625, 0.8912696838378906, 0.40868377685546875, 2.679595947265625, 0.1652679443359375, -0.8451919555664062, -0.22149658203125, -0.6046524047851562, -0.001617431640625, 0.72979736328125, -1.3320388793945312, -0.18963623046875, 1.436309814453125, 0.23449325561523438, 0.4606475830078125, 0.095184326171875, 0.49684906005859375, -0.4890289306640625, 0.9012298583984375, 1.9844970703125, 1.4336700439453125, -0.7718658447265625, 0.1373443603515625, -1.352508544921875, -1.7950439453125, 0.286163330078125, -0.2695465087890625, 0.72283935546875, 0.463165283203125, -1.27398681640625, 0.45489501953125, 0.01837158203125, 0.01849365234375, 0.36487579345703125, -1.4290771484375, 0.8643035888671875, -0.195465087890625, 0.240478515625, -0.7327880859375, -1.255859375, -0.87091064453125, -1.08465576171875, 1.1551513671875, 0.01358795166015625, -0.0644378662109375, -1.950225830078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000002.npy"}
|
||||
{"epoch": 0.004188481675392671, "step": 3, "batch_size": 128, "mean": 0.06652943789958954, "std": 0.9142764210700989, "min": -2.4149169921875, "p10": -0.96231689453125, "median": 0.010241508483886719, "p90": 1.0579551696777343, "max": 3.880859375, "pos_frac": 0.515625, "sample": [0.515106201171875, 0.002620697021484375, -0.1837158203125, -0.3484039306640625, 0.241790771484375, -2.262786865234375, 0.178314208984375, -0.331878662109375, 0.0, -0.49114990234375, 1.47381591796875, -0.984954833984375, -0.7327880859375, 1.44024658203125, -0.33465576171875, 0.2147979736328125, -0.6630859375, 1.21923828125, 0.42364501953125, 0.2640380859375, -0.51776123046875, 0.24951171875, -0.40716552734375, -0.023590087890625, -0.37969970703125, -0.12957763671875, 1.09881591796875, -1.07525634765625, 0.024688720703125, 0.500274658203125, -0.08501815795898438, 0.08338546752929688, 0.6204833984375, 1.311126708984375, -0.537109375, 0.18043136596679688, -0.11798095703125, -0.791046142578125, -0.809539794921875, 1.936065673828125, -0.2339935302734375, 0.32330322265625, 1.005035400390625, 0.7603607177734375, -0.36358642578125, -1.600677490234375, 0.1370697021484375, 0.807159423828125, -0.399627685546875, -0.228363037109375, -0.1376953125, 0.22867965698242188, 0.480316162109375, 0.12942123413085938, 0.626739501953125, 0.12060546875, 0.5429840087890625, 2.08282470703125, 0.38067626953125, 0.7920799255371094, 0.0, -0.954498291015625, 3.613861083984375, -1.689788818359375, -0.6443252563476562, -0.4348602294921875, -0.12457275390625, -1.608428955078125, 0.09748458862304688, 0.31707000732421875, -0.018798828125, 0.1861572265625, 2.538665771484375, -2.4149169921875, -0.4970855712890625, 1.10107421875, -0.322509765625, -0.32891845703125, 0.8037109375, 0.12200927734375, -0.0377655029296875, 0.1877593994140625, 0.9709091186523438, 0.01605987548828125, -0.120025634765625, 1.64593505859375, -0.980560302734375, 0.51568603515625, 0.12811279296875, -1.399322509765625, 0.24663543701171875, 0.139923095703125, -0.6881103515625, -0.78826904296875, 0.18562698364257812, 0.4803924560546875, 0.12740325927734375, 1.0569686889648438, 0.2731285095214844, -0.13308334350585938, -1.0908470153808594, 0.6779632568359375, -0.24033355712890625, -0.456329345703125, 0.454559326171875, 0.351287841796875, -0.24166107177734375, 0.5218048095703125, 0.3153533935546875, -0.04250335693359375, -1.41534423828125, -1.190185546875, 1.0602569580078125, 0.7880859375, -0.5078277587890625, 0.7215118408203125, -0.3767547607421875, -0.3152923583984375, -0.450653076171875, -0.111053466796875, 3.880859375, -0.5643196105957031, 0.0230560302734375, 0.0044231414794921875, -1.4273605346679688, -0.148162841796875, 0.6254920959472656, -0.12353515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000003.npy"}
|
||||
{"epoch": 0.0062827225130890054, "step": 4, "batch_size": 128, "mean": 0.04047304391860962, "std": 1.0237667560577393, "min": -2.796875, "p10": -1.233058166503906, "median": 0.08390045166015625, "p90": 1.1627410888671874, "max": 4.143585205078125, "pos_frac": 0.53125, "sample": [-0.29274749755859375, 0.7579498291015625, -0.06168556213378906, 1.563720703125, -1.621490478515625, 0.082275390625, 0.19321441650390625, -0.4544677734375, 0.749053955078125, 0.637542724609375, -0.051544189453125, -1.8814697265625, 0.231536865234375, -0.4024810791015625, 0.16203689575195312, 0.82843017578125, 0.793609619140625, -1.773895263671875, 0.40386962890625, 2.06390380859375, 0.51446533203125, -1.42578125, -0.31646728515625, 0.4169921875, -2.796875, 1.0212860107421875, 4.143585205078125, -0.2604522705078125, 0.13275146484375, 0.44927215576171875, 0.2977294921875, -0.14313888549804688, 0.894134521484375, -1.108551025390625, 1.1500244140625, 2.09246826171875, 0.5927886962890625, 0.6392822265625, -0.0696868896484375, -0.837188720703125, -0.18714332580566406, 0.6250076293945312, -0.2453765869140625, 0.101104736328125, -1.0181884765625, -0.7226104736328125, -0.6060028076171875, -0.7268447875976562, 0.27484130859375, -2.257110595703125, 1.78369140625, 0.2735633850097656, -0.000152587890625, -1.859649658203125, -0.6530303955078125, -0.8171463012695312, 0.037109375, 1.645233154296875, 0.7010498046875, 0.381866455078125, 0.173736572265625, -1.5562286376953125, -0.01682281494140625, -0.9429473876953125, -0.202850341796875, -0.718658447265625, 0.1935882568359375, -1.3930511474609375, 0.213409423828125, -0.927703857421875, 0.5341949462890625, 1.262664794921875, 1.354827880859375, -1.16448974609375, 0.9023284912109375, -0.114013671875, -2.15045166015625, -0.072418212890625, 1.11700439453125, 0.05567169189453125, 0.6583938598632812, -0.02008819580078125, -0.45465087890625, -0.600555419921875, -0.0341796875, 0.462615966796875, 0.4788970947265625, 0.74078369140625, 0.979034423828125, -0.226104736328125, -0.1000518798828125, 0.43084716796875, -0.6466827392578125, -0.3905792236328125, -0.4959716796875, -0.108245849609375, 1.28515625, -0.20187759399414062, -2.005706787109375, 0.08660888671875, 0.7843017578125, 0.5416107177734375, 0.18561172485351562, -0.71783447265625, 1.6054840087890625, -2.4051513671875, -0.20859527587890625, -0.7619476318359375, 0.969970703125, 1.61566162109375, 0.392852783203125, 0.274322509765625, 0.0855255126953125, -0.26141357421875, 1.192413330078125, -0.63885498046875, -2.3831787109375, -0.4549407958984375, 0.058868408203125, -0.8582534790039062, 0.3757171630859375, 1.7520751953125, 0.423004150390625, 0.80419921875, 1.130767822265625, 0.82073974609375, -0.026153564453125, 0.4541015625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000004.npy"}
|
||||
{"epoch": 0.008376963350785341, "step": 5, "batch_size": 128, "mean": -0.0679541677236557, "std": 0.857168436050415, "min": -2.48907470703125, "p10": -1.1549392700195311, "median": -0.05478477478027344, "p90": 1.0044921875, "max": 2.59912109375, "pos_frac": 0.46875, "sample": [-0.29166412353515625, -0.991943359375, -0.7362823486328125, -1.061492919921875, 0.0557098388671875, 1.1407470703125, -0.994537353515625, -1.1419525146484375, 0.7154388427734375, -1.688385009765625, -0.488555908203125, -0.465576171875, -0.28411102294921875, -0.6280746459960938, -0.1923370361328125, 0.550872802734375, 0.45101165771484375, 0.225067138671875, 0.305450439453125, -2.48907470703125, -0.202239990234375, 1.05316162109375, 0.10403060913085938, -0.09619140625, -0.71881103515625, 1.085662841796875, -0.8020172119140625, 0.25897979736328125, -0.0865631103515625, -0.09311294555664062, -0.29156494140625, -0.8238677978515625, 1.536102294921875, 0.0560150146484375, 1.0083160400390625, 0.1362152099609375, 0.059326171875, 0.461822509765625, -0.447418212890625, 1.6131591796875, -1.18524169921875, 0.50323486328125, 0.15728759765625, 1.0028533935546875, -0.14874267578125, -0.05299949645996094, -0.60223388671875, 0.52032470703125, 0.12845802307128906, -1.88787841796875, 0.193359375, -1.23394775390625, -0.4207305908203125, -0.05657005310058594, -0.44649505615234375, 1.068511962890625, 1.43701171875, -0.5117950439453125, 0.5001220703125, 2.59912109375, -1.24395751953125, 0.202239990234375, -0.1337127685546875, -0.36983489990234375, 0.06547164916992188, -0.7767219543457031, -0.0096893310546875, -0.2841796875, 0.213043212890625, 0.14703369140625, -0.688079833984375, 0.5413818359375, 0.980133056640625, -0.17806243896484375, 0.23873138427734375, 0.360015869140625, -0.975494384765625, 1.389892578125, 0.072479248046875, 0.4844970703125, -0.38641357421875, -1.6322021484375, 0.0673828125, -1.54229736328125, 0.834869384765625, -2.289886474609375, -0.6190185546875, 0.734375, 2.412750244140625, -0.4210357666015625, -1.676544189453125, 0.25970458984375, -1.962432861328125, -0.102752685546875, -0.273040771484375, 0.512786865234375, -1.582794189453125, -0.2206268310546875, -0.433013916015625, -0.339691162109375, -0.68804931640625, 0.15411376953125, 0.40227508544921875, -0.49285888671875, -0.8412399291992188, 0.15277099609375, 0.7791748046875, -0.009258270263671875, -0.17888641357421875, 0.54437255859375, 0.795989990234375, -0.23563385009765625, 1.5531005859375, -0.12969970703125, 0.0214080810546875, -0.012096405029296875, -0.06114959716796875, 0.302215576171875, 0.024749755859375, 0.74090576171875, -0.4268798828125, 0.02785491943359375, 0.91619873046875, -0.18024826049804688, -1.40447998046875, 0.132476806640625, -0.623809814453125, 1.29827880859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000005.npy"}
|
||||
{"epoch": 0.010471204188481676, "step": 6, "batch_size": 128, "mean": -0.1364934891462326, "std": 0.9510129690170288, "min": -3.309906005859375, "p10": -1.218207550048828, "median": -0.0892038345336914, "p90": 0.918285751342773, "max": 3.0152587890625, "pos_frac": 0.390625, "sample": [-0.9748382568359375, -0.28594970703125, -0.10379791259765625, 3.0152587890625, 0.027679443359375, 0.6424713134765625, 0.28692626953125, -0.1737823486328125, -0.1582489013671875, 0.185394287109375, -0.2953643798828125, 0.7171173095703125, 0.6766204833984375, -0.099365234375, 0.55126953125, -0.9931182861328125, 1.190673828125, -0.07904243469238281, 0.08103561401367188, 0.4193115234375, -2.12371826171875, 0.4153022766113281, 0.74517822265625, 1.1941375732421875, -0.46923828125, -0.542633056640625, -0.0313720703125, 0.35089111328125, 1.01666259765625, 0.47397422790527344, -0.12713623046875, -0.0452117919921875, 0.496673583984375, -0.7116241455078125, 0.2879638671875, -2.18890380859375, 0.3541412353515625, -1.5771484375, -1.132080078125, 1.987548828125, -0.525238037109375, -1.815399169921875, -3.309906005859375, 1.667510986328125, 0.04224395751953125, -0.5202789306640625, -0.065826416015625, -0.463470458984375, 1.1389312744140625, -1.03326416015625, 0.6374435424804688, 0.598480224609375, 1.28656005859375, -0.13568115234375, -0.28631591796875, 0.5135688781738281, -0.803863525390625, -1.07879638671875, -0.376678466796875, 0.4542236328125, 0.030963897705078125, -0.46905517578125, 0.8787612915039062, 0.19000244140625, -0.3760986328125, -0.825103759765625, -0.2801017761230469, 1.2816162109375, -0.04449462890625, -0.07086944580078125, 1.1248016357421875, -0.1287841796875, 0.203887939453125, -2.187744140625, 0.3600311279296875, -0.32904052734375, 0.72137451171875, -1.9743270874023438, 0.0, -0.21746826171875, -0.4379425048828125, -0.067840576171875, -1.2323226928710938, -0.02422332763671875, 0.875152587890625, -0.063934326171875, 0.6660003662109375, -1.0999298095703125, 0.03420257568359375, 0.5595703125, 0.65557861328125, -0.79949951171875, 1.0105094909667969, -1.7874755859375, -0.05133056640625, 0.194488525390625, -1.212158203125, 0.470733642578125, -0.6071624755859375, -0.28936767578125, 2.243438720703125, -1.559295654296875, -0.9710845947265625, -0.60186767578125, -0.49680328369140625, 0.5296173095703125, -0.325408935546875, -1.95281982421875, -0.7571868896484375, -0.4367637634277344, -0.5784912109375, -0.011810302734375, 2.229949951171875, -0.47528076171875, -0.05712890625, -0.1943359375, -1.66156005859375, -0.218719482421875, -1.9971923828125, 0.3548736572265625, -0.05476570129394531, -0.3616943359375, -0.656890869140625, -0.6415252685546875, -0.95306396484375, -1.180908203125, 0.16304779052734375, -0.4567985534667969], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000006.npy"}
|
||||
{"epoch": 0.012565445026178011, "step": 7, "batch_size": 128, "mean": -0.03095322847366333, "std": 0.9314342737197876, "min": -3.44091796875, "p10": -1.0310073852539061, "median": 0.0164642333984375, "p90": 1.1389503479003906, "max": 2.54736328125, "pos_frac": 0.5, "sample": [-1.064727783203125, 2.26171875, 0.8675537109375, -0.8339080810546875, 0.8603363037109375, 0.7361907958984375, 0.5269241333007812, -2.0054931640625, 0.2734375, 0.7644500732421875, -0.0122833251953125, 0.335906982421875, -1.720489501953125, -0.5012130737304688, -0.015106201171875, 0.575836181640625, 0.597259521484375, 1.1571884155273438, 0.0, 0.1326141357421875, 0.4608612060546875, -0.563385009765625, 0.23235321044921875, -0.0536956787109375, -0.743988037109375, -0.203582763671875, -1.16766357421875, -0.6612014770507812, 1.55535888671875, -1.0165557861328125, 0.032928466796875, -0.082794189453125, -0.3311500549316406, 1.4893341064453125, -0.083526611328125, 1.23187255859375, -0.164794921875, -0.564544677734375, -1.12188720703125, -0.14069366455078125, -1.7794189453125, 0.40685272216796875, -0.850799560546875, 0.207122802734375, 0.066925048828125, 0.173248291015625, 1.35491943359375, 0.035099029541015625, 0.2936725616455078, 1.0972900390625, 0.4657135009765625, -2.0701904296875, 1.18701171875, -0.9025726318359375, -0.79364013671875, -0.2639312744140625, -1.7798919677734375, 0.1282958984375, -0.8500213623046875, 0.3815765380859375, -0.4375, 2.54736328125, 0.3784217834472656, -2.07818603515625, -1.1144561767578125, 0.482421875, -0.994232177734375, 0.04393959045410156, 0.4752197265625, -0.889068603515625, -0.19256591796875, 0.501251220703125, -0.08046913146972656, 0.6837310791015625, -0.14661407470703125, -0.369964599609375, 0.32884979248046875, -0.39056396484375, -2.31842041015625, -0.75238037109375, -1.9105224609375, -0.3220710754394531, -0.287689208984375, -0.0373687744140625, 1.77947998046875, 0.2950592041015625, 0.39019775390625, 0.770782470703125, 1.131134033203125, -0.25183868408203125, 0.129058837890625, -0.4675140380859375, -0.18155670166015625, -0.5851860046386719, -1.00848388671875, -0.7311553955078125, 0.20080184936523438, -0.19658470153808594, 0.32241058349609375, 0.3306884765625, 0.347991943359375, -0.599945068359375, -0.7174072265625, 1.45428466796875, 0.3536529541015625, 0.1883544921875, 1.27935791015625, 0.060791015625, 0.35577392578125, -0.6636962890625, -0.0660247802734375, -0.3396568298339844, 0.709747314453125, 1.3556671142578125, -0.3807830810546875, 0.15545654296875, -0.592193603515625, 0.1775360107421875, 1.0419921875, 0.87054443359375, 0.13140869140625, 0.0716552734375, -0.63458251953125, -0.2320384979248047, -0.14064979553222656, -3.44091796875, 2.218017578125, 0.48052978515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000007.npy"}
|
||||
{"epoch": 0.014659685863874346, "step": 8, "batch_size": 128, "mean": 0.015543490648269653, "std": 1.0598371028900146, "min": -3.647064208984375, "p10": -1.2311645507812499, "median": 0.04134941101074219, "p90": 1.1908935546875, "max": 3.6197509765625, "pos_frac": 0.546875, "sample": [-0.19757080078125, 0.09163665771484375, -0.008880615234375, -0.34429931640625, -0.13677978515625, -0.1588134765625, 0.95843505859375, 0.1691913604736328, -0.24256515502929688, -1.64495849609375, 0.12473297119140625, -1.3182373046875, 3.6197509765625, 1.47314453125, -0.03961181640625, 0.0286102294921875, -0.5353851318359375, -0.40380859375, 0.0067272186279296875, 0.9283447265625, 1.23712158203125, 0.7666015625, -1.5533447265625, 0.82769775390625, -0.678924560546875, 0.5670623779296875, -0.344268798828125, -0.89422607421875, -2.052886962890625, -3.4652099609375, -1.5493011474609375, 1.2386474609375, 0.183013916015625, -1.8070068359375, 1.4970703125, -1.179534912109375, 0.05225372314453125, -0.6058502197265625, 0.1473388671875, 0.1827545166015625, -2.08648681640625, -0.64947509765625, 1.17108154296875, 0.4807281494140625, 1.42254638671875, 0.174163818359375, 0.09475135803222656, -0.1681365966796875, 0.164215087890625, 0.4984130859375, -1.66473388671875, -0.608551025390625, 0.570831298828125, -0.0571441650390625, -0.568267822265625, -0.575714111328125, -0.4041748046875, 0.618377685546875, 0.100250244140625, -3.647064208984375, 1.458709716796875, -0.46923828125, -0.71929931640625, 1.1041259765625, 0.3036346435546875, 2.2513427734375, -0.5699310302734375, 0.08695220947265625, 0.4400634765625, 0.61480712890625, -0.202728271484375, 0.9310302734375, -0.2205047607421875, -0.7184066772460938, 1.07470703125, -1.74188232421875, 2.5518798828125, 0.6243476867675781, 0.18575668334960938, -0.38629150390625, -0.09115219116210938, 0.2374267578125, 0.897552490234375, 0.328887939453125, -0.2468414306640625, -0.16748619079589844, -2.1080780029296875, -0.4757843017578125, 0.105560302734375, 2.75592041015625, 0.0675048828125, -2.5657958984375, 0.517333984375, -0.04058837890625, -0.20825958251953125, 0.6816940307617188, 0.60137939453125, 1.13037109375, 0.3304557800292969, -0.23126220703125, 0.1316986083984375, 1.60333251953125, 0.38829803466796875, 1.0009307861328125, -0.7921943664550781, 0.3822154998779297, -0.58221435546875, -0.0659942626953125, 1.155029296875, -1.19384765625, 0.035610198974609375, 1.37286376953125, 0.22406005859375, 0.28662109375, 0.17889404296875, 0.047088623046875, -0.626708984375, -0.542449951171875, -5.340576171875e-05, 0.0067901611328125, -0.7666778564453125, -0.6028900146484375, 0.601470947265625, -0.4424591064453125, 0.5635223388671875, 0.00853729248046875, 1.672698974609375, 0.023199081420898438], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000008.npy"}
|
||||
{"epoch": 0.016753926701570682, "step": 9, "batch_size": 128, "mean": -0.07542148232460022, "std": 0.8794937133789062, "min": -2.6163330078125, "p10": -1.192206573486328, "median": -0.0027923583984375, "p90": 1.0741851806640625, "max": 2.4525146484375, "pos_frac": 0.4765625, "sample": [0.2435302734375, -0.95489501953125, 0.4862518310546875, 0.6054840087890625, -0.841705322265625, -0.077850341796875, -0.26568603515625, -0.14599990844726562, 0.334075927734375, 0.292236328125, 0.25299072265625, -1.87713623046875, 0.1378173828125, -0.30450439453125, -2.14501953125, -0.45068359375, 0.869384765625, 0.08758544921875, 1.6279830932617188, -0.5518341064453125, 0.47808074951171875, 1.35186767578125, -0.9357070922851562, -0.07855224609375, 0.458160400390625, -0.4610137939453125, -1.0523681640625, -0.5319976806640625, 0.3683319091796875, -1.75909423828125, -1.230621337890625, -1.04595947265625, 0.144744873046875, -0.4771728515625, 0.51800537109375, -0.27742767333984375, 1.11981201171875, 0.12935638427734375, -0.0521240234375, 0.390289306640625, -0.85443115234375, -0.2704925537109375, 0.237152099609375, -0.3526458740234375, 1.20819091796875, 0.10662841796875, -1.181396484375, -1.449737548828125, -1.788330078125, -0.657562255859375, -0.204681396484375, -0.116668701171875, -1.67608642578125, 0.089996337890625, -0.434906005859375, 0.26680755615234375, -0.03558349609375, -0.050994873046875, -0.76507568359375, -0.911346435546875, -0.014617919921875, -0.9933853149414062, -0.62353515625, -0.27823448181152344, 0.30141448974609375, 0.5618896484375, 0.08574676513671875, 1.52166748046875, 0.02911376953125, 0.9400634765625, -0.237060546875, 0.011699676513671875, -2.6163330078125, 0.13365554809570312, -0.051544189453125, 0.22409820556640625, -0.251708984375, 0.73870849609375, 0.31250762939453125, 0.47027587890625, 1.088775634765625, 1.5306396484375, -0.80487060546875, -0.600494384765625, -0.0116119384765625, -0.66015625, 0.0, -0.2639312744140625, 1.13018798828125, -1.599853515625, 0.1981201171875, 0.08966445922851562, 1.306671142578125, 0.10259246826171875, 1.45123291015625, 0.4762382507324219, 0.422637939453125, -1.73162841796875, -0.2780609130859375, 1.03912353515625, -0.005584716796875, 0.0, 1.1514892578125, 1.06793212890625, 0.86553955078125, 0.1584911346435547, 0.0, 0.91912841796875, 0.01593780517578125, -0.575714111328125, 0.770751953125, 0.5003738403320312, -1.03253173828125, -1.454071044921875, -1.127288818359375, -0.2288360595703125, -0.6530914306640625, 0.697540283203125, -1.387786865234375, 0.06211090087890625, 0.00946044921875, 2.4525146484375, -1.1553726196289062, -0.29583740234375, -1.2174301147460938, -0.61224365234375, 2.433441162109375, 0.299957275390625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000009.npy"}
|
||||
{"epoch": 0.018848167539267015, "step": 10, "batch_size": 128, "mean": 0.030749976634979248, "std": 0.8194061517715454, "min": -1.9461669921875, "p10": -0.9149505615234375, "median": -0.014186859130859375, "p90": 1.0807281494140621, "max": 2.56787109375, "pos_frac": 0.46875, "sample": [-0.601959228515625, 0.079986572265625, 0.023651123046875, -0.114532470703125, 0.34229278564453125, -0.232696533203125, -0.03435516357421875, -0.569976806640625, -0.00405120849609375, 0.14850616455078125, 0.91351318359375, -0.3385906219482422, -0.39220428466796875, 1.35540771484375, 2.000213623046875, -0.254180908203125, 0.0402069091796875, -1.741485595703125, -0.59722900390625, 1.258392333984375, -0.38433837890625, -0.5450973510742188, -0.0390625, 0.488525390625, 0.374908447265625, 0.24718475341796875, 1.162353515625, -0.1903839111328125, -0.305023193359375, -0.4617462158203125, -0.587005615234375, -0.396026611328125, 0.23469161987304688, -0.0436859130859375, 0.059177398681640625, -0.002471923828125, -0.518035888671875, -0.056243896484375, 0.694610595703125, 0.837799072265625, -0.266632080078125, 0.5712165832519531, -0.2372455596923828, -1.186431884765625, -0.21221923828125, 0.8514556884765625, -0.4218864440917969, -0.03202056884765625, 0.5830078125, 2.158050537109375, 0.01369476318359375, -0.30319976806640625, 1.52862548828125, -1.7386932373046875, 0.118988037109375, 0.0970306396484375, -1.543792724609375, 0.900421142578125, -0.1861591339111328, 0.355255126953125, -0.7135467529296875, -0.7459869384765625, 1.045745849609375, 0.12807846069335938, 0.008819580078125, -0.12347412109375, 0.3981781005859375, -0.35845947265625, 0.0948333740234375, 0.021331787109375, -0.55242919921875, 0.0447998046875, 1.9141387939453125, 0.2351837158203125, -0.46443939208984375, 0.41007232666015625, 0.2176532745361328, 0.601776123046875, -0.953125, 2.3466796875, 0.2671051025390625, 1.401611328125, 0.49066162109375, 0.4266357421875, 0.3280200958251953, -0.898590087890625, 1.30560302734375, -0.2849884033203125, 2.56787109375, -1.108551025390625, -0.094390869140625, -0.08098411560058594, 0.2814445495605469, -1.20947265625, -1.034942626953125, -0.44488525390625, 0.7174224853515625, -0.396636962890625, 0.3710174560546875, 0.677978515625, -0.0135498046875, -0.7299880981445312, 0.16546630859375, -0.114013671875, -0.30035400390625, -0.46368408203125, 0.6899948120117188, -1.1712646484375, 0.8733749389648438, -0.462188720703125, -0.74462890625, -0.2568550109863281, 0.371307373046875, 0.0, 0.0673980712890625, -1.9461669921875, -0.17917251586914062, -1.86761474609375, -0.6799888610839844, 0.81103515625, 1.88934326171875, -0.01482391357421875, 0.2867431640625, -0.552093505859375, -0.506683349609375, -0.995269775390625, -1.21343994140625, 1.254852294921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000010.npy"}
|
||||
{"epoch": 0.020942408376963352, "step": 11, "batch_size": 128, "mean": 0.06434953212738037, "std": 1.1054400205612183, "min": -3.45330810546875, "p10": -1.1864685058593747, "median": 0.0326995849609375, "p90": 1.2553039550781246, "max": 4.155517578125, "pos_frac": 0.5078125, "sample": [-0.89141845703125, -3.45330810546875, 0.8856201171875, 1.106781005859375, 0.7432861328125, -0.11420822143554688, -0.67681884765625, -0.6153564453125, -0.3240966796875, 0.3120574951171875, 0.6227188110351562, 1.0326004028320312, 1.1124267578125, 1.17620849609375, 0.4539947509765625, -0.6002426147460938, -1.1173095703125, -0.08439064025878906, -0.69378662109375, -0.4912261962890625, -0.2518768310546875, -2.2000732421875, 1.77020263671875, 4.155517578125, 0.06829833984375, 1.2244720458984375, -0.1393585205078125, 0.159149169921875, 0.7639923095703125, -0.395782470703125, -0.294097900390625, 1.322509765625, 0.3736114501953125, -0.2122802734375, 0.6424407958984375, 1.4412689208984375, -0.839080810546875, 0.15438079833984375, 0.99896240234375, 0.12090682983398438, -0.408111572265625, -2.011627197265625, -2.109527587890625, -1.34783935546875, 0.4940032958984375, 0.87493896484375, 0.16845703125, 2.1359100341796875, 3.5616455078125, 0.038330078125, 0.029083251953125, -0.515869140625, 0.5886154174804688, -2.528411865234375, -0.292205810546875, 1.3935546875, -0.122283935546875, 1.22650146484375, 0.356475830078125, 0.7530670166015625, 0.2174072265625, 0.8426971435546875, 0.068084716796875, -0.15704345703125, -0.042510986328125, -0.06264305114746094, 0.3061676025390625, -0.16384506225585938, 0.07640838623046875, -0.3424491882324219, -0.354522705078125, 1.1127395629882812, 0.061492919921875, -1.4357757568359375, 0.3310050964355469, 0.561981201171875, -0.16800689697265625, -0.23955535888671875, -0.078948974609375, 2.81707763671875, 0.6184196472167969, -1.74761962890625, -0.46492767333984375, 0.09521484375, -0.9104766845703125, 0.384246826171875, -0.522308349609375, -0.5694580078125, -0.0283203125, 1.0870361328125, -0.7794189453125, -0.5269622802734375, -1.388824462890625, 0.2646827697753906, -1.086273193359375, 1.54681396484375, 0.2220458984375, -1.704559326171875, -0.5557289123535156, -0.195098876953125, -0.8943328857421875, -0.327667236328125, 2.156280517578125, -0.6797027587890625, -0.17193603515625, 0.958404541015625, 0.05010986328125, -0.45458984375, 0.159210205078125, -0.042327880859375, 2.1189727783203125, -0.63824462890625, -1.76416015625, -0.6704254150390625, -0.39263916015625, -1.776519775390625, -1.0565032958984375, 0.611968994140625, -1.6499786376953125, 0.6945648193359375, 0.67401123046875, 1.7091522216796875, 0.5339202880859375, 0.4533538818359375, 0.03631591796875, 0.4185028076171875, -0.246124267578125, 1.8074798583984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000011.npy"}
|
||||
{"epoch": 0.023036649214659685, "step": 12, "batch_size": 128, "mean": -0.15084238350391388, "std": 0.979263186454773, "min": -3.81494140625, "p10": -1.26202392578125, "median": -0.1440105438232422, "p90": 1.0219093322753907, "max": 2.98175048828125, "pos_frac": 0.421875, "sample": [-1.619659423828125, 0.13616943359375, -0.5543212890625, -0.557586669921875, -0.47554969787597656, -0.74603271484375, -0.4286651611328125, 0.0433502197265625, -1.6251220703125, -0.2482452392578125, -0.878692626953125, 1.440460205078125, -0.761627197265625, 0.09642791748046875, -1.05120849609375, 0.2997894287109375, 0.49444580078125, -0.4933929443359375, 0.02716064453125, 0.364471435546875, -2.624114990234375, 1.5465087890625, 2.729095458984375, 0.190765380859375, 0.7222900390625, 0.1717681884765625, -0.9238300323486328, -0.13525009155273438, -0.273193359375, -0.4014434814453125, -0.233123779296875, 0.8939666748046875, -0.33788299560546875, -1.660003662109375, -0.74127197265625, -0.094635009765625, 0.09882354736328125, -1.5779266357421875, -1.162933349609375, 0.7699356079101562, -0.2375335693359375, -2.67095947265625, -0.727508544921875, 1.2515869140625, -0.15277099609375, -0.2691993713378906, -1.39019775390625, 0.877593994140625, -1.127349853515625, -0.309234619140625, 1.7414703369140625, 0.1601104736328125, 0.7006072998046875, -0.65899658203125, -0.01840972900390625, 0.108123779296875, 0.27667236328125, -1.651641845703125, -0.6074066162109375, -0.387115478515625, -0.072662353515625, 0.555816650390625, -0.629852294921875, -0.5108489990234375, -0.4540519714355469, 1.0772705078125, -0.22179412841796875, 1.14727783203125, -0.7239532470703125, 0.10443115234375, -0.41072845458984375, -1.20794677734375, 0.6015701293945312, -0.1009521484375, -0.36258697509765625, 0.301422119140625, 1.0207443237304688, -0.1555023193359375, -0.4468841552734375, -0.45941162109375, 0.816986083984375, 0.028045654296875, -1.247039794921875, 0.9537353515625, 0.026599884033203125, 0.361907958984375, 0.10580253601074219, 0.258453369140625, -2.182373046875, -0.9132080078125, 1.21771240234375, 0.55535888671875, -0.42929840087890625, 0.420806884765625, -1.3857421875, -0.12408447265625, -0.372314453125, -0.0381927490234375, -0.3851776123046875, 1.477294921875, 0.166900634765625, 0.067169189453125, 0.5078125, -0.305145263671875, -1.27911376953125, -0.30859375, -0.07470703125, 1.35772705078125, 0.8367767333984375, 0.4569091796875, 2.98175048828125, 0.5087890625, -0.71746826171875, -0.790863037109375, 2.08953857421875, 0.06634521484375, -0.7683219909667969, -3.81494140625, -1.81085205078125, -0.58258056640625, -0.11749267578125, -0.043701171875, 1.024627685546875, -1.25469970703125, 0.198638916015625, -0.707977294921875, -0.7642974853515625, 0.245758056640625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000012.npy"}
|
||||
{"epoch": 0.025130890052356022, "step": 13, "batch_size": 128, "mean": -0.011737614870071411, "std": 0.9619606733322144, "min": -2.675628662109375, "p10": -1.0780540466308595, "median": -0.013570785522460938, "p90": 0.944317626953125, "max": 3.9833984375, "pos_frac": 0.484375, "sample": [-0.5923233032226562, -0.14563941955566406, -1.0870742797851562, 0.69610595703125, 0.94342041015625, 0.949737548828125, 1.09613037109375, -1.632537841796875, -0.0832061767578125, 0.34462738037109375, 0.5106048583984375, -1.074188232421875, -0.27557373046875, -0.035205841064453125, 0.2263660430908203, 0.3878002166748047, -0.587677001953125, 3.9833984375, 0.89068603515625, -0.12590789794921875, -0.2101593017578125, -0.94573974609375, 0.9464111328125, 0.15174102783203125, 0.6798248291015625, -0.5193195343017578, -0.4038410186767578, -0.376922607421875, 0.1493682861328125, -0.775787353515625, 0.761566162109375, -0.5392913818359375, 0.420806884765625, 0.2775726318359375, -0.2594146728515625, 0.35694122314453125, 0.18697357177734375, -0.6602020263671875, -0.31894683837890625, 0.5082550048828125, 0.24350357055664062, 0.44500732421875, 0.6304931640625, -1.02398681640625, -0.19024658203125, 0.20050048828125, -2.675628662109375, -1.02813720703125, 0.84527587890625, -1.67938232421875, -0.6275177001953125, -1.6123046875, -0.225921630859375, 0.2126312255859375, 0.57354736328125, 0.0, -2.353271484375, 0.888702392578125, 0.1979217529296875, 0.6094284057617188, -0.02661895751953125, -2.4275665283203125, -1.3846359252929688, 0.03968620300292969, 2.90521240234375, -0.86785888671875, -0.535797119140625, 0.17529296875, 1.27264404296875, -0.21651649475097656, 0.949951171875, -0.3618621826171875, 0.48766326904296875, 0.880157470703125, -2.50994873046875, 2.479278564453125, -0.11967277526855469, 1.33148193359375, -0.03986358642578125, 0.4487762451171875, -0.012294769287109375, 0.7310256958007812, 0.1352081298828125, -0.2813720703125, 0.25948333740234375, 0.55303955078125, 0.01981353759765625, 0.1446075439453125, 0.1949005126953125, -0.422698974609375, 1.17510986328125, -0.5399017333984375, 0.8436050415039062, -0.314453125, 0.07398223876953125, -0.0148468017578125, -0.15496444702148438, 0.03936767578125, 0.696441650390625, -0.72021484375, -0.925933837890625, -0.6945037841796875, -0.3832855224609375, -0.442474365234375, -0.687896728515625, 1.494384765625, 2.534393310546875, -0.7110595703125, 0.6439361572265625, -1.190521240234375, 0.3132362365722656, -0.1776123046875, -1.1334152221679688, 0.29611968994140625, -0.09619140625, -1.575103759765625, -0.72625732421875, 0.8822021484375, 1.0755157470703125, -0.2880859375, -0.6026611328125, 0.2242279052734375, -0.086151123046875, 0.4776153564453125, -1.20355224609375, 0.2969322204589844, -0.5472259521484375, -0.4046821594238281], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000013.npy"}
|
||||
{"epoch": 0.027225130890052355, "step": 14, "batch_size": 128, "mean": 0.041386738419532776, "std": 0.8340306878089905, "min": -2.497589111328125, "p10": -0.9374816894531249, "median": 0.08594703674316406, "p90": 1.0195938110351563, "max": 2.4493408203125, "pos_frac": 0.53125, "sample": [1.024688720703125, 0.21039581298828125, -0.7507400512695312, 0.2730712890625, 0.1210174560546875, -0.02276611328125, -0.3513031005859375, 0.298797607421875, 0.09392356872558594, 0.9979095458984375, 0.0462493896484375, -0.1116180419921875, -2.00579833984375, -2.144073486328125, 0.486297607421875, -0.0308837890625, -1.49432373046875, -0.295135498046875, 0.07799148559570312, 0.3198699951171875, -0.3335685729980469, -0.021024703979492188, 0.12347412109375, -0.34844970703125, 1.0784912109375, 0.1806640625, -0.5974578857421875, 0.04180145263671875, -1.121063232421875, -1.08282470703125, 1.878143310546875, -1.87158203125, -0.9342041015625, 0.6811370849609375, 0.34527587890625, -1.68634033203125, -1.697906494140625, -0.3949871063232422, 1.1434326171875, -0.647491455078125, 0.42862510681152344, 0.1549072265625, 0.13259124755859375, 0.99090576171875, 0.0, 0.39720916748046875, -0.3965301513671875, 0.701873779296875, -0.5018234252929688, -0.5761947631835938, 0.7987518310546875, 0.7979278564453125, 0.21480178833007812, 1.6099853515625, 0.3146514892578125, -0.84222412109375, 1.0002593994140625, 1.066314697265625, -0.423614501953125, 0.95263671875, -0.94512939453125, -2.497589111328125, 0.994873046875, -0.0576171875, 0.5826568603515625, 1.4000244140625, -0.770843505859375, -0.33795166015625, 0.10296630859375, 1.253570556640625, -0.15604019165039062, -0.7368316650390625, -0.6870994567871094, 1.336761474609375, 0.6597900390625, 0.27133941650390625, -0.32764434814453125, 0.920379638671875, 0.171630859375, -0.872772216796875, -0.0443115234375, -0.17388916015625, -0.07530975341796875, -0.3267974853515625, -0.75238037109375, -0.2093658447265625, 0.2181873321533203, -0.4842529296875, 0.42954254150390625, -0.29467201232910156, 0.4892425537109375, 0.1459197998046875, 1.537353515625, -0.4483489990234375, 0.51800537109375, 0.386077880859375, -0.4389171600341797, -0.284759521484375, 0.945556640625, -0.1197662353515625, -0.58642578125, -0.9895782470703125, -0.20318603515625, 0.781494140625, 0.093902587890625, 0.2576904296875, -0.1241455078125, 0.697998046875, 0.571746826171875, -0.2014007568359375, 0.8899688720703125, -0.123626708984375, -1.732452392578125, -0.36917877197265625, 0.13665008544921875, -1.1800537109375, -0.543609619140625, 0.551910400390625, 0.8821258544921875, 0.5079994201660156, 2.4493408203125, 1.0462799072265625, -0.1629638671875, 0.109405517578125, 0.047149658203125, 1.5560302734375, 0.2952880859375, 1.0174102783203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000014.npy"}
|
||||
{"epoch": 0.02931937172774869, "step": 15, "batch_size": 128, "mean": -0.011841341853141785, "std": 0.8913605809211731, "min": -3.14111328125, "p10": -1.0374931335449218, "median": -0.04686927795410156, "p90": 0.859796142578125, "max": 3.58935546875, "pos_frac": 0.484375, "sample": [0.4495086669921875, 0.4893951416015625, 0.1891632080078125, -0.06976318359375, 0.36859130859375, 0.5228805541992188, 0.33355712890625, -0.814544677734375, -1.0848846435546875, 0.1182861328125, 0.3044548034667969, 0.128875732421875, -0.2749176025390625, -0.19111251831054688, 0.3322601318359375, -1.16943359375, -1.0268020629882812, -0.13116836547851562, -0.12583160400390625, -0.1137542724609375, 0.51177978515625, -0.48738861083984375, -0.247833251953125, 0.69915771484375, -0.391632080078125, 1.1808319091796875, -1.38555908203125, -0.45709228515625, 1.726654052734375, -0.10384368896484375, 0.938385009765625, -0.519805908203125, -0.401641845703125, -0.36077880859375, 0.3997802734375, -0.10493087768554688, 1.63336181640625, -1.71136474609375, 0.478668212890625, 0.37891387939453125, -0.35482025146484375, -1.45458984375, 0.1022186279296875, -0.12381744384765625, -0.3355560302734375, -0.479705810546875, 0.06303787231445312, 0.0493621826171875, -0.282196044921875, 0.576416015625, 0.0705108642578125, 0.7939605712890625, 0.568389892578125, -0.263092041015625, -0.8367919921875, -1.25433349609375, 0.730438232421875, -0.16782379150390625, 0.1952075958251953, 0.644287109375, -0.40245819091796875, 0.073760986328125, -2.15045166015625, 0.010251998901367188, 0.400665283203125, -0.5848846435546875, 1.199432373046875, 0.2270965576171875, 1.509552001953125, -0.8851318359375, -0.11879348754882812, -0.55438232421875, 0.58514404296875, -1.06243896484375, -1.155426025390625, 0.8928375244140625, 0.5559844970703125, 0.30072021484375, -0.178192138671875, 0.4367523193359375, -0.460174560546875, -0.06848526000976562, 0.5535888671875, 0.85479736328125, -2.526123046875, 0.437042236328125, -0.709716796875, -0.3870849609375, 0.6149063110351562, 1.22235107421875, -0.19030380249023438, -0.7757568359375, -0.2238311767578125, -0.8771209716796875, 1.48004150390625, 3.30145263671875, 0.2623119354248047, 0.031951904296875, 0.193603515625, -1.65869140625, 0.3348388671875, 3.58935546875, -0.896820068359375, 0.24776458740234375, 0.221038818359375, 0.964599609375, -3.14111328125, -0.2265777587890625, -0.2339019775390625, -0.07925796508789062, 0.02088165283203125, 0.05770111083984375, -0.07456588745117188, -0.0712738037109375, 0.74542236328125, -0.01633453369140625, -1.01605224609375, 0.6216201782226562, 0.2583160400390625, -0.255767822265625, 0.567901611328125, -0.032802581787109375, -0.1565399169921875, -1.767974853515625, -0.214263916015625, 0.8714599609375, -0.06093597412109375, -0.19873046875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000015.npy"}
|
||||
{"epoch": 0.031413612565445025, "step": 16, "batch_size": 128, "mean": 0.13133111596107483, "std": 1.0349777936935425, "min": -2.0855712890625, "p10": -0.9542922973632812, "median": -0.001888275146484375, "p90": 1.2787658691406247, "max": 4.7010498046875, "pos_frac": 0.4921875, "sample": [-0.115447998046875, 0.35466766357421875, -0.00377655029296875, 0.3580322265625, -0.3526458740234375, -0.0648956298828125, 0.696624755859375, 1.437774658203125, -0.994384765625, 0.01513671875, 0.3861961364746094, -0.02997589111328125, -0.18048858642578125, -0.833251953125, 3.36566162109375, -0.5575485229492188, -0.11572265625, 0.1593017578125, 0.365478515625, -0.305633544921875, 0.6275482177734375, -0.34722900390625, -0.2892303466796875, 1.91204833984375, 1.94342041015625, -0.17439651489257812, 1.76483154296875, 2.24359130859375, -1.675628662109375, -0.7328338623046875, 0.332855224609375, 0.94244384765625, -0.915374755859375, -1.449737548828125, -1.293914794921875, -0.122406005859375, 1.612060546875, 0.40899658203125, -0.6037178039550781, -0.5299530029296875, -0.0057373046875, 0.8946075439453125, 0.1764068603515625, 0.48223876953125, -0.2054290771484375, -0.11162948608398438, 0.75872802734375, 0.0126495361328125, 1.002960205078125, 0.4149818420410156, 0.6596908569335938, 0.6666259765625, -0.34554290771484375, -1.625640869140625, -0.30889892578125, -0.26275634765625, 0.45849609375, -0.42535400390625, -0.0494384765625, 0.6092529296875, 0.0543212890625, 0.050930023193359375, 0.9300689697265625, -0.9511566162109375, -1.2723846435546875, -0.52581787109375, -0.12955093383789062, 1.367889404296875, -0.20458984375, -1.06329345703125, -0.07238006591796875, 0.3311767578125, -0.8515396118164062, 0.65478515625, 0.162322998046875, -0.3679351806640625, -0.877349853515625, 1.240570068359375, 0.3637809753417969, -0.209991455078125, -0.34931182861328125, -0.39171600341796875, -0.584320068359375, -0.458038330078125, -1.060394287109375, 0.71466064453125, 0.0, 0.49079132080078125, 0.483428955078125, 0.89251708984375, -1.5762939453125, -1.05450439453125, 0.1179656982421875, 0.40081787109375, 1.1502685546875, 0.75982666015625, 0.2530250549316406, -0.7085723876953125, -0.2133636474609375, 0.3654632568359375, 0.807342529296875, 4.119903564453125, 1.090301513671875, -0.390350341796875, -0.36490631103515625, 0.4009857177734375, -0.96160888671875, -0.8930511474609375, -0.7657470703125, 1.3837890625, 0.7150802612304688, -0.37108612060546875, -2.0855712890625, 0.079833984375, -1.904022216796875, 1.379974365234375, 0.8133544921875, 0.0018768310546875, -0.178955078125, 0.1868743896484375, -0.7001724243164062, 1.08477783203125, 2.48504638671875, 0.0133056640625, 4.7010498046875, -0.37353515625, -0.60614013671875, -0.7547607421875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000016.npy"}
|
||||
{"epoch": 0.033507853403141365, "step": 17, "batch_size": 128, "mean": 0.10838770866394043, "std": 0.7019661068916321, "min": -1.7891845703125, "p10": -0.7583511352539062, "median": 0.11806488037109375, "p90": 1.0451385498046875, "max": 1.885498046875, "pos_frac": 0.5625, "sample": [-0.43616485595703125, -0.9479217529296875, -0.82568359375, 0.105224609375, -0.580902099609375, 0.74066162109375, 0.264068603515625, -0.27081298828125, 0.79620361328125, 0.54986572265625, 0.10480499267578125, -0.04547119140625, -0.746490478515625, 1.885498046875, -0.54351806640625, 0.7251815795898438, 1.2345123291015625, -0.27001953125, 0.312713623046875, 0.36529541015625, -1.092041015625, 1.1653594970703125, 1.08245849609375, -0.39571380615234375, 0.21331024169921875, -0.013885498046875, 0.656524658203125, 0.36376953125, -0.34271240234375, -0.639404296875, -0.4165191650390625, 0.00970458984375, 0.2971916198730469, 0.13330078125, -0.971710205078125, 0.5997200012207031, 0.956298828125, -0.97552490234375, 0.4803581237792969, -0.008625030517578125, 1.0205078125, -0.7860260009765625, -0.5014801025390625, -0.6872596740722656, -0.8464202880859375, 0.5649185180664062, 1.6959457397460938, -0.04279136657714844, 0.62506103515625, 0.8926544189453125, -0.041778564453125, -0.3874053955078125, -1.27728271484375, 0.1280975341796875, 0.2129364013671875, 0.4376373291015625, -0.15724945068359375, -0.1650390625, 0.9464645385742188, 0.27325439453125, 0.1961517333984375, 0.0340576171875, 0.32061004638671875, 0.2235565185546875, -0.850830078125, -0.3550567626953125, -0.9241943359375, 0.2919731140136719, 0.48979949951171875, -0.09264373779296875, -1.7891845703125, 0.2234344482421875, -0.59552001953125, -1.251495361328125, 0.21893310546875, 0.1080322265625, 1.334228515625, 0.27371978759765625, 0.6771392822265625, -0.05929374694824219, 1.029144287109375, -0.601104736328125, -0.4939727783203125, 0.315216064453125, -0.1195068359375, -0.108306884765625, 1.35052490234375, 1.2913055419921875, -1.51470947265625, 0.12890625, 0.07264328002929688, 0.536651611328125, 0.202850341796875, 1.389404296875, 0.7962646484375, -0.6511383056640625, -0.4535675048828125, 0.099639892578125, 0.6265335083007812, -0.4053955078125, -0.4986915588378906, -0.40435791015625, 0.86907958984375, 1.692413330078125, -0.5280647277832031, 0.257965087890625, -0.15118026733398438, 0.2538909912109375, 0.554718017578125, 0.53271484375, -0.3142356872558594, -0.37175750732421875, -0.1021728515625, 0.09113311767578125, -0.242706298828125, -0.62139892578125, 0.5345611572265625, 1.41693115234375, -0.31976318359375, 0.96942138671875, 0.21283340454101562, 0.29998016357421875, -0.28985595703125, 0.5073661804199219, 0.1595458984375, 1.3600921630859375, -0.6178436279296875, 1.23455810546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000017.npy"}
|
||||
{"epoch": 0.0356020942408377, "step": 18, "batch_size": 128, "mean": -0.09466424584388733, "std": 0.9889122843742371, "min": -3.216461181640625, "p10": -1.2217987060546873, "median": -0.05652046203613281, "p90": 1.159130859375, "max": 2.063690185546875, "pos_frac": 0.46875, "sample": [-0.3220062255859375, -1.12335205078125, 0.017995834350585938, 0.941864013671875, 0.35604095458984375, -2.073883056640625, -0.051616668701171875, -0.3637847900390625, -0.06142425537109375, 0.1424560546875, 0.2198486328125, 0.3819732666015625, 0.5216064453125, 1.22412109375, 0.73455810546875, 0.17824554443359375, 1.035400390625, -2.432159423828125, -0.4130401611328125, -0.12519073486328125, -0.480072021484375, 0.9149627685546875, 1.211944580078125, 0.35800933837890625, 1.155029296875, -0.08080863952636719, -0.21710205078125, 1.112640380859375, 0.4716644287109375, 0.18220901489257812, 0.117340087890625, -0.585479736328125, -0.10109710693359375, -0.6136932373046875, 0.1701812744140625, -0.5643310546875, 0.624908447265625, 0.40767860412597656, 0.079803466796875, 1.58941650390625, 0.02806854248046875, -1.447601318359375, -3.12286376953125, 1.602813720703125, 0.06166839599609375, 0.45742034912109375, 1.80352783203125, 0.100555419921875, 0.886993408203125, -0.285614013671875, -0.4235687255859375, -0.50299072265625, -0.146820068359375, -0.08527374267578125, -2.85919189453125, -1.534454345703125, -0.680419921875, 1.56744384765625, -0.8267822265625, 0.195587158203125, 1.969482421875, 0.3912200927734375, -1.55999755859375, 0.537689208984375, -2.00970458984375, -0.859344482421875, -1.14093017578125, 2.063690185546875, -0.141998291015625, -0.90728759765625, 0.3934612274169922, 0.6202430725097656, -0.679656982421875, 0.0345458984375, -0.0319976806640625, 0.08683013916015625, -1.03863525390625, -0.0450592041015625, 0.8529205322265625, -0.4281158447265625, -2.102996826171875, -0.2692413330078125, -0.994140625, 1.509521484375, 0.6626129150390625, 0.471954345703125, -0.03008270263671875, -0.72894287109375, -0.395904541015625, -0.7759933471679688, -0.5950927734375, 0.34638214111328125, 0.72369384765625, -0.36925506591796875, -2.6409912109375, -0.91082763671875, -0.26703643798828125, 0.52880859375, -0.19512939453125, -0.2676544189453125, -0.384765625, 0.14593505859375, -0.70465087890625, -0.10693359375, -0.07122802734375, -0.13037109375, 1.4388427734375, -1.410491943359375, -0.328704833984375, 0.4539947509765625, -3.216461181640625, 0.439971923828125, -0.4578857421875, 1.20806884765625, -0.322265625, 0.105865478515625, -0.503662109375, 0.8447265625, -0.28173828125, 0.1683349609375, 1.168701171875, -0.380828857421875, 0.3900146484375, -0.724609375, 0.773193359375, -0.709228515625, 1.1920318603515625, -1.849273681640625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000018.npy"}
|
||||
{"epoch": 0.03769633507853403, "step": 19, "batch_size": 128, "mean": -0.09179211407899857, "std": 0.844662070274353, "min": -2.6348876953125, "p10": -1.144598388671875, "median": -0.09458160400390625, "p90": 0.9132873535156248, "max": 2.21917724609375, "pos_frac": 0.421875, "sample": [-0.3011894226074219, 0.3609619140625, 1.1981964111328125, 2.21917724609375, -1.35797119140625, -0.6805419921875, -0.2052001953125, -0.737030029296875, -0.2488250732421875, -0.47821044921875, 0.025529861450195312, -0.1927661895751953, 1.4384613037109375, -0.1058349609375, -0.028995513916015625, -0.10377883911132812, -0.264923095703125, -1.092376708984375, 0.5529251098632812, 1.5894622802734375, 0.1005401611328125, -0.093414306640625, -0.2835254669189453, 0.95562744140625, -0.077545166015625, 0.1775054931640625, 0.2533149719238281, -0.03231048583984375, 1.4854736328125, 0.5650634765625, -0.899078369140625, 0.894287109375, -1.425048828125, 0.0855712890625, 0.163787841796875, -0.17464065551757812, -0.046539306640625, 0.05841064453125, 0.5600738525390625, 0.3620719909667969, 0.09583282470703125, 0.115020751953125, -1.14117431640625, 0.0845794677734375, 0.1593017578125, 2.0960693359375, -0.0052337646484375, 0.2598114013671875, -1.298553466796875, -0.0957489013671875, -0.3152732849121094, -1.186248779296875, 0.7630157470703125, -1.2490692138671875, -0.6998138427734375, -1.05999755859375, 1.17791748046875, -1.152587890625, -0.02797698974609375, -0.27429962158203125, -0.7077102661132812, -0.0645751953125, -0.441436767578125, -0.3596382141113281, -0.853271484375, -0.6245269775390625, -0.0650634765625, 0.8142547607421875, -0.7295303344726562, 0.4909515380859375, -0.092193603515625, -1.1256103515625, -0.2733306884765625, -0.313812255859375, 0.28403472900390625, -0.7857513427734375, 0.519866943359375, 0.2768707275390625, 0.5362091064453125, 0.68731689453125, -0.5430450439453125, -1.5184326171875, 0.16411781311035156, 0.58251953125, -0.9279327392578125, -0.284698486328125, -0.1197662353515625, -1.352294921875, -0.3554534912109375, -0.17706298828125, -0.53070068359375, -2.00335693359375, -0.8451080322265625, 0.5768280029296875, 1.50775146484375, 1.543731689453125, 1.5829620361328125, -0.2456817626953125, -0.22052383422851562, 0.05320262908935547, -0.5456466674804688, 0.601593017578125, -1.0636749267578125, 0.0661468505859375, -2.6348876953125, -1.979949951171875, 0.300933837890625, -1.420654296875, 0.113372802734375, -1.064361572265625, 0.60186767578125, -0.404327392578125, 0.38714599609375, -0.6663818359375, 0.0301513671875, -1.1391143798828125, -0.408294677734375, -0.4880218505859375, 2.137725830078125, 1.18658447265625, 0.497222900390625, 0.420196533203125, 0.8951416015625, -0.3030548095703125, -0.34954833984375, -1.2574462890625, 0.3562164306640625, -0.1407012939453125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000019.npy"}
|
||||
{"epoch": 0.039790575916230364, "step": 20, "batch_size": 128, "mean": -0.006401553750038147, "std": 0.8496471643447876, "min": -3.89080810546875, "p10": -1.1241607666015625, "median": 0.0137939453125, "p90": 0.9493873596191402, "max": 2.278411865234375, "pos_frac": 0.515625, "sample": [0.107330322265625, 1.3409576416015625, -0.510589599609375, 1.2718505859375, 0.5292816162109375, 0.42254638671875, -0.023101806640625, -1.105804443359375, 0.11639404296875, -0.49428558349609375, -1.181427001953125, -0.4525146484375, 0.435546875, 1.337005615234375, 0.3609619140625, 0.5366058349609375, -0.286895751953125, 0.3519287109375, 0.5780410766601562, -0.5985794067382812, 0.4976654052734375, 1.2064208984375, 2.278411865234375, 0.001743316650390625, 0.22884559631347656, 0.40488433837890625, 0.106903076171875, 0.09844970703125, -0.29738616943359375, -1.288818359375, -1.1177978515625, -0.08271980285644531, -0.06706619262695312, -1.409271240234375, -0.2520294189453125, -2.14215087890625, -0.366302490234375, 1.14227294921875, -0.509918212890625, -0.8235931396484375, -0.18255615234375, -3.89080810546875, -0.144500732421875, 0.01995849609375, -0.0433197021484375, 0.723388671875, -0.18760299682617188, -0.1273956298828125, -0.1518096923828125, 0.37844085693359375, 0.6863250732421875, -0.26898956298828125, 0.0, 1.23974609375, 0.6798095703125, 0.5152816772460938, -0.13652610778808594, 1.42706298828125, 0.408050537109375, 0.2545013427734375, 0.37694549560546875, 0.1859130859375, -1.280731201171875, 0.761138916015625, -1.247802734375, -0.584197998046875, 0.7012176513671875, 0.28720855712890625, -0.96319580078125, 0.91009521484375, 0.8726806640625, -0.00838470458984375, -0.379180908203125, 0.2791175842285156, -1.139007568359375, -1.4698486328125, -0.055690765380859375, 1.32904052734375, 1.0410690307617188, 1.2278289794921875, 0.63323974609375, -0.1895904541015625, -0.20433807373046875, 0.33415985107421875, -0.16766357421875, -0.3141937255859375, -1.84210205078125, -0.679443359375, 0.03131103515625, 0.23443603515625, -0.17713165283203125, 1.65234375, 0.884033203125, -1.359375, -0.64813232421875, -0.53594970703125, -1.0465087890625, -0.45668792724609375, 0.7301254272460938, -0.2541179656982422, 0.42144775390625, -0.06365966796875, 0.00762939453125, -0.10857391357421875, -0.56854248046875, -1.0415496826171875, 0.170867919921875, -0.874969482421875, 0.7231979370117188, -0.574920654296875, -0.306182861328125, -0.6076507568359375, 0.43011474609375, 0.434051513671875, -1.32098388671875, 0.33135223388671875, 1.95770263671875, 0.2069549560546875, -1.40380859375, 0.35737037658691406, 0.064300537109375, 0.7828521728515625, 0.4304962158203125, -0.5453033447265625, 0.3437538146972656, 0.0479583740234375, 0.187255859375, 0.687957763671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000020.npy"}
|
||||
{"epoch": 0.041884816753926704, "step": 21, "batch_size": 128, "mean": -0.039278268814086914, "std": 0.945741593837738, "min": -2.30328369140625, "p10": -1.2604087829589843, "median": 0.0036163330078125, "p90": 1.2288909912109371, "max": 2.300689697265625, "pos_frac": 0.5, "sample": [1.131134033203125, 1.014617919921875, -0.486572265625, -0.639556884765625, -0.822967529296875, -0.58148193359375, -0.890625, -1.98638916015625, 0.020172119140625, -0.8094482421875, 1.6533355712890625, -0.62298583984375, -0.0104217529296875, -1.0349578857421875, -1.257659912109375, -0.9321975708007812, 0.47552490234375, 0.726226806640625, -0.984039306640625, 0.07390403747558594, 2.01971435546875, 0.5803585052490234, -1.66778564453125, 1.19866943359375, -0.039093017578125, -0.3126220703125, 0.929840087890625, 1.784423828125, 0.3541107177734375, -1.7851181030273438, 0.17768096923828125, 0.83880615234375, 1.32635498046875, -0.7826156616210938, 0.3377227783203125, 0.33492279052734375, 0.0980987548828125, -1.046966552734375, -1.386260986328125, 2.300689697265625, -1.0692138671875, 0.5327911376953125, 0.144134521484375, 0.3447723388671875, -1.2668228149414062, -0.158355712890625, -0.2357616424560547, 1.55938720703125, -0.464324951171875, 0.35870361328125, -1.2830352783203125, 0.9354248046875, 0.0, -0.6040191650390625, -0.002105712890625, 0.02202606201171875, 0.007232666015625, -0.318267822265625, -1.0661735534667969, 0.624267578125, -1.36419677734375, -0.3568115234375, 0.1792278289794922, 1.73736572265625, 0.458740234375, 1.491180419921875, 0.20745849609375, -1.027618408203125, 0.2757415771484375, 0.117828369140625, -0.0728759765625, 0.75341796875, -2.12811279296875, -0.9896240234375, -0.88189697265625, -0.46337890625, -2.09759521484375, 1.015899658203125, -0.23065185546875, 1.15087890625, -0.8694992065429688, -0.016933441162109375, -0.95770263671875, -1.4945068359375, 0.422454833984375, 0.193206787109375, 0.036060333251953125, -0.5634307861328125, 1.3466644287109375, -0.3193473815917969, -1.48028564453125, -0.35880279541015625, 1.3861083984375, 0.019891738891601562, 1.183990478515625, 0.5531768798828125, 0.949920654296875, -0.24298095703125, 0.29949951171875, 1.032928466796875, -0.7020263671875, -0.454010009765625, 1.6651611328125, 0.3497314453125, -2.30328369140625, -0.942169189453125, -0.2391204833984375, -0.167999267578125, 0.09885406494140625, 0.643035888671875, -0.0567474365234375, -0.00567626953125, 1.478729248046875, 0.98779296875, 0.6457138061523438, 0.426788330078125, 0.014312744140625, 1.299407958984375, -1.154052734375, -1.54254150390625, -0.30975341796875, -0.68463134765625, -0.2740039825439453, 0.115478515625, -1.1027069091796875, 0.48907470703125, 0.3242759704589844, 0.12015533447265625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000021.npy"}
|
||||
{"epoch": 0.04397905759162304, "step": 22, "batch_size": 128, "mean": 0.10819865763187408, "std": 0.724793553352356, "min": -1.24847412109375, "p10": -0.7185653686523437, "median": 0.033669471740722656, "p90": 1.0360519409179687, "max": 2.4114990234375, "pos_frac": 0.5078125, "sample": [2.092193603515625, 0.7025184631347656, 0.2882080078125, -0.7078857421875, 1.706939697265625, -0.7434844970703125, -0.2981109619140625, -0.267578125, 0.216094970703125, 0.690093994140625, -0.039520263671875, -0.21486663818359375, -1.1465911865234375, 0.2967720031738281, -1.176910400390625, -1.155303955078125, 0.17120361328125, 0.303955078125, 0.060760498046875, -0.08636474609375, 0.28253173828125, -0.5656814575195312, -0.4852294921875, -0.18444061279296875, -0.149078369140625, -0.3465118408203125, -0.40081787109375, 0.238800048828125, 0.33875274658203125, -0.20740127563476562, 1.053131103515625, 0.69219970703125, 0.0, 1.7513427734375, -0.9370956420898438, -1.24847412109375, -0.3987579345703125, 0.3716583251953125, 0.9836959838867188, -0.49652099609375, -0.267730712890625, 0.13763427734375, 0.066070556640625, 2.122314453125, 0.12158203125, -0.1633758544921875, -0.0533447265625, -0.1687774658203125, 0.540283203125, 0.091217041015625, -0.2795143127441406, 0.3768882751464844, 0.14031982421875, 1.0287322998046875, -0.37636566162109375, -0.198211669921875, -0.14146041870117188, 0.84014892578125, -0.1009521484375, 2.4114990234375, 0.1566162109375, -0.805023193359375, 1.20489501953125, -0.41793060302734375, -0.12478256225585938, -0.2846260070800781, 0.58135986328125, -1.0472412109375, 0.261444091796875, -1.110015869140625, -0.26272010803222656, 0.457763671875, -0.34332275390625, -0.01434326171875, -0.98480224609375, 0.509796142578125, -0.2777099609375, -0.5689773559570312, 0.82537841796875, 2.074188232421875, -0.630035400390625, 0.20379638671875, 0.1422271728515625, -0.10324859619140625, -0.11004638671875, 0.343597412109375, 0.8663558959960938, -0.691162109375, -0.5306396484375, -1.125213623046875, 1.5799713134765625, -0.02288055419921875, -0.11129379272460938, 0.1486968994140625, -0.4697418212890625, 0.05738067626953125, -0.94921875, -0.616607666015625, 1.473297119140625, 1.375946044921875, -0.17218017578125, -0.4999256134033203, 0.32476806640625, 0.1627197265625, 0.22125244140625, 0.6635894775390625, 0.2585296630859375, 1.3876190185546875, -0.091583251953125, -0.151153564453125, 0.2389678955078125, 0.742401123046875, -0.039134979248046875, -0.40642547607421875, 0.12634658813476562, 1.359619140625, 0.69317626953125, 0.895050048828125, 0.51025390625, 0.07177734375, -0.096282958984375, 0.62158203125, 0.19127655029296875, -0.1279144287109375, 0.05113983154296875, -0.925689697265625, 0.04237556457519531, 0.02496337890625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000022.npy"}
|
||||
{"epoch": 0.04607329842931937, "step": 23, "batch_size": 128, "mean": -0.007470622658729553, "std": 0.8977943062782288, "min": -3.0615234375, "p10": -1.16015625, "median": 0.033496856689453125, "p90": 0.9997390747070312, "max": 2.239990234375, "pos_frac": 0.5078125, "sample": [0.12229537963867188, -0.3786773681640625, 0.420074462890625, -0.2734222412109375, 1.661590576171875, 2.239990234375, 1.50054931640625, 0.9906158447265625, -0.411285400390625, -0.0130615234375, 1.8726806640625, -0.38397216796875, -0.799346923828125, -0.596832275390625, -1.322235107421875, 0.28363037109375, 0.04473876953125, 0.335205078125, -0.20338821411132812, -0.7691192626953125, 0.2564697265625, -0.5142822265625, -0.335601806640625, -0.4593505859375, 0.525726318359375, 0.60321044921875, -1.141845703125, 0.3051300048828125, -0.8825607299804688, -0.006256103515625, 0.0, 0.30558013916015625, 0.1292266845703125, 0.74969482421875, -1.50933837890625, -0.3397369384765625, 0.217315673828125, -0.894561767578125, 0.454803466796875, -0.43267822265625, -0.0926361083984375, -1.38885498046875, 0.89739990234375, -0.5604972839355469, 0.452789306640625, -1.0673828125, -0.8405914306640625, -0.8234710693359375, 0.51416015625, 0.86041259765625, 0.47515869140625, 0.1254100799560547, -1.2394866943359375, -0.68658447265625, 1.80364990234375, 1.021026611328125, -0.02860260009765625, 0.2773590087890625, 0.8439407348632812, -0.0153350830078125, -0.94390869140625, 0.36490631103515625, 0.2794189453125, 0.4236907958984375, 0.15484619140625, -0.47620391845703125, -0.68194580078125, 1.5321044921875, 0.8350372314453125, 0.422454833984375, 0.0612030029296875, 0.5062255859375, -1.121185302734375, -0.6910400390625, 1.10546875, -0.7725067138671875, 1.86260986328125, 0.4047889709472656, 1.552459716796875, 0.89654541015625, -0.7240982055664062, -0.45323944091796875, 0.5482254028320312, 0.6909942626953125, 0.959625244140625, 0.13470458984375, -1.32757568359375, 0.2166595458984375, -0.15545654296875, 0.5567626953125, -0.7982406616210938, -0.12223243713378906, -0.030853271484375, 0.3288917541503906, -0.6840591430664062, -1.760986328125, 0.574493408203125, 1.3433837890625, -1.3956680297851562, -2.007080078125, 0.895172119140625, 0.07428741455078125, 0.02225494384765625, -0.8134689331054688, 0.7076873779296875, -0.634521484375, -0.338470458984375, 0.4705047607421875, 0.6653900146484375, 1.99072265625, 0.5467491149902344, -3.0615234375, 0.19775009155273438, -1.250518798828125, -0.1524524688720703, -1.42852783203125, -0.072509765625, -0.4326324462890625, -0.7647705078125, 0.61767578125, 1.277984619140625, -1.8732223510742188, 0.2869873046875, -0.0697174072265625, 0.8039398193359375, -1.202880859375, -0.1344451904296875, -0.7717437744140625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000023.npy"}
|
||||
{"epoch": 0.048167539267015703, "step": 24, "batch_size": 128, "mean": -0.02253343164920807, "std": 0.9836527109146118, "min": -2.9935302734375, "p10": -1.116424560546875, "median": 0.03941535949707031, "p90": 1.2675506591796875, "max": 1.927978515625, "pos_frac": 0.53125, "sample": [0.2520294189453125, -0.9014129638671875, -0.4357414245605469, 1.8377838134765625, 0.470703125, 0.11444091796875, 0.7052001953125, 1.8026123046875, 0.17322540283203125, 1.083709716796875, -0.07098388671875, 0.01592254638671875, 0.2257080078125, -0.17845916748046875, 0.124908447265625, -0.933349609375, 0.215484619140625, 0.087738037109375, 0.57427978515625, 0.67156982421875, -0.464599609375, -0.3751220703125, -0.28887939453125, 0.0, 0.227874755859375, 0.22254562377929688, -1.781524658203125, -0.331024169921875, 0.078857421875, -0.129058837890625, -0.29644775390625, 0.045139312744140625, -0.664794921875, 0.803955078125, 0.532684326171875, -0.403778076171875, -1.6683349609375, -0.1262359619140625, -0.34686279296875, -0.2619781494140625, -0.1056365966796875, -0.04586029052734375, 0.22552871704101562, -0.87103271484375, -0.674530029296875, -0.18279266357421875, 0.0, -0.21405029296875, -1.1640625, 0.25091552734375, 0.562286376953125, 0.48381805419921875, 0.09454345703125, 0.1815643310546875, 1.09637451171875, 0.6583633422851562, -1.733306884765625, 0.014049530029296875, -0.3575439453125, 1.74090576171875, -1.433135986328125, -0.09822845458984375, 1.6348876953125, 0.09600448608398438, 0.27191162109375, 0.6674575805664062, 0.00289154052734375, 0.22364044189453125, -0.4229278564453125, -0.7225341796875, 0.60186767578125, 1.31378173828125, -0.7148818969726562, -2.9935302734375, -2.1860504150390625, 1.2958984375, 0.9761962890625, -0.9339599609375, 1.60015869140625, -0.4546356201171875, 1.927978515625, 0.2086639404296875, -0.8319091796875, -0.3139495849609375, 0.3624114990234375, 0.29296875, -0.71099853515625, 0.13128662109375, -2.1297760009765625, 0.7976150512695312, 1.0277099609375, 0.47686767578125, -0.792572021484375, -0.36142730712890625, 1.255401611328125, -0.36635780334472656, 0.03369140625, -0.4376220703125, -0.04046630859375, 1.691619873046875, 0.9210205078125, 0.735595703125, 1.43804931640625, 1.672607421875, -1.07489013671875, -2.73553466796875, 0.296875, -1.073760986328125, 1.095855712890625, -1.46514892578125, 0.4622802734375, 0.4641876220703125, 0.04998779296875, 0.884765625, -0.2617950439453125, 1.5865478515625, -2.068695068359375, -0.074951171875, -0.6170196533203125, -1.0530853271484375, 1.8900146484375, 0.524444580078125, 0.06034088134765625, -2.45526123046875, -1.09600830078125, -0.0446929931640625, -2.605499267578125, 0.1462249755859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000024.npy"}
|
||||
{"epoch": 0.050261780104712044, "step": 25, "batch_size": 128, "mean": 0.12705141305923462, "std": 0.7369652986526489, "min": -1.351837158203125, "p10": -0.8202316284179687, "median": 0.06372451782226562, "p90": 1.0768402099609373, "max": 2.59722900390625, "pos_frac": 0.53125, "sample": [0.3700408935546875, -0.57891845703125, 1.281951904296875, 0.57342529296875, -0.7249755859375, 0.586669921875, -0.39923095703125, 0.763763427734375, 2.51202392578125, -1.068756103515625, 0.4904022216796875, -0.707794189453125, -0.05889892578125, 1.870025634765625, 0.13031005859375, -1.166168212890625, -0.32483673095703125, 0.05957794189453125, -1.351837158203125, 0.696258544921875, -0.01326751708984375, -0.2010631561279297, -0.2644500732421875, 0.1910552978515625, -0.63385009765625, 0.2826385498046875, -0.7424163818359375, -0.4556884765625, -0.6608963012695312, 1.59808349609375, 0.4269828796386719, -0.3365020751953125, 0.0, -0.055156707763671875, 0.872955322265625, -0.017724990844726562, 1.2643661499023438, -0.1568756103515625, -1.159515380859375, -0.823638916015625, -0.06390380859375, 0.0, -0.09228515625, -0.6064453125, 0.7940673828125, -0.097625732421875, -0.07221221923828125, -0.373870849609375, 0.42999267578125, -1.0503387451171875, -0.45116424560546875, 0.001007080078125, 1.310089111328125, -0.42584991455078125, 1.095062255859375, -0.1910552978515625, 0.21048736572265625, -0.2890167236328125, 0.6182403564453125, 0.12744140625, 0.636505126953125, 0.14801025390625, -0.1181640625, -0.828338623046875, -0.4199371337890625, -0.022340774536132812, 1.2036590576171875, 0.3011474609375, -0.93603515625, 0.64239501953125, 0.4388427734375, -0.002777099609375, -0.367462158203125, 1.8580322265625, -0.54986572265625, 0.36968994140625, -0.5822677612304688, -0.013698577880859375, 0.2790374755859375, -0.859710693359375, -1.16583251953125, 0.392486572265625, 0.7739105224609375, 2.59722900390625, -0.68310546875, 0.008544921875, -0.07447242736816406, 0.13077926635742188, -0.27834320068359375, 1.4500579833984375, 0.66595458984375, -0.11902618408203125, 0.3104400634765625, 1.951629638671875, 0.4388427734375, -0.93292236328125, 0.2411651611328125, 0.560638427734375, 1.06903076171875, 0.128448486328125, 0.17581939697265625, 1.0166015625, 0.3079833984375, 0.08436203002929688, 0.06787109375, 1.17523193359375, 0.458282470703125, 0.175628662109375, -0.8187713623046875, 0.28399658203125, -0.3505096435546875, 0.9720382690429688, 0.318359375, 0.3261566162109375, -0.281036376953125, 0.35736083984375, 0.46875, 0.0294342041015625, 0.4083404541015625, 0.39534759521484375, 0.5268516540527344, 0.2017803192138672, 0.89447021484375, -0.44661712646484375, -0.834228515625, -0.12359619140625, -1.016845703125, -0.09334754943847656], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000025.npy"}
|
||||
{"epoch": 0.05235602094240838, "step": 26, "batch_size": 128, "mean": 0.029970183968544006, "std": 0.9350574612617493, "min": -2.48089599609375, "p10": -1.0705810546874999, "median": 0.03905677795410156, "p90": 1.173170471191406, "max": 3.6612548828125, "pos_frac": 0.5234375, "sample": [-2.48089599609375, 0.06392097473144531, 2.83489990234375, -0.6729583740234375, -1.9908447265625, -0.4336204528808594, 0.15835952758789062, 0.24971771240234375, 0.1580810546875, 0.06917190551757812, -0.025562286376953125, -0.6640625, 0.0, -0.45752716064453125, 0.381988525390625, 0.396026611328125, 0.738372802734375, -0.6412811279296875, -0.1153564453125, 1.4224853515625, 1.643951416015625, -0.7520599365234375, -0.3824806213378906, 1.542236328125, -1.207763671875, 0.06972122192382812, -0.79254150390625, 0.0343017578125, 0.1660308837890625, -0.50042724609375, -0.26471710205078125, 0.59844970703125, 0.5537109375, -0.18408203125, -2.1351318359375, 0.70458984375, 1.2756500244140625, -1.10809326171875, -0.161102294921875, 0.452667236328125, 0.202178955078125, -0.13419151306152344, -0.3589324951171875, 0.4642333984375, -1.68548583984375, -0.205657958984375, -0.0623779296875, -1.0530242919921875, -0.0184783935546875, -0.08518218994140625, -0.413421630859375, -1.56402587890625, -0.44234466552734375, -0.2730712890625, 0.218536376953125, -0.3689384460449219, -1.4365692138671875, -1.05450439453125, 0.122955322265625, -1.7498779296875, 0.1884307861328125, 1.0048828125, 0.39324951171875, 0.48724365234375, 0.21722412109375, -0.2811737060546875, 0.077972412109375, 0.7999267578125, 1.1517333984375, 0.8262939453125, -0.9092864990234375, -0.5491619110107422, 0.799407958984375, -0.627227783203125, 0.23191452026367188, 0.043811798095703125, 0.16119384765625, 0.224456787109375, 0.63238525390625, 0.0692138671875, 2.322540283203125, 3.6612548828125, 0.1733551025390625, 0.4605712890625, 1.234222412109375, 0.6740818023681641, 1.2231903076171875, -1.0482177734375, 0.01085662841796875, -0.8862762451171875, 0.61517333984375, 0.3315849304199219, 0.08768844604492188, -2.317291259765625, 0.689422607421875, -0.5589141845703125, 0.118133544921875, -0.1627635955810547, -0.1026763916015625, -0.4671630859375, -0.0947265625, -0.26346588134765625, 1.498504638671875, -0.484619140625, -0.155517578125, 1.970001220703125, 1.31756591796875, 0.0, 0.17535400390625, -0.11027145385742188, -0.010711669921875, 0.135772705078125, -0.374786376953125, -1.6946563720703125, 0.895782470703125, -1.2203521728515625, 0.48553466796875, 0.73431396484375, 0.498992919921875, 1.116455078125, -0.06884765625, -0.46826171875, -0.19742584228515625, 1.562255859375, 0.091339111328125, 0.013763427734375, 0.088836669921875, -1.251556396484375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000026.npy"}
|
||||
{"epoch": 0.05445026178010471, "step": 27, "batch_size": 128, "mean": -0.04163980484008789, "std": 0.9359468817710876, "min": -4.53863525390625, "p10": -1.1707233428955077, "median": 0.0397491455078125, "p90": 1.1649780273437498, "max": 2.86053466796875, "pos_frac": 0.546875, "sample": [0.603912353515625, 0.79656982421875, 0.045780181884765625, 0.3564262390136719, 0.3853759765625, 0.314697265625, -0.7833633422851562, -0.491455078125, -0.4088134765625, 0.5587387084960938, 0.1222076416015625, -2.28033447265625, 1.658233642578125, -0.373016357421875, -1.27032470703125, 1.19915771484375, 0.038330078125, 0.248931884765625, 0.43365478515625, 0.93109130859375, 0.4374237060546875, 0.40265655517578125, 0.041168212890625, -0.3641204833984375, 2.86053466796875, 1.779449462890625, -0.0380859375, 1.33831787109375, 0.03759002685546875, 0.5498809814453125, -0.16522216796875, 0.0149688720703125, 1.047332763671875, 0.155517578125, 0.192596435546875, -0.778564453125, 0.32968902587890625, -1.169189453125, -0.5081024169921875, 0.234832763671875, 1.352294921875, -0.12445068359375, -1.682952880859375, 0.33339691162109375, -0.41754150390625, 1.15032958984375, 0.15529251098632812, -0.4405670166015625, 0.634979248046875, 0.01329803466796875, -0.717193603515625, 0.26123046875, -0.427032470703125, 0.038238525390625, 1.34759521484375, 0.288299560546875, -0.107940673828125, 0.2334442138671875, -0.0412445068359375, 0.826171875, -1.2587890625, -0.6541900634765625, -0.490753173828125, -1.1737060546875, -0.0869903564453125, 1.317901611328125, 0.637847900390625, 0.3753395080566406, 2.25408935546875, -1.02325439453125, -1.781219482421875, -1.5321044921875, -0.989013671875, -0.5103759765625, 0.45166015625, 0.1128692626953125, -1.43804931640625, 0.144622802734375, 1.208709716796875, 0.019927978515625, -0.5173568725585938, 0.0543212890625, -0.165191650390625, -0.5694122314453125, -1.804443359375, -0.272247314453125, 1.721160888671875, -0.4346904754638672, 0.2695960998535156, 0.211639404296875, 0.13763427734375, -0.983978271484375, -4.53863525390625, -0.385101318359375, -1.3173675537109375, 0.24945068359375, -1.006256103515625, -0.5973720550537109, 0.3967132568359375, 1.04071044921875, -0.3586883544921875, 0.10077095031738281, -0.9214553833007812, -0.63665771484375, 0.16033935546875, 1.34869384765625, 0.210784912109375, -0.4546699523925781, -0.17125701904296875, 0.10205650329589844, -0.464447021484375, 0.07558631896972656, 0.1183319091796875, 1.304779052734375, -0.444305419921875, 0.546112060546875, -0.0904693603515625, -1.1694450378417969, -0.2318878173828125, 0.328399658203125, 0.2138671875, 0.27164459228515625, -0.047576904296875, -2.13275146484375, -0.046295166015625, 0.07187652587890625, -1.186187744140625, -0.06086158752441406], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000027.npy"}
|
||||
{"epoch": 0.05654450261780105, "step": 28, "batch_size": 128, "mean": 0.10787186026573181, "std": 0.7490465044975281, "min": -1.955322265625, "p10": -0.8153961181640624, "median": 0.10210037231445312, "p90": 0.989900207519531, "max": 2.334686279296875, "pos_frac": 0.5625, "sample": [0.261810302734375, -1.24005126953125, -0.6835861206054688, 0.48980712890625, 0.21587371826171875, -0.7178478240966797, -0.2736663818359375, 0.44156646728515625, 0.31134986877441406, -0.0277099609375, 0.061126708984375, 0.4279632568359375, 0.02899169921875, 0.345062255859375, 0.113067626953125, 0.38824462890625, -0.521575927734375, -0.23825836181640625, -0.287841796875, 0.429351806640625, -0.16628265380859375, -0.0406951904296875, -0.32550811767578125, -0.3651123046875, -0.5546417236328125, -0.4052276611328125, 1.1632080078125, 0.18715858459472656, -0.798065185546875, -0.7037506103515625, 0.2069244384765625, -0.6293487548828125, -1.227508544921875, 0.577545166015625, 0.029193878173828125, 0.2147064208984375, 1.3852386474609375, 0.4370880126953125, 0.80255126953125, 0.43218994140625, -0.8558349609375, -1.955322265625, -0.1161956787109375, -1.427825927734375, 0.1187896728515625, 1.279266357421875, 0.143280029296875, 2.334686279296875, 0.80865478515625, 0.708526611328125, 0.06939697265625, 0.5288162231445312, -1.241546630859375, -1.175537109375, -0.469146728515625, -0.7870330810546875, -0.1151275634765625, -0.33795166015625, -0.13616943359375, 0.8146820068359375, -1.350616455078125, 0.7011642456054688, -0.1544952392578125, -0.4308586120605469, 2.2373046875, 0.3692779541015625, -0.0720062255859375, 0.8064727783203125, 1.296844482421875, 0.341461181640625, 0.40167236328125, -0.134002685546875, 0.28372955322265625, -0.43430328369140625, -0.399383544921875, 1.20574951171875, 0.1622314453125, -0.227691650390625, 0.01465606689453125, 0.8875732421875, 0.7896804809570312, -0.0130157470703125, -0.322906494140625, -0.4517822265625, 1.7890625, 0.22959136962890625, 0.5327301025390625, 0.9695892333984375, 0.359039306640625, 0.1941680908203125, -0.351715087890625, -0.27655029296875, 0.674896240234375, 0.73565673828125, -0.15275001525878906, 0.327117919921875, 1.08935546875, 0.09113311767578125, 0.070037841796875, -0.550445556640625, 0.661346435546875, 0.32440185546875, -0.923980712890625, 0.781768798828125, 0.5146331787109375, -0.3807373046875, 0.6138534545898438, -0.41155242919921875, 1.761993408203125, -0.13636016845703125, 1.634246826171875, -0.967742919921875, -0.8580322265625, -0.33049774169921875, 0.19439697265625, -0.7836532592773438, 0.3753662109375, 1.4053153991699219, -1.122039794921875, 1.03729248046875, 0.0886993408203125, 0.85986328125, 0.93975830078125, 0.650787353515625, 0.249267578125, -0.3656730651855469, 0.0, -1.1745452880859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000028.npy"}
|
||||
{"epoch": 0.05863874345549738, "step": 29, "batch_size": 128, "mean": 0.0964423269033432, "std": 0.9489491581916809, "min": -2.59405517578125, "p10": -1.0295074462890625, "median": 0.0, "p90": 1.3790649414062492, "max": 3.23699951171875, "pos_frac": 0.4921875, "sample": [-0.03627204895019531, -1.10888671875, 2.65435791015625, -2.14892578125, -0.806304931640625, -0.484130859375, 1.6148681640625, 3.23699951171875, -0.61572265625, 1.58282470703125, 0.784698486328125, 1.202545166015625, 1.680908203125, -0.2217864990234375, 0.4650421142578125, 0.0, -1.094482421875, 0.598663330078125, 0.58184814453125, -0.2941436767578125, 1.550689697265625, 0.66363525390625, 0.1711578369140625, 0.17779541015625, -0.4702911376953125, 1.6163330078125, 1.72003173828125, -0.1260528564453125, -0.7251739501953125, 0.11211395263671875, -0.1265869140625, -0.549072265625, 0.68096923828125, 0.811370849609375, -0.7875213623046875, -0.21722412109375, -1.76617431640625, 0.201263427734375, 0.3775634765625, -0.8769760131835938, 0.237213134765625, 0.38323020935058594, 1.750518798828125, 0.9143524169921875, -0.449310302734375, -0.15947723388671875, -0.175994873046875, -0.357208251953125, -0.7800445556640625, -0.6804351806640625, 0.71002197265625, -1.5926513671875, 0.201934814453125, -0.7773895263671875, 2.240081787109375, 1.608184814453125, -0.34771728515625, -0.478759765625, 0.23944091796875, 0.4292755126953125, 0.032562255859375, 0.855224609375, -0.06561279296875, 1.653839111328125, 0.08475494384765625, 0.5232810974121094, -1.0201416015625, 0.2259521484375, 0.0, -0.44366455078125, 1.807220458984375, -1.528778076171875, 0.646270751953125, -1.257965087890625, 0.3688201904296875, -0.1194915771484375, 0.6385498046875, -0.14038848876953125, -1.57330322265625, 0.6691131591796875, -0.7728271484375, -0.4293060302734375, -0.16412353515625, 1.150726318359375, 1.305511474609375, 0.0557861328125, -0.39414215087890625, 1.0074310302734375, -0.667572021484375, 1.11138916015625, 0.31668853759765625, -0.52880859375, 0.36345672607421875, 0.943206787109375, -0.060638427734375, -0.4385528564453125, -1.749664306640625, 1.142974853515625, -0.0946044921875, 1.0766143798828125, -0.08721923828125, 0.5086441040039062, 0.5828399658203125, -0.9386215209960938, 0.3987274169921875, 0.7704086303710938, -0.4501800537109375, -0.55377197265625, -0.426025390625, -1.136077880859375, 0.11126708984375, -0.442962646484375, -0.353515625, -0.8095703125, 0.9415130615234375, -2.59405517578125, 0.670379638671875, -1.051361083984375, 0.555419921875, -0.0132293701171875, -0.29824256896972656, -0.01844024658203125, 0.01287841796875, -1.1681289672851562, -0.093719482421875, 0.46630859375, -0.255279541015625, 0.5416259765625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000029.npy"}
|
||||
{"epoch": 0.060732984293193716, "step": 30, "batch_size": 128, "mean": 0.08629591763019562, "std": 0.8240646123886108, "min": -1.867340087890625, "p10": -1.0079538345336914, "median": 0.031421661376953125, "p90": 1.1026748657226562, "max": 2.921661376953125, "pos_frac": 0.53125, "sample": [1.4947509765625, 0.259002685546875, -0.36859130859375, -0.885162353515625, -0.3675117492675781, 1.435089111328125, -0.0029449462890625, 0.0977325439453125, -0.439422607421875, 0.318267822265625, 0.5069808959960938, 0.03082275390625, 1.104736328125, 0.255767822265625, 0.8652496337890625, -1.4227294921875, -1.867340087890625, 0.9423828125, -0.31937408447265625, -0.1679840087890625, -0.770599365234375, 0.80426025390625, 0.219146728515625, -0.11750411987304688, 0.3772125244140625, -0.810211181640625, -1.01617431640625, 0.4035930633544922, -0.02251434326171875, 1.01861572265625, 0.873077392578125, -0.23939132690429688, -0.3109130859375, 0.382232666015625, -0.6771011352539062, 1.9677734375, -0.2449951171875, 0.06757354736328125, 0.03202056884765625, 0.01409912109375, -0.752044677734375, -0.03386688232421875, -0.4296875, 1.4427490234375, -0.5426025390625, 1.8631591796875, -0.17444610595703125, 0.382232666015625, 1.1017913818359375, -1.11962890625, 0.5909957885742188, 0.99456787109375, -0.66754150390625, 0.7043609619140625, 1.355865478515625, 0.4409027099609375, -1.46917724609375, 0.6231231689453125, -0.7730712890625, 0.0720062255859375, -0.43328857421875, 0.376953125, 0.89508056640625, -1.6463699340820312, 0.214599609375, -0.7501983642578125, -0.36444091796875, -1.14111328125, -0.10470199584960938, 0.7991943359375, -0.028039932250976562, -1.126953125, -0.6004257202148438, -0.32965850830078125, 1.0008544921875, 1.15655517578125, 0.0983428955078125, -0.9089508056640625, -0.207916259765625, -1.1981658935546875, 1.2216415405273438, 2.921661376953125, -1.18212890625, 0.63482666015625, 0.4751930236816406, 0.12506103515625, 0.38348388671875, -0.4786376953125, 1.298187255859375, -0.64947509765625, -0.37664794921875, 0.0303802490234375, 0.4983062744140625, -1.3336181640625, -1.24664306640625, -0.89849853515625, -0.51715087890625, -0.018627166748046875, 0.8828125, 0.5369873046875, 0.07611083984375, -0.523590087890625, 0.7618331909179688, -0.14586639404296875, -0.04937744140625, -0.24474334716796875, 0.195770263671875, -0.575286865234375, 0.8231048583984375, 0.89898681640625, 1.099212646484375, -0.0977020263671875, 1.256988525390625, 0.429351806640625, -0.1859149932861328, 0.12377166748046875, 0.915069580078125, 0.760223388671875, -1.0044307708740234, -1.397796630859375, 0.17401123046875, 0.5463333129882812, 1.207061767578125, -0.298004150390625, 0.011663436889648438, 0.7123489379882812, 0.881866455078125, -0.34320068359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000030.npy"}
|
||||
{"epoch": 0.06282722513089005, "step": 31, "batch_size": 128, "mean": 0.15123386681079865, "std": 0.8554779887199402, "min": -2.4102783203125, "p10": -0.7391891479492188, "median": 0.15804004669189453, "p90": 1.1858497619628903, "max": 3.070831298828125, "pos_frac": 0.578125, "sample": [-0.7318649291992188, 1.339599609375, -0.33917236328125, 0.707122802734375, 0.1744842529296875, 0.6666030883789062, 0.7913665771484375, 0.26708984375, 1.1463241577148438, 0.557159423828125, 0.5828857421875, -0.49139404296875, 1.83685302734375, -0.651763916015625, -0.821990966796875, 0.513031005859375, 0.7330188751220703, 0.11201095581054688, 0.6735076904296875, -0.6093902587890625, -2.4102783203125, -0.853515625, 0.501800537109375, 0.5958251953125, 1.407470703125, 2.020477294921875, -0.4224700927734375, 1.3647613525390625, -0.8188095092773438, 0.46521759033203125, 0.18896484375, 1.979522705078125, -0.2045440673828125, -0.6779594421386719, -1.611175537109375, -1.548492431640625, 0.1821441650390625, 0.02508544921875, -0.24951171875, 0.0, 0.3641510009765625, -0.2716522216796875, 0.0, 0.60400390625, 0.7913589477539062, -0.6238861083984375, 0.2006683349609375, 0.6224555969238281, -1.25274658203125, -0.9794769287109375, 0.2381591796875, -0.310028076171875, -0.142578125, 1.284820556640625, -0.3467979431152344, -0.33365631103515625, 1.278076171875, 3.070831298828125, -0.725860595703125, -0.39990997314453125, 0.0, -0.373779296875, 0.484405517578125, -1.69940185546875, 0.6450347900390625, 0.2751312255859375, -0.12860107421875, 0.007781982421875, 0.0206298828125, 0.0386962890625, -0.3165435791015625, -1.8826904296875, 0.19447708129882812, -0.27312469482421875, 1.138763427734375, 0.5881195068359375, -0.2909698486328125, -0.07759857177734375, -0.5382080078125, 0.947784423828125, 0.12108230590820312, 1.07330322265625, 0.69891357421875, 0.16776466369628906, -0.5479660034179688, 0.2739715576171875, 1.307861328125, -1.17584228515625, 0.10451507568359375, 1.05377197265625, -0.6319961547851562, 0.035137176513671875, 0.83026123046875, 2.1353759765625, -2.2592315673828125, -0.3670654296875, 0.1483154296875, -0.6238861083984375, 0.3912811279296875, 0.0, -0.15240478515625, 0.71197509765625, -0.3004608154296875, 0.8549728393554688, 0.3634147644042969, 0.304473876953125, 0.50531005859375, -0.29802703857421875, -0.339691162109375, 0.22967529296875, 0.732666015625, -0.15401268005371094, -0.7562789916992188, -0.48602294921875, 1.441070556640625, 0.40657806396484375, -0.12263679504394531, 0.2348785400390625, 0.60369873046875, -0.0811767578125, 0.42822265625, 1.0635223388671875, 0.21159744262695312, 1.925811767578125, -0.22255516052246094, 0.5438995361328125, 0.132568359375, 0.6234664916992188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000031.npy"}
|
||||
{"epoch": 0.06492146596858639, "step": 32, "batch_size": 128, "mean": 0.17317189276218414, "std": 0.9343750476837158, "min": -2.4798583984375, "p10": -0.9638654708862304, "median": 0.17930316925048828, "p90": 1.2920623779296871, "max": 3.8787841796875, "pos_frac": 0.59375, "sample": [-1.033538818359375, -0.0464019775390625, 0.643646240234375, 0.438629150390625, -0.730194091796875, -1.185455322265625, -0.109039306640625, 0.0798797607421875, 0.288726806640625, 0.1962890625, 0.22381591796875, -0.06824493408203125, 1.966156005859375, 0.7724380493164062, -0.73992919921875, -0.9632892608642578, -1.00689697265625, -0.1944580078125, 0.1387939453125, 1.067779541015625, 0.1209716796875, -0.4166259765625, -0.1015472412109375, -0.1749114990234375, 1.470733642578125, 1.124114990234375, -0.48052215576171875, -0.11677169799804688, 0.2921905517578125, 0.40691375732421875, -0.292388916015625, 0.07135772705078125, -0.7467880249023438, 0.30047607421875, -1.136322021484375, -0.018402099609375, -0.48223876953125, -1.294708251953125, -0.670318603515625, 0.426300048828125, 0.805999755859375, 0.5736083984375, 0.69091796875, -0.3367156982421875, 0.554718017578125, 0.8401031494140625, 0.5738525390625, -0.7342376708984375, -0.5572052001953125, 0.7366943359375, 0.695343017578125, -1.19293212890625, -1.28533935546875, 0.85504150390625, 3.8787841796875, -0.0649261474609375, 2.387664794921875, 2.3592529296875, 0.604095458984375, -0.9652099609375, -0.8118896484375, -1.648895263671875, 0.24298095703125, 0.33294677734375, -0.7529754638671875, 0.0657501220703125, -0.7294464111328125, 0.4156341552734375, 1.24755859375, 0.688720703125, 0.4759521484375, -0.23354339599609375, 0.3812751770019531, 0.152923583984375, -0.080352783203125, 0.4966888427734375, 0.3173675537109375, -0.20550537109375, 0.2857513427734375, 2.3564453125, 0.174530029296875, -1.53375244140625, 0.30199432373046875, -0.49465370178222656, 0.87786865234375, 1.1781158447265625, 0.0227203369140625, -1.0343475341796875, -0.05669403076171875, 2.328857421875, 0.4468841552734375, -0.0926666259765625, -0.8771209716796875, 1.70831298828125, 0.289215087890625, 0.427215576171875, -0.493865966796875, 0.2919921875, 0.5606689453125, 0.106353759765625, -0.434356689453125, 2.00897216796875, 1.7268218994140625, 0.07609939575195312, 0.76470947265625, -0.5824661254882812, -0.5955886840820312, 0.25421142578125, -0.6280975341796875, -0.00852203369140625, 0.92730712890625, 0.46276092529296875, 0.1802978515625, 0.762969970703125, 1.395904541015625, 0.263763427734375, -2.4798583984375, -0.181915283203125, 0.677490234375, -1.7082061767578125, -0.807952880859375, 0.17830848693847656, 1.860626220703125, 1.596435546875, 0.21759033203125, 0.22869873046875, 0.038421630859375, 0.40283203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000032.npy"}
|
||||
{"epoch": 0.06701570680628273, "step": 33, "batch_size": 128, "mean": 0.1293661743402481, "std": 1.1019879579544067, "min": -3.9110107421875, "p10": -1.1334976196289062, "median": 0.11634063720703125, "p90": 1.3282943725585936, "max": 4.572998046875, "pos_frac": 0.546875, "sample": [0.46518707275390625, -0.59661865234375, 0.8787078857421875, 1.8899383544921875, -0.05742645263671875, 0.4652862548828125, 1.15814208984375, -1.892913818359375, -0.419525146484375, -0.46988677978515625, 1.077484130859375, -0.5149307250976562, 4.572998046875, -1.5032958984375, 0.0, -0.8373184204101562, 0.609039306640625, -0.067596435546875, -0.744384765625, -0.90142822265625, 0.50628662109375, -0.13941192626953125, 0.08955192565917969, 1.420257568359375, 0.575347900390625, -0.52325439453125, 0.6369781494140625, 0.048248291015625, 1.26824951171875, 1.3111572265625, 0.42327880859375, -0.213836669921875, 0.422210693359375, -2.72479248046875, -0.02970123291015625, 1.82684326171875, -0.494842529296875, 0.4259490966796875, 0.3985748291015625, 1.2852783203125, 0.16913604736328125, -0.42901611328125, -0.6278305053710938, 0.31414794921875, -0.13214111328125, -0.2284393310546875, 1.435333251953125, -0.29290771484375, -0.116485595703125, -1.224517822265625, 0.3615264892578125, 0.178741455078125, -1.4604339599609375, 0.0, -0.671630859375, 0.11731719970703125, 0.19033241271972656, 2.372528076171875, -0.404266357421875, -0.509552001953125, 1.81219482421875, -0.79364013671875, -0.7782783508300781, 0.11536407470703125, -0.024675369262695312, 0.24932861328125, -0.10281753540039062, 0.015380859375, -1.0919189453125, 1.362091064453125, 0.389251708984375, 0.15346527099609375, -1.4456787109375, 2.00091552734375, -0.35223388671875, -0.22393798828125, -0.7592315673828125, 1.13232421875, 0.17928314208984375, -0.98516845703125, 0.38836669921875, 0.472930908203125, 2.48309326171875, 0.45987701416015625, 0.618316650390625, -1.100433349609375, 1.174835205078125, 1.13360595703125, -0.031585693359375, 1.3173675537109375, 1.09197998046875, 0.08760452270507812, 1.2039566040039062, 0.097259521484375, 0.3552207946777344, -1.070953369140625, -0.1525115966796875, 0.863555908203125, -3.9110107421875, -0.1995849609375, -0.25713348388671875, -0.4007568359375, 1.27752685546875, 0.9100341796875, 1.1075439453125, -0.5146636962890625, -0.3756866455078125, 1.064117431640625, -1.2106475830078125, -1.56634521484375, -1.66766357421875, 1.49114990234375, -2.43988037109375, 0.61090087890625, 0.6979141235351562, -1.4063568115234375, -0.1191558837890625, 0.20800399780273438, 0.8406982421875, 0.446624755859375, 1.353790283203125, 0.6160507202148438, -0.59295654296875, 0.7398605346679688, -1.94073486328125, 0.12703323364257812, 0.3357353210449219, 2.424285888671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000033.npy"}
|
||||
{"epoch": 0.06910994764397906, "step": 34, "batch_size": 128, "mean": 0.14365874230861664, "std": 1.0441186428070068, "min": -3.09588623046875, "p10": -1.011199951171875, "median": 0.19400787353515625, "p90": 1.3409973144531249, "max": 3.139373779296875, "pos_frac": 0.5859375, "sample": [0.2139434814453125, 0.642486572265625, 1.377685546875, -0.16129302978515625, 0.1135711669921875, -0.796661376953125, 0.26458740234375, 0.10117340087890625, 0.4889984130859375, 1.572998046875, -2.1868896484375, -0.632568359375, 0.53057861328125, -0.06695556640625, -0.631683349609375, 1.553955078125, -0.5750732421875, 0.8263702392578125, 0.24786376953125, -0.41982269287109375, 0.4867744445800781, -2.33831787109375, 0.58154296875, 0.966827392578125, 1.501190185546875, 0.3140411376953125, 0.9259033203125, -0.271087646484375, 0.09723472595214844, -0.08038330078125, 0.1715545654296875, 1.0284042358398438, 0.46759033203125, -0.767822265625, -0.460174560546875, -0.504852294921875, 2.51885986328125, 1.500762939453125, -1.705841064453125, 1.0525054931640625, 1.1277618408203125, -1.049224853515625, 0.908233642578125, 1.33233642578125, -0.007415771484375, 0.44890594482421875, 0.363800048828125, 1.2588348388671875, 1.635894775390625, -0.8232574462890625, -0.3656463623046875, -1.7208251953125, -0.3484325408935547, 0.93939208984375, -2.2735595703125, 0.5463104248046875, 0.5061492919921875, -0.1949462890625, 3.10235595703125, -1.725128173828125, -0.705718994140625, 0.5526466369628906, 0.467742919921875, -0.2050628662109375, 0.0560302734375, 1.50689697265625, -1.38836669921875, -0.06527900695800781, 0.8050994873046875, -0.61993408203125, 0.27692413330078125, 0.740570068359375, -1.993133544921875, -0.2792205810546875, -0.872039794921875, 0.1947479248046875, -0.031444549560546875, 0.14882469177246094, 1.0281600952148438, -0.25967979431152344, 3.139373779296875, 0.707977294921875, -0.60369873046875, -0.707672119140625, 0.8794403076171875, -0.75616455078125, 0.193267822265625, 0.019683837890625, 0.15533447265625, 0.281280517578125, -0.41118621826171875, -0.967620849609375, 1.29803466796875, -2.3419189453125, -0.3280487060546875, -0.1455078125, 0.2138519287109375, -0.6306610107421875, -1.24346923828125, 0.41602325439453125, -2.127197265625, 1.08343505859375, 0.29833984375, -0.994903564453125, 0.56768798828125, -0.14388275146484375, -0.2877044677734375, 0.12459564208984375, -0.08903884887695312, 1.712890625, 1.827484130859375, 0.7618408203125, -0.36346435546875, 1.133880615234375, 0.327423095703125, -0.054718017578125, -3.09588623046875, 1.05438232421875, -0.21893310546875, 0.160430908203125, 0.36865234375, 0.6102447509765625, 0.9089508056640625, 0.818267822265625, 1.3612060546875, 0.36392974853515625, 1.09381103515625, 1.050994873046875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000034.npy"}
|
||||
{"epoch": 0.0712041884816754, "step": 35, "batch_size": 128, "mean": 0.1330869495868683, "std": 0.9978698492050171, "min": -2.654693603515625, "p10": -1.0886978149414062, "median": 0.18111801147460938, "p90": 1.1516052246093746, "max": 3.3116455078125, "pos_frac": 0.59375, "sample": [-0.32398223876953125, 0.00152587890625, 0.2875823974609375, 1.5636749267578125, 0.185882568359375, 1.77325439453125, 0.75958251953125, -0.1083526611328125, -2.283599853515625, -1.5535888671875, 1.8494873046875, 0.43597412109375, 0.904541015625, -0.2671661376953125, 0.62359619140625, 0.3116111755371094, -2.479583740234375, -0.9720458984375, -0.1170501708984375, -0.007232666015625, 0.5839309692382812, 1.050872802734375, 0.72064208984375, -0.3649444580078125, 0.34228515625, 1.0005950927734375, 1.12042236328125, -0.2348480224609375, 1.025604248046875, -0.210205078125, 0.294769287109375, -0.321868896484375, -0.285491943359375, 0.0932769775390625, 2.2459716796875, 1.987060546875, 0.3459014892578125, 1.01617431640625, -0.11577606201171875, 0.0, 0.45941162109375, 1.21282958984375, 0.2309112548828125, 0.6456298828125, 0.5520095825195312, -1.332305908203125, 3.3116455078125, 0.263580322265625, 0.9980010986328125, 0.10791015625, -1.9880828857421875, -0.29804229736328125, 0.18634033203125, -1.682861328125, -0.40704345703125, -0.6885604858398438, 0.324371337890625, -2.654693603515625, 0.996856689453125, 0.21035385131835938, -1.00830078125, 2.41668701171875, 1.417938232421875, -0.23613739013671875, -0.4354248046875, -0.15106201171875, -0.73565673828125, 0.06549644470214844, 0.737396240234375, 0.48351287841796875, -1.1105194091796875, -0.722442626953125, -1.079345703125, 1.5431976318359375, -0.22222518920898438, 0.50799560546875, -0.5430335998535156, 0.521148681640625, 1.894561767578125, -1.33935546875, 0.654571533203125, 0.442138671875, -0.143341064453125, -0.23397064208984375, 1.2970428466796875, -0.28890228271484375, -0.23276138305664062, -0.999298095703125, 0.00421142578125, 0.6544303894042969, -2.1688232421875, 0.3983612060546875, 0.3612384796142578, 0.9547042846679688, 0.330230712890625, -0.5712890625, -0.546142578125, 0.587158203125, 0.12091064453125, 2.380828857421875, -2.235443115234375, 0.342254638671875, 0.0, 0.170928955078125, 0.6622467041015625, 0.457275390625, -0.184844970703125, 0.1759033203125, 1.090087890625, -0.9293670654296875, 0.174530029296875, -0.888519287109375, -1.3214111328125, 0.999542236328125, 0.42853546142578125, 0.05902099609375, -0.052825927734375, 1.1253662109375, 0.307830810546875, 0.17635345458984375, 0.5824737548828125, 0.9373779296875, 0.0599212646484375, -0.47808837890625, 0.5472145080566406, -0.67529296875, -1.2173309326171875, 0.388916015625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000035.npy"}
|
||||
{"epoch": 0.07329842931937172, "step": 36, "batch_size": 128, "mean": 0.27686381340026855, "std": 1.0341625213623047, "min": -3.430419921875, "p10": -0.84393310546875, "median": 0.161712646484375, "p90": 1.3870056152343748, "max": 3.65771484375, "pos_frac": 0.59375, "sample": [0.34228515625, -0.84246826171875, 1.3073883056640625, -0.26068115234375, -0.47869873046875, 1.1060791015625, -0.9818115234375, -0.19544219970703125, -1.1766357421875, -0.5272560119628906, -0.814239501953125, 0.6523895263671875, 0.01123046875, -0.37762451171875, -0.542236328125, -0.11352920532226562, -0.07863616943359375, 0.01080322265625, 1.00384521484375, 0.45101165771484375, 1.1252059936523438, -2.31304931640625, 0.328460693359375, 0.0, -0.28128814697265625, 0.378143310546875, -0.21783447265625, -0.84735107421875, -0.913818359375, 0.21273422241210938, 2.0902099609375, 0.202484130859375, 1.062255859375, 0.164581298828125, -0.294708251953125, -0.436553955078125, -0.099456787109375, -0.6029205322265625, 0.10097312927246094, 0.584991455078125, 3.65771484375, -0.124298095703125, 0.1756744384765625, 1.33502197265625, -0.1950531005859375, -0.021240234375, 1.81219482421875, 2.2836456298828125, -1.0286788940429688, 0.949005126953125, 1.10498046875, 0.4766998291015625, 0.64990234375, -0.5270919799804688, -1.212615966796875, 0.319732666015625, 0.7580108642578125, 0.494354248046875, 2.9722900390625, 0.1925048828125, 0.0, -1.68682861328125, 0.065582275390625, 0.11859893798828125, 1.0548858642578125, 1.559326171875, -0.5243415832519531, 0.8616943359375, -0.3790779113769531, 0.164642333984375, 0.99505615234375, 0.431610107421875, 1.06890869140625, -0.11346435546875, -0.74481201171875, 0.27411651611328125, 0.809967041015625, 2.892181396484375, -1.877838134765625, -0.5260009765625, 0.01171875, 0.1395416259765625, 1.7101898193359375, 0.9990005493164062, -3.430419921875, -0.27794837951660156, -0.73529052734375, -0.210479736328125, -0.12982177734375, 0.3184051513671875, 1.11505126953125, 1.35992431640625, 1.8195648193359375, 0.955902099609375, -1.48291015625, -0.06011962890625, -0.13591957092285156, 1.187255859375, 0.6387786865234375, 1.1295013427734375, 0.2059326171875, 0.482025146484375, 1.154052734375, 1.34503173828125, 0.0390625, -0.13282012939453125, 0.689910888671875, 0.715576171875, 0.1356201171875, 0.04559326171875, -0.28308868408203125, 0.158843994140625, 0.09909820556640625, 1.4501953125, -0.5926666259765625, -1.03289794921875, 1.734619140625, 1.343994140625, 3.3935546875, -0.3383331298828125, 1.6566162109375, 1.0335693359375, -0.195953369140625, 0.2973194122314453, 0.50628662109375, 0.5893173217773438, -0.0665130615234375, -1.173095703125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000036.npy"}
|
||||
{"epoch": 0.07539267015706806, "step": 37, "batch_size": 128, "mean": 0.195171520113945, "std": 0.9629870057106018, "min": -2.72406005859375, "p10": -0.8439605712890624, "median": 0.12828826904296875, "p90": 1.2819610595703121, "max": 4.50933837890625, "pos_frac": 0.578125, "sample": [-0.47306060791015625, 0.11724853515625, 0.427825927734375, 0.943756103515625, 0.764862060546875, 0.028165817260742188, 0.64898681640625, 0.3182563781738281, 0.6058807373046875, 0.104522705078125, 2.220703125, 0.16809654235839844, -0.7371826171875, 0.33621978759765625, 0.8968353271484375, 2.5673828125, 1.383880615234375, 0.150726318359375, 0.97100830078125, -1.2543487548828125, 1.009857177734375, 0.6495361328125, -0.4755992889404297, -0.121429443359375, 0.38958740234375, 0.1688690185546875, 0.04488372802734375, 0.91796875, 0.614044189453125, -0.305572509765625, -0.1096954345703125, -0.30466461181640625, -0.3592967987060547, 1.478118896484375, 0.26920318603515625, 1.945037841796875, -1.40283203125, 0.78216552734375, 0.8641357421875, -0.40814208984375, 0.1494731903076172, -0.7733154296875, -0.37749481201171875, 0.60113525390625, 1.18536376953125, 1.651123046875, 0.16736221313476562, -0.1531829833984375, 0.16625213623046875, -0.339813232421875, 0.1638641357421875, 1.8061676025390625, -0.573638916015625, 2.08544921875, -0.959869384765625, -0.593017578125, -0.663330078125, 0.738433837890625, -1.2728118896484375, 0.288665771484375, 0.311248779296875, 0.7003173828125, 0.08294677734375, -0.486053466796875, -0.823333740234375, -0.34112548828125, -1.64453125, 0.625732421875, 0.54339599609375, -0.42388916015625, 0.494049072265625, 0.55120849609375, 1.51934814453125, 0.0319061279296875, 0.896148681640625, -0.0614013671875, 0.32543182373046875, -0.097320556640625, -0.89208984375, -1.201873779296875, 0.32146453857421875, 2.355987548828125, 0.7587661743164062, -1.5712890625, 0.039520263671875, -0.024200439453125, 0.797607421875, -0.4103240966796875, 1.2110595703125, 1.23828125, -0.12506103515625, 0.518280029296875, 0.01513671875, -0.61285400390625, -0.2590179443359375, 0.22095870971679688, 0.34185791015625, -0.3423042297363281, -0.0580902099609375, 0.098388671875, 0.819549560546875, -0.318603515625, -0.57659912109375, -0.2994575500488281, -0.03033447265625, -0.603729248046875, 4.50933837890625, -0.5131683349609375, 0.2229461669921875, -0.053707122802734375, 0.0213775634765625, -0.08892822265625, -0.0205078125, -0.904327392578125, 2.149383544921875, -0.329864501953125, 1.06158447265625, -2.41650390625, -0.0513153076171875, -0.9905242919921875, -2.72406005859375, -0.9533538818359375, 0.593994140625, 0.1393280029296875, 1.07501220703125, 1.584075927734375, -0.319183349609375, 0.24249267578125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000037.npy"}
|
||||
{"epoch": 0.0774869109947644, "step": 38, "batch_size": 128, "mean": 0.13005219399929047, "std": 0.9023271799087524, "min": -2.250030517578125, "p10": -0.8705596923828125, "median": 0.110595703125, "p90": 1.2471107482910155, "max": 2.97711181640625, "pos_frac": 0.5390625, "sample": [0.791259765625, 0.9479217529296875, -0.46063232421875, -0.06000518798828125, -0.6819992065429688, -0.2757720947265625, 0.14385986328125, 0.852630615234375, -0.148651123046875, 0.471649169921875, 0.113311767578125, 0.14368438720703125, -0.092315673828125, 2.232452392578125, -0.197418212890625, 1.68988037109375, -0.86309814453125, -0.14385986328125, -0.7824859619140625, 0.7709503173828125, 0.695556640625, 0.24676513671875, 0.2711296081542969, -0.02208709716796875, 0.6493377685546875, -0.621337890625, -0.815460205078125, -0.81365966796875, -0.2089996337890625, -0.23682022094726562, -1.36260986328125, 0.9886474609375, -0.02899169921875, -1.279052734375, -0.5600814819335938, -0.6574249267578125, 0.107879638671875, -0.4727344512939453, 0.56671142578125, -2.250030517578125, -0.8446502685546875, -0.25925445556640625, 2.97711181640625, 0.0293731689453125, -1.333404541015625, 0.7301788330078125, -0.431640625, 0.619598388671875, -0.5992431640625, -0.8007965087890625, 0.749267578125, 0.44025421142578125, -2.221343994140625, -0.24877166748046875, 0.3157005310058594, -0.501678466796875, -0.11566162109375, 0.563232421875, 0.3279571533203125, 0.7205810546875, 0.5473175048828125, -1.06695556640625, 0.613922119140625, 0.028106689453125, -0.0631866455078125, -0.06252288818359375, 0.6362075805664062, -0.9138946533203125, 1.4093780517578125, 0.21722412109375, 0.8397216796875, -0.482818603515625, 0.968658447265625, -0.887969970703125, -0.4303436279296875, -1.89215087890625, 1.3347320556640625, 0.13159942626953125, 0.00608062744140625, -2.098876953125, 0.4458770751953125, -0.260528564453125, -0.340301513671875, -0.83282470703125, 0.41779327392578125, 1.309844970703125, 1.53326416015625, -0.09844970703125, 0.213287353515625, 1.0221290588378906, 0.28218841552734375, 0.652618408203125, 2.366851806640625, 0.275665283203125, 1.2916259765625, 0.4647483825683594, 0.57330322265625, 1.1404266357421875, 1.8287353515625, -0.403961181640625, 0.15058135986328125, 0.0, -0.766815185546875, -0.04351806640625, 0.6590576171875, -0.4195556640625, 1.14013671875, 1.2688522338867188, 1.23779296875, 0.6812744140625, 0.7291259765625, 0.079132080078125, 0.2681121826171875, 1.83154296875, 1.7236328125, -1.0872802734375, 1.09954833984375, 0.9011077880859375, -0.0761566162109375, -1.0413818359375, -1.3675537109375, -0.7516632080078125, -0.63372802734375, 0.30918121337890625, -0.006500244140625, 1.068634033203125, -0.15070724487304688, 0.362396240234375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000038.npy"}
|
||||
{"epoch": 0.07958115183246073, "step": 39, "batch_size": 128, "mean": 0.49477773904800415, "std": 1.0289883613586426, "min": -1.876678466796875, "p10": -0.6030990600585937, "median": 0.323760986328125, "p90": 1.8036224365234368, "max": 3.66754150390625, "pos_frac": 0.65625, "sample": [1.29620361328125, 0.198394775390625, -1.63690185546875, -0.1257476806640625, 0.4822998046875, 1.1346282958984375, 0.2249603271484375, 0.20076751708984375, -1.553802490234375, 1.37841796875, 2.622039794921875, 1.4422225952148438, 0.44659423828125, 2.32647705078125, 0.28729248046875, 3.66754150390625, -0.15061187744140625, 2.380340576171875, 0.8792572021484375, -0.583465576171875, 0.08367919921875, 0.7110595703125, 0.8363037109375, -0.919342041015625, 2.786529541015625, 1.6815719604492188, 0.8044891357421875, -0.5753173828125, 0.285430908203125, 1.48504638671875, 2.371063232421875, 0.517730712890625, 0.524169921875, -0.6536865234375, 1.593994140625, -0.158538818359375, -0.03857421875, 0.3343048095703125, -0.76031494140625, 1.3132095336914062, -0.28546142578125, -0.03470611572265625, 1.733428955078125, 0.5430488586425781, 0.23429489135742188, 0.696075439453125, 1.521636962890625, 0.12288284301757812, 0.7457122802734375, 2.9088134765625, -0.6489105224609375, 0.07947540283203125, -0.3455810546875, 1.47357177734375, 1.159759521484375, 0.322967529296875, 0.96685791015625, 0.9750823974609375, -0.13092041015625, -1.876678466796875, -0.3903999328613281, -1.1854400634765625, 0.5985946655273438, 2.334625244140625, -0.19419097900390625, 1.1869049072265625, 0.1240386962890625, 0.4711647033691406, 1.9674072265625, -0.05126953125, -1.05804443359375, 0.78717041015625, -0.1726970672607422, -1.1181640625, 0.462738037109375, 1.245880126953125, -0.30078125, 0.842071533203125, 0.6713943481445312, -1.2264404296875, 0.79852294921875, -0.13405227661132812, 0.324554443359375, -0.074066162109375, 0.6773223876953125, -0.96343994140625, 3.565277099609375, 1.4891357421875, 0.3941192626953125, -0.12900543212890625, 0.27817535400390625, -0.13458251953125, -0.3204345703125, -0.6556510925292969, 0.0, -0.0489501953125, 0.5382003784179688, 0.0291595458984375, 2.4227294921875, 0.058563232421875, 1.2247772216796875, -0.42102813720703125, 0.23382568359375, -0.0243072509765625, 3.08148193359375, 1.5590667724609375, 0.4283447265625, 0.4481658935546875, -0.36993408203125, 0.25714111328125, -0.1244354248046875, 0.3350677490234375, -0.3205375671386719, 0.5983695983886719, 0.22528076171875, -0.5080337524414062, 0.16949462890625, 1.32489013671875, 0.5638427734375, -0.09047317504882812, 0.351348876953125, 0.06790542602539062, 0.1963348388671875, 2.45343017578125, 0.64947509765625, 0.921417236328125, -0.14504432678222656, -0.16551971435546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000039.npy"}
|
||||
{"epoch": 0.08167539267015707, "step": 40, "batch_size": 128, "mean": 0.43317562341690063, "std": 1.16057288646698, "min": -3.3489990234375, "p10": -0.6896316528320312, "median": 0.28447723388671875, "p90": 1.8402374267578123, "max": 4.34014892578125, "pos_frac": 0.640625, "sample": [-0.30533790588378906, -2.965606689453125, -0.6073455810546875, 0.8130645751953125, -0.18818092346191406, 1.478179931640625, -3.3489990234375, 3.309356689453125, -0.095184326171875, 2.0263671875, 0.355316162109375, 0.695220947265625, -0.2776947021484375, -0.42889404296875, 1.288299560546875, 0.47966766357421875, 0.36803436279296875, 0.236785888671875, -0.11170196533203125, 0.6614303588867188, 0.19931793212890625, 1.819793701171875, -0.04796600341796875, 0.37300872802734375, 0.6799850463867188, 2.12786865234375, -0.2409820556640625, 0.6285858154296875, 0.1241607666015625, -0.8377532958984375, -0.7644844055175781, 0.494476318359375, -0.46331787109375, 1.0038909912109375, -0.02215576171875, 1.3433837890625, -0.27777099609375, -0.685882568359375, 2.22467041015625, 2.712890625, 4.34014892578125, 2.131317138671875, -2.81341552734375, 1.3800048828125, -0.7354660034179688, -0.49691009521484375, 0.2784423828125, 1.5890960693359375, 1.028778076171875, 1.9392547607421875, 0.1152496337890625, -0.0280914306640625, 0.49920654296875, -0.0748291015625, 0.9964599609375, 2.02935791015625, 1.2214202880859375, -0.15236663818359375, 4.18505859375, 0.092681884765625, 0.681610107421875, 0.2693195343017578, -0.605621337890625, -0.91961669921875, 1.7978515625, 0.87457275390625, 1.38714599609375, -0.187591552734375, 1.887939453125, 0.6536636352539062, 0.3837394714355469, 0.19167327880859375, 0.5185775756835938, 1.180267333984375, 0.284820556640625, -0.7114524841308594, 1.376678466796875, -0.013507843017578125, 0.0267181396484375, -0.45947265625, -0.2060699462890625, 1.302734375, -0.24460220336914062, -0.5465507507324219, -0.036651611328125, 0.34293174743652344, -0.6983795166015625, 1.708160400390625, 0.25952911376953125, -0.10931396484375, -0.90325927734375, -0.676055908203125, 0.7894287109375, 1.27398681640625, -0.0263671875, 1.076995849609375, 2.401580810546875, -0.8673095703125, 0.59295654296875, 0.17034149169921875, -0.1099090576171875, 0.910003662109375, -1.83349609375, -0.025188446044921875, -0.405242919921875, 1.372650146484375, 0.2043914794921875, 0.00506591796875, 0.172210693359375, 0.485870361328125, -2.31744384765625, 0.6275634765625, 1.2045745849609375, 0.2841339111328125, 1.555145263671875, 0.2388916015625, -0.4515533447265625, 0.2134246826171875, 1.7698974609375, 0.475311279296875, 1.2947998046875, 0.11530303955078125, -0.6674346923828125, 0.655792236328125, 0.4401092529296875, 2.2648162841796875, 0.8134765625, 0.63201904296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000040.npy"}
|
||||
{"epoch": 0.08376963350785341, "step": 41, "batch_size": 128, "mean": 0.2582607567310333, "std": 1.09701669216156, "min": -3.5087890625, "p10": -1.040289306640625, "median": 0.29401397705078125, "p90": 1.45872802734375, "max": 3.53656005859375, "pos_frac": 0.625, "sample": [2.27996826171875, 0.790191650390625, 0.40325927734375, -0.3252410888671875, 0.32623291015625, -1.05853271484375, 0.905548095703125, -0.3095703125, -0.1737213134765625, 1.0419464111328125, 1.2153053283691406, 0.473907470703125, -0.7948760986328125, 0.1461181640625, -1.1781158447265625, 0.222442626953125, 0.8979339599609375, -0.05950927734375, 1.33612060546875, 1.1729736328125, -0.66436767578125, 2.8857421875, -0.346588134765625, 0.5002899169921875, 0.50408935546875, -0.004913330078125, 1.39776611328125, 1.1507568359375, 0.593292236328125, 0.3952140808105469, 1.258087158203125, -0.381011962890625, -0.32209014892578125, 0.75054931640625, -0.3065643310546875, -0.2771949768066406, 2.24725341796875, 2.289947509765625, -0.3499755859375, 0.1089324951171875, 0.8968505859375, 0.699432373046875, -0.077911376953125, 0.907745361328125, 1.018768310546875, -0.822540283203125, 1.443603515625, 0.4876556396484375, 0.397064208984375, -0.76324462890625, 0.536102294921875, -0.4749603271484375, -1.032470703125, -0.8452606201171875, 3.53656005859375, -2.415771484375, 0.9334564208984375, 0.13641357421875, -0.56439208984375, 1.3120193481445312, -0.0165557861328125, 0.7022247314453125, 1.656463623046875, -1.716064453125, 0.261749267578125, -0.203460693359375, -1.214080810546875, 0.9467926025390625, 0.0, 0.800262451171875, -0.1683349609375, -2.564697265625, -0.8737030029296875, -1.173553466796875, -2.429443359375, -1.744384765625, 0.68865966796875, 0.0187835693359375, -0.684173583984375, 0.219329833984375, 0.6390914916992188, 0.40552520751953125, -0.06427001953125, 0.820037841796875, 1.1309814453125, 0.1139373779296875, 0.2617950439453125, 0.3394775390625, 3.0185089111328125, 0.2111358642578125, -0.810272216796875, 0.3581390380859375, 0.7419891357421875, -0.5139617919921875, -0.98748779296875, 1.705352783203125, -1.102783203125, -0.9382781982421875, 0.015186309814453125, -0.4722747802734375, 1.88818359375, 1.330413818359375, 1.4940185546875, 0.39450836181640625, 0.16558837890625, 0.3400115966796875, 0.828765869140625, 0.1689453125, -0.091156005859375, -0.383270263671875, 0.8571205139160156, 1.19232177734375, 0.7598876953125, -1.293121337890625, 1.159942626953125, 0.052947998046875, -0.2320556640625, 0.115264892578125, 0.8989791870117188, 0.4883270263671875, -3.5087890625, 0.486053466796875, 2.00091552734375, 0.083221435546875, 1.5791015625, 1.81878662109375, 0.6843109130859375, -1.65020751953125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000041.npy"}
|
||||
{"epoch": 0.08586387434554973, "step": 42, "batch_size": 128, "mean": 0.3974078297615051, "std": 1.1172817945480347, "min": -2.81787109375, "p10": -0.8244903564453125, "median": 0.3104228973388672, "p90": 1.7963043212890624, "max": 4.5770263671875, "pos_frac": 0.6640625, "sample": [-0.42199134826660156, 0.08513641357421875, 2.317230224609375, 0.8118896484375, -0.455535888671875, 1.97979736328125, -0.1098785400390625, 1.768310546875, 1.23724365234375, -0.04563140869140625, 0.45977783203125, 2.188751220703125, -1.02484130859375, 0.07944488525390625, -1.9970703125, -0.0712890625, 0.301910400390625, 1.63848876953125, 0.3007354736328125, 4.5770263671875, 0.08641815185546875, -0.2369537353515625, -0.82208251953125, 0.456787109375, 1.706390380859375, -0.1897430419921875, -1.168212890625, -0.27255821228027344, 1.789154052734375, -0.206756591796875, -1.8800201416015625, 0.9810943603515625, 0.103271484375, 0.05609130859375, 2.3609619140625, 0.9778060913085938, 0.8447723388671875, 0.536834716796875, -0.0149993896484375, 1.1508941650390625, 0.35338592529296875, 0.021697998046875, 0.740081787109375, 0.47857666015625, 1.2498779296875, 0.307037353515625, 0.9447402954101562, 0.2013092041015625, 1.507293701171875, -0.044620513916015625, 0.578369140625, 0.3095836639404297, 1.1310195922851562, 0.43267822265625, 1.0594825744628906, 0.1674957275390625, -1.0659637451171875, -0.401123046875, 0.9652252197265625, 0.29192352294921875, 0.74072265625, -1.059661865234375, -0.7105865478515625, -0.830108642578125, -0.7972412109375, 1.0029296875, 0.3582763671875, -0.24941253662109375, -1.06121826171875, 0.13891029357910156, 0.225341796875, 0.5362281799316406, -0.208831787109375, 2.18499755859375, -2.4127197265625, 0.4457969665527344, -0.5360107421875, -0.8167037963867188, 1.81298828125, 0.10754013061523438, 0.31432342529296875, 0.77337646484375, 0.2914581298828125, 1.1898422241210938, 1.8543701171875, 0.6836395263671875, 0.84454345703125, 0.5091552734375, 0.3112621307373047, 0.4557952880859375, 2.4781494140625, -1.1287841796875, 1.26678466796875, 0.5802688598632812, 1.075775146484375, 0.15069580078125, -0.296173095703125, 2.56024169921875, 1.2121353149414062, 0.11191558837890625, 0.53662109375, 0.11890411376953125, 0.0, 1.29339599609375, -0.1783447265625, -0.2512931823730469, 0.478912353515625, 0.4190826416015625, -0.2586517333984375, 0.59417724609375, 0.72235107421875, -1.48284912109375, 0.4681243896484375, 0.12482452392578125, 3.527587890625, 2.09552001953125, -0.410919189453125, -0.07427215576171875, 1.684600830078125, -0.2612762451171875, 0.90582275390625, 2.64703369140625, -2.81787109375, -0.094085693359375, -2.78509521484375, -0.1328582763671875, 1.135986328125, -0.35396575927734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000042.npy"}
|
||||
{"epoch": 0.08795811518324607, "step": 43, "batch_size": 128, "mean": 0.4518667459487915, "std": 1.2124199867248535, "min": -2.841796875, "p10": -0.8837036132812499, "median": 0.3795013427734375, "p90": 1.919036865234375, "max": 4.033599853515625, "pos_frac": 0.65625, "sample": [0.107666015625, 1.196044921875, 1.6385498046875, -1.796539306640625, 0.6659698486328125, 1.0372161865234375, 1.650146484375, 0.6867599487304688, 0.21764373779296875, -0.02375030517578125, 0.2955818176269531, -0.066558837890625, 0.371246337890625, 1.9384765625, -0.7469940185546875, 1.202972412109375, -0.9091033935546875, 0.240997314453125, 1.555938720703125, 0.635528564453125, 0.8311767578125, 0.21350860595703125, 2.8319091796875, 0.8327484130859375, -0.8728179931640625, -1.088592529296875, -1.1731719970703125, 2.8687591552734375, -0.09919357299804688, 1.720001220703125, 1.85870361328125, 0.61895751953125, -1.09332275390625, -2.744781494140625, 2.9315185546875, 0.4603385925292969, -0.6755218505859375, 0.8156890869140625, 0.5815792083740234, -0.2378387451171875, -0.537811279296875, 0.680816650390625, -0.6313591003417969, 0.220428466796875, 0.1851348876953125, 0.237030029296875, 0.7254791259765625, -0.6453781127929688, -2.841796875, -0.4821624755859375, -0.06561279296875, 0.9580230712890625, -0.2157421112060547, 0.8886947631835938, 1.66455078125, 1.4537353515625, 2.49322509765625, -0.5012054443359375, -0.1556396484375, 0.73974609375, 0.6446533203125, -1.9834136962890625, 0.994476318359375, 0.3601837158203125, 2.87213134765625, -2.18841552734375, 2.331756591796875, 0.3760986328125, 1.2532806396484375, 0.9705047607421875, 0.86395263671875, 1.987152099609375, 0.371856689453125, 0.349151611328125, -0.8523406982421875, -0.0353546142578125, -0.3036956787109375, -0.658935546875, 0.10770416259765625, 1.642547607421875, -0.09868621826171875, -0.254638671875, -0.15643310546875, -0.5485992431640625, 0.457244873046875, 0.6967926025390625, -0.1126251220703125, -0.27678680419921875, 0.0247955322265625, 0.49137115478515625, 1.8152008056640625, 1.1917724609375, -1.7535400390625, 1.3954849243164062, 4.033599853515625, -0.2693939208984375, -0.06182098388671875, 3.3544921875, -1.031005859375, 0.4852294921875, 0.01861572265625, 0.9497528076171875, 2.2908935546875, -0.2636528015136719, 0.625701904296875, 1.91070556640625, 2.4228515625, 0.4874267578125, 0.77532958984375, 3.5997314453125, -0.634674072265625, 0.382904052734375, 0.33826446533203125, -2.5552749633789062, 1.010711669921875, 0.7154388427734375, 0.222564697265625, 0.9855117797851562, 0.8748779296875, 0.218505859375, 0.130950927734375, -1.0550537109375, 0.38794898986816406, 1.68438720703125, 0.89471435546875, -0.14369964599609375, 0.5960235595703125, -0.16585922241210938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000043.npy"}
|
||||
{"epoch": 0.09005235602094241, "step": 44, "batch_size": 128, "mean": 0.5759499073028564, "std": 1.316592812538147, "min": -3.41424560546875, "p10": -0.906658935546875, "median": 0.4836883544921875, "p90": 2.470066833496093, "max": 3.742767333984375, "pos_frac": 0.6640625, "sample": [0.251983642578125, -0.32110595703125, 2.7830810546875, 0.5937957763671875, -0.7393112182617188, 3.68682861328125, 0.5305442810058594, -0.22085189819335938, -0.120361328125, -0.1896076202392578, 0.5519866943359375, 1.365753173828125, 0.0157623291015625, -0.36046600341796875, 3.742767333984375, 1.053497314453125, -0.920684814453125, 0.14271926879882812, -2.391937255859375, 2.5946197509765625, 1.9557342529296875, 3.08416748046875, 0.45904541015625, 0.41602325439453125, 0.9706840515136719, 0.31755828857421875, 1.03192138671875, 0.23768043518066406, -0.917633056640625, -0.45961761474609375, 1.7911529541015625, -0.661468505859375, -0.353851318359375, -0.9101715087890625, 0.10851669311523438, 0.88140869140625, 0.233489990234375, -0.37830162048339844, 3.376373291015625, 1.08782958984375, 3.084716796875, -0.38877105712890625, -0.336883544921875, 2.9139404296875, 2.274444580078125, 1.970733642578125, 1.825653076171875, 2.292572021484375, 1.2357177734375, -0.7293701171875, 0.03772735595703125, 0.73382568359375, -1.3295211791992188, 0.804656982421875, 0.5684738159179688, 0.8163909912109375, -0.001251220703125, -1.1568603515625, 0.17625045776367188, -0.125244140625, 0.76751708984375, 3.330780029296875, 0.2721672058105469, 1.027099609375, 1.56573486328125, 0.6276931762695312, 0.89947509765625, -0.992156982421875, 0.88446044921875, 3.00482177734375, -2.06402587890625, 0.6235427856445312, 0.107177734375, 1.395233154296875, 1.474090576171875, 0.851226806640625, 1.28472900390625, -1.087127685546875, 0.05686187744140625, -0.906402587890625, 2.94549560546875, 1.27569580078125, 1.2845458984375, 0.4621925354003906, 1.25042724609375, -0.4976959228515625, 0.504302978515625, 0.79656982421875, 3.7037353515625, 0.16402435302734375, -0.07568359375, 0.682403564453125, 0.46307373046875, 1.8643798828125, -0.103515625, 2.078460693359375, -0.534942626953125, -2.1109619140625, -0.35626220703125, 0.4310455322265625, -0.125732421875, 0.06321144104003906, -0.907257080078125, 0.82342529296875, 0.210845947265625, -0.5496826171875, -0.5972900390625, 2.41668701171875, -0.64508056640625, -1.494659423828125, 2.2860107421875, -0.678680419921875, 0.3987541198730469, 3.106842041015625, 0.9801559448242188, -0.835784912109375, 0.6463165283203125, 0.6126251220703125, -3.41424560546875, -0.4776725769042969, -0.805694580078125, 1.3332138061523438, 0.8353729248046875, 1.729400634765625, 1.443695068359375, -0.3358917236328125, 0.58056640625, 0.785186767578125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000044.npy"}
|
||||
{"epoch": 0.09214659685863874, "step": 45, "batch_size": 128, "mean": 0.4835115671157837, "std": 1.4425796270370483, "min": -5.14617919921875, "p10": -1.098101806640625, "median": 0.26784324645996094, "p90": 2.4763641357421875, "max": 5.21002197265625, "pos_frac": 0.609375, "sample": [0.681671142578125, 0.2427978515625, 2.0148773193359375, -1.2161178588867188, -0.7564125061035156, 0.798614501953125, -0.0826263427734375, -0.16302490234375, 0.8754768371582031, 2.96929931640625, 1.116119384765625, 3.612701416015625, 2.927764892578125, -0.21216583251953125, 1.327423095703125, -0.0761566162109375, -0.31984710693359375, 0.1768798828125, 0.0, -2.95166015625, 1.652099609375, 0.099578857421875, -1.424957275390625, 1.0688667297363281, 0.39788818359375, 2.49591064453125, -0.14532470703125, 0.570068359375, 2.0454559326171875, -0.29486083984375, -0.5504608154296875, 0.2708854675292969, 2.149658203125, 0.4259490966796875, -0.49884796142578125, 0.28973388671875, 1.6539306640625, 0.22912979125976562, 3.687164306640625, 1.32269287109375, 3.151824951171875, -1.07647705078125, 0.9868316650390625, -0.8007965087890625, 2.942626953125, 0.5107917785644531, 1.29620361328125, 0.0366058349609375, 0.94921875, 0.264801025390625, 0.554046630859375, -0.2314910888671875, -1.1773681640625, -0.18609619140625, -0.2558250427246094, 0.273193359375, 0.0522003173828125, 0.699127197265625, -0.248046875, -5.14617919921875, -1.1485595703125, 2.45379638671875, -0.9739227294921875, -0.512420654296875, -1.16558837890625, 0.7387847900390625, -0.0982666015625, 0.1963348388671875, 0.91595458984375, -0.09160232543945312, 0.4554443359375, 1.9068756103515625, 0.468170166015625, 0.15079879760742188, 3.91607666015625, 0.669097900390625, 0.6753387451171875, -0.178192138671875, 2.04144287109375, 1.32049560546875, 0.8060455322265625, 3.8140869140625, 0.4410133361816406, 1.0979156494140625, 0.03827667236328125, -0.1450653076171875, -0.326416015625, 0.32975006103515625, -0.11981964111328125, -2.17608642578125, 1.0406951904296875, 2.60430908203125, -2.4209136962890625, -1.484039306640625, 3.38946533203125, -0.030303955078125, 0.6932373046875, 1.25775146484375, -0.096435546875, 1.081756591796875, -1.6220703125, -0.3768463134765625, -0.24785614013671875, -0.462615966796875, 1.3854217529296875, -0.56109619140625, 1.055389404296875, -2.00054931640625, -0.06198883056640625, 0.5146560668945312, -0.21362686157226562, 0.7133636474609375, 0.4279022216796875, 2.12677001953125, 0.24690628051757812, 0.015228271484375, 1.021820068359375, -0.1563720703125, 5.21002197265625, 0.25726318359375, -0.28424072265625, 2.467987060546875, 0.5833053588867188, -0.1114501953125, 3.01812744140625, -0.367156982421875, -1.45855712890625, 0.229095458984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000045.npy"}
|
||||
{"epoch": 0.09424083769633508, "step": 46, "batch_size": 128, "mean": 0.5446134209632874, "std": 1.3284629583358765, "min": -5.628082275390625, "p10": -0.8377960205078124, "median": 0.3214263916015625, "p90": 2.3732788085937497, "max": 3.96270751953125, "pos_frac": 0.671875, "sample": [2.9990234375, 2.82354736328125, 1.383636474609375, -0.2080078125, -0.1181640625, 1.04498291015625, 2.90679931640625, 0.21966552734375, -1.613433837890625, -0.04125213623046875, -1.257354736328125, -0.3407135009765625, 2.97467041015625, -2.2349853515625, 0.02667236328125, -1.3440704345703125, 0.62408447265625, -0.48331451416015625, 0.414031982421875, -0.46453857421875, -0.0860595703125, 0.315948486328125, 0.064208984375, 0.9431915283203125, -1.08709716796875, -0.25927734375, -1.3072509765625, -0.02347564697265625, 2.402587890625, 0.09971237182617188, 0.21002197265625, 0.28899383544921875, 1.04107666015625, 0.675048828125, 2.5780029296875, 3.96270751953125, 0.178741455078125, 0.42562103271484375, -0.8282012939453125, 1.35284423828125, 0.519317626953125, 0.843597412109375, 1.0537185668945312, 0.4282073974609375, 1.2095947265625, -1.6365966796875, 1.630584716796875, 0.056026458740234375, 0.6076812744140625, -0.218414306640625, 2.601226806640625, 2.9397125244140625, 0.728607177734375, 0.37750244140625, -0.547027587890625, 2.1299896240234375, 0.1641845703125, 2.663177490234375, 0.260498046875, 1.25860595703125, 0.3972663879394531, 0.6392745971679688, -0.5259170532226562, -0.8601837158203125, 2.752838134765625, 0.25029754638671875, 0.11812973022460938, -0.535369873046875, -0.7073974609375, 0.7076416015625, -0.286041259765625, -0.1743621826171875, 2.278167724609375, 1.519378662109375, -1.3695297241210938, 0.03515625, -0.19000625610351562, -1.18994140625, 0.38422393798828125, 0.27606201171875, 0.21950531005859375, 1.416473388671875, 1.912567138671875, -1.22869873046875, 1.11163330078125, 0.3517036437988281, -5.628082275390625, -0.2533721923828125, 1.4175567626953125, 0.151885986328125, -1.01043701171875, 0.488311767578125, -0.26629638671875, 0.7742080688476562, -0.19289016723632812, 1.241546630859375, 0.07563018798828125, -0.1303558349609375, 0.326904296875, 1.0846099853515625, -0.3084087371826172, -0.752044677734375, 1.2408599853515625, -0.63134765625, 1.95123291015625, 1.2208404541015625, 1.7274169921875, 1.9911346435546875, 1.91021728515625, 1.310638427734375, 2.21435546875, -0.246856689453125, 0.4917182922363281, 0.7347240447998047, 0.07427215576171875, 3.5540771484375, -0.56988525390625, 0.15027618408203125, 2.2347412109375, 3.1580810546875, 0.0, 1.561370849609375, 0.22021484375, -0.50933837890625, 0.0296173095703125, 1.73968505859375, 2.14129638671875, 2.3607177734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000046.npy"}
|
||||
{"epoch": 0.09633507853403141, "step": 47, "batch_size": 128, "mean": 0.4037870764732361, "std": 1.4760769605636597, "min": -4.871368408203125, "p10": -1.212115478515625, "median": 0.3729515075683594, "p90": 2.1771514892578123, "max": 4.059112548828125, "pos_frac": 0.6328125, "sample": [0.5426025390625, 0.26741790771484375, 1.2176666259765625, -0.7140045166015625, 2.725921630859375, -1.5927734375, -3.8359375, 1.2676849365234375, 1.65240478515625, 0.232452392578125, 0.15045166015625, 0.09583282470703125, 0.6519851684570312, -0.22524642944335938, -0.59228515625, -0.687255859375, 0.236785888671875, -0.1816253662109375, 0.809967041015625, -0.53204345703125, 0.971099853515625, -1.090362548828125, 0.5243682861328125, -1.09539794921875, 1.260650634765625, -0.8310546875, -0.05702972412109375, 0.1558685302734375, -4.871368408203125, -0.80010986328125, 1.8074493408203125, 0.0, 2.269683837890625, -0.6773529052734375, 2.15203857421875, 1.6887054443359375, 3.27862548828125, 0.74591064453125, 0.5386962890625, 0.79266357421875, -1.166229248046875, 0.5853652954101562, 2.1470947265625, 3.27740478515625, 1.6167449951171875, 1.446685791015625, -0.147216796875, 1.25836181640625, 0.79168701171875, 0.3451080322265625, 0.25403594970703125, -0.23397064208984375, 3.244781494140625, 0.5714569091796875, -2.315673828125, 1.370849609375, -2.879669189453125, 3.897796630859375, 1.280487060546875, 0.8399505615234375, -2.38153076171875, -0.063201904296875, -0.10742950439453125, 2.05078125, 0.6021804809570312, 0.18083953857421875, 0.17022705078125, 0.480072021484375, 0.4185943603515625, 1.4052581787109375, -0.1648712158203125, 0.3292713165283203, -0.6061859130859375, 0.6714324951171875, 1.3495941162109375, 0.925567626953125, 0.0, 1.2812042236328125, 3.890228271484375, 3.466766357421875, 0.37375640869140625, -0.41751861572265625, -1.26214599609375, -1.7562255859375, 1.02490234375, -0.415435791015625, -0.21773529052734375, -0.7606201171875, 0.48656463623046875, 0.8948898315429688, 1.1766510009765625, -1.18408203125, -0.6582775115966797, 1.8800048828125, 0.3112030029296875, -0.01904296875, 2.509796142578125, 1.1314697265625, 2.7508544921875, 0.48015594482421875, -0.3472900390625, 1.1735458374023438, 0.272186279296875, 0.35076904296875, -0.256134033203125, -1.63238525390625, 0.0404052734375, -1.464599609375, 0.0172576904296875, -0.2754402160644531, 0.591156005859375, 0.6709823608398438, 2.235748291015625, 0.40574073791503906, 0.566253662109375, 4.059112548828125, -0.8974685668945312, 0.6521930694580078, -0.1226043701171875, -1.5234375, -1.190673828125, 3.7975921630859375, 0.994659423828125, 0.38934326171875, 2.13677978515625, -2.72662353515625, -1.27056884765625, 0.3721466064453125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000047.npy"}
|
||||
{"epoch": 0.09842931937172775, "step": 48, "batch_size": 128, "mean": 0.6286492347717285, "std": 1.549891710281372, "min": -3.674835205078125, "p10": -1.0222274780273437, "median": 0.43422889709472656, "p90": 2.318688964843749, "max": 7.45672607421875, "pos_frac": 0.6484375, "sample": [0.508026123046875, 0.6817474365234375, 0.9887313842773438, 1.4564056396484375, 2.016845703125, 1.219085693359375, 1.10107421875, 3.86376953125, 1.31488037109375, -0.27419281005859375, -0.0601348876953125, -0.6346435546875, -0.1091766357421875, -2.2403106689453125, 0.12023544311523438, 5.4246826171875, -1.63116455078125, -1.4752655029296875, -1.0321502685546875, 0.369110107421875, 3.539276123046875, 0.70556640625, -0.813934326171875, -0.01025390625, 2.01953125, 0.37049102783203125, -1.579925537109375, 1.3955917358398438, -2.65032958984375, 0.18182373046875, 0.4659748077392578, 2.252899169921875, 1.4544677734375, -1.3701171875, 2.96661376953125, 0.5330886840820312, -0.188690185546875, 1.68438720703125, 1.498870849609375, 1.4797210693359375, -0.012054443359375, 3.013336181640625, -0.0326690673828125, -0.266143798828125, 0.686279296875, 0.6438674926757812, -1.251617431640625, -1.017974853515625, -1.5460205078125, 0.9453353881835938, 4.25445556640625, 0.5371360778808594, -0.7696075439453125, -0.0058746337890625, 0.036773681640625, 0.916259765625, 0.29480743408203125, 0.1447296142578125, 0.4197731018066406, 1.993072509765625, 0.383636474609375, 1.62286376953125, 0.489227294921875, 1.7611160278320312, 0.8944091796875, -2.225494384765625, 0.5615234375, -0.563079833984375, 1.1174240112304688, 0.4796142578125, -0.09716796875, 0.6258544921875, 0.0, 0.2953948974609375, 0.458740234375, 0.19429969787597656, 3.08404541015625, 0.6179275512695312, -0.0732879638671875, 0.373931884765625, 0.2022247314453125, 3.2876434326171875, 0.1669769287109375, 3.9986572265625, -0.848907470703125, 2.472198486328125, 1.92120361328125, 1.2538833618164062, -0.25613975524902344, 0.08087158203125, -1.10443115234375, 0.0082244873046875, -0.7948455810546875, -0.553955078125, 2.20928955078125, 3.7679443359375, 0.984710693359375, -0.5189208984375, 0.4486846923828125, 1.2372207641601562, -0.6191329956054688, 1.638885498046875, 0.0, 1.776397705078125, 2.083831787109375, 0.471771240234375, 7.45672607421875, -0.08233642578125, -3.674835205078125, 1.0860519409179688, -0.119110107421875, 0.24707794189453125, 0.63580322265625, 1.99554443359375, -0.030071258544921875, 0.111419677734375, -0.4938507080078125, -0.6775627136230469, 0.472747802734375, 0.6721839904785156, -0.1331787109375, 1.90850830078125, -1.1924896240234375, -0.12603759765625, 0.5352630615234375, -0.058563232421875, 3.78363037109375, 0.308441162109375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000048.npy"}
|
||||
{"epoch": 0.10052356020942409, "step": 49, "batch_size": 128, "mean": 0.6291530132293701, "std": 1.5780830383300781, "min": -5.0806884765625, "p10": -1.1773162841796874, "median": 0.5218000411987305, "p90": 2.852032470703125, "max": 5.798858642578125, "pos_frac": 0.671875, "sample": [0.4004974365234375, 0.4302978515625, 0.9240760803222656, 0.995513916015625, -0.04595947265625, 1.0387420654296875, -0.725738525390625, 5.308990478515625, 1.587005615234375, 0.5056362152099609, 0.2934417724609375, -0.0438995361328125, -0.7525634765625, -0.2308807373046875, 0.3941192626953125, 3.00042724609375, 2.5977935791015625, 0.8643589019775391, -0.0627593994140625, 1.027374267578125, -1.623504638671875, -0.006717681884765625, 3.22210693359375, 1.1087646484375, 0.447784423828125, -1.115875244140625, -1.404571533203125, 5.798858642578125, 0.6240310668945312, 2.8797607421875, -0.013336181640625, 1.55950927734375, 3.5648956298828125, 0.8929367065429688, 2.2667083740234375, 0.6801528930664062, 0.9218902587890625, -0.369384765625, 0.04785919189453125, -1.135833740234375, 0.0714569091796875, 0.8927001953125, -1.46343994140625, 1.2761459350585938, 1.4417648315429688, 1.3319549560546875, 1.24676513671875, 2.27197265625, 2.635833740234375, -1.27410888671875, 0.5379638671875, -1.6403274536132812, -0.07183837890625, 1.8050537109375, -0.005802154541015625, 3.23480224609375, 0.0755615234375, 0.261810302734375, -1.3967208862304688, 0.812957763671875, 0.2162303924560547, 2.093231201171875, -0.9898681640625, 2.84014892578125, 0.552490234375, 0.81341552734375, -0.24517822265625, -1.331085205078125, 1.8773193359375, 0.83447265625, 1.569580078125, 1.8399505615234375, -0.12987518310546875, -0.428680419921875, -0.20928573608398438, 1.71417236328125, -4.1295166015625, -0.59320068359375, 1.7321624755859375, -0.23712158203125, 0.3730812072753906, 3.0504150390625, 0.16131591796875, -0.00792694091796875, 0.1143341064453125, 1.4013671875, -2.4012451171875, 2.0159912109375, 3.20654296875, 2.196319580078125, 1.072357177734375, -0.05086517333984375, 0.9874954223632812, 0.675628662109375, 0.7065505981445312, 0.329376220703125, -0.380340576171875, 0.03399658203125, 0.5612754821777344, -0.0013427734375, 0.6466064453125, -0.440338134765625, -2.64227294921875, 0.9609832763671875, 3.535552978515625, 0.34478759765625, -0.329681396484375, 3.032958984375, 1.038421630859375, 0.2108612060546875, 4.3514862060546875, 0.9195404052734375, 0.4429359436035156, -0.03244781494140625, 0.8367462158203125, 0.35848236083984375, 0.4340705871582031, 3.1839599609375, 0.9676132202148438, 0.6342620849609375, -5.0806884765625, -0.04032325744628906, -1.395263671875, 1.302825927734375, 0.6381378173828125, -1.071319580078125, -2.0677490234375, 0.06073760986328125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000049.npy"}
|
||||
{"epoch": 0.10261780104712041, "step": 50, "batch_size": 128, "mean": 0.4272787570953369, "std": 1.7566577196121216, "min": -3.856475830078125, "p10": -1.6797210693359375, "median": 0.358551025390625, "p90": 2.300030517578124, "max": 7.33935546875, "pos_frac": 0.640625, "sample": [0.411376953125, -0.300018310546875, 0.6602783203125, 0.46504783630371094, -0.756744384765625, -1.0145645141601562, 0.76702880859375, -0.6084747314453125, -0.51007080078125, 6.757568359375, 2.53411865234375, -0.2488422393798828, 1.5771484375, 0.831878662109375, 3.95880126953125, 1.6233291625976562, 2.18011474609375, -3.856475830078125, 2.1957855224609375, 0.2451171875, 0.304107666015625, -1.41278076171875, -2.4328155517578125, 1.769989013671875, 1.77130126953125, 2.706787109375, 1.5219802856445312, 0.43735504150390625, 1.8204345703125, -0.928741455078125, -1.049591064453125, 3.35772705078125, 0.8722991943359375, -3.025299072265625, 0.6496658325195312, -0.4400634765625, -0.4113006591796875, -1.452972412109375, -1.349761962890625, 0.61798095703125, -0.19334793090820312, -1.764312744140625, 1.16619873046875, 1.542724609375, -0.62066650390625, -0.6517333984375, -1.18695068359375, 1.130615234375, -0.49764442443847656, 0.28237152099609375, 4.02508544921875, -0.7328033447265625, 0.33872222900390625, 0.1066436767578125, 0.8536224365234375, 2.660186767578125, 2.105224609375, -1.73883056640625, 1.979278564453125, 1.0350341796875, 0.8015899658203125, 0.2790985107421875, -2.59698486328125, 0.261749267578125, -1.80908203125, -2.54876708984375, 0.8402099609375, 0.9316635131835938, 1.75909423828125, -0.49155616760253906, 1.07830810546875, 0.03825187683105469, -0.013214111328125, 0.453125, 0.333282470703125, -2.07550048828125, 2.01019287109375, 0.77752685546875, 2.845001220703125, -0.34287261962890625, -0.6344528198242188, 0.409942626953125, -3.3739013671875, -1.337677001953125, 0.4921398162841797, 2.19970703125, -0.931304931640625, -3.706787109375, 0.13909912109375, -1.939422607421875, 0.665985107421875, -2.179931640625, 2.82916259765625, 0.0990753173828125, 0.3826446533203125, 0.3455810546875, -0.222625732421875, 0.3822784423828125, 1.9359130859375, 1.0478057861328125, 0.37152099609375, 4.20245361328125, -0.7486572265625, 2.5788421630859375, 0.718994140625, -0.22571563720703125, 0.7939338684082031, 0.822235107421875, -0.9368362426757812, -1.654388427734375, -1.32513427734375, 2.0237274169921875, 7.33935546875, 0.3157958984375, 0.199188232421875, 0.029327392578125, 3.53436279296875, 0.595428466796875, 0.875030517578125, 0.06822013854980469, 1.733154296875, 0.21457481384277344, 1.443115234375, 0.23760986328125, 1.24444580078125, -0.990264892578125, -0.1439208984375, 2.162811279296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000050.npy"}
|
||||
{"epoch": 0.10471204188481675, "step": 51, "batch_size": 128, "mean": 0.6346527338027954, "std": 2.029217481613159, "min": -6.3603515625, "p10": -1.6010833740234374, "median": 0.5665550231933594, "p90": 3.0470947265625, "max": 5.5372314453125, "pos_frac": 0.65625, "sample": [1.33416748046875, 0.3378143310546875, 1.7426013946533203, 0.86102294921875, 1.0754680633544922, -1.1823272705078125, 2.0494842529296875, 5.137298583984375, 0.082916259765625, -0.13824462890625, 1.72821044921875, 2.227813720703125, 1.9373016357421875, 3.6292724609375, -1.103555679321289, -1.677154541015625, 2.2947998046875, -1.5684814453125, 0.4721221923828125, 2.933868408203125, -6.3603515625, -0.0496978759765625, 1.311187744140625, -0.2613983154296875, 2.658203125, 5.5372314453125, 0.5721359252929688, -0.01776123046875, 4.56640625, -0.5938949584960938, -1.791656494140625, 3.03680419921875, 3.98931884765625, 0.3802642822265625, 1.3260498046875, 2.4886474609375, 2.92218017578125, 1.7383575439453125, -1.736572265625, 0.151611328125, 1.1288604736328125, 0.193328857421875, -1.715484619140625, 0.86590576171875, -0.2353343963623047, 3.07110595703125, 0.7772674560546875, 0.100860595703125, -1.468994140625, -4.4814453125, 1.33544921875, 0.0735626220703125, -4.769439697265625, 0.81207275390625, -0.58941650390625, 2.574493408203125, 0.0347442626953125, 0.789794921875, -0.80157470703125, 2.3342437744140625, -0.341064453125, 0.34979248046875, -0.5484771728515625, 1.1570281982421875, -3.48419189453125, -1.17669677734375, 0.3000469207763672, -0.3128814697265625, 0.807708740234375, 2.89727783203125, 0.4011993408203125, 1.688201904296875, 0.6190032958984375, 4.04180908203125, -3.001068115234375, 0.0, 2.756683349609375, -0.81011962890625, 2.181243896484375, 3.42388916015625, -0.225494384765625, -0.38770294189453125, 0.4008331298828125, 0.91424560546875, -2.437286376953125, -0.720855712890625, -1.0254974365234375, -0.2078857421875, -1.273162841796875, 3.50128173828125, 1.0242385864257812, 0.0198822021484375, 0.3634796142578125, -1.0049819946289062, 0.02559661865234375, 1.742218017578125, -0.4696044921875, 1.7900238037109375, 2.98797607421875, -1.0430755615234375, -2.672821044921875, -1.4237823486328125, 0.63262939453125, 4.980255126953125, 4.9891357421875, 3.111328125, 1.8203125, 2.2318115234375, 0.610870361328125, 0.676483154296875, 2.04400634765625, 0.22739601135253906, -0.151123046875, 0.9476661682128906, 2.24468994140625, 5.176605224609375, -2.5490875244140625, -3.861572265625, 1.56195068359375, 1.347259521484375, 0.56097412109375, 0.485107421875, 0.5923538208007812, -0.3212738037109375, 0.61083984375, 1.5023956298828125, -1.3174896240234375, 0.1835479736328125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000051.npy"}
|
||||
{"epoch": 0.1068062827225131, "step": 52, "batch_size": 128, "mean": 0.9743093848228455, "std": 2.0759057998657227, "min": -3.539337158203125, "p10": -0.9596839904785156, "median": 0.567413330078125, "p90": 3.9154144287109376, "max": 8.685760498046875, "pos_frac": 0.6328125, "sample": [2.130645751953125, 2.90716552734375, 0.93011474609375, -0.2928466796875, -0.25238800048828125, -0.66485595703125, 2.143890380859375, 0.7056503295898438, 0.04778099060058594, 6.27899169921875, 0.64715576171875, 0.35626220703125, -0.012298583984375, -3.5179443359375, 2.896484375, 3.87261962890625, -0.5096435546875, 1.8280487060546875, -0.595611572265625, -0.970306396484375, 1.153839111328125, 1.5308074951171875, 1.9117431640625, 0.7308731079101562, 0.6464385986328125, -0.6071243286132812, -1.001312255859375, 2.472900390625, 1.09344482421875, -2.926483154296875, -0.66717529296875, -0.9551315307617188, -0.05863761901855469, 0.75274658203125, 0.1436767578125, 1.7386322021484375, -0.526885986328125, -2.826019287109375, 0.21417617797851562, -0.2938232421875, 4.3232421875, -0.015209197998046875, -1.678680419921875, 1.047393798828125, 2.020050048828125, -0.07022857666015625, 0.367584228515625, 5.4332275390625, 0.81005859375, 5.31439208984375, 1.9084701538085938, 1.7621002197265625, 0.984405517578125, 0.4971771240234375, -0.22928237915039062, 2.33782958984375, -0.032573699951171875, 0.563201904296875, -0.181610107421875, -0.07000732421875, 1.138458251953125, -0.5117340087890625, 3.1850433349609375, -0.42987060546875, 3.9051513671875, 0.501708984375, -2.31793212890625, 0.5916252136230469, 2.3169174194335938, -0.17217159271240234, 0.314208984375, 0.82305908203125, 3.01202392578125, 2.3665618896484375, 3.08905029296875, -2.554718017578125, 1.0267143249511719, -0.21002197265625, 1.642974853515625, 2.39923095703125, 2.018524169921875, 5.78509521484375, 0.3037872314453125, 0.636688232421875, 3.4129638671875, -3.142364501953125, -0.4178886413574219, 4.37921142578125, 7.646820068359375, 2.74224853515625, 2.7852096557617188, -1.1233444213867188, -0.5104827880859375, 3.95159912109375, -0.241943359375, 0.3052978515625, -0.0709228515625, 4.851226806640625, 0.1879119873046875, -0.5928115844726562, -0.58837890625, 1.44207763671875, 0.2881927490234375, -1.14727783203125, 3.939361572265625, 1.9404983520507812, -0.7650146484375, 1.2165679931640625, 8.685760498046875, 0.9637451171875, 1.26177978515625, 4.044586181640625, -0.2008819580078125, -0.6181106567382812, 2.4345855712890625, 0.05048942565917969, -0.0710296630859375, 0.603057861328125, 0.42964935302734375, 0.571624755859375, 0.29302978515625, 1.6980743408203125, -3.539337158203125, -1.1103057861328125, 0.453826904296875, -0.22670745849609375, 4.2144775390625, -0.12298583984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000052.npy"}
|
||||
{"epoch": 0.10890052356020942, "step": 53, "batch_size": 128, "mean": 0.7802649140357971, "std": 1.9837459325790405, "min": -5.777496337890625, "p10": -1.29619140625, "median": 0.40831756591796875, "p90": 3.2877380371093747, "max": 6.48138427734375, "pos_frac": 0.65625, "sample": [-0.518798828125, -2.0204315185546875, 2.1297149658203125, -1.956787109375, 1.1660308837890625, -0.2330322265625, 5.151397705078125, -1.4782562255859375, -1.586395263671875, -0.06121063232421875, 2.23785400390625, 1.610809326171875, -0.7481842041015625, -3.830291748046875, -0.4398040771484375, 3.37579345703125, 1.85906982421875, -0.3352203369140625, 2.157440185546875, 0.36077880859375, -0.48333740234375, -0.7792205810546875, 0.529296875, 0.95208740234375, 2.490234375, 6.356109619140625, 2.825531005859375, 4.35614013671875, 1.62066650390625, 4.30706787109375, 1.07550048828125, 2.60870361328125, -0.10010910034179688, 0.291412353515625, 1.1793060302734375, -0.5641860961914062, 1.18609619140625, -5.777496337890625, -0.4554634094238281, 1.3584442138671875, 1.214141845703125, 0.0113983154296875, 0.2801170349121094, -2.451904296875, -0.8334846496582031, -2.60406494140625, -0.72564697265625, -0.223114013671875, 3.534271240234375, 0.9696044921875, 0.288848876953125, 4.900482177734375, 1.44500732421875, -1.34771728515625, 0.3292884826660156, -0.5252227783203125, 2.17724609375, -0.09589767456054688, -0.23598480224609375, 4.132598876953125, 3.541015625, 0.43349456787109375, 0.313568115234375, 1.818206787109375, 1.474456787109375, -0.5639419555664062, 2.592041015625, -0.8773117065429688, 0.16829299926757812, 0.8645858764648438, 0.40036773681640625, 0.25604248046875, 4.52362060546875, 1.954681396484375, 3.1673583984375, 2.861328125, -0.182861328125, 0.306396484375, 2.732421875, -0.0517578125, 2.377471923828125, 0.48792266845703125, 0.398223876953125, 0.4852752685546875, -0.29163360595703125, 3.219970703125, -3.00177001953125, 2.16729736328125, 0.13599395751953125, -1.27410888671875, 1.462127685546875, 0.0264434814453125, 3.83905029296875, -1.17352294921875, 1.28729248046875, 6.48138427734375, 0.254730224609375, -2.09527587890625, 0.202667236328125, 0.1219482421875, -0.4313030242919922, 1.65966796875, -1.1425437927246094, -1.75616455078125, -0.06307220458984375, -0.6372756958007812, 0.41033935546875, 2.1673736572265625, 0.4062957763671875, -4.958221435546875, 0.46343231201171875, -0.0878143310546875, 3.171661376953125, 2.92706298828125, 0.441162109375, -0.22540283203125, 3.25, 0.0, 1.36297607421875, 0.24444580078125, 1.345855712890625, 0.173614501953125, 1.4528961181640625, 1.74053955078125, 2.397552490234375, 0.5938491821289062, 0.485321044921875, 3.5809326171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000053.npy"}
|
||||
{"epoch": 0.11099476439790576, "step": 54, "batch_size": 128, "mean": 0.8152041435241699, "std": 2.1815574169158936, "min": -5.723388671875, "p10": -1.4140411376953124, "median": 0.5340766906738281, "p90": 3.4813262939453122, "max": 7.255126953125, "pos_frac": 0.671875, "sample": [2.38299560546875, 4.996978759765625, 3.1844024658203125, 2.191650390625, 0.35101318359375, 1.2485809326171875, -0.61669921875, 5.62298583984375, -0.025434494018554688, -1.019927978515625, 1.0769195556640625, -1.617034912109375, 0.45235443115234375, -1.38885498046875, 5.51409912109375, -0.4246711730957031, 6.21392822265625, 0.8093719482421875, -0.220367431640625, 0.7262420654296875, 0.770782470703125, 3.51287841796875, 0.673309326171875, 0.527008056640625, -0.7874298095703125, 0.8302688598632812, 0.3319091796875, 0.27801513671875, 0.619384765625, 0.2922859191894531, 0.7249908447265625, 0.920928955078125, -1.884918212890625, 0.5510101318359375, 0.76373291015625, 3.298492431640625, 6.28033447265625, -4.57586669921875, 2.3995361328125, 0.5682411193847656, 0.17163848876953125, 0.5208683013916016, 6.357666015625, -0.49237060546875, 4.592987060546875, 0.4386444091796875, -3.67974853515625, -0.2433929443359375, -0.9012603759765625, -1.69281005859375, 2.4638824462890625, 5.615020751953125, 0.5411453247070312, 0.6144561767578125, 0.13168716430664062, 0.5225715637207031, -0.059661865234375, 1.3069992065429688, 1.02630615234375, 0.17455673217773438, 2.535646438598633, 1.1890411376953125, -0.97674560546875, -0.507781982421875, 0.771392822265625, 1.3652267456054688, -0.850128173828125, 0.6755752563476562, 2.1996612548828125, 2.05841064453125, -0.1351776123046875, 1.6568450927734375, 0.207763671875, 2.75738525390625, -0.056793212890625, 0.7318878173828125, 2.771270751953125, -1.472808837890625, 0.6576995849609375, -0.26267242431640625, -0.1940155029296875, -0.6495132446289062, 0.432373046875, 0.30193328857421875, 0.66510009765625, 1.91192626953125, -0.4849853515625, -0.8671875, 1.4678230285644531, -0.1958160400390625, 0.37835693359375, 1.4774169921875, -2.342803955078125, 2.2686767578125, 0.8604583740234375, -1.0508499145507812, -1.270538330078125, 7.255126953125, 7.1658935546875, 2.38336181640625, 2.838531494140625, 3.420867919921875, 3.467803955078125, 0.22711181640625, 3.5384063720703125, -0.20491790771484375, -2.67120361328125, 0.7287445068359375, -5.723388671875, 2.16131591796875, -0.775115966796875, -2.6613922119140625, 0.341552734375, -0.0836181640625, 4.19964599609375, 0.0240478515625, -1.5515518188476562, 0.4029541015625, 0.79339599609375, -0.612762451171875, 0.4434814453125, 0.0809326171875, 1.0659503936767578, -1.58575439453125, -3.836090087890625, 1.2916259765625, 1.4022903442382812, -0.16778564453125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000054.npy"}
|
||||
{"epoch": 0.1130890052356021, "step": 55, "batch_size": 128, "mean": 0.5893000364303589, "std": 1.8107975721359253, "min": -5.7236328125, "p10": -1.5038291931152343, "median": 0.4136161804199219, "p90": 3.0142456054687496, "max": 7.574951171875, "pos_frac": 0.640625, "sample": [-0.751983642578125, 1.0184173583984375, 0.803985595703125, -0.01446533203125, -0.2642364501953125, -2.32379150390625, 0.5281505584716797, 1.265411376953125, 2.152130126953125, -1.8087005615234375, 1.5590362548828125, 2.4882736206054688, 3.6485595703125, 0.160308837890625, 0.2924346923828125, 0.4220142364501953, 1.5883560180664062, 0.01025390625, 0.01445770263671875, 1.2249603271484375, 0.465240478515625, 0.615447998046875, 3.379241943359375, 1.9465789794921875, 1.408721923828125, -0.5145797729492188, 1.0862503051757812, 2.646240234375, -0.411590576171875, -1.1376495361328125, -0.432342529296875, -0.10695648193359375, -1.5846710205078125, 1.57586669921875, 0.457672119140625, 1.2540512084960938, -0.67803955078125, 0.475372314453125, -0.08489990234375, 3.899688720703125, -2.7110595703125, -0.08469390869140625, 0.01599884033203125, 1.186737060546875, 1.1103515625, 0.416473388671875, -0.492889404296875, 0.2150115966796875, 0.3157958984375, 4.52691650390625, 1.3968963623046875, -1.0688629150390625, -2.0521240234375, 0.0206298828125, 0.270751953125, -2.0784149169921875, -2.088836669921875, 3.43890380859375, -0.761932373046875, 1.6302490234375, -1.8866729736328125, 0.43182373046875, 5.339874267578125, -0.29877471923828125, 1.0015869140625, -0.443634033203125, -1.4970474243164062, 2.1307144165039062, 1.1609878540039062, -1.8106040954589844, 0.492218017578125, 0.43355560302734375, -1.2350311279296875, 0.32781982421875, -1.4471435546875, 0.029905319213867188, -0.28216552734375, 2.3679656982421875, -0.2971611022949219, 1.4808769226074219, 0.41075897216796875, 2.07928466796875, -0.9110107421875, 0.3400230407714844, 0.4044189453125, -0.5011367797851562, -0.6215438842773438, 3.3158721923828125, 4.1810302734375, 0.4514007568359375, -0.7837905883789062, 1.7226104736328125, 0.8958740234375, 0.9300537109375, 7.574951171875, -0.41701698303222656, 4.099822998046875, -1.0698089599609375, 0.312835693359375, 2.845306396484375, -0.487762451171875, 1.8686752319335938, 0.7817115783691406, 1.042724609375, 0.3076629638671875, -2.451385498046875, 2.9571533203125, 0.9764556884765625, 1.8605575561523438, -1.1534347534179688, -5.7236328125, 0.18514251708984375, 1.8878326416015625, -0.01807403564453125, -1.1768951416015625, -0.0347137451171875, 1.881439208984375, -1.9101409912109375, 1.3039169311523438, -1.5196533203125, 0.9873123168945312, 0.196746826171875, -1.288177490234375, 3.820648193359375, 1.8856201171875, 4.5014190673828125, 3.1474609375, 0.8636474609375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000055.npy"}
|
||||
{"epoch": 0.11518324607329843, "step": 56, "batch_size": 128, "mean": 0.9343713521957397, "std": 2.389112710952759, "min": -6.060943603515625, "p10": -1.4446762084960931, "median": 0.593109130859375, "p90": 3.7808227539062496, "max": 7.4324951171875, "pos_frac": 0.6484375, "sample": [0.275970458984375, -2.94989013671875, 2.7288818359375, 4.8887939453125, 0.6591339111328125, 1.9247970581054688, 0.13519287109375, -1.176605224609375, 0.4901580810546875, 5.888427734375, 1.6660003662109375, 2.67730712890625, -0.8235549926757812, -0.8221435546875, 1.3449325561523438, -0.7169189453125, -0.3143310546875, -0.3092803955078125, 3.940185546875, -2.202880859375, 1.899169921875, -0.04766845703125, 3.2603759765625, 0.590179443359375, -0.4199371337890625, 6.0906982421875, 1.194122314453125, 4.619140625, 2.5400238037109375, 0.6071014404296875, 2.5457763671875, 3.5594940185546875, 0.407867431640625, -0.4946441650390625, 6.485809326171875, 2.410858154296875, -3.2247161865234375, -0.11937713623046875, 1.5932464599609375, -0.22289276123046875, 6.13983154296875, 0.9660186767578125, 0.4054908752441406, -0.11218643188476562, 0.08841323852539062, 7.4324951171875, 1.96612548828125, 0.11167716979980469, -0.81158447265625, -0.4587860107421875, -0.8265380859375, -1.0523757934570312, 1.2225570678710938, -5.03948974609375, -0.1520843505859375, -5.29534912109375, 0.37465476989746094, -0.6881542205810547, 1.26068115234375, -0.07332611083984375, 2.2582626342773438, 5.01641845703125, 2.890960693359375, 3.4246826171875, -2.1927490234375, -1.8219451904296875, 2.20391845703125, 2.0157470703125, 4.520965576171875, 0.7161788940429688, -1.239166259765625, 0.04923820495605469, -4.12249755859375, 0.41748619079589844, 0.09308242797851562, -0.4364776611328125, -2.21148681640625, 2.74871826171875, 0.96514892578125, 1.0595550537109375, -6.060943603515625, 0.86761474609375, 2.5901947021484375, 0.619873046875, 0.596038818359375, 6.829833984375, -0.13812255859375, 3.100921630859375, -1.857513427734375, 1.9080162048339844, 0.3733978271484375, 7.00579833984375, 0.721405029296875, 1.8734664916992188, -2.7548828125, 3.7125244140625, 1.3246612548828125, -0.1819000244140625, 0.19087600708007812, 0.3865814208984375, 1.9058837890625, -0.0680999755859375, 1.698455810546875, -0.5149078369140625, -3.3555908203125, 4.43707275390625, 0.31941986083984375, 2.371612548828125, -0.3406524658203125, 2.19195556640625, 2.2684478759765625, 0.5722198486328125, 2.267822265625, -1.282989501953125, 0.37391090393066406, 0.6166839599609375, -0.71038818359375, -0.0368194580078125, 1.8045654296875, 0.149139404296875, 3.496551513671875, -0.9296112060546875, 1.854156494140625, -0.077178955078125, 1.25689697265625, 3.5091705322265625, 2.4266510009765625, -0.10559844970703125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000056.npy"}
|
||||
{"epoch": 0.11727748691099477, "step": 57, "batch_size": 128, "mean": 1.350066900253296, "std": 2.286102294921875, "min": -4.64447021484375, "p10": -0.93467960357666, "median": 0.9130115509033203, "p90": 3.791851806640625, "max": 13.352783203125, "pos_frac": 0.734375, "sample": [-1.3301925659179688, 0.803192138671875, -2.5974273681640625, 1.401702880859375, -0.343353271484375, -1.47222900390625, 5.47662353515625, 0.3486328125, -0.18841552734375, 1.499267578125, -0.737762451171875, 0.14483642578125, 3.450439453125, 0.6077041625976562, 2.993194580078125, -0.052036285400390625, 0.094696044921875, 3.380828857421875, 1.7261962890625, 0.78302001953125, 0.8914012908935547, 3.4217071533203125, 0.30760955810546875, -0.49448394775390625, 0.7034912109375, 1.024871826171875, -1.586212158203125, -0.41986083984375, -0.29327392578125, -1.063812255859375, 2.1060791015625, 2.701995849609375, 1.3468170166015625, 0.83038330078125, 7.16143798828125, 13.352783203125, 0.7518157958984375, -0.435089111328125, -1.3580360412597656, 2.329132080078125, 3.56085205078125, 0.547210693359375, -0.377716064453125, 4.5844268798828125, 1.0890121459960938, 1.2644424438476562, 0.07532501220703125, -3.258941650390625, -0.10601806640625, 1.6776123046875, 0.9658203125, 0.4605293273925781, 0.46337890625, 5.296112060546875, 1.0745086669921875, 1.91229248046875, 1.3076210021972656, 0.7811279296875, 3.100494384765625, -0.35003662109375, 2.254058837890625, -0.6268310546875, 0.6594104766845703, 1.57537841796875, 1.12628173828125, 3.8255615234375, 4.95361328125, 1.878204345703125, -1.437774658203125, 1.05889892578125, 3.74615478515625, 0.00555419921875, -1.3261642456054688, -0.8974514007568359, 3.94573974609375, 2.886444091796875, 0.185699462890625, 5.107940673828125, 1.128448486328125, 1.61431884765625, 0.76953125, 0.03836822509765625, 7.350494384765625, 6.759552001953125, 0.23846435546875, 0.052734375, -1.265411376953125, 2.58209228515625, -2.16033935546875, -1.02154541015625, 2.25994873046875, 3.77740478515625, -0.20038986206054688, 0.444793701171875, 1.600341796875, 0.6236839294433594, 4.80145263671875, 0.756439208984375, 1.85760498046875, -0.11431884765625, -4.64447021484375, -0.1261749267578125, 5.83380126953125, 2.6608123779296875, 0.18808746337890625, 3.689971923828125, 0.9082069396972656, 2.900726318359375, -0.25823974609375, 0.6558303833007812, 1.02374267578125, 2.49371337890625, 3.0096435546875, -0.5560226440429688, 1.9253196716308594, 2.280517578125, 3.16253662109375, 0.12761688232421875, 1.3013191223144531, -0.20921897888183594, 3.0333251953125, -0.3013916015625, 2.48211669921875, 0.917816162109375, 3.4079971313476562, -0.0188446044921875, 2.162567138671875, 2.6071319580078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000057.npy"}
|
||||
{"epoch": 0.1193717277486911, "step": 58, "batch_size": 128, "mean": 0.952062726020813, "std": 2.106245756149292, "min": -10.512725830078125, "p10": -1.2377288818359373, "median": 0.8861618041992188, "p90": 3.472825622558594, "max": 6.489959716796875, "pos_frac": 0.6875, "sample": [0.909454345703125, 4.674896240234375, 1.44287109375, 3.09454345703125, 0.905670166015625, 0.014873504638671875, 2.1425209045410156, -1.49127197265625, -0.181915283203125, 2.066741943359375, 1.6241912841796875, 2.8050537109375, 1.4743499755859375, 4.875579833984375, 1.328338623046875, -1.0726470947265625, 3.9815292358398438, 4.53375244140625, 2.2879791259765625, -1.31854248046875, 6.489959716796875, 4.3263397216796875, -0.708740234375, -0.6755828857421875, 3.210845947265625, 1.1748809814453125, 1.683258056640625, 1.1463623046875, 0.6466064453125, -3.0267333984375, -0.00433349609375, 1.361236572265625, 0.454864501953125, 0.24908447265625, 2.49322509765625, 0.8666534423828125, 1.170196533203125, 1.6364059448242188, -0.1678028106689453, 0.95208740234375, -2.6915283203125, 0.7247314453125, 0.3339805603027344, 1.11541748046875, 2.0794677734375, -1.3317108154296875, 0.5174427032470703, 4.339141845703125, -0.02306365966796875, -10.512725830078125, 0.6627655029296875, 4.740234375, -0.959442138671875, -2.0289840698242188, 0.94158935546875, 0.738983154296875, -0.21960067749023438, 0.7334213256835938, -1.203094482421875, 3.1009521484375, 1.4470977783203125, -0.05011749267578125, 3.472900390625, -0.449493408203125, 1.553680419921875, 0.763275146484375, -0.200927734375, 2.5676803588867188, 0.302734375, 1.53692626953125, -0.430206298828125, -2.05877685546875, -1.013519287109375, 2.7672271728515625, 1.3830375671386719, -0.555511474609375, 5.87396240234375, 0.6410560607910156, -1.168304443359375, 0.3402099609375, 1.6764297485351562, -0.89208984375, 5.873779296875, -2.05462646484375, 1.4214935302734375, 0.45342254638671875, 0.33794403076171875, 0.4087066650390625, 0.9741744995117188, 3.2281494140625, -0.10848236083984375, 3.4727935791015625, 3.1199951171875, 1.158355712890625, 1.69482421875, -0.361541748046875, 2.8320465087890625, -1.594573974609375, 3.49609375, 1.2665863037109375, 0.627410888671875, -0.1403331756591797, -0.47617340087890625, -0.996551513671875, 1.5648193359375, 0.8179931640625, 0.3170433044433594, 1.372406005859375, 3.69317626953125, 1.928985595703125, -2.34564208984375, 1.97540283203125, -0.17885589599609375, 2.81695556640625, 1.4700698852539062, 0.34352874755859375, -0.802093505859375, 0.415771484375, 3.1139450073242188, 2.3320770263671875, 2.7422637939453125, 0.7939891815185547, -1.73504638671875, 1.147369384765625, -0.1864776611328125, -2.258453369140625, 2.3095703125, -0.36029052734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000058.npy"}
|
||||
{"epoch": 0.12146596858638743, "step": 59, "batch_size": 128, "mean": 0.8896090388298035, "std": 2.594282865524292, "min": -6.565277099609375, "p10": -1.7189270019531249, "median": 0.6373953819274902, "p90": 4.772714233398437, "max": 8.7745361328125, "pos_frac": 0.609375, "sample": [-0.7318267822265625, 1.864898681640625, 0.03961181640625, -6.565277099609375, 1.175048828125, 2.64501953125, -0.609832763671875, 2.142578125, 0.748626708984375, -0.722442626953125, 0.5672454833984375, 0.031097412109375, 1.3535919189453125, 3.1062774658203125, -2.8200225830078125, 0.5407562255859375, -0.1277008056640625, -0.27217864990234375, -2.0433197021484375, 1.5291748046875, -0.6916351318359375, 1.0019035339355469, -0.022022247314453125, -0.3426055908203125, 1.3012237548828125, 1.3956375122070312, 1.8479156494140625, 5.0233154296875, -3.61871337890625, 0.85552978515625, -0.542724609375, 0.7918281555175781, 0.660919189453125, 1.69244384765625, 0.0, 5.8370361328125, -0.1620655059814453, 1.3154373168945312, -0.1879749298095703, -4.0714111328125, 0.6865806579589844, 3.54840087890625, 6.00390625, 8.7745361328125, 0.0478515625, -5.605682373046875, 2.5514984130859375, 7.168853759765625, 2.6368408203125, 6.2349853515625, 2.3048934936523438, 0.460784912109375, 4.709228515625, 4.732696533203125, 0.6890869140625, 1.1480789184570312, -0.03582000732421875, 1.80474853515625, -4.157562255859375, -1.870513916015625, 0.04449462890625, 2.2636566162109375, 1.1910400390625, -1.653961181640625, 0.598541259765625, 4.8660888671875, 5.662109375, -0.1632080078125, 3.1437530517578125, 2.6626663208007812, -0.48854827880859375, 1.54681396484375, 0.6510009765625, 6.285980224609375, -1.4822807312011719, -5.00750732421875, 0.6237897872924805, -3.87152099609375, -0.824188232421875, 0.4144859313964844, 2.7080078125, 0.374359130859375, 2.4500579833984375, 2.39068603515625, 2.4063796997070312, -2.1660919189453125, 1.8089599609375, -0.2181396484375, 0.256378173828125, -0.620361328125, 2.691314697265625, 2.56317138671875, -0.254913330078125, -0.4551372528076172, -1.1506881713867188, 4.9302520751953125, 0.66876220703125, -0.9100341796875, -0.255401611328125, 1.33428955078125, -0.61309814453125, -0.32977294921875, 0.577850341796875, 1.4999237060546875, -0.36978912353515625, -0.1513519287109375, -1.1874618530273438, -3.487335205078125, -1.2705841064453125, 1.184814453125, 2.454193115234375, 2.689666748046875, 6.15521240234375, -0.6951751708984375, -0.3928718566894531, -0.31859588623046875, 7.935943603515625, 1.168426513671875, 1.184906005859375, -0.7961273193359375, -2.5625, 0.8608474731445312, 5.733734130859375, 2.146728515625, 0.15074729919433594, -0.48990631103515625, 2.255279541015625, -0.24756622314453125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000059.npy"}
|
||||
{"epoch": 0.12356020942408377, "step": 60, "batch_size": 128, "mean": 1.2687376737594604, "std": 2.289315938949585, "min": -5.484130859375, "p10": -1.3916105270385741, "median": 1.0084075927734375, "p90": 3.9569442749023436, "max": 9.692474365234375, "pos_frac": 0.7265625, "sample": [-4.11676025390625, 2.222991943359375, 4.823814392089844, 0.11260986328125, 3.714202880859375, 2.2397918701171875, -0.1866168975830078, 2.5832290649414062, 2.7860107421875, 1.8341064453125, 0.3258056640625, -0.7735328674316406, -0.4134788513183594, 1.02813720703125, 0.4077911376953125, 1.151123046875, 0.8892059326171875, -0.774566650390625, 2.3412322998046875, 3.01507568359375, 0.6747589111328125, -1.8150634765625, 2.57672119140625, -1.7609710693359375, -1.1606693267822266, 0.14392852783203125, 0.988677978515625, 0.0, -0.6943836212158203, 4.55072021484375, -1.173919677734375, 5.9921875, 3.447174072265625, 1.8344573974609375, 0.1136016845703125, -0.8558502197265625, 3.7948150634765625, -2.7755126953125, 2.9368896484375, 0.2025737762451172, -0.42899322509765625, 2.3328170776367188, 1.706756591796875, 3.8953018188476562, 0.8759078979492188, -0.770111083984375, -1.3870105743408203, -0.274688720703125, 1.0692138671875, 0.28583526611328125, -0.043064117431640625, 0.49102783203125, 1.58648681640625, 5.514556884765625, 0.21150588989257812, -0.245361328125, 2.8963775634765625, -1.40234375, 0.931549072265625, 3.5577621459960938, 0.5395889282226562, 1.70062255859375, 2.5520477294921875, 0.4376411437988281, -5.484130859375, 2.3640594482421875, 0.2154541015625, 1.2230377197265625, 0.8097076416015625, -1.6229705810546875, -2.028106689453125, 3.9529571533203125, 3.5525894165039062, 5.6920166015625, 3.072509765625, 0.934326171875, -0.263702392578125, 2.5288543701171875, 1.7179794311523438, 6.0563812255859375, 3.927337646484375, 0.164581298828125, -0.8503265380859375, 1.705780029296875, 5.3007354736328125, 0.5313873291015625, 1.1334991455078125, -3.944000244140625, 1.1829967498779297, 0.9504547119140625, 1.75390625, 0.6104812622070312, -2.0208740234375, 0.123199462890625, 1.02850341796875, 3.5316162109375, 0.9293594360351562, 5.046875, -0.17572021484375, 0.216522216796875, 1.63262939453125, 2.90814208984375, -1.1944122314453125, -1.489990234375, 5.2610015869140625, 1.7334747314453125, -0.386566162109375, 2.7280120849609375, 9.692474365234375, 0.0, 6.619384765625, -1.55950927734375, 1.948760986328125, 0.0855712890625, -0.2714042663574219, 2.34906005859375, 2.68133544921875, 0.2167510986328125, 2.2382965087890625, -2.83758544921875, 1.74676513671875, 3.96624755859375, 1.1550178527832031, 2.78802490234375, 2.1390380859375, 0.9398956298828125, 2.7315521240234375, 4.4454498291015625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000060.npy"}
|
||||
{"epoch": 0.1256544502617801, "step": 61, "batch_size": 128, "mean": 1.0729765892028809, "std": 2.5711913108825684, "min": -6.6485595703125, "p10": -1.4951400756835935, "median": 0.7925567626953125, "p90": 4.160191345214844, "max": 10.946258544921875, "pos_frac": 0.6796875, "sample": [-0.88153076171875, -1.938995361328125, 0.6505336761474609, -0.18707275390625, 4.12017822265625, 5.284454345703125, -0.9853515625, 0.730438232421875, -0.2218189239501953, 0.610595703125, 1.506561279296875, -4.804534912109375, -1.082763671875, 0.4289703369140625, 0.3261871337890625, 0.7664241790771484, -1.734222412109375, -1.1960296630859375, -3.332305908203125, 0.18158721923828125, 1.020355224609375, 0.810394287109375, 7.180572509765625, 3.18365478515625, 3.91094970703125, 0.70050048828125, 5.110687255859375, 0.4434814453125, 2.65948486328125, 2.185791015625, 0.5387516021728516, 0.5066680908203125, 0.657745361328125, -0.6220321655273438, 3.446533203125, -0.415985107421875, -0.23410797119140625, 4.1298370361328125, 0.856964111328125, 0.8413162231445312, -0.47528839111328125, 1.73577880859375, 2.7261810302734375, -0.6609954833984375, 0.334716796875, 0.335968017578125, 3.082183837890625, 2.601776123046875, 2.589599609375, 2.0514373779296875, 0.4705810546875, 2.4208526611328125, 6.92041015625, 4.89337158203125, 4.1432952880859375, 2.11016845703125, 2.2591552734375, 1.3928718566894531, 6.4908294677734375, -0.303955078125, 0.83477783203125, 3.1677474975585938, 1.6593017578125, -0.7651443481445312, 0.5225830078125, 3.6678466796875, 2.545074462890625, 5.64935302734375, 2.4127197265625, 1.09783935546875, 2.2798233032226562, 0.9361114501953125, 1.879425048828125, -1.674530029296875, -0.7707710266113281, -0.217041015625, -0.000484466552734375, 0.77471923828125, 2.3602371215820312, -0.03033447265625, 6.544281005859375, 0.4717864990234375, 2.858154296875, 1.688751220703125, 0.886566162109375, 2.8516693115234375, 0.2975616455078125, -2.2320785522460938, -0.4104652404785156, 0.99542236328125, -0.2313709259033203, -3.43560791015625, -2.594482421875, 1.0836181640625, 0.030248641967773438, 1.414794921875, 0.7404537200927734, 4.83831787109375, -1.008819580078125, 6.46514892578125, -0.1925048828125, 10.946258544921875, 1.739288330078125, 0.0, -0.097412109375, 1.2828445434570312, 2.851165771484375, 0.736419677734375, 1.3717422485351562, -6.6485595703125, 4.199615478515625, -3.043487548828125, -0.8047027587890625, -1.1191253662109375, -5.093597412109375, 2.025390625, -6.091064453125, -0.469970703125, 0.88995361328125, -0.19171142578125, 1.06011962890625, 0.567901611328125, -1.4182586669921875, 1.4878082275390625, 2.0718765258789062, 4.479583740234375, 1.0637969970703125, -2.11737060546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000061.npy"}
|
||||
{"epoch": 0.12774869109947645, "step": 62, "batch_size": 128, "mean": 1.235146164894104, "std": 2.7426819801330566, "min": -5.87060546875, "p10": -1.9067001342773435, "median": 0.9776382446289062, "p90": 4.787193298339844, "max": 9.32659912109375, "pos_frac": 0.65625, "sample": [4.29425048828125, 0.6998367309570312, 9.32659912109375, 2.5620269775390625, 4.017730712890625, 0.7920989990234375, 3.89453125, 2.56781005859375, -3.310546875, 1.2962188720703125, -1.7039337158203125, -1.6206436157226562, -0.840240478515625, 3.92230224609375, 2.537872314453125, 0.5981063842773438, 2.248626708984375, 1.9256439208984375, -0.45433807373046875, -1.2332611083984375, 2.3216934204101562, 0.32888031005859375, 1.8922195434570312, 1.5548515319824219, -3.27825927734375, 7.52459716796875, -5.4855804443359375, -0.08349609375, 1.1342849731445312, 1.592041015625, -1.67376708984375, 0.70416259765625, 2.4927520751953125, 1.7912139892578125, -0.10727691650390625, 0.029119491577148438, 0.633392333984375, -0.3700714111328125, 1.866790771484375, -0.621490478515625, 1.335845947265625, 4.634002685546875, 3.6660232543945312, 0.16292572021484375, -0.1638641357421875, 1.42059326171875, 0.6916465759277344, 3.608642578125, 2.46087646484375, 8.16290283203125, -5.87060546875, 3.96417236328125, -0.82696533203125, 4.187408447265625, 3.5357666015625, -2.0374908447265625, 6.611083984375, 0.0, 5.388763427734375, -0.19545745849609375, -2.809112548828125, -1.1447982788085938, 4.7405242919921875, -0.38974761962890625, 1.0813827514648438, 0.05713653564453125, 0.06926727294921875, -1.75250244140625, 0.6888580322265625, 3.2491455078125, 3.7813720703125, 0.9127349853515625, 0.5766048431396484, -2.5622825622558594, 2.5230712890625, 5.7389678955078125, 1.8407440185546875, -3.3595733642578125, 0.8980178833007812, 1.8945178985595703, -0.4904632568359375, -0.800628662109375, 2.538970947265625, 2.500823974609375, -2.09381103515625, 4.216064453125, 8.57342529296875, 1.02783203125, -0.73236083984375, 3.03350830078125, 0.953155517578125, -1.240020751953125, 7.6727294921875, 3.222625732421875, 5.4088134765625, 2.4987335205078125, -2.1576995849609375, 4.9232177734375, -1.495819091796875, -2.180572509765625, 4.896087646484375, -1.85064697265625, 1.0654525756835938, 2.145111083984375, 6.0671234130859375, 2.501068115234375, 1.79901123046875, 0.4232940673828125, 0.9406547546386719, 5.212860107421875, 1.421600341796875, -1.6777801513671875, -0.5885009765625, -2.0511474609375, 0.14324951171875, -0.8406219482421875, -2.60955810546875, -0.8719940185546875, 2.4540557861328125, 1.0150279998779297, 1.033355712890625, -0.377044677734375, 3.4463043212890625, 1.0021209716796875, -0.3825225830078125, 0.912139892578125, -1.303466796875, -1.742401123046875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000062.npy"}
|
||||
{"epoch": 0.12984293193717278, "step": 63, "batch_size": 128, "mean": 1.311095118522644, "std": 2.942580461502075, "min": -7.51214599609375, "p10": -2.071228790283203, "median": 1.3633222579956055, "p90": 4.733538818359375, "max": 10.2698974609375, "pos_frac": 0.6484375, "sample": [3.56201171875, 4.49713134765625, 0.5227546691894531, -5.3670654296875, -3.525146484375, -0.6296863555908203, 2.7721099853515625, -0.2119903564453125, -0.5533370971679688, 1.97943115234375, -5.370758056640625, -2.4119338989257812, -2.9633331298828125, -2.0891036987304688, 3.53289794921875, -0.6077041625976562, 1.61505126953125, 3.12890625, 7.26837158203125, 7.328277587890625, -2.063568115234375, -0.790771484375, 9.84747314453125, -1.6951904296875, 1.72625732421875, 2.719024658203125, 2.172698974609375, -2.8177642822265625, 0.148345947265625, 4.30560302734375, 0.05316925048828125, 1.963531494140625, -0.4077911376953125, -0.6574287414550781, 1.3484020233154297, 10.2698974609375, -2.435546875, 3.0933380126953125, -2.0435943603515625, 3.87127685546875, -1.876068115234375, 0.9392147064208984, 4.25335693359375, -3.797210693359375, 1.1176910400390625, 3.6039886474609375, 7.677978515625, -7.51214599609375, -1.5494384765625, 4.85107421875, 0.49344825744628906, 3.350860595703125, 2.729583740234375, 0.0, -0.45149803161621094, 2.05194091796875, 2.081298828125, -1.266632080078125, 0.16786766052246094, 1.4143486022949219, 0.9284477233886719, -0.0440673828125, 0.46343994140625, 4.920562744140625, -0.8795166015625, 1.3782424926757812, 2.34710693359375, 3.2025146484375, -0.9171600341796875, 3.524932861328125, 6.260009765625, 1.4519195556640625, 3.2982177734375, -0.7139892578125, 4.698486328125, -3.174072265625, 3.5188980102539062, 1.5425682067871094, 1.7730255126953125, -1.5883331298828125, -0.230712890625, 8.084716796875, 2.104705810546875, -1.2294464111328125, 2.7188034057617188, 0.25038909912109375, 2.5317001342773438, 0.17053985595703125, 1.9357986450195312, 1.988677978515625, 2.2740325927734375, -0.0224609375, -2.163238525390625, 4.215728759765625, 0.8477802276611328, 3.43927001953125, -0.27153778076171875, 3.474212646484375, 3.7610015869140625, 4.9796142578125, 4.12310791015625, 2.4199371337890625, 3.616119384765625, -0.5904388427734375, 0.9406814575195312, 0.8297195434570312, -0.9261474609375, -0.208953857421875, 4.75750732421875, 0.4851837158203125, 1.5234375, -1.362884521484375, 2.0662078857421875, 1.241058349609375, 1.5619583129882812, 6.0872802734375, 0.71453857421875, 1.5313568115234375, -0.2752685546875, 0.6631240844726562, -4.6785888671875, -0.3470916748046875, 2.69305419921875, 2.032257080078125, -1.2216796875, -0.02320098876953125, 5.22991943359375, 4.7232666015625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000063.npy"}
|
||||
{"epoch": 0.1319371727748691, "step": 64, "batch_size": 128, "mean": 1.6794004440307617, "std": 3.310208320617676, "min": -6.3260498046875, "p10": -1.8156589508056638, "median": 1.3893203735351562, "p90": 6.064959716796875, "max": 10.9488525390625, "pos_frac": 0.734375, "sample": [3.903167724609375, -0.90838623046875, -3.598541259765625, 7.951690673828125, -2.240570068359375, 4.6234130859375, 2.75823974609375, 2.395904541015625, 10.53515625, 1.7208709716796875, 2.64306640625, 3.92791748046875, 6.10430908203125, 3.588623046875, 1.3820343017578125, -4.1431732177734375, 9.9930419921875, 1.899139404296875, 0.455780029296875, -3.32843017578125, 2.0399627685546875, 1.3966064453125, -0.42523193359375, -2.402435302734375, 0.585113525390625, -6.112091064453125, 2.1552696228027344, 1.0465469360351562, 1.72259521484375, 4.962799072265625, 0.601226806640625, 2.21661376953125, 2.702606201171875, 0.8641433715820312, 2.20989990234375, 0.45627593994140625, 1.753021240234375, 2.846466064453125, 1.778717041015625, 0.2302703857421875, 3.8794097900390625, -0.20302581787109375, -1.0865020751953125, 2.1537246704101562, -1.0807647705078125, 9.0389404296875, 7.8890380859375, -0.7836494445800781, 6.96307373046875, 0.0933380126953125, 4.7213134765625, -1.983184814453125, 0.8052349090576172, 0.760711669921875, 0.49954986572265625, -1.398712158203125, -2.3943023681640625, 2.2823867797851562, 4.5538177490234375, 1.9060211181640625, 2.751953125, 3.0707130432128906, -0.548004150390625, 5.4305267333984375, 2.274017333984375, 0.114837646484375, -0.4040985107421875, -6.3260498046875, -1.27978515625, 2.81103515625, 0.7266845703125, 1.44573974609375, 10.9488525390625, 1.3719024658203125, 0.6400299072265625, 2.5203857421875, 1.597747802734375, -0.455596923828125, -0.10791397094726562, 1.434051513671875, -1.7438621520996094, 6.292327880859375, -0.3614616394042969, 0.026683807373046875, 4.76068115234375, 5.6004638671875, 1.6832733154296875, 1.1445999145507812, 0.7947120666503906, -1.49053955078125, 1.860260009765625, 10.4197998046875, 5.21624755859375, 5.024658203125, 0.4260978698730469, 0.043487548828125, -6.175537109375, 1.1194000244140625, -1.5052490234375, 0.6812896728515625, 3.98797607421875, 6.516632080078125, 3.06982421875, 0.0, 0.19372940063476562, 0.36269569396972656, 0.189544677734375, 0.9278564453125, 3.7329483032226562, -2.200225830078125, 5.2120361328125, 1.3455123901367188, -0.43788909912109375, 7.115570068359375, 1.0693359375, -0.8079490661621094, 0.1699981689453125, 7.586822509765625, -5.3016357421875, 1.6956939697265625, 0.0, 2.947998046875, 6.048095703125, 2.2023696899414062, 4.6444091796875, 2.2362518310546875, -5.6666259765625, -0.620147705078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000064.npy"}
|
||||
{"epoch": 0.13403141361256546, "step": 65, "batch_size": 128, "mean": 1.144404649734497, "std": 2.658898115158081, "min": -7.4793701171875, "p10": -2.033770751953125, "median": 0.9893798828125, "p90": 4.617251586914062, "max": 6.6790924072265625, "pos_frac": 0.6953125, "sample": [-0.23725128173828125, -2.0556640625, 3.54864501953125, -0.10502433776855469, 6.6790924072265625, 1.653076171875, 0.9272918701171875, 1.876617431640625, 2.8273773193359375, -4.580963134765625, -2.278350830078125, -5.561676025390625, 0.450531005859375, 2.73712158203125, -0.9764328002929688, 3.6014404296875, 3.84112548828125, 0.090576171875, 1.3202552795410156, 0.154327392578125, -3.866851806640625, 0.3327827453613281, 2.0637359619140625, 4.875335693359375, 1.8782939910888672, 2.4674606323242188, 4.22796630859375, 1.1371536254882812, -3.73101806640625, 0.0122222900390625, 2.462371826171875, -1.616973876953125, -0.1969146728515625, 2.25506591796875, 6.52276611328125, 0.0431671142578125, 0.07325935363769531, 4.326385498046875, -0.5311813354492188, 0.11948776245117188, 1.651397705078125, -2.248748779296875, 0.99798583984375, 1.5696945190429688, 5.29290771484375, -1.0884552001953125, 3.9760589599609375, 3.11688232421875, 5.425445556640625, -0.24691009521484375, 2.4281768798828125, 1.9810791015625, -2.0372314453125, 2.667877197265625, 2.7866363525390625, -1.273406982421875, 5.30157470703125, 0.7681884765625, 0.458221435546875, 3.094329833984375, -0.73944091796875, 0.256134033203125, 0.98077392578125, 0.17502593994140625, 0.917236328125, 0.3597412109375, 2.061065673828125, 0.284820556640625, -0.934844970703125, -1.324554443359375, 1.8055038452148438, 4.51318359375, -2.223846435546875, 0.773040771484375, 2.14306640625, 0.2707672119140625, -1.22991943359375, 0.04165840148925781, -0.337066650390625, -1.0765838623046875, -5.350982666015625, -0.77117919921875, 0.0, 0.036224365234375, 4.0670928955078125, 2.0779800415039062, -1.4018402099609375, 1.581329345703125, 6.088714599609375, 3.7412261962890625, 0.485504150390625, 1.8579864501953125, 5.72357177734375, 2.8388824462890625, 4.860076904296875, -0.99383544921875, 3.9606475830078125, 3.3232879638671875, -0.5956134796142578, 5.509857177734375, 1.076324462890625, 3.451629638671875, -7.4793701171875, 1.6708984375, 1.7898483276367188, 5.60333251953125, 1.1781005859375, 6.28192138671875, 2.839202880859375, 6.61419677734375, 1.102569580078125, -0.3745574951171875, 2.4703826904296875, 1.996124267578125, -4.46484375, -1.3041305541992188, 4.0530853271484375, -2.64288330078125, 0.2244415283203125, 1.576141357421875, -2.03228759765625, -0.854156494140625, 4.063873291015625, 0.34539794921875, 4.1002197265625, -0.17009544372558594, 0.5149688720703125, -0.2895660400390625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000065.npy"}
|
||||
{"epoch": 0.13612565445026178, "step": 66, "batch_size": 128, "mean": 1.3752074241638184, "std": 3.778156042098999, "min": -11.09375, "p10": -2.205971527099609, "median": 1.1742134094238281, "p90": 6.293780517578124, "max": 11.76678466796875, "pos_frac": 0.6796875, "sample": [0.22319793701171875, 1.278909683227539, 6.24029541015625, -1.84417724609375, -1.1612014770507812, 6.87127685546875, -0.46832275390625, 1.1144332885742188, 1.4583816528320312, 1.2413711547851562, -0.3432960510253906, -0.11828231811523438, 2.4147491455078125, 0.9746170043945312, 2.1421432495117188, 0.036346435546875, -0.015676498413085938, 1.2144775390625, 7.69683837890625, 2.4636764526367188, 1.212890625, -6.48846435546875, 4.1515655517578125, 4.537200927734375, 1.270660400390625, 1.0956268310546875, -0.19940185546875, 0.909698486328125, -0.4166679382324219, 6.4185791015625, 2.585845947265625, 0.07590484619140625, 11.116592407226562, -4.34893798828125, 0.8324394226074219, -0.3309211730957031, -1.06964111328125, 11.552398681640625, 2.156219482421875, -1.6790618896484375, -5.0417633056640625, 2.7178001403808594, 6.8414764404296875, 2.9656829833984375, 0.9839553833007812, 3.0215911865234375, -0.67169189453125, 8.94097900390625, 7.419158935546875, -1.566131591796875, -8.53778076171875, 0.49593353271484375, 1.2879180908203125, 11.76678466796875, 11.19085693359375, 1.374786376953125, 0.75018310546875, 3.6317291259765625, -11.09375, 1.1378173828125, -1.39349365234375, 0.022491455078125, 9.17333984375, 1.7070159912109375, 3.86102294921875, -8.035324096679688, 0.44384765625, -0.85272216796875, 2.4652252197265625, -1.3997879028320312, 0.06195068359375, 1.327545166015625, 4.974609375, -3.191741943359375, 2.63641357421875, 1.1495132446289062, 2.86737060546875, -0.7289981842041016, 3.392059326171875, -4.61993408203125, 2.7354736328125, 1.5777320861816406, -1.4519500732421875, 2.71221923828125, -1.2364578247070312, 1.4540023803710938, -0.0778045654296875, 0.7548828125, 4.4730224609375, 3.0948333740234375, -0.08225250244140625, 4.97320556640625, 1.861572265625, -8.42138671875, 0.229766845703125, -2.940826416015625, -3.4425506591796875, -2.428924560546875, 3.99761962890625, 1.133941650390625, 8.423492431640625, 2.696624755859375, 2.455322265625, 0.0, 2.7734642028808594, -1.4197998046875, -0.9737396240234375, 5.4919586181640625, 1.656494140625, 5.746612548828125, 0.570404052734375, 1.19891357421875, -0.30144500732421875, -3.83758544921875, 0.555694580078125, 2.639739990234375, -0.496246337890625, 4.386688232421875, 3.048553466796875, 3.4882659912109375, -1.110626220703125, 0.0330810546875, 1.8246307373046875, -2.1104202270507812, 3.17724609375, 1.099884033203125, 6.574493408203125, 3.240509033203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000066.npy"}
|
||||
{"epoch": 0.1382198952879581, "step": 67, "batch_size": 128, "mean": 1.1830191612243652, "std": 3.3633925914764404, "min": -8.8818359375, "p10": -2.7384933471679687, "median": 1.068450927734375, "p90": 5.526597595214843, "max": 10.03314208984375, "pos_frac": 0.7109375, "sample": [-0.6406707763671875, 4.153900146484375, -0.676971435546875, 8.72686767578125, 0.38884544372558594, -5.31494140625, 0.6321182250976562, 5.673126220703125, 5.5970458984375, -1.0054550170898438, 2.0618438720703125, 10.03314208984375, 1.029266357421875, 2.05047607421875, 2.351470947265625, 4.67230224609375, 2.7979278564453125, 5.139495849609375, 0.17144775390625, -1.70794677734375, 1.3686370849609375, -1.824127197265625, 4.6500701904296875, -0.1345977783203125, 1.1289825439453125, 5.511505126953125, -2.786376953125, -2.485565185546875, 2.0175552368164062, -0.7035675048828125, -4.1170654296875, 0.0477294921875, -1.162872314453125, 1.8232269287109375, 1.277313232421875, -3.61004638671875, 0.0, 1.2657623291015625, 2.99420166015625, 1.6730175018310547, -2.6120147705078125, 3.6326904296875, 2.9286041259765625, 5.5618133544921875, 1.5402679443359375, -1.783193588256836, 3.88525390625, -1.0433349609375, -3.2052001953125, -0.12308311462402344, -4.93023681640625, 5.4487457275390625, 0.2163848876953125, 7.842559814453125, -0.191009521484375, 2.0939483642578125, 7.225494384765625, -4.944580078125, 0.3160858154296875, 0.9500141143798828, 1.00146484375, 2.8258590698242188, 1.7451171875, -2.7179718017578125, 2.13031005859375, 0.19654464721679688, 0.27982330322265625, -7.38885498046875, 0.70513916015625, 0.4177207946777344, 5.49566650390625, 1.081634521484375, 0.3503131866455078, 5.90606689453125, -6.38751220703125, 8.257720947265625, 7.718719482421875, 3.4114990234375, 0.0817108154296875, 1.8006362915039062, 1.919921875, -1.4680900573730469, 2.675048828125, 0.122100830078125, -3.047210693359375, 0.13962364196777344, 5.451362609863281, 0.4829254150390625, 0.9356231689453125, 4.0337371826171875, 6.215972900390625, -8.8818359375, 6.752349853515625, 2.329986572265625, -7.416229248046875, 4.015838623046875, 1.250335693359375, 2.231781005859375, 3.0780105590820312, 3.540283203125, 3.081146240234375, 1.0221786499023438, -0.3456268310546875, 0.7663650512695312, 1.055267333984375, 0.837921142578125, -1.2014236450195312, -7.498199462890625, -1.2627716064453125, 1.5615119934082031, 2.843536376953125, 3.3726806640625, 0.7688751220703125, -2.015727996826172, -0.472381591796875, 5.8988037109375, 2.51214599609375, 0.5099678039550781, 3.7972412109375, 1.494110107421875, 3.676116943359375, 0.8934669494628906, 2.025390625, 1.1091766357421875, 0.5941085815429688, -0.6716156005859375, 1.4737548828125, -1.5489959716796875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000067.npy"}
|
||||
{"epoch": 0.14031413612565444, "step": 68, "batch_size": 128, "mean": 1.7497785091400146, "std": 4.063353538513184, "min": -8.81890869140625, "p10": -3.1130401611328122, "median": 1.3493156433105469, "p90": 6.686071777343749, "max": 16.8450927734375, "pos_frac": 0.6953125, "sample": [7.584709167480469, -0.12851333618164062, 1.0653076171875, -0.53662109375, -0.7547874450683594, -0.0134735107421875, 0.572235107421875, 4.723236083984375, -0.6475658416748047, 1.6762619018554688, 1.5248565673828125, -0.18663406372070312, 0.167449951171875, 5.300506591796875, 5.3365020751953125, -4.555145263671875, -0.6287384033203125, 0.5295944213867188, 2.304931640625, 2.620807647705078, 6.4150238037109375, 0.5521926879882812, -1.980712890625, 2.0030364990234375, 16.8450927734375, 1.196767807006836, 1.3291015625, 4.22100830078125, -0.15220069885253906, -5.63421630859375, 4.726898193359375, 3.359375, 10.78466796875, 7.464019775390625, 0.43801307678222656, 6.517303466796875, 0.5977630615234375, 1.4907379150390625, -3.463287353515625, -5.7564697265625, -2.323150634765625, -1.0797119140625, 3.5115966796875, 4.1759033203125, 0.8928451538085938, -4.888671875, 0.38983154296875, 1.8473663330078125, 6.26348876953125, 5.477691650390625, 2.9087066650390625, -2.8947982788085938, 4.59527587890625, 4.21331787109375, 11.425262451171875, -0.9160003662109375, 2.83502197265625, 8.652984619140625, -0.8530368804931641, 0.367218017578125, 4.480064392089844, 2.31109619140625, -0.7661209106445312, 1.9741058349609375, 0.35486412048339844, -0.12535476684570312, -7.185791015625, 4.3489227294921875, 1.092681884765625, 11.16064453125, 0.3003654479980469, -3.069580078125, 1.5381011962890625, 0.46443939208984375, -8.81890869140625, 2.5502395629882812, 1.030853271484375, -4.09527587890625, 8.3887939453125, -0.15192794799804688, -3.214447021484375, 2.2478256225585938, -0.356964111328125, 4.0013427734375, 1.5714111328125, 1.16937255859375, -1.58892822265625, 6.021514892578125, -7.8665771484375, 9.771133422851562, 4.862548828125, 1.2336273193359375, 1.3695297241210938, 0.37261962890625, 1.750579833984375, 5.88629150390625, -5.30963134765625, -3.57177734375, 3.613616943359375, 2.1024017333984375, 4.810150146484375, 0.5939197540283203, 2.2110939025878906, 7.079864501953125, 7.3603515625, 1.39117431640625, 4.857818603515625, 7.352630615234375, 1.81640625, 3.91424560546875, 3.999847412109375, -3.2326202392578125, 1.6929473876953125, 1.6244182586669922, -0.46627044677734375, 1.5668182373046875, 4.47900390625, 0.37142181396484375, -2.5076141357421875, 1.0221023559570312, -0.5496826171875, -1.3760833740234375, 0.632904052734375, 12.41693115234375, 4.21014404296875, -2.3680572509765625, -1.0673141479492188, 0.7772254943847656], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000068.npy"}
|
||||
{"epoch": 0.1424083769633508, "step": 69, "batch_size": 128, "mean": 2.1886770725250244, "std": 3.9372689723968506, "min": -9.77362060546875, "p10": -1.8215276718139641, "median": 1.6730947494506836, "p90": 7.16746826171875, "max": 12.93865966796875, "pos_frac": 0.7265625, "sample": [5.0616607666015625, -1.234222412109375, 1.9997634887695312, 5.72198486328125, 2.59979248046875, -9.15380859375, 3.995361328125, 1.9943733215332031, -2.31341552734375, -0.1807537078857422, 4.852561950683594, 1.4947967529296875, 11.004119873046875, 0.6221733093261719, 6.77398681640625, -0.719696044921875, 5.8189544677734375, 0.18255615234375, 3.73114013671875, 1.457183837890625, -0.159637451171875, -0.24202346801757812, -2.9320144653320312, -5.107185363769531, 6.682098388671875, 12.1910400390625, 6.5145263671875, 5.500152587890625, 0.6387596130371094, 1.810028076171875, -9.77362060546875, 5.11578369140625, 0.24303436279296875, 0.995147705078125, 2.1387939453125, -0.7077713012695312, 0.9786758422851562, 0.7927703857421875, 7.213653564453125, 2.00115966796875, 10.38421630859375, -5.81353759765625, -0.03054046630859375, 2.7199249267578125, 1.0879287719726562, 1.6533660888671875, -2.9533233642578125, 7.917938232421875, 3.8047409057617188, 0.26738739013671875, 6.1871337890625, 1.490570068359375, 8.720733642578125, 5.40167236328125, 5.408203125, -5.456298828125, -5.048583984375, 0.2771453857421875, 2.556396484375, -0.6179752349853516, 1.2155914306640625, 5.10125732421875, 1.0900535583496094, 4.131156921386719, -0.14569091796875, 3.4726104736328125, 2.0394287109375, 5.25054931640625, 5.3677978515625, 5.1759033203125, 9.71661376953125, 2.700164794921875, 2.213348388671875, -0.1661529541015625, 0.86920166015625, -0.2168426513671875, 6.1669921875, 0.6971206665039062, 3.1254730224609375, 1.551300048828125, -1.6564884185791016, -0.5146484375, 1.6928234100341797, 1.5190925598144531, -1.230133056640625, 0.1180572509765625, 3.4268798828125, 1.5606842041015625, 3.84674072265625, 7.709808349609375, 8.73101806640625, 7.7819366455078125, 1.2628250122070312, 5.0254058837890625, -2.2066192626953125, -2.8367919921875, 0.21263885498046875, 1.3674087524414062, 8.936309814453125, -1.0457916259765625, 3.26239013671875, 2.0048828125, 12.93865966796875, 1.921875, 5.3681488037109375, 3.043914794921875, -2.361572265625, 0.8248748779296875, -1.145477294921875, 1.8855743408203125, -1.0150909423828125, -1.47808837890625, 0.19812774658203125, 6.58856201171875, 6.015869140625, -5.919952392578125, -1.0319290161132812, -1.4432373046875, 3.2229232788085938, 9.441009521484375, 3.358856201171875, 3.9912872314453125, 0.6216583251953125, -0.19877052307128906, -0.2931022644042969, 3.377899169921875, 7.147674560546875, 1.2076759338378906], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000069.npy"}
|
||||
{"epoch": 0.14450261780104712, "step": 70, "batch_size": 128, "mean": 2.0857458114624023, "std": 3.816991090774536, "min": -10.02447509765625, "p10": -1.9381713867187498, "median": 1.6686782836914062, "p90": 7.075061035156249, "max": 14.4580078125, "pos_frac": 0.703125, "sample": [-1.024261474609375, 3.1401214599609375, 2.0227127075195312, -1.858062744140625, -0.660186767578125, 0.639495849609375, 4.31951904296875, 3.5889205932617188, -4.391357421875, 8.81134033203125, -2.1660919189453125, -0.5199012756347656, 0.23992919921875, 3.11529541015625, 6.55694580078125, 4.08477783203125, 3.061859130859375, 4.3070068359375, 1.3183364868164062, 3.715087890625, 0.8841094970703125, 7.6304931640625, -0.0718994140625, 2.1423377990722656, -4.98663330078125, 3.039642333984375, 1.3340911865234375, 1.6281890869140625, 1.9946136474609375, -1.4101791381835938, 8.3995361328125, 0.76416015625, 0.7475128173828125, 0.20367431640625, 3.75933837890625, 1.3939666748046875, 6.696685791015625, -3.22882080078125, -5.16888427734375, -0.63128662109375, 1.43682861328125, 0.41449546813964844, 10.22344970703125, 2.761077880859375, 1.81829833984375, -0.880615234375, -1.5374298095703125, 0.3338165283203125, -0.7322540283203125, -1.80682373046875, 0.1109619140625, -0.0169525146484375, -0.5705108642578125, 0.0616302490234375, -0.8833274841308594, 6.089324951171875, 9.191131591796875, 3.32196044921875, 4.925445556640625, 1.4883642196655273, 2.2228164672851562, 4.6072998046875, -0.23435211181640625, 1.3328704833984375, 2.204803466796875, 4.276702880859375, 0.5075759887695312, 4.471343994140625, 1.19097900390625, -0.414337158203125, -4.44769287109375, 0.7836532592773438, -0.029148101806640625, 1.77825927734375, 2.0187606811523438, 8.11187744140625, -2.125091552734375, 1.8453636169433594, 10.44671630859375, 1.892547607421875, 12.1453857421875, 6.733551025390625, 1.7365875244140625, 1.71112060546875, 5.8647308349609375, 1.9359130859375, -2.3193359375, 2.05474853515625, -0.006134033203125, 11.794921875, 4.607627868652344, 6.961883544921875, 0.0, 0.3015708923339844, 2.95654296875, 8.49920654296875, 5.4615631103515625, 5.1787109375, 0.8878021240234375, 9.6944580078125, 4.97442626953125, 6.30255126953125, 1.5272216796875, -2.1889076232910156, 2.9834365844726562, -2.154083251953125, 7.339141845703125, 1.2006683349609375, 5.317657470703125, -2.5774993896484375, -0.3389129638671875, -1.1057586669921875, 1.140625, 2.907470703125, -6.62847900390625, -10.02447509765625, 2.465423583984375, -0.20813751220703125, 1.70916748046875, 6.950592041015625, 1.8954315185546875, 3.20355224609375, 2.4976806640625, -0.018613815307617188, 14.4580078125, 0.0, 0.2640380859375, -0.729583740234375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000070.npy"}
|
||||
{"epoch": 0.14659685863874344, "step": 71, "batch_size": 128, "mean": 1.9645743370056152, "std": 4.463421821594238, "min": -12.76519775390625, "p10": -3.2173141479492187, "median": 1.8052520751953125, "p90": 7.815908813476562, "max": 12.929962158203125, "pos_frac": 0.6875, "sample": [-0.5145034790039062, 10.105560302734375, -1.2739067077636719, -5.4444580078125, -1.15899658203125, -6.65753173828125, -0.5381317138671875, 10.290283203125, -7.005767822265625, 3.7708282470703125, 2.1656455993652344, 3.717082977294922, -3.33355712890625, 3.97003173828125, 1.8754806518554688, 6.5399017333984375, 4.550262451171875, -0.427032470703125, 4.0686187744140625, 9.4449462890625, 2.29296875, -1.165252685546875, 1.730255126953125, 12.929962158203125, 1.5940093994140625, -1.3094863891601562, 0.9259109497070312, 7.0116424560546875, 2.6408157348632812, 4.780609130859375, -3.1674957275390625, 3.941497802734375, 0.7581863403320312, 10.362091064453125, -2.53497314453125, 3.82073974609375, 8.5394287109375, 0.5343904495239258, 6.730621337890625, -3.644317626953125, -1.00714111328125, -0.9174652099609375, -3.6124267578125, 2.5793075561523438, 1.10040283203125, -0.3726806640625, 2.4090576171875, -12.76519775390625, -4.37249755859375, -0.3756828308105469, 2.26312255859375, 5.00164794921875, -0.8407249450683594, 4.256778717041016, 1.03045654296875, 5.1919708251953125, 10.57135009765625, 2.602935791015625, -0.3223533630371094, 0.58099365234375, 3.7636260986328125, -1.818634033203125, 7.900909423828125, 7.18878173828125, 0.3586711883544922, -3.147216796875, -0.49005699157714844, -2.792877197265625, 3.62762451171875, -0.0587615966796875, 2.3146514892578125, 10.62744140625, 3.42645263671875, -4.8782958984375, -3.8776702880859375, 5.265777587890625, 7.5592041015625, -6.446197509765625, -3.0487823486328125, 0.2723350524902344, -1.244140625, -0.60247802734375, 1.325714111328125, 1.7252578735351562, 6.70770263671875, 7.77947998046875, 0.7147064208984375, 6.69976806640625, 6.67388916015625, 4.13922119140625, 1.5217666625976562, 3.65240478515625, 0.756256103515625, 0.24831771850585938, 0.7445068359375, 8.280044555664062, -0.8782958984375, 9.7691650390625, 3.3547019958496094, 5.5592041015625, 7.595947265625, -4.440864562988281, 0.3376007080078125, 4.2554931640625, 9.9063720703125, 0.5824432373046875, 2.5973358154296875, 0.6643104553222656, -3.112457275390625, 1.878448486328125, 2.7750091552734375, 1.3010234832763672, 3.780242919921875, 1.952117919921875, 2.0459136962890625, 2.777179718017578, 2.8198699951171875, 4.03271484375, 1.5427398681640625, 0.7241058349609375, 2.26715087890625, 1.7350234985351562, 3.521484375, -0.175537109375, -0.19522476196289062, 12.10601806640625, -12.22747802734375, 3.826141357421875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000071.npy"}
|
||||
{"epoch": 0.1486910994764398, "step": 72, "batch_size": 128, "mean": 2.0128540992736816, "std": 4.15703010559082, "min": -11.379547119140625, "p10": -3.1161880493164062, "median": 2.1838674545288086, "p90": 7.694392395019531, "max": 12.934051513671875, "pos_frac": 0.6484375, "sample": [2.721832275390625, 3.4379425048828125, -3.1062774658203125, 1.528167724609375, 11.419921875, -3.139312744140625, 2.3402099609375, 6.48431396484375, 3.0035858154296875, -0.6485824584960938, -1.2059326171875, 0.564208984375, 2.640289306640625, 2.872894287109375, 2.90594482421875, 8.61810302734375, -2.3485031127929688, 1.97467041015625, -4.281394958496094, 2.87078857421875, 3.1757278442382812, 7.3056640625, 3.118896484375, 10.54742431640625, 3.046722412109375, 0.30849456787109375, -1.9783859252929688, 3.1086578369140625, -3.9767990112304688, -0.29718780517578125, 3.531158447265625, -11.379547119140625, -1.35699462890625, -5.655448913574219, 2.459625244140625, 2.153900146484375, 1.1103057861328125, -0.6584377288818359, 1.74322509765625, 1.7684707641601562, 4.143463134765625, -2.158172607421875, 6.977691650390625, 5.72821044921875, 8.838272094726562, 2.884624481201172, 3.5443267822265625, -0.36058807373046875, -3.34710693359375, -1.735260009765625, -2.4940719604492188, 3.30224609375, -2.8914794921875, 0.0, 6.2781982421875, 6.431365966796875, 8.284591674804688, -0.6823959350585938, 5.09979248046875, 9.891357421875, -0.0729827880859375, -3.466339111328125, 3.332500457763672, 5.281074523925781, 2.2164306640625, 5.354461669921875, 1.49664306640625, 1.2091293334960938, -0.7412071228027344, 4.494384765625, -3.7907562255859375, 2.197599411010742, 0.3262176513671875, 3.55322265625, 7.672515869140625, 1.5696907043457031, 0.07972335815429688, 6.37353515625, -0.471649169921875, 9.679054260253906, 2.23907470703125, 2.170135498046875, -1.665283203125, -1.5350341796875, -1.755218505859375, -4.3822021484375, 1.06475830078125, -0.5201568603515625, -2.154153823852539, 9.41204833984375, -5.051177978515625, 8.322174072265625, -0.601715087890625, 3.638458251953125, 1.5088958740234375, 6.1782379150390625, 6.362640380859375, -0.87774658203125, 2.5313262939453125, 3.15948486328125, 1.90411376953125, 7.7454376220703125, 0.0, 4.9701385498046875, -3.1547584533691406, 0.821380615234375, 8.594284057617188, 12.934051513671875, 6.1813507080078125, 6.955482482910156, 4.305366516113281, -1.925445556640625, -0.658966064453125, -6.495819091796875, 9.85919189453125, -1.0038986206054688, -2.4920921325683594, 0.0, 1.9520301818847656, 3.822998046875, 2.8332138061523438, 3.1736602783203125, -6.078094482421875, 3.45941162109375, 2.8467025756835938, -2.162994384765625, 3.1109619140625, 7.346405029296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000072.npy"}
|
||||
{"epoch": 0.15078534031413612, "step": 73, "batch_size": 128, "mean": 1.9793339967727661, "std": 4.483782768249512, "min": -16.253326416015625, "p10": -3.1316497802734373, "median": 1.8157310485839844, "p90": 7.543063354492187, "max": 12.83416748046875, "pos_frac": 0.71875, "sample": [1.2238082885742188, 1.98785400390625, 8.130950927734375, 4.2180633544921875, -1.001129150390625, 2.9234390258789062, 0.474212646484375, -0.158843994140625, 0.280364990234375, 6.89013671875, 3.5499267578125, 3.237823486328125, 0.9839324951171875, 10.5325927734375, 0.4109668731689453, 0.38153076171875, -8.19891357421875, -8.570556640625, 1.01519775390625, 0.2505950927734375, 2.285858154296875, 0.28131866455078125, -0.06450271606445312, 4.116058349609375, 1.1846923828125, 0.6933021545410156, 2.33709716796875, 1.4219474792480469, 1.7579345703125, 3.7866973876953125, -6.17376708984375, -2.7282562255859375, 5.1392822265625, 8.829376220703125, -2.09564208984375, 1.4566268920898438, -2.8045196533203125, 10.84478759765625, 5.405509948730469, 3.6339263916015625, 5.442535400390625, 1.0064773559570312, 5.279815673828125, -3.0207290649414062, 12.75250244140625, -1.0887451171875, -1.557861328125, -0.4537353515625, 2.2389144897460938, 10.4730224609375, -5.7208251953125, 2.44744873046875, 9.866790771484375, -16.253326416015625, 4.68865966796875, 1.0487937927246094, 2.956817626953125, -4.255897521972656, 5.355987548828125, -2.936553955078125, 3.706329345703125, 2.2091064453125, 4.53729248046875, -2.670867919921875, -2.1852874755859375, 6.38983154296875, 4.667304992675781, -5.1673583984375, 8.853363037109375, 0.4308509826660156, -1.4614028930664062, -2.57568359375, 3.9771575927734375, 4.627777099609375, -0.67840576171875, 1.9508514404296875, -5.5045166015625, -4.565673828125, -2.1474952697753906, 1.0506973266601562, 6.39013671875, 4.8483734130859375, 3.68304443359375, 0.5020599365234375, -1.5135040283203125, 3.0671768188476562, 0.4103736877441406, 2.9152183532714844, -0.2131195068359375, 4.488525390625, 1.6854782104492188, 0.9850082397460938, 7.376556396484375, 3.2854690551757812, 6.163482666015625, 4.85296630859375, 1.8655166625976562, 3.94842529296875, -6.4071044921875, 5.240203857421875, 0.03755950927734375, 5.9274139404296875, 4.146148681640625, 5.20880126953125, 7.362518310546875, 7.81536865234375, 1.12738037109375, -0.43505859375, 12.83416748046875, 5.449249267578125, 3.495849609375, -2.50653076171875, 2.3817367553710938, -0.012685775756835938, 8.850067138671875, -4.1815185546875, 0.6077346801757812, 8.469940185546875, -3.3904647827148438, 1.4825057983398438, 8.26666259765625, 1.7659454345703125, 7.426361083984375, 3.141246795654297, 1.4061698913574219, -0.0788116455078125, 5.396728515625, -3.79766845703125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000073.npy"}
|
||||
{"epoch": 0.15287958115183245, "step": 74, "batch_size": 128, "mean": 1.9341108798980713, "std": 4.610764980316162, "min": -9.94561767578125, "p10": -3.9054733276367184, "median": 1.7266845703125, "p90": 7.764585876464843, "max": 13.303680419921875, "pos_frac": 0.6875, "sample": [2.281494140625, 8.8843994140625, 4.7298583984375, 1.618896484375, 7.586517333984375, -1.55059814453125, -4.0791168212890625, 1.6944580078125, 7.3086700439453125, 7.398193359375, -1.1943359375, 7.953643798828125, 0.0, 1.596923828125, 0.4587249755859375, 9.7139892578125, 5.7303466796875, 10.16436767578125, -2.4575347900390625, 1.65228271484375, 2.33367919921875, 3.7741546630859375, 1.1556396484375, 1.8611984252929688, 9.455291748046875, -4.286376953125, -3.256561279296875, -1.860137939453125, -2.55523681640625, -3.5650482177734375, -1.75006103515625, -0.33203125, -4.722412109375, -0.49803924560546875, 5.40008544921875, 1.0690765380859375, 0.8549118041992188, 7.190376281738281, 0.21980857849121094, 7.8909759521484375, -0.6399497985839844, -2.841796875, -8.733489990234375, -7.562591552734375, 1.8124008178710938, 3.3049468994140625, 1.103729248046875, -2.54010009765625, -0.338531494140625, 4.607826232910156, -9.810882568359375, -6.55047607421875, 0.312713623046875, 3.177520751953125, 0.9792022705078125, -1.203908920288086, 1.0346031188964844, 2.112274169921875, -9.84478759765625, 5.810028076171875, 5.795257568359375, 3.2890167236328125, 0.0302734375, 8.40716552734375, 6.54913330078125, 2.945831298828125, 9.3216552734375, 4.7271728515625, 1.0191497802734375, 11.10797119140625, 1.9049072265625, 0.300140380859375, -0.9786758422851562, 4.956634521484375, -0.710540771484375, 2.30828857421875, 2.11590576171875, 4.00677490234375, -0.0944061279296875, 4.832794189453125, 4.4425506591796875, 0.9757919311523438, 3.1978912353515625, -9.94561767578125, 11.22320556640625, 0.7906570434570312, 1.16448974609375, 2.261749267578125, 1.0305557250976562, -1.3816757202148438, -0.558807373046875, -3.8310546875, -0.5809326171875, 5.65753173828125, 3.3508834838867188, 3.55242919921875, 5.083465576171875, -7.7620849609375, -6.6341552734375, 6.312652587890625, 5.09375, -4.6316986083984375, 1.7589111328125, 7.710418701171875, 1.54541015625, 5.8667449951171875, 4.70709228515625, -4.71710205078125, 6.929443359375, -0.110321044921875, -0.6268386840820312, 1.8976287841796875, 3.98712158203125, 3.151214599609375, 11.9066162109375, -1.1940155029296875, 0.941436767578125, 5.4307861328125, 5.233489990234375, 1.01470947265625, 4.185821533203125, 2.507049560546875, 0.5465469360351562, 8.089744567871094, 13.303680419921875, 4.3206634521484375, -0.1952972412109375, 6.670013427734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000074.npy"}
|
||||
{"epoch": 0.1549738219895288, "step": 75, "batch_size": 128, "mean": 2.736631155014038, "std": 4.426015377044678, "min": -9.2923583984375, "p10": -2.381307983398437, "median": 2.3471946716308594, "p90": 8.204229736328125, "max": 14.919677734375, "pos_frac": 0.78125, "sample": [2.594371795654297, 2.3940811157226562, 1.902862548828125, 2.3003082275390625, -0.493408203125, 9.350830078125, 6.1362457275390625, -2.735137939453125, 5.145233154296875, 3.2611923217773438, 0.6773014068603516, -3.809295654296875, 1.3306045532226562, -0.9702301025390625, 8.82037353515625, 0.22252655029296875, 14.919677734375, 1.87286376953125, 6.77337646484375, 4.3497467041015625, 8.20135498046875, 6.0770416259765625, 7.558746337890625, -4.23284912109375, 0.855987548828125, 13.02215576171875, -2.2047119140625, 5.701995849609375, 0.396484375, 1.3824005126953125, 2.765869140625, 7.3883056640625, 3.3797607421875, 2.7034149169921875, 1.0846405029296875, -1.4154987335205078, 3.608428955078125, 2.7698974609375, 3.5770645141601562, 1.6746826171875, 10.607330322265625, 5.752880096435547, 0.1336669921875, -9.2923583984375, 0.796966552734375, -0.50079345703125, 1.014364242553711, -0.07862091064453125, 8.72467041015625, 1.1267623901367188, 3.465545654296875, 3.859161376953125, 3.1868743896484375, 1.3279571533203125, 5.8111572265625, 4.845062255859375, -4.392578125, 7.907012939453125, -0.948577880859375, 6.870361328125, 2.11175537109375, 1.99969482421875, 1.27117919921875, 7.068389892578125, 1.3641853332519531, 1.970306396484375, 8.155853271484375, -1.9756240844726562, 2.4862899780273438, 0.5130615234375, 1.650665283203125, -2.2310791015625, 1.8480606079101562, 11.73394775390625, 8.2109375, 0.43515968322753906, -7.4805908203125, -7.131988525390625, -0.775482177734375, 4.179004669189453, 11.460906982421875, 5.127357482910156, 5.9193115234375, -4.611663818359375, 0.02557373046875, 9.277053833007812, 0.7207908630371094, 0.48040771484375, 3.7794418334960938, 7.0697784423828125, 4.143524169921875, -0.820068359375, 4.38311767578125, 2.746337890625, 6.946075439453125, 12.009033203125, -0.9262924194335938, 4.33795166015625, 0.828338623046875, 7.545536041259766, -8.2662353515625, 0.619873046875, 4.121612548828125, 11.18157958984375, 8.13873291015625, 2.720062255859375, -0.45653533935546875, 4.48028564453125, 1.5864524841308594, -4.392974853515625, 6.637481689453125, -0.3139972686767578, 1.334320068359375, 5.6637420654296875, 4.738149642944336, 6.33221435546875, 0.37640380859375, 1.3543663024902344, 4.92584228515625, -2.819488525390625, 2.683258056640625, 1.939239501953125, 4.715610504150391, 9.57952880859375, -2.731842041015625, -1.9866943359375, -6.38018798828125, 0.106201171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000075.npy"}
|
||||
{"epoch": 0.15706806282722513, "step": 76, "batch_size": 128, "mean": 2.934190511703491, "std": 4.87100887298584, "min": -8.882568359375, "p10": -2.1267929077148438, "median": 2.0983657836914062, "p90": 9.916580200195312, "max": 21.820205688476562, "pos_frac": 0.765625, "sample": [6.824737548828125, -3.2357635498046875, 1.4991569519042969, 2.92279052734375, 1.982757568359375, 5.01177978515625, 2.305145263671875, 1.3907222747802734, 2.2903575897216797, 3.9193496704101562, 8.60675048828125, 0.06602859497070312, 1.2227020263671875, -1.268341064453125, 13.0013427734375, 4.666412353515625, -0.04955291748046875, -2.1252593994140625, 1.4596710205078125, -1.1640625, 12.230621337890625, 0.704071044921875, 4.2314453125, 1.8932037353515625, 3.3097381591796875, 14.1685791015625, 1.99395751953125, 3.3622169494628906, 1.6250457763671875, 0.555084228515625, 3.3453369140625, 0.2531566619873047, 4.9525299072265625, 8.63427734375, -8.882568359375, 13.8333740234375, 3.3175125122070312, 4.661285400390625, 2.81201171875, 2.5728759765625, 9.89117431640625, 6.171905517578125, 6.22552490234375, 5.941680908203125, -7.1717529296875, 1.1565685272216797, 3.2362136840820312, 3.6801300048828125, 0.2541961669921875, 9.44293212890625, 0.0, 3.5105438232421875, -0.025146484375, -0.28679656982421875, 7.428337097167969, -4.44256591796875, 11.124542236328125, -7.15032958984375, -0.688201904296875, -2.521331787109375, 5.1676177978515625, 11.502105712890625, -2.202972412109375, 1.5479736328125, 3.513774871826172, 1.135711669921875, 2.945709228515625, 3.3653182983398438, 5.599246978759766, -2.13037109375, -0.143218994140625, 9.975860595703125, -2.6258544921875, 1.0133895874023438, -3.9127044677734375, 0.0, 10.33563232421875, 3.7013397216796875, 5.7024993896484375, 11.459259033203125, 2.3736648559570312, -0.014917373657226562, 1.392181396484375, 1.3796310424804688, 0.1240081787109375, 1.7188720703125, 2.126312255859375, 8.1551513671875, 2.60345458984375, 3.3358917236328125, 4.0736541748046875, 1.2988815307617188, 0.19025421142578125, 1.7735748291015625, 0.08701705932617188, -0.41797637939453125, -4.076103210449219, 0.72723388671875, 2.39874267578125, -0.32790374755859375, 5.416259765625, 2.0403594970703125, 6.74920654296875, 12.223358154296875, 9.1922607421875, 17.929962158203125, 11.229507446289062, 0.29083251953125, 2.98785400390625, -3.5619354248046875, 0.729095458984375, -1.6307601928710938, 21.820205688476562, -1.0801239013671875, -8.051666259765625, 0.1831512451171875, 2.07684326171875, 3.5419921875, 7.82965087890625, 1.395233154296875, 2.1946868896484375, 6.78961181640625, 2.345703125, 2.0853652954101562, -0.7372665405273438, 2.1113662719726562, -0.8972625732421875, 0.8228607177734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000076.npy"}
|
||||
{"epoch": 0.15916230366492146, "step": 77, "batch_size": 128, "mean": 2.4001669883728027, "std": 5.402097702026367, "min": -14.281524658203125, "p10": -3.7506256103515625, "median": 1.8151588439941406, "p90": 8.901626586914062, "max": 19.85064697265625, "pos_frac": 0.65625, "sample": [-1.2583847045898438, 3.188690185546875, 1.484405517578125, 4.3973388671875, 1.562744140625, 0.9288482666015625, 8.30316162109375, 5.053985595703125, 0.1180877685546875, 3.389495849609375, -3.4515533447265625, -0.02487945556640625, 10.283935546875, 6.376739501953125, 12.356475830078125, 1.7014656066894531, 0.5651702880859375, 3.198455810546875, -14.281524658203125, 7.8296356201171875, -1.1317062377929688, -1.172637939453125, -1.94866943359375, -0.20379638671875, 7.0576171875, 8.796783447265625, 5.0740966796875, 2.2991867065429688, 4.155029296875, 4.61663818359375, -4.26837158203125, -0.00054931640625, 0.6024017333984375, 7.66827392578125, -2.47564697265625, 1.117929458618164, 2.2977981567382812, 2.30352783203125, 8.8663330078125, -1.5684814453125, -0.0714111328125, 0.8801040649414062, 7.442352294921875, 0.5238037109375, 4.591819763183594, 1.623565673828125, 0.10017776489257812, 19.109954833984375, -0.8639678955078125, 0.903228759765625, 8.0048828125, -2.9925537109375, 4.6502685546875, 8.764678955078125, 8.471588134765625, 6.0320281982421875, 3.9691009521484375, -2.216217041015625, 1.790557861328125, 0.02611541748046875, -0.290679931640625, -3.2445907592773438, 8.79547119140625, 19.85064697265625, -1.8951416015625, 6.88250732421875, 3.42950439453125, -1.521209716796875, 11.327423095703125, 2.2807769775390625, 8.983978271484375, 2.6517791748046875, -8.32696533203125, -3.8438720703125, -3.754791259765625, -1.4302978515625, 4.818572998046875, 1.0684432983398438, 1.8064422607421875, 2.4148101806640625, -1.5518798828125, 10.572418212890625, -3.0353775024414062, -3.33978271484375, 1.9000015258789062, 3.601715087890625, 9.235809326171875, -9.6959228515625, -5.60858154296875, 3.725067138671875, 2.93292236328125, 1.8238754272460938, -1.6429290771484375, 3.5832977294921875, 4.0669097900390625, -3.9254150390625, -0.927276611328125, -5.544952392578125, 11.5054931640625, 3.4860687255859375, -5.5130615234375, 4.6317901611328125, -0.6930618286132812, 1.8424072265625, 6.77569580078125, 1.028472900390625, -0.05810546875, 10.6400146484375, 12.77642822265625, -1.068023681640625, 1.7017440795898438, -5.841796875, -4.85284423828125, 15.09722900390625, -0.2683219909667969, 4.3681488037109375, -4.782684326171875, 2.932525634765625, -2.3926620483398438, -1.135498046875, 8.77825927734375, 2.1396636962890625, 4.31634521484375, 5.02215576171875, 5.04864501953125, 14.328460693359375, -3.74884033203125, 0.43589019775390625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000077.npy"}
|
||||
{"epoch": 0.1612565445026178, "step": 78, "batch_size": 128, "mean": 2.9869937896728516, "std": 5.3209547996521, "min": -13.539810180664062, "p10": -2.8663612365722653, "median": 2.5424156188964844, "p90": 9.513827514648437, "max": 23.33837890625, "pos_frac": 0.71875, "sample": [-0.1767730712890625, 6.367034912109375, -3.549957275390625, 5.0404052734375, 8.21942138671875, 2.0076904296875, 3.2044525146484375, 3.067474365234375, -3.4796218872070312, 5.616081237792969, 0.1388397216796875, 1.8976058959960938, 4.661525726318359, 4.44287109375, -1.1468353271484375, 2.1126708984375, 3.8023834228515625, 5.4023590087890625, 10.564163208007812, -0.6126556396484375, 0.0, 10.7821044921875, -4.4801788330078125, 2.854522705078125, -0.18675994873046875, 1.6725234985351562, 4.79217529296875, 2.897674560546875, -3.0801849365234375, 5.73291015625, -2.9985885620117188, 4.0511474609375, 8.794281005859375, 0.3480682373046875, 7.277008056640625, -2.8096923828125, 0.4125823974609375, 14.7913818359375, 0.63232421875, -4.1720733642578125, 4.4896087646484375, 5.48236083984375, -0.36273193359375, 23.33837890625, 11.609130859375, -1.41693115234375, 3.913818359375, 3.9009246826171875, 1.392364501953125, 3.013763427734375, 7.1007080078125, -13.539810180664062, 1.8783798217773438, 3.9623031616210938, -0.2817840576171875, -0.71929931640625, 5.0891876220703125, 3.25848388671875, 3.4576416015625, -1.0997257232666016, 9.356201171875, -0.672027587890625, 4.7339630126953125, 5.188873291015625, 2.6935348510742188, -2.7716751098632812, 1.44866943359375, 3.35040283203125, 4.2957763671875, -10.183868408203125, 13.660491943359375, 2.668853759765625, 1.3101806640625, 1.3392486572265625, 12.043701171875, 6.511383056640625, -0.8426113128662109, 2.5177459716796875, -1.2350578308105469, 1.0447463989257812, -4.761016845703125, -1.2812175750732422, 9.343307495117188, 16.04266357421875, 0.8739471435546875, 2.5670852661132812, -0.48858642578125, 4.258880615234375, -5.9556732177734375, 6.00634765625, 3.2524566650390625, -2.300750732421875, 7.871124267578125, -1.1840057373046875, -7.04547119140625, 1.6255340576171875, 3.523406982421875, -3.4194793701171875, 11.3018798828125, 0.1374664306640625, 0.6679649353027344, 1.2804489135742188, -1.1828994750976562, 9.881622314453125, 7.203125, 3.2922821044921875, 7.289764404296875, -1.8750457763671875, 1.5710372924804688, 2.3843154907226562, 13.476470947265625, 1.079345703125, 3.625518798828125, 7.759246826171875, 3.08306884765625, 2.1473770141601562, 7.475921630859375, 11.849212646484375, 9.32452392578125, -3.2118072509765625, -2.31036376953125, 0.2098541259765625, 18.577056884765625, 0.8582344055175781, 2.058441162109375, 2.1941146850585938, -0.508270263671875, 7.9510498046875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000078.npy"}
|
||||
{"epoch": 0.16335078534031414, "step": 79, "batch_size": 128, "mean": 3.104513168334961, "std": 6.995034694671631, "min": -43.1119384765625, "p10": -2.7342254638671872, "median": 2.293262481689453, "p90": 10.198342895507812, "max": 24.42572021484375, "pos_frac": 0.703125, "sample": [0.33893585205078125, 5.8233184814453125, 9.213485717773438, -2.61029052734375, 5.5819091796875, -1.70391845703125, 1.7544403076171875, -0.7331085205078125, 3.4883575439453125, -2.535400390625, 13.012664794921875, -1.0923309326171875, 0.0, 9.00555419921875, -0.16004562377929688, -0.6186981201171875, -0.635711669921875, -1.6320724487304688, -6.137748718261719, 0.097442626953125, 3.022918701171875, -1.766693115234375, -3.5765533447265625, -2.042203903198242, -2.23883056640625, 4.682945251464844, 1.3290252685546875, 0.0, 10.2564697265625, 0.5355358123779297, -5.61468505859375, 11.7137451171875, 5.120452880859375, 4.693328857421875, 3.7748794555664062, 1.2733001708984375, 3.1949462890625, 1.4530487060546875, 3.7828826904296875, 0.638458251953125, 5.7170257568359375, -7.499214172363281, 10.173431396484375, 0.15128135681152344, 0.11539459228515625, 5.5237884521484375, 6.81169319152832, 15.49273681640625, 1.8456878662109375, 3.1839141845703125, -0.756591796875, -3.086212158203125, -43.1119384765625, -6.593292236328125, 1.4800262451171875, 7.907440185546875, 0.699188232421875, 16.65399169921875, 9.305221557617188, 9.980438232421875, -0.7767181396484375, 4.15521240234375, 2.907684326171875, 6.9138946533203125, 9.635025024414062, 4.010650634765625, 3.8010330200195312, 2.816925048828125, -1.4163055419921875, 3.198394775390625, -0.75054931640625, 5.199165344238281, 0.7536849975585938, 1.400970458984375, -4.52850341796875, -5.45538330078125, 8.7996826171875, 0.579559326171875, 9.977447509765625, 2.700775146484375, -3.18975830078125, 17.820343017578125, 8.7811279296875, 6.062042236328125, 5.417816162109375, 7.6270751953125, 4.200187683105469, 0.36388397216796875, 0.8143119812011719, 0.8712406158447266, 0.1444110870361328, 5.919761657714844, 0.04798126220703125, 0.37115478515625, 1.8857498168945312, 24.42572021484375, -0.2391338348388672, 5.236419677734375, -4.5090789794921875, 10.806488037109375, 15.490325927734375, 9.470016479492188, 3.59442138671875, 5.0408172607421875, -4.0302734375, 8.916107177734375, 9.057159423828125, -0.7064094543457031, 9.148284912109375, 0.2596435546875, -3.023406982421875, 16.792572021484375, -0.148895263671875, -0.7746505737304688, 5.058351516723633, 6.7246856689453125, 6.9113006591796875, 5.00201416015625, 0.8996810913085938, 4.2792510986328125, -0.8626251220703125, 9.441253662109375, 10.30523681640625, 18.82208251953125, -0.2198028564453125, -2.5621337890625, 0.28804779052734375, 12.740509033203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000079.npy"}
|
||||
{"epoch": 0.16544502617801046, "step": 80, "batch_size": 128, "mean": 2.4767165184020996, "std": 6.078868865966797, "min": -20.7186279296875, "p10": -3.4849517822265623, "median": 1.7136154174804688, "p90": 10.623580932617188, "max": 22.02978515625, "pos_frac": 0.6875, "sample": [1.8324050903320312, 4.259971618652344, -2.2868499755859375, 1.6540374755859375, 0.9007568359375, -2.430572509765625, 1.2622222900390625, -6.2459716796875, 7.11602783203125, -2.7739715576171875, 0.23876571655273438, -2.439544677734375, 5.140899658203125, 4.2921142578125, 5.305763244628906, -2.3639602661132812, -1.1530914306640625, 10.024627685546875, 3.3922348022460938, 4.4561767578125, 16.34442138671875, -0.7776565551757812, 2.9498138427734375, 4.65472412109375, 2.058624267578125, 10.676239013671875, 7.5072021484375, -3.1050567626953125, 3.4538841247558594, 8.9354248046875, 2.356658935546875, 0.3754863739013672, 1.773193359375, -1.6386737823486328, -11.2548828125, 0.4947967529296875, -20.7186279296875, 6.2733154296875, 0.0, 12.459716796875, -1.4806976318359375, 9.4019775390625, -2.03167724609375, -1.447845458984375, 1.789215087890625, -8.87469482421875, 0.453369140625, 5.9659271240234375, 0.326690673828125, -1.1348114013671875, 1.284576416015625, 1.028076171875, 1.5756378173828125, 5.8721771240234375, 8.508544921875, 0.45484352111816406, 0.3599815368652344, 0.10610198974609375, -2.8336639404296875, 11.071136474609375, 3.877532958984375, 1.0162734985351562, 5.913982391357422, 4.80523681640625, -5.213653564453125, 1.387939453125, 4.84814453125, 0.5640029907226562, 6.120536804199219, 2.6952171325683594, -2.28765869140625, 4.9471893310546875, 3.3287038803100586, -3.8959274291992188, 19.707977294921875, -0.79656982421875, -2.2895965576171875, -0.48805999755859375, 5.5447998046875, -9.63690185546875, 11.4281005859375, 3.1672515869140625, -0.673187255859375, 10.60101318359375, 4.3631591796875, 2.8006629943847656, -5.9396209716796875, -1.017822265625, -1.8143768310546875, 16.09918212890625, 5.82830810546875, 2.708343505859375, 12.173446655273438, 4.2193603515625, 4.0308685302734375, 12.351654052734375, 8.206832885742188, 1.0534896850585938, -9.454757690429688, 0.49066162109375, -7.49029541015625, 10.45660400390625, -3.455902099609375, 4.65032958984375, 2.79876708984375, 11.355224609375, 1.3023529052734375, -1.0056838989257812, 3.7848663330078125, 8.184814453125, -4.291015625, 0.0531768798828125, -0.3324127197265625, -0.9949569702148438, 1.378326416015625, -5.488655090332031, 3.7498512268066406, -1.4239768981933594, 0.8939647674560547, -3.552734375, 5.9908599853515625, 12.161651611328125, 10.58612060546875, 4.10736083984375, 22.02978515625, 0.08331108093261719, 3.245086669921875, 14.07562255859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000080.npy"}
|
||||
{"epoch": 0.16753926701570682, "step": 81, "batch_size": 128, "mean": 2.9438183307647705, "std": 6.5395917892456055, "min": -18.36456298828125, "p10": -4.08016357421875, "median": 2.532867431640625, "p90": 12.29260559082031, "max": 15.21783447265625, "pos_frac": 0.71875, "sample": [0.16341590881347656, -0.24854087829589844, 9.469184875488281, -11.038467407226562, 4.9401397705078125, -1.0431861877441406, -7.932586669921875, 3.5177688598632812, 13.74298095703125, 7.812530517578125, 15.21783447265625, 13.23944091796875, -2.0671844482421875, -4.0637664794921875, 7.941131591796875, 0.6828403472900391, 3.5847015380859375, 8.154983520507812, 5.155181884765625, 4.6705780029296875, 1.3982467651367188, 10.442779541015625, 0.39228057861328125, -3.7656097412109375, 4.741813659667969, 3.6800804138183594, 2.4897308349609375, 0.327667236328125, 9.499847412109375, 7.190887451171875, 13.4459228515625, 3.2138214111328125, 8.543609619140625, 4.19793701171875, -1.4180908203125, 2.6881484985351562, -4.71435546875, 14.91943359375, 14.30389404296875, 2.953216552734375, -4.792816162109375, 1.31683349609375, 10.81494140625, 0.21396636962890625, 0.014312744140625, 1.8752593994140625, -14.757568359375, -1.58978271484375, 6.4696044921875, 11.1126708984375, 1.175537109375, 3.2156219482421875, -6.555015563964844, 9.203521728515625, -4.9582061767578125, -0.6373748779296875, 6.300262451171875, 4.715087890625, 3.20452880859375, -14.288711547851562, -4.794769287109375, 3.347198486328125, 3.494384765625, 0.9580535888671875, 0.46846961975097656, 4.379127502441406, 0.56689453125, 0.7708816528320312, 2.3321685791015625, 4.15289306640625, -0.8902740478515625, -1.6021461486816406, -1.3948974609375, 11.963043212890625, 13.896209716796875, 9.906707763671875, -1.612396240234375, 4.47900390625, 2.1706066131591797, 7.381072998046875, -12.0623779296875, 6.1162261962890625, 9.30584716796875, -4.1184234619140625, 3.02471923828125, 2.0198097229003906, -1.1258506774902344, -3.5989990234375, 0.64544677734375, 2.5760040283203125, -18.36456298828125, -3.108001708984375, -13.965667724609375, 0.110107421875, 2.40960693359375, 1.370208740234375, -0.3450775146484375, 0.06358718872070312, 13.66229248046875, 13.547882080078125, 13.4097900390625, 14.5721435546875, 4.8555450439453125, -1.120452880859375, 9.88800048828125, 2.3325328826904297, -1.3006591796875, 1.4879150390625, 0.18869781494140625, 5.727203369140625, -2.5006942749023438, 4.369781494140625, -0.016653060913085938, -1.21240234375, 3.0381317138671875, 10.172698974609375, 8.431289672851562, -1.3568572998046875, 11.394866943359375, 8.09637451171875, 5.89471435546875, 13.06158447265625, 14.280517578125, 9.913070678710938, 0.90472412109375, 0.8033447265625, 6.5137786865234375, -1.638153076171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000081.npy"}
|
||||
{"epoch": 0.16963350785340314, "step": 82, "batch_size": 128, "mean": 2.9991281032562256, "std": 6.034842014312744, "min": -15.50439453125, "p10": -3.5153759002685545, "median": 2.957378387451172, "p90": 10.68841552734375, "max": 22.775390625, "pos_frac": 0.703125, "sample": [1.559783935546875, 5.335906982421875, 0.29946136474609375, -7.37432861328125, -0.4969635009765625, 5.59576416015625, 2.4005126953125, 12.381179809570312, 7.9011688232421875, -0.017292022705078125, 1.498809814453125, 6.078857421875, -5.069305419921875, -3.067047119140625, -3.30426025390625, 6.1359100341796875, 5.312896728515625, 3.266448974609375, 13.451675415039062, 5.962066650390625, 0.8315315246582031, -2.972991943359375, 6.904060363769531, -0.44197845458984375, 0.7860679626464844, 10.795867919921875, -9.42352294921875, 4.2866363525390625, 4.8924102783203125, 4.77728271484375, 18.24249267578125, 2.4072723388671875, 9.895172119140625, 3.845062255859375, 10.3997802734375, 12.666748046875, -1.322540283203125, 5.0173797607421875, 3.727325439453125, -15.50439453125, -3.4888916015625, -0.07330322265625, 3.50274658203125, 3.1823501586914062, -2.99517822265625, 12.222320556640625, -1.369964599609375, 7.2740478515625, 20.22137451171875, -2.38775634765625, -0.7396392822265625, 16.520156860351562, 4.731040954589844, 5.2570648193359375, 3.062103271484375, 0.5284366607666016, 10.642364501953125, -4.028106689453125, -0.5961761474609375, 2.9144439697265625, 3.0003128051757812, -5.2858734130859375, -13.619781494140625, 3.653627395629883, -3.5653076171875, 1.979705810546875, -4.10552978515625, 1.06683349609375, 10.897613525390625, 3.251007080078125, -0.9961929321289062, 7.134918212890625, 2.54119873046875, 3.5060806274414062, -0.13970947265625, 2.4434051513671875, -0.91827392578125, 1.7701663970947266, 4.966766357421875, 1.5235824584960938, 1.1561508178710938, 22.775390625, -2.8653717041015625, 5.5640869140625, -0.12139892578125, 2.5477676391601562, 5.7022705078125, 10.231231689453125, 2.0846939086914062, 3.5846405029296875, 3.4363250732421875, -2.4065093994140625, 2.853191375732422, 2.87615966796875, 6.695840835571289, 4.9168701171875, -5.52093505859375, 2.2990264892578125, 4.805084228515625, 10.30609130859375, 6.10076904296875, -0.17464637756347656, 1.2286529541015625, 5.0706939697265625, 7.920501708984375, 3.591930389404297, -0.41234779357910156, -1.72589111328125, -7.4180908203125, 2.6639556884765625, -3.493976593017578, 4.315155029296875, 6.736164093017578, 7.718505859375, -4.058868408203125, 12.263031005859375, 13.6258544921875, 2.7509994506835938, 4.356109619140625, 4.8583831787109375, 4.47357177734375, 5.87139892578125, 12.067169189453125, 6.46136474609375, 1.0006694793701172, 1.0044002532958984, -11.822265625, -3.150299072265625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000082.npy"}
|
||||
{"epoch": 0.17172774869109947, "step": 83, "batch_size": 128, "mean": 3.492799758911133, "std": 6.965437412261963, "min": -14.415313720703125, "p10": -3.074468994140625, "median": 2.562093734741211, "p90": 12.265689086914062, "max": 29.42120361328125, "pos_frac": 0.6875, "sample": [7.608489990234375, 0.0, 11.64697265625, 2.2939453125, 0.6463165283203125, -0.08221435546875, -1.0906982421875, 9.3377685546875, 2.8021240234375, 1.1616897583007812, -1.36859130859375, -14.415313720703125, -0.13983154296875, -12.898223876953125, 0.3092041015625, 1.611083984375, 4.2639007568359375, 9.95477294921875, 8.018798828125, 2.400310516357422, 7.83648681640625, 6.087368011474609, -1.0913848876953125, 6.0206146240234375, 10.492919921875, -4.92364501953125, -2.983154296875, -0.35334014892578125, -2.135345458984375, 11.49871826171875, 7.204193115234375, -0.23162841796875, 5.130126953125, 1.7634658813476562, -6.647705078125, 13.987060546875, 3.0175323486328125, 1.93927001953125, 19.484024047851562, 1.9163360595703125, 4.768054962158203, -2.0321502685546875, 2.792572021484375, 3.008148193359375, -2.27960205078125, 11.4517822265625, 4.5308685302734375, 4.217079162597656, -2.24676513671875, -2.7613067626953125, 5.8785400390625, 5.554718017578125, 19.294647216796875, 9.1259765625, -12.120330810546875, 1.3076934814453125, 12.23797607421875, 5.05792236328125, 29.42120361328125, 1.144134521484375, -3.1493759155273438, 12.330352783203125, 5.287200927734375, 6.2344970703125, 1.6526031494140625, -2.684356689453125, 1.08697509765625, 0.3255157470703125, -3.1415252685546875, 8.346817016601562, -8.023773193359375, 5.095245361328125, 0.22015380859375, -1.527801513671875, -1.27178955078125, 3.4630813598632812, 15.4886474609375, -2.9455718994140625, 2.723876953125, 9.524337768554688, -2.6418685913085938, 12.814453125, 5.2866363525390625, 3.41302490234375, -13.8665771484375, -6.157196044921875, 6.732574462890625, 13.718994140625, -8.488800048828125, 12.222930908203125, -0.07606697082519531, -1.2400436401367188, 0.8495025634765625, -3.561279296875, 4.902805328369141, 18.5531005859375, 15.6080322265625, 1.062469482421875, 2.894561767578125, 1.43402099609375, 15.896636962890625, -1.6124267578125, 7.551025390625, 3.22198486328125, -2.3151397705078125, 0.59375, 7.7024688720703125, 5.6219024658203125, 1.9381866455078125, -0.53936767578125, 8.90496826171875, -4.080078125, 9.356201171875, 8.995346069335938, 1.215576171875, 3.8189239501953125, 2.1981201171875, 16.140350341796875, 19.406005859375, -3.0457305908203125, 1.6985816955566406, -1.4400787353515625, 8.945114135742188, -1.2056427001953125, 5.7628936767578125, 5.954765319824219, 3.21240234375, 2.2616424560546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000083.npy"}
|
||||
{"epoch": 0.17382198952879582, "step": 84, "batch_size": 128, "mean": 2.9974305629730225, "std": 7.044206619262695, "min": -21.426605224609375, "p10": -3.7002685546875, "median": 3.141693115234375, "p90": 12.376913452148436, "max": 20.472381591796875, "pos_frac": 0.71875, "sample": [-1.4882583618164062, 7.038818359375, 15.81646728515625, 0.036865234375, -2.7734375, -4.143463134765625, 8.78271484375, 4.391502380371094, 5.5158233642578125, -5.26690673828125, 1.1688308715820312, 9.1201171875, 2.0123214721679688, -3.1038970947265625, 5.5089111328125, 3.856536865234375, 6.470367431640625, 1.189605712890625, -0.19462203979492188, 9.795944213867188, -2.7300186157226562, 8.441925048828125, 1.664337158203125, 4.9061279296875, 7.88037109375, 2.48992919921875, -8.344688415527344, 5.622589111328125, 7.52435302734375, 2.735931396484375, -2.501251220703125, -16.1278076171875, -21.426605224609375, 12.546875, 3.4355926513671875, -0.8929901123046875, 7.87530517578125, 1.5840682983398438, 1.4854278564453125, -0.978515625, 6.2042236328125, 3.510986328125, 13.417572021484375, -3.4795074462890625, 12.30633544921875, -2.366424560546875, 1.6508941650390625, -3.212646484375, 12.752777099609375, 4.9971160888671875, 20.472381591796875, -17.532958984375, 9.40704345703125, 2.582427978515625, 3.7844619750976562, -3.282958984375, 3.0360488891601562, -3.837158203125, 6.3279571533203125, 7.583099365234375, 5.070037841796875, 3.62957763671875, 0.3359375, 5.620002746582031, 11.884002685546875, 0.58941650390625, 3.97216796875, 0.34203338623046875, 0.44696044921875, 3.1452178955078125, -2.518585205078125, 4.355133056640625, 2.149749755859375, 0.14825439453125, 2.242382049560547, 14.035430908203125, 7.45391845703125, 9.354682922363281, -11.656341552734375, 4.455474853515625, 3.4723358154296875, 0.7624359130859375, -3.5387344360351562, 2.2493629455566406, -3.6416015625, -0.446044921875, -1.678741455078125, 9.305374145507812, 16.01007080078125, 0.0782470703125, 12.541595458984375, 4.072288513183594, -5.42767333984375, 5.203033447265625, -0.618072509765625, -1.1366424560546875, 4.2601776123046875, 3.004901885986328, 4.3694915771484375, 5.4665679931640625, 4.1899261474609375, -0.5097732543945312, 2.687957763671875, 2.3018798828125, 8.920387268066406, 6.052459716796875, -5.469390869140625, 19.582275390625, -12.66351318359375, 10.495147705078125, 18.17230224609375, 2.0948562622070312, 6.449317932128906, 4.504302978515625, 17.3958740234375, 9.859130859375, -5.43572998046875, -16.078521728515625, 4.20721435546875, -1.6873779296875, 0.4791145324707031, 12.830551147460938, 15.23992919921875, -2.2021636962890625, 3.766632080078125, 5.5174407958984375, 3.1381683349609375, -2.7779998779296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000084.npy"}
|
||||
{"epoch": 0.17591623036649215, "step": 85, "batch_size": 128, "mean": 1.9258768558502197, "std": 5.801815986633301, "min": -15.51007080078125, "p10": -6.012281799316406, "median": 2.4007949829101562, "p90": 8.562979125976563, "max": 17.24163818359375, "pos_frac": 0.65625, "sample": [3.4272308349609375, -4.0518035888671875, -1.6525611877441406, 17.24163818359375, 7.138423919677734, 5.3910064697265625, -0.9497833251953125, -1.352020263671875, 5.783538818359375, 2.183319091796875, 2.0651073455810547, -2.484821319580078, 2.4184188842773438, 8.5325927734375, 7.0234375, 2.9185791015625, 7.34954833984375, 6.363128662109375, 5.4781036376953125, 2.8944854736328125, 0.397552490234375, -0.461456298828125, 8.012451171875, 12.53179931640625, 1.7176170349121094, -1.41552734375, -1.0803375244140625, -4.4178466796875, -8.94537353515625, 6.5565185546875, 5.865447998046875, 4.162200927734375, -1.750518798828125, 3.1207275390625, 8.49407958984375, 11.587921142578125, -3.3009414672851562, 4.90228271484375, 1.190277099609375, -2.57464599609375, 10.79656982421875, 4.8761444091796875, -2.367919921875, 5.930908203125, 11.124755859375, -15.51007080078125, 2.3969650268554688, 3.1300048828125, 3.4743785858154297, 15.3616943359375, 0.3033447265625, -7.19317626953125, 5.041143417358398, 2.334686279296875, 2.3979339599609375, -5.2661590576171875, 2.403656005859375, 1.9768333435058594, 5.2855987548828125, 2.0800247192382812, -3.596588134765625, 5.757965087890625, 3.4152069091796875, -2.9437255859375, 8.053115844726562, 5.854949951171875, 4.984375, 2.76165771484375, -5.607307434082031, -9.15716552734375, -3.33599853515625, 6.114986419677734, -9.40313720703125, -1.7095947265625, -0.05922698974609375, -5.97161865234375, 10.963287353515625, 9.05023193359375, 0.49105072021484375, 4.580451965332031, 2.64532470703125, 8.633880615234375, 0.2158966064453125, 5.781341552734375, 14.219970703125, -11.47564697265625, -0.460174560546875, 9.329986572265625, 3.8498802185058594, -6.1071624755859375, -9.561241149902344, 1.435089111328125, 3.286895751953125, 3.1796302795410156, 9.96832275390625, 2.569681167602539, -1.2212600708007812, 3.975860595703125, -7.263580322265625, 4.228355407714844, -1.1912612915039062, 4.910614013671875, 0.8064155578613281, -1.496307373046875, -0.2087860107421875, 0.206268310546875, -6.7130889892578125, 1.3322601318359375, 7.833251953125, 3.5001449584960938, -6.1839141845703125, -10.288330078125, -0.0162200927734375, 6.8147430419921875, -3.4066925048828125, 1.4208984375, 6.4066314697265625, 1.3295822143554688, 14.660598754882812, -8.124618530273438, -2.2493133544921875, 4.3171844482421875, -5.511360168457031, 1.79937744140625, 3.4754638671875, -2.3912010192871094, 3.9586334228515625, 5.160175323486328], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000085.npy"}
|
||||
{"epoch": 0.17801047120418848, "step": 86, "batch_size": 128, "mean": 3.5604519844055176, "std": 7.5712103843688965, "min": -17.166595458984375, "p10": -4.059649658203124, "median": 3.4382009506225586, "p90": 12.649063110351562, "max": 25.397003173828125, "pos_frac": 0.6953125, "sample": [3.3772735595703125, -10.32843017578125, 1.684478759765625, -6.0391845703125, 4.64007568359375, 2.4740753173828125, 4.270263671875, 2.8448257446289062, 6.26025390625, 5.970672607421875, 25.397003173828125, -3.93341064453125, 1.4320869445800781, 1.54522705078125, 7.154869079589844, -10.226741790771484, -11.9827880859375, -3.7382049560546875, 0.9591560363769531, -1.2173843383789062, 2.4172439575195312, 6.296329498291016, -2.3562088012695312, -1.5872344970703125, 14.655517578125, 0.119171142578125, -14.413330078125, 0.89337158203125, 16.9703369140625, 12.649169921875, -6.588470458984375, 11.0369873046875, -1.1538238525390625, 10.3341064453125, 20.03179931640625, 10.8636474609375, 11.577484130859375, -4.30792236328125, 3.44061279296875, 13.75421142578125, 3.9896011352539062, -1.44305419921875, 8.938835144042969, -2.1492691040039062, 2.9334335327148438, 0.880126953125, 7.959602355957031, 11.571563720703125, 8.91729736328125, 5.6018829345703125, -17.166595458984375, 12.29840087890625, 0.914031982421875, 7.8387451171875, 4.4792327880859375, -2.382110595703125, 11.83404541015625, -15.8271484375, 1.0959587097167969, 8.547576904296875, 21.111160278320312, 5.01776123046875, 3.7713623046875, 11.180633544921875, 7.940673828125, -1.4901123046875, 0.0, -2.358306884765625, 3.1646881103515625, 6.125732421875, 4.3202667236328125, 14.83868408203125, -12.3287353515625, 4.306709289550781, 2.3269901275634766, 1.724151611328125, -0.5613365173339844, -3.065399169921875, 11.166488647460938, 9.593414306640625, -1.8641357421875, 7.0903472900390625, 11.025177001953125, -0.06817626953125, 5.440704345703125, -8.907135009765625, -13.249603271484375, 9.481430053710938, -3.821746826171875, 1.1334075927734375, -0.7750396728515625, 5.929229736328125, 0.9698638916015625, 1.9325408935546875, 2.321044921875, -5.33099365234375, 4.81512451171875, 4.70111083984375, 4.23675537109375, 3.435789108276367, 1.758087158203125, -3.9532470703125, -3.1617202758789062, 12.649017333984375, 14.9808349609375, 8.837692260742188, -1.0118446350097656, 22.847564697265625, 2.5763473510742188, 4.05975341796875, -2.5875816345214844, 6.53887939453125, 13.294204711914062, 4.83843994140625, 12.319427490234375, -0.9022216796875, 6.7995452880859375, 14.6378173828125, -0.31185340881347656, 5.101348876953125, 3.546630859375, 9.799823760986328, 8.687530517578125, 13.148162841796875, -1.8360595703125, -2.6668701171875, 2.48199462890625, 3.976348876953125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000086.npy"}
|
||||
{"epoch": 0.18010471204188483, "step": 87, "batch_size": 128, "mean": 4.029636383056641, "std": 6.543172836303711, "min": -17.904205322265625, "p10": -2.7022716522216794, "median": 3.1044845581054688, "p90": 13.503430175781249, "max": 23.557708740234375, "pos_frac": 0.734375, "sample": [4.674781799316406, 8.901168823242188, 7.36376953125, 7.870880126953125, 16.843475341796875, 0.0, -0.34932899475097656, -2.658588409423828, 8.283279418945312, 15.471099853515625, -4.462615966796875, 7.7448577880859375, -3.079132080078125, 9.421079635620117, 23.557708740234375, -0.16963768005371094, 0.7777862548828125, 6.33001708984375, 15.021575927734375, 4.216819763183594, -1.237060546875, -8.514633178710938, 1.421112060546875, 5.1747589111328125, 2.88031005859375, -2.80419921875, -0.9488677978515625, 6.064453125, 15.182708740234375, 0.5939521789550781, 11.191375732421875, -8.22711181640625, 5.6722412109375, 15.4088134765625, 6.8781890869140625, 4.567573547363281, 4.9715118408203125, 3.1302490234375, 4.298095703125, 3.674560546875, -0.6420478820800781, 0.133819580078125, 8.6922607421875, -3.3553466796875, -2.2188491821289062, -0.81329345703125, 9.688232421875, -5.103546142578125, 10.419342041015625, 4.7747650146484375, 7.1988525390625, 0.3460693359375, 13.6842041015625, 14.2506103515625, 10.33154296875, 8.340972900390625, 6.177581787109375, 8.05914306640625, 11.117584228515625, 18.802490234375, 0.717041015625, 1.13262939453125, 2.039234161376953, 11.071640014648438, 3.491851806640625, 21.50543212890625, -0.7416229248046875, 0.0, 3.6222686767578125, 4.22503662109375, 3.0787200927734375, 4.98321533203125, 13.42864990234375, 4.985198974609375, 12.707427978515625, 2.958526611328125, 4.123931884765625, -1.550628662109375, 8.6385498046875, -0.5439529418945312, -7.171630859375, -3.928741455078125, -1.5118255615234375, 1.804290771484375, -0.3226165771484375, 11.30255126953125, -1.895599365234375, 6.379909515380859, 3.499481201171875, 7.6263427734375, 3.2780914306640625, 0.9746017456054688, 3.0187606811523438, -1.4609375, 1.9298095703125, -10.22064208984375, 1.3817367553710938, 1.2088871002197266, -7.798187255859375, 12.810302734375, -0.13768386840820312, 2.781097412109375, -0.6482887268066406, 1.100128173828125, 12.861602783203125, -17.904205322265625, 4.2702484130859375, -5.2922515869140625, 0.28516387939453125, 2.3622207641601562, 14.207763671875, 13.67791748046875, -0.22625732421875, 6.96185302734375, 0.68316650390625, 0.1386871337890625, 8.78369140625, 2.5238876342773438, 2.02264404296875, 0.6893196105957031, 4.557861328125, 0.9915924072265625, -2.4266586303710938, 2.7086868286132812, 7.4502410888671875, 1.6453399658203125, 14.709060668945312, 1.2174701690673828], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000087.npy"}
|
||||
{"epoch": 0.18219895287958116, "step": 88, "batch_size": 128, "mean": 3.2125608921051025, "std": 7.404752254486084, "min": -26.30609130859375, "p10": -4.211262512207031, "median": 2.432281494140625, "p90": 13.007223510742184, "max": 26.793914794921875, "pos_frac": 0.671875, "sample": [8.696296691894531, 14.93731689453125, 13.85430908203125, 0.3510093688964844, 17.5989990234375, 2.588592529296875, -3.791534423828125, 0.8380508422851562, 8.248283386230469, 3.3714141845703125, 1.8086700439453125, -3.5628509521484375, 2.459197998046875, 20.636962890625, -0.0466766357421875, 0.81243896484375, -0.2627277374267578, 7.528167724609375, 0.7144699096679688, -3.110595703125, 12.691802978515625, -4.9278411865234375, 11.638763427734375, -4.718788146972656, -1.4573974609375, 6.9521026611328125, -5.6223907470703125, 1.1792068481445312, 14.09661865234375, -6.992462158203125, -7.03973388671875, 10.819290161132812, 4.1739349365234375, 5.4644775390625, 3.0795440673828125, -1.2503395080566406, 19.43267822265625, 0.0, 0.4241294860839844, 2.868499755859375, 6.73297119140625, 4.9990692138671875, 10.180999755859375, -0.818359375, 14.3546142578125, 13.736724853515625, 1.965850830078125, -7.25555419921875, 2.106781005859375, 3.114349365234375, 0.0, 3.813629150390625, 3.518951416015625, 5.48480224609375, 14.22576904296875, 0.6834564208984375, -0.7731361389160156, 1.4120407104492188, -1.0431365966796875, 6.146514892578125, 10.750694274902344, -0.9268760681152344, 0.646392822265625, -6.837364196777344, 5.851585388183594, 2.807708740234375, -3.5918121337890625, 11.122589111328125, -0.3140411376953125, 12.694580078125, 3.275226593017578, -3.37481689453125, 8.9481201171875, 0.13273239135742188, -1.889617919921875, -22.05596923828125, 4.9101409912109375, 1.1878738403320312, -0.0068817138671875, -2.8687744140625, 6.8497161865234375, 7.408660888671875, 7.21087646484375, -1.6154899597167969, -0.557708740234375, -4.558349609375, -3.3488540649414062, -6.8688812255859375, 6.3162994384765625, 8.080123901367188, 0.14228057861328125, 5.45513916015625, 2.6990509033203125, 4.185882568359375, 5.5966796875, 14.836944580078125, 2.15338134765625, -4.2795257568359375, -2.3806915283203125, 17.77288818359375, 2.60760498046875, 3.9483642578125, -1.1278400421142578, 8.461822509765625, -1.189605712890625, 26.793914794921875, 12.2652587890625, -0.4135284423828125, 10.57928466796875, -7.691986083984375, 8.375991821289062, 0.799346923828125, 0.18218994140625, 2.904205322265625, 21.3392333984375, 4.311990737915039, -1.3260936737060547, 0.5077037811279297, 2.405364990234375, 0.3350830078125, -26.30609130859375, 5.3199462890625, -4.1820068359375, 7.4327545166015625, 1.0811996459960938, 8.018768310546875, 4.173492431640625, -1.0267105102539062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000088.npy"}
|
||||
{"epoch": 0.18429319371727748, "step": 89, "batch_size": 128, "mean": 3.7864387035369873, "std": 6.749569892883301, "min": -19.3720703125, "p10": -3.1967441558837892, "median": 3.108264923095703, "p90": 12.14931640625, "max": 26.47589111328125, "pos_frac": 0.78125, "sample": [0.5815277099609375, -11.154067993164062, 12.9532470703125, 2.18023681640625, 5.4105224609375, 15.49786376953125, -1.488687515258789, 1.0132293701171875, 1.490325927734375, 8.950775146484375, 2.211669921875, 3.900115966796875, 2.4433135986328125, 10.9698486328125, 2.2479248046875, 11.361587524414062, 0.9225215911865234, -2.586944580078125, 4.5784912109375, 1.927459716796875, 5.179290771484375, 13.368072509765625, -2.6971359252929688, 0.4800567626953125, 0.6598739624023438, 1.3782501220703125, 4.179718017578125, -1.21954345703125, -0.19732666015625, 3.2445755004882812, 2.3716583251953125, 0.5625686645507812, -0.449554443359375, 12.0286865234375, 13.72442626953125, 26.47589111328125, 12.4307861328125, -6.740264892578125, 0.6131439208984375, 4.191314697265625, -19.3720703125, -3.486968994140625, 4.19342041015625, -10.2733154296875, -0.65625, 3.938995361328125, 7.58160400390625, -11.768600463867188, 0.695556640625, -3.4344482421875, -0.3861656188964844, 8.087417602539062, 3.962249755859375, 0.352020263671875, 9.991470336914062, 8.760711669921875, 6.4480133056640625, -3.1857566833496094, 6.8264617919921875, 10.300384521484375, 1.9442787170410156, 17.071441650390625, 9.647369384765625, 8.852798461914062, 7.1668701171875, 1.5363006591796875, -6.10980224609375, 1.170074462890625, 1.0471534729003906, -3.532989501953125, 4.404029846191406, 6.140289306640625, 4.754035949707031, 7.8288421630859375, -5.2951507568359375, -0.9673995971679688, 3.077526092529297, 7.70635986328125, 20.69775390625, -6.296295166015625, 1.431640625, 7.0627593994140625, 2.8138275146484375, 7.923675537109375, 0.2445068359375, 11.4940185546875, 2.895355224609375, 14.1954345703125, 9.7772216796875, 20.41021728515625, 5.122528076171875, -0.166259765625, 13.232154846191406, 2.1723365783691406, 2.7451438903808594, 2.8026885986328125, 23.80584716796875, -1.6047515869140625, 1.8398513793945312, 4.105491638183594, 5.933135986328125, 1.395751953125, 3.836761474609375, -0.616424560546875, 3.7649993896484375, 4.319007873535156, -1.7619781494140625, -0.5185089111328125, 0.460662841796875, 4.308135986328125, 3.9800567626953125, 3.1390037536621094, 9.595893859863281, 13.26251220703125, 6.0855712890625, 3.8916091918945312, 4.6525421142578125, 4.49053955078125, 7.8820648193359375, -13.0621337890625, 1.4060611724853516, 11.7750244140625, 5.57000732421875, 1.48419189453125, 2.2627315521240234, 3.506378173828125, 2.1255950927734375, -3.222381591796875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000089.npy"}
|
||||
{"epoch": 0.18638743455497384, "step": 90, "batch_size": 128, "mean": 3.375781774520874, "std": 7.27635383605957, "min": -20.5035400390625, "p10": -4.954065322875976, "median": 2.4962196350097656, "p90": 11.920956420898436, "max": 24.589141845703125, "pos_frac": 0.71875, "sample": [8.564125061035156, 5.158233642578125, 0.0452423095703125, 16.91064453125, -10.385711669921875, -0.8720703125, -4.190887451171875, 1.0906791687011719, -2.403350830078125, 11.61077880859375, 1.899139404296875, 2.9426193237304688, -7.625732421875, -0.7335891723632812, -1.293792724609375, 5.769989013671875, 21.539520263671875, 18.842559814453125, 11.44024658203125, 1.8155441284179688, 5.9395599365234375, -1.27618408203125, 0.6475067138671875, 2.9916954040527344, -12.09539794921875, -4.6545562744140625, 11.23638916015625, 7.335662841796875, 6.539031982421875, 1.5148773193359375, -1.4440765380859375, 5.549613952636719, 8.7998046875, 3.08544921875, 8.9373779296875, 2.3941192626953125, -5.994354248046875, -4.075164794921875, 2.15899658203125, 5.820793151855469, -5.652919769287109, 1.725311279296875, -0.8839492797851562, -10.2431640625, 7.683624267578125, 9.7674560546875, 5.270362854003906, 9.008224487304688, 6.668373107910156, 2.16607666015625, 8.316696166992188, -3.701812744140625, 1.1149139404296875, 1.6912422180175781, 0.107147216796875, 1.844390869140625, -9.746002197265625, 0.0279998779296875, 0.161773681640625, 6.0642852783203125, 0.3158721923828125, 0.0, 11.741668701171875, 7.4434051513671875, 10.200172424316406, -1.666015625, 9.5360107421875, 3.2612457275390625, -3.27325439453125, 10.59478759765625, 2.2048110961914062, 1.7824249267578125, 5.74127197265625, 15.684814453125, 11.7115478515625, 0.615264892578125, 0.38320350646972656, 3.5994033813476562, 2.7100753784179688, 9.26873779296875, 12.0184326171875, -20.5035400390625, 12.7449951171875, -10.063201904296875, 1.148651123046875, 2.94329833984375, 4.629173278808594, 2.5344009399414062, -0.7040252685546875, -3.125, -0.9081268310546875, -5.7099609375, 11.07366943359375, 11.879180908203125, -2.9679412841796875, 3.152191162109375, -4.608062744140625, 6.631744384765625, 9.992034912109375, 11.709732055664062, -2.09796142578125, 11.81003189086914, 8.237884521484375, 0.44952392578125, -1.9051971435546875, 24.589141845703125, -2.19586181640625, 4.21783447265625, -6.015869140625, -11.53704833984375, 3.8175811767578125, 1.3970184326171875, -6.896392822265625, 12.2896728515625, 2.458038330078125, 0.22322463989257812, 12.86529541015625, 3.597320556640625, 14.188812255859375, 2.53515625, 0.07379913330078125, -2.396270751953125, 21.03948974609375, 13.747039794921875, 3.6056976318359375, 9.962661743164062, 1.633209228515625, 13.757781982421875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000090.npy"}
|
||||
{"epoch": 0.18848167539267016, "step": 91, "batch_size": 128, "mean": 3.454530715942383, "std": 7.750298023223877, "min": -12.83526611328125, "p10": -5.945297241210937, "median": 3.26739501953125, "p90": 12.751821899414063, "max": 26.5426025390625, "pos_frac": 0.6328125, "sample": [4.933258056640625, 1.4214973449707031, -1.4462165832519531, 17.672592163085938, 9.002044677734375, 14.458724975585938, 0.11323738098144531, 4.650665283203125, 5.0319671630859375, -3.0296707153320312, 1.2393798828125, 5.3291168212890625, 7.2479095458984375, 7.175117492675781, 6.370660781860352, 8.810134887695312, 1.8975982666015625, -12.83526611328125, -3.93511962890625, -0.08967971801757812, 7.640083312988281, 4.25665283203125, 5.307090759277344, 6.997467041015625, -2.0053882598876953, 5.902252197265625, 0.5472049713134766, -1.19281005859375, -0.6383132934570312, -7.94921875, 12.236572265625, -0.5119171142578125, -0.7635955810546875, -5.81231689453125, -3.510467529296875, 8.879959106445312, -5.303375244140625, 7.62225341796875, 5.756317138671875, 6.6418609619140625, -8.978240966796875, -0.5660018920898438, 2.8353271484375, -2.356689453125, 3.8663330078125, 19.5667724609375, -5.690155029296875, -3.7222900390625, 0.0, -12.220458984375, -6.255584716796875, -3.76983642578125, -10.7445068359375, 2.792816162109375, 7.01080322265625, 5.794891357421875, -0.1213531494140625, -6.259063720703125, 12.519256591796875, 4.87969970703125, 11.17449951171875, -7.7244873046875, 4.2816162109375, 0.036197662353515625, -2.6123123168945312, -0.775360107421875, 24.35638427734375, -0.860015869140625, -1.248779296875, 10.993576049804688, -8.670700073242188, -3.068359375, 26.5426025390625, 3.6118011474609375, 15.830780029296875, 1.0307140350341797, 4.6580047607421875, 0.4455413818359375, 3.802581787109375, 12.831268310546875, -9.104888916015625, -4.936614990234375, 22.688568115234375, 10.070358276367188, 13.57427978515625, 0.0, 9.053108215332031, 5.50457763671875, 7.89862060546875, 12.7177734375, 3.6461181640625, 9.82244873046875, 4.397857666015625, -8.08795166015625, -0.51806640625, 12.6923828125, -8.454658508300781, 2.9229888916015625, 3.87640380859375, 20.34783935546875, 7.24462890625, 8.76824951171875, 7.641265869140625, 4.556610107421875, 4.96185302734375, -9.990409851074219, -1.8768157958984375, 11.290077209472656, -1.595489501953125, 15.902587890625, -3.404205322265625, 18.966964721679688, 20.949371337890625, -0.09230804443359375, -1.8390045166015625, 10.520553588867188, 2.4489822387695312, -5.300537109375, 7.126739501953125, 1.3060150146484375, 9.386703491210938, -4.544952392578125, 2.7357635498046875, 1.8160018920898438, 4.06024169921875, 2.35638427734375, 8.084197998046875, 1.253753662109375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000091.npy"}
|
||||
{"epoch": 0.1905759162303665, "step": 92, "batch_size": 128, "mean": 3.811319351196289, "std": 7.869380474090576, "min": -28.19232177734375, "p10": -4.96787109375, "median": 3.309925079345703, "p90": 14.376701354980467, "max": 21.133087158203125, "pos_frac": 0.7109375, "sample": [-4.95904541015625, 0.18373870849609375, -4.48907470703125, 3.4145126342773438, 3.2330703735351562, 6.6394805908203125, 4.3227996826171875, 5.043609619140625, 13.458328247070312, 6.961944580078125, 1.658782958984375, 2.348552703857422, 2.850341796875, 1.185699462890625, 21.133087158203125, 20.193359375, 18.930023193359375, -6.498046875, 7.2721405029296875, 1.4177703857421875, -6.0824432373046875, 7.515380859375, 14.644424438476562, 12.447113037109375, 1.63525390625, -1.68438720703125, 3.1581573486328125, 13.58843994140625, 1.213043212890625, -1.2590789794921875, 3.38677978515625, 1.10748291015625, 11.1812744140625, 6.9264984130859375, 1.12554931640625, -0.527099609375, 0.385772705078125, 5.0488128662109375, 14.261962890625, 8.1829833984375, 1.4545822143554688, 1.84564208984375, 0.22933387756347656, -28.19232177734375, 12.0267333984375, 2.0110015869140625, 4.986297607421875, 3.046600341796875, 1.14239501953125, 1.4247665405273438, -4.98846435546875, -7.5811767578125, -8.525192260742188, -0.838470458984375, 17.5460205078125, -5.953033447265625, -9.439285278320312, -0.3152923583984375, -5.783599853515625, 7.54345703125, 5.378288269042969, 5.44488525390625, 7.58843994140625, 5.0811614990234375, 6.613800048828125, 4.136894226074219, 9.67862319946289, 0.0, -2.136260986328125, 13.53228759765625, -2.571380615234375, 2.8079833984375, 1.690582275390625, -1.0260181427001953, -1.7188339233398438, 8.0906982421875, -0.6076698303222656, 0.9849472045898438, 11.762054443359375, -0.963801383972168, -7.359039306640625, 14.64520263671875, 5.9013671875, -21.34771728515625, -2.1998138427734375, 0.549346923828125, 5.769195556640625, 18.915313720703125, 19.211517333984375, 5.74359130859375, 8.096149444580078, 7.249359130859375, -0.5281581878662109, -0.339447021484375, -0.2521333694458008, 21.06317138671875, -1.988494873046875, -4.903594970703125, 19.5968017578125, 4.292716979980469, 10.97760009765625, 9.989089965820312, -11.9744873046875, 14.7113037109375, 9.002777099609375, 5.0433807373046875, 2.592193603515625, 3.4052734375, 3.7542190551757812, 4.962493896484375, 8.39794921875, 4.189567565917969, 9.529682159423828, -11.460357666015625, 5.468719482421875, 9.18890380859375, -0.009937286376953125, 0.6185302734375, -1.518157958984375, 9.196685791015625, -2.16265869140625, 8.926532745361328, 3.6890106201171875, 1.8320865631103516, 15.85601806640625, 17.0953369140625, 9.763992309570312, -1.2958755493164062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000092.npy"}
|
||||
{"epoch": 0.19267015706806281, "step": 93, "batch_size": 128, "mean": 3.60282564163208, "std": 7.486203193664551, "min": -18.538604736328125, "p10": -4.109667968749999, "median": 2.5268936157226562, "p90": 14.29962692260742, "max": 24.136138916015625, "pos_frac": 0.671875, "sample": [1.3284492492675781, 13.8853759765625, 1.8984222412109375, 22.47625732421875, -5.8633575439453125, 7.607940673828125, -1.6063690185546875, 11.65362548828125, 3.07025146484375, 5.265657424926758, 4.503936767578125, -7.8106842041015625, 1.260040283203125, 3.028900146484375, 4.674896240234375, 12.84423828125, 2.310089111328125, 17.72650146484375, -2.9948654174804688, 5.1836090087890625, 5.8710784912109375, 5.513702392578125, 3.191944122314453, 4.4208984375, 14.812904357910156, 1.1920013427734375, -0.381866455078125, 2.5218963623046875, 11.616119384765625, 10.88470458984375, 2.428192138671875, -8.740219116210938, 0.3504180908203125, -0.961090087890625, 2.6702117919921875, 6.3765716552734375, 4.7957305908203125, 4.685394287109375, -3.835205078125, 13.890106201171875, 1.9076080322265625, 6.053871154785156, 3.282745361328125, -4.723876953125, -0.5403900146484375, 4.139404296875, -0.56317138671875, 14.85028076171875, 1.403564453125, -0.737640380859375, 1.6419906616210938, 6.9680938720703125, 17.086669921875, -5.845550537109375, 7.8662109375, 2.055511474609375, -0.7309646606445312, -1.1027030944824219, 8.614471435546875, 1.1368446350097656, 7.6006927490234375, 4.728424072265625, 16.5684814453125, -1.340911865234375, -2.1487884521484375, -5.606689453125, 9.517669677734375, 17.49169921875, -6.2547149658203125, 2.0483856201171875, 3.777200698852539, 2.7961959838867188, -3.7471237182617188, -1.44329833984375, 8.550643920898438, -2.7044525146484375, 13.965240478515625, -0.0518646240234375, 21.14837646484375, -18.538604736328125, 21.15283203125, 7.4805908203125, 15.366943359375, -6.97467041015625, 6.920501708984375, 1.4443511962890625, 0.0, -9.478363037109375, 2.5678462982177734, 4.663505554199219, 16.98480224609375, 12.79571533203125, -3.846435546875, 2.1534271240234375, 5.5394287109375, -14.9613037109375, 6.17919921875, -0.8780689239501953, -3.088653564453125, -6.793296813964844, 10.064743041992188, 6.6186370849609375, 2.531890869140625, 2.456634521484375, 5.447265625, 0.0, 1.76849365234375, -2.5933761596679688, -3.784088134765625, -0.8631095886230469, -1.9768905639648438, 0.6161346435546875, 6.644878387451172, 2.78924560546875, -9.887054443359375, 21.96929931640625, 24.136138916015625, -2.0028419494628906, -1.334686279296875, -1.4066162109375, 4.987091064453125, 3.89971923828125, -0.33905029296875, 14.07965087890625, 0.21628952026367188, 2.078460693359375, 0.697967529296875, 6.2525634765625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000093.npy"}
|
||||
{"epoch": 0.19476439790575917, "step": 94, "batch_size": 128, "mean": 4.082180976867676, "std": 9.159119606018066, "min": -19.00677490234375, "p10": -7.2871246337890625, "median": 3.3115997314453125, "p90": 14.583477783203124, "max": 30.96087646484375, "pos_frac": 0.703125, "sample": [-2.33203125, 10.7508544921875, -6.2551116943359375, 17.57708740234375, 4.7504425048828125, 10.684539794921875, 19.956710815429688, 2.5580368041992188, 3.942779541015625, 8.850082397460938, 0.4947509765625, -5.504295349121094, -15.426849365234375, 0.7083168029785156, 12.258331298828125, 0.998931884765625, 13.377655029296875, 0.8459701538085938, 0.46954345703125, 0.2880096435546875, 13.20556640625, 14.507553100585938, 3.939544677734375, -2.1378555297851562, 20.550079345703125, 13.585723876953125, 1.8754730224609375, 3.162750244140625, -11.047393798828125, -1.670318603515625, 11.64959716796875, 21.42620849609375, -12.216827392578125, 23.980499267578125, 7.376091003417969, 8.0855712890625, 8.30548095703125, 2.4043445587158203, -7.8008880615234375, 20.80474853515625, 5.005035400390625, 30.96087646484375, 14.14483642578125, -0.7745819091796875, 13.6759033203125, 4.3956451416015625, 14.494308471679688, -6.8133544921875, 7.3920440673828125, -2.555755615234375, 1.607452392578125, 0.18153762817382812, -1.1179542541503906, 7.4619140625, 3.46044921875, -5.7044677734375, 8.663360595703125, -1.04345703125, 9.908935546875, -8.35919189453125, 2.46343994140625, -5.27899169921875, 3.587799072265625, -4.419364929199219, 1.733673095703125, 12.946197509765625, 14.760635375976562, 1.1729812622070312, -6.8470001220703125, 14.224609375, -0.62030029296875, 0.956817626953125, 7.835845947265625, 13.068084716796875, 18.3544921875, -7.932891845703125, 4.658721923828125, -1.7813224792480469, 3.802215576171875, 5.871734619140625, 4.28228759765625, -0.3589019775390625, -5.323738098144531, -5.1968536376953125, -11.43719482421875, 22.59625244140625, 12.790863037109375, -7.2840576171875, 15.839935302734375, -13.931304931640625, -19.00677490234375, 4.64312744140625, 2.4068222045898438, 8.910659790039062, 4.70654296875, 13.68682861328125, 11.155731201171875, 7.4433135986328125, 17.21856689453125, -7.856536865234375, 7.0792083740234375, -7.294281005859375, 11.80694580078125, 1.6029167175292969, 3.915008544921875, 6.9080810546875, 2.8405303955078125, 9.101097106933594, -0.43365478515625, 1.9287567138671875, 5.635284423828125, 5.9709014892578125, 0.8326873779296875, -17.89959716796875, -0.0391845703125, 0.959075927734375, 8.260711669921875, 1.1105194091796875, 0.972442626953125, -6.9703369140625, 8.188644409179688, 0.6808013916015625, 26.57080078125, 12.199485778808594, -10.46307373046875, 1.845458984375, -0.1774148941040039, -0.418853759765625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000094.npy"}
|
||||
{"epoch": 0.1968586387434555, "step": 95, "batch_size": 128, "mean": 4.592145919799805, "std": 9.137064933776855, "min": -25.822021484375, "p10": -6.589212036132812, "median": 4.298107147216797, "p90": 16.71834259033203, "max": 28.2232666015625, "pos_frac": 0.7265625, "sample": [-2.7416229248046875, -1.17669677734375, 4.318016052246094, 24.645904541015625, 6.2916259765625, 4.128631591796875, 1.6262359619140625, 5.88348388671875, 25.4605712890625, 8.700927734375, 0.832550048828125, -0.2908477783203125, 2.569732666015625, 2.961109161376953, -7.9664459228515625, 4.802032470703125, 3.8738861083984375, 6.5194549560546875, 12.25006103515625, 2.3668289184570312, 14.169586181640625, 6.5192108154296875, -9.8863525390625, 8.366241455078125, -0.615692138671875, -2.037200927734375, 27.090301513671875, -4.26629638671875, 15.01666259765625, -25.822021484375, 21.36199951171875, -2.037139892578125, 9.377195358276367, -6.8277435302734375, 25.40576171875, 10.94268798828125, 1.802032470703125, 5.7491912841796875, 2.54473876953125, 10.42938232421875, 3.4793014526367188, 5.43572998046875, -1.717193603515625, -0.05650138854980469, 0.7110443115234375, 9.75238037109375, 4.350006103515625, 10.163070678710938, 20.68951416015625, 2.741485595703125, -0.9967041015625, 9.5374755859375, 5.09259033203125, 0.13299179077148438, 7.297882080078125, 6.69952392578125, 4.5397491455078125, 4.2781982421875, 6.317901611328125, 10.62103271484375, 7.82403564453125, -1.85174560546875, 1.170257568359375, 0.9322509765625, 3.3265609741210938, 6.211212158203125, -4.872528076171875, 12.23858642578125, 6.50836181640625, 3.6321372985839844, -6.1636199951171875, 19.12213134765625, 8.36944580078125, 17.755813598632812, 1.6783447265625, -1.6573562622070312, 0.002887725830078125, 16.273712158203125, 3.3918838500976562, 6.020294189453125, -9.34912109375, 9.997943878173828, 4.5773162841796875, 1.4600467681884766, 5.709381103515625, 24.00927734375, 7.050117492675781, 8.445663452148438, -1.1556396484375, -14.28887939453125, 5.2136993408203125, 7.306854248046875, -2.438079833984375, 1.9217987060546875, -14.26202392578125, 2.171875, 8.70843505859375, 6.528289794921875, -8.0419921875, 4.54559326171875, -3.146942138671875, -2.931406021118164, -6.4869842529296875, 2.1495895385742188, 11.059432983398438, -7.983695983886719, 5.490421295166016, -0.699981689453125, 23.646148681640625, -10.02984619140625, 3.84228515625, 5.082763671875, -1.1076126098632812, 11.198486328125, 3.094879150390625, 14.383453369140625, 2.642913818359375, 9.908126831054688, 6.422454833984375, -13.9476318359375, -7.5675048828125, -0.4161529541015625, 6.3763275146484375, 21.26513671875, 3.95208740234375, 25.84796142578125, -11.9039306640625, 28.2232666015625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000095.npy"}
|
||||
{"epoch": 0.19895287958115182, "step": 96, "batch_size": 128, "mean": 4.4437994956970215, "std": 8.322668075561523, "min": -19.29931640625, "p10": -4.08038330078125, "median": 3.7973947525024414, "p90": 15.774822998046874, "max": 28.892822265625, "pos_frac": 0.7109375, "sample": [-3.831756591796875, -10.256591796875, -7.44537353515625, 1.670989990234375, 15.8363037109375, 7.8950653076171875, -13.218135833740234, -3.9725341796875, 1.34161376953125, 1.80694580078125, 3.811616897583008, -4.33203125, -4.449676513671875, -2.4871826171875, 9.084434509277344, 1.703207015991211, 4.129058837890625, -6.477020263671875, 1.6033782958984375, 5.5769500732421875, 10.378997802734375, 2.3348388671875, 8.646331787109375, 6.4181976318359375, -1.9721527099609375, 4.983795166015625, 18.652008056640625, 12.364715576171875, 2.7921218872070312, 5.72760009765625, 26.17681884765625, 5.42449951171875, 10.22930908203125, 11.204200744628906, 6.717254638671875, 3.5826950073242188, 3.71197509765625, 4.456329345703125, 2.0670318603515625, 0.0968170166015625, 7.217620849609375, 4.0728912353515625, -1.698577880859375, 21.49322509765625, -2.737579345703125, 5.363853454589844, -8.103683471679688, -0.5400238037109375, 4.1330413818359375, -1.6700439453125, 5.405853271484375, 0.180633544921875, 9.87103271484375, -3.11376953125, 4.06256103515625, -3.10882568359375, 4.888359069824219, 14.742721557617188, 4.754302978515625, 12.92236328125, -1.4665069580078125, 3.4083251953125, 13.172637939453125, 9.936065673828125, 6.502471923828125, 20.7996826171875, 27.31793212890625, 3.783172607421875, -1.5924835205078125, 15.05010986328125, 3.53790283203125, 15.875099182128906, 2.5649795532226562, -9.092330932617188, 24.145050048828125, 6.198333740234375, 10.106033325195312, -2.89788818359375, 18.5499267578125, 0.70147705078125, 9.707717895507812, 0.812408447265625, 4.842742919921875, -7.15960693359375, 0.59613037109375, -1.6199989318847656, 0.5290985107421875, 11.08660888671875, 15.74847412109375, -0.7896270751953125, -0.8738059997558594, -0.45220947265625, 4.812582015991211, 18.326080322265625, 20.4530029296875, -6.779327392578125, 4.693115234375, 7.635589599609375, 10.343088150024414, 12.91650390625, -14.380279541015625, 20.13067626953125, 2.0858154296875, -1.8584327697753906, -0.32566070556640625, 9.83026123046875, 0.260955810546875, 28.892822265625, 0.47503662109375, 4.019287109375, 9.369110107421875, 0.0, 1.313629150390625, 2.9770145416259766, 0.6492462158203125, 4.378501892089844, -3.910614013671875, 7.82269287109375, 5.33514404296875, 11.07342529296875, -4.435251235961914, 8.054624557495117, -2.750438690185547, -3.1682968139648438, 3.2156906127929688, 8.778533935546875, -1.2750091552734375, -19.29931640625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000096.npy"}
|
||||
{"epoch": 0.20104712041884817, "step": 97, "batch_size": 128, "mean": 4.689986228942871, "std": 9.83456039428711, "min": -17.955841064453125, "p10": -7.252804565429687, "median": 4.1945648193359375, "p90": 17.036441040039062, "max": 31.3763427734375, "pos_frac": 0.6796875, "sample": [5.516714096069336, 6.552978515625, 6.821533203125, -6.975608825683594, 6.89044189453125, 2.73876953125, -1.571746826171875, 6.252658843994141, 9.3739013671875, 8.069244384765625, -1.232034683227539, 31.3763427734375, 11.29754638671875, 15.424072265625, 0.32653045654296875, 4.4097747802734375, 16.787994384765625, -1.3453369140625, -5.05072021484375, 12.62158203125, 2.8390350341796875, 2.0448532104492188, -2.383453369140625, -0.620819091796875, 9.777069091796875, -7.294952392578125, 4.06683349609375, -5.564765930175781, 8.552120208740234, 8.998222351074219, -8.143692016601562, 0.500274658203125, -3.3240509033203125, -7.2347412109375, 7.866241455078125, 26.819000244140625, 10.250335693359375, -4.55889892578125, -5.709442138671875, -8.787109375, 2.416290283203125, 3.64959716796875, 2.5144805908203125, 23.88128662109375, -8.239715576171875, 7.198974609375, 5.269561767578125, 9.391456604003906, 1.431182861328125, 9.067008972167969, 12.762496948242188, -6.880035400390625, 23.178466796875, -0.5380859375, 8.146392822265625, 11.304183959960938, -6.8858489990234375, 7.722930908203125, 1.705230712890625, 0.8228759765625, 9.918800354003906, 6.6602020263671875, -1.874542236328125, 7.81568717956543, 4.9857635498046875, 0.0, 27.9852294921875, 8.730117797851562, 8.368988037109375, 30.441864013671875, 1.4344024658203125, 12.312118530273438, 9.537689208984375, 6.55487060546875, -0.4996223449707031, 27.866455078125, 3.2784423828125, -14.98199462890625, -7.349853515625, 18.543701171875, 7.53973388671875, 1.37103271484375, -4.250457763671875, 4.422630310058594, 12.01953125, 6.160594940185547, -0.4075355529785156, 29.07232666015625, 4.927490234375, -3.1688232421875, 7.664272308349609, -8.79571533203125, 8.573333740234375, 1.3538970947265625, 5.948486328125, -17.955841064453125, 0.0, 7.448394775390625, 12.746414184570312, 0.7350311279296875, 1.3695831298828125, 28.36932373046875, -9.397048950195312, 2.4018402099609375, 3.3577499389648438, 3.15338134765625, -1.6438446044921875, -5.66473388671875, 26.185150146484375, 5.177490234375, -4.22637939453125, 25.072174072265625, -8.20675277709961, -10.3282470703125, 1.160003662109375, 11.851470947265625, 2.955657958984375, 7.06829833984375, -0.6117019653320312, -12.803237915039062, 9.279205322265625, -5.0438385009765625, -0.1273040771484375, 12.91424560546875, 17.61614990234375, -12.94036865234375, 4.322296142578125, 9.629150390625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000097.npy"}
|
||||
{"epoch": 0.2031413612565445, "step": 98, "batch_size": 128, "mean": 4.7765116691589355, "std": 10.216471672058105, "min": -25.47723388671875, "p10": -5.472464942932128, "median": 3.61761474609375, "p90": 17.739796447753907, "max": 33.9921875, "pos_frac": 0.6640625, "sample": [12.439949035644531, 11.613601684570312, 21.7891845703125, 3.0177955627441406, 32.5693359375, 17.835189819335938, 14.971328735351562, -5.162506103515625, 5.149383544921875, -1.4178085327148438, 15.6595458984375, -12.827957153320312, -1.6356201171875, 3.45343017578125, 7.24835205078125, 9.877899169921875, 0.73944091796875, -11.6085205078125, -5.965278625488281, 1.7909393310546875, 11.9403076171875, -2.947235107421875, -6.155242919921875, 3.7073516845703125, 2.6047210693359375, 3.9600448608398438, -0.1305389404296875, -5.4178314208984375, -4.56842041015625, -18.0345458984375, 2.3201961517333984, -25.47723388671875, -4.0814208984375, -7.97393798828125, 3.966766357421875, -1.1403045654296875, -5.599943161010742, 14.591583251953125, 2.122833251953125, 2.3469696044921875, 2.821441650390625, 6.69482421875, 0.38128662109375, 6.7960052490234375, -0.7649383544921875, 17.69891357421875, 0.965850830078125, 18.0321044921875, -5.260162353515625, 13.40707778930664, 31.73968505859375, 14.42486572265625, -1.410888671875, 3.218780517578125, -0.1531829833984375, -1.204132080078125, 31.7720947265625, 13.429443359375, -2.8095703125, 24.14898681640625, -15.40631103515625, -3.963775634765625, -2.294647216796875, -5.621307373046875, 22.580535888671875, -2.100006103515625, 12.135467529296875, 4.263214111328125, 6.1357879638671875, -1.781402587890625, 8.186897277832031, 15.458221435546875, -12.7122802734375, 1.113616943359375, 21.879547119140625, 14.188705444335938, 7.466217041015625, 13.3211669921875, 0.071807861328125, 15.047637939453125, 5.921630859375, 19.373611450195312, -9.25048828125, -1.8152236938476562, 15.872467041015625, -4.533302307128906, -0.23801422119140625, 1.8566932678222656, 3.7144927978515625, -2.060546875, 21.779754638671875, 6.456085205078125, 6.065895080566406, -1.8780975341796875, 7.582542419433594, 8.897003173828125, 1.0659103393554688, 7.433624267578125, 3.3132247924804688, 5.136474609375, 2.3016357421875, 16.163589477539062, 33.9921875, 3.604339599609375, 6.290435791015625, -5.2458038330078125, -5.0347747802734375, 6.58770751953125, 5.7916259765625, 5.4894561767578125, -0.53924560546875, 2.765380859375, 3.630889892578125, 7.110200881958008, -1.0179710388183594, 8.6807861328125, 3.20123291015625, -1.927459716796875, 7.081897735595703, 12.089385986328125, 12.9735107421875, -4.4234771728515625, 15.02703857421875, 6.46563720703125, 20.81463623046875, 3.9476776123046875, -17.7646484375, 11.17449951171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000098.npy"}
|
||||
{"epoch": 0.20523560209424083, "step": 99, "batch_size": 128, "mean": 4.993054389953613, "std": 10.769243240356445, "min": -25.3370361328125, "p10": -7.6095336914062495, "median": 4.894968032836914, "p90": 18.446112060546874, "max": 26.943679809570312, "pos_frac": 0.71875, "sample": [-18.5965576171875, 0.473236083984375, -12.07073974609375, 26.943679809570312, 4.33966064453125, 11.961090087890625, -0.6114959716796875, 17.724029541015625, -0.7589111328125, -4.38311767578125, 5.0673980712890625, -12.384811401367188, 0.068023681640625, 5.266426086425781, -7.4388427734375, 16.058258056640625, 17.748931884765625, 8.971061706542969, 8.489410400390625, -2.1384124755859375, 5.134158134460449, 3.3426742553710938, 9.802001953125, -18.52288818359375, 25.40032958984375, 7.3989715576171875, 4.859067916870117, -9.75970458984375, 6.5897064208984375, 16.56597900390625, 13.958740234375, -3.9404296875, 21.11865234375, 2.2133255004882812, 7.1253662109375, 0.37469482421875, 3.79290771484375, 8.940994262695312, 25.6962890625, 5.0099639892578125, 11.088729858398438, 16.173675537109375, 6.244720458984375, 16.38897705078125, 6.8021392822265625, -4.3616485595703125, 13.416023254394531, 2.143871307373047, -0.4160919189453125, -0.82904052734375, -13.490333557128906, -22.343017578125, -19.48681640625, 11.260452270507812, 10.649185180664062, -4.5658111572265625, 13.991430282592773, -18.759780883789062, 15.179351806640625, 9.005523681640625, 5.31024169921875, 1.6147689819335938, 20.41595458984375, 1.22503662109375, -4.4116668701171875, 17.885543823242188, -25.3370361328125, 5.4700927734375, 3.3545074462890625, 19.106292724609375, 2.235748291015625, 16.0400390625, -2.615966796875, 0.7062664031982422, -1.95159912109375, 7.5213623046875, 4.930868148803711, 19.7459716796875, 5.4141693115234375, 3.225048065185547, 8.262054443359375, 0.7600326538085938, -0.745147705078125, 4.0196533203125, 17.210845947265625, 4.2026519775390625, 12.479400634765625, -3.676116943359375, 7.289764404296875, 3.762847900390625, -8.0078125, 2.767803192138672, 1.1277923583984375, 7.594184875488281, 18.34564208984375, 0.7575607299804688, 16.1912841796875, -13.30828857421875, 1.5415058135986328, -6.15679931640625, 22.0777587890625, -0.841705322265625, 0.0, 4.992420196533203, -0.020538330078125, 18.6805419921875, 1.6411361694335938, -3.58575439453125, 16.546783447265625, 23.242507934570312, 22.616897583007812, 13.558944702148438, -23.935455322265625, 15.68328857421875, 2.3928585052490234, 8.140274047851562, 12.85687255859375, 22.904327392578125, -1.077117919921875, 2.1369361877441406, 2.7873077392578125, -2.0206298828125, 14.013397216796875, -0.9090118408203125, 14.425933837890625, 6.402618408203125, 21.385498046875, 2.789703369140625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000099.npy"}
|
||||
{"epoch": 0.20732984293193718, "step": 100, "batch_size": 128, "mean": 4.450150012969971, "std": 9.689053535461426, "min": -17.4171142578125, "p10": -6.708909606933593, "median": 3.206845283508301, "p90": 18.870234680175777, "max": 27.71453857421875, "pos_frac": 0.703125, "sample": [7.38134765625, 2.469451904296875, 12.282119750976562, 4.00250244140625, 16.16009521484375, 5.10699462890625, -9.523757934570312, 18.30316162109375, 0.45025634765625, 9.40313720703125, -3.6358642578125, 0.30426025390625, 0.7701416015625, 5.032928466796875, 14.00274658203125, 26.763153076171875, 0.9723167419433594, -0.422760009765625, -8.530731201171875, 8.5618896484375, 0.813690185546875, 4.94818115234375, -8.74407958984375, -1.5805892944335938, 0.8509368896484375, 9.74945068359375, 3.0760040283203125, -7.6198272705078125, 0.3278541564941406, 14.904937744140625, -12.752044677734375, 2.07916259765625, -0.9057769775390625, 3.97894287109375, 2.9725341796875, 4.421684265136719, 0.5037841796875, 20.193405151367188, 2.9870223999023438, 14.49371337890625, -3.49822998046875, 11.979217529296875, 6.5040283203125, 1.58154296875, -1.3470649719238281, 6.6439208984375, 7.83050537109375, 21.86810302734375, 23.030029296875, 13.347625732421875, -1.8893814086914062, -6.7548065185546875, -5.445125579833984, 5.226352691650391, 3.337686538696289, 11.05029296875, -13.1024169921875, -3.4296875, -17.4171142578125, 1.8497848510742188, -0.13585662841796875, 26.46881103515625, -5.639434814453125, 5.0619354248046875, 0.18609619140625, 5.8814697265625, -4.4733123779296875, -14.192794799804688, 4.946624755859375, 6.62603759765625, 27.71453857421875, 9.499542236328125, 3.898468017578125, 6.058481216430664, -0.5405197143554688, -6.309600830078125, -6.689239501953125, 6.729888916015625, 0.7886276245117188, 11.677934646606445, -12.089012145996094, 4.424919128417969, 4.8660888671875, 2.574188232421875, 6.34027099609375, 3.3447513580322266, -3.5264892578125, -1.0800628662109375, 17.03363037109375, 26.61651611328125, 0.592803955078125, 3.6128387451171875, -5.1872711181640625, 0.67169189453125, 7.230712890625, 0.6606369018554688, 3.966846466064453, 2.3505897521972656, -0.013671875, 12.559860229492188, 23.581512451171875, -0.34967041015625, 24.80413818359375, 16.104949951171875, 4.76470947265625, -4.788818359375, 16.194854736328125, 25.117355346679688, 0.6149349212646484, 1.3242645263671875, 26.797271728515625, 5.38970947265625, -0.34534454345703125, -3.5074234008789062, -11.115142822265625, 14.672393798828125, 0.010776519775390625, 4.26287841796875, 21.00653076171875, -11.05426025390625, 0.4872550964355469, 9.23077392578125, 14.70977783203125, -3.9196205139160156, 9.456729888916016, -1.15106201171875, 21.467193603515625, -12.572601318359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000100.npy"}
|
||||
{"epoch": 0.2094240837696335, "step": 101, "batch_size": 128, "mean": 4.6193437576293945, "std": 10.441849708557129, "min": -23.737060546875, "p10": -5.862303924560545, "median": 4.66778564453125, "p90": 16.63836669921875, "max": 34.04327392578125, "pos_frac": 0.734375, "sample": [6.1195220947265625, -23.737060546875, 6.1769561767578125, 13.954193115234375, 3.4356918334960938, 2.7196044921875, 3.628887176513672, 4.792724609375, -4.3167724609375, -0.715484619140625, 17.340301513671875, 11.896636962890625, -4.68798828125, 21.284515380859375, 31.1314697265625, 27.14935302734375, 2.601686477661133, 4.1103363037109375, -1.3188438415527344, 12.309768676757812, 7.1689453125, 5.067436218261719, 11.2344970703125, 8.390106201171875, 1.603342056274414, 21.158584594726562, 7.38067626953125, 2.025543212890625, 0.854949951171875, 30.08294677734375, -2.550537109375, 6.92645263671875, 4.317737579345703, 6.2618408203125, 14.466064453125, 9.277847290039062, 22.869461059570312, 16.62774658203125, 12.408500671386719, -0.05450439453125, -9.24072265625, 3.210693359375, 12.29364013671875, 9.700393676757812, 19.066932678222656, 16.66314697265625, 5.36126708984375, -4.5485992431640625, 0.052242279052734375, -12.71356201171875, -1.69305419921875, 6.121612548828125, -16.780853271484375, 11.255996704101562, 22.2103271484375, 7.7456817626953125, 7.9643707275390625, -18.035888671875, -1.588815689086914, 3.6889190673828125, -4.7899627685546875, 8.240570068359375, 5.510711669921875, 11.522193908691406, 1.690887451171875, -3.7194881439208984, 5.922767639160156, 11.44305419921875, -1.4494781494140625, 13.55645751953125, 1.441131591796875, 3.8468456268310547, 1.88226318359375, 4.584625244140625, 1.135711669921875, 2.2273788452148438, 20.492889404296875, 1.6844921112060547, 3.3992767333984375, -21.77520751953125, -21.948333740234375, 9.445953369140625, -9.765769958496094, 13.192138671875, 4.10235595703125, 3.6577301025390625, 13.2513427734375, 14.591182708740234, 1.1869773864746094, -2.564666748046875, 9.997459411621094, 5.662609100341797, 7.700202941894531, 4.750946044921875, 8.597747802734375, 2.1869964599609375, -2.924713134765625, 29.496795654296875, 14.49462890625, 5.0863494873046875, -18.19818115234375, -21.49169921875, 9.545377731323242, 1.29998779296875, 5.3197021484375, -2.1558837890625, 7.8565673828125, 5.0880889892578125, 34.04327392578125, -6.780738830566406, 9.14422607421875, -0.06329345703125, 12.94207763671875, -3.2801513671875, -0.23883819580078125, 6.199005126953125, -2.0847625732421875, -5.1394805908203125, -5.46868896484375, 6.8844146728515625, -8.547653198242188, 0.05206298828125, 2.7991943359375, 0.5234451293945312, -14.065696716308594, 1.3228845596313477, 8.096923828125, 10.501911163330078], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000101.npy"}
|
||||
{"epoch": 0.21151832460732983, "step": 102, "batch_size": 128, "mean": 6.333121299743652, "std": 9.580653190612793, "min": -18.486785888671875, "p10": -6.0606552124023425, "median": 6.121896743774414, "p90": 18.143101501464844, "max": 32.1053466796875, "pos_frac": 0.7265625, "sample": [-5.4217376708984375, -1.8483123779296875, 12.670402526855469, 6.146892547607422, 2.132171630859375, 18.809814453125, 7.682952880859375, -0.6190719604492188, 2.5343017578125, 13.888946533203125, 6.0539398193359375, 6.4510040283203125, 11.369186401367188, 7.328369140625, -0.1874847412109375, 15.585594177246094, 10.753814697265625, -9.791534423828125, -2.4332733154296875, -1.6002120971679688, -3.0173187255859375, 11.06439208984375, 13.11920166015625, 23.12982177734375, 17.702865600585938, -2.995758056640625, 6.495075225830078, -8.04473876953125, 5.1005859375, -7.82867431640625, -6.709114074707031, 7.021575927734375, 3.4707565307617188, 7.7935943603515625, -6.6752166748046875, 21.70001220703125, 14.432037353515625, -1.864410400390625, 1.547821044921875, 0.7720241546630859, -1.46014404296875, 1.0959014892578125, -1.638814926147461, 16.98101043701172, 8.44805908203125, 4.60333251953125, 2.8039321899414062, -8.29632568359375, -8.93597412109375, -7.43878173828125, 18.846466064453125, -5.7329254150390625, 6.152626037597656, 5.6632080078125, 4.1136016845703125, 15.717620849609375, 7.7571258544921875, 32.1053466796875, 9.3765869140625, 15.26416015625, 1.5530014038085938, -3.00592041015625, 18.087158203125, 11.384857177734375, 8.08795166015625, -5.797271728515625, -9.232589721679688, 3.170257568359375, -3.8781471252441406, 6.267230987548828, 11.05303955078125, 16.594749450683594, 21.83647918701172, 15.6072998046875, 5.565471649169922, 18.822952270507812, 18.784252166748047, 5.955902099609375, 8.14422607421875, -14.898681640625, 0.8267726898193359, 10.409332275390625, 0.0, 6.1497802734375, -1.0808563232421875, 18.4306640625, 0.0, 6.096900939941406, 13.4176025390625, -0.1640148162841797, 13.41180419921875, 13.989532470703125, -2.3279876708984375, 2.501434326171875, 16.949798583984375, 4.91583251953125, 12.293792724609375, 1.6598434448242188, 10.6722412109375, 16.27368927001953, 17.833038330078125, 3.9337081909179688, 3.1534271240234375, 24.64288330078125, 6.608001708984375, 10.674407958984375, 2.831451416015625, 18.273635864257812, 1.121307373046875, 18.055038452148438, -18.486785888671875, 10.146240234375, -2.402679443359375, 3.3019256591796875, 13.249542236328125, -14.373687744140625, 16.04918670654297, 2.4543609619140625, 30.7047119140625, 3.67926025390625, 17.654754638671875, 5.27911376953125, 11.565093994140625, 9.765411376953125, 9.08990478515625, 29.171661376953125, -0.827056884765625, -12.18304443359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000102.npy"}
|
||||
{"epoch": 0.2136125654450262, "step": 103, "batch_size": 128, "mean": 4.7686767578125, "std": 10.844796180725098, "min": -23.13238525390625, "p10": -6.495713043212888, "median": 2.2309799194335938, "p90": 18.885723876953122, "max": 47.12469482421875, "pos_frac": 0.703125, "sample": [-20.0303955078125, -8.13995361328125, 6.651458740234375, 10.840255737304688, 17.017578125, 16.088863372802734, -0.27972412109375, -2.898426055908203, 25.397674560546875, 1.227935791015625, -1.70013427734375, 16.67724609375, 21.50079345703125, 1.820831298828125, 0.0, -1.81304931640625, -15.439697265625, -10.87908935546875, -4.7587890625, 6.6544189453125, -0.8026351928710938, 1.6211624145507812, 9.78253173828125, -23.13238525390625, 1.9304733276367188, 5.396728515625, 1.0385818481445312, 2.4234771728515625, 5.020294189453125, 0.4856414794921875, 11.301727294921875, 9.200729370117188, 0.13942718505859375, 5.93402099609375, -5.12152099609375, 7.6665496826171875, -2.1359634399414062, 16.970687866210938, 9.173751831054688, 7.5244140625, -1.8675918579101562, 13.927169799804688, 2.038482666015625, 29.862213134765625, -1.8136062622070312, 12.814407348632812, 2.5568695068359375, 8.713165283203125, 13.53515625, 47.12469482421875, 17.748245239257812, 14.539154052734375, 8.501251220703125, -10.18585205078125, 1.4478912353515625, 0.462738037109375, -9.917732238769531, 9.15460205078125, 9.11865234375, 10.408447265625, 18.105987548828125, 8.072967529296875, 1.2639694213867188, 11.762908935546875, 4.8849334716796875, 21.459503173828125, 11.084640502929688, 12.651588439941406, 3.4242630004882812, 6.90374755859375, 0.4466705322265625, -0.6501312255859375, 0.4621124267578125, 3.6583251953125, 0.5282440185546875, 0.5518512725830078, -8.345340728759766, -18.63330078125, 0.82012939453125, 1.9125213623046875, 22.568023681640625, 14.239204406738281, 0.2564220428466797, 12.85089111328125, -1.6497764587402344, 10.12396240234375, 5.662834167480469, 1.8818283081054688, 26.344314575195312, -2.02197265625, 2.4776458740234375, 1.3154296875, 15.652374267578125, 6.835723876953125, -3.2939300537109375, 1.6779594421386719, 24.386566162109375, -2.2254791259765625, -5.747161865234375, -13.26177978515625, 0.0, 3.3387222290039062, 3.1544265747070312, 18.39630126953125, -2.641510009765625, 21.337646484375, -11.7138671875, 1.79718017578125, 1.2374420166015625, 12.059810638427734, 12.03228759765625, -5.1175689697265625, -2.9828224182128906, 20.0277099609375, 1.4199066162109375, 0.23433876037597656, 22.097930908203125, 24.3038330078125, 9.423187255859375, 21.074798583984375, -2.2266845703125, 6.732673645019531, -2.363067626953125, -15.2437744140625, 1.8208160400390625, -20.09613037109375, -5.791038513183594, -0.88446044921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000103.npy"}
|
||||
{"epoch": 0.2157068062827225, "step": 104, "batch_size": 128, "mean": 4.4510817527771, "std": 11.64864730834961, "min": -26.803985595703125, "p10": -8.477449798583985, "median": 3.3085803985595703, "p90": 20.088406372070306, "max": 48.58636474609375, "pos_frac": 0.6796875, "sample": [4.447296142578125, -4.236907958984375, 8.015426635742188, 16.31378173828125, -11.3182373046875, 6.545440673828125, 4.942176818847656, -0.5566864013671875, 21.259765625, -5.85888671875, 16.61669921875, -8.263240814208984, 4.79022216796875, -17.026519775390625, 7.080169677734375, 7.869915008544922, 26.82470703125, 19.586395263671875, -1.6829833984375, -3.6700897216796875, 2.8134613037109375, -2.5112361907958984, 7.001708984375, -23.1212158203125, 25.983917236328125, -4.4630126953125, 3.0574798583984375, -1.4787006378173828, 2.7867603302001953, 25.84466552734375, 7.224365234375, 15.260498046875, -13.96435546875, -0.6749114990234375, 2.78375244140625, -8.594573974609375, 1.0785446166992188, -8.427253723144531, 23.1334228515625, -3.5760498046875, 12.48907470703125, 2.103240966796875, 1.011749267578125, -10.8570556640625, -8.78900146484375, 12.342803955078125, 1.0225982666015625, 9.1273193359375, -2.7416229248046875, -1.853363037109375, -4.628173828125, 9.272811889648438, 8.885940551757812, 4.067413330078125, 26.883148193359375, 5.528163909912109, 21.9598388671875, -5.709648132324219, -3.1597900390625, 2.006378173828125, 4.745792388916016, 9.910888671875, 5.380279541015625, 7.0966796875, 5.796783447265625, 3.556415557861328, -1.2484970092773438, 15.83538818359375, 4.7496337890625, -14.904571533203125, 11.30078125, 17.867645263671875, 0.693634033203125, -26.803985595703125, 48.58636474609375, 5.628013610839844, 22.537445068359375, 10.007164001464844, 12.018218994140625, 23.652679443359375, 5.9664154052734375, -26.537017822265625, -10.4517822265625, 12.037094116210938, 1.9626007080078125, -0.7763595581054688, 1.1789436340332031, 7.5755615234375, -8.322021484375, 18.344955444335938, 23.65728759765625, 0.1141510009765625, 4.7740631103515625, 10.386322021484375, 1.0317306518554688, 0.27022552490234375, -7.66278076171875, 0.3404083251953125, 24.258636474609375, 12.151565551757812, 14.967987060546875, 29.2227783203125, -5.3497161865234375, 15.96124267578125, -18.302581787109375, 1.5409393310546875, -5.775665283203125, 2.9961700439453125, 4.795684814453125, -15.57952880859375, 0.0, 9.783355712890625, -1.4702835083007812, 14.227363586425781, 3.0607452392578125, 6.519866943359375, 11.8790283203125, -5.727088928222656, 7.8162994384765625, 1.127777099609375, -0.37603759765625, 14.714859008789062, 2.5794105529785156, 6.5234375, 16.71820068359375, -2.242095947265625, 0.3599395751953125, 2.2620773315429688], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000104.npy"}
|
||||
{"epoch": 0.21780104712041884, "step": 105, "batch_size": 128, "mean": 8.074159622192383, "std": 11.077836990356445, "min": -22.187835693359375, "p10": -4.81409912109375, "median": 7.531219482421875, "p90": 22.755194091796874, "max": 32.9569091796875, "pos_frac": 0.796875, "sample": [5.5986480712890625, 3.7276763916015625, 14.539688110351562, 1.7966327667236328, -7.6396484375, 19.0576171875, 8.970207214355469, 15.5469970703125, 11.0120849609375, 2.939495086669922, 14.782867431640625, 6.0399322509765625, 21.98394775390625, 18.906478881835938, -10.2940673828125, 22.725311279296875, 6.271385192871094, -1.35687255859375, 26.395980834960938, 10.905853271484375, -0.6300773620605469, -2.571868896484375, 7.54473876953125, 7.20892333984375, 22.824920654296875, 28.9464111328125, -22.000213623046875, 1.0184783935546875, -4.57928466796875, 27.7894287109375, 5.455047607421875, 13.632339477539062, 0.6659317016601562, 17.7178955078125, -2.84454345703125, 20.56732177734375, 24.2943115234375, 3.7131805419921875, 18.82159423828125, -10.873260498046875, 4.36968994140625, 21.02825927734375, 13.66815185546875, 15.808746337890625, 1.4382781982421875, 19.381805419921875, 9.23638916015625, -2.178131103515625, 24.113800048828125, 19.77301025390625, 2.8378219604492188, 12.410430908203125, 0.353271484375, 13.4285888671875, -2.82952880859375, 13.413116455078125, 10.4620361328125, 25.025604248046875, 8.301948547363281, -10.44940185546875, 3.243377685546875, -9.996063232421875, 0.1844482421875, 3.7922821044921875, 5.107940673828125, 8.15533447265625, 16.287384033203125, -0.8773193359375, 2.8041133880615234, 3.265960693359375, 20.680572509765625, -20.3519287109375, 17.598209381103516, 21.705474853515625, 7.5177001953125, -4.691436767578125, 12.517578125, 17.849365234375, 25.377685546875, 2.3099441528320312, 7.88299560546875, 3.27630615234375, 7.775665283203125, 0.7655982971191406, 6.63604736328125, 19.051788330078125, -11.37554931640625, 22.917205810546875, 7.649791717529297, -22.187835693359375, 3.128826141357422, 10.81414794921875, 23.998046875, -11.4451904296875, 7.9425506591796875, 1.250091552734375, 9.231363296508789, -7.679718017578125, 11.8446044921875, 0.0, 4.582582473754883, 7.06781005859375, 0.8002185821533203, -0.16582489013671875, 14.758598327636719, -2.2701568603515625, 32.10675048828125, 27.06561279296875, 6.5301361083984375, 5.132560729980469, 8.644317626953125, 21.032745361328125, 0.3926200866699219, 4.2234649658203125, 17.900054931640625, 12.23675537109375, -5.100311279296875, 11.92340087890625, 2.6720809936523438, -8.0740966796875, 8.807907104492188, 21.71263885498047, 3.9836273193359375, 4.176250457763672, -0.532623291015625, 32.9569091796875, 19.796409606933594, 8.96917724609375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000105.npy"}
|
||||
{"epoch": 0.2198952879581152, "step": 106, "batch_size": 128, "mean": 5.517203330993652, "std": 11.730107307434082, "min": -21.132965087890625, "p10": -8.719464111328124, "median": 5.165912628173828, "p90": 20.869711303710936, "max": 36.572235107421875, "pos_frac": 0.703125, "sample": [-6.2998199462890625, -16.030487060546875, 14.206329345703125, -2.71856689453125, -3.647247314453125, 19.66845703125, -6.6441802978515625, 7.69903564453125, -13.740570068359375, 15.117080688476562, -14.53485107421875, 0.0103912353515625, 5.52398681640625, 19.765716552734375, 12.601806640625, 31.57562255859375, 26.40911865234375, -8.41607666015625, -4.6279754638671875, -2.3754730224609375, 14.365726470947266, 4.17950439453125, 3.0318756103515625, 4.942741394042969, 12.30462646484375, 0.9922943115234375, 3.5543041229248047, 6.516754150390625, 12.214408874511719, 25.908233642578125, 1.612030029296875, 9.080146789550781, 16.961273193359375, 1.85968017578125, -0.02544403076171875, -5.735984802246094, 10.842620849609375, 2.5037384033203125, 6.426116943359375, 7.811164855957031, 20.28802490234375, 1.4704742431640625, 8.168724060058594, -18.0556640625, -3.951904296875, 14.52801513671875, -7.27484130859375, 0.416961669921875, 13.307655334472656, 5.249977111816406, 8.26220703125, 8.431625366210938, -5.088981628417969, 17.749786376953125, -18.614059448242188, -21.132965087890625, 10.111297607421875, 32.8375244140625, 28.617156982421875, 20.74237060546875, 5.08184814453125, 8.293914794921875, -9.4273681640625, -7.020263671875, 13.2445068359375, 0.06690216064453125, -6.846771240234375, -4.929435729980469, 23.2406005859375, -11.11688232421875, 12.4725341796875, 6.474365234375, -1.5622329711914062, 6.49078369140625, -12.58404541015625, 1.2332229614257812, -2.546661376953125, 2.49774169921875, 18.30499267578125, 2.3839263916015625, 1.0118560791015625, 36.572235107421875, 7.59893798828125, 1.7615890502929688, -4.6778564453125, 2.87109375, 21.892181396484375, -0.4035186767578125, 16.06500244140625, -10.075469970703125, 9.789306640625, 4.848419189453125, 16.850982666015625, 0.6441497802734375, 2.0794906616210938, 5.307525634765625, 12.388702392578125, 20.408233642578125, 5.45111083984375, 7.6132354736328125, 28.751129150390625, -15.51495361328125, 0.6170806884765625, -2.4884414672851562, 6.817230224609375, 12.747554779052734, 13.385627746582031, -11.900054931640625, 3.480804443359375, -6.552581787109375, -9.926116943359375, -5.41204833984375, 32.49072265625, 7.076210021972656, -7.170806884765625, 5.60491943359375, 12.326858520507812, -5.291450500488281, 21.166839599609375, 2.454345703125, 21.95330810546875, 6.221221923828125, 10.121795654296875, 24.246551513671875, 19.963851928710938, -0.7328567504882812, 10.119705200195312, 0.9451751708984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000106.npy"}
|
||||
{"epoch": 0.22198952879581152, "step": 107, "batch_size": 128, "mean": 7.409460544586182, "std": 12.461726188659668, "min": -58.18084716796875, "p10": -5.492109680175781, "median": 6.225532531738281, "p90": 23.571269226074214, "max": 35.061981201171875, "pos_frac": 0.7421875, "sample": [1.4282073974609375, -3.9549102783203125, 4.5140380859375, 6.13525390625, 10.8836669921875, -5.6683349609375, 13.313995361328125, 26.442230224609375, 4.6387939453125, 4.21942138671875, -5.422264099121094, 5.4924774169921875, 1.2936553955078125, 13.037429809570312, 13.146846771240234, -1.74530029296875, 12.705291748046875, -7.486572265625, 10.57464599609375, 19.39447021484375, -2.959209442138672, 21.847640991210938, 7.5062255859375, 12.533660888671875, -12.149017333984375, 28.7130126953125, 7.005157470703125, -12.124542236328125, 17.7353515625, -6.5926361083984375, 19.208282470703125, 1.2482147216796875, 2.04083251953125, 19.000404357910156, 21.05804443359375, 21.9825439453125, 11.451416015625, 6.3158111572265625, 32.7967529296875, 0.81060791015625, 26.884811401367188, 6.808258056640625, 7.1176605224609375, 13.014678955078125, 3.7965927124023438, 1.4804668426513672, -3.2528076171875, 6.9359893798828125, 14.072835922241211, 6.8884124755859375, 8.3004150390625, -0.45909881591796875, 4.174568176269531, 4.07855224609375, 22.621826171875, -5.655082702636719, 3.93927001953125, 1.5365791320800781, 19.9095458984375, 10.171493530273438, 7.918243408203125, 2.4513397216796875, -3.15216064453125, 33.2354736328125, -0.417236328125, -6.5238800048828125, 4.886985778808594, 32.85205078125, 22.75922393798828, 17.48736572265625, -11.687896728515625, 24.576248168945312, -4.073333740234375, 14.960037231445312, 23.14056396484375, 2.088939666748047, 2.71258544921875, 19.64593505859375, 12.2728271484375, 11.88918685913086, 27.70404052734375, 14.926055908203125, -1.834136962890625, -11.013885498046875, 0.61260986328125, 5.0258636474609375, 6.640922546386719, -1.1596908569335938, 35.061981201171875, -8.690406799316406, 30.025970458984375, 12.384063720703125, -1.0987548828125, 7.478118896484375, 4.865966796875, 1.292776107788086, 14.9068603515625, 1.29296875, -0.4833221435546875, 0.34305763244628906, -58.18084716796875, -9.69073486328125, 12.04803466796875, 26.68988037109375, 16.51318359375, -2.9048309326171875, 27.2523193359375, -2.662872314453125, 7.2132415771484375, 1.5338516235351562, 19.507781982421875, -0.618865966796875, 10.286605834960938, 3.4237823486328125, 7.31365966796875, 17.55499267578125, -3.450531005859375, -0.91839599609375, -6.4249267578125, 0.01019287109375, 0.56048583984375, 8.596832275390625, -5.04229736328125, 7.29278564453125, -0.19154739379882812, 31.2615966796875, 5.025947570800781, 18.3714599609375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000107.npy"}
|
||||
{"epoch": 0.22408376963350785, "step": 108, "batch_size": 128, "mean": 8.221390724182129, "std": 10.990814208984375, "min": -19.942657470703125, "p10": -4.836509704589844, "median": 7.301525115966797, "p90": 24.868849182128905, "max": 34.252044677734375, "pos_frac": 0.765625, "sample": [24.642196655273438, 15.010025024414062, -2.4071121215820312, 6.6925506591796875, 13.809661865234375, -7.139251708984375, 19.594806671142578, 9.093639373779297, 3.682168960571289, 11.482864379882812, 1.8068695068359375, -4.83502197265625, -4.8399810791015625, 11.41107177734375, 4.4915313720703125, -7.561149597167969, 27.67230224609375, 6.05828857421875, -2.5386886596679688, 9.71075439453125, 15.77197265625, 5.700164794921875, 11.665924072265625, -9.2232666015625, -0.7093658447265625, 9.14813232421875, 26.700927734375, -2.5537109375, 0.0, -19.942657470703125, 14.466548919677734, 3.205902099609375, 9.98291015625, 5.042900085449219, 16.53021240234375, 2.832794189453125, 28.3609619140625, 7.35992431640625, 17.44232177734375, -0.10532569885253906, 3.4656829833984375, 9.811737060546875, 2.98834228515625, -4.081085205078125, 3.795623779296875, 27.954330444335938, 10.67767333984375, -14.4501953125, 4.8343505859375, 25.397705078125, 14.231857299804688, 4.096923828125, 9.400421142578125, 34.252044677734375, 6.286174774169922, 1.3461151123046875, 0.0, -4.92828369140625, 1.7134876251220703, 0.013650894165039062, 2.142091751098633, 26.3104248046875, 14.944305419921875, 9.26812744140625, -16.880218505859375, 8.566253662109375, 19.300994873046875, 11.555320739746094, -0.6160354614257812, 11.371368408203125, 10.593963623046875, 5.175102233886719, -0.994110107421875, 1.660003662109375, -3.6713714599609375, 24.31207275390625, 21.57916259765625, 2.3032302856445312, 13.3961181640625, 18.531585693359375, 11.65936279296875, -6.279029846191406, 27.948974609375, 12.3779296875, 4.161048889160156, -0.001617431640625, 22.91961669921875, 10.995002746582031, 8.910934448242188, 28.797637939453125, 7.1478271484375, 20.993865966796875, 2.546173095703125, 10.574310302734375, 8.127086639404297, -1.607452392578125, 31.2044677734375, 15.969696044921875, 27.781829833984375, 19.38702392578125, 15.54306411743164, 5.1047821044921875, 8.417144775390625, 0.0, -1.9934539794921875, 9.728599548339844, 7.243125915527344, 32.108123779296875, 5.522457122802734, 23.87066650390625, 5.69439697265625, -8.500213623046875, 3.97796630859375, 1.2406806945800781, 16.806602478027344, 27.621917724609375, 21.9603271484375, 3.13714599609375, 2.8305511474609375, 11.0408935546875, -11.204925537109375, -12.102394104003906, -10.73598861694336, 7.79669189453125, -0.2097320556640625, 10.62353515625, 2.841339111328125, 17.190277099609375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000108.npy"}
|
||||
{"epoch": 0.2261780104712042, "step": 109, "batch_size": 128, "mean": 8.469661712646484, "std": 13.746404647827148, "min": -29.982666015625, "p10": -6.592902183532714, "median": 6.727773666381836, "p90": 24.178425598144532, "max": 60.34912109375, "pos_frac": 0.7578125, "sample": [4.069999694824219, 13.112716674804688, -10.317352294921875, 24.150131225585938, 4.5319976806640625, 7.0681915283203125, 15.337020874023438, 16.180450439453125, -10.360382080078125, 6.1694183349609375, 6.7427978515625, 20.410400390625, 0.2717571258544922, -3.0284881591796875, -7.6067047119140625, -3.853759765625, -0.901336669921875, 30.44940185546875, 42.9053955078125, 1.2443351745605469, 0.8267860412597656, 0.991973876953125, -2.91583251953125, -13.828216552734375, -6.347177505493164, 12.520606994628906, 1.3515625, -1.2710456848144531, -0.4778251647949219, 10.7213134765625, 6.265235900878906, -2.0394439697265625, 16.106781005859375, 36.66363525390625, 25.054718017578125, 9.624610900878906, 13.801895141601562, 21.93914794921875, 1.8802413940429688, 20.571929931640625, 12.278900146484375, 19.289947509765625, 5.606315612792969, -0.4395427703857422, 14.80145263671875, 33.036651611328125, 0.4633941650390625, 28.6583251953125, 7.6790008544921875, 6.712749481201172, 12.813735961914062, 6.341094970703125, 8.9354248046875, 5.2347412109375, -22.040863037109375, 6.405548095703125, 7.157806396484375, 7.075580596923828, 13.029182434082031, 8.654693603515625, 16.44134521484375, 3.0215225219726562, -0.83099365234375, 23.080841064453125, 6.603759765625, 11.496192932128906, -4.761810302734375, 9.4942626953125, 9.558937072753906, 12.235282897949219, 17.577423095703125, 3.05926513671875, -7.4366607666015625, 17.169647216796875, 10.962043762207031, 24.24444580078125, 48.55853271484375, -3.826995849609375, 1.3809890747070312, 29.89959716796875, 4.80078125, 5.256866455078125, 19.786163330078125, 13.781265258789062, 0.851165771484375, 1.051025390625, 14.377105712890625, 1.160858154296875, -9.037139892578125, 40.6146240234375, 23.04095458984375, 19.784332275390625, 6.522983551025391, 60.34912109375, 8.503662109375, -0.654205322265625, 1.8864517211914062, 35.3790283203125, -10.490875244140625, 3.6947402954101562, -8.053863525390625, 17.976608276367188, 0.0, 0.1284027099609375, 30.240936279296875, 8.528640747070312, 14.08294677734375, 10.71759033203125, -1.1986083984375, 11.0557861328125, 0.55511474609375, 6.448829650878906, 1.277496337890625, -2.4484176635742188, 20.704147338867188, -0.920684814453125, 12.7332763671875, 18.60540771484375, -12.58447265625, -29.982666015625, -28.399436950683594, 23.10302734375, 4.98974609375, 17.444183349609375, 7.2099609375, -7.166259765625, -4.592376708984375, 15.363800048828125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000109.npy"}
|
||||
{"epoch": 0.22827225130890053, "step": 110, "batch_size": 128, "mean": 8.108083724975586, "std": 13.666289329528809, "min": -39.945709228515625, "p10": -6.39829864501953, "median": 5.7715911865234375, "p90": 26.716795349121092, "max": 43.822021484375, "pos_frac": 0.7265625, "sample": [-0.5173492431640625, -39.945709228515625, -11.204872131347656, 28.843597412109375, -2.5214385986328125, 1.5060863494873047, -3.4248733520507812, 1.52734375, -0.80816650390625, 27.7860107421875, 5.18780517578125, 3.2053985595703125, 2.8590660095214844, 23.19476318359375, -5.0381927490234375, -0.7126388549804688, 12.28704833984375, 5.488960266113281, 17.701431274414062, 3.66668701171875, -8.180809020996094, 17.80255126953125, 2.5883445739746094, 25.55712890625, 7.82977294921875, 3.1616058349609375, -5.9369049072265625, 13.540046691894531, 1.67193603515625, -3.6711578369140625, 28.752685546875, 14.262252807617188, 25.835205078125, 2.5670204162597656, 35.8668212890625, -2.63751220703125, 17.493453979492188, 17.187301635742188, 11.697845458984375, -4.117645263671875, 6.7503509521484375, 6.78961181640625, 11.959434509277344, 27.50830078125, -10.956451416015625, 0.5804481506347656, 15.881759643554688, -8.419586181640625, 13.006507873535156, 0.7633438110351562, 21.430206298828125, -20.864273071289062, 10.915283203125, 1.0087738037109375, 32.256378173828125, 12.437896728515625, 4.295921325683594, -31.282272338867188, 15.618499755859375, 20.73779296875, 4.493488311767578, 5.0268402099609375, 2.219482421875, 1.1055908203125, 1.531463623046875, 27.791595458984375, -7.597076416015625, 27.278106689453125, 19.48760986328125, -15.773529052734375, -0.6890239715576172, 18.182861328125, 5.2172088623046875, 3.5064163208007812, 11.736572265625, 7.6434478759765625, -0.0037078857421875, -3.4209518432617188, -1.771881103515625, 11.226043701171875, 0.0, 4.695037841796875, 24.5909423828125, 15.159530639648438, -10.394020080566406, 16.477813720703125, 0.6029872894287109, -7.474884033203125, 5.499725341796875, 27.276626586914062, 16.573745727539062, 36.87689208984375, 16.036331176757812, 25.925811767578125, 6.04345703125, 17.7965087890625, 24.94866943359375, -1.873382568359375, -2.1965713500976562, 18.2552490234375, 26.47686767578125, 7.181896209716797, 1.4190998077392578, 25.36773681640625, -17.01030731201172, 9.2982177734375, 3.9247512817382812, -5.319366455078125, 24.985870361328125, 20.28692626953125, -1.60589599609375, 30.719772338867188, -3.7587356567382812, 8.970382690429688, 11.47564697265625, 27.85986328125, 43.822021484375, -3.6473388671875, 19.15936279296875, 23.854827880859375, 2.0455322265625, 19.2052001953125, 6.6805572509765625, 7.175594329833984, -12.534423828125, 2.817626953125, -0.7222213745117188, 6.925445556640625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000110.npy"}
|
||||
{"epoch": 0.23036649214659685, "step": 111, "batch_size": 128, "mean": 8.716733932495117, "std": 13.76922607421875, "min": -21.463043212890625, "p10": -8.201092529296874, "median": 7.614921569824219, "p90": 26.585539245605467, "max": 49.45721435546875, "pos_frac": 0.734375, "sample": [3.7746353149414062, -10.40521240234375, 12.89093017578125, -21.463043212890625, 46.88604736328125, -7.607063293457031, 11.095550537109375, 26.381378173828125, 1.0184326171875, -1.558837890625, 28.40374755859375, 6.334991455078125, 7.8170013427734375, 5.730133056640625, 5.25347900390625, 14.472972869873047, 7.9318695068359375, 35.552032470703125, -18.05645751953125, -11.578948974609375, 10.354476928710938, 5.9751434326171875, 15.051834106445312, 6.26544189453125, 17.3314208984375, 2.26348876953125, 4.636695861816406, 26.659805297851562, 0.9648590087890625, 11.983642578125, 15.446304321289062, 19.652496337890625, -2.7563133239746094, 33.82025146484375, -3.901153564453125, 6.0604400634765625, -7.70947265625, 17.63568115234375, -0.5354461669921875, 10.66748046875, 1.4217720031738281, 11.488334655761719, -7.517250061035156, 21.802093505859375, 12.634246826171875, 3.26641845703125, 16.163055419921875, -1.47540283203125, 31.21392822265625, 8.203166961669922, 2.2847137451171875, 49.45721435546875, -11.404449462890625, -0.97796630859375, -0.05279541015625, 17.91455078125, 13.12677001953125, 23.69207763671875, 4.197761535644531, 30.91412353515625, 5.2598876953125, 10.884445190429688, 5.517324447631836, 11.57611083984375, 3.29669189453125, -6.3782958984375, 3.2768707275390625, 14.005905151367188, 40.588623046875, 11.582672119140625, 6.54638671875, -0.813232421875, 7.8851165771484375, -1.192901611328125, 4.1920166015625, 0.1872730255126953, -2.6610107421875, 26.5537109375, 8.17913818359375, -17.76702880859375, 0.08912277221679688, 10.858642578125, -9.34820556640625, 4.10345458984375, -4.68438720703125, -6.1520843505859375, -0.2569732666015625, 1.423675537109375, 18.277740478515625, 29.010986328125, 16.442047119140625, 30.959716796875, 18.042068481445312, -2.7504329681396484, 15.84393310546875, 9.895828247070312, 16.022735595703125, -2.848115921020508, 43.436492919921875, 3.3297882080078125, 21.246009826660156, 9.966590881347656, -14.96051025390625, -9.5294189453125, -2.063201904296875, 26.309539794921875, -12.155303955078125, 19.332916259765625, -2.0932159423828125, 20.060577392578125, 14.590133666992188, 7.412841796875, 17.8734130859375, 36.346649169921875, 21.508392333984375, 18.5504150390625, 25.76220703125, -17.236297607421875, 11.8118896484375, 14.335311889648438, 8.287155151367188, -11.5267333984375, 12.546417236328125, 5.585052490234375, 20.263946533203125, 1.1580734252929688, -11.171142578125, 2.02142333984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000111.npy"}
|
||||
{"epoch": 0.2324607329842932, "step": 112, "batch_size": 128, "mean": 4.3942551612854, "std": 15.429938316345215, "min": -50.05865478515625, "p10": -12.938385009765623, "median": 3.2840652465820312, "p90": 25.346237182617184, "max": 47.53822326660156, "pos_frac": 0.59375, "sample": [15.565528869628906, 3.2542648315429688, -3.949188232421875, 11.237884521484375, 0.3421630859375, 8.543777465820312, -0.8731842041015625, -29.82470703125, -3.783367156982422, 1.7996978759765625, -2.68621826171875, 6.379508972167969, -9.53521728515625, 20.03802490234375, -0.824493408203125, 3.3138656616210938, 13.2230224609375, 15.893707275390625, 6.563163757324219, 24.2734375, -8.10223388671875, 18.437957763671875, 14.401397705078125, 14.64093017578125, -50.05865478515625, -15.89459228515625, -9.686286926269531, 13.52978515625, 9.910308837890625, 8.970390319824219, -10.233184814453125, 4.802490234375, -28.093841552734375, 6.1736907958984375, 4.8617095947265625, -17.992523193359375, 5.1903076171875, 5.652339935302734, -0.8274688720703125, 3.4442367553710938, 17.6650390625, -1.33184814453125, 5.2605438232421875, 15.352066040039062, 13.320075988769531, -1.27972412109375, 15.426788330078125, -0.906097412109375, 18.279251098632812, -5.391632080078125, 10.31378173828125, 5.1537017822265625, 0.8582077026367188, 16.307861328125, -18.483474731445312, -3.1132736206054688, -14.4903564453125, -9.732635498046875, -35.346923828125, -8.44451904296875, -1.7515430450439453, -0.7902679443359375, 17.128921508789062, 36.55401611328125, -5.779693603515625, 3.06951904296875, 31.2994384765625, 16.553314208984375, -0.1089324951171875, 7.429412841796875, 7.381147384643555, -5.716840744018555, 18.400115966796875, 14.57281494140625, 7.74609375, 36.45611572265625, -4.325313568115234, -0.4435272216796875, 30.0205078125, -18.455284118652344, -1.93035888671875, 10.358474731445312, 0.5912094116210938, -0.32410621643066406, 21.37384033203125, 23.38043212890625, 13.102828979492188, -1.415374755859375, -5.616241455078125, 2.1579437255859375, 26.879974365234375, -12.27325439453125, 28.160797119140625, 4.443828582763672, -6.1617431640625, 16.864532470703125, -2.6622390747070312, 0.5986785888671875, -36.83709716796875, -7.763885498046875, -8.40887451171875, 25.91180419921875, -3.147430419921875, -1.48114013671875, 26.949172973632812, 16.436447143554688, -15.521697998046875, 25.103851318359375, 5.54583740234375, 19.548858642578125, 35.7459716796875, -10.43292236328125, 2.4705657958984375, 27.578369140625, 6.415916442871094, 1.2057037353515625, -18.3994140625, -2.0689544677734375, 2.029052734375, 0.500762939453125, 7.128173828125, -3.3804931640625, 26.61871337890625, 28.315200805664062, 47.53822326660156, -15.6744384765625, -4.020500183105469, 10.294403076171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000112.npy"}
|
||||
{"epoch": 0.23455497382198953, "step": 113, "batch_size": 128, "mean": 10.159167289733887, "std": 15.033726692199707, "min": -30.24285888671875, "p10": -8.183148193359374, "median": 8.525882720947266, "p90": 31.34105987548828, "max": 57.95867919921875, "pos_frac": 0.7421875, "sample": [5.2086029052734375, 16.3514404296875, 10.8787841796875, 0.34911346435546875, 14.24102783203125, 0.7774391174316406, -5.94744873046875, 15.39764404296875, 26.77435302734375, 5.289520263671875, 6.445831298828125, 7.63916015625, 20.09735107421875, -11.01420783996582, -1.949615478515625, 3.6683197021484375, 3.559999465942383, 2.1340789794921875, 4.764923095703125, 13.429061889648438, 57.95867919921875, -5.17401123046875, -2.433258056640625, 41.970611572265625, 12.31915283203125, -30.24285888671875, 5.317169189453125, 23.81610107421875, 13.821258544921875, -2.5556869506835938, -2.58099365234375, -11.66143798828125, 24.461883544921875, 31.610870361328125, 40.12939453125, -0.748809814453125, 15.118896484375, -6.343719482421875, 29.704193115234375, -9.739120483398438, 12.167083740234375, 31.114990234375, -1.43780517578125, 13.949066162109375, 9.634185791015625, 4.851284027099609, -10.37457275390625, 1.2770881652832031, 0.5301551818847656, 31.433013916015625, 4.34716796875, 18.923118591308594, -24.875579833984375, 26.661880493164062, 33.9239501953125, 24.810768127441406, 11.683822631835938, 33.127716064453125, 6.43975830078125, -12.402023315429688, 8.861518859863281, -0.8928680419921875, -4.8995819091796875, 25.365951538085938, -0.20673751831054688, 8.19024658203125, 6.068111419677734, 7.297569274902344, -13.00213623046875, 18.95770263671875, 6.926666259765625, 32.63677978515625, -1.8939151763916016, 11.360313415527344, -8.721542358398438, 4.788429260253906, 12.21209716796875, 31.301651000976562, 15.593902587890625, 15.605072021484375, 15.289764404296875, -12.8670654296875, 17.50286865234375, 28.541473388671875, 34.7659912109375, 13.942634582519531, 13.348968505859375, -3.3900527954101562, -2.8115081787109375, 5.55023193359375, 38.623748779296875, 26.485397338867188, 19.04071044921875, 13.670135498046875, -7.9524078369140625, 25.785888671875, 0.6943473815917969, 12.66064453125, 15.04443359375, 7.59722900390625, 5.702545166015625, -2.15576171875, 11.979034423828125, 2.8267135620117188, 24.0482177734375, 36.2718505859375, 23.683197021484375, 39.395355224609375, -11.10614013671875, 3.02288818359375, 22.521896362304688, 18.3883056640625, 11.83880615234375, 43.05780029296875, -6.0015869140625, -8.84259033203125, 2.1779937744140625, 2.8659286499023438, 21.640426635742188, 16.474441528320312, 0.194488525390625, 17.353729248046875, -17.621200561523438, 11.10546875, 4.0841522216796875, -1.772125244140625, 18.253936767578125, -4.7157440185546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000113.npy"}
|
||||
{"epoch": 0.23664921465968586, "step": 114, "batch_size": 128, "mean": 5.272119045257568, "std": 13.741157531738281, "min": -27.5694580078125, "p10": -12.285770225524901, "median": 4.549455642700195, "p90": 22.442424011230464, "max": 41.68963623046875, "pos_frac": 0.6875, "sample": [3.9004955291748047, 21.37005615234375, 4.12908935546875, 26.516265869140625, 1.4794120788574219, 6.48541259765625, 1.9785823822021484, 4.907859802246094, -1.3836669921875, -6.1539459228515625, -3.2604522705078125, 23.683242797851562, -1.730316162109375, 29.5794677734375, 15.006378173828125, 37.16229248046875, -11.480979919433594, 1.931488037109375, -22.06903076171875, 2.0238800048828125, 1.5059814453125, -16.49249267578125, 4.9544219970703125, 15.109611511230469, 1.4738235473632812, 17.44683837890625, 9.525833129882812, 11.846893310546875, 13.703460693359375, 14.30462646484375, 3.2570343017578125, 13.514572143554688, 4.649753570556641, 8.097457885742188, 14.75146484375, -11.7728271484375, 10.145484924316406, -24.9432373046875, 26.695465087890625, 5.3116455078125, -6.236747741699219, -3.62939453125, 2.841787338256836, -27.5694580078125, 6.42132568359375, 1.8029861450195312, 0.67681884765625, 1.74176025390625, 7.394500732421875, -8.810577392578125, 0.21795654296875, -13.482637405395508, 10.328323364257812, -8.990463256835938, -16.64404296875, -18.710922241210938, 12.605743408203125, 41.68963623046875, 21.750701904296875, -8.366180419921875, 26.73016357421875, -23.899566650390625, 19.2353515625, -6.886444091796875, -1.868621826171875, 20.991790771484375, 37.66748046875, -9.5545654296875, -0.5032806396484375, 14.83050537109375, -25.530609130859375, 0.287353515625, -14.88372802734375, 10.275527954101562, 4.44915771484375, 34.19293212890625, 0.650299072265625, -0.889129638671875, -2.59088134765625, -17.08648681640625, 2.1619873046875, 9.586593627929688, 8.3487548828125, -17.0269775390625, 1.3171367645263672, -1.2426338195800781, -15.3436279296875, -3.7827491760253906, 0.0, 25.350189208984375, 10.74847412109375, -2.8531265258789062, 13.010498046875, 6.125701904296875, -0.969879150390625, 21.91064453125, -2.997802734375, 18.363189697265625, 3.49249267578125, 5.743766784667969, 10.642890930175781, 31.590835571289062, 12.567523956298828, 11.16400146484375, 37.790283203125, 16.085426330566406, 7.435333251953125, 5.006782531738281, -5.71630859375, 0.0, 34.534454345703125, -2.426910400390625, -6.21746826171875, 13.1905517578125, 7.944580078125, 20.15697479248047, 6.496650695800781, 4.178466796875, 6.34039306640625, 6.345680236816406, 2.2904205322265625, 9.678466796875, 5.730449676513672, 0.09576416015625, 14.82574462890625, 7.685302734375, 3.2507266998291016, 14.411865234375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000114.npy"}
|
||||
{"epoch": 0.2387434554973822, "step": 115, "batch_size": 128, "mean": 5.797318458557129, "std": 15.437718391418457, "min": -32.701751708984375, "p10": -11.773861694335936, "median": 5.1712541580200195, "p90": 28.472790527343747, "max": 58.991943359375, "pos_frac": 0.671875, "sample": [31.94073486328125, -15.57958984375, 12.921127319335938, -11.1766357421875, -0.44512939453125, -12.942466735839844, -10.34490966796875, 2.619781494140625, 2.6718673706054688, 20.104162216186523, -5.004415512084961, 9.557205200195312, 7.54345703125, 8.698966979980469, 3.3304595947265625, -10.040306091308594, 9.603652954101562, -5.085784912109375, 14.332611083984375, 2.3630218505859375, -12.001007080078125, 2.589630126953125, 5.821197509765625, 7.564666748046875, 4.578601837158203, -3.767791748046875, 2.771240234375, 14.670768737792969, 13.559112548828125, 14.65374755859375, -3.590850830078125, 13.231430053710938, 11.269012451171875, 2.6981582641601562, 3.6634654998779297, 8.992767333984375, -6.581268310546875, -20.660858154296875, -0.7791900634765625, 34.26409912109375, 6.427978515625, 28.134521484375, 38.204833984375, 8.045204162597656, -20.7066650390625, -20.107559204101562, 24.424644470214844, -12.431533813476562, 26.556961059570312, 43.525787353515625, 5.948295593261719, 58.991943359375, -9.670166015625, -1.304779052734375, -11.1175537109375, 14.77935791015625, 31.191375732421875, 23.86920166015625, 11.873458862304688, -24.17059326171875, -16.99493408203125, -9.547698974609375, 29.2620849609375, 2.21136474609375, -5.3424530029296875, 11.88177490234375, 5.734832763671875, 0.264404296875, 6.148460388183594, 32.115478515625, 0.692779541015625, 8.9150390625, 14.95522689819336, 8.708412170410156, -2.9230880737304688, 6.715871810913086, 45.3634033203125, 2.408721923828125, -17.803497314453125, -7.8865203857421875, -0.3392333984375, 2.527587890625, -14.41290283203125, 34.894012451171875, 21.169822692871094, 4.642181396484375, -26.28857421875, 10.820716857910156, 2.051189422607422, 30.94183349609375, -4.578517913818359, 4.163047790527344, -5.669219970703125, 34.012237548828125, 16.829681396484375, -3.8843231201171875, 1.4992141723632812, 6.987152099609375, -9.5164794921875, 5.7970123291015625, 9.716339111328125, 6.1001434326171875, 9.097389221191406, 8.111480712890625, 8.2099609375, 3.5674591064453125, 42.6456298828125, 5.4100494384765625, -7.301727294921875, -10.943809509277344, 11.91595458984375, 4.932458877563477, 16.959564208984375, 6.895484924316406, 12.918952941894531, -11.676513671875, -6.3239898681640625, 13.246585845947266, -1.1806411743164062, 22.7227783203125, -32.701751708984375, -2.6477394104003906, 9.456985473632812, 0.83172607421875, 13.26708984375, 4.022705078125, -2.3721466064453125, 10.1707763671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000115.npy"}
|
||||
{"epoch": 0.24083769633507854, "step": 116, "batch_size": 128, "mean": 8.194032669067383, "std": 14.800284385681152, "min": -41.684478759765625, "p10": -8.18265151977539, "median": 7.189064025878906, "p90": 25.47784271240234, "max": 56.8408203125, "pos_frac": 0.71875, "sample": [4.954376220703125, -4.038387298583984, 5.7346954345703125, 4.1651611328125, 7.06329345703125, 10.882232666015625, 11.480621337890625, 26.625030517578125, 43.859283447265625, 6.9853515625, 18.4420166015625, 8.9990234375, 21.448814392089844, 3.335773468017578, 16.651351928710938, 11.13128662109375, -7.0186004638671875, 13.444869995117188, 3.636211395263672, -9.074920654296875, -21.69091796875, -6.17634391784668, 22.992507934570312, 24.5235595703125, 0.7938823699951172, -9.09991455078125, 11.766448974609375, -1.0508270263671875, 5.505279541015625, -0.8382568359375, 2.434345245361328, 22.94049072265625, -9.378662109375, 17.353118896484375, -6.694953918457031, 1.6231842041015625, 21.525741577148438, 14.07049560546875, -7.8887939453125, -5.3051300048828125, -2.8755874633789062, 19.2679443359375, 15.970306396484375, -9.601524353027344, 7.8634185791015625, 4.5338897705078125, -1.718017578125, 7.0858154296875, 20.8641357421875, 31.33282470703125, -41.684478759765625, 4.014923095703125, 14.79034423828125, -18.8985595703125, -1.8350143432617188, 28.117919921875, 11.58026123046875, 24.78858184814453, -2.891693115234375, 0.5959320068359375, 10.159332275390625, 12.147384643554688, 6.2454833984375, 3.4742584228515625, 9.802772521972656, 0.0, 8.050697326660156, 33.7613525390625, -2.2169876098632812, -9.139450073242188, 19.210174560546875, 32.892974853515625, -0.055450439453125, -6.0701141357421875, -8.388557434082031, 1.8519821166992188, 1.269378662109375, 1.7652091979980469, 6.842041015625, 23.430816650390625, 3.9450149536132812, 39.30804443359375, 18.047576904296875, 0.0, 12.173561096191406, 7.666954040527344, 19.7786865234375, 4.566986083984375, 36.282867431640625, -2.98187255859375, -1.40863037109375, 56.8408203125, 13.933135986328125, -36.727783203125, 11.04205322265625, 31.58258056640625, 19.256702423095703, 4.628242492675781, 10.512672424316406, 9.841766357421875, 11.07619857788086, -0.999847412109375, 8.161361694335938, 28.267974853515625, 21.22540283203125, -21.166587829589844, 8.120223999023438, 14.00634765625, 1.8843307495117188, -12.981742858886719, 10.1749267578125, 7.2923126220703125, 46.04693603515625, -0.3826904296875, 24.986190795898438, 17.064804077148438, 6.068328857421875, 16.14324951171875, 21.950714111328125, 12.979255676269531, -8.094406127929688, 8.061454772949219, 1.1037139892578125, 29.65289306640625, -12.584381103515625, 6.107120513916016, -3.572601318359375, 17.5078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000116.npy"}
|
||||
{"epoch": 0.24293193717277486, "step": 117, "batch_size": 128, "mean": 8.424925804138184, "std": 13.876792907714844, "min": -24.322509765625, "p10": -7.684767150878906, "median": 7.4943084716796875, "p90": 28.347093200683595, "max": 43.088531494140625, "pos_frac": 0.71875, "sample": [3.9791526794433594, 7.817256927490234, 11.956954956054688, 16.814224243164062, 4.9683074951171875, 14.1484375, 2.279296875, -1.7402057647705078, 11.216522216796875, -4.372283935546875, 13.08172607421875, -6.7969207763671875, -0.349761962890625, 14.187973022460938, 18.532020568847656, 4.378173828125, 10.07159423828125, 10.53680419921875, 24.172393798828125, 4.718505859375, 19.322601318359375, -8.294807434082031, 7.171360015869141, 2.2312164306640625, 14.193918228149414, 27.02178955078125, 1.3702239990234375, -18.799461364746094, 13.0491943359375, 43.088531494140625, 16.794742584228516, 28.343170166015625, 33.98016357421875, 14.08122444152832, 9.633478164672852, -11.042160034179688, 24.1502685546875, -0.4684906005859375, 0.81964111328125, -11.00970458984375, 35.057525634765625, 8.178665161132812, 3.739288330078125, 29.147216796875, 0.0, 23.9185791015625, -5.674560546875, 6.181365966796875, 38.7744140625, 24.180877685546875, 8.584487915039062, 34.46539306640625, -23.590179443359375, 18.583831787109375, 26.45330810546875, 22.774246215820312, -7.7530517578125, 3.2414608001708984, 4.74737548828125, 8.459609985351562, 17.487091064453125, 16.78466796875, 13.268707275390625, 11.43423843383789, 5.6781768798828125, 12.879287719726562, 12.956192016601562, -0.954376220703125, -7.6555023193359375, -8.45263671875, 9.324798583984375, -21.84478759765625, -0.34020042419433594, 9.531185150146484, 28.356246948242188, -13.8262939453125, 10.606658935546875, 10.94488525390625, 22.44677734375, 2.3103713989257812, -2.2169876098632812, -1.682373046875, 6.7076416015625, 31.991119384765625, 4.9596710205078125, -5.976348876953125, -6.18389892578125, -12.285308837890625, 0.64697265625, 3.9902381896972656, -24.322509765625, 18.27960205078125, 28.720474243164062, 28.430084228515625, 8.89156723022461, 4.5135650634765625, -10.315826416015625, -3.7751312255859375, -1.0615692138671875, 31.092132568359375, 2.8813323974609375, -3.00152587890625, 2.252025604248047, -4.7701416015625, -5.757598876953125, 7.8181304931640625, 27.379348754882812, 6.188606262207031, 25.13970947265625, -5.039630889892578, 15.447677612304688, 17.376922607421875, 21.731689453125, 0.8016510009765625, 0.10752105712890625, -4.1688385009765625, 4.362560272216797, -0.01892852783203125, 5.48199462890625, -3.456523895263672, -20.595260620117188, 38.522247314453125, 11.1695556640625, 14.24652099609375, 1.8311538696289062, 17.515541076660156, 37.772918701171875, 11.12628173828125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000117.npy"}
|
||||
{"epoch": 0.2450261780104712, "step": 118, "batch_size": 128, "mean": 8.984794616699219, "std": 15.190743446350098, "min": -25.795120239257812, "p10": -9.640924072265625, "median": 7.1595611572265625, "p90": 27.41181335449219, "max": 51.818756103515625, "pos_frac": 0.7421875, "sample": [6.8416595458984375, 8.928192138671875, 18.74517822265625, 3.37744140625, 29.327377319335938, 25.16705322265625, -21.635589599609375, -13.23129653930664, 19.100112915039062, 27.673583984375, 15.973541259765625, 6.3084564208984375, 5.831661224365234, 5.971649169921875, -15.44061279296875, 32.91015625, 16.830413818359375, 20.722702026367188, 22.52288818359375, 6.0365142822265625, 9.308807373046875, 18.6016845703125, 32.97180938720703, 1.992340087890625, -25.02239990234375, -25.795120239257812, 17.259765625, 1.8199462890625, -19.820343017578125, 21.8802490234375, 19.1859130859375, 24.728294372558594, 44.6038818359375, 13.9732666015625, 3.605743408203125, 5.898101806640625, 2.79449462890625, 6.7082061767578125, 17.629165649414062, 21.658599853515625, -4.060516357421875, 21.903900146484375, 31.955490112304688, -3.601442337036133, 1.6650238037109375, -5.57806396484375, 19.76751708984375, 16.57757568359375, 12.390625, 0.672515869140625, 0.5577392578125, 3.4989013671875, 27.4083251953125, 19.07989501953125, 27.419952392578125, 40.3873291015625, -8.38934326171875, 10.360931396484375, -0.13763046264648438, 21.818115234375, -5.059574127197266, 0.0, -9.58905029296875, 9.025787353515625, -10.7935791015625, -7.350126266479492, -18.875579833984375, 6.703865051269531, 15.027389526367188, -10.536163330078125, -0.22673797607421875, 31.29742431640625, 0.73907470703125, 16.451919555664062, 7.193695068359375, 23.83978271484375, 23.080841064453125, 16.748489379882812, -9.761962890625, 25.906494140625, 4.730224609375, 23.7269287109375, 8.426910400390625, -4.9541473388671875, 13.752983093261719, 25.14654541015625, -20.635284423828125, 1.3738555908203125, -7.359588623046875, -8.820404052734375, 50.050811767578125, -4.633514404296875, 23.352645874023438, 9.146408081054688, -1.5248336791992188, 9.035842895507812, 1.6980323791503906, -14.359977722167969, 12.10760498046875, -8.171127319335938, 5.217548370361328, 35.363677978515625, 7.12542724609375, 13.22332763671875, -1.5083465576171875, 10.886077880859375, 4.91070556640625, 4.8579864501953125, 10.440736770629883, 33.57847595214844, -6.248973846435547, -7.293609619140625, 18.129505157470703, 14.657028198242188, 6.9339599609375, 2.92291259765625, -20.87042236328125, 10.97216796875, 6.0052337646484375, 21.15203857421875, 2.621856689453125, 13.422454833984375, 4.326349258422852, 23.106719970703125, 0.0, 3.606475830078125, 15.143402099609375, 51.818756103515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000118.npy"}
|
||||
{"epoch": 0.24712041884816754, "step": 119, "batch_size": 128, "mean": 7.655119895935059, "std": 18.520307540893555, "min": -40.298797607421875, "p10": -15.084844970703125, "median": 5.235752105712891, "p90": 31.689273071289062, "max": 69.171630859375, "pos_frac": 0.6875, "sample": [-40.298797607421875, -19.350540161132812, 26.195465087890625, 3.6215553283691406, 2.2783050537109375, 3.6649627685546875, -9.009117126464844, -7.2773895263671875, 6.514404296875, -4.4335174560546875, 22.60302734375, -11.509040832519531, -11.64111328125, 32.93829345703125, -18.26806640625, 10.862335205078125, -14.49700927734375, 25.9122314453125, 27.838890075683594, -2.776763916015625, 6.8743896484375, 18.19677734375, -1.3848133087158203, -9.292411804199219, 7.1976318359375, 15.992088317871094, 21.30988311767578, 6.3507080078125, 33.64765930175781, 4.1861724853515625, 4.615550994873047, 1.605255126953125, -2.6027984619140625, 1.2649688720703125, 13.433883666992188, 10.593170166015625, 11.222320556640625, 2.996246337890625, 7.0263824462890625, 2.8904495239257812, 22.683197021484375, -9.247795104980469, -1.304901123046875, -0.56536865234375, 32.090606689453125, 1.3209114074707031, 4.5175323486328125, -34.1485595703125, -15.1163330078125, -19.40509033203125, 50.951995849609375, 12.238739013671875, 4.1725311279296875, 30.962188720703125, -5.146888732910156, -1.8660736083984375, 13.45513916015625, 1.6543140411376953, 15.806854248046875, 22.782562255859375, 6.4739990234375, 0.872314453125, 24.611923217773438, 15.793304443359375, 17.11474609375, -10.594146728515625, 40.82427978515625, 12.406808853149414, 31.51727294921875, 18.0242919921875, 0.2556190490722656, 7.68902587890625, 13.313934326171875, 0.0, 38.19647216796875, -3.2581787109375, -1.6051254272460938, -24.856842041015625, 20.63348388671875, -23.029556274414062, 21.705718994140625, 12.06536865234375, 4.871490478515625, -18.05419921875, 14.13714599609375, 0.465789794921875, 32.430908203125, -7.402700424194336, 16.643463134765625, 5.79522705078125, 24.711181640625, -16.683837890625, -2.650665283203125, 60.50628662109375, -4.750007629394531, 69.171630859375, 6.9835205078125, 0.48760986328125, 8.655487060546875, -26.00408935546875, 22.30572509765625, 32.858489990234375, 3.61065673828125, 19.762710571289062, -7.8038482666015625, 24.71820068359375, -7.466894149780273, 2.602874755859375, 5.600013732910156, -15.07135009765625, 2.5132904052734375, -23.072479248046875, 49.0350341796875, 14.694084167480469, 57.562225341796875, 5.956230163574219, -1.899200439453125, 13.858978271484375, 0.661529541015625, 18.803436279296875, 7.89886474609375, 4.66180419921875, 28.51654052734375, -6.509119033813477, 40.295562744140625, 4.6509246826171875, 6.197174072265625, -17.88226318359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000119.npy"}
|
||||
{"epoch": 0.24921465968586387, "step": 120, "batch_size": 128, "mean": 9.01522445678711, "std": 17.212888717651367, "min": -31.51983642578125, "p10": -13.024307250976559, "median": 9.64971923828125, "p90": 30.5302490234375, "max": 57.710906982421875, "pos_frac": 0.65625, "sample": [-12.26116943359375, 25.18017578125, -5.12664794921875, 11.807388305664062, -3.5221633911132812, 11.440567016601562, 23.09808349609375, -11.408218383789062, 7.1256866455078125, 16.11236572265625, -2.9641342163085938, -3.7745513916015625, 21.492271423339844, 4.4275665283203125, -16.85919189453125, -2.1556396484375, 30.312591552734375, -3.4540634155273438, 10.026275634765625, 18.57494354248047, -6.541351318359375, 33.241973876953125, 29.403778076171875, 25.95489501953125, 26.2137451171875, 11.907974243164062, 6.858306884765625, -1.870880126953125, -5.150638580322266, 31.22454833984375, 26.167236328125, -0.252197265625, 5.7147216796875, 3.870330810546875, -25.716888427734375, 8.6009521484375, -0.458465576171875, 15.343658447265625, 22.187103271484375, 22.440277099609375, 1.3658409118652344, -2.528614044189453, 11.93951416015625, -27.189300537109375, 0.0, -1.9137802124023438, -20.810943603515625, -1.524200439453125, -7.54034423828125, 6.001556396484375, 11.31048583984375, 12.54666519165039, 51.8179931640625, 20.8505859375, 13.79974365234375, 6.5189208984375, -14.804962158203125, 14.91729736328125, 9.273162841796875, 3.749542236328125, -3.2362594604492188, 0.0, 11.4383544921875, 18.629180908203125, 57.710906982421875, 15.422836303710938, 17.295074462890625, 40.170806884765625, -17.4813232421875, 20.656600952148438, 21.2186279296875, 7.5552825927734375, 4.382499694824219, 18.57049560546875, 14.388412475585938, -7.4612884521484375, 7.5762939453125, 32.98405456542969, 19.526611328125, 19.747314453125, 10.70135498046875, -15.6656494140625, -18.097381591796875, -8.091018676757812, -10.835243225097656, 37.30914306640625, 10.375335693359375, 37.3765869140625, 43.484649658203125, 15.639862060546875, -1.0245819091796875, 27.3670654296875, -29.156021118164062, 31.038116455078125, 38.02821350097656, 6.1883392333984375, 0.6323089599609375, 16.0694580078125, 12.224105834960938, 19.016563415527344, -22.758285522460938, 2.9755802154541016, -0.11812973022460938, 18.408958435058594, 27.416458129882812, -1.1841373443603516, 12.02099609375, -2.2761993408203125, 39.958984375, -31.51983642578125, 21.05828857421875, 22.686737060546875, 21.056640625, -17.332977294921875, -24.179962158203125, 15.2039794921875, 13.358268737792969, 19.689483642578125, 6.261100769042969, -1.220489501953125, -4.353473663330078, 15.623008728027344, -2.72161865234375, 51.947296142578125, 2.8202362060546875, 4.76397705078125, -5.6280059814453125, 15.325698852539062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000120.npy"}
|
||||
{"epoch": 0.2513089005235602, "step": 121, "batch_size": 128, "mean": 5.389993190765381, "std": 15.275097846984863, "min": -34.44195556640625, "p10": -12.122837829589841, "median": 3.9838905334472656, "p90": 25.20324401855468, "max": 46.607513427734375, "pos_frac": 0.640625, "sample": [-1.3316192626953125, -1.86517333984375, 2.715198516845703, 6.217041015625, 0.1350250244140625, 3.3057708740234375, -2.86572265625, 6.691883087158203, 9.235626220703125, 27.577789306640625, 3.2574539184570312, -1.2894134521484375, 32.994354248046875, 4.9473876953125, 16.425090789794922, 0.184814453125, -3.420623779296875, 3.834442138671875, 0.2494182586669922, 24.46722412109375, 19.05419921875, -17.491226196289062, 19.971649169921875, 16.0123291015625, -24.544342041015625, 0.0, 8.249580383300781, 2.5678863525390625, 1.4297409057617188, 17.062469482421875, 10.560117721557617, 8.774589538574219, -34.44195556640625, 27.48046875, 5.399009704589844, -33.897003173828125, -2.5159835815429688, 5.984809875488281, 22.145904541015625, 13.264663696289062, -3.128448486328125, -10.259490966796875, 7.2307891845703125, 2.599698066711426, 6.156494140625, 1.9178619384765625, 16.336105346679688, -11.568389892578125, -8.87847900390625, 0.0, 7.007549285888672, -3.997450828552246, -1.1650543212890625, -1.38446044921875, 16.434219360351562, -7.421607971191406, -0.893218994140625, 4.7254638671875, 18.98358154296875, -13.416549682617188, -2.308349609375, -17.577957153320312, -2.2282562255859375, 46.607513427734375, 13.514915466308594, 9.063961029052734, 43.28974914550781, -7.51763916015625, -4.00225830078125, -2.1963043212890625, -5.666007995605469, 13.78594970703125, -3.4366455078125, 14.514812469482422, -8.804084777832031, 0.8155517578125, 3.8973312377929688, 8.967155456542969, -0.5565338134765625, 15.584667205810547, 15.427383422851562, -1.037881851196289, -6.35302734375, 28.879776000976562, 45.84368896484375, 6.47845458984375, -24.2962646484375, -1.4490966796875, -19.6466064453125, 45.926300048828125, 16.85308837890625, 19.57843017578125, 16.588592529296875, 2.07568359375, 32.900054931640625, 3.3728179931640625, -21.86614990234375, 4.0704498291015625, 13.407272338867188, 0.50860595703125, 3.606660842895508, 8.199737548828125, 14.007709503173828, 7.689208984375, 4.79107666015625, 15.025390625, -0.61553955078125, 4.452369689941406, -4.1115875244140625, -3.4244117736816406, 6.349151611328125, 35.2003173828125, 9.296615600585938, 4.30401611328125, 8.22760009765625, 8.759849548339844, 30.206802368164062, -25.333847045898438, 26.920623779296875, 13.738067626953125, -0.8301906585693359, -14.770347595214844, 41.05274963378906, 14.843704223632812, 0.8517742156982422, -19.82257080078125, -24.186248779296875, 16.669784545898438], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000121.npy"}
|
||||
{"epoch": 0.2534031413612565, "step": 122, "batch_size": 128, "mean": 7.431892395019531, "std": 18.363971710205078, "min": -39.58372497558594, "p10": -16.129454040527342, "median": 6.851226806640625, "p90": 31.31502227783203, "max": 61.14599609375, "pos_frac": 0.6328125, "sample": [21.64208984375, -15.847686767578125, 26.418594360351562, -14.42919921875, 13.701202392578125, 23.3226318359375, 7.267669677734375, 2.902801513671875, -33.751861572265625, 31.187484741210938, 23.866012573242188, 0.0, 10.449920654296875, -26.8446044921875, -0.6942367553710938, -39.58372497558594, 10.831817626953125, 15.857025146484375, -11.763191223144531, 13.477333068847656, -4.321197509765625, 23.607208251953125, -22.891677856445312, -31.674652099609375, 6.94097900390625, 35.01641845703125, -12.292083740234375, 10.447664260864258, 44.798370361328125, 2.5179367065429688, 10.852447509765625, -2.03253173828125, 8.409713745117188, 8.54119873046875, 18.091583251953125, -5.848602294921875, 4.655902862548828, -22.470001220703125, 0.006561279296875, 9.913078308105469, -28.886306762695312, 11.557342529296875, 6.761474609375, 47.6614990234375, -3.041135787963867, -22.53631591796875, -10.147506713867188, 8.89602279663086, 40.653900146484375, -7.59527587890625, -6.7317962646484375, 8.813735961914062, 10.939300537109375, -20.662094116210938, 12.13671875, 14.890754699707031, -0.7214317321777344, 23.788360595703125, -28.91632080078125, 10.37762451171875, 13.907257080078125, 21.163665771484375, 18.212173461914062, -0.176239013671875, -7.850433349609375, 33.0833740234375, 14.45501708984375, 25.570556640625, 21.74114990234375, 29.58135986328125, 45.413726806640625, 12.941593170166016, 1.64190673828125, 37.00091552734375, 61.14599609375, 7.2879180908203125, 6.38433837890625, -2.040771484375, 31.614486694335938, 24.40155029296875, -3.2128448486328125, -16.786911010742188, 4.692718505859375, -9.212936401367188, -4.95416259765625, -2.55133056640625, -1.2524452209472656, 12.182723999023438, 7.05438232421875, -6.394134521484375, 16.43829345703125, -0.334503173828125, 3.189451217651367, -18.675338745117188, -2.9876861572265625, -18.837249755859375, -3.2456512451171875, 26.854293823242188, 24.14752197265625, -1.220062255859375, 5.449943542480469, -4.714935302734375, 22.747894287109375, 18.69622802734375, 2.7726402282714844, 24.38525390625, 20.12298583984375, 29.18584442138672, 4.734626770019531, 2.2127113342285156, -1.1556930541992188, -1.5144500732421875, -7.675148010253906, 16.018577575683594, 51.43548583984375, 0.980224609375, 32.4793701171875, -3.9006805419921875, 2.59906005859375, 25.869430541992188, 14.05035400390625, 21.4017333984375, -3.9872207641601562, 35.92132568359375, -4.978767395019531, 2.13330078125, 31.61260986328125, 6.50897216796875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000122.npy"}
|
||||
{"epoch": 0.2554973821989529, "step": 123, "batch_size": 128, "mean": 10.556944847106934, "std": 18.66426658630371, "min": -48.11004638671875, "p10": -11.692977905273438, "median": 9.459892272949219, "p90": 33.31840362548828, "max": 67.41265869140625, "pos_frac": 0.734375, "sample": [24.649459838867188, -8.432769775390625, 30.864303588867188, -11.647705078125, 7.187255859375, 5.816951751708984, 9.985954284667969, 2.626220703125, 13.1455078125, 11.742462158203125, -0.6131782531738281, 1.71478271484375, -2.9207916259765625, 2.277801513671875, 4.458341598510742, -7.730316162109375, -11.798614501953125, 17.441253662109375, 43.6884765625, 8.450439453125, -23.6795654296875, -23.608245849609375, 7.6728668212890625, 17.372526168823242, 52.03143310546875, 3.0013885498046875, 19.096435546875, 29.7733154296875, 28.77032470703125, 1.4076385498046875, 27.800872802734375, 24.21478271484375, 19.994415283203125, 23.990249633789062, 8.854293823242188, 23.73828125, 24.389495849609375, -24.70172119140625, -1.213714599609375, 28.817413330078125, 17.03253173828125, 4.769073486328125, 0.7798004150390625, 31.7850341796875, 9.6673583984375, 56.39019775390625, 1.276845932006836, 2.7709217071533203, 18.968353271484375, 28.9725341796875, 3.3392257690429688, 18.537078857421875, 21.469818115234375, 67.41265869140625, 47.2227783203125, 12.615409851074219, -4.956094741821289, 0.361297607421875, 3.37786865234375, 46.91009521484375, -2.8870391845703125, 5.834085464477539, -48.11004638671875, 33.06602478027344, 10.476333618164062, -19.625732421875, 22.291976928710938, 5.945343017578125, 36.224456787109375, 7.408632278442383, 12.1534423828125, 0.463897705078125, 7.941011428833008, -18.914695739746094, 11.443626403808594, 25.6630859375, 21.8526611328125, -5.23638916015625, -17.653045654296875, 42.41363525390625, -2.90911865234375, 10.97674560546875, 20.627864837646484, -3.99029541015625, 7.765472412109375, 18.46380615234375, -16.99212646484375, 17.826904296875, 19.065261840820312, -7.941253662109375, 0.0, -3.6869735717773438, 38.4609375, 12.076171875, 10.4466552734375, 0.0, 14.660858154296875, 9.252426147460938, 0.8104686737060547, 22.14678955078125, -6.627159118652344, -15.358978271484375, 6.041412353515625, 26.718170166015625, -1.5489654541015625, 12.241706848144531, 10.240875244140625, 35.157440185546875, 15.98162841796875, -16.737625122070312, -3.859771728515625, -2.4267578125, 32.676666259765625, 8.5389404296875, -30.523208618164062, 48.49676513671875, 15.85345458984375, -6.651641845703125, 13.8604736328125, -2.3057403564453125, 0.10516357421875, 11.369293212890625, 6.63482666015625, 33.90728759765625, 46.57054138183594, -20.14892578125, 17.09344482421875, 20.846649169921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000123.npy"}
|
||||
{"epoch": 0.25759162303664923, "step": 124, "batch_size": 128, "mean": 9.563024520874023, "std": 17.422840118408203, "min": -50.70025634765625, "p10": -8.313067436218262, "median": 6.95587158203125, "p90": 34.11058807373047, "max": 58.0831298828125, "pos_frac": 0.765625, "sample": [2.14251708984375, 29.44072723388672, -50.70025634765625, 15.773895263671875, -3.4459495544433594, 1.5500640869140625, 25.061325073242188, 8.081329345703125, 34.991058349609375, 46.5137939453125, 5.187744140625, -28.40252685546875, 6.6477508544921875, -8.194000244140625, 24.029541015625, 4.952919006347656, 5.493804931640625, 20.137237548828125, -6.290468215942383, 14.436767578125, 5.4858245849609375, 5.657707214355469, -0.5837860107421875, 9.633636474609375, 20.81951904296875, 3.199859619140625, -1.26922607421875, -5.1975860595703125, 4.012016296386719, 8.1656494140625, 28.98443603515625, -15.198699951171875, 4.651130676269531, 25.521270751953125, 43.036865234375, 38.8006591796875, 19.269493103027344, 11.609878540039062, 4.5809326171875, 23.10260009765625, 21.3492431640625, 18.643035888671875, 38.88470458984375, 10.5428466796875, 3.486907958984375, -2.3952484130859375, 0.0398712158203125, 13.523239135742188, 45.62640380859375, -8.590890884399414, 7.380706787109375, -17.772808074951172, 6.4675140380859375, 0.0, 2.372865676879883, -3.1612701416015625, 25.369720458984375, 6.99945068359375, 3.6953125, -10.985923767089844, 34.12174987792969, -36.92816162109375, -27.625823974609375, 39.834716796875, 13.789398193359375, 24.120941162109375, 8.613494873046875, 31.502593994140625, 25.025360107421875, -1.190521240234375, 11.74224853515625, 5.456451416015625, -19.175537109375, -29.151824951171875, 22.059524536132812, 1.184112548828125, 4.201541900634766, 13.922409057617188, 38.848602294921875, 20.874481201171875, 10.03289794921875, 31.52496337890625, 6.91229248046875, -15.283355712890625, -6.146697998046875, 6.8600006103515625, 3.2663726806640625, 14.816329956054688, 35.068145751953125, 16.237110137939453, 2.5063629150390625, 13.923080444335938, -2.7671966552734375, 12.72723388671875, 11.023910522460938, 8.324422836303711, -0.69281005859375, 58.0831298828125, 11.17308235168457, 20.944168090820312, -3.8851470947265625, -0.308685302734375, 13.459228515625, 2.06097412109375, 0.290008544921875, 53.38836669921875, -10.600265502929688, 0.1880950927734375, 6.66650390625, 16.419921875, -1.5687255859375, 0.22430419921875, 4.500030517578125, 10.954666137695312, 8.255203247070312, 21.329696655273438, 9.296112060546875, 34.105804443359375, 0.09253883361816406, -1.8912353515625, 5.225685119628906, 15.820648193359375, 23.6253662109375, 43.828765869140625, 5.7088623046875, 12.956207275390625, 5.063404083251953, -14.063556671142578], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000124.npy"}
|
||||
{"epoch": 0.25968586387434556, "step": 125, "batch_size": 128, "mean": 9.278959274291992, "std": 19.126379013061523, "min": -33.81390380859375, "p10": -12.09876651763916, "median": 6.583515167236328, "p90": 35.482452392578125, "max": 64.21337890625, "pos_frac": 0.6484375, "sample": [4.8711395263671875, -8.436368942260742, -33.7857666015625, 39.658203125, 1.5018024444580078, 14.522811889648438, 16.048095703125, -10.75836181640625, 3.2800426483154297, -16.602264404296875, 3.377044677734375, -1.06439208984375, 33.1585693359375, -2.51251220703125, 9.67449951171875, 24.46002197265625, -28.81951904296875, 14.884994506835938, 13.764175415039062, 14.716995239257812, -7.4837646484375, -3.582805633544922, 17.2900390625, -4.542549133300781, 15.068328857421875, 13.974899291992188, 11.013114929199219, 14.106658935546875, -4.7670135498046875, 45.9580078125, 4.963493347167969, -5.851997375488281, 10.100509643554688, -1.231842041015625, 6.702293395996094, 32.4356689453125, 21.7216796875, 37.09881591796875, 13.891632080078125, -3.6367340087890625, 0.17855072021484375, -23.476531982421875, -12.534881591796875, -7.247745513916016, -5.1941986083984375, 44.1334228515625, 1.256011962890625, -7.58758544921875, -33.81390380859375, 15.084815979003906, 15.033203125, -16.26153564453125, 24.116851806640625, 64.21337890625, 6.9249267578125, 12.770774841308594, 8.666648864746094, 20.982940673828125, 19.350006103515625, 26.455780029296875, -2.5511474609375, 35.508087158203125, 13.714828491210938, -1.0030174255371094, 0.99041748046875, 13.78668212890625, 2.883636474609375, 1.0757331848144531, 5.4853668212890625, 5.047698974609375, 56.14111328125, 10.908058166503906, 30.320594787597656, 43.83795166015625, -11.971166610717773, -19.935394287109375, 7.78021240234375, 23.842193603515625, 35.471466064453125, 25.03693389892578, 31.8365478515625, 4.1927337646484375, -1.098907470703125, 42.3477783203125, -0.54144287109375, 12.629058837890625, -14.955276489257812, -8.170150756835938, 21.299297332763672, 32.137664794921875, 27.563064575195312, 4.77069091796875, -5.071807861328125, 12.22369384765625, -8.717010498046875, -0.7762413024902344, -9.82843017578125, 27.89990234375, 27.17437744140625, -0.3864173889160156, -5.800018310546875, 0.0, 3.1381778717041016, -4.9754486083984375, 41.38970947265625, -9.038162231445312, 1.553314208984375, 4.3235015869140625, 6.4647369384765625, 44.275726318359375, -7.399288177490234, -12.396499633789062, 21.30401611328125, 31.375961303710938, 0.9168548583984375, 26.35760498046875, 52.680419921875, 11.078369140625, -15.60321044921875, 18.399658203125, 27.774169921875, -32.82032775878906, -6.8665771484375, 49.46661376953125, 13.546859741210938, 18.784103393554688, -2.9822769165039062, -16.359130859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000125.npy"}
|
||||
{"epoch": 0.2617801047120419, "step": 126, "batch_size": 128, "mean": 10.700855255126953, "std": 19.362314224243164, "min": -45.029022216796875, "p10": -10.566967773437499, "median": 9.793403625488281, "p90": 34.57593841552734, "max": 53.678558349609375, "pos_frac": 0.703125, "sample": [23.79638671875, 51.30242919921875, 13.739791870117188, 18.27630615234375, 29.676025390625, 42.93170166015625, 11.998779296875, 40.414337158203125, 3.661001205444336, -1.7885169982910156, -42.491363525390625, 25.36695098876953, 2.4484481811523438, 19.38763427734375, 27.60393524169922, 4.7974395751953125, 25.2791748046875, -5.317657470703125, -22.894775390625, -6.88140869140625, -14.50510025024414, 30.381271362304688, -10.06585693359375, -6.881380081176758, 4.53253173828125, -9.455368041992188, -5.855499267578125, -2.5750503540039062, 5.98443603515625, -7.4385986328125, 8.1705322265625, -3.0844268798828125, 17.13928985595703, 14.004135131835938, 32.2247314453125, -11.401763916015625, -8.502403259277344, 0.347412109375, -2.188323974609375, 5.040498733520508, 9.764678955078125, 23.36281967163086, 33.247833251953125, -1.0335693359375, 15.62225341796875, 25.118438720703125, 8.416061401367188, 16.57684326171875, -14.333831787109375, -3.4790191650390625, 33.443023681640625, -0.823394775390625, 12.475357055664062, 50.5596923828125, 3.807708740234375, 37.142433166503906, 16.36888885498047, 7.2329254150390625, 2.980682373046875, 21.136943817138672, 11.436927795410156, 29.39501953125, 20.767822265625, 25.947174072265625, -14.780853271484375, 15.918365478515625, 5.6746826171875, 24.147628784179688, 21.04058837890625, 48.36798095703125, 39.217010498046875, -5.140899658203125, 27.241134643554688, 28.35321807861328, -4.41595458984375, 7.2708740234375, 10.60516357421875, -8.2410888671875, 0.16652488708496094, -25.07763671875, 27.522750854492188, 11.0968017578125, -2.8995189666748047, 45.36395263671875, 21.906768798828125, 2.5703048706054688, -30.89276123046875, 6.1016845703125, 19.9342041015625, -35.6571044921875, 13.16766357421875, -10.169502258300781, 26.08770751953125, 31.853515625, -45.029022216796875, -9.921737670898438, 4.370147705078125, 9.822128295898438, 5.3936767578125, -16.612457275390625, 24.136817932128906, 26.927734375, 35.238372802734375, 43.8145751953125, 6.926361083984375, 20.599777221679688, -7.6982421875, 47.657073974609375, 21.579864501953125, 39.32672119140625, -10.24517822265625, 20.248687744140625, 17.847915649414062, 34.29203796386719, 20.661907196044922, 7.052734375, 4.23651123046875, -1.015289306640625, 10.03350830078125, -0.9751434326171875, 15.068267822265625, -11.31781005859375, 2.7042236328125, 6.819915771484375, 29.82403564453125, 53.678558349609375, 4.102043151855469, -18.483856201171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000126.npy"}
|
||||
{"epoch": 0.2638743455497382, "step": 127, "batch_size": 128, "mean": 12.398256301879883, "std": 19.38547134399414, "min": -42.43994140625, "p10": -10.058306884765624, "median": 8.999370574951172, "p90": 37.76280288696289, "max": 56.8660888671875, "pos_frac": 0.734375, "sample": [-0.8503913879394531, -1.004425048828125, -9.13330078125, 26.372467041015625, 9.153778076171875, 2.8715362548828125, 19.315673828125, 11.968414306640625, -11.379974365234375, -42.43994140625, 28.61688232421875, 2.6957473754882812, 19.939422607421875, 17.328950881958008, 9.8975830078125, -33.40692138671875, -8.642730712890625, 13.315780639648438, -18.140846252441406, 5.3916015625, 7.056304931640625, 6.6817626953125, 22.234161376953125, 4.490753173828125, 19.829116821289062, 46.2978515625, 23.3709716796875, 36.533843994140625, 12.67144775390625, 25.499862670898438, 37.361907958984375, 14.82705307006836, 31.335205078125, 4.142967224121094, 11.16571044921875, 46.04833984375, -9.289321899414062, -20.339630126953125, -0.9011611938476562, 9.662662506103516, 47.85546875, 7.186496734619141, 7.1726226806640625, 6.675018310546875, -16.068817138671875, -25.11327362060547, 35.543212890625, 37.55571746826172, 11.43475341796875, -21.80279541015625, 17.770767211914062, 34.796173095703125, 21.675949096679688, -7.5690460205078125, 30.210479736328125, 38.416961669921875, 26.29248046875, 14.93731689453125, -2.68096923828125, 2.405975341796875, 1.68707275390625, 4.731597900390625, 36.87403869628906, 23.54833984375, 8.235877990722656, 34.47271728515625, -7.5552825927734375, 5.3037109375, 30.425918579101562, 36.36041259765625, 25.914093017578125, -2.0240097045898438, -10.930252075195312, -2.3500728607177734, 9.020309448242188, -10.942901611328125, -0.56011962890625, -15.157455444335938, 7.6449127197265625, -3.3305397033691406, 2.53546142578125, 22.849685668945312, 56.73699951171875, -1.3888740539550781, 41.68450927734375, 18.814727783203125, 4.647491455078125, 56.8660888671875, 31.81695556640625, 21.22637939453125, 46.6451416015625, -8.329116821289062, 7.68353271484375, 9.0528564453125, 3.949481964111328, -1.4668426513671875, -20.979211807250977, 7.6629638671875, 8.978431701660156, 32.24147033691406, -6.4069671630859375, 8.646682739257812, -9.684616088867188, 30.873611450195312, 22.854705810546875, 4.644676208496094, 39.8765869140625, 47.732879638671875, 2.5103759765625, -4.8704986572265625, 6.7192230224609375, 3.303192138671875, 24.702972412109375, 36.244659423828125, -16.70934295654297, 4.439823150634766, 38.246002197265625, 40.37457275390625, -1.5536079406738281, 10.39453125, 29.99053955078125, 23.37939453125, 4.6475830078125, 31.010406494140625, 1.460968017578125, -7.500244140625, 24.14703369140625, 47.661590576171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000127.npy"}
|
||||
{"epoch": 0.26596858638743454, "step": 128, "batch_size": 128, "mean": 13.767068862915039, "std": 20.977201461791992, "min": -43.428955078125, "p10": -10.877230834960937, "median": 12.76089096069336, "p90": 37.592682647705075, "max": 74.41357421875, "pos_frac": 0.7265625, "sample": [27.465606689453125, 4.9968414306640625, 19.761566162109375, 25.19085693359375, -34.39569091796875, -34.303924560546875, -20.795654296875, 8.033615112304688, 25.602020263671875, 29.643417358398438, 23.80066680908203, 2.54815673828125, 35.840972900390625, 0.8653030395507812, 38.57276916503906, 0.5268840789794922, 3.0501327514648438, 23.723861694335938, 27.573013305664062, 24.852813720703125, -3.4350128173828125, -0.47527313232421875, 36.05170440673828, 67.29937744140625, 4.30426025390625, 8.562812805175781, -1.245025634765625, 5.179473876953125, 29.169769287109375, 22.9652099609375, 9.699951171875, 11.829216003417969, -19.08709716796875, 36.22499084472656, -8.356595993041992, 31.919052124023438, -16.266983032226562, 7.2847442626953125, 0.3414764404296875, -21.778396606445312, 25.268898010253906, 3.024688720703125, -2.5364303588867188, 17.94195556640625, 33.5584716796875, -4.0966796875, 0.3352699279785156, 24.996883392333984, 20.428680419921875, 37.172645568847656, 35.966705322265625, 26.9935302734375, -3.3577423095703125, 34.815185546875, -1.74761962890625, -6.0119171142578125, 3.66192626953125, 25.549789428710938, 18.799224853515625, 41.57830810546875, 27.384754180908203, 4.4370880126953125, 14.161504745483398, -0.423431396484375, 1.3157196044921875, 7.971038818359375, 40.324188232421875, 29.75830078125, 28.86041259765625, 14.294200897216797, 15.23541259765625, 29.324249267578125, 50.85520935058594, 15.8826904296875, 47.907989501953125, 35.404327392578125, 11.506370544433594, -5.785308837890625, -1.25018310546875, -1.9684257507324219, 41.764373779296875, 30.79083251953125, -13.716552734375, 20.495223999023438, 29.518280029296875, -43.428955078125, 21.371795654296875, -21.9521484375, 18.59821319580078, 74.41357421875, 9.805770874023438, -11.168548583984375, 35.39495849609375, 15.56292724609375, -7.99749755859375, 1.1547222137451172, -14.759658813476562, -8.305191040039062, 6.98492431640625, 11.552970886230469, 36.805755615234375, -4.697227478027344, 6.2032470703125, 47.27013397216797, 23.035598754882812, 33.620635986328125, -7.009098052978516, -6.1265716552734375, 58.638580322265625, 0.8722267150878906, 9.92596435546875, 0.0, -21.11114501953125, -7.64324951171875, 26.392242431640625, -11.72601318359375, -10.75238037109375, 13.69256591796875, 24.772979736328125, 45.09771728515625, 5.067138671875, 19.21771240234375, 11.113861083984375, 40.376861572265625, 60.270904541015625, 25.840286254882812, 18.57452392578125, -1.8972015380859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000128.npy"}
|
||||
{"epoch": 0.2680628272251309, "step": 129, "batch_size": 128, "mean": 12.615655899047852, "std": 21.785476684570312, "min": -48.72332763671875, "p10": -12.492751312255859, "median": 11.357309341430664, "p90": 42.510546874999996, "max": 79.31832885742188, "pos_frac": 0.7578125, "sample": [-48.72332763671875, 28.68609619140625, -24.69329833984375, -14.57989501953125, 11.390449523925781, 45.25998306274414, 26.668212890625, 37.81451416015625, 30.554428100585938, 2.9282989501953125, 32.28851318359375, 3.051708221435547, 12.880828857421875, -3.189239501953125, -7.0075836181640625, -11.069778442382812, 27.768646240234375, 35.604644775390625, 12.271621704101562, -12.273475646972656, 12.309432983398438, 12.550849914550781, 1.0881271362304688, 18.87042236328125, 42.943603515625, -6.292022705078125, 17.603057861328125, -2.655426025390625, 79.31832885742188, 62.79736328125, 4.89385986328125, 22.88995361328125, 53.4693603515625, 15.978668212890625, 1.2667999267578125, -45.53936767578125, 34.2044677734375, 5.11834716796875, 63.0914306640625, 11.324169158935547, -4.638843536376953, 16.130157470703125, 40.949920654296875, 43.0611572265625, 9.089935302734375, 43.443511962890625, -25.76177978515625, 11.134502410888672, 0.7739105224609375, 3.9658889770507812, 27.729934692382812, 17.273033142089844, 9.75909423828125, 25.610443115234375, -11.80364990234375, 8.172159194946289, 5.68011474609375, 44.77801513671875, 1.6386566162109375, 28.59307098388672, 1.8649978637695312, 2.1617813110351562, 42.324951171875, 22.339920043945312, 62.71575927734375, -6.89117431640625, 14.9517822265625, 17.862335205078125, 28.398757934570312, 4.2666015625, 4.41082763671875, 6.540493011474609, 2.639556884765625, -4.489952087402344, 3.16552734375, -13.00439453125, 28.036720275878906, 20.97303009033203, -3.198505401611328, 59.99462890625, 27.738311767578125, 8.769363403320312, 19.08441162109375, 3.8438339233398438, 2.8372039794921875, 3.506103515625, 29.29736328125, 3.766387939453125, 19.15936279296875, 39.608489990234375, -10.8037109375, -24.93719482421875, -1.7853202819824219, -0.6898193359375, -7.698822021484375, 14.8753662109375, -9.56085205078125, -31.409393310546875, 15.694961547851562, 11.844482421875, 46.788665771484375, -15.618911743164062, 13.038970947265625, -1.9874897003173828, -21.431350708007812, -22.65234375, 13.124588012695312, 3.13751220703125, 15.830268859863281, 25.235382080078125, 28.609970092773438, 6.534183502197266, 18.77001953125, 10.498199462890625, 9.740943908691406, 45.68536376953125, 10.577178955078125, 31.886093139648438, -19.3758544921875, 19.627059936523438, 11.873527526855469, -0.5838699340820312, 4.5559234619140625, 30.641555786132812, 13.740707397460938, -14.819061279296875, 24.30078125, 32.42986297607422], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000129.npy"}
|
||||
{"epoch": 0.27015706806282724, "step": 130, "batch_size": 128, "mean": 11.585528373718262, "std": 19.587093353271484, "min": -38.62939453125, "p10": -11.095370483398433, "median": 9.57354736328125, "p90": 36.687744140625, "max": 57.01704406738281, "pos_frac": 0.7265625, "sample": [9.60931396484375, -2.928253173828125, 29.313690185546875, 4.6649017333984375, 5.4530792236328125, 41.29936218261719, 9.53778076171875, 28.987457275390625, 41.446678161621094, -6.1961669921875, -3.3485794067382812, 10.129150390625, 23.175086975097656, 37.559112548828125, 0.022491455078125, 5.9080810546875, -3.982940673828125, -18.977996826171875, 8.722030639648438, 19.0843505859375, 16.536026000976562, 38.171661376953125, -1.3104095458984375, 9.06085205078125, -2.1440868377685547, 57.01704406738281, 1.62567138671875, 20.18017578125, 46.997344970703125, 50.254913330078125, -31.07867431640625, -0.523468017578125, 0.0, -10.017227172851562, 25.4351806640625, 11.915695190429688, 23.78485107421875, -38.00592041015625, 9.771480560302734, 23.8692626953125, 25.522998809814453, 15.277145385742188, 48.17034912109375, -9.273406982421875, -38.62939453125, -36.81974792480469, -3.937744140625, 34.92131042480469, 2.558258056640625, -0.5633316040039062, 33.4281005859375, 1.3664093017578125, 10.327423095703125, 33.732635498046875, -17.5418701171875, -0.28778076171875, 12.83563232421875, 18.165313720703125, 36.314300537109375, -3.0791015625, 8.837127685546875, 1.6269111633300781, 23.221832275390625, 16.234817504882812, 4.494415283203125, 44.59706115722656, 27.235595703125, -17.838043212890625, -13.611038208007812, -1.5022735595703125, 38.43231201171875, 7.296661376953125, 29.029083251953125, -23.380218505859375, 5.9098663330078125, 17.5609130859375, -14.395614624023438, 30.5511474609375, 9.052810668945312, 15.780084609985352, 28.3941650390625, 17.60577392578125, 32.20281982421875, 19.30560302734375, 32.529266357421875, 0.8433418273925781, 31.96893310546875, 8.176589965820312, 27.3948974609375, -3.798095703125, 11.72589111328125, 8.047119140625, 7.494110107421875, 3.5811805725097656, 23.638412475585938, -27.377395629882812, -7.7349700927734375, 4.6819000244140625, 3.7344970703125, 47.56964111328125, 44.1246337890625, 35.18304443359375, 20.907699584960938, 1.038604736328125, -23.406494140625, 27.933563232421875, 4.2748260498046875, 4.8232421875, 19.82501220703125, -5.4827117919921875, 35.682159423828125, 6.507228851318359, 26.577606201171875, -5.874137878417969, 19.648574829101562, 19.59127426147461, 38.412750244140625, 21.712646484375, -0.5204696655273438, 17.219070434570312, 15.336097717285156, -9.4364013671875, -1.9317779541015625, 2.9886627197265625, 28.123291015625, -26.2218017578125, 5.603639602661133, 27.68408203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000130.npy"}
|
||||
{"epoch": 0.27225130890052357, "step": 131, "batch_size": 128, "mean": 9.715350151062012, "std": 22.214693069458008, "min": -46.43424987792969, "p10": -18.86392211914062, "median": 8.401138305664062, "p90": 40.52735290527344, "max": 72.32675170898438, "pos_frac": 0.640625, "sample": [31.139312744140625, 10.167892456054688, 36.948455810546875, -21.47802734375, 12.457244873046875, 2.782684326171875, 2.9701766967773438, 23.465576171875, -18.119522094726562, 40.51141357421875, 30.901519775390625, 30.78729248046875, 40.771240234375, 17.876922607421875, -18.15594482421875, 0.9504852294921875, 1.97369384765625, -16.21649169921875, 9.862472534179688, 49.688262939453125, 35.67633056640625, 10.326263427734375, -6.0169677734375, -5.710414886474609, 30.251426696777344, -6.3542633056640625, -9.20135498046875, 42.604217529296875, 11.659698486328125, 13.496688842773438, -1.4574356079101562, 62.46977233886719, -5.787811279296875, 0.986663818359375, 19.969757080078125, -3.160430908203125, -23.371307373046875, 41.73297119140625, -3.001220703125, 30.18408203125, 8.436553955078125, 3.0210113525390625, -2.1106643676757812, -14.821533203125, 25.394775390625, 6.796630859375, 57.423553466796875, -17.44586181640625, 20.8680419921875, 24.3624267578125, -1.49188232421875, -3.05499267578125, -17.5958251953125, 14.8203125, 13.464920043945312, 41.493438720703125, -22.764999389648438, -5.23321533203125, 26.032283782958984, 35.983551025390625, -8.361236572265625, -21.187408447265625, -20.515869140625, 31.30559539794922, 6.6732177734375, 6.621349334716797, 17.697189331054688, 28.61138916015625, 72.32675170898438, 7.8107757568359375, -46.43424987792969, 4.1197052001953125, 10.81884765625, -8.16975212097168, -4.038585662841797, 18.11669921875, -28.23114013671875, 61.67620849609375, -6.835660934448242, 34.81602478027344, 10.787811279296875, -23.79852294921875, 1.463134765625, -13.600540161132812, 12.11865234375, 19.5223388671875, -12.80029296875, 38.518890380859375, 43.887786865234375, -23.52764892578125, 0.28629302978515625, -3.1171798706054688, 42.8023681640625, -22.525436401367188, 25.009765625, 6.2778167724609375, 1.794820785522461, 8.284408569335938, -7.9677886962890625, 16.883705139160156, 11.896133422851562, -21.171173095703125, 28.40960693359375, -3.9834671020507812, 9.921051025390625, 34.37511444091797, -2.786712646484375, -34.018310546875, 53.84393310546875, 8.36572265625, 32.66387939453125, 40.564544677734375, 13.97119140625, 20.543365478515625, -14.48968505859375, -3.191925048828125, 21.6246337890625, -9.051025390625, 28.134130477905273, 10.861785888671875, 20.02203369140625, 14.736083984375, 24.099609375, 17.350845336914062, -6.29766845703125, -20.61517333984375, -9.3123779296875, 1.8186607360839844], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000131.npy"}
|
||||
{"epoch": 0.2743455497382199, "step": 132, "batch_size": 128, "mean": 13.061962127685547, "std": 24.88197898864746, "min": -45.1776123046875, "p10": -17.441993713378906, "median": 10.632667541503906, "p90": 44.901463317871084, "max": 95.7044677734375, "pos_frac": 0.703125, "sample": [-7.76617431640625, 24.58941650390625, 36.88002014160156, 13.396316528320312, 0.13060760498046875, -8.209136962890625, 26.983497619628906, 10.791183471679688, -4.337127685546875, 37.406951904296875, 3.841594696044922, -11.414695739746094, -20.275787353515625, 34.400390625, 21.516708374023438, 9.300796508789062, 3.064605712890625, 46.49638366699219, 42.02488708496094, 60.263336181640625, 34.727935791015625, 32.0716552734375, 69.89865112304688, 49.8653564453125, 14.510818481445312, 1.2674407958984375, -0.9200820922851562, -32.61790466308594, 13.531829833984375, 19.945159912109375, -1.630645751953125, 47.966796875, 2.9058837890625, 20.3236083984375, -23.6246337890625, -17.638259887695312, -17.12274932861328, 9.31976318359375, 3.6457061767578125, 2.89678955078125, 42.915313720703125, 16.682266235351562, -10.847900390625, -14.865081787109375, -23.09265899658203, -0.37154388427734375, 6.346405029296875, -18.432098388671875, 12.6572265625, -12.464202880859375, 17.770065307617188, -11.034151077270508, 16.964845657348633, 30.762290954589844, 4.36712646484375, 0.16815185546875, 4.871246337890625, 17.690078735351562, 95.7044677734375, 0.11492156982421875, 26.4708251953125, 29.0543212890625, 22.9271240234375, -24.39874267578125, 36.717933654785156, -0.82470703125, -2.55364990234375, 12.814239501953125, -19.450347900390625, 1.2101593017578125, 16.500732421875, 3.803558349609375, 14.391754150390625, 37.62437438964844, -7.45611572265625, -1.6761627197265625, 0.0, 36.4561767578125, 13.819168090820312, -18.743743896484375, 9.014022827148438, 36.02813720703125, 50.7032470703125, -31.760177612304688, -45.1776123046875, 57.1971435546875, 12.41326904296875, 17.100128173828125, 2.894927978515625, 9.72650146484375, 21.7491455078125, 13.918685913085938, 33.849517822265625, 25.26587677001953, 52.44087219238281, 92.5870361328125, -15.12591552734375, 25.96759033203125, -0.504364013671875, 14.824859619140625, 23.684814453125, 41.543182373046875, 2.845062255859375, 0.9941368103027344, 10.474151611328125, 52.576812744140625, 8.035919189453125, 6.836517333984375, 0.0, -17.357879638671875, 32.60331726074219, 41.444305419921875, -9.878013610839844, 1.3657054901123047, -9.374053955078125, 11.017333984375, -3.39923095703125, 57.376251220703125, -28.106414794921875, 44.217926025390625, 34.57981872558594, 28.24639892578125, -23.559280395507812, 15.119735717773438, -16.5675048828125, 17.919769287109375, 0.82525634765625, 64.35354614257812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000132.npy"}
|
||||
{"epoch": 0.2764397905759162, "step": 133, "batch_size": 128, "mean": 13.641063690185547, "std": 22.517520904541016, "min": -65.51861572265625, "p10": -12.535035705566402, "median": 12.296638488769531, "p90": 42.0278823852539, "max": 71.8687744140625, "pos_frac": 0.765625, "sample": [22.15032958984375, 1.5713615417480469, -34.793609619140625, 0.0, 1.728057861328125, 32.96461486816406, -1.2023029327392578, 13.764404296875, 4.865272521972656, 29.737518310546875, 16.698516845703125, 18.010833740234375, 5.38995361328125, 25.86554718017578, 40.30303955078125, 50.73626708984375, -8.787826538085938, -2.13311767578125, -1.456939697265625, -14.884750366210938, 33.304840087890625, 48.92535400390625, 71.8687744140625, 5.174736022949219, 24.366058349609375, 43.746002197265625, 22.665267944335938, 53.864288330078125, 0.0, -0.0705718994140625, 21.323760986328125, 22.56512451171875, -3.4343795776367188, 1.304534912109375, 27.670684814453125, 32.46638488769531, 23.8494873046875, 27.11114501953125, -26.576751708984375, 33.28509521484375, 40.57476806640625, 6.391143798828125, -6.064483642578125, -7.777618408203125, 10.43865966796875, 0.553619384765625, -15.196174621582031, 8.114532470703125, 3.877593994140625, 9.2012939453125, 16.2490234375, 41.6484375, 3.97100830078125, 29.745033264160156, 49.143218994140625, 8.16156005859375, 5.300933837890625, 42.305816650390625, -4.2454986572265625, 31.58856201171875, -21.786376953125, 12.156524658203125, 36.1754150390625, 1.717315673828125, 17.485885620117188, 7.875846862792969, 26.464584350585938, 43.434814453125, 30.422119140625, 28.32585906982422, -36.36328125, 1.5760040283203125, 12.436752319335938, 7.86968994140625, 52.011383056640625, 9.382547378540039, -6.752616882324219, -21.3050537109375, 4.135471343994141, 45.7559814453125, -6.833202362060547, 1.549072265625, 9.586196899414062, -9.96087646484375, 20.58148193359375, -11.52801513671875, 8.571258544921875, 26.191322326660156, 24.76373291015625, 41.90876770019531, -5.079872131347656, 47.5107421875, 27.622344970703125, 14.466552734375, 15.824951171875, 5.4359130859375, 7.261383056640625, 40.26513671875, 25.429473876953125, 44.122314453125, 8.054939270019531, 29.9937744140625, 21.449092864990234, 26.783611297607422, -27.840591430664062, 21.3504638671875, 20.223907470703125, 29.398391723632812, -27.268943786621094, 49.536590576171875, 23.43060302734375, 8.253379821777344, 4.82891845703125, -15.939453125, -51.15924072265625, 19.110321044921875, -65.51861572265625, 4.593051910400391, 2.8988723754882812, 39.0882568359375, 6.348052978515625, 25.409820556640625, 36.228763580322266, -6.2680816650390625, -24.8924560546875, 2.134967803955078, 28.8006591796875, 40.40111541748047], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000133.npy"}
|
||||
{"epoch": 0.27853403141361255, "step": 134, "batch_size": 128, "mean": 9.47242259979248, "std": 21.27786636352539, "min": -41.75518798828125, "p10": -14.638009643554687, "median": 7.232063293457031, "p90": 33.91058654785156, "max": 92.0439453125, "pos_frac": 0.6796875, "sample": [39.42121887207031, 26.926300048828125, -17.217041015625, -7.034912109375, -25.1339111328125, -14.3433837890625, 8.45947265625, -16.641860961914062, -4.87200927734375, 27.651611328125, -37.9097900390625, -7.3552703857421875, 8.97235107421875, 15.619384765625, 27.641571044921875, -0.8084526062011719, -6.136943817138672, 8.008808135986328, 22.994476318359375, 6.914833068847656, 10.57049560546875, 62.73223876953125, 0.0, 18.63470458984375, 21.377197265625, 22.6951904296875, 31.826416015625, -3.0787582397460938, 4.8670654296875, 46.897369384765625, -29.43195152282715, -4.5621185302734375, 17.1173095703125, 2.6014022827148438, 29.093460083007812, 2.3570480346679688, 27.96063232421875, 16.43476104736328, 14.671630859375, 6.752960205078125, -10.8995361328125, 25.18830108642578, 14.10589599609375, 26.18292236328125, 92.0439453125, 35.33782958984375, 2.00506591796875, -12.75390625, 69.7391357421875, 28.156402587890625, -24.92559814453125, 32.249114990234375, 11.463104248046875, -13.84433364868164, 0.25829315185546875, 3.247833251953125, 31.9771728515625, 21.761978149414062, 39.03428649902344, 8.496223449707031, -11.550106048583984, -1.64306640625, 8.254180908203125, 7.146270751953125, -28.43975830078125, 41.06477355957031, -3.1833648681640625, 14.95050048828125, 25.056365966796875, 10.957290649414062, 2.6825790405273438, 17.0020751953125, -18.177757263183594, -4.447319030761719, 0.83428955078125, 42.52752685546875, 12.045013427734375, 25.2689208984375, 6.483121871948242, -34.856727600097656, 15.1927490234375, 33.508056640625, 28.990585327148438, -10.86700439453125, 0.23358154296875, -3.3125381469726562, 9.920372009277344, 14.981367111206055, 44.36944580078125, 1.2585296630859375, -0.03209686279296875, -5.1273956298828125, 10.407855987548828, -0.196380615234375, -41.75518798828125, 27.55030059814453, 22.71710205078125, 12.45281982421875, 6.717185974121094, -9.248306274414062, 5.731967926025391, 15.54315185546875, -13.876251220703125, -41.670135498046875, -15.753555297851562, 22.3382568359375, -15.325469970703125, 3.19903564453125, 35.349365234375, 3.13909912109375, 7.3178558349609375, 32.065399169921875, 0.5210800170898438, 22.09326171875, 38.1217041015625, 26.377002716064453, 33.317291259765625, 4.4034271240234375, 6.185455322265625, -7.670021057128906, -1.711181640625, 1.14434814453125, -3.8494491577148438, -6.1259765625, 34.849822998046875, 19.330039978027344, 4.191154479980469, 0.0], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000134.npy"}
|
||||
{"epoch": 0.2806282722513089, "step": 135, "batch_size": 128, "mean": 13.25109577178955, "std": 20.959461212158203, "min": -51.265167236328125, "p10": -9.726668167114255, "median": 10.761837005615234, "p90": 44.69334106445312, "max": 60.7301025390625, "pos_frac": 0.7734375, "sample": [0.4699516296386719, -26.125320434570312, 11.477088928222656, 22.000823974609375, -9.067913055419922, 41.561248779296875, 0.8683204650878906, 13.52728271484375, 27.99462890625, 3.571474075317383, 54.1602783203125, 16.5267333984375, -5.630950927734375, -3.5463409423828125, 12.73004150390625, 7.01922607421875, 8.493278503417969, 25.29498291015625, 24.120361328125, 3.5404434204101562, 2.529632568359375, 7.15252685546875, -0.27312469482421875, 2.5953521728515625, -30.045806884765625, 0.1580333709716797, 33.474456787109375, 26.65125274658203, 18.666404724121094, 47.17840576171875, 32.83622741699219, 0.030155181884765625, 31.392059326171875, 5.79669189453125, -17.236915588378906, 38.0189208984375, 3.9976806640625, 18.512725830078125, 27.43115234375, 8.74822998046875, 4.080230712890625, 13.1422119140625, 13.86163330078125, 35.60699462890625, 58.6207275390625, 18.831844329833984, 44.14532470703125, 10.785232543945312, 2.63751220703125, -2.70599365234375, 19.976104736328125, 23.28924560546875, 2.4916152954101562, -11.263763427734375, -17.766082763671875, -8.802627563476562, 44.27239990234375, 5.89569091796875, 41.6624755859375, 16.779998779296875, 23.560546875, 16.273406982421875, 32.960784912109375, -5.58404541015625, -2.7982940673828125, 11.301071166992188, -2.715179443359375, 20.32281494140625, 17.687835693359375, 3.986297607421875, 16.902297973632812, 5.6757659912109375, 50.83380126953125, 16.552459716796875, 34.577415466308594, 30.893577575683594, 16.478057861328125, 4.8184814453125, -6.20977783203125, 51.199554443359375, 20.4603271484375, -17.103897094726562, -1.3651123046875, 0.881591796875, 7.726715087890625, 49.97230529785156, 1.033294677734375, 51.519287109375, 35.267364501953125, 48.45098876953125, 20.50048828125, 41.92840576171875, 4.163330078125, -41.2078857421875, 45.675537109375, 5.818695068359375, 5.895454406738281, 7.1307830810546875, 2.08331298828125, 24.23541259765625, 8.493385314941406, 11.6993408203125, 29.563278198242188, 60.7301025390625, 2.4410552978515625, -0.23248291015625, 4.042045593261719, 14.384468078613281, 46.917205810546875, 58.09588623046875, -13.98980712890625, -15.788726806640625, 3.5240478515625, 11.010421752929688, 0.0, 22.1737060546875, -20.75726318359375, -51.265167236328125, -0.5839309692382812, 10.738441467285156, -7.13311767578125, 20.426284790039062, 6.4885711669921875, 48.593170166015625, 29.1060791015625, -13.717384338378906, -20.24627685546875, -0.49676513671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000135.npy"}
|
||||
{"epoch": 0.28272251308900526, "step": 136, "batch_size": 128, "mean": 12.70842170715332, "std": 23.051292419433594, "min": -56.447540283203125, "p10": -9.89458999633789, "median": 10.471498489379883, "p90": 42.282884216308595, "max": 81.56802368164062, "pos_frac": 0.765625, "sample": [44.219757080078125, 16.6962890625, 1.459197998046875, 13.946929931640625, 34.421958923339844, 38.311248779296875, 46.671539306640625, 11.243545532226562, 32.727210998535156, 60.355804443359375, 6.3260345458984375, 3.7393455505371094, 43.675567626953125, 5.951560974121094, 26.47616958618164, 39.79512023925781, 14.0914306640625, 39.829010009765625, 12.260467529296875, 0.08275604248046875, -43.678802490234375, 35.569149017333984, 8.64202880859375, 25.5751953125, -19.777618408203125, 4.271148681640625, -7.315673828125, -42.05828857421875, 6.667366027832031, 12.748443603515625, 4.8050689697265625, -0.326690673828125, -5.9066619873046875, 8.914794921875, 25.303443908691406, -26.058944702148438, -9.32135009765625, 29.30389404296875, 13.812118530273438, -4.0145416259765625, 26.940948486328125, -21.571853637695312, 58.3756103515625, -56.447540283203125, 12.59405517578125, -6.490264892578125, 4.068603515625, 5.665473937988281, 18.856414794921875, -4.859413146972656, 17.40533447265625, -34.986083984375, 42.34266662597656, 19.940505981445312, 34.67724609375, 71.3326416015625, -9.358795166015625, 6.6009063720703125, 9.39300537109375, 32.64988708496094, 43.74267578125, 19.5821533203125, 37.161773681640625, -11.672943115234375, 10.28641128540039, 0.53411865234375, -1.849212646484375, 9.4766845703125, 34.212677001953125, -3.8538818359375, 3.4190826416015625, 16.0733642578125, 6.6695404052734375, 10.656585693359375, 3.9144134521484375, 52.3087158203125, -0.875, 14.9713134765625, 11.782180786132812, -9.704986572265625, 1.566314697265625, -25.207962036132812, 22.53857421875, 58.385894775390625, 35.1898193359375, 25.153656005859375, 1.829824447631836, 17.476776123046875, -6.625225067138672, 47.17529296875, 16.57016372680664, 9.404296875, -10.336997985839844, 13.246963500976562, 6.204742431640625, 1.2440261840820312, 55.7845458984375, 14.483589172363281, 29.188400268554688, 22.541900634765625, -0.3543052673339844, 5.859683990478516, 32.7548828125, 1.4559707641601562, 31.11236572265625, -13.265350341796875, -5.99273681640625, 3.7427520751953125, 35.298583984375, 42.25726318359375, 36.407958984375, 9.98672103881836, -21.729339599609375, -4.763031005859375, 4.827239990234375, 11.61126708984375, 81.56802368164062, 4.991790771484375, -0.42083740234375, 9.145645141601562, 11.967880249023438, 14.435501098632812, 11.811767578125, 2.638671875, 15.760826110839844, -50.176116943359375, 32.341949462890625, 4.188350677490234], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000136.npy"}
|
||||
{"epoch": 0.2848167539267016, "step": 137, "batch_size": 128, "mean": 11.359804153442383, "std": 22.143428802490234, "min": -48.84820556640625, "p10": -15.548623275756833, "median": 8.430084228515625, "p90": 36.744422912597656, "max": 77.32183837890625, "pos_frac": 0.7578125, "sample": [-29.65234375, 2.251495361328125, 37.06854248046875, -5.949462890625, 11.672500610351562, 20.914535522460938, 39.60700988769531, 13.159637451171875, 4.1426544189453125, 2.661865234375, 18.491188049316406, 15.34039306640625, 26.64666748046875, 56.509765625, 17.719512939453125, 3.200958251953125, 0.8789749145507812, 31.5821533203125, 0.937255859375, -13.7432861328125, 29.667694091796875, 20.640899658203125, 5.89862060546875, 10.519195556640625, -44.27392578125, 21.819580078125, 4.871826171875, -17.04150390625, 29.348068237304688, 7.516357421875, 48.72393798828125, 15.476837158203125, -2.430419921875, 39.98260498046875, -18.134490966796875, -1.978515625, 3.8163909912109375, 7.8647003173828125, 0.8359375, -9.411445617675781, 9.03271484375, -4.449026107788086, -20.575584411621094, 1.488922119140625, -48.84820556640625, 6.581390380859375, 7.347908020019531, 15.081161499023438, -44.76756286621094, 17.100624084472656, 6.8426513671875, 46.676025390625, 8.371002197265625, 9.86236572265625, 15.316009521484375, 34.24885559082031, -18.260894775390625, 41.23028564453125, -17.09918212890625, 1.9092864990234375, 7.2047119140625, -0.8299636840820312, 27.2490234375, -0.371826171875, 11.4619140625, 57.552459716796875, -27.554290771484375, 25.738855361938477, 8.045318603515625, -2.289276123046875, 2.525726318359375, 24.051971435546875, 29.106689453125, 11.178359985351562, 34.8192024230957, 1.5449256896972656, 1.2047901153564453, 77.32183837890625, 27.714080810546875, 14.865493774414062, 7.8114776611328125, 59.05877685546875, -1.1371994018554688, 36.122955322265625, 17.9974365234375, 18.972900390625, 56.8521728515625, 17.774673461914062, -4.393348693847656, 35.44903564453125, 1.3671035766601562, -14.761878967285156, -0.6638412475585938, 7.4893798828125, 1.848428726196289, 65.06134033203125, 29.10784912109375, 3.3466453552246094, 19.62506103515625, 17.024612426757812, -37.0841064453125, -29.711883544921875, 8.489166259765625, -33.013916015625, -9.239814758300781, 53.129547119140625, -14.908817291259766, 29.5869140625, 36.60551452636719, 34.17723083496094, 25.458404541015625, 35.035003662109375, -1.752410888671875, 19.541900634765625, 16.1558837890625, 23.301395416259766, -1.2034149169921875, 6.7449493408203125, 0.052642822265625, 6.6450042724609375, 7.632942199707031, 20.76641082763672, 7.19268798828125, 13.36431884765625, 14.741165161132812, 20.12896728515625, 28.547027587890625, -5.0626678466796875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000137.npy"}
|
||||
{"epoch": 0.2869109947643979, "step": 138, "batch_size": 128, "mean": 11.083659172058105, "std": 24.886966705322266, "min": -47.73773193359375, "p10": -17.98569641113281, "median": 9.180717468261719, "p90": 42.73408813476562, "max": 83.4508056640625, "pos_frac": 0.6796875, "sample": [69.44522094726562, 42.38818359375, 43.54119873046875, 16.73163604736328, -12.559150695800781, 6.765350341796875, -3.719390869140625, 68.28543090820312, 6.9364013671875, 23.212005615234375, 10.703857421875, 10.19500732421875, 2.79852294921875, 13.856185913085938, 12.173065185546875, -30.5657958984375, -18.399505615234375, 31.083984375, -12.77728271484375, -29.741729736328125, 21.253143310546875, 5.7176513671875, -13.533447265625, -8.352554321289062, -47.73773193359375, 25.930099487304688, -4.2445526123046875, 35.256988525390625, 1.8777847290039062, 83.4508056640625, -7.824859619140625, 6.7362060546875, 2.884368896484375, 43.81194305419922, 1.4173698425292969, 25.808914184570312, -26.82135009765625, 13.177757263183594, 41.521331787109375, -1.4935760498046875, 11.823944091796875, -2.326913833618164, 30.380355834960938, 36.779815673828125, 16.26959228515625, 33.26799011230469, 6.5059356689453125, 3.1879501342773438, 13.638198852539062, -17.420700073242188, 12.276138305664062, 2.0780792236328125, 38.0560302734375, -17.808349609375, 45.6578369140625, 11.470474243164062, 34.10400390625, -2.98016357421875, 20.369667053222656, -20.20916748046875, -18.831130981445312, 2.020111083984375, -1.3065185546875, -1.6286773681640625, 22.503021240234375, -29.935699462890625, 64.95489501953125, 31.148818969726562, 29.325653076171875, -1.955810546875, -30.259750366210938, 15.97393798828125, -6.192901611328125, 53.73040771484375, 14.50384521484375, -12.18280029296875, -4.229804992675781, 20.039840698242188, 11.62799072265625, 1.0941162109375, 14.126419067382812, 35.02001953125, 27.973953247070312, 33.90618896484375, -41.041015625, -13.217117309570312, 13.64068603515625, 0.5107955932617188, -13.985580444335938, -17.5526123046875, 27.396530151367188, 57.079254150390625, 45.91316223144531, 2.73956298828125, -2.606874465942383, 29.27301025390625, 26.450393676757812, 26.8388671875, 1.6352691650390625, -4.09674072265625, 28.061370849609375, -15.841842651367188, 74.01254272460938, -5.5063934326171875, 9.016738891601562, 12.78668212890625, -8.464683532714844, -14.803741455078125, 4.4916839599609375, 40.725135803222656, -22.4573974609375, 3.064056396484375, 2.6307907104492188, 18.731658935546875, 27.942138671875, 1.9956741333007812, 57.573944091796875, -33.57366943359375, 33.7049560546875, -37.675048828125, 4.646354675292969, 22.630477905273438, 5.20489501953125, 20.088134765625, 50.6082763671875, 28.467330932617188, 9.344696044921875, -13.410247802734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000138.npy"}
|
||||
{"epoch": 0.28900523560209423, "step": 139, "batch_size": 128, "mean": 8.285666465759277, "std": 24.31345558166504, "min": -46.779022216796875, "p10": -21.53638458251953, "median": 5.957028388977051, "p90": 40.49977416992187, "max": 68.17453002929688, "pos_frac": 0.6484375, "sample": [-37.40299987792969, -4.1383819580078125, -31.048797607421875, 7.9262542724609375, -32.024505615234375, 15.092987060546875, 31.57989501953125, 18.9200439453125, -4.641387939453125, -11.323001861572266, 51.7408447265625, 4.4565277099609375, 21.438629150390625, 11.1300048828125, -28.508285522460938, 46.13523864746094, 53.122344970703125, 10.535430908203125, -13.766807556152344, 9.8525390625, 5.463802337646484, 25.460205078125, 32.559906005859375, -14.65985107421875, 14.28997802734375, -12.988845825195312, 6.350931167602539, 36.065704345703125, -6.2276458740234375, -46.779022216796875, 0.5783500671386719, -7.2423248291015625, 36.13017272949219, 40.39813232421875, 27.051254272460938, 0.19417667388916016, 12.37908935546875, -8.13916015625, 4.22723388671875, 29.23895263671875, -2.7621536254882812, -20.541702270507812, 1.49041748046875, 21.4136962890625, 13.925529479980469, -5.34503173828125, -41.5260009765625, 0.54559326171875, -12.450908660888672, 39.27789306640625, -9.8734130859375, 1.537322998046875, -22.074234008789062, -21.305877685546875, 7.409416198730469, 41.86273193359375, -1.03302001953125, 0.1285858154296875, -23.70111083984375, -18.7933349609375, 39.35516357421875, -13.699092864990234, 43.9715576171875, 24.266538619995117, 27.002105712890625, 15.97869873046875, 0.33708763122558594, 0.0982818603515625, 40.7369384765625, 21.97003936767578, -16.140487670898438, 18.84552764892578, 15.76898193359375, 43.408355712890625, -4.533973693847656, -12.25738525390625, 5.083251953125, 68.17453002929688, -11.1380615234375, 40.04052734375, 25.237998962402344, 21.017486572265625, 16.229904174804688, 52.57963562011719, 5.5137939453125, -14.919807434082031, 53.7950439453125, -37.838470458984375, 18.374786376953125, -4.759899139404297, 29.023712158203125, 47.54547119140625, 19.05926513671875, 16.19122314453125, -18.7410888671875, 16.005783081054688, 1.444061279296875, -40.016204833984375, -6.623504638671875, 2.6002025604248047, 24.975204467773438, 8.204925537109375, -9.627777099609375, 2.2249069213867188, 11.754486083984375, -9.079681396484375, 26.628402709960938, 67.16629028320312, 9.907005310058594, 58.739990234375, 39.41021728515625, 18.451019287109375, 3.526275634765625, -34.842437744140625, 6.6038055419921875, -33.353271484375, 0.0, 5.5631256103515625, 14.524345397949219, 33.416378021240234, -8.237686157226562, -32.762939453125, 20.4486083984375, -5.756034851074219, 23.517181396484375, -12.41058349609375, 36.452789306640625, 4.520759582519531], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000139.npy"}
|
||||
{"epoch": 0.29109947643979056, "step": 140, "batch_size": 128, "mean": 15.175578117370605, "std": 22.78795051574707, "min": -42.58770751953125, "p10": -9.812145996093747, "median": 10.315616607666016, "p90": 46.90585784912109, "max": 79.92645263671875, "pos_frac": 0.6953125, "sample": [-7.37713623046875, 24.63433837890625, 33.352020263671875, -9.042694091796875, -11.352470397949219, 50.00408935546875, -5.574798583984375, 42.514404296875, -2.241973876953125, -0.27393341064453125, -15.571891784667969, 11.161041259765625, 30.48260498046875, 20.88817596435547, 5.9449462890625, 8.596172332763672, 1.01953125, 22.197067260742188, 46.66131591796875, 10.2342529296875, 8.324111938476562, -16.574508666992188, 28.466873168945312, -28.65869140625, 21.667388916015625, 41.678192138671875, -4.063140869140625, 8.045486450195312, 44.51124572753906, 42.80536651611328, 31.13995361328125, 32.67832946777344, 65.41400146484375, 38.32818603515625, -9.18695068359375, 56.287109375, -0.831298828125, 4.951934814453125, 49.56884765625, 48.9912109375, -2.210338592529297, -26.585220336914062, 10.60699462890625, -0.21439361572265625, -3.58087158203125, 37.22283935546875, 50.78997802734375, 10.24853515625, -9.0777587890625, 79.92645263671875, 9.794677734375, 22.306228637695312, 32.262908935546875, 11.762710571289062, 2.6477279663085938, 21.623779296875, 22.57476806640625, 6.093944549560547, -7.950103759765625, 22.837127685546875, 26.842147827148438, -0.495391845703125, 4.0474853515625, 21.643585205078125, 7.5143280029296875, 0.41937255859375, -3.9007797241210938, -11.27093505859375, 0.0, -42.58770751953125, -0.86346435546875, -30.02099609375, 66.2978515625, 0.7128753662109375, 5.7198638916015625, 14.415306091308594, 25.566070556640625, 33.071624755859375, 3.2962112426757812, 47.47645568847656, 10.382698059082031, 4.5725860595703125, 0.3400306701660156, 43.08013916015625, 52.6485595703125, 35.48558044433594, 28.785186767578125, -2.5296249389648438, 35.48847961425781, 38.459197998046875, 6.440126419067383, 6.270469665527344, 19.549346923828125, -2.468475341796875, 28.960304260253906, 31.91766357421875, 0.0, -5.934471130371094, 16.8746337890625, -1.0869140625, -8.671295166015625, 49.58416748046875, -15.20709228515625, 45.30694580078125, 9.503768920898438, -18.049270629882812, 45.250732421875, 22.690658569335938, 23.346107482910156, 24.0498046875, 44.479400634765625, 22.310791015625, 50.6845703125, 20.239742279052734, 27.774917602539062, -8.807281494140625, -1.884490966796875, -15.3006591796875, 8.497222900390625, -25.9512939453125, 12.64398193359375, 56.778167724609375, 8.951263427734375, -2.8543624877929688, 7.4700927734375, -16.0675048828125, 27.407562255859375, 22.29913330078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000140.npy"}
|
||||
{"epoch": 0.2931937172774869, "step": 141, "batch_size": 128, "mean": 14.42753791809082, "std": 25.611278533935547, "min": -75.89410400390625, "p10": -11.007505989074705, "median": 12.16494369506836, "p90": 46.28346862792968, "max": 89.49847412109375, "pos_frac": 0.7265625, "sample": [41.91265869140625, 18.342498779296875, -26.16680908203125, -8.768569946289062, 6.87664794921875, 13.749618530273438, -7.4266357421875, 21.391380310058594, -46.53448486328125, 89.49847412109375, 5.8758544921875, 11.751251220703125, 30.694480895996094, 28.97991943359375, 12.837844848632812, 44.250640869140625, 26.175979614257812, 17.864837646484375, 18.401765823364258, 23.67822265625, 57.367340087890625, -5.479736328125, 1.2903518676757812, 45.285003662109375, -0.13689804077148438, -15.784912109375, 39.60579299926758, 14.192573547363281, 1.9186859130859375, 51.62139892578125, 34.17901611328125, -17.580718994140625, -10.05596923828125, 68.93994140625, 2.8111572265625, 17.398284912109375, -25.310142517089844, -10.39130973815918, 11.67767333984375, -18.6812744140625, 44.48736572265625, 36.89520263671875, 14.040830612182617, 19.735061645507812, -6.839954376220703, 7.31951904296875, 2.688018798828125, -7.849700927734375, 61.432769775390625, 7.562469482421875, 11.804931640625, 23.436920166015625, 3.5324268341064453, -8.00579833984375, 28.285369873046875, 24.13775634765625, 0.8590621948242188, 1.9233551025390625, 24.96795654296875, 2.082366943359375, 36.431640625, 34.03215789794922, -14.19891357421875, -44.78733825683594, -8.502670288085938, 70.47198486328125, -1.5243949890136719, 27.96338653564453, -5.071922302246094, 38.22700500488281, 3.81048583984375, 12.187126159667969, 7.221904754638672, 23.074180603027344, 29.6695556640625, 35.424888610839844, 31.4271240234375, -7.46270751953125, 0.0, -15.773588180541992, 42.38279724121094, -3.5091400146484375, -1.49456787109375, 58.98748779296875, 6.420997619628906, 55.2921142578125, -75.89410400390625, 18.4117431640625, 19.834014892578125, -2.3490753173828125, 25.163650512695312, 17.384620666503906, 33.395912170410156, -44.568359375, -6.8646392822265625, 15.66265869140625, -0.20619964599609375, 57.053619384765625, -0.1902008056640625, 32.37886047363281, -7.9624176025390625, 11.07281494140625, 19.01971435546875, 43.0068359375, 16.3824462890625, 4.26630973815918, 25.0579833984375, 49.478424072265625, 42.91404724121094, -12.445297241210938, 8.597040176391602, 12.14276123046875, 62.5579833984375, -36.10389709472656, 0.3535614013671875, 10.359054565429688, 11.7674560546875, 12.073394775390625, 20.9080810546875, 20.59465789794922, -9.068283081054688, 48.61322021484375, 43.2811279296875, 6.3206939697265625, 10.39971923828125, 57.53265380859375, 16.637420654296875, 0.30755615234375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000141.npy"}
|
||||
{"epoch": 0.29528795811518327, "step": 142, "batch_size": 128, "mean": 14.144156455993652, "std": 25.61022186279297, "min": -66.4852294921875, "p10": -15.201208496093747, "median": 9.420644760131836, "p90": 49.51287841796874, "max": 77.53146362304688, "pos_frac": 0.6640625, "sample": [-3.950042724609375, -1.6450042724609375, 9.039775848388672, 27.253448486328125, 1.9617919921875, -17.3717041015625, 11.18865966796875, -0.92059326171875, 73.57803344726562, 40.01811218261719, 45.031982421875, -3.49456787109375, -6.45098876953125, 34.577362060546875, 18.414230346679688, 27.182525634765625, 0.06746292114257812, -5.33978271484375, 53.0855712890625, 26.51470184326172, -1.9107208251953125, 25.85346221923828, 0.0, 19.891128540039062, 20.09625244140625, 51.063140869140625, 9.801513671875, 4.6665496826171875, -2.868865966796875, -0.341705322265625, -1.461395263671875, 28.116607666015625, -2.874225616455078, -26.435272216796875, 1.3667831420898438, 2.3624420166015625, 25.74462890625, 14.351203918457031, -22.006851196289062, 29.896514892578125, 19.30926513671875, 51.81951904296875, 44.4857177734375, 16.874252319335938, 18.20159912109375, -10.102951049804688, -1.249359130859375, 45.6781005859375, 18.92993927001953, 69.69754028320312, 14.476318359375, -6.481851577758789, 4.4461212158203125, 13.297561645507812, -14.27099609375, 3.0889110565185547, -0.4725341796875, 34.62511444091797, -18.5330810546875, -19.401092529296875, 20.9228515625, 33.0008544921875, -11.1939697265625, -9.501678466796875, 48.848480224609375, 37.920005798339844, 17.84912109375, 0.7818603515625, 46.808753967285156, -3.804920196533203, 9.00909423828125, 20.591888427734375, 2.5704345703125, 6.835153579711914, -23.171615600585938, 34.850067138671875, -38.320831298828125, 28.147811889648438, 42.5135498046875, -28.054153442382812, -4.87969970703125, 71.71426391601562, 77.53146362304688, -0.37286376953125, -1.5681991577148438, 14.68975830078125, 10.252761840820312, 51.30963134765625, 8.670196533203125, 5.934333801269531, -2.8045997619628906, -3.532337188720703, 32.8043212890625, 40.82585144042969, 58.23968505859375, 5.57623291015625, 36.8817138671875, 26.14324951171875, 22.603057861328125, 54.523162841796875, -66.4852294921875, 53.41021728515625, -30.60333251953125, 0.9991722106933594, -0.6589088439941406, -28.168975830078125, 35.60844421386719, 16.736892700195312, 30.122894287109375, -3.20794677734375, 37.76190185546875, -37.963226318359375, 3.2061614990234375, 7.14630126953125, 22.62750244140625, -6.891212463378906, 29.2491455078125, -1.2111225128173828, 6.37872314453125, 0.50628662109375, -21.145835876464844, 62.39007568359375, 0.07210159301757812, -1.0708160400390625, 32.586456298828125, 28.85657501220703, 68.45350646972656, 42.131248474121094], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000142.npy"}
|
||||
{"epoch": 0.2973821989528796, "step": 143, "batch_size": 128, "mean": 16.465831756591797, "std": 27.87979507446289, "min": -77.88665771484375, "p10": -11.153259277343746, "median": 13.984779357910156, "p90": 49.29505310058594, "max": 99.4974365234375, "pos_frac": 0.7421875, "sample": [28.11328887939453, 49.221893310546875, 58.21966552734375, 27.631134033203125, 40.91642761230469, 15.37847900390625, 31.408035278320312, 9.342475891113281, -18.2127685546875, -8.161130905151367, 28.583892822265625, -0.28464508056640625, 21.611968994140625, 14.611434936523438, 2.2664527893066406, -13.34844970703125, 49.46575927734375, 3.38385009765625, -19.49200439453125, 99.4974365234375, -17.85638427734375, -8.466995239257812, 43.78135681152344, 46.56494140625, 23.801597595214844, 45.025299072265625, 36.7523193359375, 66.97979736328125, 91.58212280273438, 2.51776123046875, -2.29351806640625, 42.70143127441406, 27.32952880859375, 31.35201644897461, 25.07080078125, -6.98974609375, -77.88665771484375, -2.6108551025390625, 27.478546142578125, 9.63092041015625, 9.39154052734375, -32.566871643066406, 9.571670532226562, 41.96429443359375, 6.321815490722656, 7.45849609375, 11.670757293701172, 20.996688842773438, 17.188007354736328, -68.806640625, 6.60784912109375, 28.1400146484375, 3.006256103515625, 39.67359161376953, 21.87841796875, 24.8333740234375, -0.341156005859375, 23.27704620361328, -14.677001953125, -0.2496490478515625, -10.21246337890625, 1.0357704162597656, 70.173095703125, 20.400482177734375, 5.356273651123047, -2.2967605590820312, 58.19354248046875, 30.50238037109375, 41.813079833984375, 2.35662841796875, 31.514862060546875, 35.44537353515625, -9.56771469116211, 3.823270797729492, 33.42292785644531, -6.77294921875, -39.64251708984375, 0.26198577880859375, 11.256744384765625, 12.553680419921875, 4.6649169921875, -1.141876220703125, 7.784393310546875, 19.83001708984375, 13.358123779296875, -3.03289794921875, -16.928955078125, 3.313720703125, -1.4811248779296875, 15.156600952148438, 35.2548828125, 15.5635986328125, -2.1978225708007812, 22.578567504882812, 19.7176513671875, 24.86199951171875, 0.2677879333496094, 18.47314453125, -4.280021667480469, 3.1247711181640625, 82.83160400390625, 31.76287841796875, -3.893218994140625, 12.442649841308594, -32.106964111328125, 75.77069091796875, 36.786712646484375, 3.2727890014648438, 3.648395538330078, -1.2535552978515625, 27.81402587890625, 57.290069580078125, 27.669525146484375, 67.49664306640625, 27.173828125, 62.51690673828125, 43.039886474609375, 45.57452392578125, -40.33758544921875, 26.222187042236328, -16.421478271484375, 51.076171875, -8.01904296875, 10.725830078125, 47.701446533203125, 7.04388427734375, 16.04010009765625, 1.2965850830078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000143.npy"}
|
||||
{"epoch": 0.2994764397905759, "step": 144, "batch_size": 128, "mean": 13.705360412597656, "std": 25.4195499420166, "min": -66.8997802734375, "p10": -14.318862915039062, "median": 7.846439361572266, "p90": 46.23644409179687, "max": 91.24114990234375, "pos_frac": 0.734375, "sample": [-2.223299026489258, 16.534194946289062, 45.98004150390625, 5.419456481933594, 5.3733367919921875, 11.7362060546875, -16.706504821777344, -11.038101196289062, 1.2932205200195312, 91.24114990234375, 2.644775390625, -14.51123046875, 7.524421691894531, 58.81500244140625, 5.3963165283203125, 33.627777099609375, 3.1285171508789062, 44.467437744140625, 8.906455993652344, -0.2646484375, -0.92742919921875, 20.98284912109375, -14.236419677734375, 3.1385879516601562, 3.27606201171875, -5.44232177734375, -21.34326934814453, -2.7938232421875, 27.980804443359375, 5.282909393310547, -39.127685546875, 6.186521530151367, 2.1233367919921875, 24.294158935546875, 51.04414367675781, 40.98588562011719, 34.075836181640625, 74.5498046875, 10.470115661621094, 59.68531799316406, -0.9018440246582031, -8.029617309570312, 5.808837890625, 20.968292236328125, -0.6419162750244141, -17.87928009033203, 7.8505859375, 35.685577392578125, -17.16021728515625, 21.328125, 12.83221435546875, 56.623779296875, 38.296173095703125, -19.19671630859375, -66.8997802734375, -27.022430419921875, 16.403228759765625, -13.749031066894531, -10.438385009765625, 4.11505126953125, 34.126007080078125, 0.1527099609375, -8.036529541015625, 5.4974822998046875, 26.68860626220703, 13.40869140625, 67.58319091796875, 66.03439331054688, 37.110076904296875, 39.92608642578125, 1.3001327514648438, 20.090232849121094, 47.3564453125, -26.92474365234375, 22.87384033203125, 31.5107421875, 4.059844970703125, 10.370136260986328, 45.909210205078125, 43.579864501953125, 45.774932861328125, 37.75776290893555, 4.9903411865234375, -9.335372924804688, 15.005264282226562, 24.629905700683594, -0.45318603515625, 8.8892822265625, 1.3760986328125, 33.5345458984375, 70.97882080078125, 33.144195556640625, 12.678382873535156, 0.6078605651855469, 11.283548355102539, 48.3880615234375, 3.517974853515625, -2.84307861328125, 7.88702392578125, 35.964324951171875, 9.632049560546875, 0.3334503173828125, 0.0, 3.3022918701171875, -13.69976806640625, -11.837158203125, 2.32135009765625, -36.507965087890625, 25.742645263671875, 25.649139404296875, 18.08502197265625, -2.041717529296875, 42.807098388671875, -7.88336181640625, 30.436309814453125, 2.196746826171875, 17.906097412109375, 5.6244964599609375, 18.660614013671875, -22.3231201171875, 14.8814697265625, 38.597808837890625, 7.842292785644531, 46.834716796875, -20.024261474609375, 0.777252197265625, 70.79428100585938, 4.242645263671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000144.npy"}
|
||||
{"epoch": 0.30157068062827225, "step": 145, "batch_size": 128, "mean": 17.609050750732422, "std": 26.442712783813477, "min": -54.63179016113281, "p10": -12.312240600585934, "median": 15.162483215332031, "p90": 53.17515563964842, "max": 84.32553100585938, "pos_frac": 0.7109375, "sample": [-14.216583251953125, 38.93694305419922, -43.394134521484375, 36.27252197265625, 13.828052520751953, -27.582366943359375, 41.77423095703125, 19.47930908203125, 79.59274291992188, 22.116653442382812, 17.899993896484375, 11.530059814453125, 23.99799346923828, 34.08728790283203, 30.70940399169922, 64.79547119140625, 2.041839599609375, 30.785308837890625, 7.439453125, 7.6293487548828125, 5.2073211669921875, -6.271240234375, 52.048828125, -9.15109634399414, -7.749053955078125, 68.36843872070312, -28.11224365234375, 49.21527099609375, 0.0645904541015625, 22.24273681640625, 0.774627685546875, 30.964393615722656, 9.332923889160156, 36.48371887207031, 55.803253173828125, 23.16845703125, 76.17379760742188, -54.63179016113281, 9.2412109375, 1.48736572265625, 22.91863250732422, 20.67059326171875, -15.730087280273438, 64.285888671875, 70.06997680664062, 14.2496337890625, 37.925018310546875, 33.254974365234375, 11.18597412109375, 22.04974365234375, 17.137046813964844, -7.1695556640625, -7.090484619140625, 33.40168762207031, 46.70097351074219, 46.91279602050781, -0.942657470703125, -4.569091796875, -14.927093505859375, 3.735820770263672, 13.611358642578125, 7.2414093017578125, -5.046051025390625, 43.3631591796875, 63.39373779296875, -5.3858489990234375, 57.2735595703125, -9.804092407226562, 55.894073486328125, 31.102651596069336, 57.822021484375, -10.54461669921875, 6.033718109130859, 35.03211975097656, 49.68115234375, 12.146881103515625, 1.8243064880371094, 10.977935791015625, 28.102676391601562, -20.982559204101562, -21.0333251953125, 5.213409423828125, 10.188983917236328, 29.94146728515625, 84.32553100585938, 22.75017547607422, 5.9557342529296875, 24.059600830078125, -22.6446533203125, -11.369720458984375, -2.6735458374023438, 0.0, 6.792518615722656, 20.578094482421875, 9.323532104492188, 3.5445404052734375, 40.90769958496094, 70.51690673828125, 42.85504150390625, 25.515838623046875, -2.5155029296875, -8.978759765625, 27.37042236328125, 25.066795349121094, 16.075332641601562, -14.7327880859375, 25.365074157714844, 28.5438232421875, -2.777374267578125, -1.0573949813842773, -16.35870361328125, 49.880279541015625, 27.807952880859375, 45.4498291015625, -2.7966651916503906, -2.17138671875, 3.6112899780273438, 31.189990997314453, 44.047760009765625, 50.524139404296875, 34.621673583984375, -11.49609375, -1.3946857452392578, -2.2169723510742188, 27.7672119140625, -22.3004150390625, -8.406295776367188, 20.899856567382812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000145.npy"}
|
||||
{"epoch": 0.3036649214659686, "step": 146, "batch_size": 128, "mean": 14.052300453186035, "std": 25.403305053710938, "min": -60.28033447265625, "p10": -15.58114776611328, "median": 8.937691688537598, "p90": 45.03756484985351, "max": 79.3323974609375, "pos_frac": 0.7421875, "sample": [5.027539253234863, 2.6971969604492188, 18.124248504638672, -34.7303466796875, 12.4029541015625, 38.548553466796875, -27.629501342773438, 13.518356323242188, 34.06941223144531, -24.77423095703125, 79.3323974609375, 74.61181640625, 10.135894775390625, -8.3302001953125, 18.30816650390625, 3.0179061889648438, 43.6737060546875, 25.511444091796875, 3.4350814819335938, 2.744873046875, 29.504058837890625, 2.6836318969726562, 8.20798110961914, 11.91339111328125, -17.4432373046875, -9.2200927734375, 4.099786758422852, 15.312942504882812, 16.73651123046875, 20.62469482421875, 9.05499267578125, 0.9610042572021484, 40.13800048828125, 0.212890625, 5.92254638671875, 19.5885009765625, 14.688735961914062, -23.069793701171875, 13.420249938964844, 3.2427139282226562, 6.9253997802734375, 49.05162048339844, 0.5190925598144531, 36.8990478515625, 0.0, 46.82798767089844, -3.7010498046875, -1.6285877227783203, 26.0521240234375, 26.186309814453125, 10.50152587890625, -17.074554443359375, 8.905603408813477, -1.37384033203125, 51.1761474609375, 62.265289306640625, -59.79364013671875, 29.821632385253906, 6.0949859619140625, 35.43775939941406, 35.38616943359375, 4.47796630859375, 3.2136306762695312, 48.70281982421875, 37.5860595703125, -14.941116333007812, -7.1820068359375, -12.390731811523438, 6.893829345703125, -19.200332641601562, 51.90754699707031, -20.950485229492188, 3.30670166015625, 41.742279052734375, -21.157272338867188, 63.261138916015625, 10.101165771484375, 8.969779968261719, -10.63958740234375, 73.109619140625, 7.96002197265625, 33.57855224609375, 2.477367401123047, -7.882080078125, 35.20654296875, 0.5059127807617188, -7.6432647705078125, 2.8846588134765625, 0.0, 38.45867919921875, -3.0891876220703125, -5.033729553222656, 27.364547729492188, -20.10308837890625, 5.7767333984375, 58.205780029296875, 22.75732421875, 40.55461120605469, 7.5144500732421875, 53.57881164550781, -21.627639770507812, -14.92242431640625, 4.2955780029296875, 15.017852783203125, 41.586456298828125, -9.30462646484375, 33.71923828125, 5.054443359375, 52.92010498046875, 44.1304931640625, 44.213226318359375, -60.28033447265625, 30.263477325439453, 44.270240783691406, 15.997802734375, 36.37091064453125, -7.514068603515625, 43.58354949951172, 3.15643310546875, 38.483154296875, 0.06317138671875, 6.6754913330078125, 24.016876220703125, 43.89860534667969, 30.687347412109375, -9.643989562988281, 29.95635986328125, -11.010711669921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000146.npy"}
|
||||
{"epoch": 0.3057591623036649, "step": 147, "batch_size": 128, "mean": 11.822027206420898, "std": 28.83446502685547, "min": -77.1812744140625, "p10": -19.790420532226562, "median": 8.700490951538086, "p90": 50.03663711547851, "max": 90.163330078125, "pos_frac": 0.6640625, "sample": [0.843353271484375, -3.7713623046875, 7.903816223144531, 30.245361328125, -9.882545471191406, -2.949005126953125, 85.05401611328125, -14.104949951171875, 0.0, 70.1209716796875, -17.9962158203125, 19.988174438476562, 13.31787109375, 4.7602386474609375, 25.436983108520508, 4.789314270019531, -19.78570556640625, 47.211822509765625, 50.8731689453125, 3.6899871826171875, -3.98681640625, 19.603240966796875, -0.5777664184570312, 35.351531982421875, -1.948577880859375, 51.25599670410156, 1.1160469055175781, 45.22270202636719, 9.776031494140625, -0.2081298828125, 68.01177978515625, 37.59344482421875, 20.040302276611328, 23.165130615234375, 10.701385498046875, -30.164276123046875, -43.45452117919922, -5.302642822265625, -17.4913330078125, 17.77869415283203, 11.2860107421875, 34.62984085083008, 32.082889556884766, 6.7711334228515625, -31.289226531982422, 29.1444091796875, -30.090621948242188, -3.8818588256835938, -0.20716285705566406, 1.6520843505859375, -3.0449371337890625, 52.06658935546875, 9.769332885742188, 13.200393676757812, 27.406333923339844, 29.290931701660156, 79.744384765625, 12.217758178710938, 4.635711669921875, -4.3776397705078125, 18.0093994140625, 22.989013671875, 60.46788024902344, 8.672321319580078, 56.121551513671875, 0.3145751953125, 1.26513671875, 63.371917724609375, 40.308265686035156, 3.37371826171875, -44.21929931640625, 13.915252685546875, 1.534332275390625, -1.1593017578125, 30.2198486328125, 4.8182373046875, 13.613943099975586, 4.7660369873046875, 0.1397418975830078, -0.5128707885742188, 28.688400268554688, -10.274330139160156, 0.052490234375, 36.217437744140625, 16.722244262695312, 39.416748046875, 8.292236328125, 32.101318359375, -13.01348876953125, 32.35765075683594, 14.04986572265625, 46.1673583984375, -24.416046142578125, -77.1812744140625, -38.02452087402344, -1.4078140258789062, 9.7738037109375, 31.018997192382812, -18.896286010742188, -41.62763977050781, -9.973472595214844, 4.3877105712890625, 55.418060302734375, 18.00079345703125, -19.801422119140625, -8.065746307373047, 45.22813415527344, -4.1275787353515625, -40.01518249511719, -14.356292724609375, -65.23738861083984, 9.500274658203125, 6.4918060302734375, 39.93009948730469, 49.678123474121094, 35.937713623046875, 8.728660583496094, 56.0220947265625, -3.5084075927734375, -7.398681640625, 15.863861083984375, 43.54920196533203, 43.50401306152344, 14.777435302734375, 90.163330078125, 10.742515563964844, -4.456708908081055, -29.024200439453125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000147.npy"}
|
||||
{"epoch": 0.3078534031413613, "step": 148, "batch_size": 128, "mean": 12.458778381347656, "std": 27.19798469543457, "min": -58.896484375, "p10": -25.402847290039062, "median": 11.167142868041992, "p90": 50.03211517333984, "max": 92.94143676757812, "pos_frac": 0.6484375, "sample": [17.8699951171875, 12.497062683105469, 18.97020721435547, -15.690887451171875, -1.5633926391601562, 69.93511962890625, -0.2123851776123047, 21.36383056640625, 74.53167724609375, 26.393173217773438, 5.728170394897461, 21.472000122070312, -4.8090972900390625, 55.48677062988281, 13.0255126953125, 38.02262878417969, 11.88751220703125, -20.28973388671875, 16.732452392578125, 25.985027313232422, -22.514984130859375, 43.51214599609375, -47.69451904296875, 0.2674751281738281, 9.388397216796875, -25.62298583984375, 14.85064697265625, -5.5473480224609375, 16.010101318359375, -9.915159225463867, 20.24506378173828, 16.1964111328125, -9.060157775878906, -58.896484375, -9.2120361328125, -17.913330078125, 61.227874755859375, 7.310302734375, -26.66680908203125, 21.674896240234375, 59.26910400390625, 1.4097099304199219, -16.60272216796875, -31.771697998046875, 1.6495819091796875, 6.380544662475586, 2.1195755004882812, -15.5228271484375, 23.47454833984375, 31.430221557617188, 11.500736236572266, 7.9367218017578125, 6.6176605224609375, 18.347137451171875, -31.393569946289062, 70.2939453125, 9.389808654785156, 37.88572692871094, 10.833549499511719, 49.8272705078125, 22.926605224609375, -7.9926300048828125, -0.2332916259765625, -1.3123321533203125, 0.0, 8.285991668701172, 54.61529541015625, 21.825210571289062, 36.70314025878906, 39.43548583984375, -3.6614513397216797, -29.76239013671875, -5.8794708251953125, -35.81822204589844, 40.86572265625, 9.7476806640625, -28.38117218017578, 32.6650390625, -2.0461273193359375, 29.614608764648438, -0.6646633148193359, 19.380142211914062, -25.308502197265625, 28.746063232421875, 57.372833251953125, 16.140060424804688, 2.593708038330078, 17.24456787109375, -9.724853515625, -13.865234375, 27.96398162841797, 21.139026641845703, -0.5553264617919922, 27.5035400390625, 23.802703857421875, 47.64399719238281, -26.6649169921875, 5.995185852050781, -13.104507446289062, 62.27574157714844, 2.86822509765625, 26.523544311523438, 92.94143676757812, 37.534088134765625, 2.1358184814453125, -7.349618911743164, 29.86336898803711, -27.953948974609375, 32.981781005859375, 22.64427947998047, 50.51008605957031, 44.501670837402344, 17.51080322265625, 29.3671875, -28.30132293701172, -1.6385269165039062, -26.31317138671875, 16.296218872070312, 62.59002685546875, 0.0, -8.40631103515625, 12.79791259765625, 34.97107696533203, -0.680419921875, 8.794754028320312, 45.302215576171875, 61.82582092285156, -4.1547393798828125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000148.npy"}
|
||||
{"epoch": 0.3099476439790576, "step": 149, "batch_size": 128, "mean": 15.52808952331543, "std": 26.375457763671875, "min": -42.14356231689453, "p10": -21.10261535644531, "median": 11.989669799804688, "p90": 47.17594909667968, "max": 98.0191650390625, "pos_frac": 0.7578125, "sample": [48.263397216796875, -24.3919677734375, 25.44134521484375, 35.04503631591797, 41.570281982421875, 46.70989990234375, 23.475570678710938, 5.422527313232422, 7.351806640625, 7.61956787109375, 20.679443359375, 5.566650390625, 43.950439453125, 17.518226623535156, 28.95697021484375, -2.203277587890625, 42.218994140625, 11.11251449584961, -1.360809326171875, -9.946434020996094, 15.910652160644531, 4.232200622558594, 20.00726318359375, 4.021953582763672, 25.559772491455078, 11.791534423828125, -40.4146728515625, 39.10015869140625, 48.61627197265625, 15.734619140625, 7.7379150390625, 18.922515869140625, 33.313995361328125, 13.247894287109375, 1.8436803817749023, -0.003032684326171875, 49.439910888671875, 1.8625469207763672, -19.901214599609375, 1.0368728637695312, 34.16386413574219, 70.665283203125, -9.324417114257812, 1.528228759765625, 77.84397888183594, 44.5679931640625, 10.514602661132812, -27.467300415039062, 39.661651611328125, 49.469970703125, 20.701744079589844, -6.575897216796875, 9.434722900390625, 23.365280151367188, 5.18804931640625, 77.0911865234375, -29.38427734375, 15.91278076171875, 17.339859008789062, 14.833984375, -8.015945434570312, 23.22644805908203, -25.8431396484375, 37.04029846191406, -27.006317138671875, 3.879119873046875, 8.688812255859375, -5.792205810546875, 24.591339111328125, 22.724864959716797, 40.9825325012207, 19.926162719726562, 32.69603729248047, 11.74468994140625, 42.202117919921875, 14.755470275878906, 50.56866455078125, 7.793449401855469, 57.4473876953125, -0.02959442138671875, 41.5418701171875, 0.4695472717285156, 32.90582275390625, 27.5372314453125, 12.18780517578125, -9.8533935546875, 55.96405029296875, 41.083282470703125, -40.20855712890625, 3.2916946411132812, 10.580108642578125, 35.4111328125, -6.1721038818359375, -26.973876953125, -2.515106201171875, -3.618438720703125, 66.513671875, 5.59356689453125, 44.238250732421875, 6.3121337890625, -4.57916259765625, 2.5023040771484375, 33.186790466308594, 0.4066619873046875, 28.550323486328125, 0.892578125, -28.98345947265625, -23.9058837890625, 2.2056427001953125, 98.0191650390625, -2.457733154296875, -4.793182373046875, 8.34844970703125, 29.464935302734375, 7.06787109375, -13.895217895507812, 42.49164581298828, 28.712448120117188, -42.14356231689453, -42.07733154296875, 21.7091064453125, 36.23651123046875, -25.169204711914062, 75.70639038085938, 40.59205627441406, 19.420989990234375, 4.010528564453125, 5.6146697998046875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000149.npy"}
|
||||
{"epoch": 0.31204188481675393, "step": 150, "batch_size": 128, "mean": 14.847814559936523, "std": 25.41783332824707, "min": -48.83544921875, "p10": -13.296282958984373, "median": 13.114465713500977, "p90": 46.48525161743164, "max": 85.76544189453125, "pos_frac": 0.7890625, "sample": [22.76434326171875, 32.08991241455078, -4.2623748779296875, 37.089599609375, 10.764633178710938, -27.161865234375, -4.988594055175781, 21.31097412109375, -30.232513427734375, 33.3077392578125, 26.26416015625, 24.7967529296875, 47.564544677734375, 15.2052001953125, -14.60711669921875, 2.3348388671875, 8.266128540039062, 31.114173889160156, 6.1282958984375, 39.470611572265625, 25.647171020507812, -16.212921142578125, 0.42822265625, 3.1298675537109375, -5.992368698120117, 14.80657958984375, 8.739044189453125, 7.6957550048828125, 65.8875732421875, 33.776939392089844, 67.77529907226562, 0.6468238830566406, 17.96697998046875, 9.542085647583008, 17.677352905273438, 26.399612426757812, 22.457733154296875, -41.64263916015625, 38.696128845214844, 10.029983520507812, 22.171173095703125, 43.562164306640625, 16.865447998046875, 40.14939880371094, -7.308219909667969, 1.842559814453125, 2.3518600463867188, 6.1861572265625, 14.09033203125, 63.0826416015625, 14.946922302246094, 77.19459533691406, 1.293701171875, 32.06695556640625, 46.4876708984375, 60.660247802734375, 50.03126525878906, 5.8177337646484375, -0.0526123046875, 1.174041748046875, 1.90869140625, -1.1327171325683594, -31.390243530273438, 46.484214782714844, 3.3489990234375, 3.4368133544921875, 5.300445556640625, 0.1676025390625, 43.12718963623047, 45.170379638671875, 21.446746826171875, 38.2783203125, 2.78997802734375, 45.71870422363281, 19.910995483398438, -9.517532348632812, -42.0921630859375, 8.54315185546875, 2.6223087310791016, 24.933151245117188, 12.696582794189453, -48.83544921875, -3.0335693359375, -20.110946655273438, -5.772529602050781, 18.2418212890625, -22.03350830078125, 85.76544189453125, 35.7025146484375, 13.5323486328125, 15.118896484375, -12.7344970703125, 6.803741455078125, 27.24978256225586, -46.8106689453125, 8.078372955322266, 41.45335388183594, -3.8563232421875, -27.075592041015625, 1.3683147430419922, 19.86431121826172, 33.17321014404297, 8.831268310546875, 52.72016906738281, 9.735157012939453, 15.576812744140625, 37.55938720703125, 11.732254028320312, 41.417755126953125, 30.9112548828125, 0.6421051025390625, 14.478851318359375, 3.903036117553711, 53.347808837890625, 9.051727294921875, 0.8981857299804688, 15.333152770996094, -11.3260498046875, 55.25750732421875, -12.2943115234375, 0.0, 27.306365966796875, 78.7913818359375, 1.42755126953125, 13.721851348876953, 17.860641479492188, 20.522262573242188, -32.015228271484375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000150.npy"}
|
||||
{"epoch": 0.31413612565445026, "step": 151, "batch_size": 128, "mean": 22.345552444458008, "std": 29.21235466003418, "min": -55.356048583984375, "p10": -14.748475646972654, "median": 26.612220764160156, "p90": 56.90081481933593, "max": 83.314697265625, "pos_frac": 0.75, "sample": [37.487213134765625, 20.792510986328125, 1.2170639038085938, -23.94061279296875, 0.108184814453125, 83.2581787109375, 53.800079345703125, -24.27324676513672, 1.703094482421875, 42.949798583984375, -1.300811767578125, -8.8392333984375, 29.554122924804688, 37.0194091796875, -7.3029327392578125, -16.0115966796875, -4.6390380859375, 37.97528076171875, 37.078155517578125, 25.79498291015625, -55.356048583984375, 1.2495574951171875, -11.6199951171875, 23.87505340576172, 3.9521484375, 37.7520751953125, 1.72137451171875, -4.554351806640625, 29.58056640625, 44.288543701171875, 52.439605712890625, 2.2582855224609375, -8.792587280273438, 19.028831481933594, 38.27606964111328, 14.549163818359375, 15.5570068359375, 17.16551971435547, -5.208984375, 80.94854736328125, 27.429458618164062, 32.8175048828125, 13.9940185546875, 18.230621337890625, 45.001373291015625, -23.650421142578125, 31.999664306640625, 9.0191650390625, 46.655548095703125, 46.08538055419922, 44.51853942871094, -8.364166259765625, 3.224599838256836, -0.07747650146484375, 37.950164794921875, 46.02421569824219, 15.2724609375, 29.12335205078125, -28.726959228515625, 20.704063415527344, -2.5128173828125, 83.314697265625, 48.40289306640625, -6.498699188232422, 29.630844116210938, 40.95066833496094, 31.23260498046875, -25.08978271484375, 38.927825927734375, 60.19281005859375, -37.43034362792969, -25.86871337890625, 44.8194580078125, 36.39086151123047, 35.42694854736328, 22.02025604248047, 55.489959716796875, 40.111602783203125, 76.0987548828125, -19.529541015625, -25.460693359375, 44.60821533203125, 72.49356079101562, 46.8604736328125, 78.65753173828125, 7.486839294433594, 43.288818359375, -0.5869636535644531, 0.0, 14.455032348632812, 33.611480712890625, 40.2451171875, 32.60688018798828, -11.576141357421875, 34.705230712890625, 32.10430908203125, 52.3394775390625, 70.59646606445312, -14.021217346191406, 80.07583618164062, -37.423675537109375, 34.767547607421875, 2.874298095703125, 35.11540222167969, 63.64247131347656, 48.261444091796875, 11.197097778320312, -14.207138061523438, 30.941543579101562, 32.499267578125, 21.207672119140625, 66.93634033203125, 14.122909545898438, -12.612361907958984, 75.65457153320312, 20.74591064453125, 52.787078857421875, 37.37751770019531, 67.88137817382812, 23.7069091796875, 36.56886291503906, 6.470123291015625, 12.7713623046875, 29.993011474609375, 36.38227462768555, -0.247894287109375, 11.283843994140625, -33.80964660644531], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000151.npy"}
|
||||
{"epoch": 0.3162303664921466, "step": 152, "batch_size": 128, "mean": 16.900421142578125, "std": 29.232891082763672, "min": -57.13616943359375, "p10": -14.179257202148436, "median": 15.094715118408203, "p90": 57.10955810546874, "max": 100.65151977539062, "pos_frac": 0.703125, "sample": [34.83697509765625, 48.38214111328125, -4.1393890380859375, 18.260665893554688, 7.251922607421875, 34.66900634765625, 12.30926513671875, 34.384765625, 59.12530517578125, 28.813674926757812, 15.125694274902344, 39.202545166015625, 70.19569396972656, 6.065155029296875, 37.40869140625, -14.72235107421875, 1.416748046875, 4.118499755859375, 1.309600830078125, -24.23138427734375, -57.13616943359375, 28.99139404296875, -55.5750732421875, 22.222549438476562, -10.820381164550781, 37.421417236328125, 19.02379608154297, -0.7726478576660156, 55.598846435546875, -2.77783203125, -0.8103141784667969, 47.07386779785156, 56.478851318359375, 31.1075439453125, 33.23179626464844, 36.69090270996094, 50.064544677734375, 41.248809814453125, 62.077392578125, 30.10858154296875, -41.60091781616211, 100.65151977539062, -29.12689208984375, 4.1905517578125, 34.28639221191406, -1.6106910705566406, -11.032554626464844, 23.57769775390625, -9.337005615234375, 1.2877044677734375, 7.157379150390625, -7.569366455078125, 15.063735961914062, -9.15887451171875, -7.4104156494140625, 25.75164794921875, 10.96966552734375, -29.48297119140625, -9.36822509765625, 0.13780975341796875, 62.9141845703125, 17.49761962890625, -19.0496826171875, 19.032516479492188, 69.43756103515625, -23.28057861328125, 18.83349609375, -11.863227844238281, -6.3636627197265625, 46.8580322265625, 6.964305877685547, -10.480453491210938, -5.233661651611328, 9.196319580078125, 1.3998794555664062, 65.143310546875, 40.9395751953125, 20.67755126953125, 70.57501220703125, 32.757774353027344, 58.581207275390625, -30.390625, 1.8772335052490234, -7.455352783203125, -1.036041259765625, 4.649208068847656, 39.202606201171875, 35.45599365234375, 69.9864501953125, 13.364572525024414, 7.232444763183594, -52.86529541015625, -10.132354736328125, 64.38522338867188, 35.92486572265625, 26.455856323242188, -15.6026611328125, 6.246429443359375, 24.543357849121094, -13.946502685546875, 18.69512939453125, 47.1373291015625, 27.454925537109375, 21.6932373046875, -13.15704345703125, 61.8331298828125, 4.6798858642578125, 12.695262908935547, -7.8529052734375, -6.014076232910156, 43.45479202270508, 10.680191040039062, -42.237548828125, 24.37603759765625, -2.216217041015625, 10.647796630859375, 31.24610137939453, 1.5595016479492188, 42.97840881347656, 61.33258056640625, 46.40887451171875, 53.467559814453125, 47.608116149902344, 38.22032165527344, 5.986930847167969, 25.9818115234375, -9.87347412109375, 43.42725372314453], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000152.npy"}
|
||||
{"epoch": 0.3183246073298429, "step": 153, "batch_size": 128, "mean": 24.71523666381836, "std": 27.823795318603516, "min": -59.232086181640625, "p10": -2.8497283935546855, "median": 21.27437973022461, "p90": 63.956304931640624, "max": 92.1790771484375, "pos_frac": 0.859375, "sample": [16.080047607421875, 0.8796806335449219, 75.80023193359375, 16.832183837890625, 61.76885986328125, 47.184410095214844, 47.213104248046875, 2.2196197509765625, 92.1790771484375, 8.543487548828125, 47.30525207519531, 3.9354782104492188, -17.83441162109375, 23.915969848632812, 35.80952453613281, 5.1792755126953125, -13.763290405273438, 1.4709625244140625, 31.535064697265625, 69.01866149902344, 64.22555541992188, 73.48664855957031, 35.84524154663086, 2.30615234375, 4.863739013671875, -4.2391357421875, 37.203208923339844, -2.303863525390625, 1.3808822631835938, 3.027099609375, -15.64520263671875, 5.390571594238281, 50.3831787109375, 5.2674102783203125, 2.421478271484375, 14.734573364257812, 53.717041015625, 45.732513427734375, 30.172956466674805, 0.0, 61.25115966796875, 41.04106140136719, 57.56455993652344, 23.0184326171875, 47.676513671875, 43.680511474609375, 47.691162109375, -1.747161865234375, 19.133758544921875, 20.707313537597656, 18.23468017578125, 52.50299072265625, -59.232086181640625, 5.1570587158203125, -4.1234130859375, 37.149169921875, 53.284210205078125, 37.7996826171875, -1.7268829345703125, 72.58718872070312, 17.992523193359375, 6.49591064453125, 22.145706176757812, 65.35107421875, -5.6090240478515625, 52.061798095703125, 8.7425537109375, 11.927520751953125, 18.658096313476562, 20.505706787109375, -6.0615234375, 22.479324340820312, 5.481903076171875, 40.39619445800781, 0.9742279052734375, 0.11200714111328125, 8.09991455078125, 32.620758056640625, 14.93853759765625, 38.69873046875, 36.7427978515625, -11.74676513671875, 49.35546875, 72.09860229492188, 27.146392822265625, 22.484451293945312, 6.729393005371094, 14.383560180664062, 40.965728759765625, -0.3667755126953125, 2.8059539794921875, 21.841445922851562, 67.778076171875, 47.54803466796875, 56.68226623535156, 11.904102325439453, 6.173524856567383, 19.11322021484375, 7.06085205078125, 9.106903076171875, 48.793975830078125, 45.931182861328125, 40.42039489746094, 8.625679016113281, 3.1045074462890625, 1.644561767578125, 30.777618408203125, 80.11203002929688, 29.949317932128906, 74.58721923828125, 66.48641967773438, 24.250762939453125, 6.001434326171875, 63.840911865234375, -22.191064834594727, 29.175384521484375, 11.118614196777344, 62.74542236328125, 33.22047424316406, 6.607147216796875, -18.876708984375, 70.37580871582031, 60.248321533203125, 12.587165832519531, -48.819427490234375, -41.085723876953125, 33.70793151855469, 31.498565673828125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000153.npy"}
|
||||
{"epoch": 0.3204188481675393, "step": 154, "batch_size": 128, "mean": 17.784772872924805, "std": 32.62965774536133, "min": -80.38320922851562, "p10": -24.20823974609375, "median": 18.75176239013672, "p90": 61.30625762939453, "max": 86.71380615234375, "pos_frac": 0.7109375, "sample": [25.714675903320312, 50.81541442871094, -26.97174072265625, 65.32867431640625, 33.63702392578125, 6.627174377441406, -7.349082946777344, 60.367034912109375, 6.507164001464844, 16.980804443359375, -51.44352722167969, 38.653587341308594, 22.252159118652344, 15.504974365234375, 34.84893798828125, -80.38320922851562, 36.81694030761719, -27.510879516601562, 35.483360290527344, 4.258056640625, 13.097549438476562, -0.5292129516601562, 8.591339111328125, -25.095840454101562, -5.06817626953125, 4.8479461669921875, 34.3626708984375, 7.8025665283203125, 4.13134765625, 75.84921264648438, 12.3079833984375, -6.370344161987305, 59.853424072265625, 69.39450073242188, 5.628814697265625, 50.44279479980469, 32.4390869140625, 71.0606689453125, 42.19464111328125, 65.5447998046875, 3.84112548828125, 57.57171630859375, 66.58622741699219, 31.148956298828125, -53.25016784667969, 19.630508422851562, 83.80419921875, -0.9061660766601562, -9.795501708984375, -26.810089111328125, -5.421424865722656, -11.9730224609375, 18.429840087890625, -24.49688720703125, 27.33331298828125, 42.855194091796875, 32.088958740234375, -6.556365966796875, 3.5572052001953125, -23.503692626953125, -2.3413619995117188, 13.525924682617188, 48.152587890625, 24.186721801757812, 25.474868774414062, -53.5145263671875, -18.070524215698242, -0.079071044921875, -4.030967712402344, -7.873992919921875, -10.280590057373047, -47.748138427734375, 54.57026672363281, 69.77481079101562, -4.125968933105469, -8.684944152832031, 10.309074401855469, -5.479736328125, 31.637725830078125, 8.917556762695312, -37.56025695800781, 12.026634216308594, -1.766510009765625, 21.8359375, 35.568878173828125, 9.49786376953125, 61.406585693359375, 32.762664794921875, -6.154266357421875, 33.85712432861328, 20.269256591796875, 20.873443603515625, 61.26325988769531, 49.45136260986328, 39.002197265625, -16.825653076171875, 21.506134033203125, 7.217620849609375, 86.71380615234375, 15.679107666015625, 21.97796630859375, 49.578460693359375, -24.08453369140625, -66.2022705078125, 41.99170684814453, 28.567306518554688, 36.11164855957031, 32.686309814453125, 17.015533447265625, 24.670822143554688, 68.30165100097656, 35.01322937011719, 34.60711669921875, 55.87385559082031, 1.57177734375, 5.366830825805664, 86.47662353515625, 19.073684692382812, 17.992328643798828, 24.50030517578125, -43.17987060546875, -12.968009948730469, 59.499908447265625, 32.97901916503906, 3.352996826171875, 25.8192138671875, 38.82701110839844, 65.3281021118164], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000154.npy"}
|
||||
{"epoch": 0.3225130890052356, "step": 155, "batch_size": 128, "mean": 13.135330200195312, "std": 29.77726936340332, "min": -61.6258544921875, "p10": -22.23429946899414, "median": 11.651369094848633, "p90": 50.29101257324218, "max": 85.0806884765625, "pos_frac": 0.671875, "sample": [67.52911376953125, 26.647659301757812, 13.6627197265625, -5.619140625, 37.95301818847656, -0.1656055450439453, 10.715194702148438, -1.78643798828125, 7.141815185546875, 49.09214782714844, 44.89609146118164, 3.892223358154297, 30.96673583984375, -21.577194213867188, 22.644378662109375, 31.41082763671875, -17.76708984375, 74.598388671875, 85.0806884765625, -6.26495361328125, 13.548309326171875, 28.873046875, 47.001678466796875, 83.1846923828125, 5.609031677246094, 41.4833984375, -44.219879150390625, 0.0, -36.10137939453125, -9.167510986328125, -22.409774780273438, 57.40093994140625, 19.797073364257812, 48.881004333496094, 5.2369384765625, -16.496734619140625, -22.159095764160156, 42.756534576416016, -8.63116455078125, 0.9405670166015625, 8.8392333984375, 31.65325927734375, 18.71435546875, 1.9461669921875, 39.38165283203125, 55.198394775390625, 37.171417236328125, 14.80267333984375, 4.639617919921875, 36.89820098876953, -9.390220642089844, -54.22509765625, 8.361227035522461, 7.898284912109375, 3.3723487854003906, -35.488861083984375, -0.9071044921875, 38.949462890625, 75.69970703125, 53.08836364746094, 3.336017608642578, 4.549938201904297, 17.690093994140625, -7.6632537841796875, -45.01396179199219, -7.58067512512207, 45.730072021484375, 59.981201171875, 13.591747283935547, 16.8033447265625, -2.94970703125, -31.606361389160156, 38.310272216796875, 46.25262451171875, -13.80517578125, 37.073394775390625, 6.981964111328125, -7.267005920410156, 44.169708251953125, -27.74371337890625, 35.29258728027344, 55.59208679199219, -28.313079833984375, -15.565628051757812, 0.2064971923828125, 7.902305603027344, 43.057220458984375, -42.424896240234375, 1.729095458984375, 44.24896240234375, 17.799591064453125, 25.456405639648438, 23.91241455078125, -0.303070068359375, -43.751373291015625, 13.138336181640625, 13.552001953125, 65.95353698730469, 11.591949462890625, -15.341346740722656, -16.910858154296875, -16.40984344482422, -4.392219543457031, 11.71078872680664, 18.818695068359375, 40.394615173339844, 4.007133483886719, -5.3936920166015625, -61.6258544921875, 0.08492279052734375, -43.9356689453125, 12.048622131347656, 37.858619689941406, -11.282867431640625, 56.556610107421875, 32.8394775390625, 38.211463928222656, 27.243255615234375, 15.232879638671875, 61.395660400390625, -21.06939697265625, 0.3041534423828125, -0.6997756958007812, 13.832061767578125, 20.36041259765625, 27.17694091796875, 15.980300903320312, -0.7693939208984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000155.npy"}
|
||||
{"epoch": 0.32460732984293195, "step": 156, "batch_size": 128, "mean": 22.226245880126953, "std": 34.074127197265625, "min": -80.063232421875, "p10": -22.13450927734375, "median": 23.72943115234375, "p90": 61.586849975585935, "max": 101.627197265625, "pos_frac": 0.7578125, "sample": [39.67436981201172, 40.990211486816406, 30.8499755859375, 22.478179931640625, -1.425537109375, 5.710929870605469, 49.36412048339844, 48.500579833984375, 32.964019775390625, 29.947294235229492, 44.776763916015625, 28.592132568359375, 54.77734375, 33.129486083984375, 3.0925827026367188, -22.437042236328125, -6.012397766113281, 35.99627685546875, 9.63031005859375, 43.01502227783203, 2.12994384765625, 45.6817626953125, -12.059783935546875, -33.46434783935547, 29.693740844726562, 11.16162109375, -3.3456954956054688, 22.332717895507812, 0.8725547790527344, 46.40191650390625, 70.62435913085938, -6.3588409423828125, 73.86065673828125, -6.9459686279296875, 12.17469596862793, -25.4051513671875, -40.167999267578125, 5.860115051269531, 52.31591796875, 3.2676544189453125, 23.23912811279297, 60.271453857421875, 26.62042236328125, 12.346405029296875, 6.8531951904296875, 3.527923583984375, 45.028533935546875, -38.585968017578125, 47.76627731323242, 33.98638916015625, 3.6541824340820312, 37.45191955566406, 23.71075439453125, 23.74810791015625, 87.39788818359375, -26.6317138671875, -39.06329345703125, 91.4560546875, -1.52191162109375, 10.494026184082031, -8.121307373046875, 86.27740478515625, 36.318115234375, 52.39152526855469, 34.823890686035156, 57.732879638671875, 52.05877685546875, 14.403892517089844, 61.166229248046875, 6.2104949951171875, 21.80902099609375, 0.876495361328125, -6.950298309326172, 66.07302856445312, 41.35792541503906, 33.550994873046875, 101.627197265625, 19.1568603515625, 0.3681640625, -4.105678558349609, 85.7137451171875, 33.34199523925781, 73.6239013671875, 1.941162109375, 9.580780029296875, 24.890670776367188, 2.931163787841797, 35.47052001953125, 31.988197326660156, -28.838394165039062, 45.46197509765625, 52.4415283203125, 37.63568115234375, 47.17620849609375, 26.0941162109375, 13.38433837890625, -19.82122802734375, 60.48234558105469, 26.876251220703125, -22.004852294921875, 62.56829833984375, 9.721588134765625, 22.671478271484375, -28.45196533203125, -80.063232421875, 55.650115966796875, -4.869056701660156, 88.74554443359375, -1.8150177001953125, -39.991546630859375, 6.29083251953125, 31.46502685546875, 89.88031005859375, 12.847549438476562, -77.489990234375, -39.09109878540039, -0.44677734375, 47.056129455566406, 53.119140625, 58.4259033203125, -1.0068931579589844, 25.217147827148438, 43.870452880859375, 63.900146484375, -18.74920654296875, 61.00408935546875, -7.834236145019531, 2.944915771484375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000156.npy"}
|
||||
{"epoch": 0.3267015706806283, "step": 157, "batch_size": 128, "mean": 15.250144958496094, "std": 30.67500114440918, "min": -65.13735961914062, "p10": -22.11654815673828, "median": 11.462268829345703, "p90": 55.71862106323242, "max": 99.77426147460938, "pos_frac": 0.703125, "sample": [31.385284423828125, 0.4615478515625, 25.40167236328125, 65.29934692382812, -1.383331298828125, -24.532005310058594, 12.734161376953125, 18.0771484375, 56.0615234375, -7.33819580078125, 24.39068603515625, 0.123992919921875, -46.0482177734375, 26.92217254638672, 0.6785316467285156, 35.3046875, 12.384742736816406, -1.6470870971679688, 3.4684391021728516, -11.791854858398438, -16.039413452148438, 66.46826171875, 29.82470703125, -10.773651123046875, 49.73275375366211, -0.1400928497314453, -35.38783264160156, -1.0460586547851562, 5.2667388916015625, 16.31781005859375, 8.039154052734375, -6.95648193359375, 9.914688110351562, -53.36029052734375, 34.07030487060547, 55.71363830566406, 21.191604614257812, 32.96327209472656, -46.69126892089844, -65.13735961914062, 84.96905517578125, 15.8516845703125, 32.157928466796875, 18.808746337890625, 82.71954345703125, 40.91676330566406, 14.151885986328125, -10.757743835449219, 22.477386474609375, 0.0, 18.970291137695312, 51.46980285644531, 4.8519134521484375, -6.778411865234375, 68.61370849609375, 12.99493408203125, 41.225982666015625, -8.549896240234375, -5.2696533203125, 1.0552940368652344, 54.5474853515625, 24.357452392578125, 55.730247497558594, 10.1710205078125, 33.739036560058594, -3.1886024475097656, 51.709564208984375, -4.897186279296875, -51.513671875, 39.668670654296875, 0.95379638671875, -17.751434326171875, -46.270660400390625, 7.237457275390625, -12.404296875, 17.120330810546875, -36.11029052734375, 46.95672607421875, 7.560874938964844, 56.84503173828125, 20.2694091796875, 40.019683837890625, -0.5856170654296875, -6.7194366455078125, 2.64068603515625, 6.547393798828125, 6.6228790283203125, 14.415573120117188, 25.02151107788086, 0.0, -28.391876220703125, 62.97332763671875, 21.044326782226562, 14.71502685546875, 5.3458709716796875, -2.10992431640625, 29.924100875854492, 57.98456573486328, 99.77426147460938, 54.45747375488281, 10.539794921875, -21.2542724609375, 39.2427978515625, 7.541440963745117, 3.409912109375, 3.95428466796875, 36.31901550292969, 37.1328239440918, 30.9608154296875, -10.055450439453125, -24.128524780273438, 40.733673095703125, 4.78887939453125, 39.763702392578125, 31.577529907226562, 42.98919677734375, -4.9512939453125, 1.4143314361572266, -27.98004150390625, 48.75499725341797, -28.965240478515625, 8.038619995117188, 6.799558639526367, 35.51298522949219, 8.345474243164062, 20.49370574951172, 98.47796630859375, 60.348114013671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000157.npy"}
|
||||
{"epoch": 0.3287958115183246, "step": 158, "batch_size": 128, "mean": 16.95669174194336, "std": 30.42383575439453, "min": -60.0465087890625, "p10": -25.479987335205077, "median": 17.193416595458984, "p90": 55.662550354003905, "max": 84.374267578125, "pos_frac": 0.71875, "sample": [5.153573989868164, -2.1939239501953125, 22.70489501953125, 55.54132080078125, -33.474822998046875, 4.958221435546875, -2.69403076171875, 23.719879150390625, -16.722320556640625, 56.26434326171875, -18.8648681640625, 0.34368896484375, -42.9755859375, -37.997772216796875, 24.15655517578125, 21.41192626953125, 35.40177536010742, 30.754981994628906, 11.281234741210938, 30.133743286132812, 22.245567321777344, 62.25103759765625, 41.371856689453125, 4.502998352050781, 17.22875213623047, 41.62422180175781, 12.533866882324219, -60.0465087890625, 47.39501953125, 22.204360961914062, -9.7908935546875, 34.67729187011719, -13.782684326171875, 56.64411926269531, 22.20367431640625, 6.8437957763671875, 40.036895751953125, -40.326908111572266, 19.45580291748047, -42.83038330078125, 36.48323059082031, 8.5887451171875, 15.191226959228516, -27.682941436767578, 15.31005859375, -1.19024658203125, -6.5496978759765625, 17.1580810546875, 33.65071105957031, 24.84939956665039, 67.20684814453125, -4.046205520629883, 2.010396957397461, 10.461769104003906, 36.09912109375, 49.15263366699219, -38.77405548095703, -19.545166015625, 65.77011108398438, 16.663909912109375, 64.0794677734375, 7.596107482910156, -3.0758056640625, 12.78912353515625, 37.11279296875, 12.923309326171875, -22.218505859375, 0.0, 32.152732849121094, 14.798309326171875, 70.23876953125, -6.7615509033203125, -3.9632415771484375, 49.47119140625, 23.896514892578125, 57.0740966796875, 5.5606842041015625, -26.37220001220703, -12.611557006835938, -2.122190475463867, 51.75190734863281, 11.153411865234375, -49.646697998046875, 14.001968383789062, 33.256378173828125, 37.8123779296875, 34.18804931640625, 0.8248386383056641, 43.945770263671875, 40.003753662109375, 49.487823486328125, 30.517547607421875, 11.634910583496094, 19.611724853515625, 69.07379150390625, 43.85089111328125, 82.1248779296875, -6.384006500244141, -28.53973388671875, -25.097610473632812, 11.998565673828125, 84.140625, -19.16632080078125, 34.4833984375, 50.14472198486328, 47.9884033203125, 55.94541931152344, 35.67909240722656, 41.403961181640625, -1.9362449645996094, 18.5838623046875, 33.90179443359375, 37.68115997314453, 5.668304443359375, 28.828598022460938, -16.279815673828125, 54.460479736328125, 4.680103302001953, -34.5791015625, -7.21075439453125, 32.8109130859375, 49.939544677734375, -49.17144775390625, 13.556121826171875, 84.374267578125, 17.404281616210938, 27.747337341308594, 1.0868377685546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000158.npy"}
|
||||
{"epoch": 0.3308900523560209, "step": 159, "batch_size": 128, "mean": 13.69869327545166, "std": 36.21403884887695, "min": -93.53118896484375, "p10": -32.44916687011718, "median": 10.236343383789062, "p90": 61.592655944824216, "max": 101.06227111816406, "pos_frac": 0.6171875, "sample": [-19.812744140625, 58.31526184082031, 25.8763427734375, 36.4154052734375, -4.84735107421875, -17.703460693359375, -39.90911865234375, 50.85968017578125, -3.78564453125, 66.573486328125, -56.13874816894531, 8.734256744384766, -93.53118896484375, -15.215980529785156, 8.97072982788086, 41.052146911621094, 13.146530151367188, -5.4609375, -36.66778564453125, 47.194374084472656, 30.787948608398438, 5.290008544921875, -13.912857055664062, -28.857269287109375, 23.15118408203125, 62.363525390625, 25.141189575195312, 61.26228332519531, 34.403106689453125, -6.5898895263671875, 23.625080108642578, 1.8177490234375, 43.99310302734375, 17.738006591796875, 21.188396453857422, -18.938064575195312, -3.30401611328125, -30.02679443359375, -4.028919219970703, -39.10382080078125, 47.99354553222656, 15.879806518554688, 28.787399291992188, 43.86742401123047, -12.2314453125, 22.8033447265625, -31.335235595703125, 3.952188491821289, -42.254119873046875, -17.922386169433594, 1.1539230346679688, -7.3661346435546875, 52.494140625, 11.584823608398438, 17.4937744140625, 52.36064147949219, 2.1524276733398438, 50.84745788574219, -10.2435302734375, -3.71044921875, 101.06227111816406, 10.538055419921875, -29.415321350097656, 13.12652587890625, 79.57476806640625, 13.76324462890625, 54.15484619140625, 3.2752685546875, -9.586456298828125, 12.120681762695312, 3.0973358154296875, 35.674468994140625, -4.407260894775391, 49.45684814453125, 38.95330810546875, -35.04833984375, 3.1631317138671875, 64.53097534179688, -18.6942138671875, 18.95794677734375, -21.25958251953125, -41.51318359375, 31.6715087890625, 74.4757080078125, 83.18692016601562, -12.04532241821289, -15.633285522460938, -37.72053527832031, -9.482330322265625, 61.25079345703125, 28.24020004272461, 19.9080810546875, -23.942596435546875, 24.409210205078125, 12.295867919921875, 33.84185791015625, -37.211181640625, 66.46044921875, -13.202590942382812, 42.99233627319336, 35.909141540527344, -28.7293758392334, -35.8028564453125, -2.9750823974609375, 99.71310424804688, 80.59661865234375, 24.00518798828125, -56.46238708496094, 48.333763122558594, 8.50515365600586, -5.00341796875, -11.9962158203125, 66.974853515625, -49.901641845703125, 79.14093017578125, -8.41754150390625, 53.04638671875, 38.771514892578125, 54.86932373046875, 52.088531494140625, 8.5948486328125, 8.883617401123047, -3.974578857421875, 78.82394409179688, 9.93463134765625, 7.719573974609375, -11.82293701171875, 45.21445846557617], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000159.npy"}
|
||||
{"epoch": 0.33298429319371725, "step": 160, "batch_size": 128, "mean": 20.639022827148438, "std": 31.429197311401367, "min": -51.63702392578125, "p10": -14.083749389648435, "median": 15.943557739257812, "p90": 59.43289947509765, "max": 114.4869384765625, "pos_frac": 0.7578125, "sample": [-6.513336181640625, 48.157470703125, 0.0, 90.79849243164062, 29.531524658203125, 26.5924072265625, 32.02923583984375, 23.391326904296875, 83.01959228515625, -4.034271240234375, 56.619842529296875, 69.6102294921875, 31.881759643554688, 72.77880859375, 18.138259887695312, -3.96490478515625, 74.73065185546875, 27.4151611328125, 8.782081604003906, 16.158218383789062, 2.9687957763671875, 11.608154296875, 31.621925354003906, 37.1842041015625, 0.2221527099609375, -4.6319580078125, -5.8231201171875, 81.95843505859375, 1.9356269836425781, -49.073822021484375, 15.105560302734375, 108.45574951171875, -38.64125061035156, 52.83690643310547, 19.369125366210938, 3.108013153076172, 9.524505615234375, 3.7650012969970703, 29.122833251953125, 12.398260116577148, 37.210853576660156, 52.69270324707031, 9.9058837890625, 0.0, 42.828125, -13.72430419921875, 36.633155822753906, -26.479171752929688, -16.621231079101562, 6.396942138671875, 19.9923095703125, 37.06877136230469, 114.4869384765625, 72.15126037597656, -20.52252197265625, 12.93359375, 28.54205322265625, -14.922454833984375, 11.089698791503906, -13.11407470703125, -7.58843994140625, 55.52684020996094, 8.277130126953125, -34.9837646484375, -10.185012817382812, 58.584259033203125, 55.20001220703125, 58.63975524902344, 5.254907608032227, 13.634124755859375, 34.27643585205078, 1.7552337646484375, -0.40714263916015625, 11.834197998046875, 26.481842041015625, 34.3905029296875, 72.8248291015625, 51.25634765625, 32.873748779296875, 19.6505126953125, -28.209548950195312, -6.1891021728515625, 44.14039611816406, -38.39513397216797, -13.470169067382812, 66.4942626953125, 35.7137451171875, 6.39306640625, 42.68681335449219, -12.012481689453125, 8.31866455078125, 7.728782653808594, 2.896881103515625, -29.851226806640625, 26.34039306640625, -51.63702392578125, -1.3499603271484375, 29.961212158203125, 64.30636596679688, 0.0259246826171875, 14.085865020751953, 14.455398559570312, -20.003461837768555, 25.171112060546875, 19.6973876953125, 9.788810729980469, 9.396053314208984, -7.9871978759765625, 46.461334228515625, 26.645828247070312, 56.971282958984375, 0.4777679443359375, 16.963470458984375, 45.39777374267578, -6.58856201171875, 28.16742706298828, 57.175689697265625, 10.538726806640625, 31.431640625, 58.207183837890625, 61.2835693359375, 30.7203369140625, -36.916900634765625, 34.8880615234375, 15.728897094726562, 57.124786376953125, 1.855560302734375, 4.78248405456543], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000160.npy"}
|
||||
{"epoch": 0.33507853403141363, "step": 161, "batch_size": 128, "mean": 22.093276977539062, "std": 34.35165786743164, "min": -71.69671630859375, "p10": -16.272779846191405, "median": 18.774826049804688, "p90": 70.64963989257812, "max": 89.4310302734375, "pos_frac": 0.7109375, "sample": [-3.9514541625976562, 1.8329277038574219, -2.448516845703125, -70.3853759765625, 1.543588638305664, 30.645660400390625, 3.802703857421875, 17.358245849609375, 6.874725341796875, 7.46551513671875, 4.424541473388672, 81.19970703125, 34.932525634765625, -36.636993408203125, 2.1052703857421875, 2.3757858276367188, 78.3009033203125, 23.061126708984375, 40.93076705932617, -5.081939697265625, 48.11566162109375, 15.999649047851562, 60.32585144042969, 80.903564453125, -23.878868103027344, 4.67840576171875, 5.514995574951172, 80.84689331054688, 9.187156677246094, 34.50139617919922, 6.308929443359375, 30.206636428833008, -6.5829315185546875, -16.036895751953125, 28.547996520996094, 25.943405151367188, 35.181304931640625, 55.884857177734375, -0.8652687072753906, 13.955863952636719, -5.1844482421875, -44.122467041015625, -35.14691162109375, -8.612907409667969, 12.736915588378906, 45.993743896484375, -16.464111328125, -1.798858642578125, 0.0, 41.92924499511719, 18.200531005859375, 41.104736328125, 7.6586151123046875, -0.005115509033203125, 14.57861328125, -1.13385009765625, 54.650901794433594, 59.21709442138672, 43.33000946044922, 31.358993530273438, 60.001312255859375, 67.81661987304688, -20.8046875, 19.34912109375, -12.20404052734375, 70.4727783203125, 59.08282470703125, 76.64324951171875, -16.190780639648438, 14.62298583984375, 82.00494384765625, 27.540401458740234, 40.28887939453125, 70.64434814453125, -42.98681640625, 72.2889404296875, 32.59880828857422, -2.50115966796875, 1.371124267578125, 40.22123718261719, -15.637222290039062, -0.17963409423828125, 89.4310302734375, 70.30569458007812, 28.621688842773438, 77.13812255859375, 2.6417236328125, 53.509613037109375, 84.38113403320312, -1.5894718170166016, 1.08880615234375, 35.881683349609375, -17.4541015625, 47.40667724609375, -13.540069580078125, 29.549560546875, 70.6619873046875, -9.021820068359375, 39.20257568359375, -8.091400146484375, 27.847030639648438, 63.0430908203125, 25.056793212890625, -17.22198486328125, 15.119705200195312, 46.06449890136719, -15.186004638671875, 30.995086669921875, 42.386024475097656, 52.846710205078125, 60.34991455078125, 69.748291015625, -17.068992614746094, 40.934417724609375, 28.829626083374023, -54.053558349609375, 74.04450988769531, 10.9947509765625, 11.242912292480469, 0.0, -8.119140625, 75.15631103515625, 60.85772705078125, -71.69671630859375, 28.36676025390625, 20.243118286132812, 2.2310028076171875, 54.9818115234375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000161.npy"}
|
||||
{"epoch": 0.33717277486910996, "step": 162, "batch_size": 128, "mean": 19.94671630859375, "std": 33.15103530883789, "min": -79.14884948730469, "p10": -19.753155517578122, "median": 18.008331298828125, "p90": 65.37911682128906, "max": 91.5325927734375, "pos_frac": 0.75, "sample": [-6.465728759765625, 6.6275634765625, 68.2301025390625, 40.29224395751953, 3.8594131469726562, 17.8624267578125, -16.77447509765625, 29.567901611328125, 6.548942565917969, 25.430030822753906, 55.421424865722656, 87.44888305664062, 14.849754333496094, 28.093852996826172, -5.84716796875, -15.272476196289062, 24.92523193359375, 30.168548583984375, 1.9746322631835938, 4.176166534423828, -33.032501220703125, 41.21437072753906, -36.4796142578125, 33.26318359375, 18.25592041015625, 35.493377685546875, 67.0238037109375, 11.1837158203125, -43.74810791015625, 7.536201477050781, -1.8619117736816406, 20.5662841796875, 38.08549499511719, -31.152122497558594, 81.63983154296875, 8.875518798828125, 18.15423583984375, -65.14410400390625, -10.909343719482422, 39.984527587890625, 4.096702575683594, 57.8782958984375, 3.2031631469726562, 67.36016845703125, 66.6278076171875, -79.14884948730469, 62.332275390625, 35.68205261230469, 29.87353515625, 66.80023193359375, -3.600492477416992, 43.09710693359375, 41.735748291015625, 2.58209228515625, 29.202774047851562, -3.2375526428222656, -10.939765930175781, 43.94694519042969, 73.42974853515625, 28.37158203125, 76.74737548828125, 89.47021484375, -2.44500732421875, 0.98883056640625, 40.64044189453125, 57.74803161621094, 59.0018310546875, 72.04106140136719, -19.148834228515625, 51.806732177734375, -1.1934070587158203, 8.909496307373047, -31.85186767578125, 18.318359375, 91.5325927734375, -25.585845947265625, -27.467010498046875, 2.377838134765625, 19.546478271484375, 59.0888671875, -6.534854888916016, 15.21826171875, 45.21149826049805, 33.554046630859375, 64.84396362304688, 61.31201171875, 7.041963577270508, -1.3883056640625, 32.33460998535156, 23.208251953125, 11.19244384765625, 15.866790771484375, 8.521978378295898, -62.1468505859375, 5.113006591796875, 25.824661254882812, 57.612083435058594, 37.4757080078125, 17.43011474609375, 62.21781921386719, 45.98414611816406, -32.64483642578125, 52.38018798828125, 8.719963073730469, 1.8512420654296875, 68.8209228515625, -18.429916381835938, 6.4893646240234375, -3.8227920532226562, 49.99445343017578, -35.26484680175781, 4.649528503417969, -9.785003662109375, 40.124847412109375, 29.725677490234375, 38.5150146484375, 13.284587860107422, -2.808074951171875, 25.81292724609375, 21.53900146484375, 7.085268020629883, -21.163238525390625, 1.2613067626953125, -0.430267333984375, 5.697151184082031, 63.02024841308594, 28.204833984375, 10.572860717773438], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000162.npy"}
|
||||
{"epoch": 0.3392670157068063, "step": 163, "batch_size": 128, "mean": 23.132179260253906, "std": 28.996917724609375, "min": -50.33282470703125, "p10": -12.3890998840332, "median": 24.34624481201172, "p90": 61.40043334960937, "max": 89.06167602539062, "pos_frac": 0.7578125, "sample": [24.9832763671875, -2.772308349609375, 44.11907196044922, -14.174423217773438, 28.7025146484375, -11.796440124511719, 28.0538330078125, 23.417343139648438, 36.04974365234375, 5.5303497314453125, 2.2100067138671875, 66.50241088867188, 45.186767578125, 16.153900146484375, 72.75543212890625, 17.41864013671875, 12.525833129882812, -13.77197265625, 10.576629638671875, -9.826705932617188, -19.054840087890625, 38.834346771240234, 28.15765380859375, -1.2082672119140625, 17.83795166015625, 54.38520812988281, -6.378931045532227, 16.25128173828125, 50.14385986328125, -2.999908447265625, -6.093719482421875, 61.2830810546875, -1.500457763671875, 46.668365478515625, 35.21588134765625, 28.4847412109375, 10.760589599609375, 50.485321044921875, -11.50152587890625, -5.060577392578125, 38.564697265625, 35.1947021484375, -26.031784057617188, 11.497276306152344, 12.866348266601562, 26.264633178710938, -45.665557861328125, 64.06961059570312, 46.86608123779297, -36.6484375, 37.728206634521484, 7.754638671875, 40.62835693359375, 34.653778076171875, -2.8683853149414062, 59.57261657714844, 19.328338623046875, 66.16485595703125, 66.54031372070312, 40.729339599609375, 3.4034423828125, -5.086578369140625, -11.6207275390625, 38.11867904663086, -9.41717529296875, -11.24847412109375, 9.370973587036133, 50.69776916503906, 5.047119140625, 35.11024475097656, -50.33282470703125, 61.521087646484375, 26.46660614013672, 23.709213256835938, 14.11199951171875, -16.785125732421875, 76.5831298828125, -14.642242431640625, 67.60140991210938, 68.05828857421875, 34.027435302734375, 46.171630859375, 22.3779296875, 15.969585418701172, 44.62805938720703, 41.36371612548828, 37.76573181152344, 0.39312744140625, 65.53971862792969, 51.02326965332031, 5.588628768920898, 6.0391845703125, 26.222679138183594, 89.06167602539062, 36.516937255859375, -34.48443603515625, -28.28327178955078, 28.083261489868164, 79.28582763671875, 32.99333190917969, 55.7921142578125, 51.65496826171875, -19.0697021484375, 54.35064697265625, -3.6986312866210938, 15.9249267578125, 61.348724365234375, 19.428878784179688, 22.6983642578125, 30.62706756591797, 60.170166015625, 37.473419189453125, 18.49481201171875, 0.007946014404296875, 16.51313018798828, 12.419281005859375, 36.507354736328125, 57.28800964355469, 27.218719482421875, 65.31298065185547, -3.1865081787109375, 4.042266845703125, -33.38739013671875, 59.993438720703125, 51.524871826171875, 33.533226013183594, 9.136030197143555, -5.914466857910156], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000163.npy"}
|
||||
{"epoch": 0.3413612565445026, "step": 164, "batch_size": 128, "mean": 19.8585262298584, "std": 31.77775001525879, "min": -55.79986572265625, "p10": -14.006382751464843, "median": 17.153215408325195, "p90": 62.72550354003906, "max": 101.06640625, "pos_frac": 0.7265625, "sample": [-10.5599365234375, 22.597366333007812, -2.5800552368164062, 56.1627197265625, -0.45665740966796875, -2.8665924072265625, -7.324300765991211, -3.5358428955078125, 27.39093017578125, 31.481201171875, 34.8623046875, -13.920806884765625, 45.65849304199219, 45.376556396484375, 25.412139892578125, -47.79473876953125, -1.85760498046875, 54.847320556640625, 15.63327407836914, 68.50210571289062, -0.205963134765625, -47.44012451171875, 49.59736633300781, 13.447784423828125, 11.565414428710938, -26.33639907836914, 28.815032958984375, 7.15234375, -14.19512939453125, 4.1024169921875, 37.82139587402344, 0.165496826171875, 3.13232421875, 62.32151794433594, -6.652099609375, 51.740447998046875, -37.785858154296875, 3.9062347412109375, 1.4758529663085938, -45.8038330078125, -3.0674591064453125, 71.31954956054688, 46.54351806640625, 33.6934814453125, 67.03561401367188, 7.9955291748046875, 54.508270263671875, -1.9925537109375, 0.560882568359375, 4.139739990234375, -31.013275146484375, 12.157051086425781, 18.67315673828125, 29.717111587524414, -14.414840698242188, 64.4716796875, -1.5008087158203125, 43.670875549316406, 32.795013427734375, 32.90081787109375, 0.631805419921875, 9.152069091796875, 65.1094970703125, 67.05799865722656, -13.925491333007812, 65.99247741699219, 19.541419982910156, -6.8061676025390625, 45.595123291015625, 14.77166748046875, 82.22576904296875, 30.376197814941406, 70.08663940429688, 5.45654296875, -9.218500137329102, -0.6343002319335938, 23.869613647460938, 5.753442764282227, -50.037322998046875, 23.423583984375, 32.66375732421875, 56.968231201171875, -18.180187225341797, 3.5025177001953125, 0.170135498046875, 40.63319396972656, 54.590423583984375, 62.26116943359375, 41.036773681640625, 22.03948974609375, -5.639360427856445, 48.87506103515625, 0.4476470947265625, 56.727691650390625, 11.0704345703125, 53.501922607421875, 51.14227294921875, 66.54144287109375, -13.073577880859375, 1.1133880615234375, -44.092681884765625, 29.995437622070312, 0.5197372436523438, 30.079193115234375, 10.21844482421875, -8.475128173828125, 58.173248291015625, -11.104766845703125, 48.397361755371094, 7.1380767822265625, 35.23974609375, 46.833709716796875, 8.72918701171875, 18.79266357421875, -55.79986572265625, -22.966156005859375, 7.3905181884765625, -9.015472412109375, 101.06640625, 19.39300537109375, 30.266849517822266, 45.67889404296875, 15.565963745117188, 34.33306884765625, 47.127777099609375, 63.66813659667969, 99.19229125976562, 18.685842514038086], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000164.npy"}
|
||||
{"epoch": 0.34345549738219894, "step": 165, "batch_size": 128, "mean": 16.75387954711914, "std": 30.281906127929688, "min": -63.97505187988281, "p10": -16.347946166992188, "median": 9.46612548828125, "p90": 56.8911865234375, "max": 93.5550537109375, "pos_frac": 0.6640625, "sample": [5.7909088134765625, 57.28057861328125, -10.177696228027344, 48.93701171875, -9.64727783203125, 4.367362976074219, 35.06132507324219, 3.140045166015625, 50.69281005859375, 43.887664794921875, 68.77166748046875, 3.800426483154297, -61.10400390625, 45.90122985839844, 65.49179077148438, 14.868438720703125, -7.98297119140625, 71.81930541992188, -6.324737548828125, 45.88421630859375, 42.64518737792969, -48.53143310546875, 55.667724609375, 42.625457763671875, 8.045730590820312, -22.14986801147461, -24.9200439453125, -13.004653930664062, 18.02423095703125, 35.573028564453125, 5.682586669921875, -8.55902099609375, -2.949005126953125, 1.1291046142578125, 1.361083984375, 13.609382629394531, 29.6424560546875, -4.8748779296875, -6.852756500244141, -10.245437622070312, 15.75408935546875, 15.089897155761719, 57.348785400390625, -18.381546020507812, 14.15606689453125, 56.72430419921875, -1.3011112213134766, -1.265106201171875, -16.305877685546875, 10.366775512695312, 88.422607421875, 3.6370391845703125, 4.937568664550781, 1.318939208984375, 62.210540771484375, 31.185592651367188, -10.221710205078125, 5.368841171264648, -5.180286407470703, 10.40069580078125, -3.994720458984375, 66.64617919921875, -9.664901733398438, 44.168701171875, 27.974578857421875, -16.4586181640625, 25.128005981445312, -4.1191864013671875, -4.13360595703125, -7.2411956787109375, -0.213592529296875, -2.5748291015625, 25.96966552734375, 18.769723892211914, 30.886093139648438, 4.6380157470703125, 40.06437683105469, -25.45489501953125, 44.58439636230469, 17.29217529296875, -31.628387451171875, 28.515213012695312, 69.24484252929688, -16.44610595703125, 8.179557800292969, 42.35699462890625, -3.8615570068359375, 65.4539794921875, 55.685546875, 20.53533935546875, 13.06884765625, 1.685211181640625, 40.795166015625, 45.044193267822266, 54.22267150878906, -17.4830322265625, -14.305419921875, 22.679962158203125, 45.264678955078125, 5.9835052490234375, 93.5550537109375, -2.9412002563476562, 36.28228759765625, 43.02227020263672, 62.17250061035156, -35.764251708984375, -8.537544250488281, 15.536590576171875, 55.47993469238281, 1.95684814453125, 0.9124813079833984, 6.138710021972656, 61.425018310546875, 32.294708251953125, 52.48286437988281, 49.5433349609375, 6.117340087890625, 15.442787170410156, 8.565475463867188, 49.74853515625, -1.2789154052734375, 46.9990234375, -63.97505187988281, -30.379791259765625, -3.4293289184570312, 54.100242614746094, -8.012344360351562, -6.86181640625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000165.npy"}
|
||||
{"epoch": 0.34554973821989526, "step": 166, "batch_size": 128, "mean": 15.34017276763916, "std": 36.540122985839844, "min": -79.95416259765625, "p10": -27.148390197753905, "median": 11.33251953125, "p90": 64.21768646240233, "max": 106.51730346679688, "pos_frac": 0.6640625, "sample": [13.7596435546875, 4.08538818359375, -27.851882934570312, 16.850326538085938, -2.009185791015625, 11.538253784179688, 66.96905517578125, 60.414947509765625, -2.907857894897461, 3.140298843383789, 11.872726440429688, -4.480995178222656, 106.31109619140625, -18.551551818847656, 3.05914306640625, -8.601776123046875, 23.487884521484375, 24.691436767578125, 3.9770965576171875, 2.1057357788085938, -1.7837638854980469, -31.90179443359375, 26.521804809570312, 68.89799499511719, -58.5517578125, -26.846893310546875, 1.7537879943847656, 33.14176940917969, -58.01007080078125, -20.923309326171875, -10.27508544921875, 18.840087890625, 27.846282958984375, 77.220458984375, 63.03852844238281, 80.96923828125, 3.9964828491210938, -10.392059326171875, -15.867088317871094, -75.92521667480469, 6.031219482421875, -31.944862365722656, 5.041221618652344, -2.943634033203125, -18.971847534179688, 69.916748046875, 31.044055938720703, 19.861465454101562, -14.850372314453125, 38.19647216796875, -2.0679931640625, 92.15185546875, -47.17724609375, 72.99169921875, 98.08517456054688, -53.63189697265625, -34.50994873046875, 94.76336669921875, 36.99835205078125, 44.114593505859375, -28.150634765625, -9.731719970703125, -20.71832275390625, 39.028717041015625, 50.755584716796875, -25.129302978515625, 54.72735595703125, 36.7210693359375, 6.484222412109375, -17.238494873046875, 24.185028076171875, 26.17315673828125, 74.62442016601562, -8.475069046020508, -15.592697143554688, 22.952682495117188, 15.156387329101562, -0.9434566497802734, 13.081962585449219, 11.45904541015625, -5.264070510864258, 45.9302978515625, 17.85589599609375, 10.130264282226562, 11.170196533203125, 74.41127014160156, 0.99896240234375, 43.265892028808594, 18.245635986328125, -40.7215576171875, 5.5201416015625, -7.45501708984375, 12.510704040527344, 57.4332275390625, 51.466217041015625, 28.107131958007812, 17.54132080078125, 52.52776336669922, 29.42400360107422, 7.4921875, 23.575241088867188, 1.3095951080322266, -3.0922012329101562, 51.48863220214844, 1.9631919860839844, 33.0191650390625, 52.9725341796875, 48.60400390625, -11.442535400390625, 62.29969024658203, -47.566131591796875, -79.95416259765625, 17.59161376953125, 43.905792236328125, 58.983917236328125, -20.56824493408203, 11.20599365234375, -0.22528076171875, 19.34613037109375, -0.6754302978515625, 62.3990478515625, -1.5612716674804688, 106.51730346679688, 13.597919464111328, 2.027587890625, 33.93431091308594, 10.673141479492188, 10.544601440429688], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000166.npy"}
|
||||
{"epoch": 0.34764397905759165, "step": 167, "batch_size": 128, "mean": 18.537555694580078, "std": 33.40691375732422, "min": -64.0133056640625, "p10": -17.06414337158203, "median": 12.99951171875, "p90": 65.3848648071289, "max": 102.20587158203125, "pos_frac": 0.71875, "sample": [47.180030822753906, 5.1165924072265625, 15.0877685546875, 44.845184326171875, 14.98193359375, 55.3729248046875, 46.7694091796875, 75.68939208984375, 34.035552978515625, 15.267684936523438, 61.05802917480469, -50.795013427734375, 66.83682250976562, 1.2301025390625, 13.488922119140625, 78.662841796875, 15.0555419921875, 70.49215698242188, 0.77935791015625, 12.467315673828125, 3.472564697265625, 33.60784912109375, -57.186065673828125, -64.0133056640625, -2.7063827514648438, 0.8333053588867188, -29.432479858398438, 54.267852783203125, 33.9404296875, 57.9378662109375, 29.92742919921875, 65.36653137207031, 18.538108825683594, 65.42764282226562, 42.56986999511719, 8.559280395507812, 16.57525634765625, -34.833404541015625, 20.3328857421875, -8.398796081542969, 70.29290771484375, -4.206535339355469, 10.461639404296875, 3.469696044921875, -0.0649871826171875, -62.439300537109375, 0.0, 1.346923828125, 0.659271240234375, 35.39100646972656, 12.510101318359375, 75.75616455078125, -32.55906677246094, 66.58416748046875, -13.963760375976562, 15.5206298828125, 47.009361267089844, 11.517913818359375, 34.331756591796875, 1.900390625, -33.48114013671875, 1.66290283203125, 84.6685791015625, 0.20918846130371094, 28.83953857421875, 11.749771118164062, 19.001007080078125, -16.747787475585938, 38.39311218261719, 3.9432945251464844, -0.7536468505859375, 1.2341995239257812, 40.563819885253906, -38.00309753417969, 49.30570983886719, 45.358184814453125, 58.78187561035156, -1.292236328125, 29.2464599609375, 2.8833160400390625, -2.64849853515625, 47.16337585449219, 48.73722839355469, -28.164093017578125, 56.7283935546875, -6.106819152832031, 17.983489990234375, -4.737348556518555, 34.071258544921875, 40.464385986328125, 36.9818115234375, 9.503303527832031, 42.03952407836914, 18.802886962890625, 39.188140869140625, 86.1708984375, -5.178352355957031, -17.80230712890625, -13.324295043945312, -11.093124389648438, -4.978630065917969, 6.4429473876953125, -11.119842529296875, 71.02078247070312, -9.169281005859375, 9.394287109375, 102.20587158203125, 21.60284423828125, 45.567138671875, -12.810760498046875, 9.43292236328125, 44.5064697265625, -33.77949523925781, -4.529930114746094, -2.3002471923828125, 2.7601470947265625, 46.081298828125, 4.74407958984375, -3.75146484375, 53.61846923828125, 47.13554382324219, 31.61455535888672, 9.341094970703125, -2.81634521484375, -56.03961181640625, 56.214569091796875, 6.1075439453125, 80.04185485839844], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000167.npy"}
|
||||
{"epoch": 0.34973821989528797, "step": 168, "batch_size": 128, "mean": 23.75702667236328, "std": 33.21804428100586, "min": -69.58514404296875, "p10": -11.270620346069336, "median": 19.981605529785156, "p90": 66.21952972412109, "max": 98.6707763671875, "pos_frac": 0.75, "sample": [32.53715515136719, 18.14617919921875, 77.30770874023438, 67.32061767578125, 47.837188720703125, 26.08506965637207, 45.65815734863281, 23.571575164794922, 51.6363525390625, -8.072181701660156, 52.52093505859375, -12.354232788085938, 5.2264404296875, 13.868888854980469, 0.9801101684570312, 4.4579315185546875, 26.057083129882812, 31.97821044921875, -5.772335052490234, 54.18074035644531, 17.058486938476562, 61.928802490234375, 19.152725219726562, 25.537567138671875, 60.50006103515625, -1.5465641021728516, 17.503372192382812, 48.553951263427734, -2.1728591918945312, 0.5981216430664062, 0.0, 13.932708740234375, 77.84600830078125, 17.841522216796875, 66.10523986816406, 33.03887939453125, 17.910125732421875, 10.218353271484375, 32.032684326171875, 2.08154296875, 66.4862060546875, 16.51824951171875, -0.50958251953125, 44.851348876953125, -5.880706787109375, 95.47772216796875, 18.391250610351562, 35.98638916015625, 82.03875732421875, -13.6337890625, 55.8863525390625, -4.8936614990234375, 35.518306732177734, 15.806640625, -46.130767822265625, 37.83860778808594, 51.842926025390625, 50.289794921875, 50.39374542236328, 4.5138092041015625, -25.041168212890625, -24.234756469726562, 2.0860595703125, 0.6355133056640625, 45.62030029296875, -0.4025764465332031, 59.968505859375, 54.207550048828125, 98.6707763671875, 26.7115478515625, 79.053955078125, 8.608293533325195, -69.58514404296875, 44.725616455078125, 67.91275024414062, 0.4581012725830078, 4.4808349609375, 52.06498718261719, 6.1936187744140625, -5.656280517578125, -3.973968505859375, 6.694244384765625, 39.23930358886719, 20.81048583984375, 64.28338623046875, 55.562774658203125, 66.05731201171875, 41.49200439453125, 79.78045654296875, 55.24945068359375, -24.361526489257812, -8.199615478515625, 62.935791015625, -11.787200927734375, 7.2057342529296875, 45.071929931640625, -11.04922866821289, 22.325408935546875, 33.022735595703125, -57.6451416015625, 16.514511108398438, 10.055343627929688, 54.24376678466797, 35.432037353515625, 28.597091674804688, 76.59832763671875, -3.7053604125976562, -3.92144775390625, 4.8700714111328125, -1.9119644165039062, 7.92388916015625, -0.110809326171875, -29.848388671875, -36.716827392578125, 56.443519592285156, 24.07257080078125, 48.99029541015625, 68.33920288085938, 64.69073486328125, 46.63153076171875, -3.422088623046875, 34.40814208984375, 5.796783447265625, -9.33648681640625, -38.344268798828125, 10.777999877929688, 67.77191162109375, -65.21762084960938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000168.npy"}
|
||||
{"epoch": 0.3518324607329843, "step": 169, "batch_size": 128, "mean": 27.657325744628906, "std": 34.7620735168457, "min": -59.22251892089844, "p10": -8.127868652343748, "median": 20.23773956298828, "p90": 76.30848693847656, "max": 112.009765625, "pos_frac": 0.7734375, "sample": [-1.7052078247070312, 47.69425964355469, 90.09234619140625, 36.68878173828125, -11.841583251953125, 11.921836853027344, 2.416322708129883, 6.28424072265625, 1.937652587890625, 23.209625244140625, 23.303253173828125, 58.59364318847656, 59.114009857177734, 58.130462646484375, -32.816162109375, -1.9636917114257812, 53.318511962890625, 10.31817626953125, 5.391912460327148, -3.6209716796875, -2.80157470703125, 11.116241455078125, 52.964813232421875, 15.2857666015625, 54.830291748046875, 67.2474365234375, 27.848297119140625, -14.541839599609375, 41.01641845703125, 50.303314208984375, 65.2269287109375, 67.9688720703125, 56.5032958984375, -7.6844024658203125, 3.952667236328125, 9.719207763671875, -14.518434524536133, -59.22251892089844, 28.240631103515625, -1.401123046875, -9.005905151367188, 67.69482421875, -8.869247436523438, 15.042411804199219, 0.0, -45.754920959472656, 8.868274688720703, 33.82353210449219, 19.952102661132812, 20.656570434570312, -2.7232666015625, 9.996978759765625, 112.009765625, 63.606536865234375, 43.40088653564453, 53.65873718261719, -51.12073516845703, 54.460243225097656, 0.5366535186767578, -2.0580596923828125, 76.349609375, 50.28497314453125, 30.77667236328125, -10.435897827148438, 41.943359375, 77.6082763671875, 65.87379455566406, 89.7322998046875, 17.81427001953125, 71.15029907226562, 1.394683837890625, 91.453857421875, 7.2316741943359375, 2.469268798828125, 40.03118896484375, -35.156768798828125, 8.301513671875, 79.51193237304688, 45.889068603515625, -6.5166015625, 52.64697265625, 8.827415466308594, 11.270479202270508, 51.886993408203125, 24.238052368164062, 55.37156677246094, 67.41473388671875, -7.8101348876953125, 41.117401123046875, 37.00840759277344, 0.5136032104492188, 39.18426513671875, 6.6714019775390625, 102.76969909667969, 33.30584716796875, 10.438282012939453, 14.384033203125, 5.35321044921875, 85.46006774902344, 40.650177001953125, 7.339677810668945, 52.43269348144531, 94.72024536132812, 16.923797607421875, -30.899688720703125, -29.91900634765625, 9.2830810546875, -1.806640625, 95.1416015625, 20.52337646484375, 47.37933349609375, -0.0049724578857421875, 69.70867919921875, 53.50250244140625, 55.050689697265625, 0.0, 14.409912109375, 3.2582550048828125, 24.752197265625, 45.56022644042969, 18.3275146484375, -0.071197509765625, 10.860076904296875, 15.019763946533203, 90.05119323730469, 88.97454833984375, 76.29086303710938, -5.7779693603515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000169.npy"}
|
||||
{"epoch": 0.3539267015706806, "step": 170, "batch_size": 128, "mean": 24.099132537841797, "std": 32.65823745727539, "min": -58.938262939453125, "p10": -9.31449737548828, "median": 20.89642333984375, "p90": 67.2322296142578, "max": 114.15716552734375, "pos_frac": 0.765625, "sample": [4.990478515625, 14.936126708984375, 53.83561706542969, -7.8446197509765625, 51.648956298828125, 11.045166015625, 0.32941436767578125, 103.83843994140625, 16.93628692626953, 9.459136962890625, 30.263427734375, 91.4266357421875, 44.39833068847656, 26.006446838378906, -39.7723388671875, 104.688232421875, -25.8184814453125, 32.68719482421875, -1.1203231811523438, 15.233037948608398, -33.85064697265625, 4.9285888671875, 35.49884796142578, 43.868804931640625, 5.937736511230469, -28.297622680664062, -18.775447845458984, 3.6981916427612305, 69.21966552734375, -8.093536376953125, 29.351089477539062, 21.63298988342285, 36.48982238769531, 12.958883285522461, 15.374267578125, -0.783111572265625, 1.85589599609375, 3.8233795166015625, 62.94770812988281, 57.0689697265625, 1.1685638427734375, 36.994873046875, 6.2058258056640625, 66.91790771484375, 53.858734130859375, 36.74250793457031, 32.675811767578125, -1.9022216796875, -22.595138549804688, 67.91238403320312, 68.23573303222656, 55.510292053222656, 52.975830078125, -47.7265625, 28.613739013671875, -7.6554107666015625, 72.07455444335938, -11.9119873046875, 66.80281066894531, 47.687828063964844, 20.509765625, 76.98281860351562, 59.382843017578125, -9.7320556640625, 44.967193603515625, 44.539039611816406, -2.86236572265625, 14.3363037109375, 60.99320983886719, 2.96533203125, 33.888458251953125, 56.6558837890625, 7.712785720825195, 21.2830810546875, 46.731895446777344, 38.203399658203125, 0.0, 69.27914428710938, 19.653419494628906, 46.52079772949219, -32.70953369140625, 48.610015869140625, 4.5963134765625, 43.2760009765625, 58.611907958984375, 33.59870910644531, -31.8345947265625, 11.7625732421875, 39.83203125, -58.938262939453125, 29.642364501953125, 25.879852294921875, -5.39581298828125, 37.090911865234375, 0.8661651611328125, -2.5828495025634766, 70.83697509765625, -6.919189453125, 28.1517333984375, 114.15716552734375, -0.61846923828125, -9.135543823242188, 34.36406707763672, -2.04071044921875, -7.1434326171875, 14.470367431640625, 26.0361328125, 11.328567504882812, 2.0461273193359375, 13.916107177734375, -3.0585174560546875, 7.189056396484375, 86.6077880859375, 16.710479736328125, -25.164833068847656, 23.058547973632812, 8.399978637695312, 63.3740234375, 23.23705291748047, 1.717916488647461, 1.384796142578125, 40.479400634765625, 73.22755432128906, 61.710105895996094, 5.667842864990234, 34.94098663330078, -0.1081390380859375, 66.94073486328125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000170.npy"}
|
||||
{"epoch": 0.35602094240837695, "step": 171, "batch_size": 128, "mean": 24.56855583190918, "std": 33.56206512451172, "min": -37.090511322021484, "p10": -18.2413330078125, "median": 21.135398864746094, "p90": 73.10483093261718, "max": 95.71255493164062, "pos_frac": 0.75, "sample": [91.77023315429688, 14.38800048828125, 32.28228759765625, 13.42095947265625, 62.209136962890625, 67.4764404296875, 64.53875732421875, -29.9627685546875, -8.13459587097168, 20.93389892578125, -21.496627807617188, 0.26464080810546875, 78.35342407226562, -19.442214965820312, 1.6009368896484375, 3.1608734130859375, -13.480720520019531, 19.525144577026367, 30.19000244140625, 78.57965087890625, 54.68855285644531, -5.8921051025390625, 18.792144775390625, 83.71710205078125, 66.28240966796875, 29.401748657226562, -25.149383544921875, 10.851898193359375, -25.7137451171875, 22.80780029296875, 10.925201416015625, 11.486038208007812, -10.917938232421875, 57.28910446166992, 55.97918701171875, 75.76779174804688, 55.50920104980469, 2.669921875, 24.573646545410156, -5.7774200439453125, 14.376129150390625, -14.231903076171875, -6.663211822509766, -6.3862457275390625, -17.41297149658203, 85.49215698242188, 84.037841796875, 11.239070892333984, -35.895660400390625, 4.77569580078125, 68.82705688476562, 28.296875, 8.207427978515625, 65.92276000976562, 33.772674560546875, 31.852127075195312, 40.410125732421875, 76.28009033203125, 81.49053955078125, -14.689666748046875, 47.75262451171875, 28.294570922851562, -8.4290771484375, 58.01697540283203, 37.575355529785156, 62.97486877441406, 31.0909423828125, 11.17279052734375, 55.96759033203125, 42.48400115966797, 27.260589599609375, 76.971923828125, -5.3440704345703125, 2.7801513671875, -32.31378173828125, -17.726669311523438, 75.94453430175781, 2.7809715270996094, -28.588714599609375, 10.585845947265625, -0.06029319763183594, 2.38470458984375, -10.033416748046875, -25.580459594726562, 16.303802490234375, -31.5833740234375, 52.50767135620117, 47.098480224609375, -2.4459228515625, 26.158050537109375, 49.67436218261719, 94.65492248535156, -37.090511322021484, 13.868858337402344, 30.642364501953125, 61.50836181640625, 54.693359375, 2.3278675079345703, 70.29293823242188, 24.752960205078125, 21.336898803710938, 48.7305908203125, -24.584716796875, 14.169021606445312, 23.46160888671875, 17.396240234375, 24.880035400390625, 60.81243896484375, 22.533058166503906, 1.7670059204101562, 19.30023193359375, -20.428970336914062, 60.742584228515625, 57.210418701171875, 53.93328857421875, 16.829986572265625, 25.87713623046875, 5.799896240234375, 22.269393920898438, 12.520950317382812, 95.71255493164062, 50.81885528564453, -10.199689865112305, 1.0955677032470703, -1.0033721923828125, 71.96356201171875, 27.404815673828125, -8.0699462890625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000171.npy"}
|
||||
{"epoch": 0.3581151832460733, "step": 172, "batch_size": 128, "mean": 16.30849266052246, "std": 34.938011169433594, "min": -87.54302978515625, "p10": -17.00957336425781, "median": 14.393241882324219, "p90": 64.19376525878906, "max": 87.060791015625, "pos_frac": 0.6796875, "sample": [16.611831665039062, -11.926742553710938, 2.1303558349609375, 23.500244140625, 35.67897033691406, 14.747947692871094, -2.0078125, -9.747291564941406, 5.8185577392578125, -0.2064208984375, -3.5985565185546875, -1.5771121978759766, -64.82711791992188, 6.175742149353027, 46.21507263183594, 12.699264526367188, 33.98590087890625, 75.50064086914062, 36.132293701171875, 11.165321350097656, -6.57769775390625, 7.59075927734375, 25.25634765625, 77.62747192382812, 4.8770751953125, 44.20977783203125, 40.27494812011719, -8.8314208984375, -12.928192138671875, 61.2027587890625, 5.67010498046875, 26.407196044921875, -69.200927734375, 7.1385345458984375, 17.18670654296875, 48.99052429199219, -17.325286865234375, 75.23358154296875, -19.207168579101562, -5.032295227050781, -6.1382293701171875, 15.617431640625, -4.18658447265625, -1.2710533142089844, 76.27043151855469, -87.54302978515625, -6.50341796875, 37.324737548828125, 74.85943603515625, 55.42938232421875, -3.6895751953125, -25.658477783203125, 20.453948974609375, -0.346160888671875, 32.5234375, 14.268936157226562, 61.73480224609375, 68.20744323730469, -16.874267578125, -4.5898590087890625, -6.72723388671875, 19.2960205078125, 23.851669311523438, -2.4189300537109375, 60.2825927734375, 45.92010498046875, -1.490234375, 7.6677703857421875, 20.804489135742188, 8.5238037109375, 87.060791015625, -62.2574462890625, 2.8844833374023438, 40.470611572265625, 72.27825927734375, 57.802490234375, 2.870147705078125, 1.3092727661132812, 20.6856689453125, -70.0069580078125, 28.125640869140625, 13.804679870605469, 21.95987319946289, -59.226348876953125, 16.374744415283203, 50.33599853515625, 6.527427673339844, 65.73294067382812, 14.015472412109375, -46.504150390625, 74.76556396484375, -5.407716751098633, 64.78082275390625, -36.7215576171875, -33.90155029296875, 22.474777221679688, 27.1810302734375, 17.66900634765625, -16.502939224243164, 14.517547607421875, -4.931427001953125, 86.8060302734375, 54.130889892578125, 44.517120361328125, 24.438247680664062, 26.49359130859375, -7.63934326171875, 20.5904541015625, 37.76884841918945, 8.27227783203125, 15.65960693359375, 53.157379150390625, 37.10184860229492, 63.942169189453125, 59.53668212890625, 41.1912841796875, -7.242584228515625, 9.47625732421875, 7.541784286499023, 51.46160888671875, -76.30804443359375, 67.56218719482422, 1.4697723388671875, 10.312088012695312, 0.0, 38.5302734375, 35.50331115722656, -9.58697509765625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000172.npy"}
|
||||
{"epoch": 0.36020942408376966, "step": 173, "batch_size": 128, "mean": 20.524009704589844, "std": 34.069915771484375, "min": -69.87393188476562, "p10": -18.758273315429683, "median": 18.394508361816406, "p90": 63.20142517089843, "max": 121.10137939453125, "pos_frac": 0.7109375, "sample": [-2.192535400390625, -69.87393188476562, -47.518959045410156, 20.377456665039062, 27.26348876953125, -13.197921752929688, 50.7613525390625, -40.693206787109375, 48.1419677734375, 113.10260009765625, -11.317005157470703, 26.015350341796875, -17.1275634765625, 39.57598876953125, 50.202301025390625, 62.63426208496094, 12.7952880859375, 4.3448638916015625, 21.56927490234375, 80.49172973632812, 65.98387145996094, 14.6781005859375, 7.176544189453125, 36.76544189453125, 22.329444885253906, 14.711669921875, 0.3060302734375, 8.277130126953125, 39.465728759765625, 52.65202331542969, -3.53216552734375, 47.86680603027344, -3.1363525390625, 6.063564300537109, 7.5125885009765625, 34.34893798828125, -36.6475830078125, 26.44158935546875, 23.386707305908203, 39.352081298828125, -9.567169189453125, 25.6512451171875, 38.732940673828125, 43.41585922241211, 55.10614013671875, 35.7735595703125, -4.044769287109375, -8.516159057617188, 3.7487564086914062, 10.70245361328125, -12.477470397949219, -42.233802795410156, -28.989334106445312, 121.10137939453125, 15.243438720703125, -27.432891845703125, 10.133441925048828, -22.563262939453125, 29.076148986816406, 98.71292114257812, 60.30390930175781, 64.25439453125, 2.6659812927246094, 56.01580810546875, 60.199676513671875, 67.99057006835938, -50.033416748046875, 34.063262939453125, 18.880081176757812, 19.905288696289062, 27.758991241455078, 2.1802749633789062, 70.68365478515625, 49.24360656738281, 3.5843963623046875, 47.521484375, 16.891876220703125, 70.63607788085938, -3.182231903076172, 17.10199737548828, 27.55548095703125, -28.562408447265625, 3.7191009521484375, 4.67523193359375, 55.2109375, 6.142974853515625, 36.630409240722656, 64.55909729003906, -3.30224609375, 8.28179931640625, 21.308135986328125, -4.973419189453125, 53.051483154296875, 27.800445556640625, 68.45352172851562, 27.7764892578125, 60.770294189453125, 34.56915283203125, 45.248443603515625, 56.586090087890625, 72.12626647949219, 15.6064453125, -2.9041271209716797, -12.451705932617188, -4.196430206298828, 61.712928771972656, -31.979461669921875, 53.41676330566406, 39.339378356933594, 17.908935546875, -5.8474884033203125, 2.99090576171875, -13.41192626953125, 7.972991943359375, 0.1856670379638672, -6.94891357421875, -5.244110107421875, -35.63079833984375, 80.41372680664062, -4.505928039550781, 62.750152587890625, -9.06158447265625, 0.0, 22.93590545654297, 38.912841796875, -36.430908203125, -5.9847869873046875, 32.30149841308594], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000173.npy"}
|
||||
{"epoch": 0.362303664921466, "step": 174, "batch_size": 128, "mean": 25.719093322753906, "std": 33.74830627441406, "min": -53.034820556640625, "p10": -14.5519775390625, "median": 20.638534545898438, "p90": 75.03247680664063, "max": 116.14385986328125, "pos_frac": 0.7734375, "sample": [54.525428771972656, 1.6988449096679688, 6.796173095703125, 1.689056396484375, 89.3946533203125, 62.940948486328125, 75.742431640625, -42.747955322265625, 8.868682861328125, -0.48857879638671875, 69.70547485351562, 53.83118438720703, 0.885711669921875, 50.898983001708984, 13.96343994140625, 30.60009765625, 68.79315185546875, 100.51473999023438, 9.877716064453125, 15.010284423828125, -8.224746704101562, 74.9854736328125, 1.8987884521484375, 82.06216430664062, 78.23046875, -5.4232177734375, 46.5758056640625, 9.240005493164062, 79.64185333251953, 50.30158996582031, -4.4138946533203125, 53.490142822265625, 116.14385986328125, -31.470123291015625, -27.36565399169922, -12.235671997070312, 60.882904052734375, 75.14215087890625, 89.58953857421875, 3.3675537109375, 16.90993309020996, 35.8741455078125, 40.003150939941406, 30.22284698486328, -2.289459228515625, 76.1553955078125, 50.366485595703125, 40.450927734375, -6.766448974609375, 106.28799438476562, -9.7611083984375, -18.153106689453125, 58.92455291748047, 39.1337890625, -13.285194396972656, 4.15087890625, 11.02804946899414, -1.847137451171875, 23.430419921875, 20.51641845703125, 22.28680419921875, 5.4168243408203125, 75.49609375, 63.50575256347656, 27.781463623046875, -17.785598754882812, -6.8971405029296875, 43.218292236328125, 11.274406433105469, 35.92842102050781, 24.68655014038086, 50.41047668457031, 61.14324188232422, 57.74371337890625, 47.17540740966797, 27.90692901611328, 14.450897216796875, 15.625946044921875, 58.674896240234375, -14.4521484375, -37.67173767089844, 45.74269104003906, 0.6978912353515625, 40.1951904296875, 20.760650634765625, 2.7854347229003906, 66.391357421875, 8.017782211303711, -20.077239990234375, 12.87057876586914, 9.34002685546875, 17.24947738647461, 31.465667724609375, 39.49287414550781, 46.001373291015625, 49.20953369140625, 3.441497802734375, 85.12472534179688, -0.6227874755859375, 19.05621337890625, 56.59367370605469, -17.744964599609375, 46.0301513671875, 22.486083984375, 9.63824462890625, 14.097213745117188, -18.637428283691406, 0.8230781555175781, -1.966064453125, -27.44586181640625, 45.384063720703125, -5.044464111328125, 3.1638031005859375, 35.587684631347656, 32.574249267578125, -18.75439453125, 67.80767822265625, -14.784912109375, 7.89862060546875, -53.034820556640625, 7.77764892578125, 15.866363525390625, 47.19451904296875, 22.58087158203125, 49.639923095703125, 0.9166946411132812, 20.86822509765625, -6.8066558837890625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000174.npy"}
|
||||
{"epoch": 0.3643979057591623, "step": 175, "batch_size": 128, "mean": 20.460651397705078, "std": 36.655029296875, "min": -97.91098022460938, "p10": -23.188264465332026, "median": 19.86650848388672, "p90": 65.88800506591795, "max": 121.4405517578125, "pos_frac": 0.7734375, "sample": [57.3265380859375, 26.651519775390625, 41.285125732421875, -2.393636703491211, 1.0105552673339844, -43.738006591796875, 24.349090576171875, 7.827919006347656, 91.27581787109375, -58.9864501953125, 20.403717041015625, 8.225624084472656, 23.785171508789062, 25.282737731933594, 49.75332260131836, 121.4405517578125, 7.540632247924805, 54.4423828125, 23.315780639648438, 72.99986267089844, -32.15338134765625, 58.68400573730469, 59.56254577636719, 47.855865478515625, 35.16758728027344, -7.6334228515625, -2.7993392944335938, -35.71966552734375, 19.935409545898438, -42.216651916503906, -5.1817169189453125, 1.4002304077148438, 23.0499267578125, 53.614501953125, -4.787933349609375, 55.13652038574219, 9.19598388671875, 2.072479248046875, 75.94259643554688, 41.78143310546875, 4.384124755859375, 10.449859619140625, 13.91510009765625, 19.797607421875, 9.431884765625, 69.58540344238281, 53.932586669921875, 52.886474609375, -15.34466552734375, 83.66769409179688, 20.631927490234375, 14.772472381591797, 1.1636962890625, 10.2283935546875, -17.40185546875, -32.4678955078125, 18.5203857421875, -3.6350021362304688, -0.37530517578125, 73.56539916992188, 31.01736831665039, 69.91091918945312, 8.8809814453125, 39.05023956298828, 51.815399169921875, 8.033401489257812, 2.725189208984375, 4.355688095092773, 0.30255126953125, 7.556854248046875, 60.68341064453125, -3.1044235229492188, 21.997718811035156, 58.677490234375, 49.94440460205078, -30.467987060546875, 1.54315185546875, 64.30340576171875, 0.0, -97.91098022460938, 23.150711059570312, 46.01031494140625, 80.18600463867188, 5.74078369140625, 22.16339111328125, -2.13134765625, 24.411712646484375, -16.9310302734375, -35.71661376953125, 84.24099731445312, 43.525177001953125, 4.0647430419921875, 46.576690673828125, 58.719329833984375, 45.96311950683594, 30.201812744140625, -22.08154296875, 13.501983642578125, -25.770614624023438, 6.99908447265625, 51.49302673339844, 58.04130935668945, 10.113166809082031, 37.67950439453125, 2.76507568359375, -77.03411865234375, 19.45086669921875, 26.969955444335938, -8.346710205078125, 25.09283447265625, 28.554882049560547, 18.146682739257812, 26.577133178710938, 33.479034423828125, 78.46514892578125, 7.5565185546875, 45.375465393066406, 85.59130859375, 57.10858154296875, 0.934722900390625, 17.88104248046875, 22.21917724609375, 22.242660522460938, -72.32525634765625, -67.62771606445312, 0.0, 4.982475280761719, 89.02365112304688], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000175.npy"}
|
||||
{"epoch": 0.36649214659685864, "step": 176, "batch_size": 128, "mean": 16.19698143005371, "std": 38.34989929199219, "min": -115.68109130859375, "p10": -29.049730682373045, "median": 13.502315521240234, "p90": 62.68857803344726, "max": 106.90536499023438, "pos_frac": 0.734375, "sample": [-32.08409881591797, 13.491935729980469, 1.69439697265625, -29.318740844726562, 19.47686767578125, 53.20233917236328, -42.27937316894531, -54.92437744140625, 9.037223815917969, 17.87884521484375, -80.5877685546875, 5.5205841064453125, 23.328426361083984, 77.81423950195312, 9.862930297851562, 52.956329345703125, 47.166656494140625, 5.584651947021484, 2.964996337890625, -8.906082153320312, -40.554443359375, 7.2939453125, 15.870019912719727, -70.97772216796875, 21.229507446289062, 74.41351318359375, 31.545143127441406, 99.4617919921875, 47.3431396484375, 1.3788604736328125, 94.66218566894531, -32.290679931640625, -3.7066268920898438, 19.912765502929688, 16.355854034423828, 0.079345703125, -11.77142333984375, 23.672714233398438, 15.881240844726562, 9.256916046142578, -49.796836853027344, 13.5126953125, 48.640045166015625, -3.140350341796875, 7.775108337402344, 61.76055145263672, 6.699069976806641, 2.992889404296875, 31.50238037109375, 82.7862548828125, -28.93444061279297, 53.4471549987793, 48.753997802734375, -21.883819580078125, 15.172607421875, 53.153564453125, 2.4838409423828125, 47.39839172363281, 30.367950439453125, 4.786418914794922, 3.7801513671875, 1.39276123046875, -5.87396240234375, -19.32757568359375, 45.503028869628906, 61.137237548828125, 37.627105712890625, -15.594482421875, 70.33544921875, 4.06134033203125, -21.57489776611328, 75.80543518066406, 22.437225341796875, 21.9691162109375, 106.90536499023438, 69.7396240234375, 1.646066665649414, -15.931488037109375, 16.6627197265625, 27.032241821289062, 24.060867309570312, 44.111572265625, 54.18949508666992, 46.86817169189453, 44.8985595703125, 57.74654006958008, -4.5078887939453125, 16.792343139648438, 2.02392578125, 82.34619140625, 21.410049438476562, -2.3211517333984375, -32.63487243652344, 28.055450439453125, -15.54937744140625, 7.936252593994141, 46.194915771484375, -71.08157348632812, -25.09661865234375, 49.120635986328125, -27.4984130859375, 52.71440124511719, 4.010185241699219, -13.84649658203125, 7.7664031982421875, -68.59384155273438, 30.639923095703125, 12.561676025390625, -17.812286376953125, 6.150363922119141, 59.82231140136719, -7.863410949707031, 83.93710327148438, 58.147125244140625, 85.08779907226562, 31.95116424560547, 4.76007080078125, 28.722816467285156, -9.703216552734375, -115.68109130859375, 64.85397338867188, 6.682159423828125, 0.24004554748535156, 32.212684631347656, 37.50860595703125, 9.461669921875, 39.597511291503906, -3.3251724243164062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000176.npy"}
|
||||
{"epoch": 0.36858638743455496, "step": 177, "batch_size": 128, "mean": 14.613381385803223, "std": 42.227508544921875, "min": -110.7755126953125, "p10": -36.59889755249023, "median": 10.801187515258789, "p90": 70.1549301147461, "max": 107.5994873046875, "pos_frac": 0.65625, "sample": [51.29322814941406, -1.4648761749267578, 2.4183731079101562, 84.82716369628906, -28.067276000976562, 70.09585571289062, 10.486469268798828, -24.51593017578125, 39.59130859375, 6.57232666015625, 67.31100463867188, -47.10485076904297, -92.5374755859375, -14.984451293945312, 3.83740234375, -31.71234130859375, 45.64316177368164, 4.45953369140625, 49.70628356933594, 9.09332275390625, 49.401397705078125, 23.93096923828125, 59.32768249511719, 57.9482421875, -30.697906494140625, 26.580108642578125, -29.899932861328125, 84.91824340820312, 52.178131103515625, 97.1650390625, 7.6850128173828125, 28.695091247558594, -76.597900390625, 92.85488891601562, 6.7201080322265625, 26.867843627929688, 74.43948364257812, 3.083070755004883, -9.904905319213867, 5.85986328125, 50.05181884765625, -24.16845703125, -3.5342159271240234, 2.4152984619140625, -11.4775390625, -34.704078674316406, 4.837902069091797, 14.912002563476562, 50.764381408691406, 53.29144287109375, -11.925323486328125, -4.575103759765625, 90.81820678710938, 26.107467651367188, -3.057159423828125, 62.4495849609375, 23.2279052734375, -4.795135498046875, 80.40867614746094, 7.267578125, -43.2149658203125, 79.96173095703125, -0.224853515625, -41.0201416015625, 74.12396240234375, -4.2423858642578125, -47.152740478515625, -10.049001693725586, 6.605129241943359, 66.67507934570312, -14.75830078125, 95.9376220703125, -7.635772705078125, 7.266353607177734, -5.53924560546875, 53.00703430175781, 50.959991455078125, -87.59738159179688, 22.296966552734375, 11.11590576171875, 21.687255859375, -61.166046142578125, -48.631858825683594, 31.212982177734375, -72.48941040039062, 11.97314453125, 35.815589904785156, 1.57489013671875, -27.561431884765625, 12.451019287109375, 17.79058837890625, 66.2415771484375, 28.336524963378906, 48.41914367675781, -10.8262939453125, -18.8663330078125, 12.966888427734375, 11.22265625, 59.842315673828125, 5.769996643066406, -23.411239624023438, -4.72308349609375, 14.823516845703125, 44.08282470703125, 10.410404205322266, -79.57974243164062, 7.3640899658203125, -1.0141143798828125, -49.67010498046875, -110.7755126953125, 28.718246459960938, 48.64598083496094, 70.29277038574219, 26.327850341796875, 17.829193115234375, 73.7030029296875, 37.4268798828125, 40.349365234375, 107.5994873046875, 11.533523559570312, 33.560455322265625, -0.5183181762695312, -2.799896240234375, -6.3025360107421875, 19.495071411132812, 5.4210357666015625, 34.93499755859375, 60.692543029785156], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000177.npy"}
|
||||
{"epoch": 0.3706806282722513, "step": 178, "batch_size": 128, "mean": 24.778339385986328, "std": 39.77253723144531, "min": -96.39889526367188, "p10": -19.031663894653317, "median": 21.15249252319336, "p90": 78.65997619628907, "max": 145.8251953125, "pos_frac": 0.71875, "sample": [61.53472900390625, 17.334104537963867, 6.986328125, -9.179840087890625, -7.49871826171875, 7.311956405639648, 43.101715087890625, 61.170921325683594, 36.990272521972656, 40.947181701660156, 145.8251953125, 62.20939636230469, 13.991279602050781, 37.40504455566406, -13.81494140625, 86.25399780273438, 0.0, 62.09663391113281, 59.76008605957031, 28.25763702392578, 13.074493408203125, 92.02822875976562, 7.05152702331543, 78.513427734375, 52.694671630859375, -13.859695434570312, -8.41693115234375, 100.17996215820312, -23.262069702148438, 62.863739013671875, 6.712516784667969, -6.2609100341796875, -0.31658935546875, 32.20671844482422, -65.86750793457031, 13.2161865234375, 41.525634765625, 70.21501159667969, -30.33721923828125, 57.441925048828125, 78.71630859375, 29.809844970703125, 50.872283935546875, 48.395751953125, 78.63583374023438, 59.53094482421875, 11.664154052734375, -49.19975280761719, -8.955902099609375, 92.08818054199219, 61.22230529785156, 90.48007202148438, -57.51066589355469, -96.39889526367188, -7.0748748779296875, 10.710067749023438, 15.64434814453125, 36.397300720214844, -35.91255187988281, 24.80303955078125, 0.6555385589599609, 98.92825317382812, 10.425750732421875, 43.796478271484375, 65.03228759765625, 21.51555633544922, 27.94921875, 34.4041748046875, 35.6051025390625, 2.3775558471679688, 94.51266479492188, 26.32097625732422, -13.388423919677734, -3.2861328125, 12.921438217163086, 84.40576171875, -28.06304931640625, 11.879135131835938, -8.88510513305664, 2.2432861328125, 42.53759765625, 16.54022216796875, 57.77020263671875, -28.092254638671875, -35.96588134765625, -5.678005218505859, 71.5396728515625, -7.3604583740234375, 35.77852249145508, 31.423828125, 15.118621826171875, 5.794588088989258, -0.1285400390625, 19.167892456054688, 26.455108642578125, 98.08389282226562, 14.819442749023438, 10.247856140136719, 0.47979736328125, 68.24334716796875, 9.614189147949219, -15.172607421875, -30.355430603027344, 96.734619140625, 21.14287567138672, -17.631210327148438, 2.956695556640625, 66.409423828125, 70.5347900390625, -14.692398071289062, -4.257564544677734, 98.29296875, -39.17585754394531, -2.792266845703125, 55.084716796875, 21.162109375, -22.299388885498047, 22.847740173339844, 46.698646545410156, 4.58282470703125, 21.41900634765625, 31.484893798828125, 25.0972900390625, -5.79669189453125, -7.175018310546875, 29.602996826171875, 55.98739242553711, 41.19261169433594], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000178.npy"}
|
||||
{"epoch": 0.37277486910994767, "step": 179, "batch_size": 128, "mean": 19.510334014892578, "std": 37.894493103027344, "min": -67.929931640625, "p10": -30.858518218994135, "median": 15.742584228515625, "p90": 71.08028564453124, "max": 102.07501220703125, "pos_frac": 0.6953125, "sample": [45.84812927246094, 4.263980865478516, 62.908447265625, 8.64471435546875, 14.2777099609375, 67.02322387695312, 8.151893615722656, 10.533744812011719, -59.5245361328125, 1.556549072265625, 15.7908935546875, -1.1467742919921875, 24.7303466796875, 1.6032562255859375, 71.38803100585938, -17.648483276367188, 45.35223388671875, 30.52386474609375, 5.41131591796875, -4.823230743408203, 26.193603515625, 44.73573303222656, 3.4071121215820312, 46.15289306640625, 22.01678466796875, -52.00497817993164, -42.08647155761719, -7.807149887084961, -15.535659790039062, -54.440216064453125, 52.81159210205078, 12.472770690917969, -48.209877014160156, 28.651885986328125, -17.994171142578125, 40.58665466308594, 38.884857177734375, 76.60716247558594, 22.5028076171875, -5.807891845703125, -13.129180908203125, -45.830810546875, -2.418792724609375, 28.73822021484375, -19.551116943359375, 53.70860290527344, 5.4926605224609375, -11.458047866821289, 68.70611572265625, -40.173675537109375, -28.982101440429688, -28.880752563476562, 34.53510284423828, 16.01299285888672, 9.994865417480469, 80.46127319335938, 72.42646026611328, 64.35786437988281, 9.4154052734375, -4.441162109375, -67.929931640625, -12.235061645507812, 95.263427734375, 11.089752197265625, -16.328369140625, 36.671051025390625, 70.94839477539062, -8.838241577148438, -0.02215576171875, 14.23663330078125, 23.57546615600586, 42.56011962890625, 56.915374755859375, 34.340484619140625, 1.001220703125, 9.670312881469727, 80.34243774414062, 11.629531860351562, 17.36614990234375, 59.52861785888672, 14.648130416870117, 8.121994018554688, 91.4921875, -19.676010131835938, 13.79949951171875, -41.48773193359375, 82.93485260009766, 70.03900146484375, 0.6545257568359375, -41.63026428222656, 50.96027374267578, 17.98571014404297, 80.9427490234375, 40.70709228515625, 34.925445556640625, 24.570098876953125, 48.19474792480469, 41.63645935058594, 15.69427490234375, 47.65019226074219, 86.76995849609375, 27.961593627929688, 67.03851318359375, 18.997108459472656, -40.55084228515625, 57.730621337890625, -54.452301025390625, 3.535919189453125, 40.49224853515625, 65.15972137451172, 27.8719482421875, 42.138427734375, -35.23682403564453, 102.07501220703125, -16.2611083984375, -18.96282958984375, 44.817893981933594, 52.46549987792969, -1.63934326171875, -11.7706298828125, 81.6456298828125, 52.69731140136719, -3.40740966796875, 14.887462615966797, -12.40460205078125, -19.38104248046875, 73.7147216796875, 54.45478820800781], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000179.npy"}
|
||||
{"epoch": 0.374869109947644, "step": 180, "batch_size": 128, "mean": 24.790103912353516, "std": 36.76823425292969, "min": -63.5194091796875, "p10": -17.461449432373044, "median": 18.75881004333496, "p90": 77.74557189941406, "max": 108.05087280273438, "pos_frac": 0.7421875, "sample": [-0.7587318420410156, -4.180080413818359, -25.045120239257812, -11.5623779296875, 1.835845947265625, 19.782989501953125, 68.027099609375, 8.86212158203125, -24.653472900390625, 67.8839111328125, -3.177886962890625, 55.1942138671875, 100.88461303710938, -59.00129699707031, 7.9731903076171875, 68.21194458007812, 13.960540771484375, 40.595611572265625, 10.042640686035156, -63.5194091796875, 59.728790283203125, 108.05087280273438, 38.97731018066406, 38.104248046875, 24.673492431640625, 51.82135009765625, 18.946258544921875, 67.06336975097656, 99.22201538085938, -8.879999160766602, -33.46235656738281, 3.4531402587890625, 77.76058959960938, 102.98822021484375, 77.7391357421875, 31.606979370117188, 15.67803955078125, 14.499313354492188, 7.468040466308594, 23.842071533203125, 69.12076568603516, -14.9644775390625, 47.717315673828125, 35.112579345703125, 91.65802001953125, 22.8040771484375, -32.084716796875, -2.3046951293945312, 10.1239013671875, 1.0699920654296875, -37.93721008300781, -5.2283935546875, 5.120147705078125, 29.892120361328125, 85.36260986328125, 1.53125, 18.571361541748047, 54.42052459716797, 13.359405517578125, 9.382698059082031, 87.7471923828125, 44.6383056640625, -3.220672607421875, 94.29537963867188, 50.81761932373047, -15.22442626953125, 43.69500732421875, 34.3863525390625, -43.007598876953125, 32.87312316894531, 26.77203369140625, 3.5748291015625, 37.23426055908203, 90.21755981445312, 74.24365234375, 73.0965576171875, 0.23537445068359375, 0.140045166015625, 77.3199462890625, 45.0543212890625, 100.59982299804688, 10.900009155273438, -15.143310546875, 49.07904052734375, 57.209564208984375, 48.11480712890625, 44.66900634765625, -18.92058563232422, 3.1656494140625, 23.429351806640625, -1.880218505859375, 14.11187744140625, 10.943899154663086, 31.62835693359375, 54.020172119140625, 14.227188110351562, 28.463836669921875, 29.70166015625, -1.1075592041015625, -2.1645965576171875, 4.9089202880859375, 24.674896240234375, 68.01840209960938, -0.8670806884765625, 5.032514572143555, -21.756927490234375, 5.396488189697266, -20.036407470703125, 79.25173950195312, 9.340255737304688, -16.123809814453125, 34.869422912597656, -6.395328521728516, -8.89068603515625, 52.79931640625, 32.751708984375, 36.22712707519531, -8.8839111328125, -19.430038452148438, 53.99102783203125, 0.4659900665283203, 43.9619140625, 3.3767776489257812, 4.926593780517578, -22.045211791992188, 93.0072021484375, -16.836105346679688, 32.0970458984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000180.npy"}
|
||||
{"epoch": 0.3769633507853403, "step": 181, "batch_size": 128, "mean": 22.817195892333984, "std": 41.498291015625, "min": -80.3817138671875, "p10": -30.40199279785156, "median": 22.702041625976562, "p90": 75.92751159667968, "max": 131.147705078125, "pos_frac": 0.7265625, "sample": [0.8871479034423828, 7.675018310546875, 2.0920028686523438, 44.70355224609375, 0.0, 26.939308166503906, 79.65780639648438, 42.7645263671875, 9.736259460449219, 28.50574493408203, 64.4984130859375, -11.643802642822266, 52.03953552246094, -8.40826416015625, 91.56005859375, 42.553497314453125, 0.0, 74.18550109863281, 31.66476058959961, -14.936019897460938, 35.65669250488281, 0.8172607421875, 80.5107421875, 24.8460693359375, -73.94798278808594, 95.60975646972656, -43.351837158203125, 39.108882904052734, 8.143836975097656, 29.756439208984375, 62.61586380004883, 2.2358779907226562, 14.568656921386719, 63.23516845703125, 25.20135498046875, 49.64044189453125, 40.94691467285156, 20.558013916015625, 28.108978271484375, 38.424102783203125, 17.82122802734375, 63.5113525390625, -24.57183837890625, -5.881860733032227, 11.95013427734375, 14.4744873046875, 10.425689697265625, 16.223388671875, 96.53047943115234, -23.278907775878906, 32.8516845703125, -8.468650817871094, 56.206024169921875, 63.04095458984375, -48.397247314453125, -63.798187255859375, -2.9779319763183594, 16.0145263671875, -2.8160400390625, -10.086578369140625, 35.83489990234375, -9.393333435058594, -25.6505126953125, 79.53109741210938, 48.20103454589844, 45.045780181884766, 6.4144287109375, 66.41697692871094, 14.37432861328125, 53.64361572265625, 18.551116943359375, 25.73099136352539, 12.002250671386719, 85.78233337402344, 113.3653564453125, 131.147705078125, -76.55535888671875, 13.144538879394531, -59.83294677734375, 69.82583618164062, -35.38883972167969, 29.6781005859375, -6.362581253051758, -61.3062744140625, -12.12744140625, 62.62611389160156, 39.702362060546875, -12.037925720214844, 20.118896484375, 30.127395629882812, -22.554244995117188, 33.483306884765625, 74.38311767578125, -50.29595947265625, 13.910812377929688, 48.732276916503906, 9.971923828125, 46.19609832763672, 30.382171630859375, 88.50732421875, 80.366455078125, -18.384857177734375, 46.46795654296875, 13.526824951171875, -29.814544677734375, -14.123207092285156, 11.941009521484375, 89.16741943359375, -13.49237060546875, -80.3817138671875, 58.786468505859375, 32.49371337890625, 71.01933288574219, -31.772705078125, 55.487518310546875, 82.99745178222656, -38.98930358886719, 14.16741943359375, 13.930252075195312, 1.808746337890625, 8.948974609375, 28.789352416992188, 62.496856689453125, -47.218299865722656, 64.87999725341797, 56.676422119140625, 46.10405731201172, 63.46209716796875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000181.npy"}
|
||||
{"epoch": 0.37905759162303665, "step": 182, "batch_size": 128, "mean": 29.262874603271484, "std": 34.7807502746582, "min": -59.994140625, "p10": -13.035675048828121, "median": 28.56451416015625, "p90": 73.04407958984375, "max": 108.06643676757812, "pos_frac": 0.8203125, "sample": [41.255035400390625, 64.18861389160156, 14.250885009765625, -2.100770950317383, 4.4037017822265625, 34.74650573730469, 70.59675598144531, 57.646240234375, 23.053939819335938, 6.5624847412109375, 62.87548828125, 29.17596435546875, 27.3431396484375, 50.9644775390625, 62.98095703125, 4.158782958984375, 39.63825988769531, 53.34539794921875, 54.634185791015625, 47.374305725097656, 41.48796844482422, 17.16818618774414, 50.177154541015625, 83.61209106445312, -15.288543701171875, 76.37673950195312, -6.885993957519531, 52.902801513671875, 97.35922241210938, 60.555419921875, 53.58153533935547, 44.00645446777344, 55.23811340332031, 61.82635498046875, 42.77074432373047, -8.418781280517578, -9.533920288085938, 4.47625732421875, 0.6069183349609375, 14.19696044921875, -56.14166259765625, 24.977035522460938, -59.994140625, 47.279869079589844, 36.34381103515625, -12.070159912109375, 77.73028564453125, -28.622161865234375, 65.05513000488281, 108.06643676757812, 39.099212646484375, 60.1533203125, 69.34619140625, 76.04232788085938, 76.06521606445312, -6.166351318359375, 26.0631046295166, 7.518516540527344, 6.69549560546875, 42.78045654296875, 39.05450439453125, 72.1759033203125, 22.50238037109375, 18.338119506835938, 65.57075500488281, 32.22389221191406, 30.31725311279297, 68.2310791015625, 97.97346496582031, 39.38693618774414, -28.018798828125, 9.44534683227539, 27.95306396484375, 14.654281616210938, 0.396575927734375, 4.314605712890625, 23.89385986328125, 49.439788818359375, 6.7837066650390625, 12.1937255859375, 8.895675659179688, -2.7032928466796875, 4.940643310546875, 56.04295349121094, -28.988494873046875, -0.8078155517578125, 13.371566772460938, 8.887176513671875, 9.38818359375, -16.373458862304688, 84.42422485351562, 54.74853515625, 81.83566284179688, 73.75540161132812, -36.03782653808594, -10.68437385559082, 80.47274780273438, 31.69739532470703, 50.02685546875, 24.925277709960938, -33.07701110839844, -2.68707275390625, 25.10144805908203, 21.209239959716797, 5.88763427734375, 49.753482818603516, 23.502685546875, 32.11592102050781, 82.64764404296875, 7.60821533203125, 0.614501953125, 59.27430725097656, 71.158203125, 2.147144317626953, 2.04852294921875, 6.07769775390625, 35.91685485839844, 65.11332702636719, 6.364509582519531, -27.593673706054688, -20.19366455078125, 64.39138793945312, 65.9886474609375, 0.43017578125, 72.73922729492188, -29.135879516601562, 45.53314208984375, -45.46809387207031], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000182.npy"}
|
||||
{"epoch": 0.381151832460733, "step": 183, "batch_size": 128, "mean": 24.389286041259766, "std": 40.9427604675293, "min": -68.40286254882812, "p10": -21.079071426391597, "median": 18.6749267578125, "p90": 76.13107299804688, "max": 111.3660888671875, "pos_frac": 0.71875, "sample": [-24.83074951171875, 63.250518798828125, -16.651840209960938, 54.350799560546875, -47.162384033203125, 75.7425537109375, -50.46110534667969, 11.364730834960938, 55.30096435546875, 5.1999053955078125, -12.3465576171875, 30.29351806640625, 67.95458984375, 2.20501708984375, 3.47149658203125, -63.256439208984375, -4.08796501159668, 70.95953369140625, 15.743011474609375, -2.532978057861328, 0.641998291015625, 32.60609436035156, 56.7861328125, -16.137771606445312, 13.417396545410156, -16.501617431640625, 70.0738525390625, 36.217559814453125, 10.464706420898438, 1.085784912109375, -7.23602294921875, -66.8978271484375, 17.0599365234375, 34.718231201171875, -1.30999755859375, 46.219818115234375, 18.7423095703125, 71.28518676757812, 30.47052001953125, -8.61309814453125, 95.9451904296875, -9.439811706542969, 45.596946716308594, -24.2135009765625, 68.5794677734375, 17.006797790527344, 60.19879150390625, -11.603424072265625, 82.7906494140625, 35.17734909057617, 0.83416748046875, 76.41049194335938, 60.63395690917969, -49.57220458984375, 0.0, 0.3349609375, -0.0843505859375, 7.71588134765625, -19.73574447631836, 83.32453918457031, 11.675086975097656, 59.452392578125, 1.8011760711669922, 90.77230834960938, 75.30364990234375, -38.113037109375, 35.595977783203125, 75.22885131835938, 56.63450622558594, -17.085693359375, 44.62647247314453, 93.398193359375, 111.3660888671875, 45.276634216308594, -15.453201293945312, 42.27520751953125, -60.11199951171875, 25.1107177734375, -3.985443115234375, 29.182769775390625, 37.013763427734375, 33.22084045410156, 35.668128967285156, 9.152982711791992, -2.1394271850585938, 7.784332275390625, 67.85302734375, 71.49592590332031, 13.90570068359375, 12.150360107421875, -13.273658752441406, 18.6075439453125, 5.52520751953125, 3.800579071044922, -34.489105224609375, 59.78446960449219, 80.01750183105469, 85.26519775390625, 66.90939331054688, -0.326202392578125, -68.11074829101562, 0.60186767578125, 2.0171051025390625, 17.58135986328125, 3.531810760498047, 58.606231689453125, 1.6504364013671875, 70.50790405273438, -5.315460205078125, 91.66793823242188, 37.08338928222656, 87.97119140625, 76.01132202148438, 32.94847106933594, 50.210662841796875, -43.748817443847656, 21.584228515625, 46.727874755859375, -68.40286254882812, 108.86282348632812, -10.1529541015625, 59.57593536376953, 71.26608276367188, 35.25335693359375, -8.41143798828125, 25.704490661621094, 48.29063415527344, 79.93853759765625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000183.npy"}
|
||||
{"epoch": 0.3832460732984293, "step": 184, "batch_size": 128, "mean": 36.472145080566406, "std": 44.751033782958984, "min": -83.778076171875, "p10": -23.52108154296875, "median": 41.18895721435547, "p90": 90.05303344726562, "max": 116.06784057617188, "pos_frac": 0.8046875, "sample": [62.94670104980469, 116.06784057617188, 81.81503295898438, 4.94244384765625, 19.1192626953125, -16.433074951171875, 46.296783447265625, -7.188350677490234, 49.1773681640625, 32.81744384765625, 25.82904052734375, 6.5791015625, -58.16455078125, 41.21159362792969, -38.179534912109375, -0.4075469970703125, 101.16952514648438, 78.56124877929688, 60.00518035888672, 55.550201416015625, 62.032318115234375, 88.967529296875, -83.778076171875, 87.12393188476562, -5.471954345703125, 84.80300903320312, 76.17703247070312, 9.85540771484375, -50.611183166503906, 100.0078125, 41.16632080078125, 94.1636962890625, 57.90779113769531, -50.492088317871094, 23.82904052734375, 83.9215087890625, 5.24725341796875, 86.84931945800781, 62.119476318359375, 102.13702392578125, 28.147308349609375, 97.72662353515625, -38.7205810546875, 42.601444244384766, 37.0689697265625, 105.692626953125, -73.72607421875, 106.63006591796875, -11.7265625, 41.71685791015625, 36.93012619018555, -23.485321044921875, 24.3162841796875, 4.654449462890625, -53.397674560546875, 64.0894775390625, 69.72506713867188, 43.353363037109375, -3.447998046875, 105.91030883789062, 8.77532958984375, 4.508758544921875, 65.41839599609375, -50.40806579589844, 76.7855224609375, 69.57171630859375, 69.99240112304688, 84.38835144042969, 23.68194580078125, 70.18450927734375, 9.287384033203125, 41.77015686035156, 51.2967529296875, 36.20221710205078, 40.72637939453125, 33.757911682128906, 28.51806640625, 41.32037353515625, 36.07066345214844, 85.00155639648438, 110.3096923828125, 83.3004150390625, 3.60101318359375, 77.30545043945312, 26.879844665527344, -47.82884216308594, 7.119476318359375, 82.6724853515625, 4.8521728515625, 11.873260498046875, 45.08172607421875, 86.43804931640625, 50.15766906738281, 34.64887237548828, 68.06857299804688, -6.509918212890625, -17.28285789489746, 21.219818115234375, 1.5609397888183594, 86.31463623046875, 57.37550354003906, 53.599273681640625, 50.35455322265625, 73.57015991210938, 0.5991668701171875, 3.9565391540527344, 71.48538208007812, 48.87042236328125, -23.604522705078125, 92.58587646484375, 26.95602035522461, -43.1705322265625, 38.764984130859375, 13.868606567382812, 73.2053451538086, -8.908554077148438, 93.29312133789062, 10.444595336914062, -8.144973754882812, 68.6329345703125, 10.3128662109375, -30.838531494140625, 81.08395385742188, 18.27093505859375, 107.08718872070312, 52.170013427734375, -15.947921752929688, 54.199981689453125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000184.npy"}
|
||||
{"epoch": 0.38534031413612563, "step": 185, "batch_size": 128, "mean": 23.34644317626953, "std": 40.67813491821289, "min": -70.51446533203125, "p10": -25.560336303710937, "median": 19.194580078125, "p90": 78.22069702148436, "max": 113.60055541992188, "pos_frac": 0.703125, "sample": [18.57866668701172, -58.0699462890625, 56.792724609375, -17.01171875, -34.63868713378906, 69.93301391601562, 28.272140502929688, 83.94297790527344, 88.196533203125, -7.186805725097656, 19.63421630859375, 46.96070861816406, 74.84854125976562, 54.88804244995117, 41.98491668701172, 95.49862670898438, 46.257568359375, -14.72076416015625, 9.589057922363281, 54.22216796875, 39.52082824707031, 6.47821044921875, 50.63943099975586, 20.638687133789062, 27.65411376953125, 18.9488525390625, 81.4542236328125, 9.638832092285156, 1.5141468048095703, 87.39639282226562, -70.51446533203125, 57.37403106689453, 2.8147354125976562, -8.607879638671875, 20.948707580566406, 67.72859191894531, 27.24786376953125, 46.69365692138672, 32.799041748046875, 105.96685791015625, 3.592803955078125, 53.289093017578125, -5.9736175537109375, 21.355499267578125, 7.263641357421875, 84.00750732421875, -4.0525665283203125, 31.212942123413086, 30.1265869140625, 63.805419921875, -36.46833801269531, -25.835739135742188, -29.402130126953125, -17.045013427734375, 16.63458251953125, 0.11228370666503906, 12.826553344726562, 0.0, 16.331787109375, 63.754852294921875, -34.82640075683594, -25.442306518554688, 11.671913146972656, 2.1172561645507812, 34.2080078125, 19.7718505859375, 5.295188903808594, 19.4403076171875, 13.23309326171875, -5.266653060913086, -24.068023681640625, 34.93377685546875, 53.89013671875, 75.09025573730469, -17.338253021240234, 3.81134033203125, -5.330657958984375, 15.46783447265625, 26.93327522277832, 69.3011245727539, 113.60055541992188, 8.271469116210938, 11.333648681640625, 0.0, -24.136199951171875, 61.98753356933594, 70.35537719726562, -23.58636474609375, 72.9327392578125, -3.469573974609375, 49.7894287109375, 50.09564208984375, -18.405296325683594, -5.347930908203125, 70.21615600585938, 76.83489990234375, -49.82899475097656, 69.35502624511719, 14.670257568359375, 17.9129638671875, -20.39128875732422, 45.47157287597656, 54.472015380859375, -37.8851318359375, 94.88485717773438, -55.855682373046875, -12.888740539550781, 15.334701538085938, 106.11907958984375, 111.34503173828125, 61.69189453125, 36.757408142089844, 13.018577575683594, -0.981964111328125, 84.57171630859375, -66.8375244140625, 44.6505126953125, 84.40606689453125, 53.70243835449219, -34.28314208984375, 29.673667907714844, 5.014923095703125, -10.229034423828125, -18.933189392089844, -41.65448760986328, -23.92821502685547, 29.805789947509766, 35.975738525390625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000185.npy"}
|
||||
{"epoch": 0.387434554973822, "step": 186, "batch_size": 128, "mean": 22.583147048950195, "std": 38.42554473876953, "min": -79.41961669921875, "p10": -17.493110656738278, "median": 20.230140686035156, "p90": 73.9227523803711, "max": 129.57379150390625, "pos_frac": 0.6953125, "sample": [27.930740356445312, -2.7189559936523438, 12.50787353515625, 51.88128662109375, 19.177825927734375, 59.65228271484375, 27.99267578125, -1.9239501953125, 13.77825927734375, 14.605892181396484, -2.3512649536132812, 8.05615234375, 20.833847045898438, 50.22496032714844, -9.8583984375, 71.428955078125, 57.08867645263672, 2.8246498107910156, 71.91708374023438, 6.211151123046875, 10.682037353515625, -11.512840270996094, -29.108688354492188, 3.5040435791015625, 0.6928558349609375, 57.88493347167969, 44.52449035644531, 28.4085693359375, 25.477020263671875, -1.34124755859375, 7.785381317138672, 35.25335693359375, 94.34530639648438, 29.34686279296875, -6.7340087890625, -51.73566436767578, 36.77056884765625, 91.5677490234375, 28.79998779296875, 30.381378173828125, 2.0633773803710938, 67.24652099609375, 62.06785583496094, 49.69426345825195, 59.65938186645508, 30.10546875, -53.6790771484375, 4.973857879638672, 51.97123718261719, 22.646209716796875, 35.711822509765625, 25.314422607421875, 76.76119995117188, -4.936492919921875, -16.873001098632812, -14.538543701171875, 75.52444458007812, 25.9443359375, -18.940032958984375, 5.3172607421875, -9.05758285522461, -10.51507568359375, 83.41455078125, -34.67962646484375, 73.23631286621094, 12.1634521484375, 39.1241455078125, -3.642486572265625, 18.036834716796875, 78.59956359863281, -6.853126525878906, 20.92889404296875, 100.44866943359375, 27.175491333007812, -11.384735107421875, 19.626434326171875, 60.90217590332031, -73.90875244140625, -0.6628856658935547, 82.44482421875, 57.54411315917969, 7.782196044921875, 34.5748291015625, 124.160400390625, 63.84234619140625, -63.41658020019531, 22.47454833984375, -24.56103515625, 5.407745361328125, -49.4700927734375, -2.80059814453125, 17.17193603515625, 36.83367156982422, -8.736503601074219, 96.91192626953125, 93.18777465820312, 6.915138244628906, 12.577072143554688, -79.41961669921875, 32.66233444213867, 31.91006088256836, 67.32012176513672, 7.079063415527344, -5.888336181640625, 37.076171875, 51.57586669921875, 24.749282836914062, -0.22674560546875, -3.8507919311523438, 47.859676361083984, 52.274208068847656, -13.774528503417969, 32.63592529296875, 4.178619384765625, 129.57379150390625, 0.0, 77.593505859375, 59.418704986572266, -20.109405517578125, 31.58660888671875, -22.268775939941406, -28.20755386352539, -0.2649688720703125, 1.8368911743164062, 30.653839111328125, -13.520469665527344, 59.51776123046875, -7.41082763671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000186.npy"}
|
||||
{"epoch": 0.38952879581151834, "step": 187, "batch_size": 128, "mean": 22.822046279907227, "std": 45.66327667236328, "min": -87.58621215820312, "p10": -39.056454467773435, "median": 16.775527954101562, "p90": 87.95274658203125, "max": 117.3154296875, "pos_frac": 0.6953125, "sample": [49.8536376953125, 75.52288818359375, 25.404052734375, 40.385772705078125, -56.830810546875, 8.45074462890625, 95.32797241210938, 0.6684589385986328, -39.37071228027344, 117.3154296875, 79.60125732421875, 1.1805267333984375, -30.15234375, 15.326828002929688, 67.56700134277344, 113.43621826171875, 49.29771423339844, 46.958717346191406, 87.66378784179688, 13.77301025390625, 68.31365966796875, -33.28266906738281, 40.192901611328125, -5.32684326171875, -23.416610717773438, -17.037078857421875, 69.33091735839844, -70.5980224609375, -50.1973876953125, -68.52569580078125, -9.234756469726562, 79.15422058105469, 52.399658203125, 0.540496826171875, -25.535934448242188, 38.754852294921875, -39.165252685546875, -10.8524169921875, 43.83154296875, 40.1279296875, 13.113433837890625, -8.159515380859375, 101.3560791015625, 21.80633544921875, -11.26251220703125, 49.4349365234375, 88.62698364257812, 4.2470703125, -24.164794921875, 45.511474609375, 0.30517578125, 32.57923889160156, 27.294906616210938, 64.77053833007812, -87.58621215820312, -0.46759033203125, 57.43658447265625, -34.541290283203125, 60.04770278930664, 91.66316223144531, 62.709503173828125, -8.332733154296875, 46.0562744140625, -42.52555847167969, 110.44161987304688, 72.89164733886719, -60.58063507080078, -58.43402099609375, 16.061279296875, 18.932830810546875, 89.50469970703125, -47.7990837097168, -5.6775970458984375, -39.00982666015625, -9.349395751953125, -14.154937744140625, 71.1826171875, 18.875404357910156, 79.80844116210938, 14.394805908203125, 2.57421875, 10.726089477539062, -50.24208068847656, 3.009368896484375, 9.016250610351562, -3.9878005981445312, 22.8533935546875, 9.5946044921875, -62.81671142578125, 77.91287231445312, 56.57536315917969, 74.61038208007812, 73.12712097167969, 22.827392578125, 24.39788055419922, 92.92938232421875, 1.866729736328125, 59.72564697265625, 59.967315673828125, 14.881034851074219, -26.39068603515625, 26.760040283203125, 103.70849609375, 10.322296142578125, -0.17354202270507812, 93.51828002929688, 1.269134521484375, 25.504547119140625, -19.29083251953125, 89.30563354492188, 53.66876220703125, 37.685546875, 7.9398651123046875, 17.489776611328125, 8.832073211669922, 80.06167602539062, 90.24652099609375, 0.9727401733398438, 4.305814743041992, 11.53045654296875, -10.849838256835938, 47.540714263916016, 32.36103057861328, -6.986167907714844, 0.0, 78.47381591796875, -12.7791748046875, 28.785781860351562], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000187.npy"}
|
||||
{"epoch": 0.39162303664921466, "step": 188, "batch_size": 128, "mean": 24.251953125, "std": 45.01414489746094, "min": -93.94293212890625, "p10": -30.569750976562496, "median": 17.150588989257812, "p90": 84.38539733886718, "max": 109.10565185546875, "pos_frac": 0.71875, "sample": [8.506074905395508, 47.8319091796875, 82.83969116210938, 0.0, 38.14507293701172, 17.1080322265625, 2.77783203125, 90.2708740234375, -5.778598785400391, 52.147132873535156, 56.8109130859375, 2.7132644653320312, 99.68853759765625, -8.906719207763672, 27.147705078125, -1.2596588134765625, 96.4091796875, 57.738983154296875, 15.759689331054688, 109.10565185546875, 74.4801025390625, 63.277069091796875, -18.19549560546875, 1.0378341674804688, -14.21942138671875, -22.8592529296875, -29.237548828125, 16.957138061523438, -26.2923583984375, 63.84556579589844, 76.21176147460938, -11.912216186523438, 88.6756591796875, 47.845237731933594, 0.5281982421875, 27.997055053710938, 83.75204467773438, 64.46981811523438, 68.09564208984375, 9.652984619140625, 18.24017333984375, 37.543701171875, 7.546539306640625, 37.239471435546875, 98.8480224609375, -81.03436279296875, 42.668975830078125, 59.6710205078125, 60.308685302734375, 11.943984985351562, 10.298187255859375, 67.35806274414062, -21.340499877929688, -5.0760345458984375, 39.386260986328125, 87.140869140625, 105.2188720703125, 9.72494125366211, 59.61073303222656, 96.85076904296875, 47.56329345703125, 31.036949157714844, 52.45473098754883, 73.96701049804688, 4.876644134521484, 7.214263916015625, 8.793930053710938, 77.36888122558594, 51.8988037109375, 13.906417846679688, 1.65216064453125, 85.86322021484375, 75.02069091796875, -1.371145248413086, -33.67822265625, 3.4900970458984375, 70.52621459960938, 92.8404541015625, -7.4490203857421875, 5.018943786621094, -64.2509765625, 61.95263671875, 20.04668426513672, 76.62762451171875, -1.0752716064453125, -6.275796890258789, 39.62086486816406, 66.76161193847656, 0.2027587890625, -2.367431640625, -93.94293212890625, 30.843093872070312, -73.28451538085938, -64.7464599609375, 40.38905334472656, -8.15557861328125, -12.161415100097656, -34.80389404296875, 29.635536193847656, 66.2427749633789, 23.066680908203125, 2.1624984741210938, -53.45286560058594, -92.92807006835938, 49.03558349609375, 13.298851013183594, 41.102195739746094, 72.73218536376953, 89.69229125976562, -10.366161346435547, -57.152618408203125, 1.592294692993164, 53.18812561035156, 47.523223876953125, 82.19009399414062, -15.572021484375, 90.122314453125, -5.036235809326172, 7.20440673828125, -41.856597900390625, -39.194244384765625, 6.883392333984375, 0.0, 17.193145751953125, 7.026458740234375, 37.33933639526367, 5.3220672607421875, -54.43292236328125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000188.npy"}
|
||||
{"epoch": 0.393717277486911, "step": 189, "batch_size": 128, "mean": 29.149169921875, "std": 37.90156555175781, "min": -71.67486572265625, "p10": -16.078123474121092, "median": 26.491512298583984, "p90": 79.16663055419922, "max": 111.26556396484375, "pos_frac": 0.7890625, "sample": [78.17330932617188, 18.813751220703125, -18.8328857421875, 11.675071716308594, 11.88409423828125, 39.76618957519531, 57.344200134277344, 102.31707763671875, 53.43313217163086, 24.520050048828125, 46.62549591064453, 59.549835205078125, 99.13360595703125, 41.98846435546875, 64.34829711914062, 37.41168212890625, 75.3289794921875, -1.0979156494140625, -9.80975341796875, 6.7794189453125, 51.065399169921875, 87.420166015625, -15.588760375976562, 96.19461059570312, 37.0556640625, 21.765960693359375, 35.898651123046875, 72.1302490234375, 5.29620361328125, -5.669036865234375, -27.12640380859375, 5.993598937988281, 21.92481231689453, 26.954116821289062, 3.5945205688476562, 26.09813690185547, 22.033058166503906, 3.018789291381836, 14.729476928710938, 20.151901245117188, -0.02281951904296875, 85.60650634765625, 69.44412231445312, 31.096466064453125, 70.13720703125, 49.28535461425781, 27.9744873046875, -17.219970703125, 75.97528076171875, -71.67486572265625, -31.17816162109375, 13.9232177734375, -13.70465087890625, 63.756591796875, 3.682098388671875, 26.8848876953125, 32.733489990234375, 4.021026611328125, 12.241390228271484, 14.02130126953125, 77.52642822265625, 51.447906494140625, 75.17947387695312, -4.451202392578125, -18.2677001953125, 18.355113983154297, 8.531713485717773, 3.5890579223632812, 4.6082000732421875, 7.427314758300781, 48.05536651611328, 4.623710632324219, -29.822345733642578, 52.84490966796875, 5.466339111328125, 1.9076423645019531, 32.845916748046875, 6.168773651123047, 3.5140380859375, 18.142547607421875, 9.813003540039062, 70.19363403320312, 30.264892578125, -27.27227783203125, 47.27857208251953, 49.933349609375, 106.99044799804688, -28.099639892578125, -0.6788482666015625, -0.483154296875, 11.0467529296875, 85.81292724609375, 28.59381103515625, -57.27197265625, 1.3799285888671875, 0.0, 111.10748291015625, 29.133758544921875, 35.56315612792969, 88.12454223632812, 71.53515625, 36.49714660644531, 78.48451232910156, 33.1387939453125, 25.39244842529297, 0.0, 30.914154052734375, 104.2171630859375, -24.0975341796875, 68.99554443359375, 73.09114074707031, 15.641220092773438, 39.2122802734375, 10.406488418579102, 27.305145263671875, -53.7738037109375, 49.484161376953125, 80.75823974609375, 63.47010803222656, 48.3619384765625, 36.78289794921875, 97.30343627929688, -31.374298095703125, 111.26556396484375, 60.8853759765625, -13.643280029296875, -5.800933837890625, -11.758934020996094], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000189.npy"}
|
||||
{"epoch": 0.3958115183246073, "step": 190, "batch_size": 128, "mean": 28.408491134643555, "std": 42.93360137939453, "min": -85.286865234375, "p10": -18.288146972656246, "median": 29.490325927734375, "p90": 80.54369964599609, "max": 129.1224365234375, "pos_frac": 0.7578125, "sample": [-32.175682067871094, 15.00384521484375, 99.6708984375, 46.070343017578125, -0.8526611328125, 28.0626163482666, -65.49403381347656, -4.170019149780273, -21.6114501953125, 28.898880004882812, 54.07991027832031, -34.171974182128906, -20.426193237304688, 82.28155517578125, 47.41059875488281, 6.928367614746094, 7.638153076171875, -10.993820190429688, 37.893951416015625, -85.286865234375, -9.011354446411133, -15.456329345703125, -42.61145782470703, 72.11158752441406, 73.86886596679688, 6.7069091796875, 40.93353271484375, -0.88116455078125, 49.59672546386719, 93.05691528320312, 118.35031127929688, -17.371841430664062, 13.496841430664062, 45.38111114501953, 59.199127197265625, -4.54888916015625, 72.70709228515625, 5.634494781494141, 61.856224060058594, 29.81707763671875, 78.7261962890625, -82.07382202148438, 5.91741943359375, 100.11317443847656, 68.08631134033203, 0.20111083984375, 64.7978286743164, 26.225669860839844, 72.36923217773438, 100.95574951171875, 15.484848022460938, 7.229286193847656, 55.451087951660156, -0.708587646484375, 42.11866760253906, 39.45210266113281, 5.5970458984375, -12.340438842773438, 23.74127197265625, -8.275726318359375, -2.97955322265625, 72.01138305664062, 33.767242431640625, 30.341445922851562, 46.91258239746094, 32.82164001464844, 28.187057495117188, -15.626251220703125, 33.87327575683594, 59.906524658203125, 60.51093292236328, 41.68952560424805, 59.006988525390625, 0.0, 72.92303466796875, 79.79890441894531, 46.3171501159668, 50.233909606933594, 110.78060913085938, 18.522705078125, 2.908294677734375, 35.08116149902344, 30.6590576171875, 51.124961853027344, 75.2620849609375, 17.42095947265625, 20.07915496826172, 10.219863891601562, 60.338592529296875, 0.85791015625, 18.476242065429688, 129.1224365234375, 57.01988983154297, 93.05279541015625, -1.3323440551757812, -85.11871337890625, 14.768447875976562, -6.094512939453125, 78.64279174804688, -33.023101806640625, 51.1951904296875, 20.79510498046875, 62.8641357421875, 84.02639770507812, 42.3533935546875, 48.060638427734375, 15.552322387695312, 6.9422607421875, -3.3052215576171875, 29.16357421875, 3.8610191345214844, 113.3450927734375, 37.11320495605469, 44.31268310546875, 2.5996742248535156, 10.318878173828125, -10.96466064453125, 55.43545150756836, -67.52432250976562, 52.561126708984375, 38.878509521484375, 0.9670543670654297, 6.0745086669921875, 55.92778015136719, -72.45571899414062, 88.7791748046875, -33.294525146484375, 89.57826232910156], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000190.npy"}
|
||||
{"epoch": 0.39790575916230364, "step": 191, "batch_size": 128, "mean": 26.103538513183594, "std": 45.53445816040039, "min": -113.91622924804688, "p10": -26.809817123413083, "median": 16.366653442382812, "p90": 85.75855712890625, "max": 120.5797119140625, "pos_frac": 0.7578125, "sample": [97.35263061523438, 64.63595581054688, -10.761138916015625, -4.1885223388671875, 52.662452697753906, 99.2210693359375, 12.463958740234375, 15.335800170898438, 0.08867645263671875, -59.735992431640625, 26.464805603027344, 51.96263885498047, 6.431673049926758, 5.759040832519531, 1.650787353515625, -16.360992431640625, 49.871803283691406, -0.79949951171875, 1.2031364440917969, 37.28565979003906, 44.68495178222656, 86.86102294921875, 14.174629211425781, 1.6746673583984375, -0.14554405212402344, 17.397506713867188, -7.0853729248046875, -79.92642211914062, 63.87005615234375, 53.00148010253906, 12.04620361328125, 3.5023555755615234, -16.411544799804688, 59.87664794921875, 7.56353759765625, -39.700439453125, 51.37244415283203, -8.049484252929688, -113.91622924804688, -6.1259765625, 38.22765350341797, 78.44650268554688, 104.52871704101562, 12.342170715332031, 3.9949913024902344, 105.35002136230469, 14.5916748046875, 0.8072509765625, 8.087417602539062, 13.369293212890625, 0.881103515625, 72.92831420898438, 106.8785400390625, 84.73788452148438, 9.818466186523438, -3.177886962890625, 53.92250061035156, 89.12179565429688, 20.4124755859375, 22.50922393798828, 8.049732208251953, 3.020416259765625, 46.97077941894531, -13.708518981933594, 76.55328369140625, 6.160194396972656, 109.31381225585938, 63.20386505126953, 24.5655517578125, 3.41143798828125, 1.73968505859375, 62.04326629638672, 7.509727478027344, -37.543365478515625, 18.98766326904297, -65.32369995117188, 56.9168701171875, 24.74737548828125, 69.18942260742188, -12.981956481933594, 3.973114013671875, 43.065826416015625, 2.028057098388672, 84.399169921875, -50.02044677734375, 120.5797119140625, 85.28607177734375, -60.4041748046875, 78.79043579101562, 90.46548461914062, 27.107177734375, 74.41604614257812, 13.26123046875, 0.35518455505371094, -1.608062744140625, -37.94898986816406, 64.15188598632812, 33.69659423828125, 104.26870727539062, 34.173980712890625, 30.393768310546875, 9.563858032226562, 72.33169555664062, 55.138946533203125, 117.51287841796875, 84.0689697265625, 4.0630645751953125, 2.3770751953125, -61.08708190917969, 18.580596923828125, -0.127471923828125, -69.80908203125, 53.945613861083984, 59.082489013671875, 62.35650634765625, 55.784423828125, -5.05517578125, -3.5700302124023438, -28.327072143554688, 74.6346435546875, 44.56549072265625, 43.46321105957031, 25.159408569335938, -39.715576171875, 0.0, 76.7047119140625, -26.159564971923828, 109.52780151367188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000191.npy"}
|
||||
{"epoch": 0.4, "step": 192, "batch_size": 128, "mean": 20.802993774414062, "std": 47.16136932373047, "min": -113.91513061523438, "p10": -45.11392364501952, "median": 16.304733276367188, "p90": 84.50582580566406, "max": 109.6773681640625, "pos_frac": 0.6796875, "sample": [7.2406463623046875, 33.2445068359375, 87.48454284667969, -73.47442626953125, 77.8941650390625, 19.090972900390625, -60.376922607421875, 16.165725708007812, -2.5717086791992188, 95.61468505859375, 38.282257080078125, 23.74285888671875, 63.660179138183594, 83.916748046875, 86.39265441894531, -6.889984130859375, -51.96775817871094, 53.91064453125, -113.91513061523438, 109.6773681640625, -82.0404052734375, 103.07061767578125, -11.931732177734375, -70.10209655761719, 8.381805419921875, 83.2891845703125, 7.796531677246094, 10.589645385742188, -13.496025085449219, 64.30865478515625, -93.59637451171875, 34.51415252685547, 58.23533630371094, 61.671173095703125, 0.77276611328125, 2.505840301513672, -26.824493408203125, 62.92621612548828, 15.788970947265625, 16.6549072265625, -0.66253662109375, -12.514892578125, -12.970291137695312, 65.27999877929688, 86.77658081054688, -51.41416931152344, -4.3441162109375, -67.23614501953125, 0.0, 0.0, 1.6750259399414062, -3.894927978515625, -69.26982879638672, 32.3990478515625, -52.90980529785156, 61.73004913330078, 3.364227294921875, 49.905853271484375, -4.878936767578125, -20.499481201171875, 39.74913787841797, 42.27574157714844, -13.924530029296875, 2.874755859375, 73.8975830078125, 84.7752685546875, 69.56800842285156, 83.2548828125, -1.1505241394042969, 70.22042846679688, -42.413818359375, 5.125244140625, 17.175506591796875, -0.4005126953125, 25.212608337402344, 55.05078887939453, 42.47906494140625, 87.43255615234375, 85.75082397460938, 9.193565368652344, 7.013172149658203, 79.91049194335938, -11.146644592285156, -35.946319580078125, -7.0142364501953125, 18.595367431640625, 16.51358413696289, 12.663665771484375, 66.03816986083984, 69.13557434082031, 1.209197998046875, 20.681640625, -75.23152160644531, 107.12847900390625, 8.596315383911133, 7.530548095703125, -1.9960556030273438, -8.717376708984375, 92.74154663085938, 48.1002197265625, 11.27996826171875, 19.4849853515625, 19.62286376953125, -19.860450744628906, 9.280731201171875, 31.0439453125, 42.46759033203125, 16.786590576171875, 84.39035034179688, 49.69023132324219, 66.69912719726562, -9.899505615234375, 60.57233428955078, 11.346359252929688, 45.01054382324219, 10.408363342285156, 73.99761962890625, -8.356639862060547, -73.549072265625, -1.06280517578125, 5.2789459228515625, 87.53338623046875, 53.357505798339844, 24.679107666015625, 16.443740844726562, 89.49256896972656, -9.656604766845703, 84.17680358886719], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000192.npy"}
|
||||
{"epoch": 0.40209424083769635, "step": 193, "batch_size": 128, "mean": 23.61876678466797, "std": 42.810237884521484, "min": -72.3720703125, "p10": -29.258647155761718, "median": 18.527456283569336, "p90": 83.84706268310546, "max": 108.070068359375, "pos_frac": 0.703125, "sample": [78.06624603271484, -2.6806259155273438, -4.27374267578125, -8.189445495605469, -7.858245849609375, 60.657470703125, 18.69939422607422, 56.027984619140625, 25.052085876464844, 33.30938720703125, 22.74603271484375, 3.660533905029297, 32.76641845703125, 77.43115234375, 3.868896484375, 64.510009765625, -23.76220703125, 33.32970428466797, 0.0665283203125, 3.6934356689453125, 71.47500610351562, -18.09197998046875, 64.0162353515625, 9.68511962890625, 29.1204833984375, 35.300689697265625, -2.735443115234375, -60.02348327636719, 62.85443115234375, -13.93359375, -43.87037658691406, 2.3678436279296875, 29.45135498046875, 23.41082763671875, 60.08890914916992, -19.83929443359375, 90.06146240234375, -46.352447509765625, 4.273477554321289, 34.448448181152344, -71.15118408203125, -10.0518798828125, 0.691162109375, 41.854248046875, 10.507659912109375, 43.6925048828125, 83.70022583007812, 35.0428466796875, 84.42977905273438, 93.40336608886719, -0.760223388671875, 9.057220458984375, 68.7523422241211, 5.473758697509766, 53.36674499511719, 5.250732421875, 97.44478607177734, 28.814239501953125, -15.88653564453125, 36.69696044921875, -9.298828125, 39.68724060058594, 68.8357925415039, 6.446990966796875, 99.68682861328125, -27.355224609375, 50.558101654052734, 41.01995849609375, 22.51727294921875, -5.9730224609375, -45.598388671875, -30.508438110351562, 11.490509033203125, 103.05010986328125, -39.64569091796875, 57.026611328125, 78.54690551757812, 88.49435424804688, 13.486907958984375, 94.87939453125, 73.48959350585938, 36.80145263671875, 73.02883911132812, -43.22357177734375, 96.230712890625, 94.470947265625, 9.653343200683594, 1.6862411499023438, -6.7159576416015625, 20.25555419921875, 58.10943603515625, 1.3224411010742188, 108.070068359375, 69.42648315429688, 12.985206604003906, 18.164703369140625, 27.841846466064453, 4.926750183105469, 27.197967529296875, 66.20713806152344, 57.17626953125, 84.18968200683594, 15.13929557800293, -42.70121765136719, 50.85650634765625, -15.732093811035156, 100.9361572265625, -26.39404296875, -69.00430297851562, -4.25628662109375, -3.6048736572265625, -3.3675537109375, -72.3720703125, 54.6666259765625, 47.84246826171875, 18.355518341064453, 77.6495361328125, -28.7230224609375, 11.73126220703125, -5.217620849609375, -0.2080078125, -4.420501708984375, -66.4295654296875, 3.298828125, 69.31732177734375, -40.194732666015625, 4.507568359375, 63.72685241699219], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000193.npy"}
|
||||
{"epoch": 0.4041884816753927, "step": 194, "batch_size": 128, "mean": 17.72484588623047, "std": 42.783443450927734, "min": -93.21426391601562, "p10": -23.422811889648436, "median": 8.958740234375, "p90": 77.1368797302246, "max": 119.04330444335938, "pos_frac": 0.6484375, "sample": [114.14743041992188, 22.1090087890625, 20.352279663085938, -50.2381591796875, 24.03607177734375, 110.38739013671875, 45.75837707519531, -9.022247314453125, 0.74444580078125, -2.1291732788085938, 73.89739227294922, -22.834182739257812, -2.789133071899414, 5.4191741943359375, 12.444549560546875, 2.7492923736572266, -10.100730895996094, 60.54564666748047, 89.16586303710938, -1.882293701171875, 18.124481201171875, -3.7627315521240234, 26.662059783935547, 5.6321868896484375, 13.637115478515625, -1.30108642578125, 1.5393218994140625, 27.7603759765625, 1.2015914916992188, -83.85202026367188, -10.570999145507812, 24.978248596191406, -10.530445098876953, 10.586883544921875, 26.197601318359375, 14.578338623046875, 119.04330444335938, 56.856842041015625, 13.613372802734375, 65.76191711425781, -19.964523315429688, 104.62408447265625, -7.7625732421875, -10.917488098144531, -13.932479858398438, 87.86831665039062, -20.652542114257812, 58.500221252441406, 60.32525634765625, 9.608779907226562, -11.005325317382812, 31.577392578125, 28.035171508789062, 34.32623291015625, 110.9547119140625, 39.91069030761719, 35.963409423828125, 31.16845703125, 52.22175598144531, -13.298538208007812, 15.29217529296875, 12.10028076171875, -0.490081787109375, 62.22259521484375, -56.0030517578125, -35.03875732421875, 8.308700561523438, -93.21426391601562, 31.22088623046875, -17.539306640625, 107.5001220703125, -6.15838623046875, -27.534332275390625, 86.41018676757812, 2.74169921875, 80.83447265625, -3.5110702514648438, -17.829856872558594, 35.10479736328125, 70.31626892089844, -20.494140625, 3.88299560546875, 67.72869873046875, 78.1130142211914, -4.252025604248047, -62.210357666015625, 55.183197021484375, -10.849250793457031, 66.51644897460938, 76.71853637695312, 39.881591796875, -66.38063049316406, 16.03668212890625, -45.90803909301758, 4.799772262573242, -0.090606689453125, 7.312919616699219, -76.26329040527344, 109.43890380859375, 6.429443359375, -4.821319580078125, 49.70123291015625, 0.4554443359375, 7.090095520019531, -57.72663879394531, 11.686431884765625, -13.8543701171875, 30.69134521484375, -4.917022705078125, -12.94195556640625, 0.12640380859375, 91.70892333984375, 63.57647705078125, 10.014358520507812, 38.269378662109375, 31.67169189453125, 68.25920104980469, 1.8329811096191406, -32.10823059082031, 4.242805480957031, 2.3253173828125, -7.165191650390625, 59.588348388671875, -22.958740234375, 2.9173431396484375, -24.505645751953125, 36.0885009765625, 52.737945556640625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000194.npy"}
|
||||
{"epoch": 0.406282722513089, "step": 195, "batch_size": 128, "mean": 17.322223663330078, "std": 44.658843994140625, "min": -87.326904296875, "p10": -28.709797668457032, "median": 9.79661750793457, "p90": 75.08939666748047, "max": 127.03399658203125, "pos_frac": 0.609375, "sample": [-9.101394653320312, -2.7698593139648438, -2.617279052734375, 67.4033203125, 75.07008361816406, 69.33740234375, -71.62124633789062, 5.232574462890625, 59.890350341796875, 9.628276824951172, 40.44049072265625, 9.607032775878906, 3.6431503295898438, -9.065704345703125, -5.2073822021484375, 20.4935302734375, -9.96246337890625, -13.7147216796875, 20.711944580078125, 30.596923828125, -18.240516662597656, 14.600372314453125, 74.64613342285156, -9.875473022460938, -16.449462890625, -4.5702667236328125, 1.4698410034179688, -0.508544921875, -29.2188720703125, 10.8939208984375, -11.61322021484375, 62.4141845703125, -24.942794799804688, -12.861770629882812, 4.41912841796875, 1.4161834716796875, 89.36323547363281, 93.65518188476562, -4.646209716796875, -4.6480712890625, -23.28643798828125, -9.366851806640625, 14.264595031738281, 6.3112030029296875, 75.6031494140625, 5.5190582275390625, 75.15325927734375, -70.18252563476562, -21.32489013671875, 40.160003662109375, -41.26417541503906, 42.038482666015625, 33.63356018066406, 34.914154052734375, -84.23419189453125, 26.77618408203125, -5.5128173828125, -71.38626098632812, 76.01902770996094, -6.810203552246094, 28.931900024414062, -87.326904296875, 5.2157135009765625, 59.29047393798828, -18.076187133789062, -75.465087890625, -71.873291015625, -11.7984619140625, -1.9094085693359375, 51.64984130859375, 95.08123779296875, 39.789794921875, 65.45138549804688, -24.3970947265625, 32.72100830078125, 88.51153564453125, 14.61798095703125, 1.67230224609375, 37.16960906982422, 16.46795654296875, 51.53770446777344, 56.69189453125, 34.138671875, 46.120025634765625, -1.7529449462890625, -2.727294921875, -0.8221282958984375, -57.1151123046875, -0.8654079437255859, 29.62128448486328, -29.903823852539062, 120.41522216796875, 68.41207885742188, -0.454193115234375, 66.27062225341797, 53.605926513671875, 127.03399658203125, 7.096832275390625, 65.93763732910156, 95.21295166015625, -27.068618774414062, 51.99028015136719, 9.964958190917969, 67.92002868652344, 6.256072998046875, 10.312088012695312, -12.48992919921875, 60.3809814453125, 52.277984619140625, 75.13446044921875, 104.87771606445312, 10.292999267578125, -82.9808349609375, 21.925079345703125, -28.491622924804688, 7.205162048339844, 88.82672119140625, -64.24636840820312, 11.610153198242188, 63.8961181640625, 11.00567626953125, -5.9456634521484375, -3.29736328125, 50.59156799316406, 74.34996032714844, 47.2459716796875, 59.11474609375, -17.914154052734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000195.npy"}
|
||||
{"epoch": 0.4083769633507853, "step": 196, "batch_size": 128, "mean": 29.476726531982422, "std": 39.85453796386719, "min": -66.49310302734375, "p10": -13.458691406249999, "median": 18.757835388183594, "p90": 79.20834884643554, "max": 124.94619750976562, "pos_frac": 0.7578125, "sample": [0.050815582275390625, 10.032470703125, 16.92192840576172, 1.4591217041015625, 14.398292541503906, 9.656585693359375, -20.33282470703125, 2.0676193237304688, 68.29035949707031, -0.1870269775390625, 4.7568511962890625, 83.70021057128906, 45.029205322265625, 46.08587646484375, 102.21481323242188, 1.76568603515625, 2.484283447265625, 4.16461181640625, 12.673614501953125, 22.82476806640625, 50.37809753417969, 6.373264312744141, 54.437530517578125, 79.32060241699219, -1.1554183959960938, 60.18400573730469, -11.04892349243164, -0.6407718658447266, -55.08263397216797, 71.63958740234375, 106.61245727539062, 84.73160552978516, 55.456298828125, 57.6226806640625, -0.110595703125, 54.26751708984375, 8.7056884765625, -37.294586181640625, 68.85018920898438, 50.616065979003906, -23.802871704101562, 94.9647216796875, 36.98577880859375, -9.681076049804688, 13.53139877319336, 10.427993774414062, 16.028900146484375, 97.4073486328125, 19.473297119140625, 52.664398193359375, -29.681991577148438, 0.2588462829589844, 1.5474853515625, 5.458995819091797, -2.0674667358398438, 81.4573974609375, 75.89389038085938, 3.6236696243286133, 6.78826904296875, 47.8515625, 60.265533447265625, 10.4674072265625, 79.16024017333984, -3.585174560546875, 20.757659912109375, 86.30122375488281, -19.934776306152344, 35.581817626953125, -12.09112548828125, -16.00469970703125, 18.089767456054688, 19.4259033203125, 0.4386444091796875, 62.70745849609375, 45.739837646484375, 73.38825988769531, 0.716278076171875, 20.339141845703125, 53.60655212402344, 38.7322998046875, 19.50279998779297, -48.57542419433594, 61.30439758300781, -3.7913856506347656, -13.06585693359375, 65.0618896484375, 62.42523193359375, 74.18331909179688, 73.04873657226562, 3.6380462646484375, 73.83489990234375, -66.49310302734375, 23.74786376953125, 107.989013671875, 53.46112060546875, -5.114982604980469, -19.416793823242188, -13.857162475585938, 70.382568359375, 77.06031799316406, 63.169281005859375, -13.287918090820312, 8.16461181640625, 124.94619750976562, 89.51095581054688, 14.2347412109375, -26.180644989013672, 68.17472076416016, 76.81207275390625, 78.98065185546875, 68.10198974609375, 104.92877197265625, 76.392333984375, 74.74685668945312, -7.0390625, -5.7051544189453125, -6.878686904907227, 64.50541687011719, 60.01075744628906, 38.273956298828125, 2.74664306640625, -6.034431457519531, -42.215911865234375, 56.18827819824219, 9.600784301757812, 4.89276123046875, -8.45855712890625, 3.9615478515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000196.npy"}
|
||||
{"epoch": 0.41047120418848165, "step": 197, "batch_size": 128, "mean": 37.930110931396484, "std": 43.53870391845703, "min": -77.62179565429688, "p10": -13.71423645019531, "median": 41.58791732788086, "p90": 90.98295974731445, "max": 143.15313720703125, "pos_frac": 0.796875, "sample": [-67.45481872558594, 5.5957183837890625, -1.418121337890625, 81.57563781738281, 56.90150451660156, -8.882522583007812, 66.0201416015625, 23.29364013671875, 23.73971176147461, 86.6676025390625, 89.2412109375, 94.93605041503906, -10.51580810546875, 38.146087646484375, 51.11573791503906, -29.75364875793457, 125.40802001953125, 98.05310821533203, 61.901947021484375, 102.52899169921875, 1.6443233489990234, 62.779029846191406, 91.0760726928711, 23.963531494140625, 9.097549438476562, 31.16302490234375, 66.212890625, 90.94305419921875, 79.74603271484375, 81.48673248291016, 69.552978515625, 52.564613342285156, -52.00419616699219, 54.30120849609375, 29.7354736328125, 34.756187438964844, 66.42884063720703, -0.47472381591796875, 35.382362365722656, 92.4365234375, 46.529327392578125, 56.506103515625, -77.62179565429688, 88.88824462890625, -50.7913818359375, 19.353240966796875, 98.56906127929688, 0.5226287841796875, -17.75579833984375, -11.851593017578125, 57.57176208496094, 56.44476318359375, 29.425933837890625, -15.852752685546875, 4.839958190917969, -1.9302902221679688, 68.63125610351562, 7.37078857421875, 82.53549194335938, 69.26181030273438, 61.93583679199219, 143.15313720703125, 3.6874847412109375, 93.34912109375, -32.79766845703125, 0.9544525146484375, 62.6014404296875, 76.58061981201172, 42.55986785888672, 39.298065185546875, 21.591346740722656, 40.615966796875, -2.2280807495117188, 72.48289489746094, 1.1063690185546875, 80.90599060058594, -8.723602294921875, 43.7899169921875, -36.40772247314453, 4.64190673828125, -12.7977294921875, 8.875167846679688, 42.7100830078125, 52.29107666015625, -28.27642822265625, 75.07958984375, 3.7344970703125, 51.37183380126953, 67.364013671875, 40.481903076171875, 33.63330078125, -0.191802978515625, 84.21173095703125, 42.6417236328125, 62.2064208984375, 44.789215087890625, 27.232940673828125, -40.34674072265625, 17.603134155273438, -57.118194580078125, 12.2969970703125, 54.87396240234375, -5.022327423095703, 71.918212890625, 79.19378662109375, 55.854888916015625, 35.65350341796875, 36.54298400878906, 85.8197021484375, 40.20263671875, 109.19296264648438, 102.08500671386719, 16.248046875, 6.4017486572265625, 68.38835144042969, 17.34661865234375, 97.0419921875, 1.1373748779296875, 71.98263549804688, 16.590293884277344, 0.0, 128.28350830078125, 66.6253662109375, 59.572479248046875, 67.83296966552734, -37.85491943359375, 56.903106689453125, -3.1893386840820312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000197.npy"}
|
||||
{"epoch": 0.41256544502617803, "step": 198, "batch_size": 128, "mean": 29.04886245727539, "std": 49.1840705871582, "min": -106.69319152832031, "p10": -21.78999977111816, "median": 21.6397705078125, "p90": 91.32843017578125, "max": 123.93356323242188, "pos_frac": 0.75, "sample": [74.26007080078125, 21.48382568359375, 10.021484375, -20.880126953125, 60.748321533203125, -0.1429290771484375, 75.33905029296875, 92.55868530273438, 33.4803466796875, 14.0029296875, -12.91351318359375, -7.763042449951172, 43.00701904296875, 35.361480712890625, 9.893718719482422, -7.597747802734375, 101.32737731933594, 77.6209716796875, 63.96624755859375, 9.466461181640625, 89.97317504882812, -87.19610595703125, -77.46240234375, 90.80117797851562, 123.93356323242188, 116.21478271484375, -34.8826904296875, 13.721221923828125, 18.849212646484375, 21.79571533203125, 81.1270751953125, 74.92471313476562, 76.94935607910156, 35.94964599609375, 1.854888916015625, 56.24476623535156, -33.086578369140625, 123.0206298828125, -46.56364440917969, 116.92666625976562, -19.842132568359375, 3.776397705078125, 88.10299682617188, 73.28050231933594, -17.403831481933594, 0.8322849273681641, 107.68939208984375, -23.913036346435547, -36.760154724121094, 4.864398956298828, 50.169464111328125, 46.39344787597656, -2.9545936584472656, 78.92689514160156, 29.101608276367188, 72.02996826171875, 100.68707275390625, 3.0616531372070312, 24.9593505859375, 46.862548828125, 13.615219116210938, 28.617523193359375, -6.6999053955078125, 90.26290893554688, 57.599853515625, 19.3031005859375, 1.4459266662597656, 99.540283203125, 29.949203491210938, 33.465660095214844, 15.505142211914062, -78.24078369140625, -18.86322021484375, 45.211883544921875, -106.69319152832031, -19.14813232421875, 18.724395751953125, 82.45199584960938, -15.547866821289062, 53.153076171875, 37.66035461425781, 79.07601928710938, 64.79170227050781, -12.573211669921875, 73.3013687133789, 1.0435638427734375, 53.50984191894531, 89.5865478515625, -69.84474182128906, 13.029052734375, 0.0, 88.3797607421875, 11.660209655761719, -18.504302978515625, -56.04083251953125, 11.46746826171875, -2.6386642456054688, 63.14110565185547, 98.1817626953125, 37.50872802734375, 102.83343505859375, 2.8939170837402344, 75.79179382324219, 54.70745849609375, 26.8394775390625, 55.345123291015625, 11.7998046875, 3.997509002685547, 1.52490234375, -105.1602783203125, 103.40032958984375, 7.591583251953125, 2.3601112365722656, 1.1035385131835938, -4.424427032470703, -4.13397216796875, 106.7213134765625, 26.7684326171875, 85.45431518554688, 12.759002685546875, 30.3070068359375, -27.67742919921875, 7.3372802734375, -9.880016326904297, 17.082237243652344, 83.55828857421875, 75.29254150390625, 1.4725341796875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000198.npy"}
|
||||
{"epoch": 0.41465968586387436, "step": 199, "batch_size": 128, "mean": 27.167959213256836, "std": 45.38579559326172, "min": -86.14987182617188, "p10": -31.518524169921875, "median": 23.094348907470703, "p90": 87.65329513549804, "max": 122.580078125, "pos_frac": 0.7265625, "sample": [17.221054077148438, 39.4063720703125, 82.0308837890625, 70.45712280273438, 94.76577758789062, 68.4985580444336, 68.35226440429688, 8.9979248046875, 87.087890625, 28.72222900390625, 114.93206787109375, 76.96984100341797, 37.296356201171875, 62.29205322265625, 107.92626953125, -54.472808837890625, 70.1922607421875, 58.98699951171875, 7.700691223144531, -2.8326492309570312, -3.9124374389648438, 2.0986709594726562, -1.5423145294189453, 2.514190673828125, -3.377716064453125, 37.82173156738281, 54.585243225097656, 2.2356643676757812, 8.98828125, 3.451751708984375, 37.97787094116211, 48.629051208496094, 62.06829071044922, 41.223907470703125, 27.641563415527344, 57.61961364746094, -40.113243103027344, -14.467521667480469, 71.6407470703125, 11.774078369140625, 11.5009765625, 75.57830810546875, 12.065887451171875, 4.775388717651367, 28.954116821289062, 74.57025146484375, -12.07080078125, -6.1047210693359375, -13.607864379882812, -65.75393676757812, 66.75211334228516, 105.336181640625, 15.39874267578125, 92.9755859375, -12.269210815429688, 28.917739868164062, 22.068023681640625, 22.77635955810547, 93.54191589355469, -40.30419158935547, -82.38916015625, 71.11151123046875, 81.13006591796875, 25.898391723632812, -22.996978759765625, -45.45051956176758, 55.27893829345703, 12.76898193359375, -17.877593994140625, 30.260772705078125, 105.16378784179688, -0.978271484375, 4.54914665222168, 110.267578125, 22.199432373046875, -2.32879638671875, 48.685546875, 25.079299926757812, -1.3927173614501953, 47.68701171875, 97.73245239257812, 3.468536376953125, 68.66265869140625, -10.573013305664062, 73.63320922851562, 89.60702514648438, -68.66498565673828, 64.5531005859375, 76.22433471679688, 34.726531982421875, -31.58367919921875, -17.374740600585938, 56.393463134765625, 92.49371337890625, -6.4925537109375, 51.422607421875, -0.29918479919433594, 10.30206298828125, 23.412338256835938, 37.40391540527344, 24.1890869140625, -39.91071319580078, 16.90728759765625, 11.103240966796875, 65.041259765625, -86.14987182617188, -63.71050262451172, 0.9654674530029297, 12.3795166015625, 52.480804443359375, 13.090232849121094, 11.8392333984375, 77.02096557617188, -3.5465850830078125, 68.01869201660156, 15.8424072265625, 88.97257232666016, 3.0002593994140625, -5.703071594238281, 55.51588439941406, 9.439422607421875, -70.872802734375, -18.336898803710938, -40.564491271972656, 122.580078125, 80.28424072265625, 40.91004943847656, -31.4906005859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000199.npy"}
|
||||
{"epoch": 0.4167539267015707, "step": 200, "batch_size": 128, "mean": 23.25149917602539, "std": 52.8889274597168, "min": -101.50465393066406, "p10": -50.91798706054688, "median": 13.580425262451172, "p90": 93.71079101562499, "max": 141.0789794921875, "pos_frac": 0.6640625, "sample": [69.44342041015625, -64.248779296875, 113.72604370117188, 16.80854034423828, -50.793731689453125, 13.693244934082031, -45.82878875732422, 63.35687255859375, -6.342254638671875, 7.048826217651367, -3.3698272705078125, 78.29541015625, 23.147674560546875, 3.1155014038085938, -86.71710205078125, 80.2562255859375, 66.58209991455078, -13.172409057617188, 29.438613891601562, 121.09017944335938, -84.22808837890625, -39.82574462890625, 106.43045043945312, 45.160804748535156, 97.077880859375, 68.55169677734375, -20.705604553222656, -9.8359375, 32.863250732421875, -6.7232666015625, -36.261138916015625, -91.73263549804688, 12.386682510375977, -56.153076171875, 13.467605590820312, 46.9720458984375, -56.16399383544922, 75.2298583984375, 69.82839965820312, -10.45074462890625, 0.0, 52.664878845214844, 90.72955322265625, 10.824310302734375, -6.66644287109375, -81.61709594726562, 84.10595703125, 8.191650390625, 116.92425537109375, -7.259246826171875, 17.66790771484375, 12.83782958984375, 92.76776123046875, 4.406288146972656, -9.346633911132812, -20.18670654296875, 17.366676330566406, -74.7248764038086, 101.81732177734375, -1.0682525634765625, 53.050750732421875, 4.82806396484375, 1.5903301239013672, 69.07064819335938, 75.29244995117188, 67.02920532226562, 10.754203796386719, 95.91119384765625, -51.207916259765625, 60.1658935546875, 102.70195007324219, 50.41668701171875, -5.996429443359375, 44.702186584472656, 44.716033935546875, 24.2001953125, -7.436614990234375, -23.315956115722656, 8.532501220703125, 64.837158203125, -80.28433227539062, 99.15325927734375, 141.0789794921875, 0.0254669189453125, 13.01531982421875, 48.87092590332031, -7.9070892333984375, 67.8419189453125, -101.50465393066406, 87.98260498046875, 10.132369995117188, 120.00958251953125, -57.59233093261719, 88.42160034179688, 59.4515380859375, 85.03488159179688, 5.246856689453125, 102.28598022460938, -27.46795654296875, -49.7142333984375, 15.39471435546875, 45.70068359375, 74.771484375, 37.678871154785156, -14.95904541015625, 81.18955993652344, 11.065086364746094, -56.416038513183594, -4.942413330078125, -0.40093994140625, 10.848030090332031, 69.25572967529297, 60.12322998046875, 116.96499633789062, 50.16661071777344, 40.778533935546875, -6.5938720703125, -16.072120666503906, 62.1458740234375, 41.470272064208984, 7.8157501220703125, -1.9670324325561523, 45.382835388183594, 4.822242736816406, -26.824920654296875, 14.35601806640625, 37.712371826171875, 1.9491119384765625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000200.npy"}
|
||||
{"epoch": 0.418848167539267, "step": 201, "batch_size": 128, "mean": 28.5308780670166, "std": 43.81116485595703, "min": -78.85836791992188, "p10": -21.005171203613273, "median": 22.04343032836914, "p90": 86.31376342773437, "max": 128.62338256835938, "pos_frac": 0.71875, "sample": [-41.514739990234375, -14.36456298828125, 56.97437286376953, 76.71905517578125, 82.19758605957031, 46.8724365234375, 27.605316162109375, 19.219024658203125, 23.8773193359375, 67.21946716308594, 80.38479614257812, 1.9348993301391602, -5.398284912109375, 24.444168090820312, -47.995880126953125, -34.173187255859375, 77.5303955078125, 95.521484375, 63.046226501464844, 43.49540710449219, 16.02496337890625, 85.82221984863281, 87.94368743896484, 0.0, 66.5644760131836, 39.19279479980469, 74.762451171875, 54.202667236328125, 1.83734130859375, 87.64202880859375, 98.27182006835938, 50.60400390625, 19.67041778564453, -7.800361633300781, 0.0, -59.935791015625, 8.64300537109375, -2.5699996948242188, 8.579418182373047, 6.4081573486328125, 34.69596862792969, 40.321868896484375, 79.64234924316406, 75.76677703857422, -44.478973388671875, 14.2381591796875, 60.9573974609375, 61.32185363769531, -17.327056884765625, 50.438392639160156, -2.3655853271484375, 21.69879913330078, 22.3880615234375, 63.70947265625, -2.3701553344726562, 16.443099975585938, 18.75908660888672, 83.15364074707031, 65.24839782714844, 79.27700805664062, -4.812652587890625, 50.28082275390625, 100.272705078125, 13.54248046875, 91.27273559570312, 69.47122192382812, 32.39923095703125, 0.0, 9.971519470214844, -78.85836791992188, 16.8348388671875, 9.225830078125, -0.599853515625, 109.66778564453125, 42.049560546875, -0.10479736328125, 0.8599205017089844, 8.5390625, -9.7159423828125, 37.723419189453125, 22.909530639648438, 25.671661376953125, -14.230611801147461, -12.711784362792969, -0.09490966796875, 91.177978515625, -7.483055114746094, 86.59506225585938, -26.548828125, 128.62338256835938, 54.946044921875, 61.39044189453125, 58.416290283203125, -6.3214111328125, -31.68310546875, -14.840179443359375, 9.960807800292969, 61.401641845703125, 6.322582244873047, 88.05780029296875, 76.24162292480469, 70.53330993652344, -61.754425048828125, 72.89114379882812, 17.922943115234375, -7.5558013916015625, -59.24766540527344, 22.547119140625, 84.33779907226562, -50.5040283203125, 19.91094970703125, 30.699310302734375, 14.4278564453125, 0.514312744140625, -3.2944183349609375, 104.455810546875, 12.159896850585938, 50.348793029785156, 51.15624237060547, 86.19320678710938, 8.774089813232422, -55.268646240234375, 10.46124267578125, 95.05636596679688, 1.1966629028320312, 65.58241271972656, -18.629318237304688, -67.76022338867188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000201.npy"}
|
||||
{"epoch": 0.42094240837696334, "step": 202, "batch_size": 128, "mean": 28.487892150878906, "std": 50.47046661376953, "min": -106.47433471679688, "p10": -27.877239990234372, "median": 16.652694702148438, "p90": 99.39573974609375, "max": 113.32586669921875, "pos_frac": 0.71875, "sample": [104.1484375, 9.093215942382812, 5.4737091064453125, 15.246002197265625, 90.91925048828125, 0.5455703735351562, -20.567047119140625, 102.25981140136719, -61.8177490234375, 0.50885009765625, -5.55352783203125, 99.9437255859375, 23.070022583007812, 66.34365844726562, 68.64608764648438, -2.5489940643310547, -19.260269165039062, 70.61174011230469, 6.078178405761719, 109.25726318359375, 22.208904266357422, 77.68499755859375, -10.437591552734375, 16.3240966796875, -20.93470001220703, 7.74102783203125, 105.242431640625, 1.0288753509521484, 58.64237976074219, 9.194267272949219, -50.57500457763672, 28.841400146484375, -18.086288452148438, 100.303955078125, 42.41438293457031, 76.80270385742188, 66.37301635742188, 108.55355834960938, 10.017494201660156, -68.42449951171875, -0.38362884521484375, 107.4638671875, 63.744346618652344, 88.51300048828125, -2.7522430419921875, 49.293609619140625, 45.82954406738281, 39.68714904785156, 84.4107666015625, -3.8786468505859375, -80.47862243652344, 65.43667602539062, 85.27374267578125, 1.2958908081054688, -75.38296508789062, 58.2303466796875, 55.1339111328125, -8.116424560546875, 27.14398193359375, -30.78612518310547, 62.64723205566406, 69.43971252441406, 100.30520629882812, -25.640045166015625, -8.08929443359375, -15.71197509765625, 104.58843994140625, 0.49311065673828125, 15.008880615234375, 14.5177001953125, 1.8028373718261719, 85.35882568359375, 111.2169189453125, 1.7851619720458984, 67.404541015625, 95.76838684082031, 70.32172393798828, 42.292205810546875, 90.88258361816406, 80.49456787109375, 0.0, 113.32586669921875, 1.4970664978027344, 15.542633056640625, 16.981292724609375, 4.013828277587891, 1.377471923828125, -29.77032470703125, 3.3460235595703125, -31.560333251953125, 60.246002197265625, 75.8970947265625, 3.247711181640625, 17.241287231445312, 96.71356201171875, 57.63719177246094, -33.92994689941406, -16.42840576171875, -14.75518798828125, -27.06591796875, -106.47433471679688, 94.999755859375, 36.501609802246094, 85.10737609863281, 34.421630859375, 24.926177978515625, -13.379180908203125, 15.296478271484375, 30.146093368530273, 0.3412322998046875, 62.65386962890625, 76.24803161621094, -54.113197326660156, 12.347900390625, 60.97662353515625, -100.32986450195312, 31.2022705078125, 70.85940551757812, -71.29083251953125, -2.4428634643554688, 4.071908950805664, 85.24334716796875, 0.8895645141601562, -2.6284847259521484, -1.2314414978027344, 99.160888671875, -12.934585571289062, 112.445556640625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000202.npy"}
|
||||
{"epoch": 0.42303664921465967, "step": 203, "batch_size": 128, "mean": 21.826416015625, "std": 38.516876220703125, "min": -93.14555358886719, "p10": -24.559359741210937, "median": 13.347005844116211, "p90": 76.29086303710938, "max": 120.10549926757812, "pos_frac": 0.71875, "sample": [2.4247798919677734, 10.866950988769531, 5.791168212890625, 70.3890609741211, 71.57852172851562, 22.7867431640625, 10.062671661376953, 10.951950073242188, 43.228973388671875, 69.29922485351562, 70.03955078125, 26.714111328125, 4.379072189331055, 38.22027587890625, 26.349136352539062, 37.4332275390625, -12.593338012695312, -49.70665740966797, -9.84942626953125, 10.491073608398438, 16.326675415039062, 0.0, -41.689727783203125, 55.26325225830078, -25.523056030273438, 11.336265563964844, -21.811065673828125, 64.6356201171875, 0.8651123046875, -3.323650360107422, 24.306854248046875, 42.12419128417969, 28.058135986328125, 77.56085205078125, 55.59033966064453, -6.338489532470703, 29.010467529296875, 109.963134765625, 15.61834716796875, -13.6500244140625, 40.00852966308594, 84.9522705078125, 26.98699951171875, 65.99752807617188, 1.8271751403808594, -4.6553192138671875, 14.95281982421875, 65.95243835449219, -5.5116729736328125, 60.19440460205078, 0.25654029846191406, 43.48186492919922, 25.663360595703125, 38.87957000732422, 120.10549926757812, -29.634307861328125, 47.65472412109375, -1.51751708984375, 79.48408508300781, 113.79727172851562, 5.275199890136719, 40.845428466796875, 46.518585205078125, -2.7605514526367188, -56.27459716796875, 84.47931671142578, -49.160736083984375, 78.7862548828125, 10.52670669555664, -3.0001220703125, 31.107513427734375, 3.3479080200195312, -93.14555358886719, 66.217041015625, -6.605987548828125, -2.538421630859375, -26.639068603515625, 1.206451416015625, -27.97381591796875, 4.029594421386719, 31.414703369140625, 11.503677368164062, -48.433837890625, 47.93693542480469, 77.97743225097656, 20.063385009765625, 60.393585205078125, 24.98016357421875, 92.801513671875, 75.74658203125, 11.4019775390625, 8.99493408203125, -6.9264373779296875, 54.54205322265625, 7.660654067993164, -10.22967529296875, 36.8115234375, -14.733505249023438, 74.02721405029297, 11.894683837890625, 70.53379821777344, 2.7315826416015625, -2.42584228515625, 9.722564697265625, 11.509944915771484, -53.369354248046875, -3.20867919921875, -32.489784240722656, 16.972503662109375, 100.78097534179688, 0.0, 45.43109130859375, 34.95574951171875, 27.6451416015625, 0.24984359741210938, -3.6336669921875, 25.985313415527344, 83.33285522460938, 9.186965942382812, -1.494781494140625, 32.646453857421875, -34.555419921875, 7.147491455078125, 82.68450927734375, 14.799327850341797, -24.146347045898438, 10.193960189819336, 24.46991729736328], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000203.npy"}
|
||||
{"epoch": 0.42513089005235605, "step": 204, "batch_size": 128, "mean": 31.34152603149414, "std": 44.93619155883789, "min": -85.33782958984375, "p10": -21.67203826904296, "median": 21.989166259765625, "p90": 89.77039947509765, "max": 138.96987915039062, "pos_frac": 0.765625, "sample": [110.118408203125, 71.06529235839844, 108.75830078125, 4.29931640625, 13.337684631347656, 89.01187133789062, 5.040493011474609, 56.715476989746094, 5.109148025512695, 102.63726806640625, 12.78643798828125, 8.53546142578125, 69.67449951171875, -19.104217529296875, -35.0899658203125, -85.33782958984375, 15.93402099609375, -31.75262451171875, 5.301506042480469, 21.37896728515625, 87.01129150390625, 3.8453598022460938, 36.16888427734375, 68.52323913574219, 25.097869873046875, 9.46728515625, 13.006690979003906, -48.04005432128906, 11.279220581054688, 21.9744873046875, 31.907379150390625, 138.96987915039062, 0.024078369140625, -29.755294799804688, 42.122528076171875, -14.01654052734375, 7.4852142333984375, 117.394287109375, -3.6929168701171875, 77.94046020507812, 99.90034484863281, 9.846298217773438, 41.928070068359375, -14.354400634765625, 74.218017578125, 74.13226318359375, 14.423828125, 113.63836669921875, 85.16659545898438, 105.53506469726562, 75.09378814697266, 57.3460693359375, -1.22906494140625, 63.040443420410156, 22.00384521484375, -2.7249755859375, 72.611083984375, 57.34697723388672, -1.6110687255859375, -27.663619995117188, 29.305938720703125, 72.24191284179688, 29.600555419921875, -45.86839294433594, -8.735946655273438, 104.01007080078125, 38.38897705078125, 73.9554443359375, -70.96710205078125, 81.08551025390625, -8.264291763305664, 76.07131958007812, -1.0508880615234375, -12.10205078125, 6.092044830322266, 5.769233703613281, 30.37781524658203, 82.58585357666016, -3.3123931884765625, 19.4896240234375, 8.978462219238281, -6.40478515625, -39.53949737548828, 10.5347900390625, 68.58447265625, -6.55511474609375, 62.999420166015625, 15.964279174804688, 74.8125, 69.99468994140625, 89.597900390625, 18.589065551757812, -37.9508056640625, 95.58154296875, 5.97239875793457, 3.45648193359375, 80.60829162597656, -39.614990234375, 22.233932495117188, 88.1405029296875, 9.137542724609375, -1.3968505859375, 7.451995849609375, 41.273162841796875, 67.68880462646484, 99.9581298828125, 26.110191345214844, 31.89319610595703, 63.601959228515625, 32.027015686035156, 61.0401611328125, 0.3031158447265625, 51.490753173828125, 80.64187622070312, 12.919815063476562, 27.886688232421875, 68.38502502441406, 3.51666259765625, -5.728485107421875, 23.233856201171875, -35.55400085449219, 8.228729248046875, 90.17289733886719, 52.18199157714844, -55.72918701171875, 113.02728271484375, 10.879615783691406, -1.364013671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000204.npy"}
|
||||
{"epoch": 0.4272251308900524, "step": 205, "batch_size": 128, "mean": 30.071590423583984, "std": 46.63984680175781, "min": -76.13778686523438, "p10": -25.47197113037109, "median": 27.515167236328125, "p90": 88.8828857421875, "max": 132.373291015625, "pos_frac": 0.7890625, "sample": [61.781280517578125, 0.0, -2.4736785888671875, 2.849262237548828, 8.891733169555664, 78.29296875, -24.755859375, -20.59637451171875, -27.142898559570312, 1.1993789672851562, 132.373291015625, 65.93133544921875, 5.8958740234375, 11.711990356445312, -15.3355712890625, 4.874843597412109, 108.9188232421875, 100.584228515625, 74.72216796875, 11.120574951171875, 6.641998291015625, 31.318748474121094, -12.848876953125, 93.76863098144531, 55.912017822265625, 13.29473876953125, 61.0804443359375, 15.992790222167969, 101.73104858398438, 45.03123474121094, 50.74220275878906, 52.04933166503906, 2.85772705078125, 6.162322998046875, 26.223297119140625, 6.34174919128418, 64.53527069091797, 11.395263671875, 53.881439208984375, 6.579795837402344, 27.978424072265625, 0.2048492431640625, -45.554656982421875, 81.81350708007812, 12.5440673828125, 47.80145263671875, 0.0, 29.32086181640625, -64.55299377441406, 31.14093017578125, 2.1821556091308594, 21.316986083984375, 86.94784545898438, 91.18182373046875, -58.21870422363281, 11.994110107421875, 80.88361358642578, 59.975860595703125, 2.517578125, 55.081878662109375, -73.63744354248047, -1.185455322265625, 7.756317138671875, 45.52781677246094, -27.830322265625, 5.14886474609375, 90.533935546875, -5.8635711669921875, -64.08721923828125, 126.30364990234375, 78.75225830078125, 72.47137451171875, -59.194000244140625, 43.9136962890625, 0.9519271850585938, 74.14247131347656, 5.35028076171875, 71.82767486572266, 68.42911529541016, -7.4290771484375, 28.347763061523438, 107.65715026855469, 64.40670776367188, 84.9478759765625, 29.777557373046875, 9.702787399291992, -74.36611938476562, 2.3076171875, 70.08003997802734, 69.832275390625, 111.52383422851562, 70.4716796875, 113.3612060546875, -22.525787353515625, 31.605743408203125, 95.69473266601562, 81.56282043457031, 3.854351043701172, 14.887077331542969, 1.655059814453125, 111.81005859375, 68.17245483398438, -4.97833251953125, -41.785675048828125, 42.3497314453125, 57.343017578125, 1.5434494018554688, 70.69818115234375, 64.27445983886719, 66.16461181640625, 27.051910400390625, 10.92901611328125, -76.13778686523438, 7.109010696411133, 38.6058349609375, -48.21821594238281, -67.46304321289062, 84.42411804199219, -10.327598571777344, 67.42549133300781, 47.86541748046875, 57.6282958984375, 0.7174072265625, 50.44482421875, 7.6644134521484375, 88.17529296875, 0.0, 34.91626739501953], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000205.npy"}
|
||||
{"epoch": 0.4293193717277487, "step": 206, "batch_size": 128, "mean": 28.0247859954834, "std": 42.493003845214844, "min": -111.47283935546875, "p10": -23.95146255493164, "median": 20.49190902709961, "p90": 87.35801239013672, "max": 109.80703735351562, "pos_frac": 0.75, "sample": [91.299072265625, 32.30765151977539, -46.936222076416016, 75.01644897460938, 107.28244018554688, -0.6139144897460938, -18.925537109375, 37.589263916015625, 12.538238525390625, 85.14900207519531, 18.95006561279297, 85.768310546875, 68.58786010742188, -18.913639068603516, -24.60748291015625, 25.065216064453125, 33.0494384765625, 5.617900848388672, -3.3133087158203125, 87.515869140625, 0.4881877899169922, 76.48983764648438, -4.92205810546875, -24.848052978515625, -5.232818603515625, 5.200439453125, -44.91522216796875, 38.459930419921875, 58.025794982910156, 7.028953552246094, 3.16668701171875, -11.32421875, -41.071990966796875, 72.14662170410156, 72.0931396484375, -11.662166595458984, 104.31101989746094, 86.93867492675781, 89.59112548828125, 62.88587951660156, 10.2784423828125, 22.91736602783203, -2.7688446044921875, 27.039947509765625, 43.388824462890625, 28.051605224609375, -26.9952392578125, 87.95706176757812, 74.20486450195312, 79.64903259277344, 32.813232421875, 60.87639617919922, 65.72281646728516, 40.57733154296875, 3.201162338256836, 44.63160705566406, 59.21073913574219, 102.35604858398438, -111.47283935546875, 22.03375244140625, 30.634552001953125, -24.571434020996094, -38.95635986328125, 70.33644104003906, 27.02032470703125, 12.083526611328125, 14.7869873046875, 6.77398681640625, -5.337688446044922, 46.0025634765625, 11.100906372070312, 73.64480590820312, 39.23956298828125, -49.06024169921875, 100.59796142578125, 4.2482452392578125, -11.659317016601562, 41.260658264160156, 8.453277587890625, 1.6400794982910156, -15.233139038085938, -30.626205444335938, 95.46670532226562, 75.03286743164062, 70.82740783691406, 11.01751708984375, 34.74699401855469, -35.21333312988281, 11.240074157714844, -0.062267303466796875, 63.976837158203125, 70.07145690917969, 65.94778442382812, 6.015781402587891, -14.90606689453125, 43.09814453125, 17.50189208984375, -10.651290893554688, 1.388671875, -6.392030715942383, 71.94134521484375, 30.02142333984375, 98.477294921875, 89.56028747558594, 9.265411376953125, 74.507080078125, -37.502044677734375, -5.3749847412109375, -23.685760498046875, 16.762786865234375, 3.218780517578125, 109.80703735351562, 72.17466735839844, 5.98150634765625, 1.5958175659179688, 96.514892578125, 63.60536193847656, 5.4528045654296875, 87.29035949707031, 36.106201171875, 3.98931884765625, 11.0203857421875, 1.033538818359375, 73.23016357421875, 32.7998046875, 4.0465240478515625, -16.12384033203125, 11.02001953125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000206.npy"}
|
||||
{"epoch": 0.431413612565445, "step": 207, "batch_size": 128, "mean": 23.215198516845703, "std": 52.559452056884766, "min": -106.1834716796875, "p10": -31.645494842529292, "median": 12.880470275878906, "p90": 94.96940460205077, "max": 140.7144775390625, "pos_frac": 0.6796875, "sample": [65.42730712890625, 48.3626708984375, 15.43414306640625, -9.29684066772461, 17.8607177734375, 22.178680419921875, 34.721435546875, 121.34347534179688, 52.10481262207031, -4.525669097900391, 24.755125045776367, 106.81890869140625, -65.767333984375, 4.512115478515625, -4.3900146484375, 83.52440643310547, 1.0729446411132812, 26.487457275390625, 131.36837768554688, -14.47479248046875, 85.49041748046875, -11.70806884765625, -101.653076171875, -14.7086181640625, 69.82264709472656, 30.758682250976562, -8.82110595703125, 10.410781860351562, -66.68109130859375, 64.179931640625, 77.5146484375, 15.579292297363281, 12.22747802734375, -29.732177734375, 73.1192626953125, 75.08838653564453, 17.2554931640625, -2.243499755859375, 20.94046401977539, 98.52059936523438, -12.369171142578125, 7.2652740478515625, 69.38845825195312, 5.3228759765625, 10.1917724609375, -18.9595947265625, 3.349973678588867, 4.2233734130859375, -90.43869018554688, 13.533462524414062, -106.1834716796875, -16.23291015625, -44.884857177734375, 26.667724609375, 87.41693115234375, 0.0, 8.626178741455078, -30.24786376953125, 52.43414306640625, -15.06256103515625, 110.638427734375, -10.538848876953125, -62.682350158691406, 26.16229248046875, 13.987350463867188, -91.98309326171875, 97.86354064941406, -65.93305969238281, 6.040496826171875, -2.3565673828125, 44.7330322265625, -23.2611083984375, 17.804458618164062, 86.55157470703125, 101.21463012695312, 92.63522338867188, 65.75702667236328, -33.855323791503906, 4.81585693359375, 92.81861877441406, -0.3293609619140625, -68.21597290039062, -0.57244873046875, 33.17105484008789, 9.225805282592773, 140.7144775390625, 14.400520324707031, 61.520233154296875, -22.49383544921875, 2.2493515014648438, 5.85980224609375, 11.164260864257812, -26.431365966796875, 104.11293029785156, 19.18011474609375, 69.78207397460938, 50.211151123046875, -4.8454742431640625, 45.23248291015625, -85.05020141601562, 103.87750244140625, 89.482177734375, -15.60040283203125, 82.667724609375, 6.4604949951171875, 111.2333984375, 10.387603759765625, 5.73638916015625, 11.370752334594727, 80.57504272460938, 9.71356201171875, 40.2615966796875, -3.0438461303710938, 96.82859802246094, 94.172607421875, -7.849525451660156, 63.105560302734375, 84.56263732910156, 110.97430419921875, -17.591033935546875, 10.564338684082031, -70.52827453613281, 81.5421142578125, 82.76567077636719, -30.69842529296875, 22.299530029296875, 53.43248748779297, 8.62567138671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000207.npy"}
|
||||
{"epoch": 0.43350785340314135, "step": 208, "batch_size": 128, "mean": 27.356691360473633, "std": 46.49909210205078, "min": -114.4967041015625, "p10": -14.30262565612793, "median": 17.021663665771484, "p90": 87.46936264038085, "max": 143.73248291015625, "pos_frac": 0.734375, "sample": [16.017791748046875, 7.7704620361328125, -23.465652465820312, 17.505615234375, -88.62887573242188, 4.67315673828125, -4.229469299316406, 24.356781005859375, 4.634429931640625, 79.10026550292969, -10.328458786010742, 9.6370849609375, 59.556365966796875, 51.72096252441406, 55.199466705322266, 122.2220458984375, 16.25946044921875, 11.869522094726562, 20.40625, 131.62197875976562, -104.26885986328125, 79.46755981445312, -6.333038330078125, 34.40574645996094, 16.58154296875, 72.270263671875, 114.48846435546875, 65.38374328613281, 16.124374389648438, -7.61846923828125, 143.73248291015625, 63.59203338623047, 56.7908935546875, 40.576171875, -14.351051330566406, 102.8851318359375, -114.4967041015625, 14.47430419921875, 11.78314208984375, 3.081878662109375, 13.302297592163086, -90.81942749023438, 78.66947937011719, -14.281871795654297, 83.1417236328125, 30.3319091796875, 31.369308471679688, 106.07464599609375, 0.0, 129.9222412109375, -34.034881591796875, 1.6257801055908203, 12.399154663085938, -49.78376770019531, 2.5213699340820312, 124.99493408203125, 14.298431396484375, 44.34541320800781, 11.34341049194336, 94.42501831054688, 47.06562423706055, 36.674224853515625, -9.372444152832031, 60.1129150390625, -4.864471435546875, 5.2707061767578125, -0.289794921875, -1.6405105590820312, 21.611045837402344, 61.35646057128906, 128.7763671875, 0.0, 24.014114379882812, 52.13983154296875, 9.449214935302734, 84.64251708984375, -7.022148132324219, 86.30215454101562, 19.22900390625, 64.3152084350586, -5.4884490966796875, 39.66436767578125, 77.92620849609375, 91.89205932617188, -4.561431884765625, 57.385589599609375, -9.766143798828125, -1.6411247253417969, 0.4354095458984375, 39.0380859375, -13.3382568359375, 20.837158203125, 39.60107421875, 12.090576171875, 75.87249755859375, 32.792633056640625, 74.17363739013672, 11.70819091796875, 94.18240356445312, -15.024360656738281, -0.7653350830078125, 2.3872241973876953, 5.725311279296875, 14.755931854248047, 18.131858825683594, 10.264122009277344, -1.492279052734375, 78.8453369140625, 17.46178436279297, 23.089614868164062, 19.198272705078125, -23.50457763671875, 65.09998321533203, 43.3455810546875, 8.2685546875, -39.571319580078125, 16.255218505859375, 0.6534938812255859, -27.433578491210938, 51.047637939453125, 60.424896240234375, 54.079864501953125, -9.955986022949219, 90.1928482055664, 33.7073974609375, -39.6197509765625, -3.619140625, 54.8192138671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000208.npy"}
|
||||
{"epoch": 0.4356020942408377, "step": 209, "batch_size": 128, "mean": 22.819921493530273, "std": 46.28190994262695, "min": -103.81719970703125, "p10": -29.612200927734374, "median": 11.28524398803711, "p90": 84.30317687988281, "max": 123.36477661132812, "pos_frac": 0.625, "sample": [116.87001037597656, -9.70135498046875, 74.47593688964844, 74.69610595703125, -14.0369873046875, -4.70208740234375, 27.0968017578125, -29.3287353515625, -34.985076904296875, 19.738555908203125, 79.7666015625, -12.273780822753906, -0.78021240234375, 7.058254241943359, 112.277587890625, 77.33061981201172, 43.41297912597656, 5.497173309326172, 7.546356201171875, 72.89363098144531, 90.49371337890625, -20.41156005859375, 6.764404296875, -69.20838928222656, 22.3519287109375, 7.9460906982421875, 6.744808197021484, 8.6258544921875, -15.305633544921875, 81.19920349121094, 83.6480712890625, -47.4818115234375, -0.3472137451171875, -56.20283508300781, -5.0630340576171875, -32.126319885253906, 116.41372680664062, 3.350433349609375, 0.0, 10.463607788085938, -29.5517578125, -45.80326843261719, 55.67176055908203, -31.58343505859375, 33.5087890625, 56.306732177734375, 22.869586944580078, 4.125375747680664, -5.066650390625, 100.65083312988281, 39.717437744140625, 30.057907104492188, 17.92645263671875, -17.92523193359375, 55.70587158203125, 0.6377544403076172, -11.55218505859375, 12.106880187988281, 69.97941589355469, -14.480615615844727, 26.47296142578125, -14.507003784179688, -7.49114990234375, 85.83175659179688, 86.91195678710938, 9.229583740234375, -11.791534423828125, -24.570556640625, 100.74411010742188, 2.720489501953125, 21.87353515625, 67.63189697265625, 18.51361083984375, 13.372613906860352, -0.1482391357421875, 0.0, 60.7115478515625, 73.46932983398438, -22.129005432128906, 67.56060791015625, -17.59112548828125, -5.60113525390625, 54.24175262451172, -1.40069580078125, 8.632827758789062, 86.92471313476562, 75.8922119140625, -59.167022705078125, 24.4256591796875, 8.283782958984375, 23.001449584960938, -61.4871826171875, -20.091033935546875, 59.81670379638672, -2.1307220458984375, -0.04290771484375, 28.44293212890625, 48.82878875732422, 123.36477661132812, -103.81719970703125, -11.435806274414062, 4.872047424316406, -38.747772216796875, -7.4457855224609375, -2.3818359375, -77.02134704589844, -5.257659912109375, 15.82427978515625, 75.37054443359375, 75.2860107421875, 55.48625183105469, 80.51097106933594, 72.09834289550781, 17.539474487304688, 59.4595947265625, 52.56390380859375, 38.27479553222656, 80.399169921875, -2.9930419921875, 115.7698974609375, 33.27301025390625, -29.75323486328125, 58.123260498046875, 48.0396728515625, 95.39166259765625, 81.66531372070312, -23.209861755371094, 86.305908203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000209.npy"}
|
||||
{"epoch": 0.437696335078534, "step": 210, "batch_size": 128, "mean": 27.690841674804688, "std": 47.27061080932617, "min": -103.51992797851562, "p10": -17.462974548339844, "median": 20.11066436767578, "p90": 86.51348876953125, "max": 172.53173828125, "pos_frac": 0.7109375, "sample": [-4.247047424316406, -4.5650634765625, 37.2799072265625, -5.13555908203125, 67.031494140625, 9.727554321289062, 107.08447265625, 111.45904541015625, 40.533905029296875, 36.72857666015625, 83.37809753417969, 55.580562591552734, 6.38751220703125, 65.59149169921875, 69.214111328125, -2.9075469970703125, 63.251190185546875, -5.602123260498047, 62.19641876220703, 4.9630279541015625, 13.435449600219727, -1.9759674072265625, -63.808258056640625, 4.4493865966796875, 63.98686218261719, 54.61955261230469, 8.5318603515625, 172.53173828125, -50.7340087890625, -7.5413360595703125, 45.37628173828125, 27.62060546875, 15.3277587890625, 1.081512451171875, 76.40481567382812, -14.759063720703125, 5.739398956298828, 53.6680908203125, 0.6851558685302734, 27.47247314453125, 2.33587646484375, 65.70541381835938, 18.415103912353516, -6.648353576660156, 13.879112243652344, 66.18844604492188, 34.8216552734375, 0.21868896484375, 75.25419616699219, 13.53022575378418, 25.998382568359375, -2.48358154296875, 44.718597412109375, 94.96209716796875, 38.104591369628906, 64.02877044677734, 41.074127197265625, -34.450958251953125, 95.21063232421875, -6.009429931640625, 54.281829833984375, 62.322235107421875, 82.1844482421875, 69.73728942871094, 21.01934814453125, 35.212982177734375, -90.2060546875, -6.9743194580078125, 10.43829345703125, 77.45391845703125, -74.16267395019531, -30.618057250976562, 85.79306030273438, 71.08245849609375, 94.90805053710938, 88.19448852539062, -0.3103752136230469, 9.354080200195312, -7.410980224609375, -103.51992797851562, 72.6791763305664, 10.863201141357422, -1.646341323852539, 103.48834228515625, 19.201980590820312, 83.62957763671875, 124.83258056640625, 32.40802001953125, -1.658447265625, -13.068374633789062, -19.186416625976562, 13.163799285888672, 3.112060546875, -27.195266723632812, 22.622583389282227, 45.014892578125, -7.656829833984375, 112.84297180175781, -15.759796142578125, 53.89287567138672, 60.706939697265625, 120.42874145507812, 7.205474853515625, -17.423080444335938, 100.66522216796875, 23.356056213378906, 93.7663803100586, 0.8553543090820312, 12.749481201171875, -17.556060791015625, 53.693992614746094, 22.423858642578125, -2.8032989501953125, 17.281982421875, 16.598587036132812, -99.80325317382812, 14.805694580078125, 76.35897827148438, 28.0645751953125, -81.51307678222656, 72.9500961303711, -9.701141357421875, 73.69096374511719, 44.899169921875, -26.453338623046875, -8.62054443359375, -5.22998046875, 37.753143310546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000210.npy"}
|
||||
{"epoch": 0.4397905759162304, "step": 211, "batch_size": 128, "mean": 37.473602294921875, "std": 44.42290115356445, "min": -75.00930786132812, "p10": -10.780265808105469, "median": 37.69696044921875, "p90": 92.90028686523436, "max": 138.60507202148438, "pos_frac": 0.7734375, "sample": [59.895164489746094, 17.960067749023438, 111.41583251953125, 61.8260498046875, 75.41285705566406, 91.81939697265625, 14.567832946777344, 6.794979095458984, -12.610458374023438, 64.16647338867188, 85.54074096679688, -75.00930786132812, 40.53131103515625, 43.967132568359375, -11.621726989746094, 85.85089111328125, 0.9790115356445312, -0.14064598083496094, -1.862762451171875, -17.171966552734375, 8.383407592773438, 68.39125061035156, 15.468917846679688, 15.49945068359375, 64.00028991699219, 27.49920654296875, 8.486328125, -56.78179931640625, 72.41030883789062, -10.61846923828125, 6.881664276123047, 74.50201416015625, 0.1808624267578125, 67.46205139160156, -3.7843551635742188, 37.974517822265625, 47.901817321777344, -70.95195770263672, 34.535980224609375, 87.81106567382812, 8.0567626953125, 56.461395263671875, 9.384307861328125, -1.250152587890625, -2.944854736328125, 123.3779296875, 91.03242492675781, 52.23077392578125, 3.70989990234375, 10.607791900634766, 67.69036865234375, -11.157791137695312, 2.489166259765625, -1.610137939453125, 138.60507202148438, 75.99542236328125, 95.42236328125, -19.94379425048828, 22.5181884765625, 62.54840087890625, -8.26300048828125, -14.397979736328125, -1.2611846923828125, 128.90411376953125, 1.147918701171875, 4.2085418701171875, -4.63494873046875, 75.62733459472656, 96.609619140625, 15.824920654296875, -22.653839111328125, -1.2025146484375, -2.115297317504883, 66.95498657226562, -2.0034656524658203, 11.451661109924316, 75.972900390625, 65.8790283203125, 74.37621307373047, 72.69488525390625, 54.088165283203125, 57.433746337890625, -1.3450851440429688, 63.5140380859375, 69.37603759765625, 125.48773193359375, 14.06494140625, 0.32244873046875, 69.614501953125, 6.923084259033203, -12.415390014648438, 123.01272583007812, 100.78729248046875, -31.876312255859375, 101.08331298828125, 19.616989135742188, 88.2899169921875, 61.49334716796875, 12.357501983642578, -8.936843872070312, 73.76220703125, 91.26885986328125, 42.68011474609375, 6.900417327880859, -69.85530090332031, 4.3577117919921875, 45.3697509765625, 41.6668701171875, 102.11932373046875, 59.00201416015625, 52.28570556640625, 12.20814323425293, 87.73200988769531, 37.419403076171875, 3.8167457580566406, 32.551055908203125, 106.78884887695312, 96.68350219726562, 66.61871337890625, 61.41419982910156, 25.585693359375, 58.55574035644531, -0.18654251098632812, 86.53324890136719, 53.420936584472656, 63.13165283203125, 1.0871124267578125, 86.90377807617188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000211.npy"}
|
||||
{"epoch": 0.4418848167539267, "step": 212, "batch_size": 128, "mean": 26.609333038330078, "std": 47.729942321777344, "min": -127.13018798828125, "p10": -28.79040184020996, "median": 18.35062026977539, "p90": 89.73107604980468, "max": 126.28680419921875, "pos_frac": 0.734375, "sample": [3.527313232421875, -7.714080810546875, 7.303924560546875, 40.338722229003906, -12.8841552734375, -50.05030822753906, 1.1876068115234375, 16.970260620117188, 3.4761199951171875, -31.78387451171875, 100.04944610595703, 0.44443511962890625, 121.37149047851562, 3.37276554107666, 67.51702880859375, -54.37547302246094, 6.147600173950195, 72.79153442382812, 77.32118225097656, 33.00665283203125, 90.53201293945312, 8.95574951171875, -22.079376220703125, 18.49908447265625, 72.90133666992188, 11.5870361328125, -9.588760375976562, -75.64212799072266, 16.78229331970215, 50.30592346191406, -28.06625747680664, -25.290786743164062, 3.4800872802734375, 17.98895263671875, 18.20215606689453, 59.7742919921875, 24.669158935546875, 95.22360229492188, 47.273590087890625, -4.8451690673828125, 74.65509033203125, -92.0821533203125, -1.911041259765625, -3.663330078125, -11.135650634765625, 77.16866302490234, 73.25579833984375, 74.84024047851562, -2.56719970703125, -56.21465301513672, 66.85970306396484, 19.79132080078125, -30.849273681640625, 102.684326171875, 38.94696044921875, -6.14801025390625, -14.302497863769531, 72.40478515625, 8.676177978515625, -3.1827392578125, -7.4305877685546875, 6.554725646972656, 76.97366333007812, 19.512725830078125, -9.145957946777344, 75.24781799316406, -11.465206146240234, 58.85508728027344, 49.27745056152344, 92.80807495117188, 38.6253662109375, 77.94192504882812, -59.696136474609375, 26.22723388671875, 82.06141662597656, 13.472991943359375, 56.53850173950195, 73.133056640625, -1.6977081298828125, 102.51416015625, 54.81072998046875, 103.35150146484375, 20.069183349609375, 57.199951171875, -101.47624206542969, 23.263092041015625, -2.892974853515625, -38.43121337890625, 29.104965209960938, 7.739910125732422, 79.6514892578125, 46.95166015625, 11.693389892578125, 6.652500152587891, 119.51129150390625, 67.69657897949219, 23.480712890625, 91.98628997802734, 60.795433044433594, 66.82081604003906, 29.57275390625, 126.28680419921875, 66.38551330566406, 1.4300670623779297, 17.96380615234375, -127.13018798828125, 49.20074462890625, 108.14019775390625, -35.962646484375, -12.350906372070312, 6.3096466064453125, -30.480072021484375, 32.7236328125, 6.381250381469727, 38.340057373046875, -0.4687652587890625, 33.44194030761719, 11.373703002929688, 66.80410766601562, 1.6221847534179688, 89.3878173828125, 10.532867431640625, 61.46562957763672, 85.12456512451172, 6.34991455078125, 0.7541351318359375, 107.45352172851562, 11.147125244140625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000212.npy"}
|
||||
{"epoch": 0.44397905759162304, "step": 213, "batch_size": 128, "mean": 27.415203094482422, "std": 47.004024505615234, "min": -82.45283508300781, "p10": -23.995367431640624, "median": 13.249324798583984, "p90": 98.01583557128906, "max": 141.150146484375, "pos_frac": 0.6796875, "sample": [106.31692504882812, 73.60191345214844, -19.3228759765625, 85.78692626953125, -33.42218017578125, 68.55126190185547, 76.41455078125, 71.61526489257812, -4.74127197265625, 32.12353515625, 26.581634521484375, 11.98956298828125, 21.622894287109375, -21.382461547851562, 9.439022064208984, 125.14019775390625, -3.0684814453125, 19.8741455078125, 24.89727783203125, 50.1158447265625, 18.48040771484375, -7.305908203125, 133.39492797851562, -5.9141845703125, 96.02314758300781, 4.0006103515625, -10.144760131835938, 15.4742431640625, 19.92058563232422, 14.509086608886719, 6.1336212158203125, -29.864013671875, -11.366546630859375, 3.953521728515625, 89.90689086914062, 38.766387939453125, 79.18830871582031, -0.697174072265625, 7.8018646240234375, -23.984619140625, 87.19352722167969, 46.36408996582031, 8.77227783203125, 76.36404418945312, -21.8211669921875, 77.22393798828125, 59.8372802734375, 8.379669189453125, -43.33221435546875, 18.15972900390625, 3.6995697021484375, 48.882904052734375, 97.995849609375, 3.988819122314453, 119.9527587890625, -29.302818298339844, -3.8319473266601562, 71.20193481445312, -57.45716857910156, 101.37004089355469, 17.1490478515625, 17.1640625, 27.76799774169922, -29.020675659179688, 8.87872314453125, 46.3741455078125, 106.71621704101562, -25.7191162109375, -16.795150756835938, -9.351959228515625, 75.81997680664062, -15.292724609375, 26.549774169921875, -79.79429626464844, 28.309356689453125, 16.052757263183594, 52.797576904296875, 93.9530029296875, 9.254692077636719, 40.30035400390625, -18.91059112548828, 8.342945098876953, -2.690673828125, -10.97161865234375, -2.2058944702148438, 68.68927001953125, -36.004486083984375, -5.53021240234375, 49.492156982421875, 122.69580078125, 69.09668731689453, 99.39828491210938, 5.22802734375, 6.06939697265625, 36.87274169921875, -5.464635848999023, 101.82022094726562, -17.166717529296875, -24.02044677734375, 141.150146484375, 1.5755424499511719, 11.526969909667969, -6.198394775390625, 94.32794189453125, -32.084808349609375, 111.6541748046875, 82.54563903808594, -6.830543518066406, 52.593994140625, 0.0, 9.876422882080078, 61.97767639160156, 11.984222412109375, 98.06246948242188, 11.27271556854248, -82.45283508300781, 73.25421142578125, 39.3662223815918, -5.139747619628906, 19.7689208984375, 3.118976593017578, 77.5458984375, -30.05926513671875, 7.271171569824219, 112.14093017578125, -0.45947265625, -20.176055908203125, 1.6274490356445312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000213.npy"}
|
||||
{"epoch": 0.44607329842931936, "step": 214, "batch_size": 128, "mean": 25.056903839111328, "std": 48.220619201660156, "min": -93.7806396484375, "p10": -29.58405380249023, "median": 19.81585693359375, "p90": 87.75529479980469, "max": 156.92520141601562, "pos_frac": 0.7109375, "sample": [56.68116760253906, 69.80291748046875, -24.49848175048828, 6.0145263671875, 13.308067321777344, 87.76907348632812, 20.599594116210938, -12.493484497070312, -56.52337646484375, -17.204345703125, 28.87384033203125, 63.90025329589844, -31.076950073242188, 34.838714599609375, 49.95654296875, 68.51626586914062, -68.36026000976562, 75.7236557006836, 21.6103515625, 24.38031005859375, -6.078086853027344, -17.191253662109375, 19.032119750976562, 64.9764633178711, 34.34637451171875, 17.8780517578125, 67.15125274658203, -23.02923583984375, 8.40771484375, 7.580131530761719, 6.9602203369140625, 14.650543212890625, 11.17325210571289, 2.9908599853515625, 69.82640075683594, 21.786468505859375, 156.92520141601562, 72.48310852050781, 100.2518310546875, -78.76716613769531, 29.933944702148438, -25.615631103515625, 3.748483657836914, 30.052490234375, 3.0651321411132812, 3.1244277954101562, -6.97802734375, 87.7493896484375, 14.548309326171875, -8.006324768066406, 67.04264831542969, 23.1573486328125, -30.8204345703125, 47.062889099121094, 3.8479690551757812, 82.04853820800781, 28.251495361328125, 47.03826904296875, -11.425262451171875, 85.59884643554688, 69.32595825195312, 106.9755859375, 61.00127410888672, 10.512924194335938, 7.2774810791015625, -23.107620239257812, -6.4350738525390625, -40.219459533691406, 0.14715576171875, 0.0, -45.15728759765625, 1.9955291748046875, -12.75592041015625, 16.7703857421875, 69.43222045898438, -1.179962158203125, 52.590354919433594, -47.954044342041016, 6.186859130859375, 13.637275695800781, -18.830413818359375, -93.7806396484375, 96.18829345703125, 2.082061767578125, 71.36355590820312, -31.044586181640625, 93.72712707519531, 49.422847747802734, 22.0838623046875, 15.767166137695312, -90.07662963867188, 32.272552490234375, 28.026824951171875, 107.3892822265625, 86.9085693359375, 10.916473388671875, -10.708740234375, 51.53533935546875, -3.4896240234375, 69.45384216308594, 85.00346374511719, 114.17059326171875, 18.470733642578125, 52.315277099609375, -4.75897216796875, 98.65927124023438, 108.2659912109375, 28.448760986328125, -25.695785522460938, 53.841278076171875, 138.7633056640625, 24.17120361328125, -21.83673095703125, -42.04730224609375, -26.261932373046875, 90.23637390136719, 24.994855880737305, -77.86869812011719, 12.443603515625, -1.5740966796875, 20.9130859375, 61.263206481933594, 24.5908203125, 52.78263854980469, -29.054176330566406, 149.2247314453125, 51.51377868652344, 61.46044921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000214.npy"}
|
||||
{"epoch": 0.4481675392670157, "step": 215, "batch_size": 128, "mean": 26.271713256835938, "std": 52.137603759765625, "min": -108.99458312988281, "p10": -36.86981201171875, "median": 20.60501766204834, "p90": 94.97093353271484, "max": 154.7841796875, "pos_frac": 0.6953125, "sample": [19.627676010131836, 29.631866455078125, 31.013595581054688, 80.07882690429688, 115.17062377929688, 73.31338500976562, 83.992919921875, 55.48145294189453, 68.24932861328125, -8.038200378417969, 107.73260498046875, 98.42996215820312, 102.29656982421875, -26.832305908203125, 10.852462768554688, 26.240325927734375, -4.517486572265625, 133.3785400390625, 137.90771484375, 74.6236572265625, 77.9144287109375, 1.075927734375, 0.0, -27.96954345703125, -4.661949157714844, -22.967926025390625, 42.94964599609375, 72.42202758789062, 130.43499755859375, -10.56597900390625, 83.71237182617188, 1.09222412109375, -16.544036865234375, -47.04908752441406, 2.541597366333008, 9.428695678710938, 84.1094741821289, -1.9555816650390625, 21.596694946289062, 127.79232788085938, -31.581207275390625, 24.941680908203125, -26.24224853515625, -33.17864990234375, 0.4463844299316406, -11.17718505859375, -67.32144165039062, 42.64990997314453, 78.50927734375, 59.137290954589844, 34.337425231933594, 154.7841796875, 17.142791748046875, 22.3204345703125, -54.197296142578125, 0.8648910522460938, 7.2795867919921875, -43.403839111328125, 0.0, 79.15451049804688, -4.0636749267578125, -2.7069664001464844, 49.099334716796875, -27.109207153320312, 52.3699951171875, 50.09136962890625, -69.09205627441406, 4.8663787841796875, 95.83436584472656, 29.82818603515625, 49.22442626953125, 8.17015266418457, -4.550483703613281, 84.55133056640625, -60.4952392578125, 75.13131713867188, 4.83819580078125, -108.99458312988281, 60.249725341796875, 44.242828369140625, 12.1226806640625, 55.47759246826172, 18.770416259765625, -17.375152587890625, -59.93400573730469, -2.9810123443603516, 77.25595092773438, 15.049560546875, 94.60089111328125, -36.010772705078125, -12.781059265136719, 24.18011474609375, 21.582359313964844, -7.161102294921875, 5.778083801269531, 17.04803466796875, 0.2657337188720703, 26.367950439453125, 112.21249389648438, 149.52883911132812, -32.57977294921875, 26.225440979003906, -71.67704772949219, 0.38961029052734375, -70.25042724609375, -38.874237060546875, 62.563682556152344, 87.4250717163086, -56.858612060546875, 91.283447265625, 4.4996337890625, -8.089134216308594, 39.24942398071289, 17.75958251953125, 11.870758056640625, 29.365371704101562, 27.30303955078125, 85.71060180664062, -50.9521484375, 26.923583984375, 46.29588317871094, 6.48565673828125, 13.885498046875, 99.07232666015625, 26.900634765625, 62.61183166503906, 28.08966064453125, 86.210693359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000215.npy"}
|
||||
{"epoch": 0.450261780104712, "step": 216, "batch_size": 128, "mean": 30.26645278930664, "std": 54.58770751953125, "min": -109.0765380859375, "p10": -40.533529663085936, "median": 25.519065856933594, "p90": 103.58431243896484, "max": 138.3812255859375, "pos_frac": 0.71875, "sample": [21.844894409179688, 54.31126403808594, 65.9510726928711, 32.278221130371094, 13.5118408203125, -38.582794189453125, -103.3155517578125, 21.559112548828125, 69.56863403320312, -1.7092361450195312, 77.61119079589844, 89.466796875, 71.35693359375, -12.061737060546875, 26.590255737304688, 57.21173095703125, 138.3812255859375, 8.887653350830078, 104.62908935546875, 117.62867736816406, 24.162261962890625, -87.685302734375, 52.94816589355469, -0.57537841796875, -9.835124969482422, 20.96502685546875, -93.59979248046875, 12.266891479492188, 57.55591583251953, 80.82415771484375, -53.14628601074219, -40.25164794921875, -92.9091796875, 2.7840805053710938, 71.27452087402344, 46.20147705078125, 84.01614379882812, -47.33599853515625, 110.86587524414062, 78.03665161132812, -11.23775863647461, 87.15756225585938, 122.228759765625, 79.58641815185547, 82.41447448730469, 59.494232177734375, 36.05816650390625, 37.472320556640625, 5.384613037109375, -0.7551193237304688, -49.12548828125, -5.12298583984375, 130.39306640625, 45.312225341796875, 7.9399566650390625, 46.971588134765625, 9.201873779296875, 106.681396484375, -13.970184326171875, 5.396320343017578, 0.0, 68.00643920898438, -23.3890380859375, 68.23526000976562, 105.14358520507812, 90.1636962890625, 97.74856567382812, -11.880706787109375, -3.6794891357421875, 14.38006591796875, 63.92921447753906, 42.76055908203125, -109.0765380859375, 15.892303466796875, 112.96092224121094, 103.13655090332031, 10.257781982421875, 103.01535034179688, 73.26531982421875, 9.656341552734375, 24.4478759765625, 11.205352783203125, 88.32235717773438, 27.774078369140625, 111.567138671875, 73.59809875488281, -44.77638244628906, -9.07080078125, 109.23812866210938, 107.588623046875, -41.191253662109375, -72.29727172851562, 3.745361328125, 51.5205078125, -5.33209228515625, -16.4884033203125, 13.95599365234375, -18.097869873046875, 35.472286224365234, 23.709091186523438, 120.83233642578125, 12.388702392578125, 0.0, 67.41626739501953, 79.47105407714844, -76.57733154296875, 1.780364990234375, 66.5886001586914, 68.062744140625, 58.25701141357422, 0.3427295684814453, 51.56768798828125, 83.34124755859375, -9.62030029296875, 69.66423797607422, -6.66375732421875, 32.59295654296875, 51.37615966796875, 4.60601806640625, 52.7469482421875, -4.05908203125, -103.89816284179688, 63.4700927734375, 83.64878845214844, 7.64141845703125, 0.7684326171875, -23.80145263671875, 3.581878662109375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000216.npy"}
|
||||
{"epoch": 0.4523560209424084, "step": 217, "batch_size": 128, "mean": 39.12125015258789, "std": 53.94680404663086, "min": -93.41433715820312, "p10": -22.78886566162109, "median": 34.118377685546875, "p90": 106.5537368774414, "max": 203.22259521484375, "pos_frac": 0.7734375, "sample": [10.745613098144531, -2.0788040161132812, -4.15863037109375, 79.39926147460938, 105.01368713378906, -77.77801513671875, 2.8895111083984375, 95.0711669921875, 3.201129913330078, 40.77313232421875, 99.05348205566406, -4.781494140625, 0.1987457275390625, 37.71467590332031, -22.2606201171875, 100.35673522949219, 76.69076538085938, 85.4100112915039, 75.55918884277344, 27.5418701171875, 50.22767639160156, 73.84225463867188, 5.301952362060547, 90.39242553710938, 162.178466796875, 67.44377136230469, 86.62277221679688, 89.62267303466797, 88.92391967773438, 98.05951690673828, 33.45916748046875, 74.54238891601562, -12.854904174804688, 125.430908203125, 51.886383056640625, 26.061126708984375, 203.22259521484375, 79.90277099609375, 91.8896484375, 36.83746337890625, -93.41433715820312, 92.50582885742188, 39.602569580078125, 148.07125854492188, -80.76495361328125, 34.267669677734375, 12.20220947265625, -24.78557586669922, 41.9833984375, 22.85943603515625, 3.7942237854003906, 1.3547210693359375, 10.510467529296875, 39.53767395019531, 54.713409423828125, -48.412750244140625, -2.11199951171875, 89.53889465332031, 62.82288360595703, -17.580780029296875, -4.305213928222656, 42.80657958984375, -24.021438598632812, 122.26535034179688, 4.0662689208984375, 40.43225860595703, 14.059911727905273, 97.64482116699219, 75.3486099243164, 5.2256927490234375, 70.57272338867188, 80.6823501586914, 39.587974548339844, -4.156036376953125, 100.940185546875, -41.2381591796875, -30.033859252929688, -21.334571838378906, 49.844268798828125, 157.6275634765625, 9.473587036132812, 110.33258056640625, 23.3504638671875, -7.010562896728516, -64.21517944335938, 123.73651123046875, 110.14718627929688, 115.38958740234375, 15.5159912109375, -70.48321533203125, 16.11041259765625, 89.63296508789062, -12.680145263671875, 82.22686767578125, 1.63592529296875, 62.0194091796875, 1.8027267456054688, 21.578323364257812, 9.790092468261719, 72.36387634277344, -4.073211669921875, 0.0, 16.1346435546875, 33.969085693359375, -28.771759033203125, 125.89755249023438, 79.61729431152344, -34.037567138671875, 45.75445556640625, 118.48609924316406, 59.95826721191406, 23.770858764648438, 31.6114501953125, 131.1640625, 83.03912353515625, 14.245925903320312, 5.6819000244140625, 23.971923828125, -28.30255126953125, 8.435348510742188, 61.9188232421875, 33.38465881347656, 51.994110107421875, 18.090347290039062, 35.19598388671875, -22.0325927734375, -6.22650146484375, 3.6612892150878906], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000217.npy"}
|
||||
{"epoch": 0.4544502617801047, "step": 218, "batch_size": 128, "mean": 31.516695022583008, "std": 47.469635009765625, "min": -91.02893829345703, "p10": -23.28836898803711, "median": 24.391311645507812, "p90": 91.33649597167968, "max": 146.38177490234375, "pos_frac": 0.7421875, "sample": [39.60333251953125, 14.72686767578125, 2.6513671875, 97.11929321289062, -5.064300537109375, -5.8638763427734375, -12.117706298828125, 108.158935546875, 29.076129913330078, 16.06252670288086, -66.00531005859375, 12.289840698242188, 64.53573608398438, 54.375457763671875, 49.09332275390625, -63.071502685546875, -2.292144775390625, 79.67143249511719, 73.97035217285156, 63.273406982421875, -19.404441833496094, 48.16375732421875, 58.772216796875, 25.70928955078125, 57.83056640625, 26.587127685546875, 11.8868408203125, -12.74310302734375, -23.084640502929688, 64.19461059570312, 69.79012298583984, 117.518798828125, 6.3030242919921875, 88.46805572509766, -24.5247802734375, 10.10284423828125, 76.63800811767578, -27.97149658203125, 78.56535339355469, 64.28140258789062, 142.95880126953125, 54.18829345703125, 21.608428955078125, 3.306619644165039, -26.321304321289062, 93.02702331542969, 11.680198669433594, -63.759674072265625, -39.131866455078125, 86.8486328125, 38.42643737792969, -23.45220184326172, 78.03241729736328, 54.87579345703125, 91.78564453125, -2.2280845642089844, 91.39291381835938, 19.008697509765625, 87.88232421875, -21.4765625, 17.712493896484375, -8.959869384765625, -23.218154907226562, 12.217697143554688, 0.0, 63.1416015625, 19.291275024414062, -0.482421875, -70.18254089355469, 86.88616943359375, 15.415237426757812, 38.51051330566406, -55.78273391723633, 89.38737487792969, 32.61529541015625, 90.81365966796875, -0.02884674072265625, 12.33660888671875, 26.14599609375, 4.3174591064453125, 146.38177490234375, 91.31231689453125, 114.69085693359375, 6.32061767578125, -5.3082275390625, 20.05731201171875, -4.68011474609375, 0.9833297729492188, 110.36080932617188, 31.74267578125, 20.08861541748047, 66.19741821289062, 18.79339599609375, -71.04885864257812, 82.16986846923828, 74.8696060180664, 92.13941955566406, 87.980712890625, -41.1287841796875, 44.4871826171875, 6.28704833984375, 79.7071533203125, -8.161670684814453, 8.838592529296875, 51.762786865234375, -11.5068359375, -91.02893829345703, 55.85353088378906, 23.073333740234375, 73.3721923828125, 11.904632568359375, 22.36871337890625, 40.867095947265625, 66.88729858398438, 0.0213623046875, 22.34783935546875, 112.1275634765625, 3.403961181640625, -16.271141052246094, -15.67242431640625, 92.723388671875, 65.74838256835938, 62.502288818359375, 28.880401611328125, 44.22967529296875, 83.97662353515625, 27.770584106445312, 9.647598266601562], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000218.npy"}
|
||||
{"epoch": 0.45654450261780105, "step": 219, "batch_size": 128, "mean": 30.53944969177246, "std": 50.801910400390625, "min": -116.01287841796875, "p10": -23.198271179199217, "median": 25.21436309814453, "p90": 98.0311553955078, "max": 130.92269897460938, "pos_frac": 0.765625, "sample": [-116.01287841796875, 15.210418701171875, 47.37152099609375, 10.748565673828125, 86.28964233398438, 34.274452209472656, -13.966033935546875, 39.6654052734375, 86.44741821289062, 26.6402587890625, 51.987060546875, -1.3344535827636719, 41.32843017578125, -59.02630615234375, -0.6643791198730469, 58.26457977294922, 61.37481689453125, -10.930023193359375, 20.420303344726562, 75.8304443359375, 41.401397705078125, 8.016487121582031, 45.21966552734375, 112.0972900390625, -3.5124740600585938, 23.10076904296875, 79.82431030273438, 6.2628173828125, 75.98257446289062, -25.311569213867188, -17.51947784423828, -17.924030303955078, -77.66189575195312, 42.65733337402344, 11.511077880859375, 93.19754791259766, -4.7755126953125, 33.75497055053711, 21.186203002929688, 87.74224853515625, 107.51593017578125, 130.46978759765625, 7.498012542724609, 58.873504638671875, 13.865005493164062, -80.0479736328125, 10.779449462890625, 74.621337890625, 8.679840087890625, 102.50080871582031, 0.0, 26.071914672851562, 18.398040771484375, 27.319244384765625, -60.41864013671875, 104.93218994140625, -22.292572021484375, 101.119140625, 110.62991333007812, 16.731658935546875, 49.67314147949219, 30.301025390625, 41.40922546386719, 4.97271728515625, 2.0995492935180664, -10.9959716796875, 3.99700927734375, 13.607481002807617, 0.8488616943359375, 67.59193420410156, -14.7686767578125, 22.96588134765625, 10.272789001464844, 114.34933471679688, 19.714813232421875, 0.005889892578125, -84.80560302734375, 3.835845947265625, 62.131011962890625, 96.70773315429688, -0.7467021942138672, 96.28166198730469, 23.64990234375, 26.96417236328125, -6.818208694458008, -39.99334716796875, 112.5361328125, 7.394401550292969, 112.41993713378906, -4.39080810546875, 96.16046142578125, 65.04315185546875, 48.87373352050781, 70.30609130859375, 77.63333892822266, 88.10444641113281, 46.565608978271484, 52.3670654296875, 30.286376953125, 90.45648193359375, -40.85919189453125, 87.46310424804688, 78.5377197265625, -52.2044677734375, 94.36027526855469, 60.70802307128906, 55.213958740234375, 1.0323333740234375, 23.493408203125, 24.3568115234375, 51.16646194458008, -3.1611175537109375, 40.685821533203125, -109.9644775390625, 130.56491088867188, 105.2376708984375, 50.79649353027344, -53.50682067871094, 13.544113159179688, 13.977096557617188, 26.67059326171875, 16.024246215820312, 56.537841796875, 11.9520263671875, 130.92269897460938, -10.179834365844727, 2.5148162841796875, -70.25830078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000219.npy"}
|
||||
{"epoch": 0.4586387434554974, "step": 220, "batch_size": 128, "mean": 34.437522888183594, "std": 52.207130432128906, "min": -124.34921264648438, "p10": -24.44508056640625, "median": 36.33866882324219, "p90": 104.46058502197265, "max": 140.98370361328125, "pos_frac": 0.7265625, "sample": [97.60903930664062, 112.35699462890625, 70.74264526367188, 63.70067596435547, 22.527637481689453, 66.640869140625, 6.410980224609375, 83.38238525390625, 21.170654296875, 72.47149658203125, -9.050018310546875, 0.849365234375, 75.60579681396484, -57.720947265625, 9.9300537109375, 31.308868408203125, 28.078887939453125, 105.07296752929688, -24.313720703125, 12.335861206054688, 58.344879150390625, -8.912200927734375, 25.354644775390625, 2.4894580841064453, 0.08660888671875, 140.98370361328125, -6.057464599609375, 101.95368957519531, -70.64419555664062, 17.418136596679688, 64.26893615722656, 4.2333526611328125, -7.71453857421875, -41.47737121582031, 31.11077880859375, 9.14093017578125, -24.7515869140625, 57.01995849609375, 1.1387500762939453, 69.10992431640625, 57.90088653564453, 27.829833984375, 90.74603271484375, 44.279052734375, 113.07295227050781, 63.757286071777344, -7.7149505615234375, -124.34921264648438, 68.72723388671875, 42.8104248046875, 51.624755859375, 81.52861022949219, -4.05950927734375, 110.97393798828125, -24.21733856201172, 15.553794860839844, -6.7523956298828125, -21.426170349121094, 7.1497802734375, 42.142784118652344, -12.399139404296875, 63.954872131347656, 55.317169189453125, 74.01274108886719, 9.555213928222656, -14.422904968261719, -11.655380249023438, 54.16131591796875, 0.38702392578125, 86.44110107421875, 102.71430969238281, 113.91802978515625, -6.4986572265625, 24.266403198242188, 74.72964477539062, 71.2263412475586, -22.382965087890625, 57.16960144042969, 66.79613494873047, -9.097412109375, 9.062530517578125, 20.116966247558594, 82.86447143554688, 59.049346923828125, -7.5676422119140625, 115.55937957763672, -76.79513549804688, 9.630279541015625, 126.69772338867188, 120.3160400390625, -37.46270751953125, 35.634796142578125, -86.2481689453125, -1.6362457275390625, 63.660308837890625, 119.61239624023438, 105.24688720703125, 16.835952758789062, 75.9197998046875, -40.10565948486328, 75.00775146484375, 43.84545135498047, -36.402374267578125, -22.249359130859375, 5.254554748535156, -16.591720581054688, 78.76577758789062, -28.805984497070312, 112.52679443359375, 104.19813537597656, 76.75306701660156, 37.04254150390625, 51.625, -25.444778442382812, 124.70849609375, -22.84710693359375, 94.4954833984375, 71.81956481933594, 14.544349670410156, 82.4666748046875, -93.10838317871094, 53.567138671875, 60.824195861816406, 44.35784912109375, 80.081298828125, -14.044883728027344, 87.75297546386719, 37.490936279296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000220.npy"}
|
||||
{"epoch": 0.4607329842931937, "step": 221, "batch_size": 128, "mean": 25.550880432128906, "std": 50.38564682006836, "min": -116.18356323242188, "p10": -27.820307159423823, "median": 17.29010009765625, "p90": 93.55956726074218, "max": 126.78079223632812, "pos_frac": 0.71875, "sample": [5.8675537109375, 71.84722900390625, -17.89080810546875, 17.611541748046875, 39.541282653808594, 1.093048095703125, 20.692054748535156, 35.167694091796875, -13.49658203125, -10.77340316772461, 68.11275482177734, 49.38935852050781, 4.0550537109375, 59.3314208984375, -3.2679977416992188, 61.978851318359375, 13.7816162109375, -69.37001037597656, 7.699893951416016, -20.202590942382812, 69.73284912109375, 35.869964599609375, 126.78079223632812, -17.99163818359375, 28.586669921875, 70.10222625732422, 96.9154052734375, 67.97592163085938, 86.56500244140625, -1.0208740234375, 15.17095947265625, 12.603195190429688, 116.1864013671875, -116.18356323242188, -55.049713134765625, 16.968658447265625, 45.538482666015625, 79.972412109375, -11.103012084960938, 113.2255859375, -26.23552703857422, 9.292022705078125, 11.46112060546875, 30.05743408203125, -71.41288757324219, -3.4913330078125, 91.43685913085938, -58.61114501953125, 79.49591064453125, 32.4814453125, -0.15945053100585938, 80.53944396972656, 24.27490234375, -18.25494384765625, -8.110271453857422, 59.968292236328125, 55.1910400390625, -60.801116943359375, 1.88934326171875, 52.3446044921875, 10.05975341796875, 6.1647796630859375, 59.865753173828125, 11.673805236816406, 48.810943603515625, 1.2626724243164062, 50.57981872558594, 73.31275939941406, -2.6423873901367188, 77.95503234863281, 53.491790771484375, -18.607376098632812, 65.00674438476562, 69.40167999267578, 101.35002136230469, 60.76171875, 104.146728515625, 28.90887451171875, -60.94988250732422, -76.875244140625, 74.5670166015625, 94.71282958984375, -13.04827880859375, 10.396163940429688, -10.182647705078125, 91.36210632324219, 9.188690185546875, -20.084503173828125, 31.657257080078125, 70.67050170898438, 112.69390869140625, 100.65853881835938, 1.48199462890625, 7.363531112670898, -37.6065673828125, 5.891944885253906, -13.974441528320312, 117.20268249511719, -99.92608642578125, 48.100555419921875, 5.0219879150390625, -78.73663330078125, -19.650390625, 112.76947021484375, 16.772247314453125, 5.2845458984375, 46.263938903808594, 93.98185729980469, 93.37858581542969, 11.236640930175781, 54.407012939453125, 41.5654296875, -31.51812744140625, -18.969661712646484, 6.296943664550781, 6.2581024169921875, 44.3056640625, 31.057952880859375, 73.77432250976562, -7.84124755859375, 40.96818542480469, 81.11174011230469, -25.03564453125, 94.53695678710938, 35.33837890625, 14.230173110961914, -95.77767944335938, 11.3092041015625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000221.npy"}
|
||||
{"epoch": 0.46282722513089003, "step": 222, "batch_size": 128, "mean": 33.904117584228516, "std": 49.056209564208984, "min": -154.068603515625, "p10": -19.16722412109375, "median": 24.672439575195312, "p90": 101.03751220703126, "max": 127.0517578125, "pos_frac": 0.7578125, "sample": [64.53485107421875, 17.414337158203125, 77.49720764160156, 75.68853759765625, -24.061874389648438, -25.714279174804688, 37.877960205078125, 50.56428527832031, 16.606719970703125, 8.452163696289062, -22.932159423828125, 19.7626953125, 85.45730590820312, 37.97789001464844, 24.259246826171875, 56.215049743652344, 75.72447967529297, 120.48443603515625, 63.220458984375, 42.12639617919922, 7.6090545654296875, 2.162139892578125, -24.930755615234375, -46.4266357421875, 58.453216552734375, 123.75253295898438, -15.0640869140625, 2.671661376953125, -70.52864074707031, 8.457826614379883, 7.91802978515625, 43.03235626220703, 126.45770263671875, 5.214347839355469, 17.891483306884766, 11.336071014404297, 7.24658203125, 62.41259765625, 77.47828674316406, 35.8782958984375, -2.227783203125, -66.4019775390625, 78.41696166992188, 0.89788818359375, 38.312225341796875, 8.138957977294922, -63.239707946777344, 64.23887634277344, -154.068603515625, 50.75830078125, 97.755126953125, 70.04803466796875, -0.5207443237304688, 100.20811462402344, -16.844425201416016, 127.0517578125, 33.510093688964844, 101.0078125, 1.7391395568847656, 3.856414794921875, 29.50494384765625, 23.569984436035156, 23.04681396484375, 107.0352783203125, -6.089263916015625, 19.22454833984375, 86.51241302490234, 126.73397827148438, 28.22918701171875, 64.47970581054688, 90.03440856933594, 84.31686401367188, 2.062347412109375, 84.8590087890625, -22.70672607421875, 13.50640869140625, 18.87792205810547, 0.8407402038574219, 101.1068115234375, -7.3304443359375, 6.58013916015625, 27.3571834564209, -5.939416885375977, 70.405517578125, -17.542984008789062, -2.485443115234375, 107.24058532714844, -26.275436401367188, -7.009407043457031, 60.11543273925781, 84.22111511230469, 86.20660400390625, 32.890960693359375, 42.693748474121094, 6.795099258422852, -19.7972412109375, 4.650238037109375, 55.50379943847656, 25.08563232421875, 110.07958984375, 115.34506225585938, -8.021598815917969, -1.8338623046875, 38.21049499511719, 107.30903625488281, -4.47662353515625, 92.80117797851562, 79.19674682617188, 16.815765380859375, 85.84600830078125, 21.2220458984375, 42.68400573730469, -31.417510986328125, 10.67987060546875, -4.97406005859375, -13.166275024414062, -14.512237548828125, 97.62677001953125, -18.897216796875, 78.1663818359375, 54.804443359375, 8.4805908203125, 115.79425048828125, 77.80717468261719, 41.414215087890625, -0.3768444061279297, 12.125259399414062, 117.66896057128906], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000222.npy"}
|
||||
{"epoch": 0.4649214659685864, "step": 223, "batch_size": 128, "mean": 38.059417724609375, "std": 53.79189682006836, "min": -95.93470764160156, "p10": -33.83981475830078, "median": 41.575050354003906, "p90": 107.18503265380859, "max": 148.7000732421875, "pos_frac": 0.765625, "sample": [10.636024475097656, -17.941314697265625, 9.272239685058594, 81.4058837890625, 11.052024841308594, 29.864547729492188, 2.1034088134765625, -12.562294006347656, 47.829341888427734, 98.30815124511719, 26.670654296875, 96.45040893554688, 39.59674072265625, 9.13543701171875, 60.894256591796875, 140.57666015625, 102.26051330566406, -95.93470764160156, 77.03340148925781, -36.31549072265625, 30.091461181640625, 121.66748046875, -3.0032501220703125, 45.284423828125, 54.58714294433594, -36.786712646484375, 15.4298095703125, 12.914764404296875, 115.61993408203125, -17.3702392578125, 25.891571044921875, 65.73419189453125, 107.99638366699219, 96.35836791992188, 47.71575927734375, 92.80599975585938, 139.02294921875, -84.22552490234375, 62.75950622558594, -82.97749328613281, 19.913818359375, 62.405029296875, 9.11962890625, 69.83287048339844, 102.60372924804688, 88.72967529296875, 112.68466186523438, 18.386474609375, 119.8841552734375, 24.136383056640625, -3.24285888671875, 23.86944580078125, 124.02685546875, 129.021240234375, 80.45695495605469, 68.53814697265625, 1.139617919921875, 106.83731079101562, 33.23649597167969, 44.33734130859375, 41.611480712890625, 20.35919189453125, 56.079345703125, 148.7000732421875, 47.87461853027344, 41.53861999511719, 26.07958984375, 33.632568359375, 77.30357360839844, -4.8082733154296875, -14.2078857421875, 63.36485290527344, 4.56494140625, 97.13665771484375, -44.33869171142578, 36.026954650878906, -24.751121520996094, -34.14622497558594, 94.78598022460938, -14.600425720214844, 61.96055603027344, -6.745391845703125, 0.00571441650390625, 36.11402130126953, 89.98320007324219, 62.472412109375, -64.48641967773438, -12.045433044433594, 71.87664794921875, -13.275829315185547, -84.01642608642578, 24.68169403076172, 72.22483825683594, 45.3607177734375, 79.99432373046875, 121.89199829101562, 79.43502807617188, 80.203369140625, 70.04811096191406, 6.3023681640625, -1.9507102966308594, 56.122650146484375, 55.66400146484375, 71.7662353515625, 56.00054931640625, -33.70849609375, 92.67828369140625, 134.42999267578125, 0.13756179809570312, 23.683349609375, 110.52236938476562, -52.16368103027344, -2.10546875, 8.156761169433594, 94.1802978515625, 59.78947448730469, -68.08747100830078, -3.2366485595703125, 68.55828857421875, 7.057403564453125, -68.41756439208984, 86.12229919433594, 44.64817810058594, 42.03515625, 48.48615264892578, 19.248779296875, -32.93890380859375, -45.033538818359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000223.npy"}
|
||||
{"epoch": 0.46701570680628274, "step": 224, "batch_size": 128, "mean": 38.58637237548828, "std": 49.31826400756836, "min": -99.4764404296875, "p10": -19.334217834472657, "median": 33.96812057495117, "p90": 102.86169128417967, "max": 145.7061767578125, "pos_frac": 0.8125, "sample": [123.61569213867188, 145.7061767578125, 81.00729370117188, 49.004798889160156, 3.5890274047851562, 5.249137878417969, -26.12109375, 42.921417236328125, 79.53610229492188, 13.881782531738281, 4.85101318359375, 74.71292877197266, -2.8162994384765625, 72.92738342285156, 26.146400451660156, 33.64862060546875, 13.1004638671875, 82.62847900390625, 33.738792419433594, -56.62733459472656, 75.990478515625, 6.60986328125, 114.01432800292969, 117.87338256835938, 15.713043212890625, 65.92282104492188, 74.43089294433594, 29.731491088867188, 17.122894287109375, 34.19744873046875, 122.20101928710938, 86.036376953125, -99.4764404296875, 49.41209411621094, 35.67402648925781, 20.441329956054688, 34.622283935546875, 70.86393737792969, 41.7366943359375, -4.25225830078125, 22.122604370117188, -19.265411376953125, 81.80345153808594, 68.62982177734375, 60.916351318359375, 37.400794982910156, 61.631866455078125, -48.12162780761719, 92.846435546875, 0.3700733184814453, 71.42323303222656, 45.2242431640625, 0.0, 106.2757568359375, 88.39663696289062, 11.35739517211914, 7.867095947265625, -19.494766235351562, 6.1263275146484375, 29.803558349609375, 65.02227783203125, 13.98101806640625, 3.2217025756835938, 83.5511474609375, -1.962188720703125, 109.00323486328125, -20.283721923828125, -32.256561279296875, 56.890350341796875, -31.968734741210938, 17.2313232421875, 126.45672607421875, 30.982559204101562, 105.9434814453125, 17.87054443359375, -10.81689453125, -36.72234344482422, 66.2183837890625, 21.97113037109375, 109.88323974609375, -11.306137084960938, -13.144523620605469, -92.89617919921875, 10.942108154296875, 22.34429168701172, 95.2513427734375, 48.31671142578125, -18.55902099609375, 98.66070556640625, -82.96337127685547, 91.70187377929688, 79.56283569335938, 55.45893859863281, 99.65744018554688, 30.58245849609375, 66.899169921875, 101.54092407226562, 53.19053649902344, 35.439697265625, 6.004974365234375, -31.29143524169922, 23.3011474609375, 4.18817138671875, 17.328094482421875, 46.122032165527344, 60.484336853027344, 144.69024658203125, -19.223037719726562, 108.23208618164062, 76.47189331054688, 17.6793212890625, -42.381011962890625, 11.61065673828125, 3.27294921875, 6.90521240234375, 31.612258911132812, 86.3572998046875, 10.011001586914062, 70.38583374023438, -17.145111083984375, 6.940731048583984, 84.77786254882812, 89.35971069335938, 73.24676513671875, 97.22715759277344, 73.93743133544922, 0.6482810974121094, 118.524169921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000224.npy"}
|
||||
{"epoch": 0.46910994764397906, "step": 225, "batch_size": 128, "mean": 26.29075813293457, "std": 59.607513427734375, "min": -122.489013671875, "p10": -53.995687866210936, "median": 21.15899658203125, "p90": 109.81603088378905, "max": 144.85177612304688, "pos_frac": 0.640625, "sample": [109.62139892578125, 0.560089111328125, 33.5631103515625, 110.27017211914062, 100.40029907226562, 5.664581298828125, -69.99928283691406, -71.3546142578125, 70.66567993164062, 108.77919006347656, 27.477828979492188, 66.48007202148438, -2.04571533203125, -8.400238037109375, 118.874755859375, 38.58233642578125, 127.48684692382812, 77.07427978515625, 6.879356384277344, 19.0465087890625, -10.032257080078125, -14.843307495117188, 79.3555908203125, -43.04905700683594, 100.42465209960938, 6.012237548828125, 108.75613403320312, -10.136611938476562, 1.741455078125, -31.53936767578125, -29.285118103027344, -122.489013671875, 84.13118743896484, 70.0634765625, 14.38623046875, 9.249504089355469, 29.272369384765625, -2.205768585205078, -1.2818603515625, -53.98255920410156, 79.34075927734375, 119.57907104492188, -10.481109619140625, -61.982177734375, -89.75704956054688, -65.50225067138672, 37.321197509765625, 23.271484375, -10.058456420898438, 46.7423095703125, -78.42266845703125, 24.98126220703125, 72.127685546875, 89.84756469726562, -12.694061279296875, 15.28460693359375, -30.76580810546875, 33.420166015625, 103.040771484375, 85.62504577636719, 85.05364990234375, 118.20738220214844, 128.3072509765625, -21.040435791015625, 29.3504638671875, -10.815750122070312, -43.97113037109375, 68.71478271484375, 52.832183837890625, 133.52264404296875, 1.2014923095703125, 110.564453125, 29.39898681640625, 65.95023345947266, -6.860565185546875, 55.77391052246094, 99.57221221923828, 123.11709594726562, -24.7230224609375, 84.13563537597656, -36.20452880859375, -57.73504638671875, -43.89973449707031, 25.911285400390625, -78.3199462890625, -11.20166015625, 17.23552894592285, 0.4881858825683594, 42.45640563964844, -108.3309326171875, -9.481201171875, -31.48101806640625, -11.9613037109375, -14.281005859375, 66.59091186523438, -28.55718994140625, 1.015350341796875, 38.76165008544922, 144.85177612304688, 61.96257019042969, 4.623863220214844, 65.49170684814453, 119.66656494140625, 107.74237060546875, -29.892578125, 14.467636108398438, -68.33065032958984, 61.29132080078125, 10.95648193359375, 126.6953125, -15.491668701171875, 73.37176513671875, 59.58172607421875, 129.9215545654297, 7.93182373046875, 66.64202117919922, 57.564964294433594, -21.852981567382812, 63.76901626586914, -54.02632141113281, 23.965240478515625, 1.8739547729492188, -9.278038024902344, -7.71502685546875, -60.75746154785156, 42.53712463378906, 35.26692199707031, 87.995849609375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000225.npy"}
|
||||
{"epoch": 0.4712041884816754, "step": 226, "batch_size": 128, "mean": 23.734424591064453, "std": 50.398563385009766, "min": -95.2554931640625, "p10": -38.38848037719726, "median": 13.792160034179688, "p90": 92.08967742919921, "max": 132.2955322265625, "pos_frac": 0.703125, "sample": [15.42230224609375, 132.2955322265625, -20.21282958984375, 48.35054016113281, 75.0078125, 107.30178833007812, 5.609466552734375, -0.8165817260742188, 68.06666564941406, 82.18960571289062, 109.97642517089844, 37.341678619384766, -4.7733154296875, 110.66778564453125, -4.331146240234375, 12.788311004638672, 3.8417205810546875, -67.41879272460938, 90.33053588867188, 101.17251586914062, -2.95086669921875, 0.4821815490722656, 78.30157470703125, 66.71763610839844, 5.231231689453125, 24.081130981445312, -9.400613784790039, -45.0462646484375, 77.33271789550781, 60.665557861328125, 1.3060989379882812, 32.814605712890625, 55.29547119140625, 102.93389892578125, 2.496307373046875, 15.84136962890625, 91.27143859863281, -37.54498291015625, -14.746063232421875, 24.45861053466797, -61.5567626953125, 62.459129333496094, -15.72210693359375, 0.27202606201171875, -17.928497314453125, 44.052947998046875, 13.32818603515625, -10.676025390625, -95.2554931640625, 65.53622436523438, 28.406005859375, -47.040008544921875, 93.9989013671875, 2.5695571899414062, 8.99692153930664, -69.80816650390625, 3.3311386108398438, 23.567626953125, 0.6828041076660156, -0.7247848510742188, -6.144073486328125, 11.326492309570312, 5.1130218505859375, -94.94076538085938, 18.649871826171875, -5.3740997314453125, 12.872940063476562, 85.18995666503906, 38.14862823486328, 11.211944580078125, -29.437301635742188, 6.465873718261719, 79.91433715820312, -24.245223999023438, 58.975318908691406, 4.11767578125, 21.079971313476562, 90.67111206054688, 7.089988708496094, 110.03253173828125, -27.186721801757812, 86.97449493408203, 118.74655151367188, -60.74109649658203, 106.81729125976562, 10.33441162109375, 22.49341583251953, 1.1950149536132812, 35.15069580078125, 87.98080444335938, 65.92788696289062, 36.5369873046875, 83.0125961303711, 60.48968505859375, 85.918212890625, 14.256134033203125, -18.646240234375, 1.2081298828125, 68.68587493896484, 57.15773010253906, 114.25382995605469, 18.08544921875, 106.64529418945312, -16.89642333984375, 6.1011962890625, 20.936904907226562, 54.18132019042969, 34.607696533203125, 66.61909484863281, -24.06317138671875, 6.6740875244140625, -18.458236694335938, -2.7650070190429688, 67.17593383789062, 40.805580139160156, -59.42633819580078, 32.292755126953125, -39.804229736328125, -25.49725341796875, 117.5096435546875, 53.98199462890625, -37.78173065185547, -57.10179138183594, 14.786861419677734, -83.30471801757812, -51.40760803222656, 11.66436767578125, -9.680007934570312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000226.npy"}
|
||||
{"epoch": 0.4732984293193717, "step": 227, "batch_size": 128, "mean": 37.52872085571289, "std": 54.14963150024414, "min": -118.624755859375, "p10": -27.893178939819332, "median": 30.914737701416016, "p90": 109.00221405029296, "max": 150.05169677734375, "pos_frac": 0.7578125, "sample": [-65.20426940917969, 33.952850341796875, 19.34771728515625, 28.9168701171875, 60.142669677734375, 13.736564636230469, 120.4146728515625, 14.459732055664062, 10.447052001953125, 69.940185546875, 103.37945556640625, 99.52445983886719, 11.768234252929688, -0.342529296875, 25.589553833007812, 66.65647888183594, 0.0, 114.457275390625, -102.92047119140625, -30.162296295166016, 67.79364013671875, -64.673095703125, 85.83714294433594, 102.89918518066406, 138.07952880859375, 150.05169677734375, 108.36117553710938, -91.10562133789062, 94.97045135498047, -18.2720947265625, -6.8864593505859375, 94.10833740234375, 13.295387268066406, 12.803924560546875, 34.584564208984375, -11.65386962890625, 30.44696807861328, 70.97380828857422, 53.601165771484375, 117.03717041015625, 119.67132568359375, 48.55078125, 14.425323486328125, 65.80328369140625, 34.918701171875, -65.588134765625, 18.14556884765625, 49.40924072265625, -18.149169921875, 29.881500244140625, 106.531982421875, 80.49469757080078, -7.420257568359375, -25.7998046875, 56.723541259765625, 135.6287841796875, 6.42327880859375, 73.00938415527344, 110.49797058105469, 25.198131561279297, 113.30101013183594, -42.00263977050781, 15.029388427734375, 81.70402526855469, -118.624755859375, 24.046875, 12.874168395996094, 127.51702880859375, -39.92218017578125, 41.881500244140625, 72.9503173828125, 31.820648193359375, 8.908830642700195, 72.65351867675781, 105.54620361328125, 56.217681884765625, 71.658935546875, 56.334442138671875, -37.684906005859375, 65.47654724121094, 121.37164306640625, 5.706268310546875, 16.64630126953125, 105.07429504394531, 48.437713623046875, 29.868179321289062, 55.930389404296875, 31.61895751953125, 30.211029052734375, 81.75811767578125, -31.5670166015625, 26.058555603027344, 8.084686279296875, -59.54145812988281, 90.2711181640625, 42.96063232421875, 87.29679870605469, 13.194793701171875, -19.481307983398438, 99.86417388916016, -0.82305908203125, 123.500244140625, -2.3872318267822266, 111.27206420898438, -43.377052307128906, 98.03494262695312, -2.9100570678710938, -0.959259033203125, 3.23846435546875, -0.0702667236328125, 95.43133544921875, 24.771011352539062, 93.57626342773438, 90.53811645507812, 13.733123779296875, 30.381378173828125, 31.38250732421875, 63.086181640625, 92.02554321289062, -3.305755615234375, -15.8756103515625, -26.920700073242188, 23.1527099609375, 12.363815307617188, -16.657821655273438, 3.719846725463867, 34.81958770751953, 33.77217102050781], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000227.npy"}
|
||||
{"epoch": 0.47539267015706804, "step": 228, "batch_size": 128, "mean": 36.70655059814453, "std": 52.51327133178711, "min": -74.64447021484375, "p10": -25.037321472167967, "median": 29.356250762939453, "p90": 106.38211669921874, "max": 137.58480834960938, "pos_frac": 0.71875, "sample": [-69.32662963867188, -11.671180725097656, 60.9517822265625, 95.15835571289062, 47.126983642578125, -5.59691047668457, 124.71575927734375, 30.5054931640625, 66.45150756835938, 93.42085266113281, 3.8232040405273438, -8.208869934082031, -10.923995971679688, 112.3328857421875, 39.247406005859375, 67.65762329101562, -4.4259796142578125, 80.2603530883789, -15.94921875, 5.616485595703125, 59.065460205078125, 28.33216094970703, -74.64447021484375, -14.932891845703125, 15.60418701171875, 68.06391906738281, 24.621429443359375, 2.1657867431640625, -21.8258056640625, 58.9947509765625, 13.320053100585938, 26.474273681640625, -51.12602233886719, 105.808349609375, 30.380340576171875, 96.90481567382812, 61.2056884765625, 79.13265991210938, 11.88287353515625, -2.58648681640625, 85.96954345703125, -8.5567626953125, -23.055694580078125, -24.37835693359375, 74.600341796875, 8.53387451171875, -8.036407470703125, 16.794837951660156, 47.621124267578125, 40.71260070800781, 43.51043701171875, 113.01315307617188, 61.47981262207031, 15.079666137695312, 99.86270141601562, 34.648101806640625, 74.8260498046875, 82.266357421875, 111.67913818359375, 116.16009521484375, -1.2149200439453125, -49.868995666503906, -65.46300506591797, -0.6975650787353516, 62.54753112792969, 46.358551025390625, 0.0, 22.30084228515625, 80.55319213867188, 16.95092010498047, 12.791580200195312, 137.58480834960938, 16.362571716308594, -6.64306640625, -35.78094482421875, 93.19195556640625, -16.44219970703125, 100.9093017578125, -14.124557495117188, 107.4093017578125, -17.626922607421875, -56.586761474609375, 18.2054443359375, 105.94189453125, 95.08213806152344, 113.9500732421875, -38.49456787109375, 4.16314697265625, 119.90403747558594, 64.3626708984375, -21.40478515625, 59.03009033203125, 15.6187744140625, 88.17730712890625, 23.787635803222656, 76.27702331542969, -67.27212524414062, 4.355445861816406, 31.708423614501953, 104.82347106933594, 105.0723876953125, -26.574905395507812, -38.98755645751953, 16.096960067749023, 81.42080688476562, 59.62237548828125, 90.89852905273438, -50.30474853515625, -15.9603271484375, 6.88140869140625, 24.553539276123047, 116.35516357421875, -1.433990478515625, 86.23912048339844, 86.89044189453125, 120.09942626953125, 129.85107421875, 68.09719848632812, 36.64160919189453, -50.713623046875, 75.64578247070312, 76.36224365234375, 101.16757202148438, 24.855880737304688, 135.913330078125, 5.61785888671875, 1.4338531494140625, 21.231990814208984], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000228.npy"}
|
||||
{"epoch": 0.4774869109947644, "step": 229, "batch_size": 128, "mean": 46.305206298828125, "std": 56.1221923828125, "min": -91.80177307128906, "p10": -18.911077499389645, "median": 45.281593322753906, "p90": 115.68245086669921, "max": 166.92498779296875, "pos_frac": 0.8046875, "sample": [77.64654541015625, -11.32763671875, 93.1490478515625, 72.37188720703125, -14.9244384765625, -89.74850463867188, 25.647750854492188, 83.95204162597656, 40.882080078125, -61.40693664550781, 150.76202392578125, 115.39979553222656, 89.82980346679688, 0.2818489074707031, 131.60946655273438, 101.69140625, 37.187469482421875, 34.780609130859375, 24.846874237060547, 46.8084716796875, 35.27281188964844, 21.816436767578125, 96.88995361328125, -11.520355224609375, 71.88054656982422, -68.3985595703125, 15.07568359375, 46.46333312988281, 108.758056640625, 77.92247009277344, 0.32958984375, 44.818206787109375, 117.4483642578125, 51.70782470703125, 17.369384765625, 116.34197998046875, -4.7259674072265625, 18.92828369140625, 140.31265258789062, -91.80177307128906, 6.47509765625, -57.982421875, 15.933486938476562, 95.53097534179688, 60.95225143432617, 133.68548583984375, 166.92498779296875, 107.6517333984375, 50.40901184082031, 94.05442810058594, 39.213897705078125, -0.5500736236572266, 19.533981323242188, 3.208740234375, 75.5942153930664, 109.78973388671875, 90.9705810546875, 31.354217529296875, 10.615447998046875, -20.603973388671875, 53.94337463378906, -53.32854080200195, 101.56719970703125, 60.876953125, -14.4637451171875, -6.683502197265625, 100.9122314453125, 14.882820129394531, 74.37596893310547, -1.308929443359375, -0.23957061767578125, 45.74497985839844, 1.753763198852539, -24.03326416015625, 100.72979736328125, 32.86427307128906, 27.67028045654297, 81.33141326904297, 111.1678466796875, 44.2403564453125, 109.11109924316406, 18.950698852539062, 38.256439208984375, 97.29032897949219, 52.722042083740234, 12.32733154296875, -32.762542724609375, 89.58843994140625, 124.0665283203125, 70.30463409423828, 138.04136657714844, 99.32493591308594, 14.438261032104492, 27.5274658203125, 14.259429931640625, 104.8262939453125, 42.40272521972656, 113.8306884765625, 58.1041259765625, 25.05908203125, 88.73013305664062, 23.474647521972656, 43.703338623046875, 118.516845703125, -9.4920654296875, 14.0123291015625, 59.291954040527344, -14.59283447265625, 9.644420623779297, 114.39999389648438, 65.54733276367188, -18.185550689697266, 11.0333251953125, 49.1563720703125, 80.02691650390625, 87.87271118164062, 88.49221801757812, -60.341339111328125, 89.89433288574219, -73.43377685546875, 127.83146667480469, 50.329345703125, 76.58474731445312, 24.171722412109375, 116.8056640625, -53.185604095458984, 138.82418823242188, -78.80965423583984], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000229.npy"}
|
||||
{"epoch": 0.47958115183246075, "step": 230, "batch_size": 128, "mean": 34.75718307495117, "std": 55.54924774169922, "min": -168.845458984375, "p10": -17.637118530273433, "median": 27.540847778320312, "p90": 107.52025451660157, "max": 151.39849853515625, "pos_frac": 0.7421875, "sample": [108.9317626953125, 30.58642578125, 32.211181640625, 89.74996185302734, 147.78872680664062, 63.71649169921875, 117.54594421386719, -1.4119873046875, 106.50588989257812, 29.018325805664062, 12.657211303710938, 28.41448974609375, 23.792991638183594, -73.6881103515625, -168.845458984375, 94.28475952148438, 98.19216918945312, -1.9932594299316406, -7.868682861328125, -27.8673095703125, 12.6917724609375, 11.259654998779297, 117.51416015625, -61.581787109375, 0.0, 103.43304443359375, -2.398834228515625, 49.08524703979492, 58.1143798828125, 98.11274719238281, 58.255775451660156, 88.71464538574219, 7.5894317626953125, 37.5, -0.73956298828125, -79.17974853515625, 52.2930908203125, 87.87471008300781, 64.3773193359375, -22.684646606445312, 0.5905265808105469, -81.26986694335938, 12.881072998046875, 28.595458984375, 44.63703155517578, 102.35186767578125, 68.08091735839844, -8.431549072265625, 90.58023071289062, -19.883407592773438, -7.330263137817383, 52.577301025390625, -34.3917236328125, 55.38706970214844, 100.70304870605469, -10.31671142578125, 32.36082458496094, 11.338714599609375, 102.46343994140625, 24.071762084960938, -73.8941650390625, 63.06744384765625, 127.62139892578125, 79.15997314453125, 15.064964294433594, 5.014495849609375, -16.674423217773438, 92.25454711914062, -21.717819213867188, 63.494293212890625, 44.30999755859375, 6.81378173828125, -11.958328247070312, -12.721931457519531, 56.519775390625, 26.667205810546875, 9.8651123046875, 53.82608413696289, 125.0877685546875, -9.338287353515625, 3.156686782836914, 54.291717529296875, 24.51275634765625, -15.223474502563477, 17.971710205078125, -122.97128295898438, 13.06964111328125, -20.27434539794922, 131.59280395507812, 38.5718994140625, 138.00747680664062, 92.35565185546875, 13.5948486328125, -7.3495635986328125, 112.74575805664062, 109.13290405273438, 1.492788314819336, 31.707290649414062, 94.2003173828125, 64.65907287597656, 107.85807800292969, 5.589773178100586, 75.0076675415039, 26.345277786254883, -12.742897033691406, 84.76844787597656, -0.91473388671875, 99.938232421875, 76.09837341308594, 5.4503936767578125, 72.03971862792969, 107.37547302246094, 62.51414489746094, 2.602855682373047, 19.74676513671875, 32.471763610839844, 2.830718994140625, 127.58598327636719, 151.39849853515625, -4.746063232421875, 1.1199569702148438, 3.72857666015625, 3.0582427978515625, 9.518218994140625, 25.30047607421875, 32.00056457519531, -4.951416015625, -6.729095458984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000230.npy"}
|
||||
{"epoch": 0.4816753926701571, "step": 231, "batch_size": 128, "mean": 27.833770751953125, "std": 50.805973052978516, "min": -105.56521606445312, "p10": -38.01005554199219, "median": 24.022674560546875, "p90": 91.21240844726562, "max": 140.73284912109375, "pos_frac": 0.7578125, "sample": [83.59259033203125, 11.574798583984375, -38.862945556640625, -63.38654708862305, 78.71344757080078, 46.25639343261719, 64.175537109375, 1.0737762451171875, 67.11248779296875, -105.56521606445312, 99.80097961425781, -31.8572998046875, 63.868499755859375, 29.697021484375, 0.04315185546875, 90.53662109375, 9.209747314453125, 9.739559173583984, 75.13970947265625, 7.3205718994140625, -24.324310302734375, -68.02928161621094, -13.840744018554688, 50.96099853515625, 10.306533813476562, 31.910568237304688, 5.9249725341796875, 48.91057586669922, -31.385528564453125, -52.76043701171875, 26.19525146484375, 83.25149536132812, 92.96769714355469, 8.539506912231445, 44.921844482421875, -0.210693359375, 119.04946899414062, 43.486328125, 12.630111694335938, 42.523712158203125, -4.902069091796875, 38.38336181640625, -5.38275146484375, 1.1041641235351562, 105.57887268066406, 83.5638427734375, -5.99078369140625, 23.722793579101562, 54.44598388671875, 32.456634521484375, 87.656005859375, -56.65704345703125, 71.16059875488281, 15.62188720703125, 5.0599365234375, 16.738433837890625, 81.82681274414062, -19.96417236328125, 68.22364807128906, 140.73284912109375, -42.44419860839844, 4.086200714111328, 2.9831390380859375, 39.067169189453125, 2.6606483459472656, 72.67333984375, 87.65072631835938, 27.15557861328125, 20.58123779296875, 47.385162353515625, 24.322555541992188, 52.23486328125, 131.66119384765625, -22.92608642578125, -85.46723937988281, -0.6485595703125, 48.15138244628906, -9.049407958984375, 82.18780517578125, -2.879180908203125, 18.639404296875, 62.895660400390625, 35.13604736328125, 63.4051513671875, 1.9639434814453125, 23.422683715820312, 8.1094970703125, 101.67843627929688, 5.6384124755859375, 8.594512939453125, 37.772003173828125, 58.406585693359375, 67.92697143554688, 16.1785888671875, -34.93766784667969, 72.893310546875, 2.822093963623047, 21.467041015625, 92.78924560546875, 0.6910514831542969, -93.5933837890625, 22.79443359375, 63.63850402832031, 134.92665100097656, 80.0333251953125, -51.7744140625, -1.7985458374023438, 31.414581298828125, 25.361740112304688, 71.01959991455078, -40.75349426269531, 53.91569519042969, 29.45269775390625, -87.15142822265625, 71.26287841796875, 130.92901611328125, 11.884429931640625, 105.15957641601562, 11.625534057617188, 57.48846435546875, -17.70836639404297, 4.7178955078125, 94.05989074707031, -37.64453125, 57.84788513183594, 0.0, -85.65885925292969, 113.80398559570312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000231.npy"}
|
||||
{"epoch": 0.4837696335078534, "step": 232, "batch_size": 128, "mean": 35.16517639160156, "std": 53.355491638183594, "min": -81.32537841796875, "p10": -25.93144149780273, "median": 25.213260650634766, "p90": 105.99450988769532, "max": 164.15362548828125, "pos_frac": 0.6875, "sample": [-21.0560302734375, 103.79318237304688, -3.4027786254882812, 113.07244873046875, 30.229965209960938, 70.65481567382812, 87.35882568359375, 43.743316650390625, -3.90643310546875, -9.035354614257812, 61.630859375, 65.9837646484375, -2.886138916015625, -24.78601837158203, -45.97113037109375, 52.29254150390625, 24.702468872070312, 0.31304168701171875, 19.7034912109375, 19.962112426757812, 27.7371826171875, 10.815673828125, -61.21351623535156, 66.88186645507812, 92.85110473632812, 108.550048828125, 21.09619903564453, 70.00888061523438, -0.43303680419921875, 131.58392333984375, 56.987701416015625, -21.421592712402344, -39.53558349609375, 105.98556518554688, 68.70513916015625, 93.7783203125, -55.68634033203125, 35.8182373046875, 19.639892578125, 98.64031982421875, 60.375213623046875, -29.44365692138672, -0.147247314453125, 24.43152618408203, 12.101776123046875, 2.7451133728027344, 76.67034912109375, 15.27392578125, 101.98738098144531, 0.26520538330078125, 40.917755126953125, -24.69818115234375, 91.00861358642578, 25.532211303710938, 5.1290130615234375, 67.92083740234375, 17.897125244140625, 94.13866424560547, -1.4826812744140625, 84.77961730957031, 24.707721710205078, 132.3359375, 105.80958557128906, -24.44671630859375, 87.77658081054688, 88.99089050292969, -6.1424560546875, 4.09014892578125, -6.288818359375, -10.258003234863281, 43.07117462158203, 108.9732666015625, 11.2535400390625, -28.604095458984375, 124.57095336914062, 121.35121154785156, -3.351715087890625, -5.287078857421875, 52.23291015625, 12.45001220703125, 1.0449199676513672, 24.895248413085938, -19.108154296875, 83.56243896484375, 106.015380859375, -17.548236846923828, -5.794670104980469, 76.02606201171875, -5.464691162109375, -33.65728759765625, 89.22698211669922, 67.62908935546875, 0.0, 110.088134765625, 31.015586853027344, 5.31549072265625, 48.734161376953125, 25.531272888183594, 100.77342987060547, 68.29290771484375, 88.60262298583984, 12.582901000976562, 125.47354125976562, -81.32537841796875, 36.63972473144531, 164.15362548828125, 12.27783203125, -20.204071044921875, 94.02670288085938, 69.98674011230469, -5.95733642578125, 112.6376953125, -63.45465850830078, -13.400253295898438, -30.83250617980957, -50.334197998046875, 61.22045135498047, 0.7553825378417969, 90.21917724609375, -72.51577758789062, 94.05131530761719, -4.61029052734375, -76.96073913574219, 127.19879150390625, 60.00164794921875, 77.30097198486328, 47.665924072265625, -22.45977783203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000232.npy"}
|
||||
{"epoch": 0.48586387434554973, "step": 233, "batch_size": 128, "mean": 34.1455078125, "std": 56.67924880981445, "min": -119.57382202148438, "p10": -39.073681640625, "median": 29.405960083007812, "p90": 106.1844497680664, "max": 161.87744140625, "pos_frac": 0.7578125, "sample": [-39.946502685546875, 95.43903350830078, -0.266082763671875, -11.55718994140625, 7.518974304199219, 103.29817199707031, 23.2420654296875, 32.32415771484375, -22.188507080078125, 14.792747497558594, 18.762298583984375, 70.28970336914062, 118.28384399414062, 89.3773193359375, 41.044883728027344, -0.27239990234375, 22.003143310546875, 75.479248046875, 54.583404541015625, 155.49307250976562, 96.87667846679688, 89.8011245727539, 104.53445434570312, 93.56602478027344, 46.65190887451172, 108.04924011230469, 96.58456420898438, 4.4619598388671875, 88.09428405761719, -26.727699279785156, 29.11492919921875, -35.245994567871094, 48.96653747558594, 4.8926849365234375, 112.49787902832031, 53.0791015625, 21.910369873046875, -68.09880828857422, 76.14227294921875, 124.31222534179688, -50.31153106689453, -2.784740447998047, 7.142364501953125, 22.69921875, -48.51325988769531, 143.2974853515625, 52.58055114746094, 17.245147705078125, -0.085845947265625, 67.2125473022461, 77.111083984375, 132.720703125, 17.81915283203125, -6.3113861083984375, 34.45277404785156, -14.368255615234375, 145.9146728515625, 31.9576416015625, 98.37312316894531, 9.15081787109375, 68.2117919921875, 90.3304443359375, -45.6441650390625, 8.02969741821289, 9.996551513671875, -72.78790283203125, -67.30960083007812, 57.5114631652832, -119.57382202148438, -70.67715454101562, 40.53826904296875, 105.38525390625, 121.13482666015625, 52.67979431152344, 139.255615234375, 52.1920166015625, 21.231109619140625, 34.88948059082031, 15.32188606262207, 1.251312255859375, 2.344440460205078, 8.905166625976562, 17.052001953125, 48.78009033203125, -78.95262145996094, 4.959749221801758, 161.87744140625, 41.552001953125, -38.699615478515625, 65.27508544921875, 78.94914245605469, 5.7912445068359375, -81.70965576171875, 131.67495727539062, 23.212425231933594, -22.921875, 1.1608390808105469, 85.44686889648438, -7.13665771484375, 99.71855163574219, -13.0184326171875, -0.34287261962890625, -11.019973754882812, -89.07807922363281, 30.372955322265625, 109.88916015625, 29.696990966796875, 76.07613372802734, 80.63446044921875, -74.35903930664062, 1.7319488525390625, 84.103759765625, 48.17437744140625, -9.300933837890625, 31.209228515625, 25.377731323242188, -12.383682250976562, 62.863250732421875, 56.225730895996094, 20.2225341796875, 60.53448486328125, 4.532917022705078, 2.44061279296875, 92.37576293945312, 46.487060546875, 20.741683959960938, 34.427024841308594, 20.29620361328125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000233.npy"}
|
||||
{"epoch": 0.48795811518324606, "step": 234, "batch_size": 128, "mean": 38.59610366821289, "std": 63.23731994628906, "min": -144.88235473632812, "p10": -36.512979125976564, "median": 39.56597137451172, "p90": 112.92429351806639, "max": 183.73907470703125, "pos_frac": 0.7578125, "sample": [-82.16183471679688, -99.09735107421875, -23.935821533203125, 0.08286285400390625, 53.38665771484375, 92.39469909667969, 115.67025756835938, 47.737060546875, -28.053558349609375, 133.79544067382812, 103.46229553222656, -94.914794921875, 96.1461181640625, 85.12761688232422, -24.388336181640625, 99.0556640625, 35.20848846435547, 61.95892333984375, 81.68199157714844, 124.20289611816406, 61.51995849609375, 52.828704833984375, 82.95663452148438, 129.44033813476562, 31.879364013671875, 6.457557678222656, 85.16795349121094, 84.02581787109375, 90.69415283203125, 22.18121337890625, 43.63914489746094, 68.36737060546875, 104.526123046875, 132.47708129882812, 85.73703002929688, 13.920112609863281, 33.24481201171875, 26.422698974609375, 1.66448974609375, 46.54840087890625, 61.2562255859375, 21.69818115234375, 7.2896728515625, -74.34693908691406, 29.647247314453125, -27.341896057128906, 85.18960571289062, 73.16888427734375, -0.38556671142578125, 101.42545318603516, 43.70975112915039, 4.01727294921875, 132.27886962890625, 67.66488647460938, -21.662948608398438, 150.30108642578125, 90.84257507324219, 81.8417739868164, 97.47418212890625, 29.131500244140625, 19.02912139892578, 7.615419387817383, 94.25360107421875, 15.200836181640625, 7.952705383300781, -1.851806640625, -33.66841125488281, 0.0, 23.26702880859375, -27.123001098632812, 13.530105590820312, 89.94151306152344, -92.5865478515625, -31.52094268798828, 94.70892333984375, 141.04135131835938, 100.079345703125, 27.569534301757812, -79.65048217773438, 53.67437744140625, -1.0877704620361328, 90.15066528320312, 111.74745178222656, -22.8602294921875, 43.982688903808594, 50.41673278808594, 82.49884033203125, 47.23114013671875, -144.88235473632812, 131.40283203125, -61.888824462890625, 160.16436767578125, 17.816070556640625, 22.101425170898438, 91.5471420288086, 2.6374969482421875, 90.76809692382812, 35.4927978515625, 2.1690292358398438, 9.194091796875, 83.63909912109375, 0.0, 10.701698303222656, 49.37953186035156, 183.73907470703125, 111.63992309570312, -96.76957702636719, -7.887603759765625, 116.60821533203125, 123.45556640625, 81.3449478149414, 5.602752685546875, -52.344390869140625, 28.36163330078125, 63.2279052734375, 105.22189331054688, -36.54376220703125, -36.499786376953125, -21.35577392578125, 2.1893310546875, 48.2178955078125, 3.317138671875, -66.7086181640625, 8.714149475097656, -85.72695922851562, 102.65704345703125, -4.1772918701171875, 105.975341796875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000234.npy"}
|
||||
{"epoch": 0.4900523560209424, "step": 235, "batch_size": 128, "mean": 33.25033187866211, "std": 62.793426513671875, "min": -125.26953125, "p10": -47.98707275390623, "median": 26.027454376220703, "p90": 106.58979187011718, "max": 150.32989501953125, "pos_frac": 0.6953125, "sample": [22.6358642578125, 26.64118194580078, -7.61187744140625, -0.081085205078125, 8.894691467285156, -124.56271362304688, -71.016845703125, -1.307159423828125, 89.80201721191406, 82.78480529785156, 8.539863586425781, 142.73822021484375, 131.38006591796875, 53.6417236328125, 52.77207946777344, 57.29647445678711, 93.20307159423828, 44.51507568359375, -1.6025161743164062, -7.487335205078125, 55.05705261230469, 24.0704345703125, 28.8497314453125, 104.97286987304688, 131.56259155273438, 133.87841796875, 92.66189575195312, -18.340438842773438, 89.45994567871094, -11.394407272338867, 111.0433349609375, 0.0125732421875, 30.044342041015625, 98.62808990478516, 93.5357666015625, 90.17703247070312, 121.09725952148438, 101.25015258789062, 7.105110168457031, 141.84393310546875, -11.783905029296875, 107.1783447265625, -31.255958557128906, -3.84423828125, 46.30397033691406, 129.936767578125, -107.54788208007812, -27.10974884033203, -125.26953125, 23.40313720703125, 52.219024658203125, -17.945281982421875, -14.057071685791016, -13.991455078125, 29.05133056640625, 93.16934204101562, -63.429718017578125, 41.14064025878906, -8.975128173828125, -62.276512145996094, 11.644977569580078, 12.814483642578125, 104.60115051269531, -43.43646240234375, 140.12454223632812, 72.34111785888672, -9.7225341796875, -91.79925537109375, 4.969940185546875, 38.669281005859375, -70.64996337890625, 57.853668212890625, -65.43721008300781, 88.65034484863281, 6.955543518066406, 9.87353515625, 7.57501220703125, 105.43672180175781, 6.09173583984375, 51.084747314453125, 58.772918701171875, 16.21136474609375, -121.38229370117188, 24.28730010986328, 76.80599975585938, -31.48668670654297, 81.44481658935547, 93.80955505371094, 106.56195068359375, 3.1490631103515625, 142.14419555664062, 100.26971435546875, 72.89457702636719, 150.32989501953125, 106.65475463867188, 46.3369140625, 48.80229187011719, 19.647705078125, 100.5863265991211, -9.963571548461914, -1.087310791015625, 102.05874633789062, -68.74594116210938, 59.75885009765625, -3.2655029296875, 86.46134185791016, -5.493011474609375, 25.147308349609375, 85.66571044921875, -11.91717529296875, -3.28741455078125, 37.55326843261719, 5.544044494628906, -35.16845703125, 22.88176727294922, 13.219181060791016, 99.4150619506836, -119.31134033203125, -58.60516357421875, 4.257026672363281, 82.48313903808594, 78.52591705322266, 25.413726806640625, 86.10899353027344, 73.04502868652344, 65.57481384277344, 25.20751953125, -0.5233402252197266], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000235.npy"}
|
||||
{"epoch": 0.49214659685863876, "step": 236, "batch_size": 128, "mean": 39.3505859375, "std": 58.095619201660156, "min": -121.11920166015625, "p10": -43.429109191894526, "median": 35.44163131713867, "p90": 107.0010269165039, "max": 155.65802001953125, "pos_frac": 0.78125, "sample": [56.69659423828125, 85.3128662109375, -72.33047485351562, 61.6865234375, 64.4615478515625, 87.12680053710938, 9.794387817382812, 101.62283325195312, -14.17047119140625, -54.53309631347656, -69.4923095703125, 89.07058715820312, -64.650634765625, 29.490081787109375, 92.9864501953125, 53.591217041015625, -0.6728477478027344, 78.16574096679688, 20.960678100585938, 80.79681396484375, 24.5545654296875, 85.46810150146484, 41.363861083984375, -27.81118392944336, 55.59503936767578, 90.72407531738281, 94.50604248046875, 7.520114898681641, 65.23489379882812, 101.4432373046875, 60.55955505371094, 33.23876953125, 2.9696731567382812, 18.775535583496094, 10.466606140136719, 22.888015747070312, 14.837158203125, 130.67416381835938, 99.36186218261719, 8.148956298828125, 7.4562225341796875, 105.51837921142578, 103.53274536132812, 75.94865417480469, 128.78662109375, 1.6378889083862305, -73.61064147949219, 2.162506103515625, 22.438613891601562, 0.0, -5.917205810546875, 30.767501831054688, 94.93011474609375, 85.73661041259766, 10.206743240356445, -26.234130859375, -100.36880493164062, 100.77005004882812, 12.0760498046875, -50.60211181640625, 72.26629638671875, 113.65170288085938, 53.471923828125, 43.384620666503906, 0.94635009765625, -48.19297790527344, 65.39703369140625, -121.11920166015625, 69.76629638671875, 5.0287017822265625, -72.10008239746094, 155.65802001953125, 77.8744125366211, 137.0928955078125, -3.71234130859375, -5.650299072265625, -66.36199951171875, -14.067413330078125, 106.95628356933594, 74.30406951904297, 25.153717041015625, 64.69314575195312, -41.387451171875, 81.5640869140625, 97.21994018554688, 32.20208740234375, 114.30192565917969, 132.2421875, -8.105659484863281, 73.51739501953125, 61.271575927734375, 106.96087646484375, 106.79510498046875, 21.58978271484375, 31.179885864257812, 107.09471130371094, 111.13134765625, 37.644493103027344, 93.45460510253906, 112.58366394042969, 101.69486999511719, 31.719482421875, -24.867538452148438, 50.534263610839844, 0.0, 44.24609375, 21.467910766601562, 28.449600219726562, -95.49462890625, 100.83973693847656, 12.099990844726562, 0.2605094909667969, -73.01134490966797, -19.382766723632812, 46.3817138671875, 101.9077377319336, 117.18531799316406, 8.665283203125, 6.92462158203125, -20.93553924560547, 113.44902038574219, 89.28446960449219, 114.98110961914062, 17.82745361328125, 23.638092041015625, 91.34811401367188, 8.141719818115234, 32.14971923828125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000236.npy"}
|
||||
{"epoch": 0.4942408376963351, "step": 237, "batch_size": 128, "mean": 34.30473327636719, "std": 56.26811218261719, "min": -127.4256591796875, "p10": -21.29223937988281, "median": 24.311798095703125, "p90": 109.79963455200195, "max": 191.8909912109375, "pos_frac": 0.7421875, "sample": [32.49920654296875, 14.576263427734375, 40.19573211669922, 117.602294921875, -8.83935546875, 32.01513671875, 36.173377990722656, 85.31970977783203, 55.414764404296875, 7.591461181640625, -27.050994873046875, -5.8394317626953125, 139.4349365234375, -24.402099609375, -4.208854675292969, 20.978134155273438, -0.86328125, 65.81149291992188, 191.8909912109375, 89.09940338134766, 17.383773803710938, 34.04478454589844, 107.8211669921875, 59.991966247558594, 11.46432876586914, 11.1959228515625, 35.3765869140625, 0.0, 72.94910430908203, 88.8701171875, 3.2659378051757812, 74.8935546875, -12.54962158203125, 7.608070373535156, -53.285247802734375, 34.87115478515625, 56.533721923828125, 0.95220947265625, -127.4256591796875, -42.53539276123047, 2.6624069213867188, -1.51092529296875, 33.43354797363281, 89.16961669921875, 7.328895568847656, -1.518798828125, 102.46340942382812, 28.45465087890625, 31.825942993164062, 4.582599639892578, 0.623321533203125, 142.76776123046875, 62.13490295410156, 19.36205291748047, 7.9722747802734375, 81.4322738647461, 105.7264404296875, 105.37295532226562, 15.7882080078125, 20.0057373046875, 124.07733154296875, 35.931983947753906, 22.3887939453125, 12.02935791015625, 59.67900085449219, -8.647247314453125, 124.26119995117188, 102.72621154785156, -37.206146240234375, -70.39984130859375, 0.6870613098144531, 129.0067138671875, -86.45057678222656, 1.3837890625, 102.83296203613281, 18.980064392089844, -24.905792236328125, -13.455230712890625, 142.00819396972656, -19.957763671875, 69.97500610351562, 23.93408203125, 113.75619506835938, 63.960357666015625, 114.52719116210938, -69.19183349609375, -3.297882080078125, 4.504730224609375, 111.29595184326172, 28.566925048828125, 10.403938293457031, 141.825927734375, -18.602325439453125, -3.63665771484375, -36.50994873046875, 125.37957763671875, -13.0916748046875, 2.89727783203125, 100.97430419921875, 61.434749603271484, 24.68951416015625, 12.347000122070312, 95.47357177734375, -2.7744674682617188, 3.0593490600585938, 7.280019760131836, 98.96868896484375, -85.08200073242188, -15.659881591796875, -13.292770385742188, 3.174957275390625, 109.15835571289062, -19.959442138671875, 60.72045135498047, 25.584732055664062, 70.77615356445312, -95.5284423828125, 25.866622924804688, 90.035888671875, 106.21682739257812, 93.59835815429688, -16.98260498046875, 33.32365417480469, 83.13790893554688, 43.495452880859375, 28.56170654296875, 19.1578369140625, 64.6795654296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000237.npy"}
|
||||
{"epoch": 0.4963350785340314, "step": 238, "batch_size": 128, "mean": 38.98161315917969, "std": 57.548583984375, "min": -90.98078918457031, "p10": -21.069408988952627, "median": 29.6798095703125, "p90": 116.03138885498045, "max": 161.29931640625, "pos_frac": 0.71875, "sample": [6.79278564453125, -0.03475189208984375, 6.1949310302734375, 13.130210876464844, -53.686363220214844, 145.47723388671875, 94.06915283203125, 37.9405517578125, 95.35211181640625, -49.15911865234375, -5.08966064453125, 13.974761962890625, -6.8664398193359375, 15.576995849609375, -33.43849182128906, -73.40003967285156, 114.0927734375, 103.7647705078125, -3.110870361328125, 72.95970916748047, 102.78619384765625, 79.8740234375, 42.62103271484375, 113.9850082397461, 144.89695739746094, 62.923683166503906, -13.05108642578125, 1.6647872924804688, 79.43533325195312, -90.98078918457031, -47.842529296875, -90.52326965332031, 127.97686767578125, -83.11672973632812, -7.403472900390625, 128.19442749023438, 31.857757568359375, 102.49053955078125, 8.747055053710938, 59.2236328125, 140.31536865234375, -3.29095458984375, 65.55410766601562, 161.29931640625, 95.04669189453125, 112.8553466796875, 130.76425170898438, 119.43170166015625, -11.26812744140625, 39.665557861328125, 112.84803771972656, 147.8529052734375, 22.684417724609375, -30.196571350097656, 18.193496704101562, 29.54449462890625, -0.3292388916015625, 1.997396469116211, -10.707504272460938, 80.41293334960938, 24.3046875, 0.1754150390625, -16.713470458984375, -70.08489990234375, 96.36125183105469, 30.7041015625, 91.16717529296875, 105.92560577392578, 74.55685424804688, 108.67064666748047, -46.98152160644531, 11.287841796875, 46.378334045410156, -4.7126007080078125, -3.6817398071289062, 23.251068115234375, 121.50579833984375, -13.119964599609375, 85.04843139648438, 99.96197509765625, 29.81512451171875, 121.51812744140625, -8.79604721069336, -9.377128601074219, 2.80230712890625, 26.74566650390625, 89.96778106689453, -18.729875564575195, 62.63517379760742, 36.917144775390625, 0.7214374542236328, 41.26939392089844, -44.16917419433594, 58.51348876953125, 161.0067138671875, -9.132568359375, 38.500274658203125, 34.09031677246094, 19.333351135253906, 40.663509368896484, 24.577072143554688, 122.3140869140625, 114.57411193847656, 49.412353515625, 64.59016418457031, 0.17352294921875, -17.598342895507812, 95.92109680175781, 17.00469970703125, 15.583709716796875, 26.14031982421875, 89.34329223632812, 13.226909637451172, 14.519081115722656, 71.33624267578125, -12.167617797851562, 57.4090576171875, 64.5379638671875, -4.1153411865234375, -8.112213134765625, 104.44050598144531, 73.7742919921875, 37.06414794921875, -26.5283203125, 9.507240295410156, 6.933349609375, 82.1796875, -5.66802978515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000238.npy"}
|
||||
{"epoch": 0.49842931937172774, "step": 239, "batch_size": 128, "mean": 32.368629455566406, "std": 60.73345184326172, "min": -144.64739990234375, "p10": -40.728903198242186, "median": 19.412994384765625, "p90": 117.04646911621093, "max": 162.41201782226562, "pos_frac": 0.7109375, "sample": [-60.00152587890625, -35.19915771484375, 130.68374633789062, 26.88226318359375, -16.500823974609375, -47.030731201171875, -144.64739990234375, 76.94175720214844, 131.83627319335938, 1.3654251098632812, 54.331939697265625, -140.814208984375, -59.59112548828125, -5.539642333984375, 87.85225677490234, 72.924560546875, 87.17877197265625, 121.59880828857422, 67.98733520507812, -10.094329833984375, 0.30115509033203125, 104.78021240234375, 12.116634368896484, 8.245101928710938, -12.060310363769531, 40.3948974609375, 0.9750194549560547, 46.15575408935547, 19.712799072265625, -11.633987426757812, -43.70294189453125, 94.04962158203125, 75.4010238647461, 1.7513198852539062, 141.33828735351562, -12.865875244140625, 28.724014282226562, 96.5447998046875, 162.41201782226562, 111.5634765625, 87.06742095947266, -6.99688720703125, 24.192108154296875, 0.460968017578125, 144.641845703125, -17.34009552001953, -12.5885009765625, 65.52349853515625, -68.84211730957031, 82.3502197265625, -73.5796127319336, 78.57975769042969, 17.963943481445312, 13.132537841796875, 72.76567840576172, -34.60809326171875, 48.438873291015625, 12.742698669433594, 0.2337493896484375, -38.12615966796875, 19.113189697265625, 115.50971984863281, 7.671060562133789, 110.6719970703125, 16.397193908691406, 35.465087890625, 74.9476089477539, -0.55767822265625, 6.838523864746094, 134.26693725585938, 1.484292984008789, 52.119117736816406, 10.728759765625, 38.8004150390625, 108.1812744140625, -7.408935546875, 73.74789428710938, 16.414794921875, -51.66522216796875, 7.577606201171875, -0.096221923828125, 83.74524688720703, 138.62753295898438, -20.505126953125, -1.33258056640625, 107.842529296875, -72.33343505859375, 29.34264373779297, 9.446334838867188, -46.34771728515625, 108.91293334960938, -3.7481117248535156, -9.725357055664062, 105.0401382446289, 2.9927024841308594, 78.82707214355469, 2.3654403686523438, 5.7000732421875, 138.66522216796875, 64.44943237304688, 0.30499267578125, 136.98602294921875, 18.010299682617188, -57.34747314453125, 50.514892578125, -2.2774658203125, -39.454315185546875, 29.217071533203125, 43.640167236328125, 124.78598022460938, 5.4093780517578125, 104.63204956054688, 51.9925537109375, 137.47479248046875, 120.63221740722656, 0.0, 45.79010009765625, 0.5103092193603516, 106.52660369873047, 54.187713623046875, -9.23577880859375, 83.02556610107422, 49.97918701171875, 19.84039306640625, 35.22015380859375, -78.70176696777344, -4.828514099121094, 23.797958374023438], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000239.npy"}
|
||||
{"epoch": 0.5005235602094241, "step": 240, "batch_size": 128, "mean": 40.87913131713867, "std": 55.99492645263672, "min": -108.19914245605469, "p10": -19.680340576171876, "median": 36.182533264160156, "p90": 111.736279296875, "max": 159.38510131835938, "pos_frac": 0.7265625, "sample": [8.228752136230469, 108.68576049804688, 69.347900390625, 105.63301086425781, -12.756912231445312, -0.5110855102539062, 66.3057861328125, 49.721954345703125, 0.0, 131.2154083251953, 35.55035400390625, -26.13201904296875, -92.87406921386719, 48.1015625, -20.109451293945312, 96.63593292236328, 64.48575592041016, 71.64413452148438, 90.87290954589844, 41.595924377441406, 21.162689208984375, -1.477813720703125, 76.644287109375, 104.99533081054688, 17.25030517578125, 104.62059020996094, 69.08432006835938, -13.49212646484375, 114.9146728515625, 112.11968994140625, 38.683837890625, 56.7607421875, 111.16912841796875, 92.9706039428711, 15.962890625, -14.00286865234375, 120.19393920898438, 43.112327575683594, 92.44722747802734, 10.834953308105469, 25.002914428710938, 46.40911865234375, 116.05125427246094, -10.647010803222656, 77.6123046875, 61.91705322265625, 88.15286254882812, 88.52633666992188, 82.00540161132812, 99.56405639648438, 113.77569580078125, 36.858558654785156, 0.0, 49.07563018798828, -108.19914245605469, 24.233673095703125, -19.62689208984375, -8.493133544921875, -19.8050537109375, 36.04096984863281, 25.886734008789062, 94.70890808105469, -8.433441162109375, -11.776702880859375, 115.46832275390625, 103.57107543945312, 47.488555908203125, 84.95035552978516, 109.44746398925781, 82.77536010742188, 132.16131591796875, -29.338409423828125, 0.0, 20.524078369140625, -17.369400024414062, -11.494140625, 139.8785400390625, -1.962005615234375, 19.37329864501953, 26.349609375, -11.563804626464844, 36.3240966796875, 94.46510314941406, 32.20347595214844, 28.516891479492188, 159.38510131835938, -83.73101806640625, 154.78460693359375, -100.10882568359375, 74.18377685546875, 10.95501708984375, 13.797027587890625, -6.75738525390625, 96.15379333496094, 94.80403137207031, -9.928619384765625, -24.405303955078125, 84.1369857788086, 15.588525772094727, -12.911035537719727, -14.85711669921875, 7.026699066162109, 5.801002502441406, 19.406082153320312, 99.12043762207031, 61.128082275390625, 79.15046691894531, 111.57196044921875, 30.561309814453125, 9.568603515625, 114.5438461303711, 56.82820129394531, 29.62493896484375, -88.32013702392578, 31.122421264648438, -53.42854690551758, -43.77099609375, 75.85186767578125, 28.91741943359375, 77.84268188476562, 0.0, 6.474205017089844, 63.39496612548828, 63.89111328125, 22.89495849609375, -68.01642608642578, 19.02276611328125, 131.02517700195312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000240.npy"}
|
||||
{"epoch": 0.5026178010471204, "step": 241, "batch_size": 128, "mean": 30.69418716430664, "std": 62.04478454589844, "min": -110.58248138427734, "p10": -48.39708251953125, "median": 17.228233337402344, "p90": 109.59050292968749, "max": 169.33401489257812, "pos_frac": 0.6953125, "sample": [-14.45587158203125, 115.939208984375, 83.00279235839844, 104.29238891601562, 22.004493713378906, -30.2396240234375, 24.536468505859375, 0.9512577056884766, 51.271575927734375, 97.29156494140625, 98.2685546875, 0.0, -9.75628662109375, 104.04766845703125, 89.77812194824219, 142.77719116210938, 16.685516357421875, 64.97781372070312, 15.5155029296875, 111.861572265625, 77.59428405761719, 99.89895629882812, -14.546585083007812, -12.71087646484375, 93.46171569824219, 16.864730834960938, -39.11016845703125, 45.336334228515625, 98.18202209472656, -8.723907470703125, 117.6637954711914, -36.3369140625, 100.17185974121094, 95.19671630859375, 15.31683349609375, 8.1875, -7.8387451171875, -9.559097290039062, 5.313867568969727, 104.30780029296875, 10.279762268066406, 20.172714233398438, 90.14690399169922, 140.3048095703125, 12.24688720703125, 87.65509033203125, 57.428192138671875, -110.06985473632812, 24.465866088867188, 25.050506591796875, 22.954002380371094, 2.252880096435547, 2.12420654296875, 87.73628234863281, -24.537643432617188, -6.990325927734375, 108.18734741210938, 7.3815765380859375, 17.841339111328125, -20.321823120117188, 67.02096557617188, 96.21011352539062, 41.04705810546875, 101.5870361328125, -5.8456878662109375, 13.535491943359375, -13.612838745117188, 15.65478515625, 1.6789169311523438, 34.56329345703125, -88.60420989990234, 42.14581298828125, 125.69078063964844, 17.59173583984375, -71.17938232421875, -50.021484375, 115.79605102539062, 76.58190155029297, 2.904510498046875, -1.13079833984375, 4.5964508056640625, 0.6361541748046875, 108.6171875, 27.200103759765625, 96.22196960449219, 32.39598083496094, -83.0657958984375, 103.43411254882812, -23.692733764648438, 112.19996643066406, 6.02569580078125, -67.62429809570312, 1.9925460815429688, -2.5131072998046875, 102.40371704101562, 28.565948486328125, 3.3155746459960938, -9.979034423828125, 122.83270263671875, -79.58523559570312, 83.03450012207031, -90.569580078125, 20.80963134765625, -48.5750732421875, 107.30712890625, 114.06549072265625, -62.70344543457031, 6.6168975830078125, 127.49298095703125, 66.97730255126953, -48.32080078125, -110.58248138427734, -30.207977294921875, -7.6625213623046875, 86.9571533203125, -0.1939830780029297, 141.50967407226562, 53.763214111328125, -69.99482727050781, 59.57037353515625, 169.33401489257812, 4.306549072265625, 14.70733642578125, -95.63912963867188, 10.99664306640625, -22.046539306640625, 96.93331909179688, -46.350616455078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000241.npy"}
|
||||
{"epoch": 0.5047120418848168, "step": 242, "batch_size": 128, "mean": 37.26660919189453, "std": 53.80137634277344, "min": -116.91702270507812, "p10": -21.644346618652342, "median": 28.708322525024414, "p90": 111.49656524658204, "max": 144.00717163085938, "pos_frac": 0.7421875, "sample": [91.21987915039062, 7.921470642089844, 102.20343017578125, 10.42062759399414, 38.770660400390625, 12.509872436523438, 90.43975830078125, 67.59771728515625, -24.91375732421875, 10.057540893554688, 137.010498046875, 90.41204833984375, 53.464256286621094, -18.06866455078125, -93.1764907836914, 28.29999542236328, 19.642410278320312, 99.2021713256836, -116.91702270507812, 47.923824310302734, 12.145954132080078, 129.5617218017578, 119.99102783203125, 44.57605743408203, 52.59326171875, 133.88824462890625, -2.6699981689453125, 125.43609619140625, -1.2344818115234375, -8.847103118896484, 16.3990478515625, 98.1309814453125, -0.9642333984375, 123.47683715820312, 37.938568115234375, 139.10311889648438, 120.37345886230469, 4.18913459777832, -30.522178649902344, 1.6063385009765625, 90.71755981445312, -18.0933837890625, -1.341796875, 55.312103271484375, -21.629043579101562, -3.6990833282470703, 33.517303466796875, 50.1539306640625, 88.02763366699219, 2.3702239990234375, -8.98455810546875, -4.185939788818359, -30.115020751953125, 25.340133666992188, -7.549346923828125, -3.544656753540039, 100.67361450195312, 2.3689422607421875, -12.4385986328125, 87.92166137695312, 46.76800537109375, 0.289764404296875, -28.0792236328125, 118.03384399414062, 10.443634033203125, -3.0892562866210938, 125.1700439453125, 61.29229736328125, 23.313217163085938, 21.177108764648438, 90.7113037109375, 57.27496337890625, -21.6800537109375, 17.913063049316406, 82.114501953125, 52.12903594970703, 80.1547622680664, 21.824447631835938, 29.4825439453125, -49.624961853027344, 14.033004760742188, 80.76304626464844, 18.6058349609375, 65.32685852050781, -90.28695678710938, 14.568222045898438, 6.375585556030273, 45.072914123535156, 40.055816650390625, 86.85552978515625, 144.00717163085938, 5.112701416015625, -13.17706298828125, 29.116649627685547, -2.950836181640625, 18.9296875, 98.07437133789062, 97.12451171875, 73.44200134277344, 26.374664306640625, 1.2763671875, 105.816162109375, -50.140865325927734, -15.075210571289062, 50.72456359863281, 12.091819763183594, 60.98179626464844, -54.532867431640625, 23.825042724609375, 65.25701904296875, 35.811248779296875, 8.701126098632812, 111.82472229003906, 123.51112365722656, 88.59475708007812, 80.4837646484375, 111.35592651367188, 40.173744201660156, -62.68504333496094, 89.57708740234375, -10.6947021484375, 55.37504577636719, 12.8321533203125, -46.827239990234375, 70.94373321533203, 94.19171142578125, -15.385711669921875, 95.06202697753906], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000242.npy"}
|
||||
{"epoch": 0.506806282722513, "step": 243, "batch_size": 128, "mean": 46.87254333496094, "std": 55.201385498046875, "min": -101.16911315917969, "p10": -15.342156982421873, "median": 46.892765045166016, "p90": 117.4966827392578, "max": 155.553466796875, "pos_frac": 0.7734375, "sample": [-72.71743774414062, -22.669281005859375, 32.182159423828125, 20.788421630859375, 77.14047241210938, 69.64138793945312, 4.88653564453125, 124.78472900390625, -69.71002197265625, 90.32152557373047, 25.81573486328125, 140.38800048828125, 20.326669692993164, -24.393592834472656, 55.3668212890625, -9.839645385742188, -68.76876831054688, 36.021728515625, 127.732177734375, -80.62825012207031, 42.22370147705078, 17.20953369140625, 7.2010498046875, 0.40192413330078125, 61.610450744628906, 110.3699951171875, 155.553466796875, 124.58377075195312, -3.9241943359375, 101.89129638671875, 10.832687377929688, 90.68266296386719, 63.57910919189453, 95.34829711914062, 101.0406494140625, 10.148284912109375, 116.59370422363281, 15.909515380859375, 85.16018676757812, 71.32150268554688, -13.264877319335938, 95.02371215820312, 53.58154296875, 73.02496337890625, 65.99732971191406, 93.81558227539062, 121.18896484375, -39.498779296875, 53.58807373046875, 65.18614196777344, 13.951263427734375, -101.16911315917969, -0.0225677490234375, 38.706390380859375, 16.70526123046875, 7.11578369140625, 74.58258819580078, 80.783447265625, 15.782562255859375, -5.79400634765625, 139.78878784179688, 101.95263671875, 86.17886352539062, -10.98553466796875, -7.41412353515625, 88.08750915527344, 31.569976806640625, 101.07644653320312, 85.98513793945312, -6.23565673828125, 80.13211059570312, -26.959320068359375, -11.69500732421875, 111.39527893066406, 82.74612426757812, 58.75274658203125, 28.353759765625, 62.182586669921875, -2.2772140502929688, 23.90277099609375, -9.903966903686523, 94.72931671142578, -25.775230407714844, 76.58999633789062, 8.3824462890625, 0.0, 3.4902305603027344, 84.19779968261719, 94.39751434326172, 102.16043090820312, 53.06427764892578, 76.75448608398438, 94.31060791015625, -5.210731506347656, 110.66751098632812, 111.84799194335938, 119.05059814453125, 23.779815673828125, 4.2936248779296875, 35.535614013671875, 23.407958984375, 6.23309326171875, -13.730026245117188, 28.698028564453125, -16.7923583984375, 116.83071899414062, 131.0806884765625, 89.55741882324219, 2.4119644165039062, 30.322784423828125, 21.808494567871094, 10.970367431640625, 105.6289291381836, -14.72064208984375, -29.072723388671875, 150.5318603515625, 18.737194061279297, 108.62954711914062, -10.362548828125, 68.97471618652344, 38.863616943359375, 121.31971740722656, 145.63778686523438, 70.37085723876953, 132.3802032470703, 97.93853759765625, 51.56182861328125, -20.12359619140625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000243.npy"}
|
||||
{"epoch": 0.5089005235602094, "step": 244, "batch_size": 128, "mean": 46.347469329833984, "std": 60.193843841552734, "min": -128.14694213867188, "p10": -14.83706359863281, "median": 45.53034973144531, "p90": 120.29825134277344, "max": 201.32208251953125, "pos_frac": 0.7265625, "sample": [-128.14694213867188, -0.985321044921875, 100.15615844726562, -0.7109527587890625, 29.519302368164062, 12.993549346923828, 56.16943359375, -30.63238525390625, 135.08892822265625, 85.57788848876953, 99.8348388671875, 70.99600219726562, -4.3846282958984375, -5.8337554931640625, 10.035850524902344, 20.10467529296875, 81.462890625, 84.36653137207031, 84.30294799804688, 105.99542236328125, 20.08697509765625, 98.66504669189453, -2.6309890747070312, -4.694793701171875, 104.77886962890625, -1.68524169921875, -13.952911376953125, -59.996986389160156, 105.62930297851562, -23.974273681640625, 6.79754638671875, -16.90008544921875, -5.1969757080078125, -5.953224182128906, -96.42849731445312, 9.676399230957031, 133.96783447265625, 8.21588134765625, 91.165771484375, 104.92774963378906, 23.029541015625, 54.267608642578125, 124.56216430664062, 65.25630187988281, 130.997802734375, 113.01271057128906, -0.09130859375, 57.450592041015625, 32.60408020019531, 73.18544006347656, 110.9683837890625, 80.05413818359375, 0.0, 90.06634521484375, 22.46124267578125, -44.70745849609375, -52.56207275390625, 97.78482055664062, -8.918792724609375, -7.9006500244140625, 120.10089111328125, -36.30375671386719, 3.409576416015625, 78.72737121582031, 5.108970642089844, 119.82632446289062, 111.39248657226562, 127.43899536132812, 201.32208251953125, 18.9871826171875, 46.72444152832031, 46.69267272949219, -37.41387939453125, 111.02230834960938, 81.42251586914062, -0.32501220703125, 126.2421875, -7.811798095703125, 127.5491943359375, 82.63442993164062, 2.9197540283203125, 120.75875854492188, 10.509078979492188, -92.11827087402344, 28.170166015625, 80.2080078125, -3.51483154296875, 101.34950256347656, 96.20147705078125, 9.2294921875, 4.8090667724609375, -1.39019775390625, 38.089263916015625, 67.3763427734375, 136.66107177734375, 119.18600463867188, 108.17619323730469, 4.158382415771484, 144.69613647460938, 40.08770751953125, 1.7519378662109375, 94.0802001953125, 44.36802673339844, 79.7935791015625, -11.97625732421875, 88.316650390625, 76.98518371582031, 18.994232177734375, 117.91930389404297, 61.60520935058594, -23.895660400390625, 52.2119140625, 3.3056716918945312, 123.29916381835938, 93.6453857421875, 145.64080810546875, -0.7017993927001953, 79.17156982421875, 83.8389892578125, -11.073974609375, 23.18444061279297, 14.9149169921875, 98.0677719116211, -104.29122924804688, 26.739349365234375, 82.31617736816406, 112.89303588867188, -0.867523193359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000244.npy"}
|
||||
{"epoch": 0.5109947643979058, "step": 245, "batch_size": 128, "mean": 46.1944694519043, "std": 58.2772102355957, "min": -124.04991149902344, "p10": -21.954457855224604, "median": 49.287376403808594, "p90": 113.86020050048826, "max": 161.8094482421875, "pos_frac": 0.8046875, "sample": [17.531005859375, -14.374191284179688, 33.037078857421875, 24.843772888183594, 1.599822998046875, 25.11297607421875, 85.17581176757812, 4.6717529296875, -5.726715087890625, 51.151885986328125, 57.09471130371094, 27.81341552734375, 19.192703247070312, 134.05044555664062, 81.85948181152344, 15.59246826171875, 146.97109985351562, 138.046630859375, 56.42835998535156, 161.8094482421875, -124.04991149902344, -27.318206787109375, -61.26202392578125, -27.18121337890625, 30.421279907226562, 65.9268798828125, 0.0, 37.83673095703125, 76.30382537841797, -8.736663818359375, 89.40122985839844, 92.43991088867188, 90.83799743652344, 104.01263427734375, 3.427398681640625, 139.1412353515625, 3.1809539794921875, 29.792755126953125, 89.806640625, 72.85858154296875, 127.3494873046875, 14.3226318359375, 8.748397827148438, 97.56320190429688, 10.283935546875, 89.78280639648438, 28.370269775390625, 50.625457763671875, 40.202701568603516, -2.5233917236328125, -25.44622802734375, 53.626007080078125, 8.466278076171875, 108.15966796875, 59.497406005859375, -7.6062164306640625, 76.61650085449219, 94.33352661132812, 107.04690551757812, 111.13340759277344, 142.49200439453125, 7.114347457885742, 2.3794937133789062, 67.28894805908203, 47.94929504394531, 96.27845001220703, 41.680328369140625, -40.53462219238281, 56.062103271484375, -82.58209228515625, 62.76649475097656, 97.4496078491211, 69.22850036621094, 85.81082153320312, 101.4198226928711, 0.5308837890625, -64.18035888671875, 33.291500091552734, 157.32244873046875, 109.00607299804688, 5.473993301391602, 101.45438385009766, -53.773712158203125, 7.45953369140625, 12.495361328125, -54.21260070800781, 6.7764892578125, 110.63458251953125, 92.08233642578125, 100.7440185546875, 82.79774475097656, 18.058380126953125, -6.332130432128906, 142.90060424804688, 53.837677001953125, 152.8709716796875, 13.951980590820312, 62.769622802734375, 52.18071746826172, 83.26688385009766, -88.54208374023438, 109.14195251464844, 73.75651550292969, -16.8458251953125, -4.25452995300293, 120.22271728515625, 31.88525390625, 78.24835205078125, 78.53392028808594, 99.02517700195312, 10.900543212890625, -15.120223999023438, 39.586944580078125, 65.96994018554688, 94.66146850585938, 18.86358642578125, 0.94366455078125, -20.457984924316406, 134.30908203125, 1.3811759948730469, 110.42776489257812, 43.636505126953125, 75.38580322265625, 159.4185791015625, -75.19052124023438, -43.60520935058594, 0.0, 65.12397766113281], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000245.npy"}
|
||||
{"epoch": 0.5130890052356021, "step": 246, "batch_size": 128, "mean": 28.698631286621094, "std": 62.05835723876953, "min": -119.92007446289062, "p10": -39.24865570068358, "median": 15.543556213378906, "p90": 119.73305053710938, "max": 156.40814208984375, "pos_frac": 0.671875, "sample": [22.9403076171875, 86.96148681640625, 80.08263397216797, -3.219207763671875, 77.0185546875, -7.311164855957031, 156.40814208984375, 9.850433349609375, 19.21106719970703, 52.08062744140625, -82.73104858398438, 74.612548828125, 12.564544677734375, 18.90892791748047, -8.993240356445312, -27.538406372070312, -12.782997131347656, 131.36880493164062, 131.99148559570312, -54.025543212890625, 111.31431579589844, 140.6103515625, 18.68829345703125, 92.36056518554688, 40.266387939453125, -7.2768402099609375, 8.614486694335938, -49.040992736816406, 25.350906372070312, -3.8446502685546875, 39.9852294921875, -32.217376708984375, 117.15740966796875, 56.07196807861328, 19.62921142578125, -7.340118408203125, -109.83442687988281, 117.3406982421875, 2.924957275390625, 1.2078495025634766, 13.37896728515625, -112.902587890625, -21.005462646484375, 119.59597778320312, -48.32499694824219, -20.8109130859375, -66.71197509765625, -20.239974975585938, 3.68243408203125, 81.39608764648438, 28.81549072265625, 138.8394775390625, -27.25677490234375, 35.125274658203125, 122.08615112304688, -15.6751708984375, 70.28656005859375, -22.973846435546875, -119.92007446289062, 11.461967468261719, 133.3857421875, 68.4205322265625, 63.523338317871094, 53.946380615234375, -31.473846435546875, 61.567962646484375, 133.17034912109375, 79.43017578125, -14.815567016601562, 136.16741943359375, 120.05288696289062, 28.0394287109375, -35.358795166015625, 3.03521728515625, 21.736328125, 4.475341796875, 102.12507629394531, 16.45855712890625, -28.746002197265625, -67.31858825683594, -69.32058715820312, 17.752777099609375, 92.04156494140625, 102.89755249023438, -27.759384155273438, 0.1222381591796875, -99.29586791992188, -65.4986572265625, 12.60693359375, 7.608856201171875, 49.6513671875, 119.22430419921875, 70.80123901367188, -57.634063720703125, 19.113571166992188, 29.231163024902344, 0.2400054931640625, -10.710540771484375, 0.0, 111.30972290039062, 122.85890197753906, 99.81231689453125, -19.730010986328125, -9.474517822265625, 88.77743530273438, 0.695404052734375, 0.362457275390625, 85.57693481445312, 9.36953353881836, 65.84066009521484, 34.48268127441406, 5.827110290527344, 112.1708984375, 76.641357421875, 14.05316162109375, 122.67608642578125, -3.2214012145996094, -27.601150512695312, -21.66595458984375, -22.553848266601562, 126.7711181640625, 26.06536865234375, 0.4682464599609375, 91.43757629394531, 12.29766845703125, 14.628555297851562, -3.458587646484375, 111.89999389648438], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000246.npy"}
|
||||
{"epoch": 0.5151832460732985, "step": 247, "batch_size": 128, "mean": 39.65989685058594, "std": 56.6664924621582, "min": -124.20526123046875, "p10": -23.823731994628904, "median": 34.24006652832031, "p90": 117.09703826904297, "max": 176.06622314453125, "pos_frac": 0.7421875, "sample": [103.93705749511719, -34.2596435546875, 79.21907043457031, 64.83881378173828, -13.903549194335938, 46.867897033691406, 115.43023681640625, 117.77981567382812, 92.09208679199219, 67.18279266357422, 0.0, 91.13777923583984, 116.80857849121094, 17.61767578125, 8.277809143066406, 36.75628662109375, -23.19892120361328, 6.80126953125, -61.14300537109375, 80.71528625488281, -18.0477294921875, 117.77011108398438, -22.771682739257812, -30.63347625732422, 58.152130126953125, 89.41204833984375, 44.638519287109375, -0.4685516357421875, 94.50961303710938, 116.5045166015625, 94.18072509765625, 107.82357025146484, 19.944717407226562, 141.3260498046875, 73.0799560546875, -26.56451416015625, 3.77313232421875, 131.64508056640625, -6.3221435546875, 6.7510223388671875, -10.800529479980469, -21.31957244873047, 28.485137939453125, -7.519256591796875, 18.962554931640625, 73.49658203125, -2.120624542236328, 35.980316162109375, 3.4753341674804688, -72.23147583007812, 3.7148361206054688, 60.599212646484375, 58.92021942138672, 64.77603149414062, -35.7398681640625, 104.27742767333984, 147.74072265625, 17.39910888671875, 0.7004852294921875, -14.267974853515625, -16.36937713623047, -7.298606872558594, -51.829345703125, 127.780517578125, 1.4713134765625, -0.0280914306640625, 100.70906066894531, -83.96865844726562, 20.27777099609375, -24.887649536132812, 53.183685302734375, 158.80429077148438, -124.20526123046875, 0.6944427490234375, 81.77557373046875, 13.198478698730469, 77.63292694091797, 8.906097412109375, 32.816986083984375, 91.9422607421875, 102.17681884765625, 87.87022399902344, 176.06622314453125, 39.505767822265625, -23.367767333984375, -6.1753387451171875, 3.3994598388671875, 64.80265808105469, 93.50047302246094, 25.649322509765625, 26.63189697265625, 85.41831970214844, 56.6136474609375, 23.510498046875, 16.1864013671875, 93.68812561035156, 125.16055297851562, 120.60820007324219, 135.23004150390625, -15.4146728515625, 41.85731506347656, 37.99677276611328, 62.93119812011719, 42.184539794921875, 88.42184448242188, 32.9449462890625, 27.76068115234375, -29.21435546875, 126.73052978515625, 8.106109619140625, -96.1119384765625, 92.93312072753906, 57.42607879638672, -25.49066162109375, 23.509057998657227, 30.652883529663086, 35.6134033203125, -9.122665405273438, 35.535186767578125, -6.48626708984375, 77.91244506835938, 39.415618896484375, 123.42221069335938, 2.956380844116211, 97.42007446289062, 9.250015258789062, 25.305999755859375, 68.74966430664062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000247.npy"}
|
||||
{"epoch": 0.5172774869109947, "step": 248, "batch_size": 128, "mean": 30.958358764648438, "std": 61.01762771606445, "min": -167.70858764648438, "p10": -36.844837951660146, "median": 23.116867065429688, "p90": 115.22391204833984, "max": 172.2529296875, "pos_frac": 0.75, "sample": [12.272613525390625, 53.320465087890625, -14.242813110351562, 16.898277282714844, 137.64059448242188, -19.51007080078125, 22.157546997070312, 4.519012451171875, 74.90081787109375, 13.274810791015625, 26.040420532226562, 53.5054931640625, 74.0628662109375, 15.500534057617188, 0.6933441162109375, 33.90142822265625, 131.01620483398438, 25.570663452148438, 3.4803009033203125, 83.33013916015625, -19.527450561523438, 24.076187133789062, 85.90320587158203, 49.99852752685547, 128.3154296875, 35.416717529296875, 18.104999542236328, 21.045455932617188, 2.10247802734375, 71.84268188476562, 29.64385986328125, -81.02600860595703, 136.90386962890625, -16.205337524414062, 1.1322917938232422, 36.067047119140625, 118.59251403808594, -167.70858764648438, 38.87535095214844, -5.070695877075195, -27.2548828125, 63.056007385253906, 60.369117736816406, -25.636627197265625, -10.743213653564453, 32.251129150390625, -0.6340560913085938, -112.10061645507812, 60.37290954589844, 13.618072509765625, 81.15859985351562, 164.35267639160156, 33.418853759765625, 21.863765716552734, 152.8651123046875, -34.341461181640625, 112.6912841796875, 0.2983226776123047, 5.721343994140625, 127.75350952148438, 49.332366943359375, 93.51093292236328, -28.103424072265625, -76.25106811523438, 16.351898193359375, 8.802947998046875, 171.7120361328125, 111.22320556640625, 76.37165832519531, 35.12591552734375, 79.63821411132812, -52.26123046875, 0.22524452209472656, 85.5811767578125, 1.1102294921875, -92.50726318359375, 0.0, 30.2591552734375, 172.2529296875, 114.9500732421875, 102.156982421875, 70.664306640625, 74.54472351074219, 40.16802978515625, 8.992809295654297, 42.63694763183594, -24.53009033203125, -51.7542724609375, -8.629676818847656, 10.354804992675781, -42.68605041503906, 17.50494384765625, 116.0859375, 96.02365112304688, 110.94674682617188, -83.36138916015625, 47.374664306640625, 29.076377868652344, 7.937232971191406, 100.44850158691406, 1.3331985473632812, -49.264923095703125, 35.01007080078125, -18.515380859375, 115.86286926269531, 27.80706787109375, 84.44253540039062, 7.8501129150390625, 10.6197509765625, -99.51612854003906, 17.573638916015625, 18.94659423828125, 18.160934448242188, -2.2222137451171875, 143.83322143554688, 79.518798828125, 67.9632568359375, 59.25401306152344, -2.716766357421875, -74.41615295410156, -14.814010620117188, -4.894584655761719, 24.122909545898438, 47.561248779296875, 83.08612060546875, -71.26483154296875, 1.560476303100586, 18.612884521484375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000248.npy"}
|
||||
{"epoch": 0.5193717277486911, "step": 249, "batch_size": 128, "mean": 36.0838508605957, "std": 59.41815185546875, "min": -150.06854248046875, "p10": -26.781445312499997, "median": 24.996932983398438, "p90": 116.39083862304688, "max": 164.164306640625, "pos_frac": 0.7109375, "sample": [-11.205810546875, 7.922267913818359, -69.8099365234375, -75.4471435546875, 24.6036376953125, 4.367607116699219, -16.742385864257812, -13.754409790039062, -2.34307861328125, 22.16314697265625, -3.1984329223632812, -18.52698516845703, 93.27615356445312, 116.70123291015625, -10.691390991210938, 61.031585693359375, 30.952728271484375, 119.14471435546875, 28.4017333984375, 79.16571044921875, 12.424301147460938, -126.67153930664062, 106.26053619384766, 17.3072509765625, -43.86871337890625, 141.52691650390625, 120.2474365234375, -14.77435302734375, 56.6458740234375, 80.67538452148438, 17.93128204345703, -9.933502197265625, 39.341400146484375, -14.753150939941406, 5.48577880859375, 65.20585632324219, 123.57350158691406, 55.269317626953125, 132.73968505859375, -28.152435302734375, 109.35603332519531, 99.9146728515625, -11.243408203125, 107.57203674316406, 145.88784790039062, 55.091796875, -90.99528503417969, 97.087158203125, 57.986968994140625, 4.586517333984375, -11.90679931640625, 42.709556579589844, 159.19610595703125, 75.93253326416016, 70.8265380859375, -2.175445556640625, 16.849655151367188, 64.45279693603516, -28.8416748046875, 86.005615234375, 36.51513671875, 0.63372802734375, -1.123260498046875, 3.8907470703125, 98.71319580078125, 9.318580627441406, 88.62734985351562, 2.183124542236328, 78.63153076171875, 90.21397399902344, -21.936843872070312, -68.93246459960938, 25.390228271484375, 54.741485595703125, 54.61079406738281, 81.91952514648438, 164.164306640625, 134.29898071289062, 104.94258117675781, 96.940673828125, 141.37680053710938, 75.4059829711914, 19.279930114746094, -34.656036376953125, 92.36968994140625, 131.76315307617188, -7.273609161376953, -22.5751953125, 9.108806610107422, 89.3046875, 14.149673461914062, -49.9443359375, 78.6827392578125, 73.19927978515625, 24.246047973632812, 92.27694702148438, 37.12921142578125, 34.1414794921875, -5.7281341552734375, -41.176544189453125, 116.2578125, 1.053497314453125, 23.459545135498047, 75.08709716796875, 14.88446044921875, 5.26702880859375, -5.580848693847656, -150.06854248046875, -34.209197998046875, -18.52948760986328, -26.193878173828125, 9.675451278686523, 51.7576904296875, 1.5079803466796875, 19.0906982421875, -4.860870361328125, 79.289794921875, 43.09539794921875, 100.00523376464844, 4.093017578125, 106.81097412109375, 73.51382446289062, 32.097652435302734, -19.84832000732422, -3.9168243408203125, 46.820281982421875, 12.191320419311523, 130.3732147216797], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000249.npy"}
|
||||
{"epoch": 0.5214659685863874, "step": 250, "batch_size": 128, "mean": 36.55620574951172, "std": 64.95527648925781, "min": -111.924560546875, "p10": -37.32461853027343, "median": 28.74789047241211, "p90": 122.809700012207, "max": 171.22523498535156, "pos_frac": 0.703125, "sample": [78.02642822265625, 10.981193542480469, -54.722686767578125, 113.658203125, 90.16946411132812, 164.80462646484375, 147.72402954101562, 85.54606628417969, 40.47674560546875, -88.82089233398438, 85.22476196289062, 23.796113967895508, 99.71331787109375, 29.30389404296875, 21.9503173828125, 131.40252685546875, 89.33596801757812, 17.84814453125, 129.75762939453125, 98.873291015625, 120.23970031738281, 51.13629150390625, 7.57806396484375, 92.88812255859375, -30.264404296875, 3.136322021484375, -111.924560546875, 21.419647216796875, -5.813720703125, -1.18731689453125, 130.90289306640625, -98.821044921875, 45.77203369140625, 156.70672607421875, -11.197029113769531, 9.90789794921875, 5.375244140625, 1.922454833984375, 129.077392578125, -92.42788696289062, -2.1304759979248047, -1.572418212890625, 117.8460693359375, 28.19188690185547, 86.14376831054688, 72.55949401855469, -99.47378540039062, -78.9942626953125, -1.5915470123291016, 81.42179870605469, 88.80812072753906, 9.007736206054688, 101.6064453125, -35.5064697265625, 35.126495361328125, 15.29547119140625, -6.6800079345703125, -31.77862548828125, 76.5303726196289, -0.476959228515625, 109.13246154785156, 129.64556884765625, 92.10746765136719, -1.1756954193115234, 26.505287170410156, 40.32940673828125, 149.38711547851562, 96.67460632324219, -6.503150939941406, 41.813201904296875, 83.48748016357422, 94.77348327636719, -39.496673583984375, 114.97137451171875, 34.8887939453125, -14.653427124023438, 34.61114501953125, 107.30833435058594, 128.80636596679688, 21.267349243164062, 11.900123596191406, -22.8514404296875, 45.669677734375, -24.70623779296875, 109.2176513671875, 45.445804595947266, -0.9508514404296875, -106.408447265625, 54.917266845703125, 130.71990966796875, -6.7841796875, 171.22523498535156, -66.74650573730469, -48.040306091308594, -102.84954833984375, 1.0249786376953125, 10.099105834960938, -36.39373779296875, -90.3585205078125, 98.20901489257812, 73.93148803710938, 110.58121490478516, 95.41363525390625, 78.86984252929688, 88.27598571777344, 59.800697326660156, -32.36041259765625, 39.816558837890625, 17.24346923828125, -3.5895328521728516, 6.00885009765625, 68.20298767089844, 145.1700897216797, -7.8561859130859375, 40.957061767578125, 35.493080139160156, 98.70637512207031, 6.8974151611328125, 1.279052734375, -0.7810211181640625, -25.087318420410156, 88.44692993164062, 43.307952880859375, 3.0039520263671875, 7.8048095703125, 23.728057861328125, 18.038406372070312, -12.137836456298828], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000250.npy"}
|
||||
{"epoch": 0.5235602094240838, "step": 251, "batch_size": 128, "mean": 36.841796875, "std": 59.545291900634766, "min": -106.74317932128906, "p10": -25.441058349609374, "median": 24.143394470214844, "p90": 115.49701843261718, "max": 223.479736328125, "pos_frac": 0.7578125, "sample": [83.30975341796875, 40.801666259765625, 21.622100830078125, -15.229217529296875, -31.302703857421875, -43.257904052734375, -52.0220947265625, 43.5093994140625, 10.2421875, 18.958526611328125, 4.116615295410156, 62.08258056640625, 106.91265869140625, 84.6782455444336, 4.526786804199219, -3.8736610412597656, 34.611541748046875, -15.773956298828125, 5.747407913208008, 142.741455078125, 0.5536956787109375, 2.747283935546875, -26.1661376953125, 110.34246826171875, 83.994384765625, 121.92294311523438, -57.04144287109375, 105.36894226074219, 125.01287841796875, 23.424911499023438, 107.70892333984375, 4.138404846191406, -21.967613220214844, 42.002685546875, 30.313720703125, -21.86044692993164, 54.453369140625, 0.021284103393554688, 12.555389404296875, -22.24707794189453, 88.69622802734375, 59.79510498046875, 101.97576904296875, 122.81918334960938, 120.85490417480469, 37.46697998046875, -2.376220703125, -54.09501647949219, 3.1228771209716797, 5.9705047607421875, -32.41254425048828, 14.50006103515625, 223.479736328125, 24.86187744140625, 77.16647338867188, 94.85358428955078, 26.75531005859375, 108.30386352539062, 3.374603271484375, 44.094810485839844, 90.087890625, -106.74317932128906, 18.894222259521484, 83.7613525390625, 63.42601013183594, -77.4395751953125, 8.123794555664062, -4.246620178222656, 23.2607421875, -3.3868236541748047, 104.32876586914062, 76.62857055664062, 12.439498901367188, 4.407733917236328, 51.542144775390625, -49.09126281738281, 4.646766662597656, 29.51446533203125, 19.355323791503906, 77.72166442871094, -0.004547119140625, 41.080535888671875, 2.15444278717041, -3.2265167236328125, 86.78233337402344, 100.8847427368164, 116.221923828125, -87.3135986328125, -1.669952392578125, 2.567169189453125, 65.0177001953125, 115.42852783203125, 102.5594482421875, 16.419647216796875, 142.79129028320312, -5.552324295043945, 184.97552490234375, 115.65682983398438, 93.5940933227539, -4.905490875244141, 83.40072631835938, 32.432159423828125, 65.9771728515625, 50.216796875, 55.076385498046875, 31.39813232421875, 111.83966064453125, 3.2598800659179688, 85.06005859375, -104.20730590820312, 0.783477783203125, 28.11279296875, 11.6783447265625, 0.03897857666015625, 4.86273193359375, -79.27580261230469, -25.13031005859375, -20.570877075195312, 4.3657989501953125, 143.57275390625, 104.27597045898438, -13.660415649414062, 123.62312316894531, 133.39694213867188, -13.5341796875, 4.964580535888672, 75.72367858886719, 60.487060546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000251.npy"}
|
||||
{"epoch": 0.5256544502617801, "step": 252, "batch_size": 128, "mean": 38.995628356933594, "std": 67.57903289794922, "min": -139.06082153320312, "p10": -35.11401062011718, "median": 29.188194274902344, "p90": 126.4396743774414, "max": 200.1065673828125, "pos_frac": 0.703125, "sample": [193.3641357421875, 131.02236938476562, 129.64190673828125, 82.41865539550781, -69.80398559570312, 18.93142318725586, -68.90281677246094, 8.020095825195312, 2.4542770385742188, -15.56500244140625, -5.5807952880859375, -27.415084838867188, 13.5711669921875, 30.593658447265625, 200.1065673828125, 38.7069091796875, 119.91532897949219, 86.62467193603516, 107.13893127441406, 108.535400390625, 1.3720054626464844, 87.72344970703125, -18.76080322265625, 126.09101867675781, -139.06082153320312, 4.941650390625, 4.87774658203125, 59.12017059326172, -33.1041259765625, 48.018585205078125, 1.288909912109375, 120.897705078125, 112.05096435546875, 98.58474731445312, 78.9633560180664, 13.89337158203125, -52.5592041015625, -8.376300811767578, 7.531219482421875, 195.14459228515625, 48.6207275390625, 3.5562515258789062, 77.06465148925781, 27.592727661132812, 16.68829345703125, 16.82476806640625, 27.782730102539062, -6.943634033203125, -2.33892822265625, 12.396781921386719, 109.05340576171875, 5.15032958984375, 127.25320434570312, 85.08594512939453, -20.63916015625, -4.591796875, 52.30937194824219, -89.87322998046875, 37.312042236328125, 6.1691436767578125, 85.05133056640625, -28.109222412109375, 33.06512451171875, 138.16949462890625, 46.85795593261719, 77.57267761230469, 114.68304443359375, 36.3614501953125, 33.5184326171875, 116.36929321289062, -8.993927001953125, 100.64762878417969, 136.210205078125, -7.113990783691406, -57.64069366455078, 109.19070434570312, -39.803741455078125, 7.664482116699219, 69.42498779296875, 91.81291198730469, 81.95640563964844, 2.43719482421875, 36.20947265625, 95.84735870361328, 25.58349609375, -1.98724365234375, 0.0, 89.91600036621094, -99.46591186523438, 108.2804946899414, -0.003753662109375, 99.17288208007812, -12.836700439453125, -21.42840576171875, 36.8304443359375, 132.87307739257812, -1.6121063232421875, 169.02133178710938, 44.3668212890625, 114.31904602050781, -4.30902099609375, 64.96147918701172, 67.65292358398438, 5.4235992431640625, 127.286376953125, -0.995819091796875, 96.64712524414062, -115.60830688476562, 19.83392333984375, 1.3318328857421875, -20.48828125, -54.85528564453125, 37.37896728515625, -108.64642333984375, -102.00177001953125, -1.568572998046875, 97.0992431640625, 129.3929443359375, -50.18194580078125, 103.96365356445312, 18.89093017578125, 141.47994995117188, 77.06039428710938, -32.92425537109375, -2.7330322265625, 105.94993591308594, 14.632755279541016, 101.45918273925781], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000252.npy"}
|
||||
{"epoch": 0.5277486910994764, "step": 253, "batch_size": 128, "mean": 30.262083053588867, "std": 62.9678840637207, "min": -135.26431274414062, "p10": -48.84879760742187, "median": 16.83154296875, "p90": 118.94188385009765, "max": 150.97344970703125, "pos_frac": 0.7265625, "sample": [143.07577514648438, -5.8472900390625, 12.271224975585938, 47.819252014160156, 17.66375732421875, 20.605697631835938, 1.8210315704345703, 140.09519958496094, 148.9930419921875, 26.11669921875, 38.5546875, 118.94499206542969, 6.230316162109375, -47.823699951171875, 57.18890380859375, -98.66973876953125, 19.56487274169922, -63.376548767089844, 103.02423095703125, -36.721588134765625, 14.501007080078125, -16.185882568359375, 24.98225212097168, 12.38808822631836, 2.275390625, -31.80157470703125, 32.201141357421875, 124.39163208007812, 97.72608184814453, 85.7822265625, -0.7708854675292969, -12.979206085205078, -102.4908447265625, 31.556915283203125, 150.97344970703125, 6.570281982421875, -14.406448364257812, 108.61346435546875, -135.26431274414062, -70.2239990234375, 112.59710693359375, 42.151123046875, 120.02609252929688, 22.260269165039062, 17.742767333984375, -42.74794006347656, 106.41511535644531, 4.412200927734375, -9.0076904296875, 15.99932861328125, 116.23858642578125, 35.445281982421875, 116.14370727539062, 18.217422485351562, -66.24180603027344, 107.9384765625, 14.656723022460938, 9.875221252441406, -37.001800537109375, 83.20367431640625, 72.35830688476562, -26.438079833984375, -6.0033721923828125, 99.81412506103516, 19.23907470703125, 109.54794311523438, -53.1439208984375, -8.718944549560547, 56.43226623535156, -54.83062744140625, 19.08135986328125, -12.719390869140625, -103.1684341430664, 5.630615234375, -104.82308959960938, 10.960870742797852, -87.28875732421875, 4.126422882080078, -2.5334854125976562, -37.368255615234375, 65.44940185546875, -0.21875762939453125, 77.446533203125, 89.60798645019531, 37.14857482910156, -0.727264404296875, 27.369789123535156, 7.23333740234375, -51.240692138671875, 135.92868041992188, -54.55445861816406, 7.989341735839844, 26.27880859375, 2.081298828125, 7.57318115234375, 106.11192321777344, -18.578285217285156, 145.185302734375, 8.109092712402344, 85.74484252929688, 89.90162658691406, 12.454490661621094, -31.32830810546875, 3.079833984375, 34.569366455078125, 118.9405517578125, 12.671829223632812, 123.28170776367188, 13.041366577148438, -2.7917938232421875, 27.313720703125, 128.7249755859375, 93.73871612548828, 2.7269325256347656, 90.28396606445312, 12.946090698242188, 9.49615478515625, 12.403121948242188, 89.45025634765625, 147.86041259765625, 7.200614929199219, 41.31072998046875, 135.98373413085938, 93.78050231933594, 10.2139892578125, 90.03976440429688, 55.78533935546875, 100.68014526367188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000253.npy"}
|
||||
{"epoch": 0.5298429319371728, "step": 254, "batch_size": 128, "mean": 39.02460479736328, "std": 57.6854133605957, "min": -134.685302734375, "p10": -22.64782257080078, "median": 30.978858947753906, "p90": 118.86912536621092, "max": 147.1891326904297, "pos_frac": 0.7578125, "sample": [139.24050903320312, 0.0232696533203125, -85.67071533203125, 29.904685974121094, -10.995307922363281, -3.5635414123535156, 17.10186767578125, 19.79790496826172, -0.43540191650390625, 19.932952880859375, -33.686744689941406, -18.579559326171875, -29.437255859375, -3.331268310546875, 7.871185302734375, 121.58309936523438, 3.54791259765625, 93.41610717773438, 46.12579345703125, -31.495208740234375, -35.762847900390625, 124.87396240234375, 45.35552978515625, 11.372314453125, -0.99957275390625, -1.5490875244140625, 100.20083618164062, -8.146453857421875, 93.81887817382812, 62.7652587890625, 99.0147705078125, 133.27496337890625, -79.3636474609375, 75.85159301757812, 0.9552021026611328, 1.0409088134765625, -17.718551635742188, 33.27338790893555, 24.319778442382812, 11.6168212890625, 60.836273193359375, 50.898040771484375, 83.67620849609375, 38.50416564941406, -49.486392974853516, 59.07615661621094, 35.72119140625, 44.41387939453125, 61.0458984375, 96.03990173339844, 56.81439208984375, 3.5796947479248047, 18.60158920288086, 111.66363525390625, 80.2275390625, 31.179168701171875, 100.7431640625, -22.401458740234375, 55.327423095703125, 63.214691162109375, 5.198066711425781, 10.326950073242188, 131.4970703125, 7.361106872558594, 36.87760925292969, 83.9746322631836, 94.5926513671875, 111.979248046875, 117.70599365234375, 31.003067016601562, 14.997093200683594, 47.831146240234375, 126.85693359375, 4.8321685791015625, 45.88580322265625, -23.222671508789062, -3.90655517578125, 30.95465087890625, 3.3895034790039062, 9.861610412597656, -4.48468017578125, -10.095085144042969, 126.62890625, 112.30703735351562, 81.1513671875, 135.17996215820312, 123.6488037109375, 100.94927978515625, 16.18487548828125, 108.00204467773438, 21.456920623779297, 110.95768737792969, 111.30145263671875, 94.96485900878906, 104.15499114990234, 25.536346435546875, 0.3180408477783203, 111.63378143310547, -4.6858367919921875, -5.7275848388671875, -10.33087158203125, 45.309600830078125, 57.43928527832031, -2.402393341064453, 0.0, -30.140243530273438, 122.3009033203125, 69.65107727050781, 34.32305908203125, 23.276031494140625, 13.781845092773438, 22.513572692871094, 22.798057556152344, 114.12063598632812, 78.45437622070312, 1.9544677734375, -134.685302734375, -70.75537872314453, 137.88140869140625, 22.004547119140625, -103.540771484375, -86.43206787109375, 4.966030120849609, 147.1891326904297, 146.3888397216797, 34.961181640625, 110.90408325195312, 40.61991882324219], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000254.npy"}
|
||||
{"epoch": 0.5319371727748691, "step": 255, "batch_size": 128, "mean": 44.2780876159668, "std": 66.62220764160156, "min": -111.88505554199219, "p10": -26.232209777832026, "median": 32.84490203857422, "p90": 127.71985473632812, "max": 178.67410278320312, "pos_frac": 0.7734375, "sample": [7.4855499267578125, -110.57513427734375, 135.5032958984375, 47.613006591796875, 27.29339599609375, -88.28155517578125, 0.304412841796875, 120.21974182128906, -10.22576904296875, 104.81336212158203, -8.1593017578125, 9.707107543945312, 17.566757202148438, 8.3009033203125, 143.96865844726562, 123.267333984375, 31.202613830566406, 13.728424072265625, 140.09146118164062, 11.167724609375, 110.71633911132812, -82.69462585449219, 1.36962890625, 97.52885437011719, 40.156951904296875, 127.43760681152344, 34.46241760253906, 103.09970092773438, -20.82440185546875, 24.340240478515625, 52.9903564453125, 9.016387939453125, 22.27655792236328, 13.97198486328125, 66.08995056152344, 34.3309326171875, 119.7159423828125, 104.84072875976562, -24.899078369140625, -68.12681579589844, 178.67410278320312, 150.0281982421875, 128.37843322753906, 27.604248046875, -86.74649810791016, -13.327163696289062, -96.62442016601562, 108.2283935546875, 91.6035385131836, 91.13931274414062, 115.41903686523438, 120.71859741210938, 86.7362060546875, 86.66632080078125, 144.25733947753906, -4.679573059082031, 73.9351806640625, -86.76931762695312, -29.342849731445312, 110.69999694824219, 24.346481323242188, 114.90911865234375, 107.13381958007812, 97.3653564453125, 46.6473388671875, 106.73562622070312, 6.438667297363281, -100.07634735107422, 124.49064636230469, 17.1761474609375, 143.61495971679688, 41.955780029296875, 50.47822952270508, -4.209747314453125, 4.10614013671875, 5.155827522277832, 35.37030029296875, 14.822586059570312, 8.368499755859375, 98.22643280029297, -6.41741943359375, 92.444091796875, -111.88505554199219, 142.09918212890625, 123.75509643554688, 97.01631164550781, -8.70343017578125, 67.97046661376953, 15.633224487304688, 17.84765625, 126.09640502929688, 54.0167236328125, 75.85848236083984, 124.03675079345703, -17.8369140625, 146.3472900390625, 110.95013427734375, 0.4569549560546875, 7.231597900390625, -13.205940246582031, 3.2812862396240234, -37.379058837890625, 4.6871490478515625, 4.6552886962890625, 135.4530029296875, 83.0045166015625, 7.154022216796875, 102.2965087890625, -15.667304992675781, 44.11029052734375, 10.5091552734375, 9.007286071777344, -12.121978759765625, 114.18728637695312, 17.441314697265625, 31.358871459960938, -11.902877807617188, -11.868316650390625, 105.31146240234375, 52.2913818359375, 35.19956970214844, 109.04210662841797, -5.7361602783203125, 142.80029296875, -54.3948974609375, 157.75677490234375, 24.52801513671875, -51.567848205566406], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000255.npy"}
|
||||
{"epoch": 0.5340314136125655, "step": 256, "batch_size": 128, "mean": 43.44810485839844, "std": 70.85692596435547, "min": -130.38409423828125, "p10": -58.931157684326166, "median": 53.605106353759766, "p90": 126.14222717285156, "max": 162.4661865234375, "pos_frac": 0.7109375, "sample": [32.91064453125, 103.06008911132812, 100.41888427734375, -36.90118408203125, 89.02397155761719, 86.49038696289062, 119.1175537109375, 97.50845336914062, 112.2427978515625, 2.70709228515625, 99.62397766113281, -28.161819458007812, -33.49041748046875, -96.95955657958984, 40.72389221191406, 101.32548522949219, 135.93606567382812, -18.922515869140625, 75.16767883300781, 58.002655029296875, 0.0216064453125, 108.47920227050781, -82.4530029296875, 162.4661865234375, 19.2171630859375, 123.09785461425781, 107.57083129882812, -61.1827392578125, 138.5635986328125, 20.526611328125, 133.998779296875, -11.208232879638672, -1.19244384765625, 52.585784912109375, 70.82098388671875, 10.4193115234375, -48.099456787109375, -63.63885498046875, 120.34237670898438, 53.58454132080078, 138.30084228515625, 17.981292724609375, 20.636520385742188, 102.04376220703125, 55.62998962402344, 50.75982666015625, -66.24205780029297, -21.71971893310547, -26.229232788085938, 151.3958740234375, 4.323661804199219, -57.96619415283203, 128.61215209960938, 135.79058837890625, -24.328582763671875, 2.3658523559570312, -130.38409423828125, -33.35198974609375, 110.38396453857422, 55.0147705078125, 96.62385559082031, 18.664031982421875, 103.05084228515625, 135.4482421875, -2.9705944061279297, 91.14773559570312, 108.833740234375, -102.88436126708984, 100.04669189453125, -6.39398193359375, -94.60494995117188, -92.45751953125, 126.0611572265625, 37.108612060546875, 42.953643798828125, 108.92413330078125, -17.959014892578125, 115.75390625, 26.95782470703125, 137.38430786132812, 145.59780883789062, 83.4652328491211, 95.66246795654297, -4.1454925537109375, 67.9361572265625, 64.14739990234375, -95.38059997558594, 46.267086029052734, 66.331787109375, 126.33139038085938, -25.68243408203125, 118.03460693359375, 91.96598815917969, 124.669921875, 14.4674072265625, 100.9570541381836, 92.81275939941406, -62.2327880859375, 17.830650329589844, 26.8660888671875, 39.264434814453125, 56.33202362060547, 125.2930908203125, 77.72210693359375, 88.379638671875, 53.62567138671875, 8.754806518554688, 119.32794189453125, -44.842506408691406, -129.79086303710938, 4.181365966796875, 125.01719665527344, 128.56243896484375, 14.781631469726562, -10.27313232421875, -0.619873046875, 124.67428588867188, -36.58795166015625, -28.79627227783203, -100.8155517578125, 94.74790954589844, -6.82635498046875, -5.891548156738281, 62.384552001953125, 21.5377197265625, 77.847900390625, 65.62713623046875, 103.3870849609375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000256.npy"}
|
||||
{"epoch": 0.5361256544502618, "step": 257, "batch_size": 128, "mean": 42.80219268798828, "std": 63.005611419677734, "min": -118.75373840332031, "p10": -28.068722534179685, "median": 29.133075714111328, "p90": 125.3571533203125, "max": 183.98260498046875, "pos_frac": 0.6796875, "sample": [3.1602935791015625, 21.52764892578125, 172.1189422607422, -8.658809661865234, -1.571075439453125, 183.98260498046875, 24.469757080078125, 122.78929138183594, 100.28369140625, 106.41551971435547, 10.25885009765625, 140.7838134765625, 84.45547485351562, 86.1744384765625, -1.7702655792236328, 28.52332305908203, 57.13746643066406, -17.1981201171875, -25.57703399658203, 133.80545043945312, 122.369873046875, -39.18769073486328, 84.83548736572266, 157.95675659179688, -17.63714599609375, -46.1026611328125, 27.702880859375, 22.951889038085938, -4.8060302734375, -17.492002487182617, -58.67632293701172, -30.0533447265625, 104.87066650390625, 25.27581787109375, 88.6832275390625, 122.34454345703125, 178.04852294921875, 2.5264892578125, 134.3331298828125, 92.6276626586914, -20.22979736328125, 7.8373260498046875, -21.26024627685547, 101.93170166015625, 117.92379760742188, -38.302276611328125, -2.0022354125976562, 41.9376220703125, 70.95449829101562, -55.061737060546875, 23.573394775390625, 157.47503662109375, 73.41718292236328, 52.35877990722656, 79.8231201171875, 17.08355712890625, 87.96231079101562, -23.592605590820312, 72.70703125, 68.01654052734375, 61.94792175292969, 127.60720825195312, 99.49856567382812, 36.523658752441406, -52.205482482910156, -12.306198120117188, 104.57839965820312, 2.7863101959228516, 59.39060974121094, 112.71615600585938, -0.1735687255859375, 100.69033813476562, 74.10042572021484, 39.21728515625, 12.684514999389648, 39.2059326171875, -118.75373840332031, -0.455169677734375, 56.72956848144531, 122.9312744140625, 35.962646484375, 3.910646438598633, 79.56671142578125, 27.69158935546875, 22.186817169189453, -15.149200439453125, -13.74652099609375, 117.71893310546875, 70.645263671875, -74.50323486328125, -16.291259765625, 156.07965087890625, -30.453155517578125, 125.2489013671875, 62.000301361083984, 105.67343139648438, 146.4049072265625, 46.651580810546875, -16.119430541992188, 75.0369644165039, 7.731964111328125, 12.772476196289062, 111.385986328125, 32.712646484375, -3.1662445068359375, -14.318878173828125, -5.117401123046875, -35.37908935546875, 23.683135986328125, 125.6097412109375, 24.976470947265625, 116.04220581054688, -19.625457763671875, 165.15402221679688, -41.07757568359375, -11.686553955078125, 69.180908203125, -27.218170166015625, 29.742828369140625, 34.728271484375, 25.843612670898438, -5.489173889160156, -11.106842041015625, 82.77630615234375, 6.653511047363281, -12.3592529296875, -70.88529968261719, 107.65116882324219], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000257.npy"}
|
||||
{"epoch": 0.5382198952879581, "step": 258, "batch_size": 128, "mean": 47.140220642089844, "std": 60.32489776611328, "min": -141.17303466796875, "p10": -23.87011260986328, "median": 49.28583526611328, "p90": 121.990185546875, "max": 155.14559936523438, "pos_frac": 0.7578125, "sample": [103.43911743164062, 28.512788772583008, 17.341636657714844, 7.058235168457031, 101.85469055175781, -32.349700927734375, 143.12393188476562, -23.70257568359375, 26.488990783691406, 113.37921142578125, -43.116851806640625, 35.71270751953125, 137.3193359375, 69.135009765625, 118.28463745117188, 74.99198913574219, -24.261032104492188, 68.76885986328125, 57.845947265625, 93.91798400878906, 124.2906494140625, 94.91351318359375, -35.02079772949219, 9.76263427734375, 154.902587890625, -39.70603942871094, -141.17303466796875, 113.68717956542969, 129.2257080078125, 28.000701904296875, 100.64129638671875, 56.11505126953125, -29.43060302734375, 72.2806396484375, 43.5550537109375, 36.875701904296875, 3.704700469970703, -25.956878662109375, 58.62309265136719, 101.76719665527344, 66.55035400390625, 84.44136047363281, 5.655059814453125, 55.88746643066406, 13.983352661132812, 87.43905639648438, 24.84185791015625, 14.94219970703125, 0.91424560546875, 14.186767578125, 46.11988067626953, 13.037117004394531, 108.06021118164062, -7.11187744140625, -122.42234802246094, 71.85567474365234, 52.984100341796875, -7.45556640625, -129.3564453125, 7.7880859375, -18.703079223632812, 27.93170166015625, 30.09234619140625, 41.519500732421875, 116.38311767578125, -0.3043060302734375, 147.90921020507812, 119.1522216796875, -28.627349853515625, 91.7210693359375, 117.82313537597656, 64.89404296875, -20.447769165039062, -12.667724609375, -0.8396968841552734, 22.7921142578125, 127.962158203125, -50.27644348144531, 89.61837005615234, 90.1285400390625, -8.958671569824219, -21.262786865234375, 96.2950210571289, 103.95785522460938, -46.55152893066406, -0.9763107299804688, 88.82673645019531, 72.7813720703125, 121.905517578125, 5.85955810546875, 7.958919525146484, 51.25065612792969, 89.20916748046875, 20.263336181640625, -5.8832244873046875, -0.97845458984375, 97.44660949707031, 106.44000244140625, 74.56320190429688, -12.867584228515625, 56.883880615234375, -5.66046142578125, 101.03176879882812, -15.760345458984375, 155.14559936523438, 24.192672729492188, 107.91571044921875, 130.4361114501953, 122.187744140625, 114.07341003417969, 47.321014404296875, -15.578369140625, 148.0239715576172, 83.13214111328125, 138.72354125976562, 79.85260009765625, 54.30004119873047, 17.20098876953125, 71.06834411621094, 94.54203796386719, -11.1962890625, 35.284393310546875, 84.226318359375, 143.5965576171875, 18.668975830078125, 36.961181640625, 83.22708129882812, 5.66339111328125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000258.npy"}
|
||||
{"epoch": 0.5403141361256545, "step": 259, "batch_size": 128, "mean": 38.76637268066406, "std": 62.72801971435547, "min": -119.29904174804688, "p10": -27.939537048339837, "median": 30.795772552490234, "p90": 123.83252258300782, "max": 160.59466552734375, "pos_frac": 0.75, "sample": [96.16677856445312, -31.578231811523438, 125.15975952148438, 10.80461311340332, -26.380096435546875, 18.122207641601562, 69.83158874511719, 137.0709686279297, 94.25503540039062, -12.5887451171875, 104.45654296875, 13.85955810546875, 100.83584594726562, 100.50015258789062, 70.06484985351562, 20.596328735351562, 95.68670654296875, -111.360107421875, 82.38119506835938, 90.87980651855469, 36.61383056640625, 146.42144775390625, 5.35321044921875, 14.617961883544922, 110.96157836914062, 22.729965209960938, 60.716087341308594, 70.04644012451172, -100.86029052734375, 0.5182037353515625, 10.438400268554688, 0.7154617309570312, -22.625518798828125, 90.62384033203125, 105.24270629882812, 37.7557373046875, 28.79827880859375, 11.655693054199219, 18.043365478515625, 90.54190063476562, -40.35400390625, 20.670867919921875, 133.16012573242188, 17.716890335083008, 3.73968505859375, -119.29904174804688, 85.04496765136719, 129.30294799804688, -1.8473587036132812, 94.12933349609375, 88.14260864257812, 59.416046142578125, 44.981781005859375, 5.77325439453125, 32.79326629638672, 62.23571014404297, -2.1497802734375, 130.08935546875, 27.827011108398438, 15.189254760742188, 160.59466552734375, 28.4859619140625, 83.38644409179688, 2.6191749572753906, -7.4654998779296875, -93.4408950805664, 48.88299560546875, -37.76780700683594, -87.64146423339844, 61.5595703125, 39.15643310546875, -12.45416259765625, -9.551254272460938, 123.60403442382812, 125.3399658203125, 5.696258544921875, 56.18257141113281, 0.0, -10.16156005859375, 114.7784423828125, 39.274017333984375, 110.2637939453125, -65.27175903320312, 18.459228515625, 50.102783203125, 66.94895935058594, 124.396484375, -86.1872787475586, -1.0083160400390625, 49.639892578125, -108.63861083984375, -22.635528564453125, 126.58280944824219, 132.98187255859375, 39.483428955078125, 25.3382568359375, 92.78559112548828, -19.8984375, 108.88890838623047, -16.95111083984375, 41.4586181640625, 71.4471435546875, -23.128509521484375, 105.15960693359375, 16.573486328125, -3.392730712890625, 107.61707305908203, -101.748046875, 124.36566162109375, 125.191650390625, 75.17535400390625, 102.25590515136719, 12.071792602539062, -11.9439697265625, 11.927661895751953, 13.0189208984375, 120.35105895996094, 20.75189208984375, 8.412544250488281, -26.3800048828125, 99.02732849121094, -43.531036376953125, -22.8905029296875, 112.59515380859375, 3.78167724609375, 121.67636108398438, 54.90980529785156, 17.346736907958984], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000259.npy"}
|
||||
{"epoch": 0.5424083769633508, "step": 260, "batch_size": 128, "mean": 18.112144470214844, "std": 66.74613952636719, "min": -148.6329345703125, "p10": -72.50072174072265, "median": 8.716064453125, "p90": 108.0616226196289, "max": 151.46035766601562, "pos_frac": 0.6015625, "sample": [121.60736083984375, -0.41878509521484375, 91.68487548828125, 83.67466735839844, -115.99786376953125, 98.17100524902344, 32.152801513671875, 126.91473388671875, 15.818267822265625, 94.94451904296875, -20.201919555664062, 25.57867431640625, 0.0, -7.314239501953125, 92.04774475097656, 4.919765472412109, 0.731719970703125, 151.46035766601562, 72.94865417480469, 53.761783599853516, 101.50190734863281, -148.6329345703125, -5.392208099365234, -25.9664306640625, 142.21682739257812, -66.73641967773438, 12.617862701416016, 93.47224426269531, 40.3658447265625, 86.56556701660156, 31.1748046875, -72.53309631347656, 15.41900634765625, -0.7542591094970703, 5.535022735595703, 142.19927978515625, -91.34022521972656, -122.15237426757812, -9.536666870117188, 75.60850524902344, -1.13067626953125, 9.2135009765625, 47.0386962890625, -87.87890625, -24.33587646484375, 94.95211791992188, -80.96709442138672, -43.646244049072266, -49.393035888671875, 12.72608757019043, -5.85272216796875, 6.61029052734375, 117.1607666015625, 36.57501220703125, -27.541400909423828, 12.130157470703125, 114.8218994140625, -125.53244018554688, 96.91778564453125, 84.39826965332031, -30.467559814453125, -3.5335464477539062, 41.159446716308594, -38.946624755859375, 7.341518402099609, -88.48384094238281, 64.515869140625, 109.02479553222656, -6.2237548828125, 118.63711547851562, 4.4912261962890625, 128.6207275390625, -68.55458068847656, 8.342010498046875, 72.12667846679688, -21.491241455078125, -6.385063171386719, 51.4049072265625, 14.33056640625, 2.85247802734375, -27.31695556640625, 19.76025390625, 136.61322021484375, 0.131500244140625, 1.75311279296875, 86.835693359375, 26.096405029296875, -16.495742797851562, -85.02214050292969, -38.23468017578125, -29.119354248046875, 2.47918701171875, -79.14173889160156, 134.86502075195312, 10.27392578125, -29.14154052734375, -10.565227508544922, 33.058250427246094, -28.17584228515625, 107.64883422851562, 101.25497436523438, 65.32656860351562, 18.89794921875, 73.03010559082031, -12.341705322265625, 5.272125244140625, -4.73699951171875, 52.56548309326172, -58.91655731201172, 135.4766845703125, -7.8153076171875, 99.4700927734375, 76.327392578125, 9.090118408203125, -53.344757080078125, 4.0368804931640625, -72.48684692382812, -112.96157836914062, -12.3502197265625, -17.665313720703125, 37.7230224609375, -96.39276123046875, 14.519500732421875, 15.08627700805664, 90.70784759521484, 51.37040710449219, 99.71125030517578, -69.94613647460938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000260.npy"}
|
||||
{"epoch": 0.5445026178010471, "step": 261, "batch_size": 128, "mean": 31.960647583007812, "std": 65.13914489746094, "min": -120.61508178710938, "p10": -49.73440093994141, "median": 17.11493492126465, "p90": 116.36744003295898, "max": 188.51107788085938, "pos_frac": 0.6875, "sample": [-80.03392028808594, -116.27877807617188, 10.068923950195312, -32.02626037597656, 98.76602172851562, 89.47811889648438, 77.48980712890625, -17.1414794921875, 68.52590942382812, 107.1414794921875, 106.05023193359375, 0.0, -20.25919532775879, -22.625030517578125, -9.920257568359375, 61.352264404296875, -16.712615966796875, 39.93975830078125, 114.45445251464844, 34.6356201171875, 56.013519287109375, -13.43636703491211, 107.89256286621094, -80.82421875, 72.24250793457031, -8.805366516113281, -49.66413879394531, 62.88795471191406, -83.52667999267578, -96.43456268310547, 81.982421875, 86.60104370117188, 119.9447021484375, 82.23701477050781, 125.62469482421875, 73.47122192382812, -27.96160125732422, 15.101970672607422, -49.898345947265625, 14.162811279296875, 14.147224426269531, 19.127899169921875, 127.87350463867188, -26.568511962890625, -6.2701873779296875, 20.284103393554688, -1.3731842041015625, 116.5321273803711, -87.49313354492188, 88.7665786743164, 79.81355285644531, -22.368072509765625, 80.0657958984375, 132.6729736328125, 113.12213134765625, 56.599609375, 30.38428497314453, -36.71412658691406, 57.38946533203125, 101.62237548828125, 112.712890625, 116.29685974121094, 41.166263580322266, 14.770156860351562, 86.96511840820312, -10.604511260986328, -7.698951721191406, -85.34005737304688, 31.58423614501953, -1.4959335327148438, 111.832763671875, 188.51107788085938, 2.781951904296875, 145.115966796875, 36.82807922363281, -20.406097412109375, 6.393035888671875, 2.374021530151367, 7.725372314453125, 11.538116455078125, 102.29296875, -7.4383544921875, 119.02760314941406, -65.8739013671875, 90.9896240234375, 84.7247314453125, 101.98112487792969, 50.00243377685547, 11.545768737792969, -91.71279907226562, 8.288261413574219, 11.28466796875, -93.20513916015625, 7.25762939453125, -51.87255859375, 127.75975036621094, 53.98486328125, 108.78498840332031, 93.80882263183594, -14.85577392578125, -34.83915710449219, 67.3351821899414, 1.1150665283203125, 106.93692016601562, 14.588455200195312, 47.01953125, -38.426841735839844, 141.90866088867188, 124.0491943359375, -28.380828857421875, 129.95059204101562, -6.355369567871094, 27.33855438232422, -120.61508178710938, 35.933746337890625, 9.203689575195312, 9.407073974609375, 12.5880126953125, 30.8768310546875, 9.41778564453125, 61.24266815185547, 147.64981079101562, 11.175491333007812, -44.69000244140625, 3.136474609375, 6.601898193359375, 3.0643768310546875, 87.77444458007812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000261.npy"}
|
||||
{"epoch": 0.5465968586387434, "step": 262, "batch_size": 128, "mean": 41.4820556640625, "std": 61.19712448120117, "min": -107.09587097167969, "p10": -33.25136260986328, "median": 32.990478515625, "p90": 124.95104675292968, "max": 157.726806640625, "pos_frac": 0.7421875, "sample": [64.80941772460938, 13.36370849609375, 126.222900390625, 88.83222961425781, 97.54205322265625, 3.5208969116210938, 110.923828125, 47.775238037109375, 23.526580810546875, 139.90216064453125, 142.4365234375, -20.85797119140625, 35.012046813964844, 18.142669677734375, 10.104217529296875, 4.650005340576172, 101.65436553955078, 30.68328857421875, 16.205543518066406, 85.26217651367188, 119.89457702636719, 49.35516357421875, -63.632381439208984, 122.48904418945312, 16.734130859375, 134.93133544921875, 32.95872497558594, 60.50616455078125, 103.6519775390625, 145.44573974609375, 5.64874267578125, -107.09587097167969, 54.82868957519531, -18.894271850585938, 4.094390869140625, -52.31636047363281, -91.98001098632812, 102.378662109375, -3.2252960205078125, 118.52215576171875, 29.515899658203125, -6.860517501831055, -7.889801025390625, -39.9921875, 33.02223205566406, 32.27631378173828, 5.1576385498046875, -0.2421875, 64.95285034179688, -38.61688232421875, 117.81790161132812, 157.726806640625, 70.06784057617188, 52.593223571777344, -85.57989501953125, -55.34254837036133, 5.705810546875, 53.01806640625, -22.9053955078125, 15.35748291015625, 19.11578369140625, -17.25469970703125, -6.1420745849609375, 75.07513427734375, -0.29357147216796875, 10.94903564453125, 137.15689086914062, 37.14463806152344, 134.02670288085938, 82.13699340820312, -34.35954284667969, -0.200469970703125, 125.60501098632812, 1.433074951171875, 58.275848388671875, -77.00698852539062, -52.32537078857422, 110.39480590820312, 1.235107421875, -12.171684265136719, -102.2681884765625, -12.786762237548828, 0.0, 98.23758697509766, 147.2164306640625, 37.9813232421875, 15.979034423828125, 92.38235473632812, 57.08867645263672, -0.0284271240234375, 104.2938003540039, 106.39230346679688, 103.59005737304688, 17.146728515625, 124.93289184570312, 118.60128784179688, 26.04278564453125, 75.17756652832031, 35.213623046875, 126.76486206054688, 5.983388900756836, -15.552215576171875, 116.9571533203125, 44.62164306640625, -32.77642822265625, 14.481849670410156, 28.15301513671875, 26.37109375, 124.993408203125, 52.33514404296875, -2.230052947998047, 86.31795501708984, 139.96170043945312, 69.47361755371094, 116.88540649414062, 75.20294189453125, 117.72227478027344, 61.22325134277344, 83.82646179199219, 109.36546325683594, 19.38177490234375, -54.08533477783203, 37.300048828125, 26.07366943359375, 11.127731323242188, 82.98286437988281, -22.851219177246094, -32.0858154296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000262.npy"}
|
||||
{"epoch": 0.5486910994764398, "step": 263, "batch_size": 128, "mean": 33.25617218017578, "std": 67.708984375, "min": -142.61602783203125, "p10": -45.60750122070312, "median": 17.380355834960938, "p90": 131.91530151367186, "max": 188.7061767578125, "pos_frac": 0.671875, "sample": [122.83059692382812, 20.675201416015625, 130.7457275390625, 17.628570556640625, -6.287117004394531, 21.421066284179688, -16.863433837890625, 11.604705810546875, 70.25835418701172, 54.93658447265625, -9.731325149536133, 17.13214111328125, 24.848846435546875, 25.00555419921875, -44.31329345703125, 4.140556335449219, -0.5393867492675781, -72.10699462890625, 111.12014770507812, -6.73193359375, 54.09906005859375, -66.76301574707031, 14.137832641601562, 85.20957946777344, -10.35888671875, 117.75051879882812, 120.40467834472656, 25.13207244873047, -52.62890625, -86.15591430664062, -96.75780487060547, 110.859619140625, 133.9959716796875, 134.92987060546875, 139.54718017578125, -24.1571044921875, -10.375091552734375, -13.9573974609375, -69.28518676757812, 86.79881286621094, 8.148651123046875, -5.228546142578125, -10.319259643554688, 140.1790771484375, 55.9068603515625, 30.10712432861328, 90.84011840820312, -38.53692626953125, 110.14053344726562, 19.55291748046875, 131.68328857421875, -18.439910888671875, -16.964218139648438, 86.77095031738281, 137.50299072265625, 83.54910278320312, -23.460189819335938, 6.11126708984375, 109.1285400390625, 113.10737609863281, 0.0, 37.0570068359375, -142.61602783203125, 25.64846420288086, 8.36181640625, 7.91204833984375, 3.2740325927734375, 2.2840194702148438, 12.593025207519531, 22.46514892578125, 135.0902862548828, 4.2229156494140625, -84.32418823242188, -15.637006759643555, 10.410820007324219, -11.757568359375, 90.15191650390625, 120.19509887695312, 71.18206787109375, 5.837188720703125, -10.683624267578125, 8.574264526367188, -48.6273193359375, 8.589675903320312, 14.130020141601562, 64.89019775390625, -89.71218872070312, 138.7224578857422, 101.50177001953125, 18.873512268066406, -20.6412353515625, 138.4019775390625, 10.698638916015625, -121.47137451171875, -67.49163818359375, -7.680450439453125, -82.97518920898438, 132.4566650390625, 19.320968627929688, -40.89642333984375, 98.59423828125, 7.6920166015625, 0.0, 11.120170593261719, 93.9327392578125, 61.821258544921875, 188.7061767578125, 33.909515380859375, 170.0648193359375, -9.562431335449219, 111.53665161132812, 1.9619522094726562, 124.29435729980469, 0.0, 133.71368408203125, 101.03406524658203, 19.167190551757812, 25.34265899658203, 105.98849487304688, 59.21345520019531, 73.8425521850586, 115.83433532714844, 7.9983367919921875, -3.4053955078125, 168.13037109375, -21.650821685791016, 50.644287109375, -19.446624755859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000263.npy"}
|
||||
{"epoch": 0.5507853403141362, "step": 264, "batch_size": 128, "mean": 45.82780456542969, "std": 58.33063507080078, "min": -118.25723266601562, "p10": -14.345800018310547, "median": 39.79205322265625, "p90": 130.0895233154297, "max": 186.8868408203125, "pos_frac": 0.7421875, "sample": [-3.767423629760742, -10.977066040039062, 154.83074951171875, 58.382476806640625, 136.20086669921875, 105.83507537841797, 93.82527923583984, 60.4027099609375, -0.6252574920654297, 9.7437744140625, 3.628185272216797, 68.86404418945312, 111.23861694335938, 25.50811767578125, 40.1796875, 100.74217224121094, 90.61143493652344, 110.30455017089844, -13.352716445922852, 0.8987960815429688, 61.261322021484375, 47.021240234375, -12.23828125, 41.246498107910156, -11.775718688964844, 152.24346923828125, 116.51605224609375, -8.218936920166016, 56.834716796875, 118.35464477539062, -28.928909301757812, 102.3587646484375, 123.64212036132812, 115.78810119628906, 145.14059448242188, 0.0, -11.519805908203125, 156.3023681640625, 21.96636199951172, 70.03668975830078, -18.1683349609375, 131.896240234375, -14.144210815429688, 61.59796142578125, 83.70906829833984, 109.91943359375, -34.263641357421875, 153.53118896484375, -60.52105712890625, 37.47996520996094, 1.94732666015625, 16.449066162109375, 51.544677734375, 5.5931854248046875, 106.49002075195312, -59.577392578125, 119.86497497558594, 96.2350082397461, 63.07014083862305, 29.32007598876953, 51.26336669921875, 1.6444969177246094, 16.264450073242188, 66.92603302001953, 49.494964599609375, -21.35418701171875, -7.01416015625, 21.56280517578125, 59.58647155761719, 116.12884521484375, -2.641357421875, 50.539306640625, -118.25723266601562, -14.203071594238281, 26.043888092041016, -5.074516296386719, 10.724529266357422, 12.417617797851562, 127.5344009399414, 130.72891235351562, 40.822662353515625, 2.1051559448242188, 0.17139053344726562, 71.0416259765625, -1.5479583740234375, 186.8868408203125, -6.367012023925781, 0.0, 100.70166015625, 72.27790832519531, -44.132843017578125, 92.96710205078125, 107.56754302978516, 129.93328857421875, 18.125, 34.68867492675781, -4.57086181640625, 39.4044189453125, 91.04068756103516, 23.748123168945312, 130.45407104492188, 70.36895751953125, -60.88763427734375, -15.668701171875, 24.17156982421875, 41.693084716796875, 31.42950439453125, 13.146209716796875, -14.6788330078125, 53.37126922607422, -1.4716644287109375, 42.598876953125, 43.587493896484375, 166.63038635253906, 35.0311279296875, 141.23715209960938, -6.84173583984375, 20.365699768066406, 136.90487670898438, -44.62498474121094, -33.30182647705078, 88.07246398925781, 88.99507141113281, 5.5162506103515625, 52.62945556640625, 2.157440185546875, 38.619598388671875, 8.7960205078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000264.npy"}
|
||||
{"epoch": 0.5528795811518324, "step": 265, "batch_size": 128, "mean": 41.294857025146484, "std": 66.19124603271484, "min": -117.41168212890625, "p10": -30.529899597167965, "median": 31.7657470703125, "p90": 132.8032196044922, "max": 161.969482421875, "pos_frac": 0.71875, "sample": [1.4969253540039062, 147.06979370117188, 114.67373657226562, -67.21759033203125, 42.888641357421875, 8.73565673828125, 110.65323638916016, 34.68798828125, 91.79617309570312, -77.1314468383789, 18.822067260742188, 44.324493408203125, 64.35926055908203, -13.43841552734375, 131.86880493164062, 73.77580261230469, 34.621307373046875, 132.73773193359375, 42.414215087890625, 85.77257537841797, 142.8934326171875, 139.80130004882812, 140.10736083984375, 15.443740844726562, 161.969482421875, 100.93045043945312, 123.62673950195312, 135.66751098632812, 57.060760498046875, -7.1401824951171875, -27.74852752685547, 115.8839111328125, 148.40402221679688, 8.209823608398438, 128.83128356933594, 131.3211669921875, 76.58969116210938, 112.16902160644531, 37.604400634765625, 125.18704223632812, -2.616058349609375, 0.94793701171875, -23.6591796875, 58.887298583984375, -8.221748352050781, -11.70654296875, -17.7099609375, -40.705230712890625, -3.6514434814453125, 95.72592163085938, 0.091461181640625, -22.764541625976562, -33.3321533203125, 9.951583862304688, -94.52542114257812, 154.532958984375, 45.241058349609375, 51.498016357421875, 91.92012023925781, -111.76126098632812, 23.516971588134766, 102.26963806152344, 4.1226806640625, 102.24525451660156, 19.933563232421875, -95.8931884765625, 124.92076110839844, 22.6573486328125, 26.9326171875, -29.328933715820312, -42.82293701171875, 121.26400756835938, 142.2924041748047, 22.15432357788086, 0.0, -5.461435317993164, -14.240447998046875, -117.41168212890625, 153.2310791015625, 114.20500183105469, 32.95709228515625, 3.07147216796875, 26.596107482910156, 1.018798828125, 109.05221557617188, 108.31558227539062, 0.0, -9.7467041015625, -12.163238525390625, 88.93283081054688, 102.86151123046875, -29.12274169921875, -11.9664306640625, 1.8783950805664062, 47.456573486328125, 3.936920166015625, 49.38818359375, 161.67889404296875, 8.847991943359375, -93.53274536132812, 15.946769714355469, 107.71749877929688, 34.9539794921875, 42.698123931884766, -9.54559326171875, 20.60308837890625, -87.20248413085938, 140.58966064453125, 107.06573486328125, -41.756805419921875, -4.871849060058594, 66.40509033203125, 5.1614990234375, 2.5681915283203125, 57.14496612548828, -34.368621826171875, 0.66217041015625, 132.95602416992188, 82.25704956054688, 90.06640625, 92.75552368164062, -10.49713134765625, 73.93795013427734, 30.57440185546875, 21.918149948120117, -20.57611083984375, 26.983535766601562, 49.680633544921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000265.npy"}
|
||||
{"epoch": 0.5549738219895288, "step": 266, "batch_size": 128, "mean": 44.346561431884766, "std": 61.40810012817383, "min": -130.609130859375, "p10": -13.525106811523438, "median": 32.636627197265625, "p90": 119.72423858642578, "max": 181.64080810546875, "pos_frac": 0.828125, "sample": [46.02130126953125, 92.22181701660156, 165.59432983398438, 16.667999267578125, -12.504180908203125, -13.560012817382812, 97.61483764648438, -7.86224365234375, -30.636688232421875, 8.4664306640625, 31.5859375, 8.562774658203125, -5.9686431884765625, 54.2921142578125, 83.28218078613281, -19.804107666015625, -11.361785888671875, 16.916152954101562, 42.47957229614258, -100.29501342773438, 115.13844299316406, 94.2270736694336, 4.432830810546875, 31.38671875, 126.06979370117188, 93.4254150390625, 0.250335693359375, 69.003173828125, 133.2103271484375, 129.9833221435547, 1.3917579650878906, 154.22625732421875, -18.44329833984375, 74.062255859375, 3.484569549560547, 104.1378173828125, -41.1376953125, 9.533618927001953, 5.6888427734375, 119.3646240234375, 36.104034423828125, 24.9053955078125, 13.076515197753906, 3.6327056884765625, 114.7817153930664, 97.9610595703125, 21.131103515625, 119.06011199951172, 39.536827087402344, 28.17669677734375, 78.81982421875, 96.91458129882812, -106.49783325195312, -110.46041870117188, 0.10238838195800781, -38.83613586425781, 5.439533233642578, 76.10319519042969, 59.150787353515625, 78.26779174804688, 115.11383056640625, 12.81781005859375, 14.370368957519531, 16.8021240234375, 20.607908248901367, 145.00369262695312, 69.90887451171875, 115.910888671875, 11.472412109375, 28.8336181640625, 33.474090576171875, 135.13458251953125, 66.2266845703125, 54.811798095703125, 111.75545501708984, 158.16928100585938, 40.41192626953125, -7.339263916015625, -77.38113403320312, -95.6214828491211, -130.609130859375, 120.56333923339844, 13.178375244140625, 115.50080871582031, -13.510147094726562, 66.51235961914062, 50.564056396484375, 28.644622802734375, 51.95147705078125, 27.029205322265625, -56.552459716796875, 8.53125, -4.2913665771484375, 43.68743896484375, 106.557373046875, 68.76885986328125, 30.026351928710938, 38.7965087890625, 12.147705078125, 5.211456298828125, 12.779693603515625, 47.264678955078125, 28.57415771484375, 143.0517120361328, 14.966188430786133, 31.799163818359375, -11.764232635498047, 0.07949066162109375, 43.12910461425781, -6.697296142578125, 111.211669921875, 136.41110229492188, 143.84727478027344, 14.422935485839844, 116.49407958984375, 6.14654541015625, 115.555419921875, 75.68193817138672, 13.960594177246094, 37.232391357421875, 181.64080810546875, 13.407482147216797, 116.05015563964844, 15.536705017089844, 110.5838623046875, 78.25386047363281, 107.65579223632812, 117.37440490722656], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000266.npy"}
|
||||
{"epoch": 0.5570680628272251, "step": 267, "batch_size": 128, "mean": 38.35746765136719, "std": 69.2497787475586, "min": -140.64112854003906, "p10": -42.157023620605464, "median": 27.43195343017578, "p90": 125.03415679931638, "max": 195.5914306640625, "pos_frac": 0.6953125, "sample": [3.8805675506591797, 108.38101196289062, 101.15594482421875, 90.2685546875, 157.56182861328125, 10.022171020507812, 45.48247528076172, 27.921676635742188, -9.650390625, -0.3094940185546875, -0.1466960906982422, 25.598236083984375, -11.30816650390625, 137.91339111328125, 26.578073501586914, -26.207183837890625, 8.979743957519531, 136.284423828125, 135.4068603515625, 32.990203857421875, -32.862762451171875, 102.58892822265625, 111.97494506835938, 3.1634521484375, 46.89054489135742, -26.851654052734375, 113.690673828125, 97.29927062988281, 143.29541015625, -120.68072509765625, 108.43377685546875, 40.41584014892578, 26.942230224609375, -18.58380126953125, 14.888931274414062, 87.83060455322266, -24.775312423706055, 140.63055419921875, 120.48855590820312, 78.45439910888672, -90.352783203125, 148.7774658203125, -14.312774658203125, 95.67449951171875, 12.442131042480469, 175.51309204101562, 1.50555419921875, -24.17498779296875, 63.264404296875, 122.59269714355469, -96.93875885009766, -33.615447998046875, 102.78363037109375, 102.01171112060547, 8.93109130859375, 169.6300048828125, 104.56490325927734, 78.23675537109375, 136.05487060546875, 11.087705612182617, 140.45233154296875, 73.45794677734375, -104.6424560546875, 12.23858642578125, 28.393417358398438, 16.914779663085938, 130.73089599609375, -65.48405456542969, 44.485076904296875, 84.15170288085938, 11.749847412109375, 92.27767944335938, 70.32568359375, -125.26068115234375, -8.995193481445312, 91.69293212890625, 25.880615234375, 78.47128295898438, -2.7388229370117188, -0.4079113006591797, -81.38899993896484, 96.2132568359375, -8.714706420898438, 42.38201904296875, -59.74962615966797, 87.89988708496094, -48.246734619140625, 48.441070556640625, 19.98895263671875, 88.13111877441406, -45.22517395019531, -23.452346801757812, 31.669967651367188, 47.843231201171875, 3.6917877197265625, 103.42447662353516, 85.9596939086914, 9.589954376220703, 119.66705322265625, -13.046051025390625, 115.01040649414062, 33.97991943359375, -26.231170654296875, 8.84161376953125, -11.836700439453125, 26.93072509765625, 25.134765625, 102.47445678710938, -6.482757568359375, 195.5914306640625, -52.27802276611328, -38.05609130859375, 96.25045013427734, 30.057220458984375, 93.93608093261719, 1.572092056274414, 113.67169189453125, 101.43896484375, -8.690399169921875, -40.84210205078125, 115.59449768066406, -140.64112854003906, -7.6231689453125, 1.69097900390625, 78.35631561279297, -57.920562744140625, -36.95188903808594, 10.29315185546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000267.npy"}
|
||||
{"epoch": 0.5591623036649215, "step": 268, "batch_size": 128, "mean": 39.27926254272461, "std": 61.61400604248047, "min": -108.87979125976562, "p10": -33.61709289550781, "median": 26.579917907714844, "p90": 125.51693267822264, "max": 170.63909912109375, "pos_frac": 0.7421875, "sample": [29.585601806640625, 12.246536254882812, 76.35340881347656, 42.43463134765625, 91.11846923828125, -3.261566162109375, -2.0163116455078125, -11.610572814941406, -48.01068878173828, -58.946624755859375, 9.073959350585938, 156.67843627929688, 19.7774658203125, 129.34567260742188, 1.21160888671875, 85.09144592285156, 170.63909912109375, -5.992330551147461, 8.479217529296875, 98.70950317382812, 108.97835540771484, 116.20381164550781, 80.1199951171875, -102.86122131347656, 53.9210205078125, 27.306625366210938, -8.231302261352539, 48.82196044921875, 16.210357666015625, 54.06926727294922, 8.516057968139648, 79.09294128417969, 99.35467529296875, 85.81072998046875, -33.126708984375, 1.4522857666015625, -108.87979125976562, -20.865478515625, -29.046356201171875, 157.8548583984375, 46.68341064453125, 32.67308044433594, 25.85321044921875, 126.92633056640625, -34.761322021484375, 15.282379150390625, 120.9547119140625, -96.0933837890625, -42.5394287109375, -6.650794982910156, 24.440261840820312, 14.583587646484375, 51.043975830078125, 120.40322875976562, 4.531257629394531, 39.59405517578125, 142.1260528564453, -59.56188201904297, 30.2305908203125, -41.296844482421875, 163.21548461914062, 27.55078125, 115.40593719482422, -6.6900634765625, -61.259613037109375, 40.276611328125, 125.03633117675781, 42.224609375, 3.9204864501953125, -8.97711181640625, 94.27713012695312, 17.841445922851562, 109.595947265625, 111.57864379882812, 16.06878662109375, -9.610382080078125, 44.50445556640625, 134.90377807617188, 123.70721435546875, 113.33306884765625, 92.92410278320312, 59.52894592285156, 65.7735595703125, -76.68018341064453, 41.29517364501953, 97.072509765625, -17.571426391601562, 55.091705322265625, 73.84341430664062, 8.77398681640625, 69.59579467773438, 99.40934753417969, 147.789306640625, 81.84445190429688, 0.795654296875, 126.63833618164062, 17.143280029296875, 6.3473052978515625, 88.35733032226562, 43.6796875, -10.72479248046875, 18.927597045898438, 48.737548828125, 2.1141815185546875, 5.62877082824707, 50.96711730957031, 13.830596923828125, -48.72906494140625, 0.059417724609375, 29.77236557006836, 159.17062377929688, 10.392204284667969, 94.62481689453125, 121.22512817382812, -0.9105224609375, 140.21337890625, -13.113662719726562, -8.996200561523438, 164.70928955078125, 12.906402587890625, -0.2885284423828125, -37.22113037109375, 18.96697998046875, 13.595657348632812, 9.318115234375, 9.947227478027344, -5.96527099609375, 0.0], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000268.npy"}
|
||||
{"epoch": 0.5612565445026177, "step": 269, "batch_size": 128, "mean": 38.11360168457031, "std": 68.92536163330078, "min": -176.94546508789062, "p10": -33.4046630859375, "median": 28.88690185546875, "p90": 125.57626647949218, "max": 168.28216552734375, "pos_frac": 0.7109375, "sample": [-96.90428161621094, -5.3978424072265625, 105.78448486328125, -60.59065246582031, 16.197372436523438, -21.449310302734375, 8.512580871582031, 118.294189453125, 9.69073486328125, -34.21600341796875, 94.47168731689453, -1.8066482543945312, -8.27337646484375, 74.21090698242188, -33.05694580078125, 111.01095581054688, 4.8919525146484375, -4.88250732421875, 116.84123992919922, 95.82843780517578, 168.28216552734375, 151.72750854492188, 27.140533447265625, 138.9842071533203, 152.04800415039062, -21.49993133544922, 31.73077392578125, 115.25947570800781, 18.719818115234375, 68.67648315429688, 3.7770137786865234, 143.0783233642578, 106.08551025390625, -86.00563049316406, 100.82540893554688, 0.0, 34.032989501953125, -69.65608215332031, 83.1163330078125, 101.13018798828125, -16.075828552246094, 18.9130859375, 132.27154541015625, 132.23440551757812, 5.25469970703125, 84.4471435546875, 138.03623962402344, 0.0, 88.15243530273438, 39.68780517578125, 76.93112182617188, -19.205810546875, 123.43928527832031, -114.23681640625, -67.33099365234375, 120.87919616699219, 95.26531219482422, -17.874488830566406, -103.9246826171875, -0.04924201965332031, 119.74612426757812, 28.682388305664062, 102.19711303710938, 5.052490234375, 99.67778015136719, -23.84796142578125, 13.507781982421875, 49.03472900390625, 115.24061584472656, 68.15459442138672, 48.309295654296875, 105.74459838867188, 0.0, 128.1208953857422, 14.712135314941406, 19.700347900390625, -127.00201416015625, 2.4033470153808594, 43.024658203125, 5.101961135864258, -8.13470458984375, 19.2496337890625, 128.78118896484375, 19.19708251953125, 29.091415405273438, 30.530517578125, 44.295379638671875, 54.18896484375, 93.24894714355469, 124.89775085449219, 46.42034149169922, -32.127166748046875, 17.9434814453125, -20.18096923828125, 116.78643798828125, 41.92950439453125, -16.738677978515625, 99.16084289550781, 6.37530517578125, 95.92121887207031, 44.806121826171875, 85.94190979003906, 3.086404800415039, -18.68768310546875, 3.393747329711914, -108.27647399902344, 81.07230377197266, 129.5797119140625, 29.5941162109375, 88.69654083251953, -25.571624755859375, 10.576271057128906, 83.37973022460938, 1.039093017578125, 108.09544372558594, -88.579345703125, -27.471893310546875, -5.10546875, 9.518478393554688, -45.05397033691406, 93.4287109375, 127.15946960449219, 6.3467864990234375, -176.94546508789062, -14.11163330078125, 26.52179718017578, 124.87577819824219, 149.4095458984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000269.npy"}
|
||||
{"epoch": 0.5633507853403141, "step": 270, "batch_size": 128, "mean": 43.057884216308594, "std": 73.2115478515625, "min": -168.63619995117188, "p10": -39.765014648437486, "median": 44.48872756958008, "p90": 128.2059341430664, "max": 199.96856689453125, "pos_frac": 0.734375, "sample": [-139.229248046875, 49.86454772949219, 93.19049072265625, 86.1164321899414, -29.323883056640625, 95.83660888671875, -8.490043640136719, 28.37103271484375, 75.91352844238281, -48.090911865234375, 118.932861328125, 25.1923828125, 17.18756103515625, 43.42506408691406, 21.678970336914062, 109.17584228515625, -14.714752197265625, 132.73118591308594, 81.76325225830078, -127.71812438964844, -81.87490844726562, 56.83434295654297, 3.1320037841796875, 33.328216552734375, -6.808229446411133, 0.198089599609375, 112.21421813964844, 107.60717010498047, -91.9854736328125, 1.3272285461425781, 102.01361083984375, 120.08932495117188, 67.96743774414062, 2.739368438720703, 52.58494567871094, 9.262496948242188, -52.782958984375, 55.75651550292969, -140.51348876953125, 92.02766418457031, -28.937225341796875, 131.61285400390625, 28.870513916015625, 12.8280029296875, 120.77459716796875, 17.7930908203125, -4.545919418334961, 130.45684814453125, 84.27761840820312, -111.92559814453125, -120.82363891601562, 123.45339965820312, 127.10696411132812, 127.24125671386719, -30.39183807373047, 82.65385437011719, 142.21429443359375, -36.09552001953125, 121.62020874023438, 5.508514404296875, 26.552337646484375, -18.996414184570312, 10.092498779296875, 36.67472457885742, 124.7777099609375, 111.85316467285156, 135.17974853515625, 4.543571472167969, 120.6553955078125, 96.12220764160156, 89.8035888671875, -3.78228759765625, -126.01922607421875, 112.78490447998047, -54.76873779296875, 131.59112548828125, 60.503692626953125, 38.72581100463867, 104.4735107421875, -33.638092041015625, -14.33160400390625, 11.746826171875, 80.66180419921875, 113.2088623046875, 17.893108367919922, -5.725349426269531, 0.17950439453125, -14.50628662109375, 0.07558441162109375, -3.5852813720703125, -36.52349853515625, -47.32855224609375, 101.060302734375, 153.215087890625, -7.764068603515625, 35.5218620300293, 77.90036010742188, 133.04025268554688, 57.68927001953125, -10.5552978515625, 114.128662109375, 151.05035400390625, 38.3931884765625, 89.06195831298828, 51.02379608154297, -168.63619995117188, 44.639068603515625, -7.2506103515625, 142.76962280273438, 9.144073486328125, 79.71975708007812, 114.80291748046875, 9.904876708984375, 107.2515869140625, 65.90493774414062, 136.9873046875, 95.11529541015625, 44.33838653564453, 7.697509765625, 142.37234497070312, -11.928314208984375, 199.96856689453125, 61.979034423828125, 121.10188293457031, 98.78205108642578, 104.97674560546875, 107.68560791015625, -27.199951171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000270.npy"}
|
||||
{"epoch": 0.5654450261780105, "step": 271, "batch_size": 128, "mean": 44.892372131347656, "std": 61.21451187133789, "min": -95.98633575439453, "p10": -16.184201049804678, "median": 29.624366760253906, "p90": 124.12879180908203, "max": 200.84783935546875, "pos_frac": 0.7734375, "sample": [100.92877960205078, -95.98633575439453, -7.20391845703125, 31.8114013671875, 14.133819580078125, 3.59454345703125, 15.000358581542969, -10.017391204833984, 34.09228515625, -6.069122314453125, 55.453460693359375, 13.054031372070312, 116.91349792480469, 4.57928466796875, 13.976448059082031, 94.07443237304688, 81.61151123046875, 29.414306640625, 29.834426879882812, 73.26408386230469, 59.6931266784668, 9.875358581542969, 114.0484619140625, 96.69080352783203, 2.0642547607421875, 3.337696075439453, -35.7779541015625, 101.9892578125, 13.12124252319336, -11.4835205078125, 102.28898620605469, 130.1356201171875, -1.81231689453125, 40.76397705078125, 3.285064697265625, 24.761474609375, 28.992462158203125, 30.031707763671875, 88.66921997070312, -69.4091796875, 109.65951538085938, 6.03167724609375, -0.33489990234375, 115.58880615234375, 99.55546569824219, 31.58319091796875, 45.938720703125, 0.2841644287109375, 124.46540832519531, 10.032196044921875, -3.29254150390625, 47.011871337890625, 128.13804626464844, 62.4981689453125, 200.84783935546875, 147.23028564453125, 0.40834999084472656, 127.54730224609375, 28.2108154296875, 105.49827575683594, 93.41609191894531, 8.77642822265625, -5.479148864746094, 177.4324951171875, 105.92608642578125, -46.35101318359375, -21.642120361328125, 123.98452758789062, -5.825103759765625, 17.987548828125, 115.95966339111328, 163.38616943359375, -28.443145751953125, 84.95986938476562, -13.8450927734375, 19.759979248046875, -78.84434509277344, 27.750701904296875, 40.1400146484375, 85.52645111083984, -23.61328125, -48.53253173828125, 132.275390625, -7.796394348144531, 156.88262939453125, 87.95587158203125, 47.6046142578125, -37.364410400390625, 25.585613250732422, 24.467742919921875, 6.092273712158203, 128.85235595703125, 104.68315124511719, -13.456039428710938, 4.13348388671875, 98.72044372558594, 9.402225494384766, 98.49079132080078, 115.67250061035156, 82.59473419189453, 17.743270874023438, 25.235122680664062, -4.284149169921875, 97.64112854003906, 146.37554931640625, -7.04510498046875, -90.6890869140625, 13.93560791015625, 18.0435791015625, 94.243408203125, 33.00830078125, 103.89849853515625, 106.35536193847656, 58.47023010253906, 37.189605712890625, 14.8193359375, 112.343017578125, 117.485107421875, -4.504302978515625, -8.0321044921875, 109.67327880859375, -88.01667785644531, 114.27545166015625, 147.021484375, 19.012985229492188, 1.141998291015625, 58.721588134765625, -65.69419860839844], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000271.npy"}
|
||||
{"epoch": 0.5675392670157068, "step": 272, "batch_size": 128, "mean": 40.534446716308594, "std": 62.39738082885742, "min": -128.21295166015625, "p10": -24.03845977783203, "median": 28.340377807617188, "p90": 125.27801513671875, "max": 179.95928955078125, "pos_frac": 0.78125, "sample": [85.46080017089844, 0.0, -4.905670166015625, 21.595016479492188, -87.11653137207031, 56.00274658203125, 150.10992431640625, 80.54095458984375, 115.61224365234375, 10.899261474609375, -24.48797607421875, 23.175073623657227, 47.048828125, 109.56260681152344, 51.2950439453125, 144.32662963867188, 122.56549072265625, 103.55636596679688, 39.87811279296875, 20.71100616455078, 107.14317321777344, 10.49002456665039, -8.574371337890625, 1.03961181640625, 105.32351684570312, -21.214401245117188, 4.411567687988281, 155.1300048828125, 0.8847675323486328, 12.0714111328125, 108.73358154296875, 64.5951919555664, 67.78018188476562, 0.16137313842773438, 127.44995880126953, 98.452392578125, 146.70068359375, 125.687744140625, 12.410491943359375, -1.0450592041015625, 78.69656372070312, -42.88934326171875, -64.3839111328125, 66.81942749023438, -13.10089111328125, 29.29339599609375, 106.47756958007812, -46.888153076171875, -2.1302413940429688, 1.4944210052490234, 131.99728393554688, 104.38048553466797, -15.831136703491211, 35.89867401123047, 106.22007751464844, 143.68072509765625, 5.1362152099609375, 47.37982177734375, 93.43075561523438, 0.9098396301269531, 3.650400161743164, 9.18035888671875, 9.100006103515625, 15.003753662109375, 83.59687042236328, 42.87525939941406, 179.95928955078125, 53.103546142578125, 128.46044921875, 10.20745849609375, 117.35429382324219, -23.845809936523438, 68.51748657226562, 11.860023498535156, -46.9197998046875, -40.1893310546875, -6.910919189453125, 44.11833190917969, 4.473968505859375, 107.67288208007812, 90.89077758789062, 47.67059326171875, -10.033050537109375, 151.32591247558594, 3.21002197265625, -128.21295166015625, 24.64463996887207, 13.695926666259766, 29.5677490234375, 2.814910888671875, 33.410743713378906, 82.86721801757812, 8.085281372070312, -124.18191528320312, 52.6881103515625, 42.2503662109375, 49.479736328125, 105.82329559326172, 13.044219970703125, 2.2090702056884766, 0.0, -25.082000732421875, 11.078893661499023, -100.1494140625, -19.54522705078125, 0.407470703125, 21.460845947265625, 1.1417922973632812, -5.9902801513671875, 27.387359619140625, 57.94074630737305, 178.41082763671875, -54.70831298828125, 58.843666076660156, 119.523681640625, -79.57821655273438, 22.59412384033203, 3.4225940704345703, 7.59766960144043, -4.410369873046875, 133.16046142578125, 120.6969985961914, 36.50263977050781, 59.92633056640625, 99.958251953125, 125.1024169921875, 97.24104309082031, 82.90267944335938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000272.npy"}
|
||||
{"epoch": 0.5696335078534032, "step": 273, "batch_size": 128, "mean": 45.3183479309082, "std": 66.62500762939453, "min": -131.7032470703125, "p10": -33.53895797729492, "median": 38.06222152709961, "p90": 128.01270141601563, "max": 194.47747802734375, "pos_frac": 0.703125, "sample": [13.22222900390625, 150.85809326171875, 142.19329833984375, 8.749410629272461, 151.87548828125, -7.7114410400390625, 170.40118408203125, 21.2601318359375, 91.2088623046875, 8.16180419921875, -24.24774169921875, 55.22613525390625, 3.567760467529297, 87.86622619628906, 194.47747802734375, -122.67398071289062, 65.4375, 165.1002197265625, 15.03314208984375, 60.04412841796875, -33.376380920410156, 89.20906829833984, 114.49639892578125, 24.22303581237793, 133.77236938476562, 26.180908203125, -107.89974975585938, 31.748504638671875, 103.81497192382812, -18.142837524414062, 19.01042938232422, -36.5897216796875, -20.420982360839844, 113.76483154296875, -20.8311767578125, -0.858856201171875, 9.713836669921875, 2.230804443359375, -5.0098876953125, 120.552490234375, -6.5395965576171875, -11.667510986328125, 1.074737548828125, 45.524932861328125, 101.74208068847656, 22.0927734375, 103.29735565185547, -0.28345680236816406, -46.268341064453125, 6.37310791015625, 115.35137939453125, -23.67279052734375, 137.41122436523438, -21.278961181640625, 84.30545043945312, -8.03192138671875, 2.18902587890625, 67.43484497070312, 22.004348754882812, 37.910118103027344, -42.70166015625, 105.29305267333984, 0.9444427490234375, 96.98385620117188, -21.59332275390625, 142.5462646484375, 93.08329772949219, 0.4213714599609375, 93.85020446777344, 1.027984619140625, 110.06698608398438, -30.822494506835938, -45.00384521484375, -9.717254638671875, 60.479248046875, 129.42547607421875, 22.78704833984375, 119.55743408203125, 106.86846923828125, -57.389923095703125, 89.40733337402344, 106.2448959350586, 96.98118591308594, -35.01032257080078, 55.9166259765625, 105.73018646240234, 27.95306396484375, 68.97711944580078, 131.33563232421875, 109.15419006347656, 84.68972778320312, -49.951629638671875, -4.6712493896484375, 106.69259643554688, -131.7032470703125, 38.214324951171875, -91.05584716796875, 113.48106384277344, 127.4072265625, -0.227081298828125, 142.23443603515625, 49.55510711669922, 124.14495849609375, 120.28834533691406, 124.1314697265625, -5.784348487854004, -11.8397216796875, 48.788330078125, 112.86264038085938, 97.38978576660156, 19.617767333984375, 46.72418212890625, -2.4019927978515625, 38.4580078125, 34.77409362792969, 58.106048583984375, -37.255828857421875, 137.80783081054688, 25.683807373046875, -6.10107421875, -10.519702911376953, 124.81866455078125, 46.19537353515625, 110.39386749267578, 99.37774658203125, -33.918304443359375, 115.8013916015625, -18.863510131835938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000273.npy"}
|
||||
{"epoch": 0.5717277486910994, "step": 274, "batch_size": 128, "mean": 47.46599197387695, "std": 65.26335144042969, "min": -133.559326171875, "p10": -16.304017639160154, "median": 36.79961395263672, "p90": 125.97944030761718, "max": 195.34466552734375, "pos_frac": 0.78125, "sample": [115.10738372802734, 113.453857421875, -42.531494140625, 102.71113586425781, -1.7371444702148438, 62.5277099609375, 41.29998779296875, 53.53144836425781, 94.0811538696289, 10.127945899963379, 24.316741943359375, 103.74349212646484, 156.9792938232422, 84.92473602294922, 25.14185333251953, 100.34586334228516, 167.6400146484375, 141.5601043701172, -60.433135986328125, -15.70260238647461, 21.64996337890625, 140.6314697265625, 115.04507446289062, 41.265228271484375, -8.318161010742188, -3.75689697265625, -104.7381591796875, -31.002197265625, 12.738828659057617, 34.5614013671875, 17.82293701171875, 30.068206787109375, 99.63626098632812, 50.47210693359375, 104.99948120117188, 72.06243896484375, 43.1104736328125, 104.4053955078125, 6.2069854736328125, 124.9615478515625, 139.91024780273438, -9.37429428100586, -3.697357177734375, 5.263580322265625, -17.686553955078125, 117.30282592773438, 23.272232055664062, 10.857288360595703, 19.349090576171875, 144.9818878173828, -8.855453491210938, 9.2633056640625, 106.51779174804688, 21.995697021484375, -117.50225830078125, 27.003616333007812, -14.672210693359375, 43.502044677734375, 0.3743896484375, 37.255126953125, 19.908971786499023, 5.817781448364258, -9.313423156738281, 81.99913024902344, -63.334991455078125, -1.4276580810546875, 114.80555725097656, 0.3154296875, 72.50244140625, 111.50421142578125, 8.862716674804688, -133.559326171875, 137.4697265625, 94.29826354980469, 57.8695068359375, 50.8121337890625, 45.908721923828125, 36.34410095214844, 195.34466552734375, 8.829635620117188, 174.80087280273438, 97.45552062988281, 23.4146728515625, 115.52914428710938, -2.0167388916015625, 55.767120361328125, 107.22064208984375, 124.13385009765625, 72.30615234375, 120.5422592163086, 109.84207153320312, 25.190231323242188, -18.833908081054688, 84.4290771484375, 122.98825073242188, 101.00437927246094, 9.2705078125, 128.35452270507812, 146.966552734375, -7.668182373046875, -83.23536682128906, 17.67413330078125, 7.483734130859375, -15.711502075195312, 21.193435668945312, 105.28057861328125, 8.253288269042969, 119.95462036132812, 7.87713623046875, 51.70086669921875, 129.32186889648438, 102.6149673461914, -35.4415283203125, 57.66914367675781, -0.9311981201171875, -18.64398193359375, 2.4401702880859375, 3.4063568115234375, 12.579177856445312, 73.99850463867188, 12.941364288330078, -131.72674560546875, 113.96978759765625, -1.47674560546875, 23.7047119140625, 102.25714111328125, 111.428955078125, 167.43765258789062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000274.npy"}
|
||||
{"epoch": 0.5738219895287958, "step": 275, "batch_size": 128, "mean": 53.19717025756836, "std": 63.20420455932617, "min": -106.779296875, "p10": -12.92349395751953, "median": 37.303890228271484, "p90": 131.88404541015623, "max": 166.020263671875, "pos_frac": 0.8046875, "sample": [166.020263671875, -10.6470947265625, -2.706575393676758, -99.888427734375, 127.33978271484375, 77.93975830078125, 124.742431640625, 13.3541259765625, 40.88482666015625, -106.779296875, 10.478790283203125, 144.77496337890625, -35.7030029296875, 132.94613647460938, 0.0, 10.149810791015625, 12.108535766601562, 87.953857421875, 3.5230331420898438, 3.2210693359375, 0.0, 0.0, 107.02171325683594, 110.5793228149414, 89.20915222167969, 4.972808837890625, 139.03985595703125, 95.91258239746094, 21.02777099609375, 23.023635864257812, 63.7672119140625, 6.183555603027344, 114.3307113647461, 122.89593505859375, 144.32933044433594, 136.821533203125, 1.3066291809082031, 1.736083984375, -106.05291748046875, 42.74382019042969, 128.767578125, 32.11627197265625, 83.78240203857422, 27.464752197265625, 17.32586669921875, -4.851388931274414, -74.86286926269531, 120.92472839355469, 131.42886352539062, 17.143798828125, 125.478759765625, -57.43304443359375, 2.4643115997314453, 70.90167236328125, 88.23870849609375, 15.255096435546875, 89.66619873046875, 18.700927734375, 125.96127319335938, 0.0, 20.42291259765625, 24.289520263671875, 27.147369384765625, 126.63502502441406, -5.593666076660156, 150.84365844726562, 151.22210693359375, 59.784698486328125, 35.81990051269531, 0.0, 10.27134895324707, 135.37518310546875, 82.53974914550781, 103.01974487304688, -2.9314002990722656, 87.687255859375, 128.17190551757812, 161.85726928710938, 115.46601867675781, 8.342849731445312, 107.3105239868164, 119.28524780273438, 78.591796875, 104.34634399414062, 76.50102996826172, 1.9988574981689453, 23.268203735351562, 103.64738464355469, 85.6768569946289, 33.973907470703125, -15.92242431640625, 38.787879943847656, 8.131011962890625, 136.8455810546875, 1.454681396484375, 119.7840576171875, -14.681686401367188, 3.5972518920898438, 159.225830078125, 33.73735046386719, 48.42161560058594, 27.43658447265625, 120.64608764648438, -1.3442153930664062, 125.15779113769531, 112.69175720214844, 12.661109924316406, -46.61944580078125, 0.808197021484375, -12.16998291015625, 129.03564453125, -56.314517974853516, 104.80752563476562, 1.8616943359375, 34.633209228515625, 7.267566680908203, 16.24969482421875, 113.98944091796875, 159.79867553710938, 108.8440933227539, -15.808273315429688, 98.902587890625, 66.96870422363281, -14.8978271484375, 92.86965942382812, -14.79241943359375, 114.60980224609375, 70.55618286132812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000275.npy"}
|
||||
{"epoch": 0.5759162303664922, "step": 276, "batch_size": 128, "mean": 49.38157653808594, "std": 64.50350952148438, "min": -94.51394653320312, "p10": -24.93019409179687, "median": 29.83367919921875, "p90": 131.25009002685547, "max": 206.916015625, "pos_frac": 0.78125, "sample": [206.916015625, 17.843643188476562, 105.89287567138672, -4.7646331787109375, 23.512481689453125, -2.77447509765625, 11.657135009765625, 68.89100646972656, 26.6063232421875, 66.70388793945312, 99.49647521972656, 5.462127685546875, 12.50054931640625, 0.8408145904541016, -1.2892913818359375, 91.25342559814453, 44.924560546875, 26.9945068359375, -23.9088134765625, 44.712188720703125, 134.310791015625, 51.68962097167969, 8.193389892578125, 133.05279541015625, 85.36740112304688, -27.31341552734375, -28.93731689453125, -45.36376953125, -9.312744140625, 113.51123046875, 79.48443603515625, 5.9919586181640625, 112.613525390625, 114.09723663330078, 88.35964965820312, 11.83837890625, 26.03173828125, -1.720266342163086, 107.13581848144531, 40.84938049316406, 22.9150390625, 23.7308349609375, 121.2906494140625, 23.566192626953125, 5.168727874755859, 148.65292358398438, 90.49667358398438, 131.0589599609375, 17.08551025390625, -8.875762939453125, 8.239723205566406, 99.95425415039062, 100.202392578125, 31.212890625, 123.28057098388672, 12.163459777832031, 43.451568603515625, 22.400299072265625, 125.34185791015625, -40.203033447265625, 0.0, -35.524017333984375, 38.17655944824219, -92.21743774414062, 8.344863891601562, -42.335968017578125, 69.06878662109375, 156.16964721679688, -1.967132568359375, -22.93231201171875, 44.75435256958008, -17.0987548828125, 97.03900146484375, 122.43157958984375, 99.04911041259766, 129.51019287109375, 130.73907470703125, -21.506210327148438, 40.02935028076172, 117.77281188964844, 127.62956237792969, 107.75245666503906, 101.3072509765625, 38.84844970703125, 12.095123291015625, 79.49343872070312, -69.04193115234375, 103.73043060302734, -4.00262451171875, 28.4544677734375, 1.2931747436523438, 180.01495361328125, 20.205078125, 19.88140869140625, 11.803924560546875, 8.534469604492188, 2.3166351318359375, 144.2900390625, 18.934478759765625, 125.83355712890625, 71.64651489257812, 66.00215148925781, 136.00872802734375, 2.19091796875, -1.919464111328125, 99.02713012695312, -4.456687927246094, -50.64223861694336, 27.9990234375, 131.1493682861328, 4.52642822265625, 112.28372192382812, 16.556381225585938, 140.17791748046875, 131.13296508789062, -92.70803833007812, 159.06710815429688, 18.84893798828125, 102.75447082519531, -36.8663330078125, 147.44027709960938, 9.239166259765625, 93.01329040527344, -94.51394653320312, -79.47186279296875, 131.485107421875, 162.738037109375, 116.77029418945312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000276.npy"}
|
||||
{"epoch": 0.5780104712041885, "step": 277, "batch_size": 128, "mean": 33.97996520996094, "std": 70.56504821777344, "min": -147.78587341308594, "p10": -44.802538299560545, "median": 25.489198684692383, "p90": 120.03590087890625, "max": 198.24688720703125, "pos_frac": 0.703125, "sample": [13.085739135742188, 5.86328125, -6.036968231201172, 119.97320556640625, 26.01611328125, -22.281463623046875, -77.02214050292969, -9.423095703125, 95.67518615722656, 4.010986328125, 21.795166015625, 83.46611022949219, 114.34373474121094, 108.26756286621094, -8.146560668945312, 113.76828002929688, 71.07984924316406, -138.0938720703125, 100.77279663085938, 1.75152587890625, 31.313812255859375, 9.226638793945312, 102.73815155029297, 48.5816650390625, 93.90447998046875, 132.96292114257812, -0.5899772644042969, -3.3157958984375, 116.01742553710938, -7.047233581542969, 80.32573699951172, -132.434326171875, -3.8781890869140625, -44.80628204345703, -18.918014526367188, 22.374893188476562, 112.64451599121094, 6.68743896484375, 85.75177001953125, 53.12042236328125, -147.63092041015625, 0.93853759765625, -30.654823303222656, -27.583099365234375, 9.631866455078125, 78.505615234375, 30.412628173828125, 123.4984130859375, 2.9316253662109375, 36.19964599609375, 63.34478759765625, -3.5757970809936523, 1.8162994384765625, 198.24688720703125, 103.39865112304688, 0.0, 54.413238525390625, 75.96367645263672, 95.96531677246094, 26.5048828125, 148.20889282226562, 85.440673828125, -106.59982299804688, -3.3245811462402344, 0.804412841796875, 2.5855712890625, 83.39936065673828, -91.94314575195312, 176.5091552734375, 169.5653076171875, -32.43735885620117, 37.38360595703125, 150.74887084960938, 44.80854797363281, 10.41558837890625, -147.78587341308594, 25.96710205078125, -66.08670043945312, -89.56661987304688, 19.411415100097656, 17.860107421875, 94.98096466064453, 52.654296875, -22.736236572265625, 73.478515625, 95.28550720214844, 99.21343231201172, 114.20808410644531, -70.35366821289062, 130.21112060546875, 1.6550731658935547, 14.128921508789062, -12.7762451171875, 25.011295318603516, 56.371002197265625, -12.00201416015625, 22.339370727539062, 95.54736328125, 50.41619873046875, 112.74942016601562, 12.028121948242188, 15.127357482910156, 111.05117797851562, 122.45855712890625, 33.497230529785156, 115.12586975097656, 63.918487548828125, 127.4490966796875, -32.477020263671875, 0.0, 130.15438842773438, 164.52880859375, 0.41942596435546875, 54.75065612792969, -44.800933837890625, -10.574172973632812, -114.24288940429688, 120.18218994140625, 29.427658081054688, 91.85569763183594, 89.00425720214844, -109.4930419921875, -1.1258964538574219, -1.8977737426757812, 4.926486968994141, 15.57666015625, 34.444007873535156, -23.478437423706055], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000277.npy"}
|
||||
{"epoch": 0.5801047120418849, "step": 278, "batch_size": 128, "mean": 29.28695297241211, "std": 72.9912338256836, "min": -163.55487060546875, "p10": -82.76495666503905, "median": 14.235276222229004, "p90": 121.43143768310544, "max": 196.44143676757812, "pos_frac": 0.6953125, "sample": [113.09796142578125, 92.08595275878906, -87.78878784179688, 65.39385986328125, -3.503448486328125, 23.943862915039062, 56.189666748046875, 11.70672607421875, 0.9647655487060547, 126.75608825683594, 36.69708251953125, 111.82373809814453, 86.65463256835938, 34.449798583984375, 94.72270202636719, 102.36859130859375, -106.485107421875, -15.301155090332031, 24.722122192382812, -3.565185546875, -4.680976867675781, 79.25494384765625, -7.65460205078125, -22.790847778320312, 21.193756103515625, 165.19970703125, 141.12216186523438, 103.2693862915039, 10.5714111328125, -19.46484375, -162.71405029296875, 174.30816650390625, -92.04138946533203, 5.82391357421875, 49.568115234375, 80.44760131835938, 14.015981674194336, -55.68646240234375, 143.44515991210938, 22.741661071777344, 127.78617095947266, 146.71728515625, 90.64019775390625, 10.92291259765625, 5.534553527832031, 48.54058837890625, -1.8743133544921875, 143.54856872558594, 43.692100524902344, 60.708580017089844, 9.367803573608398, 1.77496337890625, 2.1637802124023438, -80.78305053710938, -17.289642333984375, -163.55487060546875, 2.7447471618652344, 150.33023071289062, -87.389404296875, -96.7925796508789, 17.33063507080078, 73.94293212890625, 25.744110107421875, 4.58251953125, 1.411590576171875, 114.18499755859375, -30.074752807617188, 111.01432800292969, 113.77706146240234, 30.279708862304688, 151.5433349609375, 196.44143676757812, 0.5455093383789062, -56.81388854980469, 73.5806884765625, -97.02813720703125, 93.90715789794922, 22.263671875, 80.91799926757812, 44.077728271484375, -126.30320739746094, 5.4050140380859375, 2.3884620666503906, 51.42376708984375, 99.12523651123047, -5.5451202392578125, -1.559295654296875, 103.06580352783203, 29.68511962890625, 101.33158874511719, -0.417449951171875, -1.5328369140625, 107.32859802246094, 164.95001220703125, -21.060699462890625, 118.60836791992188, -92.81480407714844, 4.0164794921875, 77.89515686035156, -1.0690155029296875, -11.05718994140625, 7.4996490478515625, 21.265716552734375, -93.41130828857422, -17.451385498046875, 22.976974487304688, 130.43124389648438, -8.673629760742188, 6.158233642578125, 14.454570770263672, 106.4593734741211, 3.594308853149414, -16.27557373046875, 69.953369140625, 0.9238185882568359, -28.947288513183594, 7.2894287109375, -112.95233154296875, 79.08056640625, -37.42279052734375, 6.063934326171875, 13.470443725585938, 13.00567626953125, 119.14944458007812, -112.63491821289062, 67.54554748535156, -7.641471862792969, 111.67424011230469], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000278.npy"}
|
||||
{"epoch": 0.5821989528795811, "step": 279, "batch_size": 128, "mean": 49.876502990722656, "std": 65.6118392944336, "min": -115.77953338623047, "p10": -15.735875701904291, "median": 35.204315185546875, "p90": 134.42240142822266, "max": 208.23480224609375, "pos_frac": 0.796875, "sample": [41.48004150390625, 105.79158020019531, 50.32051086425781, -22.11235809326172, 15.808990478515625, -6.1155242919921875, 101.47512817382812, 123.36151123046875, 5.229774475097656, 114.826171875, 79.32176208496094, 154.302734375, 37.9683837890625, 0.98956298828125, -4.0296630859375, 32.450164794921875, 110.12570190429688, 100.81145477294922, 34.7724609375, 75.73480224609375, 18.554580688476562, 25.41144561767578, -5.3365478515625, 33.7095947265625, 28.511409759521484, 28.591583251953125, 19.688735961914062, 6.40753173828125, -108.00907897949219, 4.506858825683594, 88.08978271484375, 158.97604370117188, 120.72808837890625, 129.05056762695312, 79.49472045898438, -22.0621337890625, 6.6541290283203125, 93.23849487304688, -26.299331665039062, 11.5382080078125, 99.81998443603516, 117.16184997558594, 107.06033325195312, 142.82350158691406, 10.199195861816406, 134.09036254882812, -5.2432861328125, 10.537429809570312, -14.075981140136719, 83.286865234375, 23.148468017578125, 32.820404052734375, 10.61572265625, 28.803050994873047, 49.23292541503906, -8.616912841796875, 208.23480224609375, -115.77953338623047, 97.25558471679688, 154.33535766601562, -13.577949523925781, -0.6871337890625, 14.326950073242188, 22.674560546875, -77.74569702148438, -10.392074584960938, -24.577377319335938, 116.78302001953125, -2.4122467041015625, 177.6993408203125, -48.508056640625, 177.92547607421875, 35.63616943359375, 82.97064208984375, 9.14406967163086, -111.22209167480469, 12.18670654296875, 123.51971435546875, 163.3177490234375, -97.45529174804688, 6.70452880859375, 24.213912963867188, 69.99618530273438, 93.55853271484375, 97.23184204101562, 72.11764526367188, 40.91363525390625, 8.151805877685547, 133.28382873535156, 3.7398757934570312, 107.47459411621094, -19.608963012695312, 103.55349731445312, 45.546600341796875, 21.98577880859375, 17.264572143554688, 131.39593505859375, 1.746358871459961, 47.611083984375, 131.8839874267578, 108.05862426757812, 135.63803100585938, -9.681640625, 120.0924072265625, 143.69989013671875, 130.36669921875, 16.934112548828125, 9.592254638671875, 13.023529052734375, 135.19715881347656, -3.21051025390625, 35.934906005859375, 60.83721923828125, 166.07440185546875, 107.74534606933594, 52.195709228515625, 27.3265380859375, 40.79669189453125, -87.04403686523438, -20.683921813964844, 148.71331787109375, -5.870269775390625, 85.26171875, 2.7558135986328125, 117.72882080078125, 61.18858337402344, 33.6571044921875, 59.824432373046875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000279.npy"}
|
||||
{"epoch": 0.5842931937172775, "step": 280, "batch_size": 128, "mean": 43.84587860107422, "std": 68.21540832519531, "min": -138.90415954589844, "p10": -25.240336608886718, "median": 27.354268074035645, "p90": 138.59579772949218, "max": 171.21746826171875, "pos_frac": 0.71875, "sample": [106.78021240234375, 107.21437072753906, 1.1642303466796875, 119.7508544921875, -4.125892639160156, -138.90415954589844, 140.08631896972656, 116.26190185546875, 150.47274780273438, 76.64247131347656, -20.377883911132812, 32.26123809814453, 4.440956115722656, 100.84442138671875, 17.138427734375, -24.575439453125, 146.744873046875, 94.86578369140625, 140.81094360351562, -3.7059783935546875, 2.024261474609375, 137.29452514648438, -63.25669860839844, 138.70339965820312, -23.9705810546875, 101.13320922851562, 26.64817237854004, 19.2705078125, -4.0239410400390625, 109.70123291015625, 0.0, 19.55010986328125, 77.04159545898438, -33.298797607421875, 43.10552978515625, 171.21746826171875, -10.796348571777344, 130.74862670898438, 10.654155731201172, -1.461090087890625, -5.63818359375, 84.73001098632812, 10.037792205810547, 22.250091552734375, 95.08710479736328, 76.56126403808594, 106.39581298828125, -94.29786682128906, 141.270263671875, 16.8016357421875, 29.1885986328125, 10.857231140136719, 4.86614990234375, 5.280672073364258, 99.86985778808594, 142.55325317382812, 66.62542724609375, -26.653335571289062, 3.4242095947265625, 111.41351318359375, 133.18731689453125, 26.399169921875, 20.456008911132812, -72.53264617919922, -12.955291748046875, -0.9028778076171875, -61.2779541015625, 127.630615234375, 8.183837890625, 85.54124450683594, 121.96075439453125, 31.573829650878906, 140.91799926757812, 138.5496826171875, 144.19439697265625, -24.634765625, 83.74601745605469, 76.8046875, 28.06036376953125, 52.30029296875, 97.77250671386719, 105.033935546875, -19.046417236328125, 108.37252807617188, 127.83161926269531, -5.493186950683594, 21.054412841796875, -6.735443115234375, -59.32212829589844, -0.2581787109375, 118.58349609375, 35.98260498046875, 60.891639709472656, -67.76902770996094, 18.064571380615234, 148.45309448242188, -7.34210205078125, 94.07478332519531, 97.99446105957031, 39.18341064453125, 126.13728332519531, 10.058074951171875, 115.28376770019531, -18.172821044921875, 21.6605224609375, 2.9931182861328125, -0.6363754272460938, -0.81683349609375, 144.830078125, 6.467254638671875, 69.18406677246094, 18.65892791748047, 5.17833137512207, 91.47404479980469, -56.199249267578125, 95.62136840820312, 110.09098815917969, 162.4432373046875, -13.062652587890625, -135.0152130126953, -8.56634521484375, 9.64413833618164, -98.67561340332031, 33.63107681274414, 44.582672119140625, 0.7981491088867188, 92.30917358398438, -86.85769653320312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000280.npy"}
|
||||
{"epoch": 0.5863874345549738, "step": 281, "batch_size": 128, "mean": 45.257423400878906, "std": 64.69198608398438, "min": -135.33279418945312, "p10": -16.652125549316402, "median": 29.99707794189453, "p90": 133.1213836669922, "max": 236.3983154296875, "pos_frac": 0.7265625, "sample": [99.06414794921875, 49.262596130371094, 0.965576171875, 5.284828186035156, 87.57891845703125, 10.408660888671875, 50.939849853515625, 50.360633850097656, 13.382492065429688, 132.7825927734375, 129.78875732421875, -96.2503662109375, -106.09124755859375, 148.496337890625, 17.348140716552734, 29.357147216796875, 95.74309539794922, 112.29159545898438, 5.1316375732421875, 15.489448547363281, 135.72386169433594, 9.511344909667969, -5.3587799072265625, -3.4451141357421875, 108.54672241210938, -7.7777557373046875, -29.48590087890625, 137.09814453125, 17.8699951171875, 58.480072021484375, 0.0, 45.726348876953125, -2.3223876953125, 78.16738891601562, 10.863166809082031, 126.08357238769531, 45.80743408203125, -11.2847900390625, 144.4232940673828, 90.07086181640625, -135.33279418945312, 69.41824340820312, 119.860107421875, -8.523515701293945, 0.0, 18.56781005859375, 39.90740966796875, 117.15017700195312, -21.94317626953125, 22.195663452148438, 79.891357421875, -53.800994873046875, 1.5448112487792969, 38.51780700683594, 2.015350341796875, 160.0382080078125, -9.51141357421875, 30.783065795898438, -0.499053955078125, 84.32449340820312, 15.272186279296875, 37.02824401855469, -15.70166015625, -1.9975605010986328, 80.7708740234375, 66.98403930664062, 24.84471893310547, -44.76173400878906, -1.2181396484375, 5.584991455078125, 16.56011962890625, 130.1011199951172, 2.493408203125, 131.71377563476562, 122.104736328125, 31.285552978515625, 128.81024169921875, 7.555572509765625, 43.634033203125, 45.85658264160156, 30.518997192382812, 106.71737670898438, 236.3983154296875, 135.25897216796875, 124.91732788085938, -59.63092803955078, -3.7679443359375, 98.88938903808594, 83.55943298339844, 3.6588211059570312, 169.376220703125, 33.344085693359375, 95.63470458984375, -5.239223480224609, 23.75799560546875, -32.12933349609375, -6.59967041015625, 149.20404052734375, 115.1258544921875, 50.016357421875, -23.746063232421875, 38.4739990234375, 0.0, 146.4266357421875, -5.76019287109375, -18.608367919921875, 85.97279357910156, -41.3182373046875, 149.6019287109375, 142.86318969726562, 112.79715728759766, 128.91983032226562, 133.91189575195312, -0.8568115234375, 127.17141723632812, 10.78948974609375, -15.813735961914062, 77.55133056640625, 5.9748077392578125, 24.233963012695312, 26.576080322265625, -81.1239013671875, -6.713508605957031, 127.00912475585938, 15.503082275390625, 76.21438598632812, 29.47515869140625, -3.14276123046875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000281.npy"}
|
||||
{"epoch": 0.5884816753926702, "step": 282, "batch_size": 128, "mean": 39.88337326049805, "std": 71.73343658447266, "min": -153.17404174804688, "p10": -38.68498840332031, "median": 27.54138946533203, "p90": 135.16560974121094, "max": 195.44171142578125, "pos_frac": 0.7265625, "sample": [129.9159698486328, -46.99504089355469, 56.19586181640625, 194.45233154296875, 86.97100830078125, 5.031227111816406, -153.17404174804688, 134.6568603515625, 93.85105895996094, -3.2619247436523438, 11.864456176757812, 63.98847961425781, 23.357421875, -27.114105224609375, 12.2557373046875, 9.096282958984375, 145.48721313476562, -38.202606201171875, 116.49270629882812, -2.8017120361328125, 9.07940673828125, 51.776824951171875, -90.48712158203125, 40.385955810546875, 1.165130615234375, 188.00234985351562, 121.07601928710938, 60.965087890625, -4.13592529296875, 10.83587646484375, -36.32135009765625, 20.675918579101562, 117.6578369140625, 98.94633483886719, -5.75701904296875, 44.19615936279297, 91.05926513671875, 6.70338249206543, 45.56793212890625, 30.15167236328125, 43.48649597167969, 81.21890258789062, 16.557403564453125, -96.66531372070312, 119.72909545898438, -27.77557373046875, -100.80258178710938, 38.70191955566406, 144.19802856445312, 4.449214935302734, 25.4566650390625, 32.073150634765625, -25.03216552734375, 13.428459167480469, -35.271514892578125, 161.8740234375, 58.667266845703125, 4.105926513671875, 2.5474929809570312, -58.65777587890625, 4.816230773925781, 5.91943359375, -15.1104736328125, 111.83026123046875, 131.131591796875, 97.89210510253906, 86.53785705566406, -9.886329650878906, -5.034095764160156, 125.71533966064453, -104.11077880859375, -2.806610107421875, 122.1500473022461, 54.92164611816406, 7.69775390625, 157.77490234375, 113.40234375, -39.810546875, 83.27813720703125, -30.924285888671875, -94.42721557617188, 10.470802307128906, -5.9063720703125, 156.01173400878906, 3.861602783203125, -63.61427307128906, 19.044113159179688, 16.738601684570312, 137.5062255859375, 123.5367431640625, 159.01583862304688, 81.8992919921875, 15.63616943359375, 76.98907470703125, -105.27728271484375, 32.19694519042969, 110.23936462402344, 136.7721405029297, 7.951200485229492, -5.71612548828125, 126.47518920898438, 195.44171142578125, 110.61860656738281, -6.742259979248047, 107.01507568359375, -1.795623779296875, -133.03237915039062, 22.61481475830078, 45.63604736328125, 35.68271255493164, 122.24278259277344, 29.626113891601562, 52.853057861328125, 95.43401336669922, 7.838203430175781, 126.9061279296875, 142.7789306640625, 16.521820068359375, 136.35269165039062, 46.478546142578125, 53.604705810546875, -34.1205940246582, -4.3517913818359375, -48.748046875, 93.08035278320312, -8.58135986328125, 6.14508056640625, 44.8841552734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000282.npy"}
|
||||
{"epoch": 0.5905759162303665, "step": 283, "batch_size": 128, "mean": 42.89732360839844, "std": 69.46410369873047, "min": -121.25617980957031, "p10": -54.12179260253905, "median": 31.934402465820312, "p90": 126.5989028930664, "max": 188.5101318359375, "pos_frac": 0.75, "sample": [-13.120681762695312, 28.857467651367188, 129.20950317382812, -25.2412109375, 4.747264862060547, -51.768585205078125, 122.28385925292969, -82.70260620117188, 60.85382080078125, 2.478302001953125, 9.47869873046875, 125.7056884765625, 92.4951171875, 85.00129699707031, -1.09423828125, -96.548095703125, -20.314605712890625, 106.68414306640625, 6.8595428466796875, -34.1285400390625, 183.96942138671875, 143.1588134765625, 32.150848388671875, -43.4326171875, 138.77008056640625, 110.00238037109375, 98.79136657714844, -24.393096923828125, 62.74919128417969, 14.66290283203125, 77.17115783691406, 56.56175231933594, 21.767257690429688, 113.11111450195312, 34.844383239746094, 96.2456283569336, 6.6003570556640625, 0.33921241760253906, 66.437744140625, 79.19889831542969, -4.19281005859375, 108.37469482421875, -59.61260986328125, -76.59304809570312, 102.75935363769531, 123.5233154296875, 18.1256103515625, 13.173828125, -89.45138549804688, 148.65313720703125, 93.2025146484375, 125.60090637207031, 103.42100524902344, 24.31983184814453, 146.37086486816406, 2.5095443725585938, 2.5809326171875, 43.29277038574219, 22.04376220703125, 100.94413757324219, 100.37725830078125, 82.8785400390625, 135.49517822265625, 160.6292724609375, 126.10466003417969, -7.59228515625, 127.75213623046875, 2.62384033203125, 46.41886901855469, 6.8941192626953125, -77.0009536743164, 115.88136291503906, -75.29006958007812, 136.3975830078125, -2.446319580078125, 66.85977172851562, 124.01852416992188, 95.43510437011719, 0.0, 2.60198974609375, -65.72000885009766, 15.303131103515625, 98.96063232421875, -102.39833068847656, 121.39129638671875, 50.3702392578125, 92.22967529296875, 12.844696044921875, -63.3321533203125, 28.36712646484375, -2.4749069213867188, 1.18841552734375, 12.611984252929688, -18.931396484375, 29.138702392578125, 107.94387817382812, 109.37176513671875, 51.535552978515625, -121.25617980957031, 108.24505615234375, 0.9077472686767578, 31.71795654296875, 188.5101318359375, -5.3930816650390625, 32.260009765625, 38.08973693847656, 106.60455322265625, -101.7060546875, 21.14581298828125, 2.2601776123046875, 147.16949462890625, 18.0400390625, -0.5157127380371094, 2.993497848510742, 0.0, 95.57562255859375, 121.00934600830078, 186.31198120117188, 13.099891662597656, 4.822734832763672, 64.3843994140625, 103.74337768554688, 58.72608947753906, 39.32975769042969, -10.53680419921875, 119.77244567871094, -60.300537109375, -28.08251190185547], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000283.npy"}
|
||||
{"epoch": 0.5926701570680628, "step": 284, "batch_size": 128, "mean": 41.707672119140625, "std": 78.41712951660156, "min": -145.68905639648438, "p10": -80.8250503540039, "median": 35.50697326660156, "p90": 138.15797424316406, "max": 234.1375732421875, "pos_frac": 0.7578125, "sample": [-90.73916625976562, 88.12310791015625, -94.08970642089844, -145.68905639648438, 110.00218200683594, 35.4722900390625, 137.74859619140625, 135.384521484375, -82.80827331542969, 26.391273498535156, 33.72064971923828, 87.98965454101562, 89.43999481201172, 20.67957305908203, 27.600555419921875, 100.09373474121094, 125.88569641113281, 72.49639892578125, 142.21221923828125, 7.33990478515625, 121.69425964355469, 200.22833251953125, -18.51287841796875, 90.02059936523438, -6.915517807006836, 14.552116394042969, 12.51507568359375, -79.97509765625, 15.91278076171875, -41.721160888671875, -92.32618713378906, 37.440673828125, 67.57598876953125, -23.275596618652344, 5.4118499755859375, 36.66131591796875, 126.60006713867188, 81.04074096679688, 4.553676605224609, 6.59368896484375, -111.10125732421875, 11.482070922851562, 57.38072204589844, 36.86285400390625, -10.181480407714844, 0.183990478515625, -38.49568176269531, 46.7950439453125, 14.44384765625, 180.67605590820312, 129.43234252929688, -114.85830688476562, 69.10043334960938, 24.64483642578125, 108.450927734375, 86.52658081054688, 9.172607421875, -14.247108459472656, 141.79583740234375, 99.38150024414062, 79.19625854492188, 153.22303771972656, 25.30902099609375, -6.408653259277344, 125.21876525878906, 47.714385986328125, 163.85415649414062, 29.65869140625, 82.79855346679688, 11.281824111938477, 113.11520385742188, 12.298988342285156, -4.497552871704102, 35.541656494140625, 2.3125076293945312, 234.1375732421875, 8.88409423828125, 10.108894348144531, 19.664047241210938, 0.992095947265625, 164.02235412597656, 124.99761962890625, 101.44587707519531, 145.50177001953125, -59.736419677734375, -126.3404541015625, 5.304841995239258, -17.313385009765625, 47.46662902832031, -54.39031219482422, 104.73194885253906, 57.45367431640625, 101.09555053710938, 113.17897033691406, 1.0239219665527344, -2.222869873046875, 76.4537353515625, -111.10960388183594, 2.2609100341796875, 88.14549255371094, 139.11318969726562, 36.149993896484375, 101.06222534179688, 13.02728271484375, 0.7598361968994141, 126.87588500976562, -27.664657592773438, 78.93789672851562, 22.56854248046875, 113.30702209472656, 125.02295684814453, 185.72332763671875, 59.22686767578125, -22.38897705078125, -112.74563598632812, -103.83929443359375, 92.98123931884766, -98.80900573730469, 2.391489028930664, -16.40911865234375, 169.037353515625, 56.31208801269531, 102.14385986328125, 182.53982543945312, 107.36312103271484, -2.7618980407714844, -105.85498046875, 95.36306762695312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000284.npy"}
|
||||
{"epoch": 0.5947643979057592, "step": 285, "batch_size": 128, "mean": 31.10248565673828, "std": 64.68875122070312, "min": -127.72418212890625, "p10": -43.71647491455078, "median": 18.793716430664062, "p90": 122.46670074462891, "max": 168.2447052001953, "pos_frac": 0.6875, "sample": [15.306854248046875, 97.40573120117188, 103.383056640625, 26.577392578125, 120.7315673828125, 125.42678833007812, 109.6748046875, -63.63591003417969, 106.45077514648438, 19.032730102539062, 75.86213684082031, -15.134735107421875, 10.108497619628906, -77.36923217773438, -11.10931396484375, 26.608238220214844, 18.554702758789062, -31.47076416015625, -17.96014404296875, -11.441314697265625, -7.80169677734375, 44.27801513671875, -96.03296661376953, 14.26153564453125, 31.473350524902344, 36.52996826171875, 153.79544067382812, 44.543670654296875, 19.41193389892578, 37.998382568359375, 168.2447052001953, -101.61248779296875, 12.8343505859375, 43.208038330078125, 126.2388916015625, 126.04963684082031, 114.4002685546875, -43.07243347167969, -30.445571899414062, 55.80296325683594, -45.21923828125, -30.409912109375, -24.327468872070312, -3.658294677734375, -0.23346900939941406, 63.117919921875, 71.8707275390625, 115.28262329101562, -0.35735321044921875, 8.567031860351562, 126.58885192871094, 1.2279033660888672, 8.490234375, -60.91350555419922, 137.71359252929688, 1.7057647705078125, -5.4558868408203125, 0.6099205017089844, 109.32440185546875, 15.118675231933594, 4.4060516357421875, 93.17047119140625, 122.09788513183594, 43.075286865234375, 46.32379150390625, 10.69403076171875, -50.96636962890625, 5.524236679077148, -55.75690460205078, 41.56365966796875, -114.95556640625, 109.36151123046875, 29.228904724121094, -2.4761505126953125, 93.69256591796875, 0.904632568359375, 43.554046630859375, 20.760852813720703, 30.543487548828125, 15.712371826171875, 96.55867767333984, 109.11013793945312, 12.488128662109375, 2.754058837890625, -15.288619995117188, 4.8013916015625, 32.4193115234375, 123.3272705078125, 141.45501708984375, 103.15139770507812, 97.12562561035156, 112.44960021972656, -3.0566024780273438, -26.1973876953125, -25.521575927734375, 4.470916748046875, 37.185546875, 14.792236328125, -36.402099609375, 19.84564208984375, 86.82600402832031, -32.72901916503906, 92.99580383300781, -35.921234130859375, -49.60369873046875, -28.2298583984375, 10.7900390625, -14.978181838989258, 134.34365844726562, 79.69088745117188, -127.72418212890625, -32.896026611328125, 28.849822998046875, -65.46435546875, -7.919593811035156, -114.23623657226562, 115.88875579833984, 5.9300384521484375, 117.03604125976562, 57.633941650390625, 17.88433837890625, 59.567047119140625, 85.82162475585938, 125.31175994873047, -41.736419677734375, 71.60223388671875, 148.43057250976562, 135.87661743164062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000285.npy"}
|
||||
{"epoch": 0.5968586387434555, "step": 286, "batch_size": 128, "mean": 36.007080078125, "std": 70.49595642089844, "min": -127.69097900390625, "p10": -54.20704078674315, "median": 26.73963165283203, "p90": 124.33255310058593, "max": 185.39788818359375, "pos_frac": 0.671875, "sample": [-7.27435302734375, -23.19757080078125, 45.637725830078125, 17.859771728515625, 10.742561340332031, 31.332748413085938, 9.72430419921875, 7.760047912597656, 8.0965576171875, 72.81649780273438, 98.00721740722656, 16.5263671875, -3.5919952392578125, 87.33575439453125, -104.13705444335938, 148.1796112060547, 12.07147216796875, 140.32785034179688, 103.09613800048828, 139.16326904296875, -28.017303466796875, 121.91073608398438, 15.32257080078125, 26.573532104492188, 18.660316467285156, 25.353973388671875, -15.132095336914062, -6.254547119140625, 100.24238586425781, -18.701019287109375, -127.69097900390625, 13.701919555664062, 118.23812103271484, 114.20545959472656, -8.57171630859375, 59.445068359375, 5.98126220703125, 72.90625, 9.464534759521484, 88.17568969726562, -8.55999755859375, 119.12274169921875, 104.68553161621094, -88.6768569946289, 66.5260009765625, -44.32672119140625, 36.814208984375, 183.54046630859375, -12.971725463867188, -7.00927734375, 122.66458129882812, -125.18414306640625, 126.72036743164062, 91.88571166992188, 78.288330078125, 91.84344482421875, 3.930816650390625, 91.07135009765625, -24.87110137939453, 148.63446044921875, -9.176994323730469, 0.0, -3.4765625, -115.02676391601562, 21.095050811767578, 51.493896484375, -5.374080657958984, -23.073867797851562, -5.69171142578125, 3.4067306518554688, -50.147369384765625, 49.3017578125, 108.86994934082031, 40.134918212890625, -21.33907699584961, 26.905731201171875, 119.54476928710938, 71.95372009277344, -89.12332153320312, 48.6411018371582, 101.47705078125, 2.758312225341797, -0.6875514984130859, 185.39788818359375, 122.96755981445312, 34.56951904296875, 27.148895263671875, -74.73594665527344, 102.66142272949219, -98.80084228515625, 80.9246597290039, 130.22122192382812, 35.24273681640625, 104.81454467773438, 131.56686401367188, 58.525146484375, 82.27189636230469, 58.099769592285156, 91.6666488647461, 90.86434936523438, 152.647216796875, 30.385894775390625, 132.92367553710938, -16.781005859375, 72.70391845703125, -42.66204833984375, 147.00234985351562, 61.769248962402344, -2.349456787109375, -63.67960739135742, -82.60397338867188, 123.3092041015625, -6.7387237548828125, -106.14031219482422, -93.00589752197266, 173.75357055664062, -78.65524291992188, -4.86273193359375, 19.24394989013672, 11.786003112792969, 64.17112731933594, 116.35009765625, -0.2240772247314453, 16.689132690429688, 60.8909912109375, -23.314498901367188, 104.97831726074219, 5.0279998779296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000286.npy"}
|
||||
{"epoch": 0.5989528795811518, "step": 287, "batch_size": 128, "mean": 38.455078125, "std": 68.6739730834961, "min": -152.75421142578125, "p10": -42.411434936523435, "median": 32.084524154663086, "p90": 127.90033111572265, "max": 162.5587158203125, "pos_frac": 0.7421875, "sample": [110.31663513183594, -26.65192413330078, -40.822509765625, -10.54371452331543, -23.534194946289062, -95.59376525878906, 122.66915893554688, 117.303466796875, 77.79010772705078, 153.4609375, 107.07846069335938, 17.920196533203125, 3.407672882080078, 117.84673309326172, 3.77398681640625, 16.822906494140625, -26.21209716796875, -67.9625244140625, -29.723175048828125, 12.750465393066406, -21.519012451171875, 162.5587158203125, 114.61666870117188, 12.808624267578125, 44.66473388671875, -22.177642822265625, -12.545124053955078, 68.53390502929688, 58.271484375, 39.337615966796875, 11.479782104492188, 127.70181274414062, 23.042579650878906, 65.73260498046875, -6.26165771484375, 4.420989990234375, 79.696044921875, 105.31961059570312, -1.9264488220214844, 109.46076202392578, 115.8937759399414, 103.98104858398438, 17.914459228515625, 47.79736328125, 128.36354064941406, 4.808145523071289, 2.5092620849609375, 118.29147338867188, 69.11544799804688, 149.3345947265625, 35.19419860839844, 126.40768432617188, 109.36895751953125, 32.3217887878418, 25.5482177734375, 42.931640625, 33.88568878173828, 0.092864990234375, 108.20327758789062, -76.7742919921875, 106.92115783691406, 86.31967163085938, 43.194793701171875, 15.780742645263672, 2.291677474975586, 31.847259521484375, 106.67538452148438, 128.689697265625, 96.97598266601562, 53.06219482421875, -118.8575439453125, 1.3221817016601562, 130.687255859375, 133.3310546875, 49.37786865234375, 98.30491638183594, 107.12814331054688, 131.54505920410156, 16.1396484375, 88.84353637695312, 157.1370391845703, -69.15496826171875, -91.23216247558594, -12.84161376953125, -35.67010498046875, 45.422142028808594, -31.297607421875, 89.00144958496094, -68.2340087890625, 16.438705444335938, 101.45820617675781, 141.76809692382812, -152.75421142578125, -14.68701171875, 23.003868103027344, -14.103912353515625, 92.66761779785156, 4.24119758605957, -114.14773559570312, 44.223358154296875, 7.69232177734375, 14.8726806640625, 7.577003479003906, 8.5357666015625, -31.753204345703125, -23.689239501953125, 106.99877166748047, -114.050048828125, 142.56829833984375, -11.051078796386719, 134.16360473632812, -49.26994323730469, 102.37136840820312, 68.64485168457031, 136.93458557128906, 7.29522705078125, 15.803756713867188, 59.18470764160156, 60.62408447265625, 9.164752960205078, 18.656457901000977, 6.951423645019531, -79.990234375, -46.118927001953125, 50.813201904296875, 123.30380249023438, -19.320114135742188, 96.01504516601562], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000287.npy"}
|
||||
{"epoch": 0.6010471204188481, "step": 288, "batch_size": 128, "mean": 39.559120178222656, "std": 68.42576599121094, "min": -122.91476440429688, "p10": -33.35609436035156, "median": 24.77197265625, "p90": 123.15812072753906, "max": 193.46865844726562, "pos_frac": 0.734375, "sample": [48.0740966796875, 4.4322052001953125, -22.7607421875, -10.6875, 86.08462524414062, 3.380157470703125, -22.020950317382812, -19.692626953125, 27.58038330078125, 94.72283935546875, 24.574462890625, 153.09649658203125, -4.844696044921875, 116.20962524414062, 97.7684326171875, -11.656280517578125, 8.793403625488281, 46.79960632324219, 24.969482421875, 0.0, 18.94488525390625, -12.4661865234375, 30.593505859375, 9.79837417602539, 66.39486694335938, 2.01171875, -30.277740478515625, 6.55035400390625, -84.30023193359375, 19.903564453125, 159.19769287109375, 109.4652099609375, 23.892501831054688, 81.57870483398438, 7.010337829589844, 0.0, -10.513900756835938, 18.54486846923828, 11.99420166015625, 17.00537109375, -13.444686889648438, 106.8495864868164, -41.95648193359375, -103.94866943359375, 12.886123657226562, 108.23530578613281, 152.4014892578125, 143.14291381835938, -122.91476440429688, 0.4671306610107422, 120.83084106445312, -8.348808288574219, 117.41937255859375, 2.849761962890625, 11.4500732421875, -117.98443603515625, 4.959278106689453, 53.05438232421875, 111.96632385253906, -106.80557250976562, 193.46865844726562, -32.91819763183594, -49.666259765625, 66.90066528320312, 165.6671142578125, 16.61529541015625, -3.1187591552734375, -54.2301025390625, 116.44432067871094, 6.157569885253906, 120.13833618164062, 29.737945556640625, -1.9002532958984375, 115.49853515625, 12.06768798828125, 11.89324951171875, 27.1947021484375, 44.4334716796875, -30.56414794921875, 65.5008773803711, 87.63052368164062, 40.197021484375, -3.535358428955078, -91.019775390625, -32.77247619628906, -58.148277282714844, 27.906017303466797, 144.43502807617188, 122.81103515625, 11.115081787109375, 78.51522827148438, 4.90032958984375, 97.64483642578125, -7.664546966552734, 82.8055419921875, 109.75529479980469, 87.98841857910156, -49.43524169921875, 131.662353515625, 25.65667724609375, 43.82341003417969, 72.85121154785156, -88.75308227539062, 185.21713256835938, 8.065765380859375, 28.9197998046875, 15.403846740722656, 91.84368896484375, 38.41943359375, -7.070343017578125, -34.37785339355469, 93.62828063964844, 49.04634094238281, 108.78875732421875, 94.2625732421875, 113.64266967773438, 29.3970947265625, 11.42279052734375, 17.79608154296875, 114.87867736816406, 14.886343002319336, 192.98944091796875, 111.18193054199219, 158.3580322265625, 123.96798706054688, 35.115692138671875, 174.3809814453125, 114.45213317871094], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000288.npy"}
|
||||
{"epoch": 0.6031413612565445, "step": 289, "batch_size": 128, "mean": 28.921913146972656, "std": 70.89383697509766, "min": -139.44039916992188, "p10": -60.76953124999999, "median": 13.7879638671875, "p90": 122.07459869384765, "max": 158.89161682128906, "pos_frac": 0.640625, "sample": [-49.34552001953125, -20.8297119140625, 13.9539794921875, -71.70978546142578, -4.02081298828125, 2.39849853515625, 12.0120849609375, -23.424072265625, 47.541748046875, -83.87197875976562, 117.92401123046875, -5.241947174072266, 0.8749532699584961, 46.0599365234375, -1.0394439697265625, 84.50814819335938, -33.6287841796875, 130.2796630859375, 15.25604248046875, -107.20635986328125, 158.89161682128906, 134.1679229736328, 140.3223876953125, 8.746284484863281, -1.33740234375, -18.781234741210938, 0.909698486328125, 3.5975723266601562, 8.860137939453125, 79.79270935058594, -5.6021270751953125, 110.97503662109375, -2.568450927734375, 56.32965087890625, 0.4219818115234375, 0.0, 7.654899597167969, 84.10342407226562, -26.8446044921875, 13.6219482421875, 11.206371307373047, -66.18017578125, -106.92138671875, -120.24111938476562, 74.3450927734375, -27.850250244140625, 49.8211669921875, 101.76409912109375, -3.420379638671875, -2.760589599609375, -9.85394287109375, 103.40560913085938, -126.388671875, -120.45034790039062, 16.058868408203125, 66.88416290283203, -3.907003402709961, -2.0150527954101562, 21.420515060424805, 153.48919677734375, -87.3138427734375, 110.44183349609375, 81.24609375, 103.35891723632812, 50.267822265625, 104.23014831542969, -41.741363525390625, -98.309326171875, -35.98375701904297, 2.903533935546875, 124.64505004882812, -23.788528442382812, 5.686553955078125, 103.42875671386719, 15.797683715820312, 17.955276489257812, 0.5371856689453125, 47.7296142578125, 99.27008056640625, 6.376434326171875, 92.56576538085938, -20.66943359375, -139.44039916992188, 128.30303955078125, -14.441879272460938, 90.38916778564453, -13.832412719726562, 120.97297668457031, -40.24839782714844, 127.20632934570312, 33.027740478515625, 145.80267333984375, 111.96673583984375, 12.22393798828125, 35.10298156738281, 99.3516845703125, 82.8411865234375, 8.402618408203125, 84.7613525390625, 82.82623291015625, 126.41000366210938, 85.32483673095703, 146.66046142578125, 95.69973754882812, -20.102264404296875, 110.36506652832031, 14.7274169921875, -12.237319946289062, 103.85482788085938, 158.63775634765625, -105.51907348632812, 112.03746795654297, 118.5345458984375, -33.493804931640625, 118.63020324707031, -58.45068359375, -103.010986328125, 51.638023376464844, -12.568992614746094, 119.0573959350586, 25.581634521484375, 27.837493896484375, 143.96725463867188, 2.4947280883789062, 16.97296142578125, -7.0464935302734375, -2.463531494140625, 32.45794677734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000289.npy"}
|
||||
{"epoch": 0.6052356020942409, "step": 290, "batch_size": 128, "mean": 35.29863739013672, "std": 65.31219482421875, "min": -131.03720092773438, "p10": -36.05894317626953, "median": 22.96661376953125, "p90": 124.23167724609375, "max": 179.06231689453125, "pos_frac": 0.6640625, "sample": [135.71531677246094, 6.78668212890625, -6.171142578125, 106.97705078125, 54.38018798828125, -131.03720092773438, -16.537063598632812, 110.55673217773438, -13.79498291015625, -23.05303955078125, 153.8272705078125, 126.27886962890625, 46.813720703125, -11.880325317382812, 112.93983459472656, 102.72593688964844, 38.426429748535156, 33.16644287109375, 127.41864013671875, 105.58096313476562, 117.68450927734375, 3.7567367553710938, 0.78460693359375, 6.9694061279296875, 170.60873413085938, 2.517547607421875, -83.85995483398438, -103.9415283203125, -20.074249267578125, 1.200531005859375, 139.371337890625, -4.897350311279297, -5.362152099609375, 40.52728271484375, 138.21832275390625, 146.51153564453125, 107.89338684082031, 21.268369674682617, -8.4276123046875, -62.19960021972656, 99.35411834716797, 31.17591667175293, -0.09942626953125, 116.14053344726562, 37.346588134765625, -0.0396728515625, 1.023712158203125, -35.832794189453125, 14.037374496459961, 60.24090576171875, -17.604583740234375, -5.1594696044921875, 23.15380859375, 89.98352813720703, 25.201927185058594, -26.24432373046875, 32.98595428466797, 99.58139038085938, -7.8916015625, 22.7794189453125, 9.414459228515625, 12.528430938720703, -0.1910400390625, 106.933349609375, 30.518310546875, 21.232421875, 16.506210327148438, 73.57421875, 33.657379150390625, 109.4775390625, -28.654632568359375, 24.822463989257812, 1.6541748046875, 27.378639221191406, -19.70318603515625, 149.3280029296875, -7.462703704833984, -44.16461181640625, -3.534820556640625, 83.69451904296875, -4.2540283203125, -12.676475524902344, -35.04936218261719, 122.43356323242188, 103.71847534179688, 115.74896240234375, 116.911865234375, -9.726654052734375, -122.75381469726562, 11.100662231445312, 97.69050598144531, -36.58662414550781, 57.068389892578125, 13.8367919921875, -42.06884765625, 5.531135559082031, -27.186553955078125, -10.387786865234375, -78.39262390136719, 100.06747436523438, 9.153327941894531, -80.72943878173828, 39.19891357421875, 179.06231689453125, 93.0498046875, 60.96990966796875, 123.93548583984375, 123.81292724609375, 127.15145874023438, 33.34601593017578, -47.72590637207031, 24.7838134765625, 99.86660766601562, 44.53546142578125, 80.93283081054688, -61.75428771972656, -14.637199401855469, 11.82672119140625, -37.896331787109375, 106.82652282714844, -9.773712158203125, 2.8639450073242188, 23.643203735351562, 100.53522491455078, -22.658660888671875, 62.126068115234375, 131.01974487304688, 124.92279052734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000290.npy"}
|
||||
{"epoch": 0.6073298429319371, "step": 291, "batch_size": 128, "mean": 49.555999755859375, "std": 69.85961151123047, "min": -154.94265747070312, "p10": -27.95686798095703, "median": 38.829254150390625, "p90": 139.5332061767578, "max": 180.6478271484375, "pos_frac": 0.7265625, "sample": [143.40615844726562, 0.0445556640625, 167.60855102539062, 13.525054931640625, 162.29855346679688, 64.035888671875, 144.963134765625, 118.46189880371094, 36.863189697265625, -23.96807861328125, -2.314727783203125, 99.109619140625, 34.49562072753906, -6.55059814453125, 137.838623046875, 12.569061279296875, 154.14358520507812, -15.816650390625, 105.98455810546875, 51.99260711669922, 70.80917358398438, 41.816375732421875, 112.2042236328125, 22.34991455078125, -10.253738403320312, 120.32284545898438, 48.33930969238281, 7.4279022216796875, 94.31680297851562, -27.31610107421875, -125.38191223144531, -0.9487953186035156, -35.83348083496094, 9.55935287475586, -154.94265747070312, 28.00799560546875, 102.58953857421875, 79.5755615234375, 40.91461181640625, 5.880126953125, 138.11068725585938, -26.815872192382812, -85.14083099365234, 40.795318603515625, 131.03273010253906, 32.188507080078125, 7.669898986816406, 85.90614318847656, 50.9373779296875, 11.6798095703125, 129.1912841796875, 104.13099670410156, 83.83158111572266, -18.2081298828125, 180.6478271484375, 103.95646667480469, 22.377838134765625, -33.251373291015625, -39.589508056640625, 109.64462280273438, 146.6319122314453, 17.354156494140625, -10.083953857421875, -19.96417236328125, 121.64242553710938, -45.705657958984375, 5.4262847900390625, -0.232879638671875, -46.40557861328125, 135.75657653808594, -55.026580810546875, 18.126617431640625, 62.56705856323242, 176.30865478515625, 33.2449951171875, 29.0438232421875, 46.967918395996094, 116.23532104492188, 132.9957275390625, -5.76507568359375, 74.50592041015625, 0.37708091735839844, -5.70062255859375, -18.0491943359375, 104.19810485839844, 144.71783447265625, 75.27049255371094, -53.371826171875, 0.0, -7.961704254150391, -29.076156616210938, 121.00933837890625, 13.371139526367188, -109.3790054321289, 173.28338623046875, 105.254638671875, 125.73056030273438, 32.3385009765625, 142.8524169921875, -27.4771728515625, 72.37234497070312, 124.88076782226562, 31.70458984375, 121.33670043945312, -1.06024169921875, 65.07037353515625, 16.183563232421875, 34.942718505859375, 166.24542236328125, 43.0262451171875, -74.80813598632812, 122.1641845703125, -7.096595764160156, 62.045997619628906, 98.47494506835938, 18.9713134765625, 113.1981201171875, -0.9577159881591797, 148.9093017578125, 127.3394546508789, 77.8359375, 12.93145751953125, 3.54437255859375, -10.940946578979492, 130.09686279296875, 136.702392578125, 100.36930847167969, 31.450881958007812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000291.npy"}
|
||||
{"epoch": 0.6094240837696335, "step": 292, "batch_size": 128, "mean": 41.39234161376953, "std": 66.96598815917969, "min": -169.31427001953125, "p10": -27.533229064941406, "median": 28.48443603515625, "p90": 125.6664566040039, "max": 200.266845703125, "pos_frac": 0.7578125, "sample": [13.052764892578125, 3.7260284423828125, 101.40300750732422, -58.109893798828125, 84.99948120117188, 97.6766128540039, 119.56857299804688, 171.9443359375, 19.352615356445312, 132.02850341796875, 181.76141357421875, 89.72343444824219, -26.908203125, -9.524879455566406, 51.36212158203125, 25.95458984375, -8.945791244506836, -8.979795455932617, 11.9007568359375, -4.62152099609375, 9.498321533203125, 17.655479431152344, 38.116058349609375, 63.990570068359375, 121.4731674194336, 56.211700439453125, 124.76231384277344, 27.8724365234375, 25.286582946777344, -5.09068489074707, 137.42218017578125, 98.49383544921875, 1.5424461364746094, 39.116790771484375, 100.83444213867188, 23.322917938232422, 24.49102783203125, -41.648681640625, -166.60397338867188, 8.937217712402344, 116.26741027832031, 14.207389831542969, 88.57589721679688, 67.36105346679688, -57.33074951171875, 8.487457275390625, 15.368289947509766, -87.7154541015625, 0.15489959716796875, 121.69036865234375, -22.3359375, 142.69476318359375, -35.59789276123047, -13.7755126953125, -5.46636962890625, 47.89335632324219, 1.502197265625, -1.2880401611328125, 31.3717041015625, 14.168106079101562, 105.53311157226562, 42.264404296875, 118.21775817871094, -28.991622924804688, -11.31378173828125, 11.925830841064453, 15.18408203125, 32.3187255859375, 200.266845703125, 55.714599609375, 6.577484130859375, 70.44168090820312, 80.61184692382812, 35.51507568359375, 99.84603881835938, 157.69573974609375, 5.74627685546875, 129.79774475097656, 27.154651641845703, -13.049388885498047, -52.792572021484375, -169.31427001953125, 152.21026611328125, 100.46076965332031, 120.2783432006836, 83.08232116699219, 107.40386962890625, 29.096435546875, 7.047636032104492, 127.776123046875, -35.55322265625, 12.643524169921875, 4.190036773681641, 17.80328369140625, 73.9882583618164, -17.013198852539062, -10.466764450073242, -24.140609741210938, 18.567798614501953, 137.52362060546875, 86.71380615234375, 88.01473236083984, 11.88287353515625, 16.34805679321289, 30.36803436279297, 77.15142822265625, 75.0350341796875, -0.5913543701171875, -3.5550994873046875, 29.796112060546875, 5.858245849609375, 63.66424560546875, 39.462677001953125, -0.14166259765625, 119.93887329101562, 116.58370971679688, 169.13607788085938, -84.6920166015625, 83.03506469726562, 71.51258850097656, 137.96759033203125, -122.06463623046875, -55.778526306152344, 101.7589111328125, 105.23878479003906, 118.79066467285156, 47.8236083984375, 5.431558609008789], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000292.npy"}
|
||||
{"epoch": 0.6115183246073298, "step": 293, "batch_size": 128, "mean": 41.075889587402344, "std": 67.4294662475586, "min": -130.50375366210938, "p10": -25.92674407958984, "median": 25.348983764648438, "p90": 133.27406768798826, "max": 183.52960205078125, "pos_frac": 0.7265625, "sample": [129.03338623046875, -11.679611206054688, 118.43511962890625, 149.64990234375, 23.14520263671875, -41.1175537109375, 14.798248291015625, 92.58184814453125, 5.0448760986328125, 28.412841796875, -4.9889678955078125, 121.9831314086914, 39.9688720703125, 22.499847412109375, 20.070058822631836, 11.929763793945312, -63.702720642089844, -0.7777099609375, -12.779769897460938, -19.404876708984375, 2.1422348022460938, -0.154083251953125, 141.94992065429688, 111.25094604492188, 16.94012451171875, 0.0, 143.7578125, 38.276519775390625, 143.6676025390625, 16.535049438476562, 106.84710693359375, -28.1873779296875, -10.681381225585938, 4.3457794189453125, 15.053955078125, -35.86723327636719, 17.87359619140625, 99.50321960449219, -0.1959228515625, 42.58921813964844, 131.95648193359375, -2.6904449462890625, 101.63836669921875, 120.27676391601562, 42.52142333984375, -126.9261474609375, 66.29043579101562, 125.96836853027344, 166.18569946289062, 13.003368377685547, 59.72535705566406, 38.66827392578125, 111.29537963867188, 90.74652099609375, 2.7855682373046875, 123.26893615722656, 26.0101318359375, -9.7535400390625, 8.81353759765625, 20.94744873046875, -7.011238098144531, 136.3484344482422, 75.53268432617188, -12.560724258422852, -114.27108001708984, 50.772857666015625, 14.417865753173828, -69.53750610351562, 69.80270385742188, -3.9711074829101562, -130.50375366210938, -36.64263916015625, 95.35243225097656, 183.52960205078125, -111.2421875, 11.49224853515625, -10.4373779296875, 109.83154296875, -10.453201293945312, 37.72003173828125, 18.986724853515625, 41.1463623046875, 47.1593017578125, 18.908401489257812, 124.74261474609375, 28.700531005859375, 48.41046142578125, 116.73388671875, 109.15646362304688, -24.957901000976562, -17.69024658203125, 146.8145751953125, 10.105438232421875, 117.7335205078125, 47.89678955078125, 85.48240661621094, 25.80615997314453, -88.57508850097656, 24.73956298828125, 9.67801284790039, 97.12521362304688, 148.22171020507812, 176.1221923828125, 152.462158203125, 68.93096923828125, 115.04620361328125, 12.509895324707031, 24.891807556152344, -45.10429382324219, 0.196868896484375, -0.9586334228515625, 9.590484619140625, -1.60992431640625, 75.36639404296875, 181.07632446289062, 176.5986328125, 16.29278564453125, 29.165939331054688, 101.37831115722656, -88.15980529785156, 94.58489990234375, -9.005287170410156, 6.6222991943359375, 97.34039306640625, 28.888071060180664, -4.267669677734375, 38.89434814453125, 26.88507080078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000293.npy"}
|
||||
{"epoch": 0.6136125654450262, "step": 294, "batch_size": 128, "mean": 38.8336296081543, "std": 67.34125518798828, "min": -122.90298461914062, "p10": -37.27083129882812, "median": 19.374862670898438, "p90": 128.50979309082032, "max": 192.48153686523438, "pos_frac": 0.7265625, "sample": [126.39141845703125, 102.8878173828125, 116.0169677734375, -0.00732421875, -7.8276519775390625, 7.164970397949219, 1.0894126892089844, 2.48504638671875, 16.27564239501953, 127.75003051757812, 45.03460693359375, 128.4244384765625, 108.37362670898438, 50.516845703125, -5.092071533203125, 6.837547302246094, 120.86231994628906, 151.10243225097656, 127.99730682373047, 23.21160888671875, 87.00152587890625, 122.44355773925781, 95.16307067871094, -45.26325225830078, 0.8934860229492188, 128.138427734375, 12.19329833984375, -25.36505126953125, 6.5892333984375, 16.329391479492188, 152.92184448242188, 12.946929931640625, 11.025665283203125, 87.30987548828125, -17.649208068847656, 18.1531982421875, -26.823486328125, -67.00212097167969, 53.82733154296875, 9.17889404296875, 70.4559326171875, 150.4107666015625, -36.267578125, -67.3257064819336, 133.92550659179688, 95.8014144897461, 94.45574188232422, -5.4133758544921875, -5.435249328613281, 36.713043212890625, 132.20004272460938, -4.636322021484375, 139.97744750976562, 124.77349090576172, 22.98685073852539, 131.6320037841797, 131.6066131591797, -24.572509765625, 18.360382080078125, 11.699073791503906, 93.9931640625, 24.54217529296875, 28.871490478515625, 118.39030456542969, -106.81405639648438, 102.06607818603516, 97.98391723632812, 38.888824462890625, 12.885345458984375, 7.79376220703125, 16.539642333984375, 151.2310333251953, 139.93585205078125, 0.0, 125.5384750366211, 192.48153686523438, -106.93263244628906, -9.412490844726562, -10.414749145507812, 29.51861572265625, -7.113563537597656, 128.70895385742188, 6.407924652099609, -115.87022399902344, -13.958404541015625, 43.30096435546875, -39.61175537109375, 4.061195373535156, 100.47941589355469, -44.76130676269531, 24.980148315429688, 9.052803039550781, -2.8408584594726562, -109.58563232421875, 28.30461883544922, 81.46568298339844, 19.86968994140625, 0.99456787109375, 9.804901123046875, -14.793838500976562, 18.880035400390625, 109.6200942993164, 12.7708740234375, 4.686349868774414, 56.12732696533203, 115.54310607910156, -67.92486572265625, -71.78167724609375, 33.129638671875, -0.14150428771972656, -19.200302124023438, 111.02229309082031, 78.81468200683594, 32.635589599609375, 79.83984375, 90.70025634765625, 9.075439453125, -25.524879455566406, 90.24468994140625, -16.417190551757812, 16.766357421875, -47.381195068359375, 30.086395263671875, 2.3184280395507812, 132.72134399414062, -122.90298461914062, 100.41527557373047, 127.74434661865234], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000294.npy"}
|
||||
{"epoch": 0.6157068062827226, "step": 295, "batch_size": 128, "mean": 42.648765563964844, "std": 72.4398422241211, "min": -121.83062744140625, "p10": -31.199935913085934, "median": 21.925960540771484, "p90": 133.16734161376954, "max": 209.303955078125, "pos_frac": 0.7265625, "sample": [145.32464599609375, 21.414398193359375, -33.37237548828125, -104.31526184082031, -10.042121887207031, 33.12846374511719, -15.673294067382812, 116.9354019165039, -97.16238403320312, -12.974334716796875, -15.095870971679688, 7.278587341308594, 129.72189331054688, 20.980377197265625, -11.753143310546875, -16.456130981445312, -109.6407470703125, 103.08828735351562, -100.21957397460938, 25.961891174316406, 106.18574523925781, 166.10198974609375, -16.24652099609375, -121.83062744140625, -13.983062744140625, 97.7906265258789, 15.707748413085938, -91.60421752929688, 94.93325805664062, -2.7496795654296875, 7.2532958984375, 22.860031127929688, 38.54461669921875, 112.26174926757812, 117.85247802734375, 9.037261962890625, 98.56256103515625, 9.707855224609375, 104.94158935546875, 77.8865966796875, 171.00718688964844, -10.2576904296875, 84.8345947265625, 120.8333740234375, -0.928497314453125, 89.33206176757812, 30.702064514160156, 156.16824340820312, 125.20858764648438, 110.56307983398438, -108.1728515625, -86.49392700195312, 129.32362365722656, 209.303955078125, 125.57142639160156, 173.71258544921875, -22.8935546875, 91.16204833984375, 51.29693603515625, 9.904094696044922, -17.181121826171875, 3.0325927734375, 14.798599243164062, 2.17633056640625, 108.164306640625, 64.04630279541016, 19.838401794433594, 63.34832763671875, 91.05455017089844, 18.344009399414062, 5.0535430908203125, 6.43963623046875, 1.442413330078125, 131.59780883789062, -30.268890380859375, 100.78794860839844, 129.47637939453125, 111.96188354492188, 14.935256958007812, 88.67660522460938, 8.630096435546875, 74.64006042480469, 32.881675720214844, 22.825332641601562, 6.4882659912109375, 18.73126220703125, 0.0, 100.19305419921875, 7.696746826171875, 14.843231201171875, 130.8050994873047, 160.23822021484375, 28.21490478515625, 40.75408935546875, 136.36880493164062, 0.6553230285644531, -58.21307373046875, 151.2989501953125, 1.7601776123046875, 132.63270568847656, 93.05833435058594, -19.835296630859375, 9.805618286132812, 20.0511474609375, 26.75439453125, -105.07369995117188, 111.93970489501953, 15.69224739074707, -1.466522216796875, 90.16631317138672, -14.306110382080078, 124.10682678222656, 160.3958740234375, 132.54776000976562, -2.151947021484375, 157.23440551757812, 0.21286392211914062, -7.0467376708984375, 22.437522888183594, -48.0135498046875, -38.6329345703125, 5.20361328125, 134.41482543945312, -16.04998779296875, 110.97067260742188, 139.81483459472656, 80.80081176757812, -23.67822265625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000295.npy"}
|
||||
{"epoch": 0.6178010471204188, "step": 296, "batch_size": 128, "mean": 57.38672637939453, "std": 72.43463134765625, "min": -139.92959594726562, "p10": -29.889566040039053, "median": 57.22616195678711, "p90": 142.75140228271485, "max": 172.0714111328125, "pos_frac": 0.78125, "sample": [93.09148406982422, 14.884931564331055, 16.49530029296875, 108.20988464355469, 110.00898742675781, -7.80523681640625, 1.8282012939453125, -1.727935791015625, 86.1289291381836, 5.15313720703125, 106.277099609375, -69.85220336914062, 45.228668212890625, 35.585845947265625, 7.829803466796875, 120.48626708984375, 126.88363647460938, 81.36862182617188, 107.13270568847656, -56.60382080078125, 120.78282165527344, 92.57666015625, -6.79296875, 113.36538696289062, 62.096221923828125, 139.4853515625, -7.903858184814453, 15.930496215820312, 133.20574951171875, -35.612335205078125, 17.071578979492188, 80.41586303710938, 165.278564453125, 6.661476135253906, -116.85835266113281, 154.79571533203125, 47.5081787109375, 118.22483825683594, 128.447509765625, 38.233970642089844, -60.48457336425781, 27.563262939453125, 151.94207763671875, 120.87516021728516, 30.796096801757812, -27.43695068359375, 31.7027587890625, -26.268478393554688, 167.821533203125, 15.44085693359375, 151.0, 70.2967529296875, 25.230682373046875, 138.11407470703125, 147.7491455078125, 167.61300659179688, 11.415420532226562, 155.15176391601562, 43.911956787109375, 113.28169250488281, 3.92767333984375, -7.164695739746094, 109.08872985839844, -49.66864013671875, 49.06036376953125, 89.15708923339844, 29.4556884765625, 135.5172119140625, -111.56651306152344, 138.90252685546875, 127.68888854980469, 76.60292053222656, -86.67739868164062, 118.62557220458984, 86.42605590820312, 40.109527587890625, 157.49215698242188, 171.10360717773438, -3.1190719604492188, 105.29934692382812, 94.28375244140625, 121.42218017578125, 7.2012786865234375, -139.92959594726562, 143.31069946289062, 101.59767150878906, 2.1398468017578125, -93.068603515625, 141.15786743164062, -6.138275146484375, 112.4991455078125, 9.626800537109375, 92.48751831054688, -42.148101806640625, 39.42620849609375, 0.0, 118.4207763671875, 114.30657958984375, 115.78732299804688, 97.41989135742188, 23.8900146484375, 132.18466186523438, 66.17218017578125, 113.56791687011719, 8.101593017578125, 39.763824462890625, 10.396331787109375, 11.971435546875, 42.33843994140625, 117.74493408203125, 155.08883666992188, 81.19860076904297, 5.900646209716797, -63.584197998046875, -3.0061111450195312, 128.312744140625, 120.99725341796875, -2.694976806640625, -0.957000732421875, 142.51170349121094, -1.4595680236816406, 92.10614013671875, 22.6907958984375, 141.436767578125, 52.356101989746094, 172.0714111328125, -121.32711791992188, -5.5701141357421875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000296.npy"}
|
||||
{"epoch": 0.6198952879581152, "step": 297, "batch_size": 128, "mean": 45.34703826904297, "std": 68.00128173828125, "min": -134.86968994140625, "p10": -23.74381103515625, "median": 31.545318603515625, "p90": 137.46063690185548, "max": 169.14117431640625, "pos_frac": 0.71875, "sample": [162.57382202148438, 57.05974578857422, 121.82762145996094, 134.7825927734375, 9.04779052734375, 2.131031036376953, 122.26751708984375, 52.58879089355469, 137.16290283203125, -21.2777099609375, -2.0544281005859375, -2.50537109375, -20.592666625976562, 3.8311309814453125, 146.3094482421875, -76.04743957519531, -9.223336219787598, 31.640777587890625, 138.1553497314453, 88.94239807128906, 11.679771423339844, 34.016448974609375, -3.0606460571289062, 47.186767578125, -57.79010009765625, -34.13897705078125, -21.91705322265625, -134.86968994140625, 14.423828125, -18.98361587524414, -59.21009826660156, 81.15370178222656, 27.513442993164062, 52.66758728027344, 44.92240905761719, 9.09521484375, 131.33453369140625, 0.296356201171875, 168.71463012695312, 19.655426025390625, 10.8214111328125, 7.9150390625, 10.102737426757812, -66.83212280273438, 4.048126220703125, -16.9044189453125, 0.09100341796875, 30.45001220703125, -99.3265380859375, 127.97735595703125, 35.63636779785156, -20.369781494140625, -0.5357666015625, -34.45501708984375, 1.93096923828125, 169.14117431640625, 121.6151351928711, 56.108158111572266, -4.210416793823242, -22.984359741210938, 68.34013366699219, 120.00759887695312, -119.09426879882812, 25.186767578125, 154.84939575195312, 26.79888916015625, 16.752044677734375, -23.675750732421875, 146.6434326171875, -9.54071044921875, 31.449859619140625, 100.90057373046875, 147.6358642578125, 140.0088653564453, 163.5096435546875, -57.8131103515625, 23.80720329284668, 5.287300109863281, 95.3726806640625, -7.197517395019531, -62.32806396484375, 115.82807922363281, 65.12628173828125, -5.687873840332031, 2.0527191162109375, 34.63801574707031, 33.49493408203125, -6.702880859375, 133.21055603027344, 7.2025146484375, 113.38116455078125, -3.8951416015625, 80.79196166992188, 110.81108856201172, 84.056640625, 87.96028137207031, 7.1563568115234375, 133.64378356933594, -3.50634765625, 150.4969482421875, 104.929931640625, 112.81080627441406, 87.62791442871094, 126.19699096679688, -7.422901153564453, 93.45892333984375, 84.902099609375, 11.65130615234375, 105.96558380126953, 102.65106201171875, 150.994873046875, 115.62567138671875, 3.039215087890625, 79.54925537109375, -0.81829833984375, -23.902618408203125, 10.311614990234375, 63.4034423828125, 111.46524047851562, 32.8077392578125, 78.44542694091797, 119.55154418945312, 152.94564819335938, -41.3074951171875, -4.820556640625, 60.3363037109375, 110.78618621826172, 136.74749755859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000297.npy"}
|
||||
{"epoch": 0.6219895287958115, "step": 298, "batch_size": 128, "mean": 50.278289794921875, "std": 74.47785949707031, "min": -147.92132568359375, "p10": -34.54883728027344, "median": 32.34827423095703, "p90": 144.26207275390624, "max": 262.1134033203125, "pos_frac": 0.7265625, "sample": [185.53672790527344, 122.06521606445312, -18.924713134765625, 41.84117126464844, -41.5982666015625, -53.0594482421875, -30.96710205078125, -11.896697998046875, -23.7508544921875, 146.35760498046875, 81.97589111328125, -10.94580078125, 8.616058349609375, 16.8848876953125, 106.85340881347656, -84.08706665039062, 127.50592041015625, 81.3102798461914, 18.019569396972656, 100.63983917236328, 131.6917724609375, 120.78363037109375, 37.437896728515625, 185.2398681640625, 6.5419921875, -35.1685791015625, 118.94203186035156, -95.35665893554688, 122.1971435546875, 63.44793701171875, 127.4556884765625, -38.47564697265625, 114.58946228027344, 68.05215454101562, 29.4124755859375, -52.8653564453125, 5.850250244140625, 68.83441162109375, 126.79134368896484, -21.129653930664062, 110.99539184570312, -2.386077880859375, 115.59042358398438, -25.030853271484375, 37.6025390625, 152.29525756835938, 40.34332275390625, 21.060562133789062, -61.625030517578125, -10.048049926757812, 26.933319091796875, 124.08978271484375, 19.75946044921875, 135.05947875976562, -59.461883544921875, 31.203414916992188, 115.9969482421875, 144.8271484375, 12.790069580078125, 114.24577331542969, 173.1243896484375, -147.92132568359375, 120.38058471679688, -15.694564819335938, -16.5167236328125, 181.73663330078125, 144.0198974609375, 49.167327880859375, -58.0355224609375, 0.0, 5.116455078125, 7.709693908691406, 168.46240234375, 65.54425048828125, 24.391006469726562, 4.388599395751953, 111.55929565429688, 70.02621459960938, 12.967094421386719, 126.23690795898438, -31.229049682617188, -9.760574340820312, 14.768390655517578, 82.54368591308594, 117.0096206665039, 5.829246520996094, 136.33168029785156, 125.30738830566406, 122.56890869140625, 9.589111328125, 190.932861328125, 36.3408203125, -20.61249542236328, -34.283233642578125, 16.929946899414062, 30.628662109375, 39.6959228515625, 14.56591796875, -4.949733734130859, -6.0439453125, -5.2594146728515625, 125.526123046875, 46.37593078613281, 128.33969116210938, 180.26513671875, 160.55752563476562, -18.26904296875, 96.68304443359375, 262.1134033203125, 12.559280395507812, 129.64877319335938, 33.493133544921875, 3.5640792846679688, 27.257308959960938, 142.73968505859375, 22.36822509765625, -14.28765869140625, 121.61312866210938, 55.35614013671875, 158.84164428710938, 7.4248504638671875, 75.15411376953125, 16.624893188476562, 22.6173095703125, -60.68963623046875, -26.78228759765625, 52.31100845336914, -74.24098205566406], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000298.npy"}
|
||||
{"epoch": 0.6240837696335079, "step": 299, "batch_size": 128, "mean": 38.73846435546875, "std": 63.319210052490234, "min": -119.39753723144531, "p10": -28.157701110839838, "median": 22.342498779296875, "p90": 126.57821731567383, "max": 168.66473388671875, "pos_frac": 0.703125, "sample": [99.30177307128906, 126.82333374023438, 17.26769256591797, 90.05884552001953, 21.99249267578125, 3.0802001953125, 130.23049926757812, -96.7027587890625, 104.3402099609375, -117.03866577148438, 147.95669555664062, 41.62664794921875, 18.72217559814453, 99.18276977539062, -27.0111083984375, 122.57891845703125, -3.710235595703125, 38.58979034423828, 10.490997314453125, -119.39753723144531, 117.14591217041016, -2.586273193359375, 93.17678833007812, -80.45152282714844, 132.18548583984375, -4.581512451171875, -49.326080322265625, -30.833084106445312, 107.156005859375, 13.186262130737305, 87.8311767578125, -5.0980224609375, 129.43592834472656, -3.054779052734375, -94.78909301757812, 41.10918045043945, 103.51019287109375, -12.097076416015625, -46.97013854980469, 103.40621948242188, 11.70050048828125, 153.1129608154297, 21.610008239746094, 10.731985092163086, -9.448272705078125, -1.14337158203125, 135.80419921875, -3.9326725006103516, -16.451797485351562, -92.02928161621094, -48.47406005859375, 12.90423583984375, 8.694231033325195, 18.547653198242188, 104.87939453125, -99.5380859375, 108.82855224609375, -5.70068359375, 161.0802001953125, 47.452857971191406, 91.72557067871094, 48.882049560546875, 17.678237915039062, 63.994293212890625, -3.2089691162109375, 168.66473388671875, -2.683929443359375, 10.446756362915039, 81.45916748046875, 27.812332153320312, 103.49403381347656, 8.656394958496094, 60.25775146484375, 8.156402587890625, 106.95597839355469, 107.17388916015625, -6.960441589355469, -10.247455596923828, -7.9359283447265625, 118.3511962890625, -2.524688720703125, 72.55740356445312, 40.82904052734375, 15.79998779296875, 37.02433776855469, -4.5908355712890625, 127.54119873046875, 41.9097900390625, 54.205078125, 92.03610229492188, 14.342193603515625, 93.05557250976562, 128.70892333984375, -13.096710205078125, 13.078622817993164, -8.424468994140625, 37.312049865722656, 5.236701965332031, 110.10747528076172, -13.586372375488281, -31.5875244140625, 55.75068664550781, 27.972946166992188, 10.62396240234375, 105.81745910644531, 166.24871826171875, 13.419601440429688, 126.4731674194336, 27.66363525390625, -4.562255859375, 12.63739013671875, 2.2691802978515625, 111.04106903076172, -32.70298767089844, 6.9278564453125, 10.054656982421875, -4.5711669921875, 33.624176025390625, 57.96839904785156, 45.72003173828125, -13.498985290527344, 95.7679443359375, 138.3970184326172, 60.79241943359375, 46.72174072265625, 118.824462890625, 22.6925048828125, 88.47683715820312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000299.npy"}
|
||||
{"epoch": 0.6261780104712041, "step": 300, "batch_size": 128, "mean": 42.984806060791016, "std": 73.60923767089844, "min": -122.05523681640625, "p10": -52.66046295166016, "median": 33.12698936462402, "p90": 134.20425720214843, "max": 216.41876220703125, "pos_frac": 0.7109375, "sample": [132.45852661132812, 129.9923095703125, 21.17547607421875, -63.91917419433594, 12.752388000488281, -25.546142578125, 112.58660888671875, 83.83815002441406, 9.817020416259766, -45.804443359375, 146.9986572265625, 125.76544189453125, 136.89584350585938, 11.481487274169922, 1.7855606079101562, 128.06903076171875, -99.2886962890625, 24.084197998046875, 66.22480773925781, -6.568115234375, 142.72262573242188, 133.05072021484375, 54.285064697265625, 95.99090576171875, 66.33380126953125, 6.0115966796875, 39.090789794921875, -16.3485107421875, 73.57183837890625, -38.817344665527344, 69.889892578125, 88.82754516601562, 123.65518951416016, 110.34722900390625, 148.65496826171875, -27.901458740234375, -119.3216552734375, 19.618682861328125, 3.650275230407715, 175.89535522460938, 58.813751220703125, -18.6417236328125, -36.666351318359375, 0.5611572265625, 57.602752685546875, 124.0396728515625, -4.7696990966796875, 126.53551483154297, 74.31687927246094, 161.53396606445312, 8.530941009521484, -110.74607849121094, -89.82040405273438, 21.709625244140625, 32.5980339050293, -2.7345199584960938, 49.163238525390625, 48.32402038574219, 132.24105834960938, -8.748016357421875, 116.46778106689453, 118.57440185546875, -53.50531005859375, 147.0858154296875, 47.344024658203125, -11.9088134765625, 3.873870849609375, 94.55345153808594, 95.93638610839844, 120.28102111816406, 57.21588134765625, 95.75091552734375, 0.296661376953125, 16.373443603515625, -17.75074005126953, -76.42341613769531, 156.40464782714844, 56.883056640625, -77.20524597167969, 126.03802490234375, 131.50588989257812, 12.287250518798828, 45.795440673828125, -52.29838562011719, 109.89624786376953, 19.25006103515625, 1.3563232421875, 0.0, -15.047744750976562, 33.65594482421875, 160.7608642578125, -57.77947998046875, -122.05523681640625, 100.9320068359375, 5.8688201904296875, 128.71231079101562, 93.44776916503906, 7.72296142578125, 165.84039306640625, 116.28167724609375, 17.96759033203125, 216.41876220703125, 6.3253021240234375, 68.09738159179688, 0.763458251953125, 93.52377319335938, 113.3260498046875, -56.01690673828125, -35.9217529296875, -16.128021240234375, 78.42132568359375, -4.144708633422852, 22.572906494140625, -24.694671630859375, 54.29925537109375, 107.86053466796875, 45.992767333984375, 4.0515594482421875, -90.42308044433594, -2.32354736328125, 166.8504638671875, -0.2799072265625, -71.88327026367188, -36.8056640625, 123.20294189453125, 143.97161865234375, -6.197620391845703, 4.9595947265625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000300.npy"}
|
||||
{"epoch": 0.6282722513089005, "step": 301, "batch_size": 128, "mean": 39.303504943847656, "std": 70.5409164428711, "min": -167.56961059570312, "p10": -34.84185943603515, "median": 24.41314697265625, "p90": 135.76199340820312, "max": 202.8848876953125, "pos_frac": 0.703125, "sample": [12.725784301757812, 160.34771728515625, 121.7081298828125, 26.71893310546875, 19.080307006835938, 126.25357055664062, 25.781417846679688, -34.05589294433594, 78.75942993164062, 8.991146087646484, 4.283786773681641, -0.02227783203125, 125.158935546875, 27.0797119140625, 1.517333984375, 4.395820617675781, -17.302047729492188, -43.07928466796875, 110.00492858886719, 153.10604858398438, 13.697784423828125, 14.086311340332031, 108.71368408203125, 23.588592529296875, 114.21681213378906, 16.55633544921875, -0.35614013671875, 56.0985107421875, 8.681488037109375, 127.934814453125, 115.25909423828125, -5.4144287109375, 0.0, 149.77993774414062, 121.00527954101562, 46.007713317871094, -15.260650634765625, 32.32781982421875, -20.847213745117188, 113.82264709472656, -15.409112930297852, 106.98455810546875, 148.47811889648438, 30.387374877929688, -138.44012451171875, 8.497802734375, -51.68609619140625, 107.15951538085938, -29.8126220703125, 23.162109375, 66.8699722290039, -15.9461669921875, -37.256988525390625, 124.46067810058594, 161.935546875, 77.2258529663086, 35.24870300292969, 152.35223388671875, -4.1495208740234375, 202.8848876953125, 26.048057556152344, 109.27365112304688, 20.663848876953125, 29.803497314453125, -58.143096923828125, -48.1146240234375, 10.657928466796875, -167.56961059570312, -118.89443969726562, 44.28955078125, 61.287254333496094, 15.593231201171875, -19.72576904296875, 19.00025749206543, 9.723495483398438, 27.307861328125, 131.02392578125, -12.192413330078125, 139.10775756835938, -71.1795654296875, 5.1249847412109375, -15.216537475585938, 163.30235290527344, 48.1568603515625, 136.15570068359375, 26.97509765625, -5.334623336791992, -25.677902221679688, 107.00479125976562, -111.96748352050781, 124.22955322265625, 34.771728515625, -15.072662353515625, 3.8801727294921875, 114.66162109375, -43.55548095703125, 39.705810546875, 24.87664794921875, 90.2866439819336, 104.62139892578125, 111.83175659179688, 128.15512084960938, 144.71463012695312, -8.609481811523438, 29.08074951171875, 135.59326171875, 23.94964599609375, -17.757389068603516, 151.3136444091797, -15.088874816894531, 20.82220458984375, -3.5573883056640625, -14.40472412109375, 6.6479034423828125, 3.7254257202148438, 96.98958587646484, -99.08535766601562, 30.51822853088379, 3.046009063720703, 125.51010131835938, 160.0242919921875, -36.67578125, -17.820831298828125, 51.03520202636719, 22.994552612304688, 70.97994232177734, 99.59280395507812, -11.86419677734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000301.npy"}
|
||||
{"epoch": 0.6303664921465969, "step": 302, "batch_size": 128, "mean": 48.272743225097656, "std": 73.87347412109375, "min": -141.18997192382812, "p10": -53.03250961303711, "median": 44.22471618652344, "p90": 136.10097961425782, "max": 175.85452270507812, "pos_frac": 0.75, "sample": [141.13075256347656, 56.644134521484375, 111.69463348388672, 125.83175659179688, 21.819839477539062, 98.21576690673828, 0.0, -8.096572875976562, -3.09759521484375, -53.21177673339844, -126.32293701171875, 127.24765014648438, 34.451988220214844, 127.25726318359375, -8.240509033203125, 62.244041442871094, 63.91473388671875, 71.69244384765625, -15.893692016601562, -78.59911346435547, 147.86141967773438, 127.27215576171875, 159.174072265625, -11.5694580078125, 100.1944580078125, 114.61764526367188, 169.791259765625, 16.440887451171875, -97.3424072265625, 101.13091278076172, 39.85923767089844, 108.63046264648438, 113.56593322753906, 86.03219604492188, 47.91209030151367, 45.75140380859375, 159.18954467773438, 102.02485656738281, 42.840118408203125, 44.4786376953125, 113.71701049804688, -2.2024307250976562, -141.18997192382812, -2.716796875, 84.2432632446289, 50.43205261230469, 168.96334838867188, 22.654067993164062, 151.950439453125, 134.5381317138672, -16.231719970703125, 125.73768615722656, 11.735710144042969, 4.3927764892578125, -9.760368347167969, -78.66397094726562, 94.12034606933594, 4.304962158203125, 109.69430541992188, 38.88957214355469, 108.00038146972656, 86.63003540039062, 43.94366455078125, 9.30743408203125, 65.69345092773438, 10.7069091796875, 18.009918212890625, 12.466201782226562, -5.558015823364258, -82.38870239257812, 33.774085998535156, 119.90692138671875, -17.266159057617188, -103.25286865234375, 126.6961669921875, 107.34515380859375, -40.89307403564453, 146.718017578125, 112.21833801269531, 94.63378143310547, 22.925582885742188, 50.42767333984375, 124.96340942382812, -103.11807250976562, 136.085205078125, -6.740509033203125, -111.3616943359375, 70.44837188720703, -66.01669311523438, -97.49845886230469, 16.893293380737305, -84.4464111328125, 2.129608154296875, 108.19085693359375, 3.9816513061523438, 43.970794677734375, 92.39022064208984, 134.82315063476562, 136.13778686523438, -52.95568084716797, 0.47015380859375, 26.37860107421875, 24.499839782714844, -13.970306396484375, 117.42186737060547, 0.3947601318359375, 30.145904541015625, 175.85452270507812, 36.64231491088867, 101.51689147949219, 0.378265380859375, 10.124061584472656, 47.7677001953125, -13.926681518554688, 121.47238159179688, 94.016845703125, 169.392578125, 3.2036399841308594, 117.06509399414062, 43.440673828125, 21.956375122070312, 171.315185546875, 85.29228210449219, 130.38973999023438, 91.38070678710938, -5.180419921875, 136.80975341796875, -44.41007995605469], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000302.npy"}
|
||||
{"epoch": 0.6324607329842932, "step": 303, "batch_size": 128, "mean": 45.40017318725586, "std": 73.44808959960938, "min": -152.29522705078125, "p10": -45.336740112304675, "median": 48.619895935058594, "p90": 129.51590423583983, "max": 189.40582275390625, "pos_frac": 0.7421875, "sample": [14.89605712890625, 50.45874786376953, -66.36331176757812, -109.2103271484375, 106.88644409179688, -10.600936889648438, 90.191162109375, 51.455230712890625, 121.16897583007812, 103.43714904785156, -54.33135986328125, 118.52845764160156, 134.32916259765625, 127.48199462890625, 42.620208740234375, -1.056243896484375, -63.554779052734375, 107.97659301757812, -5.610280990600586, 145.96923828125, 101.09996032714844, 33.588226318359375, 109.5809555053711, 61.7938232421875, 95.322021484375, -41.481903076171875, 95.44551086425781, -10.751861572265625, 148.90658569335938, 12.309951782226562, 137.24984741210938, 125.84233093261719, -40.65568542480469, 124.64772033691406, -10.120559692382812, 13.22845458984375, -0.3845062255859375, 48.649383544921875, 0.0, 16.110443115234375, -97.80243682861328, 59.46741485595703, -10.985084533691406, 35.862091064453125, 9.515872955322266, 0.199676513671875, 123.12748718261719, 44.51025390625, 85.07447052001953, 107.03305053710938, 60.399200439453125, -91.43617248535156, 113.34097290039062, -152.29522705078125, 69.85809326171875, 128.45191955566406, 131.99853515625, 81.287353515625, 45.1219482421875, 107.85952758789062, -0.094970703125, -2.5073699951171875, 26.758560180664062, -7.501312255859375, -65.71273803710938, 9.258819580078125, 75.11126708984375, 108.5621337890625, 125.21649169921875, 84.0409927368164, 145.89794921875, -118.09051513671875, 20.841400146484375, -1.0257568359375, 122.51154327392578, 103.062744140625, -1.623748779296875, 4.371429443359375, 124.56072235107422, 181.79925537109375, 83.38131713867188, -145.71075439453125, 99.45968627929688, -116.88485717773438, 6.157537460327148, 65.31790161132812, 108.56927490234375, 6.28387451171875, 52.82232666015625, 81.80712890625, 9.830657958984375, -1.2515869140625, 136.9222869873047, 3.8513946533203125, 141.7569580078125, 1.8446197509765625, 10.690017700195312, 75.65974426269531, -28.7431640625, 0.96624755859375, -27.785125732421875, -138.56582641601562, 51.19842529296875, -3.72052001953125, 7.133872985839844, 94.97235870361328, 108.51730346679688, 21.656158447265625, 123.30049133300781, 97.23858642578125, 95.2591552734375, 95.3134765625, 71.83463287353516, 112.67526245117188, -12.668243408203125, 189.40582275390625, 110.08291625976562, -95.42025756835938, 150.20022583007812, 12.62249755859375, 146.6583251953125, 19.05804443359375, 164.6727294921875, 16.87420654296875, 30.666748046875, 6.965240478515625, 10.676544189453125, 48.59040832519531], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000303.npy"}
|
||||
{"epoch": 0.6345549738219896, "step": 304, "batch_size": 128, "mean": 39.34748840332031, "std": 71.61701965332031, "min": -139.334716796875, "p10": -59.98331909179687, "median": 31.184814453125, "p90": 131.20396728515624, "max": 175.71453857421875, "pos_frac": 0.6796875, "sample": [0.0, 31.95269775390625, -13.142333984375, 105.64810180664062, -58.829132080078125, -0.48577880859375, -104.36709594726562, 127.94696044921875, 123.09378051757812, -24.29315185546875, 74.10832977294922, 129.4246826171875, 34.15985107421875, 74.39683532714844, -62.676422119140625, 39.406494140625, 15.718505859375, 137.79000854492188, 44.13788604736328, 13.8306884765625, 11.844528198242188, 21.73802947998047, 12.218582153320312, -10.851654052734375, 94.50711059570312, 124.94869995117188, 108.2840576171875, 55.62761306762695, 9.288482666015625, 3.038604736328125, -13.462921142578125, -1.0023269653320312, -2.4817657470703125, -5.2796783447265625, 34.914581298828125, 23.137176513671875, 37.84235382080078, -1.1952896118164062, 33.0703125, 109.919189453125, 48.20343017578125, 89.34483337402344, 107.44041442871094, -56.82057189941406, 159.0028076171875, 21.46368408203125, -9.205574035644531, 117.0128173828125, -82.62045288085938, 40.36480712890625, 38.048492431640625, -26.620086669921875, 89.70787048339844, -139.334716796875, 0.0, 132.54444885253906, -74.2071533203125, 89.44791412353516, 8.112518310546875, -15.04754638671875, -84.47042846679688, 22.726547241210938, -105.21440124511719, 89.19584655761719, 21.164459228515625, 58.825469970703125, -36.407012939453125, 46.764404296875, 166.55731201171875, 113.64324951171875, -72.41403198242188, 19.805755615234375, 38.74827575683594, -94.90728759765625, -49.43310546875, 148.80093383789062, 139.13958740234375, -0.026123046875, 134.88064575195312, -100.58950805664062, 123.95660400390625, 104.05133056640625, -24.918922424316406, -4.769420623779297, 175.71453857421875, 16.280441284179688, 132.37380981445312, -5.435943603515625, 124.4725341796875, -110.22540283203125, -13.259445190429688, -0.19364356994628906, 24.810333251953125, 133.76071166992188, 141.04151916503906, 9.84793472290039, -32.141448974609375, 113.58650207519531, 124.45591735839844, 104.52691650390625, 30.41693115234375, 29.06170654296875, 92.10383605957031, 10.37750244140625, 94.18851470947266, 124.29702758789062, 50.24925231933594, 151.90509033203125, 119.76283264160156, 130.70260620117188, 90.17152404785156, 47.432342529296875, -86.67060852050781, 2.9866485595703125, 112.75166320800781, 34.038177490234375, -67.3448486328125, -11.296188354492188, 114.0191650390625, -10.439407348632812, 19.083297729492188, 98.7151870727539, 0.517486572265625, 115.009765625, -17.825836181640625, 124.45095825195312, 20.920806884765625, 151.40325927734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000304.npy"}
|
||||
{"epoch": 0.6366492146596858, "step": 305, "batch_size": 128, "mean": 57.34446716308594, "std": 72.61917114257812, "min": -175.32199096679688, "p10": -14.328898620605454, "median": 50.094512939453125, "p90": 144.08741455078123, "max": 259.27734375, "pos_frac": 0.8203125, "sample": [8.54129409790039, 11.575939178466797, -10.193740844726562, 187.79696655273438, 35.67803955078125, 4.918487548828125, 102.68409729003906, -55.55210876464844, 75.35505676269531, 18.97760772705078, 39.362098693847656, 36.734466552734375, -9.3861083984375, 107.14593505859375, 18.399932861328125, -47.25639343261719, -23.97760009765625, 26.68927001953125, 63.2857666015625, 0.4558677673339844, 95.83558654785156, 132.89651489257812, 0.9528274536132812, 96.53074645996094, 6.951812744140625, 17.598876953125, -84.686279296875, -4.5001983642578125, 31.892684936523438, 113.20303344726562, 24.56866455078125, 79.7349853515625, 116.33154296875, 87.91561889648438, 129.30242919921875, 18.047958374023438, -0.399658203125, -3.5034637451171875, 114.62305450439453, 0.389923095703125, -121.39457702636719, 114.18951416015625, -5.0513458251953125, 103.24324798583984, 139.08233642578125, 114.52009582519531, 49.73455810546875, 151.34683227539062, 130.35171508789062, 110.73587036132812, 1.0939922332763672, -94.52079772949219, 156.010986328125, 144.92904663085938, 35.72987365722656, 70.51712036132812, 259.27734375, 98.212646484375, 169.4398193359375, 153.4384765625, 24.307083129882812, -0.4174766540527344, -9.914794921875, 138.77627563476562, 37.3377685546875, 98.73319244384766, 142.82284545898438, 117.0882568359375, 106.00689697265625, 62.508270263671875, -49.82921600341797, 108.9556655883789, -64.66881561279297, 34.238555908203125, 94.90771484375, -175.32199096679688, 78.42573547363281, 117.97982788085938, 10.960845947265625, 46.9991455078125, 2.891998291015625, 139.8768310546875, 146.99078369140625, 9.868247985839844, 107.17379760742188, 98.27432250976562, -27.098846435546875, 181.19345092773438, 3.0693016052246094, 10.558547973632812, 147.25112915039062, 56.9537353515625, 124.24652099609375, 45.67987060546875, 91.68661499023438, 115.91033935546875, 54.425201416015625, 118.4864501953125, 166.41354370117188, 50.4544677734375, 32.846221923828125, -3.72869873046875, 117.1058349609375, 118.58209228515625, 181.31890869140625, 22.76617431640625, 29.90777587890625, 38.751220703125, 58.38609313964844, 20.87005615234375, 142.40020751953125, 2.131837844848633, 14.608871459960938, 4.097900390625, 5.956428527832031, -78.91417694091797, 143.72671508789062, 3.1101303100585938, 10.136856079101562, -51.065582275390625, 107.23397827148438, 51.48681640625, 156.63189697265625, 125.98321533203125, 130.36273193359375, -85.98226928710938, -4.368988037109375, 135.73919677734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000305.npy"}
|
||||
{"epoch": 0.6387434554973822, "step": 306, "batch_size": 128, "mean": 48.40129852294922, "std": 71.30281066894531, "min": -141.13916015625, "p10": -24.798302268981924, "median": 44.75984191894531, "p90": 139.13483276367188, "max": 175.27691650390625, "pos_frac": 0.734375, "sample": [86.80917358398438, -99.61651611328125, -87.31847381591797, 173.82415771484375, -141.13916015625, 167.41400146484375, -14.076799392700195, 21.03704833984375, -12.038116455078125, 128.4691162109375, 4.446216583251953, 24.442718505859375, 130.0885467529297, 99.66514587402344, 2.317138671875, 10.072357177734375, -37.792236328125, 115.15928649902344, -3.17340087890625, 27.16448974609375, 72.18524169921875, 81.49156951904297, -44.751312255859375, 46.625732421875, -126.08709716796875, 156.94952392578125, 147.7332763671875, 116.47647094726562, 99.19268798828125, -0.49546051025390625, 56.6142578125, 58.27850341796875, 92.81361389160156, 118.56832885742188, 139.00657653808594, 113.0419692993164, 158.00640869140625, 117.993896484375, 110.77325439453125, 92.91213989257812, 6.959892272949219, 104.56803131103516, -6.22015380859375, 8.049545288085938, 103.12881469726562, 89.04421997070312, 52.616119384765625, 110.28022766113281, 57.07659912109375, -98.89898681640625, -7.446319580078125, 130.2860870361328, -11.160125732421875, 171.41058349609375, 45.50701904296875, -128.89996337890625, 80.85417938232422, 64.8046875, -39.22821044921875, 165.38754272460938, -3.3757171630859375, 33.3438720703125, 82.17495727539062, 25.266098022460938, 112.40435028076172, 3.5356063842773438, 5.99127197265625, 4.2386627197265625, 99.80438232421875, -77.54745483398438, 94.4964599609375, -109.15969848632812, 100.90396118164062, -19.33758544921875, 168.75823974609375, -1.9842109680175781, -10.31585693359375, 34.78520202636719, 18.270416259765625, 23.33154296875, 25.4080810546875, 52.0150146484375, 106.85714721679688, 107.47515869140625, 140.2840576171875, -21.430206298828125, -30.3826904296875, 83.30087280273438, 66.7069091796875, 175.27691650390625, 64.019775390625, 37.78080749511719, 20.456527709960938, -20.47723388671875, 21.61663818359375, 108.79086303710938, 106.525634765625, -10.7850341796875, 139.43409729003906, -22.404993057250977, 133.12017822265625, 44.012664794921875, 125.14363098144531, 92.08087158203125, 25.512420654296875, 40.629188537597656, 6.6029205322265625, -6.889373779296875, -5.281158447265625, -5.06402587890625, -8.04843521118164, 4.06610107421875, 123.25212097167969, 67.90339660644531, 15.570068359375, 163.72784423828125, 47.819801330566406, -2.4792327880859375, -6.689788818359375, 4.079345703125, 33.3720703125, 107.79342651367188, 0.8381195068359375, -60.010009765625, 111.33207702636719, 31.816375732421875, 133.4830322265625, 140.41580200195312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000306.npy"}
|
||||
{"epoch": 0.6408376963350786, "step": 307, "batch_size": 128, "mean": 51.361385345458984, "std": 68.09608459472656, "min": -165.41787719726562, "p10": -16.940411376953122, "median": 43.45945739746094, "p90": 128.87836914062498, "max": 184.9169921875, "pos_frac": 0.7734375, "sample": [98.71929931640625, 124.32997131347656, 33.67671203613281, 141.5594482421875, -118.936279296875, -5.542579650878906, 69.86469268798828, 98.76177978515625, 113.39837646484375, 126.101806640625, -154.21932983398438, -5.165159225463867, 78.14593505859375, -4.1833343505859375, -11.112060546875, 107.12339782714844, 127.30474853515625, 110.40179443359375, 12.239395141601562, 19.08245849609375, 55.522918701171875, -51.135650634765625, 116.737060546875, 8.241350173950195, 165.51678466796875, -25.32781982421875, 101.826171875, 10.234375, 111.29255676269531, 151.75677490234375, 75.92961120605469, 24.663421630859375, 28.576568603515625, 88.34188079833984, 108.48622131347656, -16.071731567382812, 127.76588439941406, 67.61325073242188, 21.589996337890625, -8.164100646972656, 23.55999755859375, 87.16048431396484, 110.5960922241211, -12.87548828125, 125.67337036132812, 91.96746826171875, 131.4741668701172, -10.666030883789062, 5.36175537109375, 23.841094970703125, 68.16581726074219, 8.458612442016602, 131.62367248535156, 8.604774475097656, 65.30645751953125, 117.55799865722656, 104.53083801269531, 138.0838623046875, 26.4443359375, 3.5438079833984375, 15.25897216796875, 34.10528564453125, -82.55755615234375, 98.98089599609375, -7.075019836425781, -3.0170440673828125, 120.04454040527344, 31.453330993652344, -6.301445007324219, 6.254207611083984, -3.2569828033447266, 125.01764678955078, -116.79730224609375, -15.80389404296875, 31.48187255859375, -38.49261474609375, 84.3580322265625, 123.25511169433594, 32.85418701171875, 31.371002197265625, 135.7823028564453, -18.967330932617188, 37.20548629760742, 89.99530029296875, 154.5306396484375, 112.28347778320312, -37.0540657043457, 113.07301330566406, 8.138568878173828, 41.99859619140625, 18.93341827392578, -38.93559265136719, 46.320465087890625, 158.07061767578125, 94.64291381835938, 17.15093994140625, 123.86973571777344, -60.42546844482422, -9.387405395507812, 117.5166015625, 3.805389404296875, 166.67337036132812, 31.918975830078125, 122.69476318359375, -165.41787719726562, 8.832794189453125, 1.109375, 41.94267272949219, 44.920318603515625, 85.63026428222656, -40.7679443359375, 25.9490966796875, 184.9169921875, 95.19878387451172, 70.33433532714844, 94.94570922851562, -0.9320068359375, -6.51165771484375, 50.11260986328125, 122.8458251953125, 48.521209716796875, 26.974205017089844, 110.53298950195312, 127.41098022460938, 138.92221069335938, 5.381984710693359, 134.58920288085938, 104.48733520507812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000307.npy"}
|
||||
{"epoch": 0.6429319371727749, "step": 308, "batch_size": 128, "mean": 38.033206939697266, "std": 71.61894989013672, "min": -165.2470703125, "p10": -42.41674194335937, "median": 24.069549560546875, "p90": 134.00708923339843, "max": 189.56460571289062, "pos_frac": 0.71875, "sample": [138.01319885253906, 41.000709533691406, -46.1142578125, 138.79824829101562, 51.57061767578125, -82.27070617675781, -47.74853515625, 49.40557861328125, 9.205474853515625, 72.98918151855469, 5.17913818359375, 150.00604248046875, -1.5141830444335938, -71.1219482421875, -2.4853515625, 44.02337646484375, 111.01869201660156, -4.4188232421875, 113.22064208984375, 126.1734619140625, 120.9028091430664, 36.18896484375, 0.697906494140625, -66.2359390258789, 12.83502197265625, 89.17205810546875, -142.90695190429688, -9.631401062011719, 22.769073486328125, 134.79153442382812, 110.08940124511719, 49.708953857421875, 76.69915771484375, 84.79930114746094, -16.19622802734375, -6.243408203125, 120.23037719726562, 0.2611083984375, 98.48202514648438, 25.370025634765625, 6.9405975341796875, 40.69904327392578, 174.08074951171875, -0.52215576171875, -34.86448669433594, -12.6776123046875, 1.52667236328125, 129.31170654296875, -5.580619812011719, 111.48135375976562, 5.592403411865234, 50.124847412109375, 2.9382858276367188, 99.7414321899414, -8.300086975097656, 29.140869140625, 107.506591796875, 22.175125122070312, -165.2470703125, 162.5755615234375, 74.64842224121094, 7.5889892578125, -7.5631103515625, -16.3892822265625, 189.56460571289062, 106.00276184082031, -2.8752613067626953, -115.37078857421875, 14.998733520507812, -0.5784225463867188, 58.847900390625, -83.72476959228516, 133.6708984375, 69.95916748046875, 10.266571044921875, 34.5352783203125, 149.38607788085938, 175.46954345703125, 137.14144897460938, -1.1521759033203125, 35.11064147949219, 1.6759033203125, 2.5062522888183594, 13.6126708984375, 68.02740478515625, -115.5162353515625, 55.65142059326172, 8.4091796875, -40.83209228515625, 99.76766967773438, 19.65533447265625, 16.362945556640625, 33.54107666015625, 9.448684692382812, 111.18266296386719, 108.64427947998047, 145.11309814453125, 124.5866928100586, 31.864486694335938, 119.32257080078125, 27.114776611328125, 72.1309814453125, 167.88165283203125, -33.335289001464844, 42.20623016357422, 11.534339904785156, 13.98052978515625, -46.70595932006836, 100.18890380859375, 100.05026245117188, -115.770751953125, 14.5357666015625, -24.552642822265625, 2.140216827392578, 2.6470565795898438, 91.60687255859375, -28.384353637695312, 11.043716430664062, -5.115325927734375, 6.196197509765625, -37.9920654296875, 89.59541320800781, 123.75918579101562, 124.97879028320312, -83.1016845703125, -20.636474609375, 165.71566772460938, 60.61962890625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000308.npy"}
|
||||
{"epoch": 0.6450261780104712, "step": 309, "batch_size": 128, "mean": 51.465328216552734, "std": 77.35250091552734, "min": -121.37408447265625, "p10": -44.6249599456787, "median": 42.67413330078125, "p90": 152.21206970214843, "max": 263.7127990722656, "pos_frac": 0.7265625, "sample": [0.614532470703125, 153.40631103515625, 125.5296630859375, 115.48745727539062, -1.1163787841796875, -20.6998291015625, -22.58642578125, -93.22327423095703, 126.96658325195312, 45.144378662109375, 152.7723388671875, 11.225303649902344, -12.080963134765625, 4.9369049072265625, -92.54151916503906, 93.55941009521484, 12.92608642578125, 21.61939239501953, -7.0572662353515625, 39.741058349609375, -9.31878662109375, 101.97367095947266, -13.593597412109375, -99.94602966308594, 11.54156494140625, 23.95166015625, 1.3621673583984375, 263.7127990722656, 109.253173828125, 65.93386840820312, -5.9420623779296875, 156.580322265625, 44.31756591796875, -1.9164199829101562, -5.903350830078125, -10.917404174804688, 82.89820098876953, -58.302581787109375, 189.694580078125, -75.38656616210938, 176.25137329101562, -54.462982177734375, -4.634101867675781, 63.2215576171875, 114.14578247070312, -57.263465881347656, 1.808349609375, -34.095947265625, -74.61395263671875, 23.60125732421875, 124.68280029296875, -13.807008743286133, 29.15057373046875, -102.94891357421875, 141.2822265625, 113.56283569335938, 118.99870300292969, 15.5054931640625, 85.697265625, 15.54693603515625, 152.60635375976562, -15.4482421875, 61.460609436035156, 209.458251953125, -12.323722839355469, 148.876220703125, 10.594894409179688, 94.8487777709961, 22.315879821777344, 63.447235107421875, 23.856918334960938, 28.53387451171875, 1.8191146850585938, 167.9638671875, 3.4640655517578125, 137.38499450683594, 115.98736572265625, 49.43382263183594, 10.311553955078125, 152.9195556640625, 118.97430419921875, 123.03836059570312, 5.567329406738281, 83.25802612304688, 116.27018737792969, 134.52517700195312, -2.5921249389648438, 90.91315460205078, 41.03070068359375, -40.40866470336914, 119.32186889648438, -121.37408447265625, 4.398956298828125, -16.150190353393555, 152.0430908203125, 122.57058715820312, 19.418243408203125, 124.11708068847656, 105.52987670898438, 12.141098022460938, -98.2535400390625, 59.05976867675781, 125.8433837890625, 148.22744750976562, 133.53152465820312, -1.84515380859375, 176.88388061523438, 30.739013671875, 86.09130859375, -109.88040161132812, 145.6051025390625, 59.07501220703125, 30.99224853515625, 128.100830078125, 29.906280517578125, 126.94924926757812, 53.14215087890625, 154.51605224609375, -19.333709716796875, 133.77760314941406, 157.56365966796875, 53.189903259277344, 103.18165588378906, -59.598388671875, 89.68145751953125, 54.613037109375, -4.2506866455078125, 73.70355224609375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000309.npy"}
|
||||
{"epoch": 0.6471204188481675, "step": 310, "batch_size": 128, "mean": 41.79604721069336, "std": 65.34387969970703, "min": -119.947265625, "p10": -36.88656425476074, "median": 29.147552490234375, "p90": 127.48927612304686, "max": 190.166015625, "pos_frac": 0.734375, "sample": [188.62918090820312, 18.405914306640625, 3.8398971557617188, 167.28704833984375, 36.71868896484375, 31.404144287109375, 17.590057373046875, 5.948280334472656, 116.21533203125, 35.793121337890625, 103.2104721069336, 157.45721435546875, 61.19509506225586, 111.71267700195312, 24.7139892578125, 38.64447021484375, 7.508941650390625, -5.07420539855957, 38.13323974609375, -104.31532287597656, 4.95843505859375, 5.313751220703125, 17.57257080078125, 55.221923828125, 83.91851043701172, -84.9344482421875, 101.29801940917969, -40.947906494140625, 114.67623901367188, -0.3985595703125, -3.989988327026367, -7.199378967285156, -1.72235107421875, 60.8411865234375, -6.01629638671875, -1.448486328125, 54.685455322265625, 155.40603637695312, -45.575164794921875, -82.10382080078125, 106.95980834960938, -1.7049407958984375, -10.017318725585938, 23.409210205078125, 101.771728515625, 36.845703125, 133.45565795898438, 38.73101043701172, 64.82124328613281, 8.74407958984375, -9.257904052734375, 41.44041442871094, 106.92129516601562, 93.41778564453125, 80.86679077148438, 126.8707275390625, 165.588134765625, 16.199249267578125, -4.301353454589844, -119.947265625, -36.56353759765625, 150.21575927734375, 22.18236541748047, 21.197998046875, -3.052703857421875, -99.75688934326172, 9.013900756835938, 16.174591064453125, 100.6641845703125, 4.140098571777344, -20.65972137451172, 55.54376220703125, 41.21073913574219, 29.3602294921875, 28.93487548828125, 103.86300659179688, 114.00799560546875, 154.5098876953125, 56.024505615234375, -5.5903778076171875, -37.64029312133789, 84.77876281738281, 99.48727416992188, -0.4986114501953125, -43.009246826171875, 97.95853424072266, 120.65345001220703, 4.64215087890625, 106.84496307373047, -8.17388916015625, 78.24298095703125, -47.12762451171875, 123.23892211914062, -68.36395263671875, 38.26658248901367, 2.6169967651367188, 12.043075561523438, 125.18354797363281, 6.243743896484375, 27.89599609375, 49.009605407714844, 141.92330932617188, 6.80078125, -35.072662353515625, 128.93255615234375, -9.52545166015625, 4.420806884765625, -51.238922119140625, -79.59857177734375, 88.33120727539062, -18.83782958984375, 95.06927490234375, 53.5252685546875, 120.32574462890625, 7.5997314453125, -3.511474609375, 17.67401123046875, 18.440765380859375, 8.646598815917969, 93.08668518066406, 40.036956787109375, 6.346485137939453, 125.3868408203125, 134.401123046875, 190.166015625, 72.86300659179688, 133.9398193359375, 110.66007995605469], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000310.npy"}
|
||||
{"epoch": 0.6492146596858639, "step": 311, "batch_size": 128, "mean": 46.89714813232422, "std": 77.87544250488281, "min": -175.76528930664062, "p10": -37.555208587646476, "median": 33.830020904541016, "p90": 141.0401184082031, "max": 199.25466918945312, "pos_frac": 0.71875, "sample": [9.612861633300781, -1.8029937744140625, 114.39717102050781, 105.48979949951172, 136.7236328125, 31.973068237304688, 27.015182495117188, -7.6656036376953125, 135.89715576171875, 109.5076904296875, 126.83474731445312, 5.628993988037109, -9.947044372558594, -24.23712158203125, 33.79722595214844, -0.8846435546875, 125.8087158203125, 65.19007873535156, -71.42411804199219, 73.86492919921875, 165.685302734375, 130.05038452148438, 2.6746673583984375, 82.78677368164062, -1.4434280395507812, -16.1719970703125, 145.71548461914062, 110.62203979492188, 6.899677276611328, -2.427734375, -83.3140640258789, -152.2000732421875, -175.76528930664062, 113.71502685546875, 31.55523681640625, 147.50946044921875, -118.76416015625, -80.190673828125, 124.41178894042969, 49.91925048828125, -42.418983459472656, 89.37068176269531, 69.03147888183594, 2.49658203125, -4.274202346801758, 52.40992736816406, -14.464057922363281, 117.90084838867188, -99.76417541503906, 22.797752380371094, 97.00336456298828, 23.13623046875, 2.337554931640625, 27.8907470703125, 108.86720275878906, -24.03125, 74.32670593261719, -21.17138671875, -18.79986572265625, 150.91574096679688, 148.25738525390625, 127.57574462890625, 136.4835205078125, 107.1443862915039, 106.67572021484375, -89.74497985839844, 36.167266845703125, 11.107559204101562, 32.50458526611328, 33.862815856933594, -88.60107421875, -13.09029769897461, 160.49203491210938, 10.117828369140625, 22.456298828125, 80.05985260009766, 140.9688720703125, 96.134765625, 45.26703643798828, 141.20635986328125, 199.25466918945312, 2.1689453125, -93.9488525390625, 31.660606384277344, -0.042110443115234375, -34.4244384765625, 154.16525268554688, 148.59173583984375, 127.2457504272461, 20.469696044921875, 109.38949584960938, 88.70095825195312, 97.54756164550781, 70.0263671875, 160.203857421875, -121.12403869628906, 158.983154296875, -7.538970947265625, -115.51632690429688, 109.67601013183594, 7.972026824951172, 138.32763671875, -21.887161254882812, 30.74859619140625, 10.341217041015625, -2.9698333740234375, 137.25662231445312, 133.92898559570312, 8.927583694458008, 48.12860107421875, 112.5758056640625, 95.94204711914062, 75.41339111328125, 33.297393798828125, -35.470733642578125, 5.221351623535156, 103.51399230957031, 130.7259521484375, 130.49887084960938, 2.8242111206054688, -25.301753997802734, -2.178985595703125, 5.684181213378906, 181.82806396484375, 96.75869750976562, 114.18009185791016, 79.5078125, -24.103118896484375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000311.npy"}
|
||||
{"epoch": 0.6513089005235602, "step": 312, "batch_size": 128, "mean": 46.32633972167969, "std": 71.46038818359375, "min": -153.22845458984375, "p10": -38.17874755859375, "median": 41.237030029296875, "p90": 135.54689331054686, "max": 188.54281616210938, "pos_frac": 0.7265625, "sample": [32.536346435546875, -3.7091140747070312, -16.97454833984375, 117.27850341796875, 128.2225799560547, -153.22845458984375, 123.68255615234375, 114.560791015625, -36.94744110107422, -96.85836791992188, -18.753204345703125, 31.40387725830078, 3.187744140625, 103.08016967773438, 60.48039245605469, 146.36410522460938, 93.04217529296875, 122.6441650390625, -20.822677612304688, -80.25723266601562, 7.61175537109375, 110.69953918457031, 117.6083984375, -0.894256591796875, 6.7547454833984375, 48.491065979003906, 56.522705078125, 188.54281616210938, -39.5120849609375, -13.11619758605957, 138.73257446289062, -5.235416412353516, 2.38812255859375, 144.84771728515625, 57.6513671875, 46.52903747558594, 135.40963745117188, 0.6282520294189453, 85.87518310546875, 106.62322998046875, 161.62808227539062, 137.7874298095703, 17.300567626953125, 65.973876953125, 100.35009765625, 108.77751159667969, 53.4954833984375, 102.736572265625, 114.89588928222656, 62.917694091796875, 123.80813598632812, 128.17515563964844, 17.164413452148438, 2.946990966796875, -39.436431884765625, 15.1944580078125, 42.696624755859375, 17.25469970703125, -0.9430389404296875, 156.64462280273438, 9.007575988769531, -28.46929931640625, 134.59368896484375, -103.01828002929688, 55.005615234375, -34.4261474609375, 155.64248657226562, 121.12686157226562, -0.75152587890625, 94.30137634277344, 27.10174560546875, 120.74249267578125, 22.155536651611328, -4.881988525390625, 59.49493408203125, 86.40460205078125, 3.4392013549804688, 135.86715698242188, -4.411865234375, 103.4964599609375, 116.74681091308594, 135.91363525390625, -63.009613037109375, 117.18925476074219, -2.1761016845703125, 0.0, 56.35528564453125, -63.131919860839844, -100.272216796875, -85.32115173339844, 18.43310546875, 68.46002197265625, 7.205280303955078, 38.72572326660156, 106.91549682617188, 1.1845779418945312, -100.95671081542969, 20.577106475830078, 107.27642822265625, 2.6232147216796875, 33.58123779296875, 63.49186706542969, 117.99551391601562, 98.953369140625, -19.909210205078125, 94.84658813476562, 115.84344482421875, -1.3170166015625, 162.67288208007812, 35.7147216796875, 126.73223876953125, 19.5440673828125, 39.777435302734375, -4.743495941162109, -37.639739990234375, 43.11381530761719, 100.76995849609375, 6.564971923828125, -18.69647216796875, 163.59005737304688, -103.09916687011719, 144.95013427734375, -28.850006103515625, 122.00413513183594, 104.31021881103516, 19.550689697265625, 14.546792984008789, -54.14567565917969], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000312.npy"}
|
||||
{"epoch": 0.6534031413612565, "step": 313, "batch_size": 128, "mean": 43.24102020263672, "std": 75.07164001464844, "min": -141.6334228515625, "p10": -52.484329223632805, "median": 34.32398986816406, "p90": 139.5916275024414, "max": 209.1826171875, "pos_frac": 0.7109375, "sample": [127.600830078125, 131.26681518554688, -31.6795654296875, -21.10113525390625, -40.61322021484375, -3.9917373657226562, 58.019287109375, 122.08128356933594, 148.0240478515625, -16.58099365234375, -11.431793212890625, 125.98123168945312, -125.78680419921875, 131.9586639404297, 89.16275787353516, 141.5623321533203, 129.45260620117188, 42.05743408203125, -19.31146240234375, -23.84063720703125, 11.298080444335938, 16.154052734375, 4.218379974365234, 128.6636199951172, 105.122802734375, -6.889404296875, -2.235107421875, 23.650604248046875, 101.6666259765625, 168.42739868164062, 160.236572265625, 197.6627197265625, 17.319015502929688, 65.88916015625, -51.260498046875, 61.234619140625, 81.00973510742188, 24.26165771484375, 165.13992309570312, 50.68915557861328, -51.188201904296875, 168.560791015625, 126.45664978027344, 37.165283203125, -84.18618774414062, 84.74382019042969, 23.729583740234375, -55.542236328125, 93.38929748535156, 86.10357666015625, 42.43751525878906, 84.79330444335938, -69.20162963867188, 16.191152572631836, 78.28045654296875, -0.6926078796386719, 138.74703979492188, 59.2581787109375, 27.29914093017578, 2.79949951171875, 112.7228012084961, 6.7034454345703125, 113.351318359375, 18.87908172607422, 11.773155212402344, 50.35302734375, -55.339935302734375, 94.1290283203125, -97.47500610351562, -18.126068115234375, 153.6138458251953, -13.69195556640625, 22.270042419433594, 143.87808227539062, 18.3878173828125, -4.3212890625, 31.482696533203125, 128.45669555664062, -96.87220764160156, 97.28880310058594, 9.809051513671875, -100.87106323242188, 75.78070068359375, -43.04644775390625, 37.33294677734375, 118.43824768066406, 76.35427856445312, -10.248291015625, 10.587713241577148, 5.009193420410156, 26.938232421875, 209.1826171875, 104.11708068847656, 58.0201416015625, 116.74591064453125, 136.6972198486328, 24.439617156982422, 150.72406005859375, -38.081329345703125, -138.94650268554688, 55.59332275390625, 44.298500061035156, 8.052885055541992, 137.01971435546875, 77.50032043457031, -58.02958679199219, -0.6854133605957031, -1.8974609375, 120.67269897460938, 101.98629760742188, 25.5689697265625, 69.2178955078125, 57.400299072265625, 3.626373291015625, -0.9251556396484375, -11.002449035644531, 108.87059020996094, 28.712844848632812, -95.71881103515625, -105.52055358886719, -0.33229827880859375, -141.6334228515625, 68.97779846191406, 71.94156646728516, 22.9959716796875, 143.2873077392578, 24.264862060546875, 149.92726135253906], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000313.npy"}
|
||||
{"epoch": 0.6554973821989529, "step": 314, "batch_size": 128, "mean": 44.87549591064453, "std": 75.32698059082031, "min": -153.27783203125, "p10": -53.7540771484375, "median": 37.286834716796875, "p90": 134.58890686035156, "max": 193.28787231445312, "pos_frac": 0.71875, "sample": [65.0170669555664, 97.79910278320312, 108.32550048828125, 42.38714599609375, -12.36151123046875, 19.10101318359375, -80.873046875, 80.4281005859375, -77.52706146240234, -113.20057678222656, -22.124649047851562, 40.43370056152344, 125.3201904296875, 139.8702392578125, 146.1893310546875, -62.809326171875, -83.54527282714844, 165.74270629882812, 110.26874542236328, 140.0045166015625, 93.04306030273438, -9.099151611328125, 53.226566314697266, 112.18162536621094, 7.416290283203125, 13.865806579589844, 10.510658264160156, 158.7396240234375, 7.000946044921875, 160.68002319335938, 4.984369277954102, -16.316162109375, 11.26727294921875, -93.59893798828125, 124.62342834472656, 76.32476806640625, 14.66241455078125, 134.47335815429688, 38.06927490234375, 134.8585205078125, -10.453033447265625, 101.2996826171875, -45.57275390625, 106.40434265136719, 4.4587860107421875, 65.07081604003906, 169.93753051757812, 115.17498779296875, 9.37200927734375, 5.1795654296875, 124.61654663085938, -0.23760223388671875, 36.50439453125, 97.63201904296875, -1.8188629150390625, 97.50369262695312, 30.11676025390625, 18.72882080078125, -14.434112548828125, -85.70285034179688, 119.45196533203125, 128.15379333496094, 110.52067565917969, 0.0, 1.8580322265625, -64.30970764160156, -149.17141723632812, 115.80661010742188, 20.74560546875, -16.742034912109375, -53.926513671875, 18.15924072265625, 87.117919921875, 109.74960327148438, 141.7569580078125, -32.738502502441406, -26.42243194580078, 150.76153564453125, -31.01239013671875, -153.27783203125, 112.35010528564453, 58.54827880859375, 121.45181274414062, 86.36611938476562, 0.0, 128.61590576171875, -6.6455078125, 56.24351501464844, 121.58135986328125, 18.925811767578125, 31.970748901367188, 119.4970703125, 120.56961059570312, 33.818756103515625, 94.79141235351562, 131.26651000976562, 106.70748138427734, 19.53326416015625, 42.2490234375, 5.961669921875, 9.087871551513672, -23.13861846923828, -4.135711669921875, -43.4310302734375, 102.36880493164062, 1.619873046875, 104.5174789428711, 113.29071044921875, 44.24237060546875, 103.46340942382812, 156.41729736328125, 10.792251586914062, -53.68017578125, -3.3961029052734375, 2.306976318359375, 193.28787231445312, 125.42839050292969, -13.302864074707031, 18.194839477539062, 75.61541748046875, 118.60588073730469, 93.25235748291016, -65.11784362792969, 26.799224853515625, -149.25326538085938, 156.40347290039062, 80.12626647949219, -5.726545333862305], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000314.npy"}
|
||||
{"epoch": 0.6575916230366492, "step": 315, "batch_size": 128, "mean": 52.22566223144531, "std": 71.41708374023438, "min": -151.05845642089844, "p10": -23.13702545166015, "median": 32.46699523925781, "p90": 147.73860778808594, "max": 226.34133911132812, "pos_frac": 0.78125, "sample": [2.5478057861328125, 9.703643798828125, 3.2542076110839844, 4.3938140869140625, 33.875762939453125, -35.3348388671875, -21.04144287109375, 12.683547973632812, 150.71298217773438, -11.627593994140625, 31.0582275390625, -11.157760620117188, 112.59019470214844, 19.97833251953125, 112.69589233398438, -36.25569152832031, 37.94648742675781, 147.37994384765625, 51.24517822265625, -20.970077514648438, -45.29533004760742, -28.026718139648438, 2.610076904296875, -2.3443069458007812, 165.0020751953125, 9.76153564453125, 106.13717651367188, 150.9451904296875, 179.62994384765625, 1.2154808044433594, 21.421398162841797, 39.509063720703125, 139.4208526611328, 142.7242431640625, 30.032264709472656, 164.93988037109375, 145.953369140625, 133.37759399414062, 94.3145751953125, -1.23675537109375, 5.516998291015625, -32.650970458984375, 22.1116943359375, 133.62802124023438, -45.871490478515625, 35.46055603027344, 7.549957275390625, 7.38421630859375, 15.64129638671875, 8.894775390625, 127.02215576171875, -18.93624496459961, 131.37832641601562, 148.39547729492188, 127.32998657226562, 130.38819885253906, 12.114471435546875, 45.22112274169922, 35.60272216796875, 67.96820068359375, 11.53192138671875, -2.567535400390625, -7.209598541259766, 103.58889770507812, 97.03610229492188, 133.75367736816406, 68.27645874023438, 97.41918182373047, 72.25018310546875, 226.34133911132812, -20.0213623046875, 28.351577758789062, 114.62460327148438, 24.49188232421875, 128.5546417236328, -40.234405517578125, -125.952880859375, 154.78448486328125, 4.307096481323242, 150.05377197265625, 104.86221313476562, 29.67303466796875, -1.2002410888671875, 125.4703369140625, 175.29473876953125, 8.607658386230469, 35.414703369140625, 101.35435485839844, 22.3095703125, 166.134765625, 12.129287719726562, 67.69863891601562, -118.27764892578125, -42.134368896484375, 136.65765380859375, 155.193603515625, 73.2587890625, 38.05244064331055, -151.05845642089844, 140.1953125, 25.97187042236328, -6.31683349609375, -0.8587646484375, 88.46678924560547, 10.497512817382812, -53.751312255859375, -64.78939819335938, 49.91259765625, 22.002704620361328, 17.74908447265625, 138.262451171875, 13.567031860351562, 119.92340087890625, 183.28480529785156, 17.66265869140625, 98.84341430664062, 147.45709228515625, 10.4029541015625, -9.566680908203125, 14.331085205078125, 101.76744842529297, -3.0717430114746094, 123.89382934570312, 29.17535400390625, 121.44456481933594, 81.95703125, 34.71598815917969, 65.01380157470703], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000315.npy"}
|
||||
{"epoch": 0.6596858638743456, "step": 316, "batch_size": 128, "mean": 53.209686279296875, "std": 74.13626861572266, "min": -129.09942626953125, "p10": -32.98233337402343, "median": 38.32499694824219, "p90": 153.2975311279297, "max": 240.67529296875, "pos_frac": 0.7421875, "sample": [113.82403564453125, 62.87542724609375, 37.5057373046875, 35.958740234375, -10.160888671875, -42.35417175292969, -9.822265625, 152.93704223632812, 70.52241516113281, 158.77420043945312, 25.716552734375, -37.5888671875, 132.13450622558594, 27.96765899658203, -55.30384826660156, 23.90851402282715, 114.09552001953125, -44.80909729003906, 122.1466064453125, 9.92266845703125, 123.17593383789062, 142.79241943359375, -25.7476806640625, 110.17359924316406, 123.15129089355469, 208.5372314453125, 42.11834716796875, 122.71588134765625, 43.459259033203125, 170.9077911376953, -8.422653198242188, 130.05799865722656, -23.845611572265625, 109.44872283935547, 159.35513305664062, 12.715225219726562, 95.1573257446289, 134.04296875, 150.1822052001953, 31.672149658203125, 3.299896240234375, 82.48831176757812, 4.63153076171875, 56.9183349609375, 12.809974670410156, 66.7337417602539, -45.630367279052734, 142.54034423828125, 25.64501953125, 1.4071044921875, 74.38299560546875, 5.822845458984375, -129.09942626953125, -85.67803955078125, -26.114761352539062, 179.6251220703125, 100.58731079101562, 88.8516845703125, -2.67822265625, 0.01061248779296875, -11.69301986694336, 51.844818115234375, 166.908203125, 129.2598419189453, 21.529022216796875, 23.51824951171875, -58.4403076171875, 240.67529296875, 37.443946838378906, 95.40106201171875, -1.1579132080078125, -1.6581573486328125, -31.60284423828125, -1.932708740234375, 2.819366455078125, 126.02678680419922, 55.5548095703125, 136.4803924560547, 5.2306671142578125, 128.59512329101562, 112.4739761352539, 121.45718383789062, 154.138671875, -4.7988433837890625, -6.362539291381836, 2.71197509765625, 38.60797119140625, 171.44403076171875, -72.8068618774414, 140.92669677734375, -36.201141357421875, 57.0147705078125, 0.0, -47.13592529296875, 17.452545166015625, 157.48080444335938, 158.55081176757812, 12.607368469238281, -1.58612060546875, 169.06976318359375, 91.32534790039062, 21.102752685546875, -113.74185180664062, 9.43408203125, -2.034027099609375, 56.41473388671875, 137.633544921875, 7.648902893066406, 82.98523712158203, 107.27264404296875, 164.55096435546875, 13.37213134765625, 144.05267333984375, -25.82550048828125, 21.830108642578125, 50.7342529296875, -25.605331420898438, -116.36857604980469, 38.042022705078125, 6.9885711669921875, 94.39825439453125, 102.13848876953125, 106.59710693359375, 39.038421630859375, 108.39070892333984, 21.7874755859375, 122.45703887939453, -10.0804443359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000316.npy"}
|
||||
{"epoch": 0.6617801047120419, "step": 317, "batch_size": 128, "mean": 47.20562744140625, "std": 70.30571746826172, "min": -134.10243225097656, "p10": -24.851294708251952, "median": 31.805145263671875, "p90": 139.4584548950195, "max": 224.22283935546875, "pos_frac": 0.734375, "sample": [118.09956359863281, -113.06322479248047, 0.0, 7.892021179199219, 175.69384765625, -25.29938507080078, 149.2932586669922, 82.34132385253906, 10.017814636230469, 147.5416259765625, 14.833251953125, 138.66770935058594, 63.15809631347656, -15.5028076171875, 133.66128540039062, -70.7158203125, -24.659255981445312, 128.40829467773438, 72.51507568359375, 7.89892578125, 51.7935791015625, 8.43994140625, 119.00482177734375, 103.6815185546875, 6.391441345214844, 123.55024719238281, 6.833442687988281, 0.9405574798583984, -9.878997802734375, 23.72869873046875, -12.188713073730469, -49.28135681152344, 32.32472229003906, -6.91778564453125, -23.709548950195312, 94.38835144042969, 106.92659759521484, 146.97442626953125, 110.58905029296875, -37.180816650390625, -26.372100830078125, 31.285568237304688, 153.54327392578125, -124.93940734863281, -116.38424682617188, 134.72900390625, 162.92303466796875, 134.35430908203125, 16.28635025024414, 0.0, 81.73468017578125, 60.797119140625, 30.2557373046875, 171.50094604492188, -117.57955932617188, -13.812187194824219, 29.463088989257812, 41.34977722167969, 1.59014892578125, 15.19744873046875, 18.31744384765625, 141.9483642578125, 134.4766387939453, 96.21237182617188, 0.0, 11.51806640625, 17.706329345703125, -0.61041259765625, 25.990859985351562, 80.20693969726562, 133.3289794921875, -5.704673767089844, -12.900588989257812, -134.10243225097656, 152.02847290039062, 68.26652526855469, 141.30352783203125, -59.71343994140625, 40.82215881347656, -13.892166137695312, 54.568450927734375, 111.66497802734375, 100.58616638183594, -11.338916778564453, 31.20660400390625, 71.166748046875, 47.367340087890625, -27.76727294921875, 101.14683532714844, 135.23065185546875, -1.7666778564453125, -1.0533447265625, 147.87631225585938, 11.243324279785156, 7.9263458251953125, 0.3230762481689453, 1.275634765625, 108.453125, 95.40615844726562, 224.22283935546875, 2.8964691162109375, 102.47270202636719, 114.1728515625, -1.78839111328125, -32.31414794921875, 38.74519348144531, 19.110626220703125, 36.479278564453125, 0.3967304229736328, 59.96490478515625, 87.78471374511719, 4.867633819580078, 125.45208740234375, 6.471569061279297, 86.75283813476562, 104.41995239257812, 103.18196868896484, 37.6158447265625, 149.4578399658203, 20.943958282470703, -6.823614120483398, 63.9859619140625, -7.6921844482421875, 137.34059143066406, 61.00462341308594, 108.15177917480469, -3.7302818298339844, 116.94241333007812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000317.npy"}
|
||||
{"epoch": 0.6638743455497382, "step": 318, "batch_size": 128, "mean": 49.19019317626953, "std": 68.1949691772461, "min": -146.6738739013672, "p10": -26.799906921386718, "median": 51.65284729003906, "p90": 136.9917785644531, "max": 175.14996337890625, "pos_frac": 0.7421875, "sample": [91.09536743164062, 21.104995727539062, 126.77191925048828, 121.46985626220703, 19.693695068359375, -6.952484130859375, 33.183868408203125, 21.09710693359375, 133.953857421875, 42.65931701660156, 4.90594482421875, -22.006668090820312, 136.85260009765625, 75.78592681884766, 154.45123291015625, 29.75091552734375, 102.75070190429688, -14.597091674804688, 99.30976867675781, 72.43231201171875, -23.6689453125, -20.235198974609375, 117.96414184570312, 138.17535400390625, -15.885101318359375, 37.29624938964844, 110.92007446289062, 127.01676940917969, -103.89502716064453, 30.53814697265625, 0.9370574951171875, -27.476119995117188, 33.67195129394531, 65.10266876220703, 54.06727600097656, -26.510101318359375, 8.881220817565918, 62.834075927734375, 124.63461303710938, 162.85830688476562, 71.82275390625, 106.39645385742188, 102.17068481445312, 58.384124755859375, 19.007095336914062, 6.103973388671875, -121.70840454101562, 102.32689666748047, -2.19805908203125, 105.04229736328125, -137.33273315429688, 13.46682357788086, 83.6661148071289, 151.04611206054688, 110.50531005859375, 42.87181091308594, -17.034454345703125, 109.23226928710938, 132.86141967773438, -14.814453125, 82.35557556152344, 106.45254516601562, 156.17001342773438, 53.589996337890625, -58.99348068237305, 61.7476806640625, -40.9293212890625, 3.443572998046875, 141.309326171875, -3.6720352172851562, 116.46697998046875, 16.002349853515625, 8.341827392578125, 83.46858215332031, 106.75196838378906, 121.36422729492188, 9.630889892578125, 114.36788940429688, 62.158050537109375, -10.29241943359375, 105.80599212646484, 14.355743408203125, 32.181640625, 124.11224365234375, 64.44828796386719, -38.04644775390625, 80.23515319824219, 149.21115112304688, -41.26789093017578, -6.6639556884765625, -10.048004150390625, 51.90380859375, 26.58782958984375, 105.77168273925781, 56.6473388671875, 46.06097412109375, 104.57675170898438, -53.48963928222656, 51.401885986328125, -22.17022705078125, 102.55751037597656, 175.14996337890625, 151.27346801757812, 53.316558837890625, -40.91716003417969, 105.441162109375, 65.95965576171875, 10.385238647460938, -2.9765472412109375, -42.79685974121094, 138.34375, 34.0106201171875, 64.9874267578125, 140.48460388183594, 25.606704711914062, 137.3165283203125, 41.35281753540039, 130.7607421875, -26.371734619140625, 100.31695556640625, -116.17266845703125, 41.880462646484375, -2.7603225708007812, 27.40057373046875, 142.0319061279297, -7.668205261230469, -146.6738739013672, 0.0], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000318.npy"}
|
||||
{"epoch": 0.6659685863874345, "step": 319, "batch_size": 128, "mean": 53.875404357910156, "std": 63.22944641113281, "min": -141.83689880371094, "p10": -13.457952880859374, "median": 40.01371765136719, "p90": 135.18512573242185, "max": 184.43646240234375, "pos_frac": 0.8203125, "sample": [95.37615966796875, 2.864837646484375, 58.86871337890625, 51.644805908203125, 31.09050941467285, 184.43646240234375, 106.35858154296875, -38.9521484375, 97.14332580566406, 25.094085693359375, -14.008331298828125, 11.620830535888672, 91.65491485595703, 147.0667724609375, 76.71145629882812, 140.10665893554688, 4.08770751953125, 0.0, 64.33840942382812, 94.14144897460938, 9.75497055053711, 21.23553466796875, 72.36514282226562, 10.258987426757812, 8.90374755859375, 112.84112548828125, 6.68304443359375, 76.29400634765625, -12.269012451171875, 91.73371887207031, -8.206390380859375, 8.86456298828125, 142.839599609375, 4.695915222167969, 112.67464447021484, 47.530517578125, 7.320709228515625, 115.73062133789062, 125.35137939453125, 41.13421630859375, 3.743185043334961, 128.54656982421875, 52.7628173828125, -97.2833251953125, 1.7707138061523438, 110.0308837890625, 148.1737060546875, 6.181793212890625, -48.456390380859375, 131.7852783203125, 127.57308959960938, 109.31666564941406, 13.260894775390625, 3.7232818603515625, 124.51611328125, 26.3193359375, 26.966522216796875, -47.28816223144531, 58.599830627441406, 126.3416748046875, -141.83689880371094, 62.133323669433594, 93.5580825805664, -74.1239013671875, 73.33876037597656, -19.512588500976562, 24.1121826171875, 24.38922119140625, 15.65594482421875, 109.40536499023438, 3.5778121948242188, 97.08574676513672, 80.75701904296875, 132.00225830078125, -3.681976318359375, -13.222076416015625, 106.06982421875, 104.68174743652344, 134.1817626953125, -23.143356323242188, 18.89185333251953, 127.04421997070312, 116.38113403320312, 43.548553466796875, 31.55645751953125, -0.5786781311035156, 177.16867065429688, 183.86865234375, 83.89303588867188, 15.576568603515625, -12.5052490234375, 38.893218994140625, 87.34625244140625, -30.85833740234375, 117.94395446777344, 22.61083984375, 27.73553466796875, 141.338623046875, -38.2579345703125, 151.450927734375, 31.1214599609375, 11.26416015625, 151.52621459960938, 21.29180908203125, -9.96551513671875, 13.372573852539062, 108.46578979492188, 66.09152221679688, 128.89630126953125, -6.086723327636719, 34.75933837890625, 20.634654998779297, -4.948844909667969, 123.98857879638672, -23.586944580078125, 102.52882385253906, 108.83003234863281, 159.76278686523438, -29.112091064453125, 31.56024169921875, 140.3890380859375, 12.395294189453125, 32.45940399169922, 1.3718681335449219, 130.69561767578125, 50.53680419921875, 29.839378356933594, 137.52630615234375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000319.npy"}
|
||||
{"epoch": 0.6680628272251309, "step": 320, "batch_size": 128, "mean": 48.59565734863281, "std": 76.41400146484375, "min": -140.0714111328125, "p10": -30.340875244140616, "median": 49.54180908203125, "p90": 142.04168243408202, "max": 178.89324951171875, "pos_frac": 0.7265625, "sample": [101.12435913085938, 74.24385070800781, -64.37484741210938, -127.74761199951172, 107.00961303710938, 138.61404418945312, -62.9521484375, 5.9104766845703125, 136.4952392578125, 138.62939453125, 19.5352783203125, 108.29779052734375, 16.019317626953125, 10.75225830078125, 150.178955078125, 83.67727661132812, -17.994720458984375, -0.10475540161132812, 16.553466796875, 17.275863647460938, 17.2447509765625, 127.95793151855469, 38.150634765625, 6.3721160888671875, 2.7321624755859375, -25.26812744140625, 155.73956298828125, 67.00312805175781, 62.03520202636719, 53.163963317871094, 90.63423919677734, 112.1068115234375, 142.4393310546875, 43.52313232421875, 49.1029052734375, 126.3154296875, -10.886993408203125, 142.460205078125, 10.056396484375, -1.0449256896972656, 5.777931213378906, 20.86359405517578, 103.65850830078125, -25.980621337890625, 149.29554748535156, 163.89068603515625, 100.82354736328125, 54.61737060546875, 95.76324462890625, 83.17320251464844, -71.64230346679688, 11.71466064453125, -116.89898681640625, -20.61767578125, 144.1107177734375, -16.2899169921875, 62.879608154296875, -27.740997314453125, 82.36682891845703, 138.46371459960938, 126.50848388671875, 136.8712158203125, 20.910064697265625, 131.8377685546875, 139.82827758789062, -13.104751586914062, 132.07174682617188, -16.70989990234375, 150.09561157226562, 128.11773681640625, 16.52911376953125, -5.3040771484375, 25.202056884765625, -14.453460693359375, -140.0714111328125, 86.35986328125, 9.302001953125, 138.686279296875, 8.777862548828125, 119.52912902832031, -119.44760131835938, 33.290061950683594, 3.4347400665283203, -1.366302490234375, -17.907943725585938, 74.45403289794922, 95.50563049316406, 129.04080200195312, 82.66217041015625, -1.0450439453125, -73.54161071777344, 44.178558349609375, 169.16351318359375, -2.92626953125, 112.0523681640625, -25.46316146850586, 9.453079223632812, 87.99017333984375, -27.292861938476562, 147.3643798828125, 83.33538818359375, -36.407257080078125, 86.01739501953125, 141.8712615966797, 125.976806640625, 178.89324951171875, 121.28990173339844, -15.248245239257812, 90.46090698242188, -108.7440185546875, 50.30555725097656, 62.3118896484375, -121.751953125, -3.6826934814453125, 150.6197509765625, 24.44903564453125, 133.79428100585938, 31.61065673828125, 98.85420227050781, 99.19270324707031, -25.933958053588867, 49.980712890625, 27.7633056640625, -114.11652374267578, 163.74044799804688, 46.510772705078125, -129.8833770751953, 109.24203491210938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000320.npy"}
|
||||
{"epoch": 0.6701570680628273, "step": 321, "batch_size": 128, "mean": 43.01298522949219, "std": 68.99885559082031, "min": -143.02276611328125, "p10": -30.091673278808592, "median": 24.204696655273438, "p90": 139.50805358886717, "max": 170.27191162109375, "pos_frac": 0.734375, "sample": [124.79751586914062, -17.59423828125, 72.26904296875, -134.16973876953125, 25.463760375976562, 154.07492065429688, 19.19171142578125, 6.808502197265625, -53.43299865722656, 159.1512451171875, 116.47798156738281, 22.945632934570312, -55.1239013671875, 32.14081573486328, 81.76603698730469, 6.82501220703125, 1.0049819946289062, 126.90934753417969, 152.85540771484375, 3.9364166259765625, 73.15779876708984, -61.05242919921875, 2.861156463623047, -7.668243408203125, 144.01934814453125, 98.55158996582031, -2.3901214599609375, 7.31317138671875, 98.7674560546875, 5.0670166015625, 113.31977081298828, 85.5906982421875, 94.18626403808594, 11.224334716796875, 38.02630615234375, -52.18536376953125, -29.90045166015625, 57.49302673339844, 6.0818023681640625, 117.9932861328125, 62.18882369995117, 29.530303955078125, -6.7967987060546875, 46.531005859375, 10.478759765625, 148.6220245361328, -28.895523071289062, 134.14915466308594, 170.27191162109375, 98.89236450195312, 6.1616058349609375, 22.039505004882812, -17.81057357788086, 41.6688232421875, -0.5582122802734375, 10.964569091796875, 138.78128051757812, -9.1617431640625, 136.3280487060547, -22.77703857421875, 60.98838806152344, 92.11377716064453, 84.98819732666016, 115.7190170288086, 72.590087890625, 141.203857421875, -10.1019287109375, 59.365142822265625, -48.959075927734375, -4.20135498046875, -6.271209716796875, 56.042572021484375, -30.537857055664062, 150.01165771484375, -137.11505126953125, 101.27951049804688, 151.14413452148438, 168.13723754882812, 22.3748779296875, 112.572998046875, 21.279953002929688, -29.827880859375, 92.78900146484375, 87.27137756347656, 8.945648193359375, 132.0037841796875, -28.660446166992188, 141.4813690185547, -75.109130859375, 116.49087524414062, 11.117568969726562, 15.745098114013672, 99.76087951660156, 1.5022811889648438, -3.2912559509277344, 159.4608154296875, 1.98193359375, 64.41342163085938, 27.214019775390625, 79.51766967773438, 110.75741577148438, -5.102447509765625, 89.65512084960938, 20.24932861328125, -4.428131103515625, 104.39697265625, 6.249584197998047, 3.8103904724121094, 1.1149616241455078, -0.814178466796875, -32.344390869140625, -37.2213134765625, 9.233972549438477, -22.303512573242188, 64.80486297607422, 161.2666015625, -143.02276611328125, 134.52493286132812, 34.802154541015625, 68.21428680419922, 15.171295166015625, 49.214447021484375, 11.194931030273438, 1.5971908569335938, 131.169921875, 131.22576904296875, 0.0, -90.54946899414062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000321.npy"}
|
||||
{"epoch": 0.6722513089005235, "step": 322, "batch_size": 128, "mean": 51.42274475097656, "std": 70.06802368164062, "min": -222.97528076171875, "p10": -27.400074768066407, "median": 46.03815460205078, "p90": 140.56994018554687, "max": 180.75830078125, "pos_frac": 0.7421875, "sample": [-54.18402099609375, -9.992034912109375, -53.735260009765625, 80.24928283691406, 30.654754638671875, 13.267166137695312, 45.35115051269531, -40.96234130859375, 124.30641174316406, 64.4068603515625, -13.565765380859375, -3.0511627197265625, 78.20767211914062, 11.0032958984375, 130.6526641845703, 107.48265075683594, 70.56488037109375, -9.67327880859375, 95.24444580078125, 46.72515869140625, 78.49029541015625, -15.579193115234375, 154.9609832763672, 5.0001220703125, 1.3668365478515625, 38.06587219238281, -27.17919921875, 94.64598083496094, 6.3099365234375, 118.96435546875, 71.2344970703125, 121.19279479980469, -0.19831085205078125, -10.211639404296875, 117.69744873046875, 31.19976806640625, 5.589256286621094, 128.47994995117188, -66.88545227050781, 112.04145050048828, 128.96426391601562, 103.86466979980469, -41.977142333984375, 18.0877685546875, -4.91168212890625, 35.41935729980469, 105.0251693725586, -27.915451049804688, 119.13497924804688, -117.20704650878906, 111.65626525878906, 5.862823486328125, 127.41511535644531, 172.139892578125, -9.0047607421875, 120.10098266601562, 101.26234436035156, -17.90789794921875, 64.126953125, 105.75825500488281, 146.640869140625, 20.724700927734375, 169.99105834960938, 180.75830078125, -52.3349609375, 112.02417755126953, 81.54057312011719, 36.1043701171875, 96.34169006347656, 13.231521606445312, 160.963134765625, 139.41717529296875, 102.01202392578125, 57.346405029296875, 10.1064453125, 0.0, -60.3477783203125, 48.63031005859375, 59.708984375, -36.63722229003906, 142.6011962890625, 54.34493637084961, 124.10064697265625, 139.69940185546875, -16.4407958984375, 57.278900146484375, 59.876617431640625, 7.971282958984375, 1.670135498046875, 104.95231628417969, 156.3958740234375, -2.424285888671875, 9.505950927734375, -38.760650634765625, -12.2899169921875, 0.5919399261474609, 79.50100708007812, 4.467193603515625, -222.97528076171875, 71.04791259765625, 121.40843200683594, 20.36309051513672, 43.869232177734375, 157.10748291015625, 123.09686279296875, -33.18303680419922, -1.2688446044921875, 176.44065856933594, 172.10757446289062, 5.3682861328125, -13.493988037109375, 123.689697265625, 23.697906494140625, 40.50569152832031, -3.4017181396484375, 11.734241485595703, -12.044715881347656, 19.588592529296875, 23.091552734375, 99.20384216308594, 84.91143798828125, 162.67758178710938, 93.18394470214844, -5.29205322265625, 16.33441162109375, 131.846435546875, 59.33351135253906, 155.8658447265625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000322.npy"}
|
||||
{"epoch": 0.6743455497382199, "step": 323, "batch_size": 128, "mean": 33.87662124633789, "std": 78.81759643554688, "min": -188.12567138671875, "p10": -59.4457321166992, "median": 18.76827335357666, "p90": 132.45803833007812, "max": 171.16668701171875, "pos_frac": 0.6640625, "sample": [12.629974365234375, -43.30244445800781, -82.51170349121094, -105.92767333984375, 114.08758544921875, 119.24278259277344, -72.82756042480469, 59.816497802734375, 54.6817626953125, 169.16793823242188, -78.60679626464844, 122.47854614257812, 25.18389892578125, -19.45172119140625, 2.1298828125, 96.97472381591797, 113.81791687011719, -11.252120971679688, 69.2995834350586, -24.60833740234375, 169.96575927734375, 95.68545532226562, 20.533172607421875, 127.36618041992188, 21.594131469726562, 152.59568786621094, 1.92633056640625, 118.37909698486328, -77.83204650878906, 130.64944458007812, -16.0941162109375, -165.7587890625, -7.606666564941406, 16.70068359375, 0.26774024963378906, 96.57231140136719, 19.675888061523438, 11.898935317993164, 103.14997863769531, 18.453378677368164, 26.41656494140625, 120.28176879882812, -34.436981201171875, 112.2235107421875, 123.53713989257812, -41.18922424316406, -18.556549072265625, -188.12567138671875, 139.8299560546875, -0.32701873779296875, 7.883686065673828, 9.55997085571289, -53.710662841796875, 8.545608520507812, 83.10928344726562, 62.61456298828125, 48.9708251953125, 130.213623046875, 167.92420959472656, -24.519088745117188, 78.58282470703125, 71.60382080078125, 113.4935531616211, -21.19317626953125, 143.50643920898438, 25.32061767578125, 111.70597839355469, -15.203964233398438, -46.1055908203125, 101.54046630859375, 13.8922119140625, 2.8722057342529297, -44.63771057128906, 2.65777587890625, 112.23191833496094, 65.00999450683594, 160.29904174804688, -47.492950439453125, 150.18630981445312, 4.258880615234375, 111.61698150634766, -126.41189575195312, 16.566368103027344, 29.333282470703125, 16.528762817382812, 160.60140991210938, -51.13352966308594, 131.64459228515625, -119.400634765625, 7.776935577392578, 126.249267578125, 26.594078063964844, -38.98338317871094, 103.7162094116211, -5.3885498046875, -9.779621124267578, -79.54747772216797, -15.032966613769531, -24.806396484375, -7.029327392578125, 30.803253173828125, -92.15655517578125, -10.80322265625, 4.0656280517578125, 26.66748046875, 117.08740234375, 14.427196502685547, -10.618659973144531, 114.08770751953125, 156.64715576171875, -2.628694534301758, 11.003982543945312, 137.87013244628906, 130.5098419189453, -117.98239135742188, -127.070068359375, 66.07160949707031, 23.732391357421875, 80.84523010253906, 19.083168029785156, 15.874275207519531, -6.260009765625, 91.0567626953125, 134.3560791015625, 113.7967529296875, -29.294952392578125, -1.1643943786621094, 171.16668701171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000323.npy"}
|
||||
{"epoch": 0.6764397905759162, "step": 324, "batch_size": 128, "mean": 56.82297134399414, "std": 65.46845245361328, "min": -94.79374694824219, "p10": -11.818051910400385, "median": 39.36174774169922, "p90": 143.5288070678711, "max": 187.223876953125, "pos_frac": 0.7890625, "sample": [62.77278137207031, 148.085693359375, 13.82086181640625, 115.35224914550781, 23.834503173828125, 106.39312744140625, 4.5838165283203125, 9.87270736694336, 123.99322509765625, 28.654449462890625, 32.6519775390625, 2.34490966796875, -55.5242919921875, 148.93365478515625, 4.508636474609375, 148.43478393554688, 111.791259765625, 15.497001647949219, -9.11724853515625, 0.0, 7.5263671875, 7.2250823974609375, 117.99864196777344, -7.416633605957031, 3.8160953521728516, 30.352294921875, 123.7772216796875, -41.56181335449219, 114.91818237304688, -18.508079528808594, 6.09712028503418, 129.27780151367188, 135.0029754638672, 142.52340698242188, 118.77265930175781, 112.18463897705078, -10.463348388671875, 50.486846923828125, 75.39297485351562, 0.0, -3.878183364868164, 132.49383544921875, 146.76693725585938, 102.62310791015625, 52.333465576171875, 118.18546295166016, 0.0, -5.0262451171875, 98.46212768554688, -17.74932861328125, -82.28929138183594, 39.47087097167969, 28.841903686523438, 21.43017578125, 95.58880615234375, 79.91974639892578, 8.954254150390625, 122.39756774902344, 148.47314453125, 10.5430908203125, -46.82228088378906, 144.10861206054688, 10.638595581054688, 119.79269409179688, 94.69464874267578, -3.1413497924804688, 93.29466247558594, 107.98757934570312, 19.280914306640625, -94.79374694824219, 143.2803192138672, 38.618141174316406, -62.165618896484375, 116.2860107421875, 42.878936767578125, -2.8961181640625, 17.99823760986328, 61.225860595703125, 12.719268798828125, 187.223876953125, 170.07955932617188, 7.238670349121094, -16.08575439453125, 55.164764404296875, 165.74615478515625, 135.42950439453125, 15.910247802734375, 24.507080078125, -8.08514404296875, 8.26239013671875, 119.78872680664062, 48.49017333984375, 24.69292449951172, -29.780731201171875, -10.461532592773438, 55.05279541015625, 33.36628341674805, 143.16705322265625, 117.45465087890625, 59.28131103515625, -44.985862731933594, -4.644134521484375, 7.7357635498046875, 82.19612121582031, -46.114349365234375, 171.8504638671875, 167.46188354492188, 151.29354858398438, 39.25262451171875, 7.426971435546875, 139.49526977539062, 80.00759887695312, -14.979026794433594, 113.24156188964844, 18.8157958984375, 119.56289672851562, 144.477294921875, 119.234130859375, -0.4256706237792969, 134.76974487304688, 119.64599609375, 0.7770156860351562, 121.17825317382812, 128.40675354003906, 127.43913269042969, 8.784759521484375, 2.3779296875, 27.801559448242188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000324.npy"}
|
||||
{"epoch": 0.6785340314136126, "step": 325, "batch_size": 128, "mean": 45.59696578979492, "std": 81.9312515258789, "min": -133.14169311523438, "p10": -32.38389434814452, "median": 26.242034912109375, "p90": 158.4437225341797, "max": 225.4947509765625, "pos_frac": 0.6796875, "sample": [62.849639892578125, 134.36813354492188, -105.46565246582031, 11.42413330078125, 90.55133056640625, 23.0498046875, 38.405731201171875, 135.77859497070312, 30.84844970703125, -1.107208251953125, 4.88037109375, 14.143646240234375, 130.95066833496094, 130.70285034179688, -10.636322021484375, 4.404726028442383, 146.49314880371094, 79.20782470703125, 60.22406005859375, 160.38504028320312, 129.3416748046875, 161.1561737060547, 97.06558227539062, -2.7831764221191406, 119.4830322265625, 4.420539855957031, 13.392322540283203, 72.78875732421875, 117.30485534667969, 26.037841796875, -89.18649291992188, 6.9537353515625, 71.6519775390625, 149.1244354248047, 145.96630859375, 84.02943420410156, 21.30535316467285, 149.81333923339844, 47.346405029296875, -9.10601806640625, 126.46601867675781, -131.90591430664062, 142.89373779296875, 0.3474884033203125, -69.91566467285156, -0.71856689453125, 225.45474243164062, -113.58131408691406, 169.08091735839844, -3.424102783203125, -2.022369384765625, 26.44622802734375, 147.31427001953125, 13.005218505859375, 8.493927001953125, -102.50833892822266, -8.571319580078125, -106.53133392333984, -5.40570068359375, -61.17974853515625, 53.32475280761719, 123.9393310546875, -4.7882843017578125, -3.412811279296875, 38.31654357910156, 37.87724304199219, -0.6543960571289062, -109.40455627441406, 178.30609130859375, 6.616851806640625, 93.24935913085938, 19.3736572265625, 225.4947509765625, -13.9241943359375, -21.3385009765625, -5.0389862060546875, -12.5394287109375, -11.577285766601562, 89.65194702148438, 129.82879638671875, -119.03939056396484, -9.5673828125, -22.29644775390625, 158.54974365234375, -111.93992614746094, 61.462158203125, 115.85052490234375, 40.970703125, 43.11897277832031, -26.28643798828125, -24.885498046875, 102.89002227783203, 158.39828491210938, -29.961807250976562, 2.7852916717529297, -133.14169311523438, -16.364356994628906, 163.63619995117188, 32.160919189453125, 126.99079895019531, 159.03277587890625, 39.24658203125, 79.11724853515625, 179.98257446289062, 147.75506591796875, -38.035430908203125, 4.727928161621094, 25.99742889404297, 23.1798095703125, -3.19732666015625, -19.242172241210938, 139.08047485351562, 46.63670349121094, 128.0037841796875, 6.700218200683594, 113.89971160888672, 67.34310913085938, 173.28387451171875, 11.164215087890625, 0.07787322998046875, 0.6847991943359375, -29.114471435546875, 191.55841064453125, -22.50250244140625, -14.745758056640625, 188.27743530273438, 62.293243408203125, 137.27117919921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000325.npy"}
|
||||
{"epoch": 0.680628272251309, "step": 326, "batch_size": 128, "mean": 47.57628631591797, "std": 75.50325012207031, "min": -176.8203125, "p10": -35.4315185546875, "median": 37.392539978027344, "p90": 135.41036071777344, "max": 167.2999725341797, "pos_frac": 0.7109375, "sample": [127.86737060546875, 121.72891235351562, 114.20584106445312, 160.61790466308594, 61.7130126953125, 62.249176025390625, -42.10150146484375, -19.81854248046875, 121.46370697021484, -35.06764221191406, 83.42288208007812, -13.354248046875, 124.05227661132812, 127.3076171875, 22.77460479736328, 2.5999412536621094, 4.77598762512207, 139.67929077148438, 167.2999725341797, 17.668838500976562, 48.1212158203125, 127.48793029785156, 5.5393218994140625, -20.104644775390625, 60.2635498046875, 88.7125244140625, 110.23129272460938, 131.829833984375, 14.601207733154297, 15.651512145996094, -108.69773864746094, 135.28878784179688, -0.13922119140625, 110.43504333496094, -111.83377075195312, -75.40060424804688, 14.142776489257812, 73.34626770019531, -30.8350830078125, 123.32662963867188, 76.2178955078125, 98.55853271484375, -22.20184326171875, 129.21063232421875, 46.263885498046875, 105.66246032714844, -24.693145751953125, 126.35096740722656, -22.828125, 152.34104919433594, 109.8927001953125, 93.25930786132812, 131.91342163085938, 16.785240173339844, 135.04995727539062, -111.06168365478516, -34.64569091796875, 10.900100708007812, 116.16384887695312, -9.343982696533203, 114.35307312011719, 104.13257598876953, 22.777099609375, 90.10183715820312, 0.0, -176.8203125, 14.955657958984375, -1.841064453125, 108.9415283203125, -28.28125, 13.327316284179688, 145.03469848632812, 165.42120361328125, 51.99395751953125, 39.13923645019531, -56.525054931640625, 32.00775146484375, -51.95692443847656, 30.345748901367188, 95.06085968017578, -2.31085205078125, 146.51416015625, -5.1031494140625, 25.330474853515625, 12.420066833496094, 123.80039978027344, -29.93450927734375, -79.37026977539062, 97.89933776855469, 11.320840835571289, 126.23448944091797, 15.2310791015625, 153.91986083984375, -9.254486083984375, 24.60700225830078, 35.645843505859375, 99.27051544189453, 135.69403076171875, 23.966796875, 118.42044067382812, 144.66708374023438, 0.0, 120.47952270507812, 50.482025146484375, 58.634979248046875, 119.65652465820312, -11.06103515625, 132.45372009277344, 97.72624206542969, 0.0, 121.44491577148438, -129.76556396484375, -29.51800537109375, -110.91732025146484, 10.5968017578125, -113.1387939453125, 145.10394287109375, -12.9586181640625, 134.031982421875, 15.950668334960938, 28.255088806152344, 23.4276123046875, -18.378692626953125, -36.28056335449219, 136.34963989257812, 142.09759521484375, 25.07440185546875, 124.03225708007812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000326.npy"}
|
||||
{"epoch": 0.6827225130890052, "step": 327, "batch_size": 128, "mean": 54.640499114990234, "std": 74.73738861083984, "min": -141.96749877929688, "p10": -23.5098388671875, "median": 39.38105773925781, "p90": 151.00719299316407, "max": 202.64501953125, "pos_frac": 0.78125, "sample": [32.59843444824219, 116.38552856445312, 110.13766479492188, 3.0679855346679688, 180.87118530273438, -128.39730834960938, 66.67889404296875, 129.841796875, 6.48638916015625, 103.05833435058594, 0.07472801208496094, 169.650146484375, -6.85894775390625, -121.87771606445312, 43.516273498535156, -17.165283203125, -41.8043212890625, 8.970733642578125, 162.25439453125, 158.45184326171875, 97.03252410888672, 98.50463104248047, 23.436370849609375, 47.606231689453125, 95.86624145507812, -40.3253173828125, 17.215530395507812, 97.71888732910156, 11.927947998046875, 113.4988784790039, 39.93609619140625, 162.0816650390625, -7.395055770874023, -4.970024108886719, 73.79216003417969, 103.73025512695312, 154.54547119140625, 8.445770263671875, 202.64501953125, 111.37355041503906, -23.35943603515625, -7.847087860107422, -7.843841552734375, 105.4141845703125, 31.30211639404297, -15.429916381835938, 144.23956298828125, 38.035614013671875, 102.08360290527344, 10.032699584960938, 135.29190063476562, 3.94952392578125, -3.88067626953125, -141.96749877929688, 110.81940460205078, 29.089263916015625, 126.22545623779297, 118.27072143554688, 166.078369140625, 56.655487060546875, 67.55103302001953, 122.064208984375, 100.05247497558594, 31.63446044921875, 145.16433715820312, 137.663818359375, 155.23773193359375, 20.30108642578125, 95.71551513671875, -21.332427978515625, 27.70611572265625, 5.374481201171875, -10.9967041015625, -42.585289001464844, 105.58908081054688, 9.10614013671875, 38.50439453125, 54.751220703125, 156.79345703125, 8.262786865234375, -23.86077880859375, 3.684356689453125, 26.6353759765625, 110.12738037109375, 36.65049743652344, 151.3094482421875, 12.3055419921875, 0.0, 134.1630859375, 2.8054885864257812, 26.5374755859375, 146.27685546875, 173.4808349609375, 21.292144775390625, 116.52398681640625, 12.50204849243164, 124.6614990234375, 32.672637939453125, 130.42918395996094, 126.78923797607422, 65.17425537109375, 5.281494140625, -133.4561767578125, -21.111770629882812, 95.03607177734375, 103.53746795654297, -86.25032806396484, 38.826019287109375, 59.77685546875, 121.88479614257812, -119.6294174194336, 80.84161376953125, 153.5599365234375, 9.226348876953125, -36.51068115234375, 132.91510009765625, -4.427928924560547, 150.87765502929688, 0.7882957458496094, -7.2846221923828125, 144.18753051757812, 133.83447265625, 127.98544311523438, 6.4535064697265625, -74.82369995117188, 130.3204345703125, 18.730621337890625, -33.07080078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000327.npy"}
|
||||
{"epoch": 0.6848167539267016, "step": 328, "batch_size": 128, "mean": 46.65143585205078, "std": 70.12755584716797, "min": -132.1219482421875, "p10": -25.130580139160156, "median": 33.590789794921875, "p90": 138.3501434326172, "max": 192.50030517578125, "pos_frac": 0.7578125, "sample": [18.453231811523438, 100.72628784179688, 134.7648468017578, 34.08013916015625, 97.13937377929688, 31.20709228515625, 1.807098388671875, 111.66377258300781, 108.21830749511719, 102.87252807617188, 79.09445190429688, 27.96329116821289, 4.890834808349609, 106.152099609375, 50.35870361328125, -0.37753868103027344, 10.37158203125, -6.266998291015625, 44.29217529296875, 16.424118041992188, 22.242721557617188, 93.9268798828125, 83.76849365234375, -25.96865463256836, 167.59829711914062, 34.06536865234375, -1.0585479736328125, 38.6602783203125, 68.7408447265625, 109.67079162597656, -3.7334136962890625, 113.25531768798828, 36.706390380859375, 147.29928588867188, 181.84423828125, 92.88614654541016, 155.55459594726562, 5.0193634033203125, 157.8297119140625, -0.5424709320068359, 3.99383544921875, 131.05177307128906, 82.20706176757812, -0.0355224609375, 7.954498291015625, -7.852846145629883, -19.72113037109375, 24.1307373046875, -21.007568359375, 126.60163879394531, 39.6126708984375, -1.3211669921875, -25.180450439453125, 151.83343505859375, 10.735809326171875, 33.6766357421875, 10.87896728515625, 9.884864807128906, 31.28606414794922, -25.361709594726562, 16.65601348876953, 39.15313720703125, 142.18832397460938, 138.34039306640625, 54.87406921386719, 148.1768798828125, 111.1823501586914, -71.55523681640625, 124.40892791748047, -93.64201354980469, 12.442276000976562, 109.52044677734375, 114.38397979736328, 117.28414916992188, 29.5972900390625, 170.45626831054688, 164.77487182617188, 16.06488037109375, -0.39312744140625, 120.30126953125, -127.73544311523438, 8.287841796875, 131.80401611328125, 36.81037902832031, -104.38967895507812, 8.10552978515625, 104.64578247070312, 116.09801483154297, 107.05831909179688, -25.3194580078125, -14.90631103515625, 154.2440185546875, 5.9691925048828125, -2.593536376953125, 6.7982177734375, -19.97869873046875, 4.2836456298828125, 22.41033935546875, -99.15322875976562, 21.821256637573242, 49.91499328613281, 34.1058349609375, 101.71794128417969, 2.21966552734375, -2.7889404296875, 131.16738891601562, 131.740234375, -132.1219482421875, -3.1997222900390625, -25.109207153320312, 34.494384765625, 19.291900634765625, 130.25656127929688, -58.708984375, 33.50494384765625, 23.720123291015625, 2.804351806640625, 128.316650390625, -4.791399002075195, -125.47554016113281, 23.650787353515625, 138.37289428710938, 85.53953552246094, -77.48997497558594, 65.4482421875, 192.50030517578125, 115.4422607421875, 43.41645812988281], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000328.npy"}
|
||||
{"epoch": 0.6869109947643979, "step": 329, "batch_size": 128, "mean": 39.579559326171875, "std": 80.26961517333984, "min": -157.59649658203125, "p10": -63.515655517578125, "median": 24.955841064453125, "p90": 145.76054534912106, "max": 197.19140625, "pos_frac": 0.65625, "sample": [-3.584259033203125, 8.654647827148438, -6.0290985107421875, 42.40045166015625, 1.9078521728515625, -16.357372283935547, 9.293594360351562, 24.5093994140625, 4.8939666748046875, 34.195098876953125, 131.52108764648438, -16.23556900024414, -89.0042724609375, 104.87374877929688, 93.9451904296875, 164.86553955078125, -12.808349609375, -40.0531005859375, 130.65672302246094, 105.52291107177734, 42.3858642578125, 19.48309326171875, 139.53915405273438, -10.4599609375, 113.36688232421875, 143.26405334472656, -112.8131103515625, 107.00167846679688, 160.8178253173828, 51.88385009765625, 0.45355224609375, 20.619583129882812, -10.25439453125, 139.74905395507812, -40.606781005859375, 41.002288818359375, 91.45924377441406, 89.24920654296875, 15.765975952148438, 104.82391357421875, 11.71942138671875, 157.0009765625, -0.563262939453125, -64.93304443359375, 48.4215087890625, -91.7667236328125, -10.767318725585938, 138.3797607421875, 14.2222900390625, -157.59649658203125, -4.924659729003906, 157.69439697265625, -62.908203125, -17.11590576171875, 118.770751953125, 35.99845886230469, 38.418212890625, 35.584381103515625, 28.260757446289062, -118.2989501953125, 6.558017730712891, 135.30343627929688, 3.57403564453125, -42.15568542480469, -1.1443405151367188, -4.666194915771484, -0.9485225677490234, -52.123321533203125, 185.455322265625, 13.938804626464844, 35.23927307128906, 1.350433349609375, 96.6259765625, 193.0233154296875, 14.367145538330078, 129.08782958984375, 59.884552001953125, -22.110931396484375, -92.40786743164062, 135.9122314453125, 124.49996948242188, -12.189788818359375, -139.52413940429688, 104.89949035644531, -14.74420166015625, 160.7969207763672, 92.04959106445312, 3.358428955078125, 111.792724609375, -1.556304931640625, 174.6119384765625, -3.0002517700195312, -83.52082824707031, 160.92019653320312, 118.82331848144531, -112.02789306640625, 187.85606384277344, 151.585693359375, -2.240753173828125, 116.715087890625, -4.6577606201171875, 19.653900146484375, 92.26872253417969, 37.07293701171875, -1.1410751342773438, 1.3843803405761719, 197.19140625, 54.6851806640625, 132.49220275878906, 195.094970703125, -15.580230712890625, 115.83802795410156, 114.66142272949219, -7.88848876953125, 34.57135009765625, -98.08502960205078, 25.40228271484375, -23.249237060546875, 28.4935302734375, -2.421905517578125, 55.686004638671875, 118.31082153320312, 0.8763084411621094, 50.164588928222656, 124.1622314453125, 47.09684753417969, -69.6202392578125, -125.64366912841797], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000329.npy"}
|
||||
{"epoch": 0.6890052356020943, "step": 330, "batch_size": 128, "mean": 34.96834945678711, "std": 74.02091979980469, "min": -120.419677734375, "p10": -54.23674926757811, "median": 23.600341796875, "p90": 141.43985595703126, "max": 231.70745849609375, "pos_frac": 0.734375, "sample": [-5.114898681640625, 124.52037048339844, 18.4007568359375, -84.42459106445312, 78.86744689941406, 108.47723388671875, 38.535797119140625, 163.96401977539062, 132.0812225341797, 29.0853271484375, -14.63970947265625, 23.03143310546875, 18.20843505859375, -112.6339111328125, 71.41827392578125, -62.76353454589844, 142.96820068359375, 28.681777954101562, 1.4501953125, -0.40655517578125, 147.20758056640625, -98.32539367675781, -30.088424682617188, 8.991668701171875, 0.4981689453125, 108.33758544921875, 24.132537841796875, 4.146505355834961, 12.987106323242188, 0.035675048828125, 55.64112854003906, 43.19874572753906, 161.68780517578125, 91.79306030273438, 7.750633239746094, -5.125244140625, -2.99298095703125, -84.155517578125, -32.309356689453125, 16.30218505859375, 42.30291748046875, 155.58734130859375, 33.9041748046875, 42.785919189453125, 45.69744873046875, 24.68878173828125, 18.400596618652344, 175.13336181640625, 11.225257873535156, 93.08177185058594, 2.9768543243408203, 140.78485107421875, 57.466461181640625, -119.50051879882812, 8.64263916015625, 24.269729614257812, -108.96321868896484, 31.642547607421875, 37.59442138671875, 37.45447540283203, 13.712478637695312, 156.8221893310547, 11.931072235107422, 99.82855224609375, -120.19630432128906, 0.099639892578125, 117.1190185546875, -94.72314453125, 6.885311126708984, 11.371467590332031, 42.667633056640625, -50.58241271972656, 161.82366943359375, 38.364776611328125, -2.0272216796875, 89.70855712890625, -120.419677734375, 160.10098266601562, -30.571990966796875, 3.821319580078125, 21.757568359375, 80.40780639648438, -6.095123291015625, -15.620452880859375, 20.876495361328125, 0.7822723388671875, 31.73907470703125, 139.50564575195312, 139.26416015625, 46.546226501464844, 161.5596466064453, 9.0048828125, 93.63382720947266, 108.21853637695312, 15.6473388671875, -25.052749633789062, 16.54864501953125, 15.345537185668945, 70.37757110595703, 126.09683990478516, 162.9708251953125, 34.050140380859375, -97.79788208007812, 79.7586669921875, -114.23846435546875, 137.14471435546875, -14.844970703125, 23.068145751953125, -20.345355987548828, 36.64424133300781, 126.00088500976562, -92.58521270751953, -0.4660186767578125, -46.420166015625, 1.4029541015625, 170.7515869140625, 48.037261962890625, 44.345703125, -2.04119873046875, 124.68528747558594, -24.858245849609375, 231.70745849609375, 66.64085388183594, -31.0367431640625, -25.838104248046875, 123.76412963867188, 77.9620132446289, 26.610015869140625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000330.npy"}
|
||||
{"epoch": 0.6910994764397905, "step": 331, "batch_size": 128, "mean": 43.889503479003906, "std": 70.53678131103516, "min": -153.10191345214844, "p10": -22.164315795898435, "median": 25.677051544189453, "p90": 146.35391082763672, "max": 179.2113037109375, "pos_frac": 0.7578125, "sample": [1.1763687133789062, 172.03421020507812, 115.67083740234375, -84.4234619140625, 179.2113037109375, 99.30288696289062, 1.194580078125, -5.265167236328125, 44.0513916015625, 60.569915771484375, -3.4771575927734375, -69.24443054199219, 51.38032531738281, -36.51197814941406, 169.64599609375, -19.80615234375, 1.907196044921875, 3.8086585998535156, -10.2303466796875, -6.16717529296875, 12.814926147460938, 44.348388671875, -31.043975830078125, 14.458816528320312, 11.581256866455078, 140.10488891601562, 128.569580078125, 36.02687072753906, 8.248733520507812, 117.26692199707031, 31.932708740234375, 42.5032958984375, -9.86102294921875, 31.542633056640625, 25.546920776367188, 146.65814208984375, 13.647750854492188, 52.57672119140625, 156.17703247070312, 113.63536834716797, -28.36114501953125, 11.454803466796875, -21.776214599609375, -147.97727966308594, 4.543468475341797, 8.464683532714844, 146.22352600097656, 21.669692993164062, 40.747314453125, 0.0, 30.81158447265625, -0.0811767578125, 91.87881469726562, -11.3839111328125, 11.77734375, 86.45756530761719, 11.70135498046875, 133.2100830078125, 37.043182373046875, 55.771636962890625, 24.13848876953125, 48.8736572265625, -34.77227783203125, -24.469650268554688, 165.487060546875, 23.08594512939453, 151.81967163085938, 96.71923828125, -16.35009002685547, 25.727821350097656, 5.396453857421875, 23.463409423828125, 135.68698120117188, 33.794677734375, 36.28041458129883, 14.796497344970703, -2.0679931640625, 31.662872314453125, 10.539283752441406, -23.06988525390625, 25.62628173828125, 28.9044189453125, 111.914794921875, 30.943283081054688, -153.10191345214844, 175.0421142578125, 6.37005615234375, 143.67788696289062, -19.199783325195312, 32.199676513671875, 166.5223388671875, 110.2767333984375, 91.04708099365234, 165.66830444335938, 51.86210632324219, 150.01409912109375, 133.63172912597656, 5.806297302246094, -17.13372802734375, -6.9738006591796875, -8.2021484375, 31.2783203125, 139.6612091064453, 117.51290893554688, -127.50708770751953, 20.552032470703125, 176.1610107421875, 2.9218101501464844, 117.38156127929688, -2.154163360595703, 4.79266357421875, 15.158050537109375, 121.47500610351562, 12.81890869140625, 101.66143035888672, 118.74755859375, -2.4254150390625, 79.2626953125, 18.228717803955078, 111.78734588623047, -135.0059814453125, 22.79737091064453, 164.74526977539062, 124.21736145019531, 91.89834594726562, -30.997467041015625, 122.66780853271484, 14.842018127441406], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000331.npy"}
|
||||
{"epoch": 0.6931937172774869, "step": 332, "batch_size": 128, "mean": 54.026344299316406, "std": 76.57466125488281, "min": -163.64955139160156, "p10": -33.60346527099608, "median": 52.821407318115234, "p90": 142.89545135498048, "max": 209.00668334960938, "pos_frac": 0.75, "sample": [83.35263061523438, 63.853759765625, 11.213859558105469, 170.53646850585938, 112.57579040527344, 123.63595581054688, 95.14244079589844, 97.69098663330078, 177.62939453125, 122.054931640625, -1.0543365478515625, 11.439188003540039, 8.48038101196289, -4.1065521240234375, 61.64646911621094, 98.8076171875, -30.152023315429688, 12.503524780273438, 132.9903564453125, -55.900413513183594, 134.95028686523438, -9.60528564453125, 41.70806884765625, 20.523963928222656, 1.1333084106445312, 0.312164306640625, -9.822967529296875, 4.5174713134765625, 11.957138061523438, 8.133697509765625, 132.65435791015625, 154.6810302734375, 3.76080322265625, 2.777862548828125, 140.78375244140625, 101.22502136230469, 124.73379516601562, 151.79595947265625, 34.8109130859375, 110.03462219238281, 44.65235900878906, 10.4044189453125, 120.16043090820312, -0.9774246215820312, 126.77978515625, 29.39276123046875, 98.53887939453125, 133.02923583984375, -3.7136993408203125, 23.37579345703125, 136.22348022460938, 167.1546630859375, -95.8048095703125, -91.76019287109375, 173.330322265625, 116.09818267822266, -77.83901977539062, 84.50469970703125, 135.27880859375, -124.15066528320312, 53.547447204589844, 142.99404907226562, 133.53977966308594, -1.0832366943359375, 11.622077941894531, 7.70159912109375, 73.49846649169922, 20.56451416015625, 12.623184204101562, -41.656829833984375, -0.930084228515625, 137.710693359375, 100.62026977539062, 20.355270385742188, -1.8072662353515625, 156.79608154296875, 47.97125244140625, 105.95010375976562, -112.33335876464844, 55.062835693359375, -4.9956512451171875, 106.01057434082031, -22.854598999023438, 2.4809722900390625, 108.64263916015625, -93.24446105957031, 103.56350708007812, 19.22747802734375, 3.7772216796875, 106.04171752929688, 135.07754516601562, 97.43640899658203, -83.59416961669922, 133.90478515625, 69.17108154296875, -6.94891357421875, 129.94451904296875, -21.970436096191406, -15.486801147460938, 42.575439453125, 151.30917358398438, 137.27615356445312, -100.9967041015625, -57.92657470703125, 209.00668334960938, 175.92877197265625, 113.11663818359375, 20.310653686523438, 66.52557373046875, 11.13934326171875, 142.8531951904297, 61.85247802734375, 112.65019226074219, 167.5184326171875, 52.095367431640625, 121.07470703125, -17.683456420898438, 137.8065185546875, -9.41988754272461, 111.53804779052734, 103.80940246582031, 162.40158081054688, -5.727806091308594, -44.467864990234375, 0.0, 11.57421875, -163.64955139160156, 55.86676025390625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000332.npy"}
|
||||
{"epoch": 0.6952879581151833, "step": 333, "batch_size": 128, "mean": 38.51242446899414, "std": 68.93851470947266, "min": -131.86380004882812, "p10": -47.950907897949214, "median": 29.550804138183594, "p90": 128.70985107421876, "max": 162.01483154296875, "pos_frac": 0.703125, "sample": [107.21465301513672, -131.86380004882812, -97.27809143066406, 153.136962890625, 148.4996337890625, -42.717926025390625, 63.83221435546875, 35.490386962890625, 44.906890869140625, 86.6107177734375, -49.10307312011719, 11.27789306640625, -16.163467407226562, 138.87432861328125, 129.5457763671875, 125.49676513671875, 110.66284942626953, 4.5561676025390625, 102.52455139160156, 36.617462158203125, -22.10546875, 21.755569458007812, 7.977081298828125, 131.6783447265625, 124.40248107910156, 80.21719360351562, 53.39208984375, 95.6966552734375, 121.88760375976562, -5.6033935546875, 39.799163818359375, 149.00363159179688, -12.334014892578125, 71.580810546875, -121.50384521484375, 3.1575279235839844, 49.5616455078125, -9.704765319824219, 120.87588500976562, 113.53321838378906, 39.561309814453125, 15.079360961914062, 13.759185791015625, 12.160858154296875, 74.1129150390625, 127.77656555175781, 4.571836471557617, 86.56198120117188, 25.35479736328125, -105.64097595214844, -63.37255859375, 64.85992431640625, 0.0, -4.3836669921875, 123.17201232910156, 20.7679443359375, 122.517822265625, 150.447021484375, 129.37332153320312, 65.96792602539062, -2.14971923828125, 35.43719482421875, -99.42105102539062, 124.59628295898438, 76.5908203125, -25.71917724609375, 128.42550659179688, 162.01483154296875, 47.31257629394531, 31.747344970703125, -97.02603149414062, 77.11257934570312, 29.8385009765625, -62.68316650390625, 10.988685607910156, 119.75230407714844, -2.300018310546875, -3.097900390625, 24.164794921875, 109.04314422607422, 38.046836853027344, 25.514419555664062, 29.263107299804688, 132.64031982421875, -21.7503662109375, 134.61427307128906, 114.00732421875, -5.52117919921875, -47.457122802734375, 136.5469207763672, 28.174602508544922, 67.64566040039062, -10.012733459472656, 95.61624145507812, -25.786239624023438, -18.02691650390625, 17.01251220703125, 21.388565063476562, -106.58001708984375, 96.66334533691406, -11.103973388671875, 103.49388885498047, -94.95986938476562, -9.190521240234375, 109.4534912109375, 121.50796508789062, -0.028472900390625, 64.36380004882812, 132.88848876953125, 11.03515625, 84.0048599243164, 3.9021873474121094, 112.89309692382812, 12.931671142578125, 16.54633140563965, -8.63995361328125, -64.38591003417969, 85.90863800048828, -17.5377197265625, 30.688568115234375, -7.79693603515625, 5.62506103515625, -102.7261962890625, 15.017616271972656, -0.88690185546875, 51.37425231933594, 18.560455322265625, 1.9888458251953125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000333.npy"}
|
||||
{"epoch": 0.6973821989528796, "step": 334, "batch_size": 128, "mean": 47.834197998046875, "std": 77.77783966064453, "min": -172.1069793701172, "p10": -32.56955261230468, "median": 34.380645751953125, "p90": 150.30847930908203, "max": 248.66131591796875, "pos_frac": 0.7421875, "sample": [103.34783172607422, 93.2850341796875, -23.6300048828125, -7.80035400390625, 0.0, -7.4586181640625, 6.540191650390625, -31.480880737304688, 22.3365421295166, 3.5262680053710938, 26.470489501953125, 21.6636962890625, 32.623565673828125, 97.25489807128906, -111.8768310546875, 118.84307098388672, -2.561370849609375, 155.826904296875, 38.067047119140625, -11.54425048828125, 104.85026550292969, 61.97613525390625, 189.01524353027344, 151.41307067871094, -80.8778076171875, 33.25468444824219, 20.199600219726562, 28.4422607421875, -2.0064315795898438, 16.640579223632812, 56.67144775390625, -9.756851196289062, 0.2573890686035156, 84.19000244140625, 115.22692108154297, 33.91192626953125, -3.6153488159179688, 110.3367919921875, -149.27716064453125, 131.7607421875, -1.2568702697753906, 4.923797607421875, 20.91156005859375, 16.5667724609375, 86.63741302490234, 82.24136352539062, 10.55572509765625, -38.306915283203125, 120.2537841796875, -3.312713623046875, 99.18305969238281, -114.63003540039062, 137.7764129638672, 82.4237060546875, 178.2078857421875, 123.1021728515625, -4.826557159423828, 57.6549072265625, 7.928314208984375, 41.071929931640625, 12.309211730957031, 6.511081695556641, 176.6270751953125, 55.463775634765625, -33.93385314941406, -101.54156494140625, 45.659332275390625, -83.56539916992188, 25.490447998046875, -32.129852294921875, 149.8350830078125, 10.962814331054688, -16.78447723388672, 164.11444091796875, 127.7049560546875, 117.19490814208984, 87.56373596191406, 19.30054473876953, 12.286376953125, 121.99079895019531, 34.849365234375, 41.2037467956543, -5.304412841796875, 117.09590148925781, 105.10401916503906, 163.1072540283203, 66.6945571899414, 50.436248779296875, 123.11566162109375, 0.60064697265625, 129.70254516601562, 120.59637451171875, 127.98533630371094, -17.316070556640625, 2.5641937255859375, 26.01962661743164, -11.603370666503906, 9.115753173828125, 140.12603759765625, 162.14019775390625, 153.25990295410156, 23.04638671875, 11.649688720703125, -105.36245727539062, 137.49517822265625, -33.59552001953125, 90.89144134521484, 113.380126953125, 153.70199584960938, 130.78768920898438, -172.1069793701172, 177.77801513671875, 139.40057373046875, 0.4470691680908203, 112.81161499023438, -141.54708862304688, 20.619720458984375, 84.47625732421875, -23.297409057617188, 55.36726379394531, 41.804656982421875, 248.66131591796875, 125.78048706054688, 44.179595947265625, 51.6761474609375, -11.221893310546875, -56.01416015625, 174.2615966796875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000334.npy"}
|
||||
{"epoch": 0.6994764397905759, "step": 335, "batch_size": 128, "mean": 43.11384582519531, "std": 79.38816833496094, "min": -157.25860595703125, "p10": -50.61505737304686, "median": 25.18649387359619, "p90": 143.79428405761718, "max": 190.95230102539062, "pos_frac": 0.6953125, "sample": [-63.690155029296875, -129.37619018554688, 21.467575073242188, -40.97294616699219, 96.06112670898438, -102.27774047851562, 167.7998046875, -10.573577880859375, 21.742706298828125, 124.38056945800781, 101.5601806640625, -129.80612182617188, 43.290771484375, -7.445560455322266, -2.464080810546875, 21.714111328125, 16.093673706054688, -1.7430000305175781, -27.8369140625, -46.20176696777344, -16.192405700683594, 161.15469360351562, -0.360382080078125, 10.047775268554688, 13.680419921875, -0.228607177734375, 50.044586181640625, 68.01656341552734, -34.823394775390625, 121.4527587890625, -157.25860595703125, 177.76132202148438, -21.146560668945312, 121.69302368164062, 47.4410400390625, 174.04220581054688, 8.441635131835938, 117.18093872070312, 140.9405517578125, -0.23535919189453125, 143.24636840820312, 129.18954467773438, 145.07275390625, 108.36390686035156, 150.9500732421875, 140.87246704101562, 86.55250549316406, 116.58393096923828, 121.9014892578125, 148.06182861328125, -6.143798828125, 114.79444885253906, 13.674163818359375, 103.77123260498047, 119.5000991821289, 116.66908264160156, 177.9993896484375, 67.66543579101562, 2.962005615234375, 37.68853759765625, 23.1796875, -125.92221069335938, 4.898590087890625, 28.586517333984375, -109.91009521484375, 134.6827392578125, 65.44065856933594, 138.20819091796875, 95.21888732910156, 18.223846435546875, 14.282730102539062, -1.5145721435546875, 7.0953826904296875, 114.11543273925781, 179.63967895507812, 125.65399169921875, 3.482311248779297, -60.91273498535156, 10.021728515625, 27.193300247192383, 68.08042907714844, 70.74168395996094, -20.879058837890625, 112.02615356445312, 148.09104919433594, -16.94843292236328, -75.3858642578125, 5.3824005126953125, 52.396240234375, 34.27850341796875, -36.398223876953125, -0.9499969482421875, 117.22848510742188, 4.129255294799805, 9.778472900390625, -14.77056884765625, 14.696319580078125, 20.449668884277344, 77.70458984375, -3.9995880126953125, 160.75271606445312, -120.66964721679688, 14.5404052734375, 83.20018768310547, 129.0301513671875, 68.18008422851562, 0.0, 139.51483154296875, 147.4036865234375, 120.51300811767578, -92.4219970703125, 129.0638427734375, 125.65058898925781, -108.48321533203125, -112.47982788085938, 190.95230102539062, 7.45667839050293, -1.2051620483398438, 14.626676559448242, -2.492034912109375, 12.221588134765625, 97.19461822509766, 39.961944580078125, 88.36953735351562, -44.4801025390625, -14.51239013671875, 55.2010498046875, 61.417083740234375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000335.npy"}
|
||||
{"epoch": 0.7015706806282722, "step": 336, "batch_size": 128, "mean": 40.29804229736328, "std": 71.69638061523438, "min": -139.93763732910156, "p10": -32.94631652832031, "median": 22.157821655273438, "p90": 139.77186889648436, "max": 231.3577880859375, "pos_frac": 0.703125, "sample": [55.74542236328125, 17.9593505859375, 57.555389404296875, -11.441009521484375, 125.93869018554688, 28.2939453125, -18.226470947265625, 115.03546142578125, 114.33926391601562, 134.49273681640625, -14.078216552734375, 22.971221923828125, 37.63069152832031, 64.65415954589844, 3.4481773376464844, 60.476165771484375, -2.511199951171875, 164.16860961914062, 7.459297180175781, 141.22988891601562, 165.18861389160156, 8.781028747558594, 96.3275146484375, 119.95124816894531, -49.702392578125, -15.546173095703125, 7.14892578125, 16.635536193847656, 13.762374877929688, 31.194580078125, 129.28321838378906, 16.877723693847656, -3.0623512268066406, 120.44186401367188, 117.5146484375, 143.4271697998047, 35.29719543457031, 194.61990356445312, 15.018714904785156, 10.462066650390625, -139.93763732910156, 118.72027587890625, -41.49053955078125, 131.03610229492188, 100.5087890625, -0.5677261352539062, -1.1462421417236328, 122.50076293945312, -35.781158447265625, 23.632389068603516, 78.47917938232422, 0.73443603515625, 21.34442138671875, 105.02145385742188, 33.88368225097656, -49.421630859375, -12.809539794921875, -12.141166687011719, 59.917091369628906, -16.20063018798828, 83.02601623535156, 33.387664794921875, -88.95391845703125, 80.68060302734375, -12.48333740234375, 85.3865737915039, 46.235870361328125, -36.568359375, 86.87492370605469, 122.25359344482422, 2.9434814453125, -7.323738098144531, 132.09689331054688, 87.30039978027344, -2.6601104736328125, 9.52264404296875, -5.655059814453125, 15.827186584472656, 78.83219909667969, -54.63008117675781, 2.9969730377197266, 36.87847900390625, 44.54193115234375, -106.80465698242188, 115.00689697265625, 36.95353698730469, 104.9757080078125, 51.790191650390625, 19.343894958496094, 150.5815887451172, -20.45001220703125, -104.53936004638672, 146.7335662841797, 150.63436889648438, 66.4432373046875, 121.22467041015625, 159.48541259765625, 29.400535583496094, 19.444427490234375, 99.91928100585938, -4.471738815307617, 147.8705291748047, 8.21575927734375, -19.415191650390625, -126.94525146484375, 231.3577880859375, 16.435791015625, 116.42070007324219, 173.37026977539062, -136.830078125, 3.4858436584472656, -6.916877746582031, 1.098388671875, 139.14700317382812, 24.605422973632812, 10.105236053466797, 16.28125, -15.5216064453125, -84.39556884765625, 0.0, -16.371780395507812, 23.48004150390625, 11.249298095703125, 147.30734252929688, -31.73138427734375, -21.03546142578125, -15.439437866210938, 21.093734741210938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000336.npy"}
|
||||
{"epoch": 0.7036649214659686, "step": 337, "batch_size": 128, "mean": 45.67418670654297, "std": 72.01741790771484, "min": -118.79258728027344, "p10": -48.56407012939453, "median": 34.143959045410156, "p90": 140.9084701538086, "max": 183.8577880859375, "pos_frac": 0.7109375, "sample": [34.315155029296875, -34.894378662109375, 16.214370727539062, 0.41104698181152344, 10.880523681640625, -4.358856201171875, -16.171615600585938, 165.72079467773438, 45.57281494140625, 183.8577880859375, -2.1793365478515625, 36.40910720825195, 20.515396118164062, 45.778564453125, 161.005615234375, 143.94955444335938, -65.03546142578125, 23.668113708496094, -11.261474609375, 19.100479125976562, -109.20700073242188, -56.5806884765625, -6.457305908203125, -42.978240966796875, 45.121559143066406, 3.1218185424804688, 6.61334228515625, 140.9899139404297, 3.3365211486816406, 131.7767333984375, 55.61187744140625, 147.5885009765625, 69.65994262695312, 126.95993041992188, 139.88821411132812, 116.2501220703125, 25.4287109375, 5.3356475830078125, 93.99069213867188, 102.06451416015625, -10.171913146972656, -9.31927490234375, -7.186065673828125, 134.03799438476562, 57.121429443359375, 131.4507293701172, -9.837554931640625, -107.38076782226562, -22.821807861328125, 29.37188720703125, 9.8927001953125, 120.88080596923828, -17.705856323242188, 112.3468017578125, 88.43911743164062, -51.577392578125, 27.260711669921875, 96.85610961914062, 58.73078918457031, 20.87652587890625, 0.4654998779296875, -4.761312484741211, 62.181373596191406, 156.02032470703125, 73.63526153564453, -18.600784301757812, 121.66424560546875, 114.7596435546875, -118.79258728027344, 47.25531005859375, 115.1138916015625, 152.36260986328125, 38.331390380859375, 27.551544189453125, 112.31512451171875, 3.382476806640625, -4.01019287109375, -24.88555145263672, 146.83981323242188, 67.793701171875, 15.07577896118164, 91.70196533203125, 92.2049789428711, 137.17950439453125, 116.22651672363281, 116.68975830078125, -16.822429656982422, 101.77529907226562, -48.50276184082031, -64.36395263671875, 31.228057861328125, 128.01930236816406, 73.19775390625, 173.44760131835938, -52.323516845703125, 99.91714477539062, -6.270233154296875, -6.60333251953125, 103.04305267333984, 118.42279052734375, 81.76270294189453, -14.13531494140625, -99.43179321289062, 150.97711181640625, -34.2001953125, -77.14708709716797, 161.998291015625, 33.97276306152344, 4.357978820800781, 1.9625225067138672, -24.00830078125, 10.415847778320312, 136.48272705078125, 86.17486572265625, -64.01211547851562, 147.36790466308594, 0.0294189453125, 3.525482177734375, -76.58697509765625, 120.27133178710938, 122.95142364501953, 99.557861328125, -48.707122802734375, 51.983123779296875, 14.049468994140625, 140.87356567382812, 134.9640655517578, 85.7032470703125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000337.npy"}
|
||||
{"epoch": 0.7057591623036649, "step": 338, "batch_size": 128, "mean": 55.123435974121094, "std": 74.09307861328125, "min": -142.20790100097656, "p10": -20.591644287109375, "median": 38.1119384765625, "p90": 153.87250061035155, "max": 202.76828002929688, "pos_frac": 0.7578125, "sample": [152.20663452148438, -0.8744449615478516, 8.77325439453125, 115.83489990234375, 146.14114379882812, 160.20303344726562, 114.20995330810547, 106.36883544921875, 4.236625671386719, -79.92958068847656, 98.26002502441406, 128.91822814941406, 34.49658203125, 16.22937774658203, 140.7340087890625, 23.676116943359375, 125.336669921875, 41.1126708984375, -5.8282928466796875, -20.88494873046875, 150.98150634765625, 59.07533264160156, -1.5248851776123047, 168.3128662109375, 6.25, 66.70492553710938, 157.60977172851562, 113.76130676269531, -9.126007080078125, 1.672037124633789, 40.99664306640625, 9.6993408203125, -2.5516357421875, 16.658782958984375, 56.58695983886719, 112.68476104736328, 51.9132080078125, 146.77783203125, 24.160964965820312, 35.86601257324219, 91.05267333984375, 29.622528076171875, 73.01998901367188, 139.82034301757812, 155.583984375, 6.8390655517578125, -31.328857421875, 133.49819946289062, 104.89749145507812, 38.85516357421875, 94.283935546875, -16.88140869140625, 152.56973266601562, -106.00079345703125, 26.175506591796875, -2.9669418334960938, -6.758087158203125, 160.96087646484375, -92.24659729003906, 149.64373779296875, 121.78132629394531, 202.76828002929688, 133.31533813476562, 43.642181396484375, 55.34466552734375, 37.36871337890625, -20.4659423828125, -11.177658081054688, -119.659423828125, -4.858146667480469, 10.63094711303711, 118.22946166992188, 163.63943481445312, 21.975997924804688, -24.137664794921875, 3.0162811279296875, 6.397548675537109, 145.38983154296875, 139.7671661376953, -0.5076370239257812, 103.9669189453125, 0.6086807250976562, 114.30050659179688, -4.107719421386719, 112.1475830078125, 24.59564208984375, -9.552810668945312, 10.2918701171875, 96.3203125, 126.72468566894531, 105.35211181640625, 165.89886474609375, 19.607391357421875, -3.8422698974609375, -107.50852966308594, 142.30722045898438, 21.516632080078125, 175.4315185546875, -80.86549377441406, 79.10045623779297, 22.97857666015625, -34.718505859375, 17.49335479736328, 41.387939453125, 31.906532287597656, 153.13900756835938, 108.62380981445312, 14.714447021484375, -34.396240234375, 170.553466796875, 44.990386962890625, -0.39492034912109375, 0.5603160858154297, 31.972412109375, -7.8646240234375, 155.9058837890625, 43.06105041503906, 101.35203552246094, 108.4171142578125, 25.01177978515625, -142.20790100097656, 1.236053466796875, 28.67901611328125, -10.801780700683594, 172.52825927734375, 171.12918090820312, -33.57330322265625, 142.99143981933594], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000338.npy"}
|
||||
{"epoch": 0.7078534031413612, "step": 339, "batch_size": 128, "mean": 55.081756591796875, "std": 75.89386749267578, "min": -166.58071899414062, "p10": -30.179758071899414, "median": 43.29835510253906, "p90": 139.2803924560547, "max": 207.189453125, "pos_frac": 0.7578125, "sample": [76.68966674804688, -46.32415771484375, 124.05683135986328, 117.56646728515625, 69.6176528930664, -10.559539794921875, 28.39599609375, -69.90602111816406, 113.25660705566406, -4.3775634765625, 165.93377685546875, 19.87738037109375, 5.709716796875, 4.231025695800781, 129.96218872070312, 2.110565185546875, 65.94784545898438, 93.64205932617188, 80.7297592163086, -58.90924072265625, 102.47077941894531, 13.224761962890625, 60.420623779296875, 115.93804931640625, 0.0, 161.91729736328125, -30.044677734375, 42.460906982421875, 0.0, 46.00754928588867, 26.968652725219727, -30.494945526123047, 170.31396484375, 127.31591796875, 7.3973388671875, -128.8465118408203, 135.8997802734375, 152.47467041015625, 25.94384765625, -13.186569213867188, 103.49148559570312, 130.75575256347656, -29.65875244140625, 120.29225158691406, 83.50860595703125, 90.14584350585938, 110.64398193359375, 0.00212860107421875, -5.0295257568359375, 127.17236328125, 39.745361328125, 9.9771728515625, 200.79949951171875, 102.07133483886719, 134.15487670898438, 126.95712280273438, 15.816200256347656, 28.492591857910156, 134.6977996826172, 152.19619750976562, 133.42877197265625, -0.08734893798828125, -10.768768310546875, -68.54910278320312, -1.4223518371582031, 161.90780639648438, 7.027313232421875, 110.69802856445312, 114.40234375, 30.7298583984375, 41.0855712890625, 17.431060791015625, -74.8947525024414, 101.17828369140625, 119.61904907226562, 21.21197509765625, 108.13128662109375, 139.8482666015625, -48.870819091796875, 118.6116943359375, 4.376380920410156, 129.25672912597656, 143.43646240234375, -10.77337646484375, 138.71438598632812, 207.189453125, 44.13580322265625, 0.0, 46.263641357421875, 42.267242431640625, 122.04266357421875, 128.56402587890625, 128.01028442382812, 139.03701782226562, 127.74319458007812, -9.900848388671875, 70.45796203613281, -8.1065673828125, 24.3052978515625, 66.49539184570312, 132.167724609375, -46.444854736328125, 28.211822509765625, 13.382537841796875, 10.762908935546875, 122.13600158691406, 115.68148803710938, 9.033294677734375, -29.871780395507812, 3.8751449584960938, 158.7828369140625, 97.96087646484375, -166.58071899414062, 136.30621337890625, 26.035888671875, 11.097923278808594, 10.974800109863281, -12.756683349609375, 146.0037841796875, -99.61019897460938, -148.44325256347656, 171.11004638671875, 41.44122314453125, 26.60205078125, 134.14187622070312, -3.158477783203125, -95.26720428466797, 132.59158325195312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000339.npy"}
|
||||
{"epoch": 0.7099476439790576, "step": 340, "batch_size": 128, "mean": 46.09893035888672, "std": 75.33020782470703, "min": -130.83218383789062, "p10": -44.33215179443359, "median": 30.96924591064453, "p90": 139.61356658935546, "max": 202.37371826171875, "pos_frac": 0.71875, "sample": [185.1650848388672, 83.38077545166016, -45.42326354980469, 126.64237976074219, 1.5050506591796875, 139.5570831298828, -43.864532470703125, -18.245811462402344, -51.0743408203125, 136.56419372558594, -118.28628540039062, 115.78565979003906, -70.7713623046875, 2.4921112060546875, 0.0, 139.745361328125, 20.97344970703125, -46.11183166503906, 105.73644256591797, 142.16636657714844, 134.81829833984375, 14.92681884765625, -32.132171630859375, 80.59793090820312, -40.10247802734375, -4.49652099609375, -26.6771240234375, 123.1270980834961, 8.350845336914062, 141.39889526367188, 176.81768798828125, 4.89802360534668, 65.50314331054688, 139.36029052734375, 26.54718017578125, -13.010574340820312, -0.3321495056152344, -102.88601684570312, 93.94169616699219, 14.934463500976562, 133.79263305664062, 120.49896240234375, -96.09147644042969, 44.28411865234375, 7.483734130859375, 139.4001007080078, -15.17681884765625, 65.72996520996094, 7.3980712890625, -15.054092407226562, 94.933349609375, 170.38168334960938, -2.263355255126953, -78.81298828125, 8.980016708374023, -10.425994873046875, 124.75282287597656, 104.88519287109375, 33.075042724609375, -24.924285888671875, 12.673995971679688, -130.83218383789062, 145.96705627441406, 0.0, 177.09927368164062, 21.65191650390625, 49.3709716796875, 16.788162231445312, 132.3160400390625, 32.86834716796875, 16.181396484375, -6.172563552856445, 120.98503112792969, -90.33808898925781, 148.14697265625, 31.358123779296875, 77.2518310546875, -22.826171875, 139.89471435546875, 9.97509765625, 136.72406005859375, 202.37371826171875, -78.9539794921875, 32.4495849609375, -13.298065185546875, 115.02182006835938, 79.98501586914062, 30.580368041992188, 116.16778564453125, 22.243377685546875, 46.51800537109375, -13.241090774536133, 81.2884521484375, -16.6063232421875, 121.4857177734375, 125.0362548828125, -62.24629211425781, 5.285247802734375, 49.864471435546875, 51.317657470703125, 196.70306396484375, 8.575721740722656, 129.23306274414062, 13.124130249023438, 132.49844360351562, -31.82122802734375, 14.691650390625, -116.62478637695312, 128.16744995117188, 109.90440368652344, 122.6035385131836, 37.428367614746094, 12.160465240478516, 97.79502868652344, 144.38656616210938, 139.34193420410156, 22.9632568359375, 28.76311492919922, 12.436298370361328, 95.31732177734375, 35.09466552734375, 57.435577392578125, -22.006103515625, 20.14980697631836, 44.8023681640625, 16.268173217773438, 122.22199249267578, -9.67556381225586], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000340.npy"}
|
||||
{"epoch": 0.7120418848167539, "step": 341, "batch_size": 128, "mean": 36.02146530151367, "std": 78.53331756591797, "min": -139.488037109375, "p10": -60.13180847167968, "median": 16.857803344726562, "p90": 135.58721160888672, "max": 199.9520263671875, "pos_frac": 0.6796875, "sample": [6.772453308105469, 6.32159423828125, 93.34796142578125, 131.85845947265625, 97.30624389648438, -38.234275817871094, 110.26249694824219, 15.4466552734375, 21.290054321289062, 3.54931640625, 120.44232177734375, 38.13653564453125, 129.73788452148438, 12.541309356689453, 172.041015625, 164.39683532714844, -41.48309326171875, 6.39910888671875, 17.451019287109375, 12.944028854370117, 127.55477905273438, 14.5645751953125, 117.85214233398438, -116.79945373535156, 167.90045166015625, 50.839569091796875, 10.626274108886719, 7.6305999755859375, 7.3875274658203125, -2.54730224609375, 1.83538818359375, 37.87677001953125, -15.675872802734375, -6.013214111328125, -110.77285766601562, 8.110992431640625, -62.89495849609375, -117.67681884765625, -0.29217529296875, 25.618927001953125, 5.44476318359375, 40.006710052490234, 132.59500122070312, 80.40841674804688, 65.18583679199219, 8.30084228515625, 133.82110595703125, 103.31758117675781, -2.388731002807617, 104.83062744140625, 95.63360595703125, 135.6601104736328, 171.01983642578125, -23.0706787109375, 113.43359375, 96.99057006835938, 97.7996826171875, 49.71099853515625, 10.900222778320312, -27.0089111328125, 41.1239013671875, -98.4862060546875, 199.9520263671875, -9.787139892578125, 93.16998291015625, -112.92158508300781, -98.482421875, 128.14285278320312, 150.9346466064453, 6.684967041015625, 61.67535400390625, 135.55596923828125, 97.13516235351562, -22.31584930419922, 112.61079406738281, 133.19631958007812, 130.1661376953125, -0.792633056640625, 20.491714477539062, -130.88160705566406, 64.16253662109375, 4.294586181640625, -13.102157592773438, -27.4232177734375, 179.03448486328125, 28.47369384765625, -21.32916259765625, 43.01458740234375, 4.53143310546875, 142.3394775390625, -98.6455078125, 66.87705993652344, 161.00308227539062, 122.95291137695312, -104.49969482421875, -13.936676025390625, -46.92848205566406, -1.8423080444335938, -12.778778076171875, 30.43133544921875, -17.999725341796875, 134.07012939453125, 18.6280517578125, -13.4444580078125, -106.33197021484375, 35.77903747558594, 35.802642822265625, -19.506256103515625, -139.488037109375, -27.97589874267578, 16.26458740234375, -58.947601318359375, -4.45062255859375, 145.27728271484375, -120.88108825683594, -5.7014312744140625, 118.48019409179688, 140.48260498046875, 121.99856567382812, 74.71165466308594, 119.7332763671875, 7.3319091796875, 32.574562072753906, -25.2640380859375, 177.0224609375, 15.20318603515625, -15.97039794921875, 11.302909851074219], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000341.npy"}
|
||||
{"epoch": 0.7141361256544503, "step": 342, "batch_size": 128, "mean": 46.51683044433594, "std": 69.98486328125, "min": -161.5255126953125, "p10": -24.136993408203118, "median": 22.976425170898438, "p90": 141.8795654296875, "max": 199.80889892578125, "pos_frac": 0.734375, "sample": [-6.824920654296875, 24.77691650390625, 19.695098876953125, 19.659912109375, 92.30279541015625, 192.00393676757812, 18.904338836669922, 44.88933563232422, 7.214851379394531, 37.253448486328125, -49.52183532714844, 5.649932861328125, -4.56182861328125, 121.980224609375, 98.54948425292969, 121.99676513671875, -7.114356994628906, 62.5281982421875, -65.74153900146484, 8.079498291015625, 66.92385864257812, 7.42291259765625, 11.06341552734375, 55.60186004638672, -49.619651794433594, 9.264511108398438, 7.668769836425781, -43.44586181640625, 39.550018310546875, 116.75233459472656, 21.175933837890625, 126.46768188476562, 63.306800842285156, 65.1251220703125, -8.290435791015625, 16.48602294921875, 16.728317260742188, -13.272319793701172, -22.00335693359375, 31.35614013671875, 21.150339126586914, 160.72207641601562, 19.57568359375, -29.115478515625, -14.656539916992188, -21.91405487060547, -30.797210693359375, 58.051605224609375, 61.078857421875, 5.2191162109375, 113.65585327148438, -1.0113677978515625, -30.73931884765625, 144.94284057617188, 10.038803100585938, 168.6903076171875, 136.26577758789062, -1.8935546875, 57.7115478515625, 99.76763916015625, 141.8140869140625, 146.84088134765625, 105.47978210449219, 104.72693634033203, 107.71650695800781, -98.95439147949219, 1.2816009521484375, 0.0, -8.7711181640625, 31.192138671875, 161.68634033203125, 11.998764038085938, 149.47079467773438, 5.062941551208496, 31.492828369140625, 117.72357177734375, 134.06875610351562, 76.27904510498047, 160.2473907470703, 3.8544273376464844, -15.1214599609375, 123.7734375, 32.8165283203125, 98.648193359375, -66.7789306640625, -19.147781372070312, -161.5255126953125, -17.360015869140625, 102.84521484375, 88.45030212402344, 137.8702392578125, 104.33084106445312, -1.914093017578125, 8.939056396484375, 20.545146942138672, 72.85670471191406, 106.36886596679688, 20.196487426757812, 13.829513549804688, -94.43280029296875, 18.452667236328125, 121.123291015625, -11.812240600585938, 199.80889892578125, 12.628570556640625, 134.25933837890625, -64.57879638671875, 20.161224365234375, 83.46466064453125, -89.99639129638672, 146.89471435546875, 111.59371948242188, 53.25848388671875, 130.6089324951172, 117.38775634765625, 142.0323486328125, 24.81256103515625, 13.9056396484375, 116.08477020263672, 129.5175018310547, -14.161125183105469, -12.736495971679688, -19.981491088867188, 176.1314697265625, 111.47122192382812, -6.845451354980469, 170.06591796875, 15.47607421875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000342.npy"}
|
||||
{"epoch": 0.7162303664921466, "step": 343, "batch_size": 128, "mean": 41.50767517089844, "std": 75.15618133544922, "min": -185.340576171875, "p10": -34.330152893066405, "median": 22.041027069091797, "p90": 139.2221969604492, "max": 193.19412231445312, "pos_frac": 0.7109375, "sample": [-30.573974609375, 113.97006225585938, 130.958740234375, 4.995338439941406, 40.290191650390625, 61.27850341796875, 34.099365234375, 19.60198974609375, -85.29095458984375, 23.9202880859375, 135.08734130859375, -39.4521484375, -11.505615234375, -25.532012939453125, 92.80874633789062, 157.40884399414062, 145.80169677734375, 44.2802734375, -10.27471923828125, -6.15159797668457, 174.73828125, 22.05908203125, 143.1458740234375, 136.70968627929688, 93.89832305908203, 35.53355407714844, 2.093292236328125, 71.28964233398438, 10.72430419921875, 118.10577392578125, 44.690643310546875, -121.32569122314453, 50.308319091796875, -8.51129150390625, -106.31340026855469, 19.107887268066406, 14.486572265625, 131.07955932617188, -24.176513671875, 13.9527587890625, -36.27650451660156, 163.12905883789062, 38.35931396484375, 94.081787109375, 0.0, -96.83221435546875, 21.99725341796875, 126.59690856933594, 103.7044448852539, 187.02960205078125, -114.8681640625, 4.02984619140625, 11.07377815246582, 10.383234024047852, 136.99261474609375, -33.496002197265625, 106.893798828125, 66.84564208984375, -10.563880920410156, 87.77835845947266, 116.45515441894531, 4.3795318603515625, 6.3706817626953125, 2.71893310546875, 122.45494079589844, 131.6438751220703, -21.163604736328125, 133.06149291992188, 17.774993896484375, 67.43296813964844, 65.32608795166016, -17.989898681640625, 60.8126220703125, -12.951858520507812, -0.6905517578125, 64.87029266357422, 3.644256591796875, 193.19412231445312, 113.9439697265625, 137.9703826904297, 32.043800354003906, 40.26051330566406, -45.519744873046875, 12.523651123046875, 109.89396667480469, 7.4022979736328125, 3.786224365234375, 33.12876892089844, 11.65032958984375, -26.91937255859375, 5.155788421630859, 132.265380859375, 131.82839965820312, 185.38232421875, 13.3524169921875, -25.53734588623047, 114.39964294433594, 56.91552734375, 127.0234375, 22.022972106933594, 150.0854034423828, 129.36741638183594, -1.99951171875, 118.4266357421875, -185.340576171875, 142.14309692382812, -14.346527099609375, -3.4824180603027344, -76.47579956054688, -63.74713134765625, 11.026874542236328, 116.84449005126953, 142.74957275390625, 1.813995361328125, -67.97003173828125, -0.681365966796875, -9.187210083007812, -20.26031494140625, 26.92083740234375, 168.1226806640625, 80.05599975585938, -0.3356971740722656, 26.2576904296875, 8.710235595703125, 6.58197021484375, 151.38296508789062, -14.85809326171875, -123.3158187866211], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000343.npy"}
|
||||
{"epoch": 0.7183246073298429, "step": 344, "batch_size": 128, "mean": 48.025794982910156, "std": 74.1978988647461, "min": -144.13153076171875, "p10": -31.604293823242184, "median": 29.658103942871094, "p90": 153.3678192138672, "max": 192.88211059570312, "pos_frac": 0.7265625, "sample": [17.652496337890625, -2.156341552734375, 112.0008544921875, 78.40823364257812, -18.490692138671875, 129.61874389648438, 131.23252868652344, 19.797515869140625, 127.9908447265625, -15.9053955078125, 110.33026123046875, 19.542831420898438, 106.45143127441406, -11.908842086791992, 51.020416259765625, 159.25808715820312, 19.08648681640625, 26.809814453125, -137.93365478515625, 84.79837036132812, -100.24136352539062, 5.46923828125, 0.1822509765625, -34.234954833984375, 168.20159912109375, 77.83324432373047, -13.608634948730469, 157.58334350585938, 152.74818420410156, 11.85400390625, 157.02005004882812, -37.90753173828125, -2.405303955078125, 123.00048828125, -144.13153076171875, 30.914138793945312, 129.45257568359375, 120.55522918701172, 12.8463134765625, -11.919509887695312, -138.09059143066406, 18.285736083984375, 21.299392700195312, -26.785629272460938, -5.544410705566406, 16.393753051757812, -61.4443359375, 128.7705841064453, 166.03118896484375, 37.036956787109375, -37.45855712890625, -42.04473876953125, -1.10052490234375, -47.83338165283203, 98.7557373046875, 94.6844482421875, 159.0242156982422, 18.510101318359375, 173.96347045898438, 155.56190490722656, 133.79901123046875, 130.12667846679688, 192.88211059570312, 120.19253540039062, 141.1720428466797, -4.973335266113281, -30.47686767578125, 157.27896118164062, -10.774185180664062, -10.112701416015625, -59.157379150390625, 14.687286376953125, 21.91802978515625, 181.32077026367188, 48.85174560546875, 66.35881042480469, 18.66143798828125, 88.21820068359375, 21.24261474609375, 80.07383728027344, 30.599517822265625, 19.921600341796875, 43.90277099609375, 126.77409362792969, -9.21826171875, 78.39715576171875, 3.1841583251953125, 132.68927001953125, 20.396621704101562, 42.07536315917969, 163.47381591796875, 101.68140411376953, 19.39043426513672, -15.545303344726562, -5.941864013671875, 147.327880859375, 52.203399658203125, -1.9873123168945312, 25.477188110351562, -17.759017944335938, -8.3636474609375, 32.07347106933594, 17.591888427734375, 129.75689697265625, -12.94708251953125, -69.87841796875, 134.34765625, 38.3350830078125, 17.541542053222656, 9.422393798828125, 3.35345458984375, 96.9178466796875, 11.089736938476562, 153.455810546875, 84.95773315429688, 28.716690063476562, 45.423797607421875, 45.99644470214844, 28.608154296875, 153.33010864257812, 115.1138916015625, 30.749649047851562, 64.07305908203125, 139.40316772460938, 46.790740966796875, -19.57469940185547, 146.83221435546875, -112.97946166992188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000344.npy"}
|
||||
{"epoch": 0.7204188481675393, "step": 345, "batch_size": 128, "mean": 39.71239471435547, "std": 65.84900665283203, "min": -154.97027587890625, "p10": -23.548479080200195, "median": 19.139442443847656, "p90": 137.03343811035157, "max": 177.6351318359375, "pos_frac": 0.7421875, "sample": [33.728271484375, 109.92024993896484, 72.11123657226562, -126.61836242675781, 160.694580078125, 57.276092529296875, -8.776123046875, -37.587799072265625, 16.658843994140625, -1.2891521453857422, 116.81838989257812, 10.345191955566406, -0.18088531494140625, 5.917854309082031, 58.10047912597656, 6.301025390625, 147.21102905273438, 14.26953125, 11.48199462890625, -6.3535614013671875, 15.646728515625, -24.1929931640625, 101.8515396118164, 136.6306610107422, 3.3525609970092773, 44.150634765625, 124.75906372070312, 33.32965087890625, -98.188232421875, 0.5485763549804688, -8.972114562988281, 26.100128173828125, -23.272258758544922, 16.692718505859375, -1.136383056640625, 53.7171630859375, 56.869140625, 3.960113525390625, 15.3582763671875, 25.54351806640625, 30.315887451171875, -6.89622688293457, -30.102275848388672, 118.90179443359375, 12.78887939453125, 76.18310546875, 18.5169677734375, -1.1574592590332031, 44.10010528564453, 171.11016845703125, 111.15776062011719, 122.82904052734375, 45.2098388671875, 137.97325134277344, 155.7368621826172, 51.87705993652344, 19.7550048828125, -51.23272705078125, 15.576446533203125, -7.448486328125, 143.40142822265625, 27.617965698242188, 60.82427978515625, 18.859237670898438, 18.7864990234375, -62.74433898925781, -154.97027587890625, 14.837730407714844, -86.4754638671875, -22.231842041015625, 151.76068115234375, 155.66671752929688, 133.1720428466797, 20.933761596679688, 23.680381774902344, -18.13574981689453, 177.6351318359375, -33.842567443847656, 14.701873779296875, -2.7068405151367188, 135.9615478515625, 145.1060791015625, 62.391014099121094, -81.63507080078125, 72.25604248046875, 3.5529937744140625, -36.67762756347656, 166.1051025390625, 104.7608642578125, 57.06395721435547, 150.45225524902344, 118.65415954589844, 19.434722900390625, -16.13629150390625, 90.76887512207031, 14.982940673828125, 38.298805236816406, 124.3230972290039, 133.1510009765625, -6.54803466796875, 136.2073974609375, -26.3270263671875, 113.64228057861328, -22.806060791015625, 2.0443115234375, 26.630767822265625, 4.3788299560546875, 19.370208740234375, 6.2086029052734375, -14.254058837890625, -12.71466064453125, 100.83011627197266, 18.908676147460938, 24.166885375976562, 141.10577392578125, 121.51300048828125, 13.01617431640625, 31.366039276123047, 0.0, 9.907768249511719, 3.1159133911132812, 16.182830810546875, 6.8603515625, -0.24269676208496094, 127.87568664550781, 7.28155517578125, 89.52484130859375, 84.38330078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000345.npy"}
|
||||
{"epoch": 0.7225130890052356, "step": 346, "batch_size": 128, "mean": 46.955223083496094, "std": 73.39617919921875, "min": -170.29119873046875, "p10": -31.755299568176262, "median": 31.2237548828125, "p90": 146.96556396484374, "max": 176.71176147460938, "pos_frac": 0.734375, "sample": [-50.161285400390625, -30.020475387573242, 1.0636234283447266, -55.14573669433594, 72.0006103515625, 40.9591064453125, 157.58786010742188, 159.34927368164062, 115.91475677490234, -3.2343902587890625, 126.3946533203125, -19.951065063476562, -16.81256103515625, 27.44427490234375, 127.49368286132812, -115.89544677734375, -58.07810974121094, 2.8743896484375, 23.5299072265625, 161.06503295898438, 21.147384643554688, -3.3382110595703125, 8.70166015625, -35.80322265625, 21.105770111083984, 117.30105590820312, 11.211380004882812, 174.0670166015625, -12.04095458984375, -4.9822845458984375, 35.6656494140625, 49.00109100341797, -89.83441925048828, 24.9449462890625, 4.5461273193359375, 144.88308715820312, 81.61857604980469, 153.37347412109375, 16.323522567749023, 19.13482666015625, 74.22406005859375, 78.20343017578125, -96.01571655273438, 20.334564208984375, -9.455978393554688, 85.92474365234375, -8.96185302734375, -18.242935180664062, 16.305328369140625, 39.42250442504883, 126.75186157226562, 14.7884521484375, 117.55284881591797, -45.820709228515625, 19.33624267578125, 53.21173858642578, 168.26324462890625, 13.132736206054688, -1.3991317749023438, 85.94454956054688, 151.495361328125, 172.16302490234375, 29.7274169921875, 9.41646957397461, 27.432083129882812, 71.84968566894531, 156.10015869140625, -15.311370849609375, 72.84603881835938, 92.29660034179688, 71.54849243164062, 26.16986083984375, 108.52261352539062, 165.69830322265625, 146.525634765625, 140.12765502929688, -98.58224487304688, 147.9920654296875, 54.87156677246094, 32.7200927734375, 109.35418701171875, -154.447998046875, 144.37686157226562, 5.802001953125, -7.159698486328125, -8.273983001708984, -10.403556823730469, 5.207389831542969, -28.476348876953125, 105.72076416015625, 96.5064697265625, 43.091552734375, -13.01953125, 138.17666625976562, -1.6381988525390625, -12.826301574707031, 111.26689147949219, -170.29119873046875, 2.129505157470703, 131.38250732421875, 1.9193382263183594, 7.30438232421875, 111.34213256835938, 98.35462951660156, 158.71311950683594, 37.742408752441406, 111.84930419921875, 127.3658447265625, 15.556976318359375, 15.28863525390625, 0.0, 47.29957580566406, -2.761444091796875, 71.74278259277344, 120.2669677734375, 113.98439025878906, 38.434600830078125, 122.782470703125, 35.513877868652344, -60.942962646484375, 118.19465637207031, 23.062713623046875, -37.98188781738281, 112.12542724609375, 138.66128540039062, 116.10206604003906, 4.643621444702148, 176.71176147460938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000346.npy"}
|
||||
{"epoch": 0.724607329842932, "step": 347, "batch_size": 128, "mean": 42.19906234741211, "std": 71.86200714111328, "min": -188.69488525390625, "p10": -32.131701660156246, "median": 28.277008056640625, "p90": 138.3924377441406, "max": 167.2215576171875, "pos_frac": 0.71875, "sample": [1.5258331298828125, 73.79568481445312, 119.208740234375, 4.553680419921875, 52.61700439453125, 94.3962173461914, 3.0177536010742188, 141.15069580078125, 14.693824768066406, 134.7466583251953, -188.69488525390625, 70.3414306640625, 133.90919494628906, 33.7880859375, 150.875732421875, -59.164459228515625, -36.6177978515625, -2.5103607177734375, 125.64334106445312, 151.42332458496094, 113.54023742675781, -28.760894775390625, 115.70108032226562, 10.625564575195312, 87.33882904052734, 36.31634521484375, 131.6851806640625, 19.173095703125, 11.21542739868164, -3.6908721923828125, 2.29498291015625, 58.3778076171875, 161.7326202392578, 22.789398193359375, 118.31793212890625, 40.296207427978516, 142.1988525390625, -8.894134521484375, 41.116546630859375, 12.996650695800781, 15.672599792480469, 137.2103271484375, 165.37109375, 25.395965576171875, 24.14996337890625, 85.4697265625, -14.756546020507812, 12.889892578125, 17.002273559570312, -105.28680419921875, -3.1383056640625, -0.5993194580078125, 65.77290344238281, -24.077072143554688, 68.75595092773438, 102.88351440429688, -1.51708984375, 129.03565979003906, 107.37002563476562, -87.17631530761719, 111.63729858398438, 17.993682861328125, -0.334197998046875, -15.715660095214844, -6.1431427001953125, 89.591796875, 127.47091674804688, 40.72868347167969, 161.45370483398438, 38.126556396484375, 23.1917724609375, -126.62030029296875, -34.27117919921875, 3.6474952697753906, 99.89584350585938, -4.853919982910156, -56.706504821777344, -15.031211853027344, 4.3819122314453125, 3.884429931640625, 6.231941223144531, 103.20086669921875, 148.105712890625, -15.06396484375, 14.074268341064453, -133.83645629882812, 104.5704116821289, 167.2215576171875, 27.1593017578125, 66.86640930175781, -14.835067749023438, 71.1270751953125, -56.785308837890625, 127.8198471069336, 17.55792999267578, -13.777130126953125, 63.866455078125, -1.6947689056396484, 84.10662841796875, -37.272216796875, -156.86166381835938, -44.204071044921875, 159.3150634765625, 135.01229858398438, 0.0, 26.991729736328125, 20.706817626953125, 33.5303955078125, 118.84730529785156, 149.34954833984375, 29.39471435546875, 2.0084304809570312, 130.6926727294922, 117.37074279785156, -8.80670166015625, -3.44598388671875, 38.790191650390625, -4.5045166015625, 121.60643005371094, 29.931869506835938, 34.4095458984375, 150.65243530273438, 159.61521911621094, 70.57992553710938, 14.744789123535156, -31.21478271484375, 58.422515869140625, 34.074729919433594], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000347.npy"}
|
||||
{"epoch": 0.7267015706806282, "step": 348, "batch_size": 128, "mean": 37.06032943725586, "std": 68.82481384277344, "min": -177.672119140625, "p10": -22.625549316406246, "median": 12.641761779785156, "p90": 133.77763977050782, "max": 181.91693115234375, "pos_frac": 0.671875, "sample": [14.375885009765625, 12.23779296875, 3.185832977294922, 4.5093994140625, -4.3157958984375, 9.71380615234375, 127.3836669921875, -7.942806243896484, 92.29855346679688, 113.83477020263672, -11.580718994140625, 132.98365783691406, 50.08978271484375, 120.52099609375, 37.681488037109375, 12.79925537109375, 167.5504150390625, 110.94393920898438, -89.7899398803711, 0.0, 14.08612060546875, 47.48756408691406, 92.43340301513672, -0.519073486328125, -37.520843505859375, 163.05677795410156, -4.640869140625, 0.5139007568359375, -9.73504638671875, 102.85824584960938, 56.17732238769531, 96.67857360839844, -9.55010986328125, -2.0841140747070312, 69.8079833984375, 0.00506591796875, 135.97137451171875, -5.1327972412109375, 116.16917419433594, 146.11309814453125, 5.042097091674805, -18.13995361328125, -71.815185546875, 28.382171630859375, -2.590057373046875, 130.1953887939453, 34.729583740234375, 1.794891357421875, 20.257400512695312, 86.8713607788086, 21.05748176574707, 93.81060791015625, -8.51190185546875, -0.3647651672363281, -10.93267822265625, 43.12199401855469, 139.72012329101562, 170.90420532226562, 145.0596923828125, -94.84910583496094, -8.77655029296875, -19.689453125, -25.793212890625, 108.9407958984375, 0.6076507568359375, 83.73968505859375, 15.49261474609375, -2.356904983520508, 8.755874633789062, 43.7734375, 121.61750030517578, -112.0657958984375, -122.32052612304688, -3.655641555786133, -71.92666625976562, -1.182708740234375, -14.58538818359375, 11.736801147460938, 90.66957092285156, 7.63165283203125, 165.62506103515625, 7.96527099609375, 49.892120361328125, 174.7804718017578, 3.0052490234375, 159.44537353515625, 16.754180908203125, -25.7799072265625, -28.9283447265625, 17.5723876953125, -8.042974472045898, 135.63026428222656, 119.35211181640625, -9.813499450683594, 181.91693115234375, -21.273681640625, 8.988075256347656, 6.0362091064453125, -65.96745300292969, 8.03192138671875, 125.97042846679688, 10.6868896484375, 115.77760314941406, -11.480369567871094, 59.97624969482422, -0.17441558837890625, 130.768798828125, 40.863311767578125, 1.011077880859375, 129.42037963867188, 105.40078735351562, -14.3992919921875, 72.80975341796875, -41.51677703857422, 8.250022888183594, 39.15577697753906, 149.08399963378906, 34.03314208984375, 107.57624053955078, 32.018768310546875, 118.642578125, 3.5704879760742188, -4.5937652587890625, 12.484268188476562, -13.51678466796875, 42.12147521972656, -177.672119140625, 79.2518310546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000348.npy"}
|
||||
{"epoch": 0.7287958115183246, "step": 349, "batch_size": 128, "mean": 51.03040313720703, "std": 76.63325500488281, "min": -170.59210205078125, "p10": -26.072302246093745, "median": 30.14319610595703, "p90": 156.3229568481445, "max": 212.29168701171875, "pos_frac": 0.765625, "sample": [94.11390686035156, 136.4423065185547, 4.28765869140625, -170.59210205078125, -74.59909057617188, -0.2723846435546875, 212.29168701171875, 107.95001220703125, -100.24093627929688, 51.7003173828125, 42.75395202636719, -70.03437805175781, 111.7174072265625, 158.597412109375, 52.62225341796875, 129.38259887695312, 171.755615234375, 9.562026977539062, 100.62005615234375, 161.218994140625, 73.41937255859375, 1.07574462890625, -24.7044677734375, -22.068359375, -5.9934844970703125, 8.754379272460938, -47.4986572265625, 2.731386184692383, 162.2869873046875, 155.3481903076172, 50.88471984863281, 117.79461669921875, 122.37881469726562, -45.69354248046875, 30.308334350585938, 27.2645263671875, 172.7378387451172, -1.814849853515625, 17.691085815429688, 47.01849365234375, 7.370201110839844, -1.9425640106201172, 131.51022338867188, 2.1315765380859375, 15.778961181640625, 45.61737060546875, -10.7166748046875, 193.04876708984375, 26.074127197265625, 15.063690185546875, -50.543182373046875, -0.9351367950439453, -12.082275390625, 145.2955322265625, 138.72860717773438, 25.202957153320312, 6.880134582519531, 4.2116241455078125, -96.38198852539062, 73.95271301269531, 0.0, 187.13418579101562, -4.15478515625, 75.5211181640625, 121.56842041015625, 27.176025390625, 131.76904296875, 149.34014892578125, 134.23257446289062, -5.284446716308594, 141.9591064453125, -29.263916015625, -84.63655090332031, 194.794189453125, 1.435333251953125, 45.61444091796875, -8.8074951171875, 110.4900894165039, 5.347015380859375, 19.38750457763672, 18.69427490234375, 102.7414779663086, 175.469482421875, 88.84649658203125, 159.36199951171875, 9.2174072265625, 61.24211120605469, 10.52796745300293, 136.1845703125, 44.3277587890625, 56.746856689453125, 8.29522705078125, 135.7237091064453, 8.331668853759766, -17.866241455078125, 7.36505126953125, 135.2686004638672, 71.42417907714844, 25.563079833984375, -107.76922607421875, 36.63232421875, 31.011375427246094, 47.04420471191406, 1.42303466796875, -132.880859375, 21.88458251953125, -3.8220062255859375, 5.920766830444336, 13.174308776855469, 122.00674438476562, 21.8118896484375, 132.21563720703125, 29.978057861328125, 117.26112365722656, 138.2599334716797, 162.06851196289062, 138.92758178710938, -20.13897705078125, 100.32125854492188, 175.21963500976562, 9.830230712890625, 91.12736511230469, 117.83798217773438, 112.57186889648438, -43.582733154296875, -3.0285797119140625, 13.291122436523438, 122.74375915527344], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000349.npy"}
|
||||
{"epoch": 0.7308900523560209, "step": 350, "batch_size": 128, "mean": 46.67707061767578, "std": 78.1683120727539, "min": -141.92398071289062, "p10": -46.08222961425781, "median": 37.787879943847656, "p90": 148.64996643066405, "max": 194.72174072265625, "pos_frac": 0.78125, "sample": [148.31390380859375, 49.291107177734375, 1.2823371887207031, -3.5467987060546875, 9.77872085571289, 38.407012939453125, 15.350601196289062, 166.71417236328125, 1.7895336151123047, 116.66464233398438, 29.2894287109375, 14.647048950195312, 144.41567993164062, -8.757553100585938, 24.933807373046875, 149.43411254882812, 133.39666748046875, 149.51390075683594, 75.5294418334961, -38.09305953979492, 136.6363067626953, 14.9718017578125, 3.785064697265625, 99.02186584472656, 97.36739349365234, 124.26457214355469, 46.793304443359375, 93.18191528320312, 180.53585815429688, 166.43283081054688, 53.300628662109375, 51.49737548828125, 16.88900375366211, -3.0195159912109375, -45.604827880859375, 49.073951721191406, 13.364120483398438, 18.850738525390625, 55.779205322265625, 33.3131103515625, 14.8016357421875, 96.47918701171875, 108.78585815429688, -121.96795654296875, 143.10147094726562, -14.3421630859375, 7.7868804931640625, 46.29130554199219, -141.92398071289062, 4.0175323486328125, 41.71778869628906, 134.61505126953125, 188.90158081054688, 99.40830993652344, 26.229248046875, 57.110137939453125, 143.1639404296875, 129.17343139648438, 57.609771728515625, 177.76473999023438, 100.16081237792969, -42.501953125, 146.27940368652344, 8.830657958984375, -50.437652587890625, -27.80181121826172, 40.19255065917969, 25.738258361816406, -127.40728759765625, 8.52271842956543, 121.66204833984375, 121.617431640625, 2.937530517578125, -47.1961669921875, -12.135589599609375, 126.28738403320312, -8.965728759765625, 24.989532470703125, 17.91436767578125, 63.743408203125, 163.40863037109375, 137.22225952148438, 105.40753936767578, 14.5604248046875, 31.235260009765625, 122.89073944091797, 16.59429931640625, 156.624755859375, -94.90518188476562, 7.749725341796875, 166.82034301757812, 37.76390075683594, 23.347305297851562, 37.811859130859375, 141.28192138671875, 110.60575866699219, 28.696197509765625, 125.4393081665039, 194.72174072265625, 143.57757568359375, 140.07455444335938, -47.6795654296875, 54.164031982421875, 3.2417945861816406, -128.39590454101562, -93.77007293701172, -82.3905029296875, 5.526336669921875, -128.4075927734375, 44.68487548828125, -131.2454833984375, -75.97038269042969, 18.727203369140625, 158.54315185546875, 77.91846466064453, 44.76129150390625, 10.19927978515625, 47.61079406738281, 153.49966430664062, -45.001068115234375, -6.402130126953125, 19.74119758605957, -24.572906494140625, -7.68511962890625, 145.35501098632812, -41.58527374267578, 45.2353515625, 31.690040588378906], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000350.npy"}
|
||||
{"epoch": 0.7329842931937173, "step": 351, "batch_size": 128, "mean": 31.657249450683594, "std": 80.2809066772461, "min": -186.78558349609375, "p10": -44.12251434326172, "median": 13.47446060180664, "p90": 148.67961730957032, "max": 212.07373046875, "pos_frac": 0.640625, "sample": [0.0, 130.1334228515625, 16.41455078125, 177.94647216796875, -1.6200714111328125, 165.80972290039062, 10.053604125976562, -58.053733825683594, 44.674468994140625, -6.490142822265625, 60.2806396484375, -57.71522521972656, 12.787384033203125, 8.991607666015625, 144.27197265625, -114.20932006835938, 27.917972564697266, 98.12816619873047, -30.805984497070312, 19.259521484375, 20.27167510986328, 121.43278503417969, -15.061508178710938, 188.2464599609375, 64.52738952636719, 142.84414672851562, 7.531820297241211, -17.180450439453125, -2.326934814453125, 1.4767608642578125, 152.68434143066406, 164.3921661376953, 53.414878845214844, 22.711185455322266, -41.45281982421875, -44.11433410644531, -2.274465560913086, 124.11798095703125, -28.12261199951172, -30.234966278076172, -91.46878051757812, 129.504150390625, 11.958824157714844, 47.38507080078125, -35.8526611328125, 126.26924133300781, 8.04168701171875, 19.885040283203125, -24.063018798828125, 16.26416015625, -38.297576904296875, -28.534393310546875, 16.89947509765625, -11.58721923828125, 10.19769287109375, 1.2406005859375, 139.7899169921875, -27.823944091796875, -11.49188232421875, 16.442626953125, 160.58016967773438, 207.07809448242188, 16.334421157836914, 181.57049560546875, 138.56605529785156, -18.48699951171875, 58.576324462890625, 12.431121826171875, 115.5401611328125, -88.69691467285156, -3.482757568359375, 104.45047760009766, -25.28778076171875, 12.579940795898438, 148.55392456054688, -163.12307739257812, 30.821807861328125, -12.19671630859375, 52.8668212890625, 27.57366943359375, 0.0, 0.75048828125, 23.340133666992188, 148.972900390625, 127.48374938964844, -0.8984146118164062, 1.7158050537109375, 17.211692810058594, 112.51470947265625, -24.539520263671875, 31.432586669921875, 5.3185577392578125, 83.543212890625, -13.404136657714844, 0.0, 163.10369873046875, 149.3392333984375, 3.20263671875, 14.161537170410156, -32.4676513671875, 5.2826385498046875, -186.78558349609375, 96.66613006591797, 143.71658325195312, 67.49493408203125, -44.1416015625, 212.07373046875, 21.350494384765625, 141.2123565673828, -1.4639511108398438, -0.5415153503417969, 8.691574096679688, -13.93597412109375, 29.908172607421875, 5.093593597412109, 136.5682373046875, 19.112960815429688, -44.96630859375, -134.0848388671875, -135.7435302734375, 24.05169677734375, 63.769134521484375, -162.174072265625, 74.07887268066406, -37.37016296386719, -44.970184326171875, 199.776123046875, 37.01017761230469], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000351.npy"}
|
||||
{"epoch": 0.7350785340314137, "step": 352, "batch_size": 128, "mean": 51.848243713378906, "std": 74.72885131835938, "min": -126.032470703125, "p10": -31.782675170898425, "median": 42.89330291748047, "p90": 153.45039825439454, "max": 184.39553833007812, "pos_frac": 0.7421875, "sample": [107.77426147460938, 20.516082763671875, 145.2542724609375, -45.824615478515625, 11.7896728515625, 173.37628173828125, 88.91144561767578, 84.36349487304688, 102.17152404785156, 3.9941864013671875, -1.255615234375, 124.11216735839844, -126.032470703125, 16.17938232421875, 130.87928771972656, 4.693996429443359, 31.81884765625, 160.828369140625, 130.54684448242188, 102.86761474609375, 180.41705322265625, -18.785409927368164, -41.08416748046875, 162.69183349609375, 38.44091796875, -93.20660400390625, -17.840423583984375, 95.94859313964844, 17.16998291015625, 106.94567108154297, -1.4975128173828125, -0.1507720947265625, 159.49200439453125, 14.702804565429688, 130.14166259765625, -96.70899963378906, 13.041923522949219, 82.74420166015625, -14.265640258789062, 13.799346923828125, 60.22918701171875, -114.4818115234375, -24.431640625, 92.50494384765625, -27.086746215820312, 25.3458251953125, 0.0, 65.695556640625, 55.265350341796875, 153.87509155273438, -2.637603759765625, 168.81491088867188, 110.48590087890625, 26.38671875, 134.92703247070312, 86.79806518554688, 19.838241577148438, 0.0, 64.0306396484375, 84.6192626953125, 96.34918975830078, 1.835906982421875, 17.40569305419922, 101.3135986328125, -96.12930297851562, -20.683074951171875, 8.5570068359375, 55.07731628417969, -86.152099609375, 4.9845123291015625, 97.67131042480469, 176.1151123046875, 65.14151000976562, 11.250173568725586, 7.407461166381836, -106.2177734375, 161.45611572265625, 11.118480682373047, -71.628662109375, -76.55978393554688, 184.39553833007812, 157.9868621826172, 153.2683868408203, 21.522377014160156, -8.59405517578125, 126.77496337890625, 11.006591796875, -38.95440673828125, 0.57196044921875, 126.07260131835938, 42.37554931640625, -9.695846557617188, 54.32342529296875, 23.358232498168945, 43.12730407714844, 42.6593017578125, 139.31243896484375, 119.3196029663086, -8.113037109375, 163.82752990722656, 50.9322509765625, 73.66152954101562, 116.89555358886719, 123.24667358398438, 130.57313537597656, -28.709075927734375, 124.7314453125, 141.92294311523438, 53.82579040527344, -7.352577209472656, 24.105499267578125, 114.29466247558594, 145.6055908203125, 4.12127685546875, -67.58282470703125, -11.39453125, 29.92264175415039, 150.12210083007812, 118.2979965209961, 163.82420349121094, -24.2476806640625, -19.4571533203125, 123.50436401367188, 100.38229370117188, 69.80296325683594, 105.09503936767578, 21.124855041503906, 127.32803344726562], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000352.npy"}
|
||||
{"epoch": 0.7371727748691099, "step": 353, "batch_size": 128, "mean": 51.031314849853516, "std": 75.65597534179688, "min": -149.47857666015625, "p10": -26.98517150878906, "median": 37.487831115722656, "p90": 140.3115249633789, "max": 223.4647216796875, "pos_frac": 0.765625, "sample": [28.647594451904297, 125.08319091796875, 31.195842742919922, -59.84674072265625, 40.557342529296875, 150.3262939453125, -96.0038833618164, 137.19976806640625, 133.72543334960938, 15.839584350585938, 23.668838500976562, 28.043685913085938, 5.3299560546875, 80.19076538085938, -7.04913330078125, 14.045562744140625, -13.770294189453125, -25.504608154296875, 41.2547607421875, -17.543212890625, 75.96846008300781, 99.14822387695312, 9.499870300292969, -149.47857666015625, 123.52253723144531, 20.4571533203125, 68.31016540527344, 115.73257446289062, 207.0211181640625, -1.9707832336425781, 160.85986328125, 120.77962493896484, 138.22177124023438, -62.85809326171875, 136.63653564453125, -118.9415283203125, 167.35012817382812, 91.0115966796875, 76.4449462890625, 223.4647216796875, 27.664474487304688, -84.91087341308594, 36.360687255859375, 140.1546630859375, 33.32276916503906, -11.9671630859375, 140.242919921875, 211.8245849609375, 183.05935668945312, 115.8624267578125, 123.18315124511719, 53.810890197753906, -79.91410064697266, 109.41067504882812, 14.577957153320312, 194.64163208007812, 3.7361316680908203, -17.160011291503906, 110.65750885009766, 55.606658935546875, 167.0948486328125, 72.11355590820312, -35.80938720703125, -95.65765380859375, 3.0354766845703125, 93.51904296875, -110.85786437988281, -15.50592041015625, 65.50567626953125, 26.736709594726562, 48.52685546875, 92.18577575683594, 121.80497741699219, 135.72900390625, 131.0264129638672, 114.30905151367188, 16.577369689941406, 124.12762451171875, 65.32926940917969, 9.511024475097656, 127.42724609375, 7.671607971191406, 48.77580261230469, 24.260772705078125, 26.355987548828125, 148.63861083984375, 34.409271240234375, 117.50349426269531, 110.03924560546875, 4.285236358642578, -3.991788864135742, -24.21942138671875, 117.82089233398438, 53.559295654296875, 13.011955261230469, -20.6705322265625, -3.3776397705078125, 16.01251220703125, 13.513435363769531, -15.678619384765625, 67.51904296875, 11.332107543945312, 27.45355224609375, 5.195219039916992, 144.49847412109375, 38.61497497558594, 85.82879638671875, 50.10829162597656, 132.02601623535156, -37.636474609375, 27.66948699951172, -0.7506332397460938, 28.901798248291016, 60.2740478515625, 6.153697967529297, 0.0, 183.887939453125, 64.461669921875, -30.4398193359375, -13.6409912109375, 2.048370361328125, -129.2713623046875, -12.233169555664062, 18.217071533203125, 140.4716033935547, 100.02972412109375, 126.68790435791016, 113.21624755859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000353.npy"}
|
||||
{"epoch": 0.7392670157068063, "step": 354, "batch_size": 128, "mean": 31.247615814208984, "std": 79.09293365478516, "min": -153.365478515625, "p10": -64.62171173095703, "median": 10.838900566101074, "p90": 136.75571899414064, "max": 196.812255859375, "pos_frac": 0.640625, "sample": [7.989463806152344, -44.49299621582031, -122.03691101074219, 135.93923950195312, 44.260009765625, 187.35052490234375, 33.62432861328125, 6.50701904296875, -3.1087608337402344, 2.2458343505859375, 23.211776733398438, 40.822113037109375, -25.735435485839844, 14.613906860351562, 136.71600341796875, -9.7359619140625, -12.939064025878906, -0.6093597412109375, 8.352371215820312, -1.837249755859375, 96.12921142578125, 109.43275451660156, 89.77456665039062, -81.49298095703125, 27.072288513183594, 17.91374969482422, -9.861625671386719, -7.1309356689453125, 127.3883056640625, 184.60342407226562, -28.6756591796875, -3.791595458984375, -0.5412693023681641, 55.764007568359375, 160.98916625976562, 25.898284912109375, 12.166004180908203, -1.451080322265625, 132.4841766357422, 128.06564331054688, 5.2755889892578125, -56.69854736328125, 93.75483703613281, -18.483062744140625, 130.83094787597656, 131.97674560546875, 5.42828369140625, 7.832954406738281, -147.0405731201172, 5.0326080322265625, 1.1754150390625, 181.80148315429688, 29.525970458984375, 159.34469604492188, 154.49319458007812, 160.30535888671875, 116.12174224853516, 68.41714477539062, 41.3131103515625, 66.43157958984375, 59.34135437011719, -25.695846557617188, -19.06024932861328, -84.38796997070312, 6.557098388671875, 117.11534118652344, 9.169921875, 196.812255859375, -3.2768478393554688, -4.332832336425781, 137.7769775390625, -64.50340270996094, 113.06124877929688, 6.901296615600586, 183.90380859375, 56.53486633300781, 97.0598373413086, -100.03657531738281, 3.67864990234375, 0.03541755676269531, -6.18505859375, -64.89776611328125, 167.07272338867188, 38.854095458984375, -19.597198486328125, 36.254913330078125, -19.67388916015625, 128.46875, -20.420425415039062, -57.29951477050781, 20.9364013671875, -24.24097442626953, 136.0089111328125, 9.397109985351562, 110.16078186035156, 26.529739379882812, -9.725418090820312, -17.49304962158203, -43.15958786010742, -131.32333374023438, 49.23747253417969, 2.806121826171875, -15.087783813476562, 99.60973358154297, 111.96981811523438, 37.633941650390625, 3.74493408203125, -2.0921630859375, 18.908935546875, 166.8900146484375, 131.231689453125, 55.981719970703125, -93.26549530029297, 136.848388671875, -117.03044128417969, 68.89959716796875, -153.365478515625, 106.93309020996094, -107.95343017578125, -56.29991149902344, 101.5823974609375, 9.511796951293945, -99.98806762695312, 19.83343505859375, -21.986083984375, -101.36137390136719, 94.55641174316406, 12.91119384765625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000354.npy"}
|
||||
{"epoch": 0.7413612565445026, "step": 355, "batch_size": 128, "mean": 50.84136962890625, "std": 77.21736907958984, "min": -176.06427001953125, "p10": -32.43272705078125, "median": 35.77118682861328, "p90": 152.68844451904297, "max": 193.0435791015625, "pos_frac": 0.7890625, "sample": [-1.8600730895996094, -176.06427001953125, 93.8756103515625, 99.94033813476562, 0.96990966796875, 12.360221862792969, 49.972686767578125, 5.2005615234375, 59.30108642578125, 33.48388671875, 144.0976104736328, 8.034378051757812, -9.710975646972656, 117.31716918945312, 3.4826393127441406, 137.91294860839844, 167.62796020507812, 136.7308349609375, -4.36199951171875, 0.83538818359375, -38.61851501464844, 19.441009521484375, 39.744293212890625, 165.53231811523438, 13.65771484375, 166.5269012451172, -14.456058502197266, -11.97650146484375, 35.46405029296875, 26.19500732421875, 24.455230712890625, 21.691986083984375, 48.009429931640625, 166.39599609375, 162.69305419921875, 48.07499694824219, 173.33621215820312, 21.12433624267578, 2.991180419921875, -94.03363037109375, 90.2761001586914, 134.00567626953125, 109.47409057617188, 91.67161560058594, -54.77037048339844, 21.83758544921875, -7.9781494140625, 73.02198791503906, 34.37493133544922, 20.96636962890625, 132.19808959960938, 53.905792236328125, 127.39617919921875, 12.536537170410156, 0.90753173828125, 114.33551025390625, 2.675548553466797, 31.152231216430664, 5.5899658203125, -11.83334732055664, 155.80490112304688, 10.200515747070312, -32.23284912109375, -11.6094970703125, 112.60850524902344, -150.12948608398438, 23.63824462890625, 193.0435791015625, 73.9736328125, 134.72393798828125, 1.306783676147461, 152.42242431640625, -142.300048828125, 139.85073852539062, 1.3995399475097656, 44.03337097167969, 127.55987548828125, 48.60190963745117, 172.12600708007812, 88.54595947265625, 91.93576049804688, 131.11526489257812, 2.840911865234375, -4.2808837890625, 22.6524658203125, -54.11383056640625, 39.059173583984375, -61.836181640625, 30.46478271484375, 35.57000732421875, 125.58903503417969, -23.383377075195312, -129.00680541992188, 4.0572509765625, -7.623748779296875, 39.154296875, 19.908111572265625, 153.3091583251953, -32.89910888671875, 137.66314697265625, 36.332977294921875, -78.62903594970703, 184.509765625, 108.86930847167969, 161.66934204101562, 140.67132568359375, 143.6059112548828, 80.40997314453125, -50.85639953613281, -13.78741455078125, 16.648590087890625, 35.97236633300781, 19.596343994140625, 148.51138305664062, 119.34756469726562, 124.86876678466797, 135.06124877929688, 114.21451568603516, 130.12530517578125, 102.59307861328125, 3.34381103515625, -1.9547119140625, 71.53250122070312, 156.86624145507812, 53.852325439453125, -113.94906616210938, 146.07228088378906, 31.314544677734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000355.npy"}
|
||||
{"epoch": 0.743455497382199, "step": 356, "batch_size": 128, "mean": 42.74309539794922, "std": 69.42635345458984, "min": -158.52993774414062, "p10": -36.8487564086914, "median": 27.0853328704834, "p90": 132.55061950683594, "max": 166.09495544433594, "pos_frac": 0.71875, "sample": [0.030548095703125, -1.70782470703125, 131.39730834960938, -2.315643310546875, 103.51171875, -40.3873291015625, 24.086517333984375, -39.25914001464844, 6.982383728027344, 12.859298706054688, -12.552261352539062, 9.821044921875, 22.015106201171875, 69.74795532226562, -7.76866340637207, 3.2904834747314453, 13.815528869628906, -18.63043212890625, 3.8682861328125, 50.18780517578125, -43.932403564453125, 128.23240661621094, 31.624176025390625, 35.143768310546875, 0.3289070129394531, 157.29083251953125, -99.65016174316406, 123.37520599365234, 3.1053924560546875, 118.4461669921875, 35.558197021484375, 143.05731201171875, 17.998992919921875, 79.81114196777344, 127.75486755371094, 140.73707580566406, 106.2147216796875, 108.66806030273438, 157.29681396484375, 109.42509460449219, -108.74922180175781, 132.27545166015625, -0.5374755859375, 8.538154602050781, 18.613357543945312, 27.211212158203125, -7.2428741455078125, 160.39736938476562, 85.30490112304688, -6.8244476318359375, 35.032318115234375, 129.69500732421875, 134.88014221191406, 153.22857666015625, -0.00360107421875, 35.79925537109375, -81.66340637207031, 124.54850769042969, 103.08797454833984, 39.371246337890625, 133.07589721679688, -115.95118713378906, 119.02619934082031, 99.419189453125, 126.40299987792969, 115.47319793701172, 110.64691162109375, -9.570388793945312, 2.0812911987304688, 4.72198486328125, -17.021408081054688, 24.223297119140625, 117.37997436523438, 18.582008361816406, 6.78948974609375, 98.95507049560547, 80.17960357666016, 0.2554931640625, 147.28321838378906, -48.27464294433594, 27.892921447753906, -24.191204071044922, 163.16159057617188, 36.28253173828125, -65.89677429199219, 87.04421997070312, 119.29182434082031, 108.53535461425781, 26.959453582763672, -1.3590660095214844, -78.23820495605469, -4.854007720947266, -10.25872802734375, -158.52993774414062, 28.669189453125, 14.537261962890625, 46.183349609375, 0.0, 48.688323974609375, 27.27178955078125, 90.62196350097656, 20.530776977539062, -8.587837219238281, -20.733734130859375, 0.3567619323730469, 17.009063720703125, 23.186038970947266, 108.34825134277344, -1.3513946533203125, -78.98716735839844, 136.08148193359375, 166.09495544433594, 130.25799560546875, 119.11904907226562, -21.11321258544922, 138.7747802734375, 119.77236938476562, 7.319541931152344, 49.97100830078125, 122.23782348632812, -35.81573486328125, 47.107208251953125, -2.5941925048828125, -54.13932800292969, 55.171295166015625, -11.663679122924805, 24.506973266601562, 132.32550048828125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000356.npy"}
|
||||
{"epoch": 0.7455497382198953, "step": 357, "batch_size": 128, "mean": 45.502906799316406, "std": 78.23091125488281, "min": -160.3754119873047, "p10": -37.02628784179687, "median": 21.90802574157715, "p90": 153.87926788330077, "max": 224.58831787109375, "pos_frac": 0.71875, "sample": [80.019775390625, 136.09335327148438, -14.91354751586914, 0.433563232421875, 140.52670288085938, 40.78363037109375, 0.0, 4.707275390625, 78.99246215820312, 6.620849609375, 132.6568603515625, -121.82302856445312, 3.906522750854492, 224.58831787109375, 183.58660888671875, 188.712890625, 12.376399993896484, -0.413848876953125, 36.68702697753906, 46.65667724609375, 14.7147216796875, 13.90966796875, 83.4630126953125, 135.8304901123047, 156.7719268798828, -64.51223754882812, 206.095947265625, 20.45526123046875, 25.022884368896484, 165.10244750976562, -31.622955322265625, 12.0855712890625, -18.93549346923828, 7.39356803894043, 190.29428100585938, 38.0374870300293, -16.359573364257812, -88.19918060302734, 6.001373291015625, 78.28116607666016, 92.14508056640625, 186.06964111328125, 25.6756591796875, 104.58396911621094, 65.33055114746094, -41.288665771484375, 6.0941162109375, -90.594970703125, 144.618408203125, 18.68157958984375, -29.810867309570312, 37.33600616455078, 37.825531005859375, 10.25042724609375, 89.23225402832031, 122.06465148925781, 163.52484130859375, 149.90264892578125, 34.66094970703125, -3.199432373046875, 128.4825439453125, 62.134552001953125, 19.819854736328125, -53.3917236328125, 52.528564453125, -10.672943115234375, 129.051025390625, 128.63070678710938, 37.729827880859375, 41.5418701171875, -6.905242919921875, -84.47769165039062, -19.834793090820312, 3.1574554443359375, 1.6263580322265625, 131.79132080078125, 76.13240051269531, -58.813751220703125, 19.298614501953125, -35.199554443359375, -52.459808349609375, 8.492645263671875, 15.894424438476562, 25.219680786132812, 165.73013305664062, 20.48895263671875, 143.74569702148438, -34.30047607421875, -19.779800415039062, -0.41478729248046875, -6.357887268066406, 16.187225341796875, 4.689056396484375, -74.23988342285156, 50.910858154296875, 127.00819396972656, 0.342193603515625, -85.7308349609375, 0.0, 152.63955688476562, -121.09982299804688, 0.0, 33.817047119140625, -3.524505615234375, 151.58251953125, 10.685585021972656, 15.009757995605469, 134.77381896972656, 141.31735229492188, 122.22171020507812, 159.44058227539062, -160.3754119873047, 141.72213745117188, 82.79483032226562, 2.6439380645751953, 89.29556274414062, -6.238922119140625, 116.51399230957031, 104.94380950927734, 167.003662109375, 172.69387817382812, 23.327098846435547, -10.791099548339844, -22.106369018554688, 120.8941650390625, -0.7627029418945312, 3.3497085571289062, 97.41429138183594], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000357.npy"}
|
||||
{"epoch": 0.7476439790575916, "step": 358, "batch_size": 128, "mean": 51.83292007446289, "std": 80.85848999023438, "min": -156.83297729492188, "p10": -40.22066192626953, "median": 35.51853561401367, "p90": 155.38541564941406, "max": 192.73388671875, "pos_frac": 0.7578125, "sample": [-0.045806884765625, 2.0054092407226562, 97.85055541992188, -58.493377685546875, 9.33544921875, 19.106576919555664, -30.8443603515625, 104.73513793945312, 168.35818481445312, 192.73388671875, 8.923038482666016, 2.7457542419433594, 28.479873657226562, 138.91612243652344, 159.71908569335938, 139.69552612304688, 21.307647705078125, 161.83587646484375, 155.45144653320312, 36.827850341796875, -106.0, 125.82605743408203, -21.264129638671875, -109.08065795898438, 75.56134033203125, 1.0330429077148438, 115.78958129882812, -0.6612548828125, 158.24636840820312, -9.831512451171875, 9.230850219726562, 91.60188293457031, 27.520858764648438, 113.29354858398438, 78.8553466796875, 41.616485595703125, 185.68228149414062, -74.44058227539062, 32.2808837890625, 115.61067199707031, -102.70097351074219, 149.65029907226562, 146.11239624023438, 154.9532470703125, 29.522483825683594, 140.43478393554688, 111.5773696899414, 137.7728729248047, 142.81869506835938, 4.96551513671875, -30.31329345703125, -10.02935791015625, -6.631256103515625, 21.53741455078125, 135.1543426513672, -10.697586059570312, 143.4742431640625, 146.96438598632812, -57.52824401855469, 192.0814208984375, -42.4788818359375, -156.83297729492188, 168.26666259765625, -14.842376708984375, 3.1906356811523438, 159.05484008789062, 80.98773193359375, 28.732715606689453, 35.619140625, 9.886474609375, 35.417930603027344, 11.597686767578125, 18.231231689453125, 27.85796356201172, 4.57373046875, -116.02301025390625, 129.45606994628906, 112.39280700683594, -12.297523498535156, 165.23834228515625, 53.67875671386719, -54.65950012207031, 120.26441955566406, 48.07501220703125, 155.35711669921875, 33.6796875, 100.05865478515625, 37.44915771484375, 37.23749923706055, 148.052001953125, 11.120269775390625, 166.46853637695312, 129.66641235351562, 101.59713745117188, 110.3621826171875, 137.5616455078125, -30.87994384765625, 9.29058837890625, 6.49090576171875, 105.65325927734375, 177.2659912109375, 9.958587646484375, 89.38682556152344, 154.35531616210938, 101.358154296875, 147.73028564453125, 142.61239624023438, -143.6939697265625, -15.599952697753906, -119.97457885742188, -99.54856872558594, 6.4571380615234375, 120.9945068359375, 4.593193054199219, 37.216835021972656, -34.47319793701172, 11.047584533691406, 12.686088562011719, -2.47418212890625, 30.48504638671875, 0.0, 45.52166748046875, 70.72518920898438, 43.1734619140625, 132.78668212890625, -39.25285339355469, 31.4696044921875, -3.3798370361328125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000358.npy"}
|
||||
{"epoch": 0.749738219895288, "step": 359, "batch_size": 128, "mean": 48.932315826416016, "std": 72.33525848388672, "min": -176.662841796875, "p10": -28.306690979003907, "median": 35.267822265625, "p90": 146.51954345703123, "max": 198.37200927734375, "pos_frac": 0.765625, "sample": [48.82482147216797, -52.3115234375, 123.26431274414062, 162.41812133789062, 97.9805908203125, -51.3919677734375, 74.85106658935547, 15.890953063964844, 30.34857177734375, 44.57647705078125, 26.466800689697266, 80.24298095703125, 16.954593658447266, 3.040933609008789, -12.086700439453125, -17.154449462890625, 101.25333404541016, 45.88365173339844, 20.1903076171875, 17.756134033203125, 109.3221206665039, -13.732315063476562, 129.76556396484375, 154.08224487304688, -29.054794311523438, 131.91360473632812, 11.01397705078125, 33.89471435546875, 68.00070190429688, -162.75421142578125, -1.81982421875, 188.08001708984375, 5.39080810546875, 56.00029754638672, 2.3966426849365234, 54.652008056640625, -7.643350601196289, 128.3759765625, 144.22210693359375, 148.49871826171875, -23.904342651367188, -106.2396240234375, 126.4154052734375, 3.467987060546875, 108.52128601074219, 98.33892822265625, 12.984893798828125, 58.139190673828125, 120.70945739746094, 112.560302734375, 121.20794677734375, 1.5321426391601562, 195.91287231445312, -33.61420440673828, 118.88665771484375, 1.1717605590820312, 51.60931396484375, -148.29730224609375, 110.56311798095703, 42.7099609375, 176.98675537109375, 96.96603393554688, -176.662841796875, 61.812774658203125, 38.904266357421875, -8.987037658691406, -9.95428466796875, 44.68890380859375, 131.5887451171875, 88.06036376953125, -2.6900634765625, 29.5494384765625, -30.656524658203125, 36.64093017578125, 87.1420669555664, 82.12332153320312, -9.3184814453125, 118.65859985351562, 13.455154418945312, 128.17066955566406, 1.643310546875, 3.4503936767578125, 5.746795654296875, 114.0513916015625, 25.424545288085938, 17.19792938232422, 16.740936279296875, 127.50370788574219, 28.26434326171875, 19.797653198242188, 151.69105529785156, 13.889144897460938, 150.906982421875, 115.43669891357422, 20.91546630859375, -28.649459838867188, -47.08103942871094, 45.39044189453125, -25.496246337890625, -0.3598442077636719, 28.937225341796875, 163.74746704101562, 30.787933349609375, -48.32872009277344, 15.131698608398438, -27.410491943359375, 152.21044921875, -28.1597900390625, 56.300872802734375, 137.16574096679688, 115.49600219726562, 104.5475082397461, 13.054534912109375, 11.501899719238281, -9.771484375, 162.3319091796875, 145.67132568359375, 91.17617797851562, 101.88003540039062, -2.3104171752929688, -7.040645599365234, 198.37200927734375, 28.767364501953125, 5.494306564331055, 152.09765625, 47.20121765136719, 103.4488525390625, -32.1868896484375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000359.npy"}
|
||||
{"epoch": 0.7518324607329843, "step": 360, "batch_size": 128, "mean": 42.31696319580078, "std": 69.17524719238281, "min": -187.46173095703125, "p10": -21.853399658203124, "median": 22.562562942504883, "p90": 140.80420532226563, "max": 204.65655517578125, "pos_frac": 0.7265625, "sample": [92.51486206054688, 91.27813720703125, -44.05714416503906, 143.64300537109375, -120.93954467773438, -8.695060729980469, -103.72349548339844, 70.6317138671875, 204.65655517578125, -43.45234680175781, 31.31402587890625, 91.60964965820312, -22.190475463867188, 118.33209228515625, 47.76605224609375, 88.48492431640625, 14.850860595703125, 124.7347412109375, -23.445159912109375, 42.324188232421875, 157.0360107421875, 51.7838134765625, 15.0286865234375, 108.4676513671875, 111.9596939086914, -21.708938598632812, 4.360256195068359, 21.041015625, -15.096282958984375, 149.48825073242188, 33.71014404296875, 165.08665466308594, 84.387939453125, 16.64093017578125, 63.53851318359375, 6.0599365234375, -18.017364501953125, 121.60690307617188, -31.3812255859375, 40.00511169433594, 93.8489990234375, 93.77682495117188, 119.7762451171875, 23.756500244140625, 72.1304931640625, 21.36862564086914, 12.848419189453125, 124.52784729003906, -7.707000732421875, 36.737884521484375, -12.03521728515625, 130.2321319580078, 102.88262939453125, 82.60269165039062, 176.24765014648438, -9.529289245605469, 153.3770294189453, -33.8472900390625, 134.27772521972656, 12.052146911621094, 40.78717041015625, 33.2620849609375, 138.10928344726562, 41.65477752685547, 20.260986328125, 2.302581787109375, 156.05520629882812, 17.140777587890625, 52.86952209472656, -121.72640991210938, -1.011871337890625, 105.6785888671875, 1.4469852447509766, 140.48309326171875, -15.428436279296875, 21.023265838623047, 67.31651306152344, 152.4097900390625, 129.93490600585938, 123.64622497558594, -10.479019165039062, 103.996826171875, 6.207738876342773, -3.6612777709960938, -187.46173095703125, 105.86672973632812, 42.530242919921875, 13.546234130859375, -18.16082763671875, 9.342422485351562, 136.50796508789062, -34.110748291015625, 165.46896362304688, 11.307533264160156, -0.7272262573242188, -0.88702392578125, 4.4575042724609375, -121.38607788085938, 8.463653564453125, 1.4300537109375, 109.89268493652344, 0.0, 20.58740234375, 17.303985595703125, 141.553466796875, 1.7979316711425781, 65.0689697265625, 39.442626953125, -0.96820068359375, 6.834260940551758, 16.62939453125, -5.172149658203125, -6.271568298339844, 112.446044921875, 9.451744079589844, 37.87114715576172, -28.75403594970703, -15.7138671875, 0.0, 11.172950744628906, 25.040786743164062, 27.19611358642578, 34.515167236328125, -21.405738830566406, -14.398773193359375, 18.454681396484375, 143.53363037109375, 145.03518676757812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000360.npy"}
|
||||
{"epoch": 0.7539267015706806, "step": 361, "batch_size": 128, "mean": 57.381004333496094, "std": 79.64491271972656, "min": -157.03811645507812, "p10": -32.97619018554686, "median": 54.15729522705078, "p90": 153.0715133666992, "max": 215.9404296875, "pos_frac": 0.7265625, "sample": [69.0815200805664, -8.667572021484375, -106.1756591796875, 36.05279541015625, 31.842681884765625, -5.1065673828125, 1.2857208251953125, -26.611419677734375, 143.00430297851562, 44.462493896484375, 113.46463775634766, 15.764007568359375, 1.2433204650878906, 184.5120849609375, -5.730752944946289, 141.82765197753906, -3.9000587463378906, 13.887863159179688, -30.2528076171875, 1.1880416870117188, 143.604736328125, 166.81979370117188, -64.18693542480469, 141.22610473632812, -1.93017578125, 168.29306030273438, 112.95289611816406, 118.51254272460938, 49.070960998535156, 173.98907470703125, 10.111480712890625, -6.666374206542969, 108.4284439086914, 67.45289611816406, 14.69772720336914, 128.89254760742188, -130.25250244140625, 90.81048583984375, 113.59459686279297, 55.63694763183594, 172.19137573242188, 182.36416625976562, 21.814712524414062, 74.03067016601562, 187.0283203125, 25.92852783203125, -97.05718994140625, -9.6571044921875, 43.71037292480469, -11.925674438476562, -2.2660579681396484, 65.50694274902344, 94.6755599975586, 0.0, 55.588104248046875, 12.410491943359375, 132.132568359375, 63.735382080078125, 3.0506668090820312, -24.91754150390625, 40.12236022949219, 125.36647033691406, -12.324495315551758, 149.34442138671875, 112.169189453125, -46.83807373046875, 33.33416748046875, 89.95137786865234, 29.06280517578125, -63.800323486328125, 78.7117919921875, -4.815765380859375, -66.2852783203125, -28.77850341796875, 91.87285614013672, 0.0, 106.8489990234375, -40.88618469238281, 52.72648620605469, -9.579803466796875, 190.45138549804688, 141.468505859375, 159.95260620117188, 51.538604736328125, -157.03811645507812, 128.53964233398438, 0.56005859375, -39.33074951171875, 215.9404296875, 125.56866455078125, -44.57646179199219, 56.476318359375, -114.1595458984375, 138.480712890625, 159.69363403320312, 108.49867248535156, 66.83983612060547, 23.693359375, 139.4578094482422, 114.57328796386719, 145.58584594726562, -8.892807006835938, 186.59523010253906, 129.8248291015625, 146.32046508789062, 139.92095947265625, 71.32490539550781, 25.841033935546875, 31.32012939453125, 125.88153076171875, 148.031005859375, 91.60638427734375, -16.894378662109375, 82.20268249511719, 155.54718017578125, 133.14144897460938, -5.6199951171875, 143.02890014648438, 152.01051330566406, 108.58659362792969, 11.585525512695312, 113.06625366210938, -3.6004791259765625, 37.902652740478516, 30.881080627441406, 18.36284637451172, 136.128662109375, -148.326416015625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000361.npy"}
|
||||
{"epoch": 0.7560209424083769, "step": 362, "batch_size": 128, "mean": 47.75523376464844, "std": 74.67219543457031, "min": -161.41860961914062, "p10": -37.07015647888183, "median": 33.74968719482422, "p90": 139.5164367675781, "max": 233.724365234375, "pos_frac": 0.75, "sample": [20.693679809570312, 90.5535888671875, -6.408802032470703, 80.21597290039062, 124.63827514648438, 31.311492919921875, 123.8270263671875, -10.393341064453125, 59.44732666015625, 13.577789306640625, 75.97689819335938, -161.41860961914062, -128.49594116210938, -91.66835021972656, 171.18411254882812, 8.412240982055664, 146.99742126464844, -43.927490234375, 37.4246826171875, 105.18305969238281, 83.17880249023438, 129.31393432617188, 233.724365234375, 46.15216064453125, 80.29486083984375, 144.04998779296875, 115.2781982421875, -15.451873779296875, -145.45547485351562, 83.00929260253906, -24.613739013671875, -40.8878059387207, 80.54620361328125, -6.243747711181641, -68.01919555664062, 119.92971801757812, 12.258304595947266, -11.245433807373047, 140.4495849609375, 14.000724792480469, 182.972900390625, 81.59100341796875, 165.4798583984375, 122.00782775878906, 111.1065673828125, 138.35775756835938, 118.54533386230469, -1.14447021484375, -16.5423583984375, 20.1358642578125, -35.43402099609375, 115.55563354492188, 71.5545654296875, -34.612884521484375, 2.294780731201172, 4.4830322265625, -28.39697265625, 158.92620849609375, -4.54913330078125, 126.74674987792969, 123.38893127441406, -17.387725830078125, 32.47637939453125, 109.6888427734375, 99.9193115234375, 102.34796142578125, 25.9229736328125, 18.591339111328125, 105.39704895019531, 60.806976318359375, 154.3372802734375, 100.85577392578125, 57.033714294433594, 20.222869873046875, -5.437782287597656, 176.56201171875, 100.72496032714844, 5.147972106933594, 95.54949951171875, 108.25746154785156, -72.54866027832031, 118.6636962890625, -57.900020599365234, 10.052947998046875, -44.5162353515625, -68.92047119140625, -15.888126373291016, 166.9342803955078, 8.877437591552734, 14.366386413574219, 25.127716064453125, 3.7721405029296875, -8.627021789550781, 12.950103759765625, 23.69158935546875, 91.22930908203125, 108.94368743896484, 30.218826293945312, 12.250320434570312, 3.448455810546875, -6.41973876953125, 129.52325439453125, 129.29991149902344, 20.74774169921875, 3.5906143188476562, 85.46331787109375, -124.93463134765625, 17.77630615234375, 51.892547607421875, 17.765579223632812, 157.29454040527344, 11.790973663330078, 109.79353332519531, 78.42892456054688, 29.20867919921875, -9.584735870361328, 74.16738891601562, 158.0758056640625, 31.5657958984375, 21.441070556640625, -25.1177978515625, 99.51985931396484, 116.95829772949219, 139.11651611328125, 37.94195556640625, -103.36819458007812, 72.698974609375, 35.02299499511719], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000362.npy"}
|
||||
{"epoch": 0.7581151832460733, "step": 363, "batch_size": 128, "mean": 55.67426300048828, "std": 77.45149993896484, "min": -179.03482055664062, "p10": -20.87405910491943, "median": 40.292755126953125, "p90": 154.62447052001954, "max": 218.25128173828125, "pos_frac": 0.765625, "sample": [132.03469848632812, -3.6646270751953125, -13.87750244140625, 154.33941650390625, 134.8160400390625, 58.19915771484375, -0.4436626434326172, -103.79937744140625, 43.87445068359375, 110.4366226196289, -15.209014892578125, 7.271270751953125, 133.11636352539062, 96.2098388671875, 5.785972595214844, 67.1888427734375, 88.6644287109375, 176.7969512939453, -3.48785400390625, -25.021242141723633, 21.582183837890625, 113.56509399414062, 173.81591796875, 75.30844116210938, 21.917747497558594, 161.26751708984375, 172.6327667236328, 204.87811279296875, 12.194671630859375, 8.650741577148438, 135.13107299804688, -0.6899337768554688, -13.535224914550781, -2.7026214599609375, 98.23678588867188, 1.784524917602539, -2.9569168090820312, 82.9967041015625, 136.70138549804688, 0.9095458984375, -1.7087631225585938, 22.68450927734375, 76.70635986328125, 110.87281036376953, 71.72441101074219, 70.5023193359375, -9.378768920898438, 94.21783447265625, 147.5952911376953, 41.83154296875, 124.450927734375, 16.34663200378418, 218.25128173828125, 142.6204833984375, -117.14932250976562, 86.78546142578125, -0.456024169921875, 117.5755615234375, 32.904296875, 101.251953125, 26.327056884765625, 38.09059143066406, 155.2895965576172, 120.05244445800781, 160.17987060546875, 0.42253875732421875, 163.60726928710938, 76.5091552734375, 37.49957275390625, 5.240865707397461, 130.76377868652344, 1.079864501953125, 166.61822509765625, 211.20022583007812, 6.5391845703125, 145.39874267578125, -34.905426025390625, -10.953323364257812, 42.131954193115234, 58.187347412109375, 7.7498321533203125, 13.795074462890625, -38.779815673828125, 10.47296142578125, 47.466094970703125, -19.096694946289062, -4.082000732421875, -5.423248291015625, -86.5958251953125, 16.34259033203125, 114.4293212890625, -121.59669494628906, 123.726318359375, 6.432350158691406, 19.895919799804688, 152.66537475585938, 134.5432891845703, -18.833465576171875, 144.24517822265625, 92.3758544921875, 23.35528564453125, 116.22958374023438, -132.83470153808594, 114.48965454101562, 155.32666015625, 132.2913818359375, 25.58702850341797, 77.10232543945312, 148.1234893798828, 66.7706298828125, -179.03482055664062, -56.94071960449219, -39.336334228515625, 0.3668670654296875, 19.853302001953125, 38.75396728515625, 10.6177978515625, 7.350860595703125, 116.5618896484375, 152.98345947265625, 150.94395446777344, 143.69842529296875, 20.773468017578125, 160.02044677734375, -38.550804138183594, 10.602447509765625, 25.050262451171875, -25.41021728515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000363.npy"}
|
||||
{"epoch": 0.7602094240837697, "step": 364, "batch_size": 128, "mean": 41.76121520996094, "std": 77.46251678466797, "min": -190.8619384765625, "p10": -30.25634002685547, "median": 25.07036590576172, "p90": 139.4536636352539, "max": 224.94732666015625, "pos_frac": 0.6796875, "sample": [62.63255310058594, -3.0019149780273438, 109.017822265625, 32.49635696411133, 92.20578002929688, 14.344131469726562, 118.228759765625, 121.46429443359375, 111.46609497070312, 25.37645721435547, 42.33961486816406, 91.80073547363281, -7.975189208984375, -7.991943359375, -17.0035400390625, 31.261810302734375, 52.896759033203125, 5.44866943359375, 0.127593994140625, 106.13078308105469, -139.13294982910156, -7.766632080078125, 86.35736083984375, -143.64129638671875, 21.25775146484375, -75.41281127929688, 166.36676025390625, 126.9974136352539, -11.579505920410156, 128.01898193359375, 5.19708251953125, 17.21666717529297, 48.75152587890625, 7.121095657348633, -24.55908203125, -1.5968475341796875, -2.812225341796875, 123.23265075683594, 5.44110107421875, 131.93377685546875, 101.86898803710938, 11.5994873046875, -166.6571044921875, 54.25294494628906, 119.43734741210938, 161.94058227539062, -36.74681854248047, 87.8297119140625, 14.422927856445312, -24.577529907226562, 133.43527221679688, 144.74789428710938, 4.525543212890625, 59.62054443359375, -10.00164794921875, -0.30999755859375, 158.2919921875, -1.2518119812011719, -190.8619384765625, 200.2458038330078, 110.32904052734375, -10.8226318359375, -7.30511474609375, -30.145401000976562, 32.89134216308594, 111.811767578125, -29.375732421875, 157.67868041992188, -17.061492919921875, 3.693450927734375, 33.87396240234375, 137.18470764160156, 80.80841064453125, -13.506095886230469, 96.85169982910156, 6.98480224609375, 161.57122802734375, 70.62042999267578, 17.275985717773438, 127.71783447265625, 120.56791687011719, -12.190826416015625, 26.50543212890625, -6.913848876953125, -5.3274383544921875, 16.448318481445312, 24.76427459716797, -18.00604248046875, 38.20008850097656, 187.10455322265625, 116.2777099609375, 1.7490921020507812, -30.51519775390625, 5.30419921875, 128.98590087890625, 133.3038330078125, 9.522418975830078, -1.2730674743652344, -34.393463134765625, 224.94732666015625, 36.11895751953125, 151.1852569580078, -97.24478149414062, 191.60696411132812, 112.41365051269531, -49.445648193359375, 95.67495727539062, 66.67709350585938, 3.569488525390625, -16.633636474609375, 116.20401000976562, -0.4075927734375, -15.361183166503906, 166.3235626220703, 45.61638641357422, -0.918182373046875, 8.67938232421875, -78.70469665527344, 94.6492919921875, -82.69453430175781, 6.029056549072266, 47.392250061035156, 37.13243865966797, 5.195159912109375, -59.5255126953125, 101.19683837890625, 196.4250030517578, 33.67475891113281], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000364.npy"}
|
||||
{"epoch": 0.762303664921466, "step": 365, "batch_size": 128, "mean": 52.525672912597656, "std": 75.03955841064453, "min": -147.56716918945312, "p10": -23.334467315673827, "median": 26.938644409179688, "p90": 156.5578155517578, "max": 190.93002319335938, "pos_frac": 0.7578125, "sample": [0.0, 3.5956954956054688, 103.16604614257812, 4.857025146484375, 153.25918579101562, 168.16824340820312, 17.628448486328125, 69.70050048828125, 0.0, 18.0010986328125, 17.865966796875, 4.667694091796875, 16.618362426757812, 160.07305908203125, 12.208709716796875, 12.292572021484375, 19.03118896484375, 35.092201232910156, 12.713088989257812, 147.8570098876953, 39.77703857421875, 129.8221435546875, -7.3548736572265625, 19.242767333984375, 57.21058654785156, 28.341217041015625, -24.54534149169922, 139.55654907226562, 122.43687438964844, 140.42022705078125, 141.78732299804688, -3.5252304077148438, -22.815521240234375, 111.20886993408203, -104.35687255859375, -62.329559326171875, 124.349609375, 54.136070251464844, 106.96029663085938, 58.485321044921875, -19.399625778198242, 125.41644287109375, 165.7698974609375, 112.7431640625, -33.08319091796875, 171.14947509765625, -1.259521484375, 117.621337890625, 33.30357360839844, 144.68255615234375, 124.34983825683594, -20.956100463867188, 25.53607177734375, -3.124664306640625, 15.476104736328125, 8.64459228515625, 100.99510192871094, 149.6946258544922, 139.11566162109375, -27.431777954101562, 24.5750732421875, 135.89059448242188, -34.730072021484375, 4.94500732421875, 5.074836730957031, 79.61572265625, 118.13909912109375, 124.76683044433594, 7.673004150390625, 89.54421997070312, 174.464599609375, -11.291488647460938, 167.41943359375, 2.0723114013671875, 8.024169921875, 174.24171447753906, 93.76913452148438, -120.62301635742188, 156.04241943359375, 3.4400291442871094, 165.89511108398438, 3.0670623779296875, 181.61831665039062, 2.075927734375, 16.766538619995117, -16.327606201171875, 132.99783325195312, 54.78578186035156, 5.389842987060547, -15.382820129394531, 75.46981811523438, 16.46240997314453, 59.760650634765625, 157.76040649414062, -4.936256408691406, 5.4124603271484375, 3.7113571166992188, -17.224456787109375, 121.54106903076172, -8.46014404296875, 23.8699951171875, -130.8737030029297, -15.3662109375, -3.817291259765625, 125.44219970703125, 0.0, 15.415475845336914, 4.265312194824219, 120.36799621582031, 167.62493896484375, 61.6397705078125, 95.34587097167969, 5.724254608154297, -25.526824951171875, -53.09075927734375, -147.56716918945312, 125.16290283203125, 163.5438995361328, -38.517730712890625, 115.73458862304688, 32.26853942871094, -103.92156982421875, 190.93002319335938, 89.34735107421875, 84.79385375976562, 126.22982788085938, 126.72291564941406, 69.25335693359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000365.npy"}
|
||||
{"epoch": 0.7643979057591623, "step": 366, "batch_size": 128, "mean": 51.98725891113281, "std": 76.14994049072266, "min": -181.03585815429688, "p10": -32.02213439941406, "median": 40.96301460266113, "p90": 150.25621032714844, "max": 199.65155029296875, "pos_frac": 0.734375, "sample": [17.655372619628906, 33.31233215332031, 199.65155029296875, 128.21604919433594, 111.89580535888672, 126.001708984375, 151.91189575195312, 109.99417114257812, 81.83830261230469, 166.01251220703125, 149.73770141601562, 119.56253051757812, 102.41939544677734, -40.384735107421875, 30.868818283081055, 107.41548156738281, 100.47999572753906, 48.53929138183594, -120.53976440429688, 29.870956420898438, -181.03585815429688, 183.591796875, -105.36415100097656, 14.802093505859375, 48.3419189453125, 124.77374267578125, 120.46992492675781, 72.820068359375, 136.1915283203125, 95.64230346679688, -18.93768310546875, 130.93701171875, 25.222335815429688, -33.958221435546875, 146.03150939941406, -159.40286254882812, -3.0228271484375, 111.55426025390625, 25.889068603515625, 25.649749755859375, 3.993305206298828, -21.534622192382812, -31.1923828125, -15.775611877441406, -3.7607421875, 79.7806396484375, 135.91879272460938, -27.57415771484375, 99.76690673828125, -15.1463623046875, 70.33477783203125, 30.43292236328125, 11.792236328125, 55.41084671020508, 93.37217712402344, -11.712890625, 28.970947265625, 14.74749755859375, 34.311248779296875, 17.114540100097656, 1.598175048828125, 130.9964599609375, 85.56636047363281, 3.8679847717285156, 3.90032958984375, 157.5343017578125, 44.29835891723633, 98.66229248046875, 88.82219696044922, -67.29995727539062, 124.44735717773438, 154.36181640625, -57.33119201660156, 0.0, -26.873077392578125, 154.63829040527344, 56.6019287109375, 113.649658203125, 113.6258544921875, -1.5946044921875, -3.405029296875, 43.445098876953125, 165.1265869140625, 72.79592895507812, -5.4735107421875, 49.25384521484375, 19.1829833984375, 175.25790405273438, -2.6369857788085938, -11.923423767089844, -4.867431640625, -34.8302001953125, 93.83660888671875, 10.701309204101562, 38.48093032836914, 122.28146362304688, 93.53585815429688, -5.55792236328125, 54.8438720703125, 146.9722442626953, 129.35328674316406, 18.250900268554688, 3.230743408203125, 177.75088500976562, 132.658447265625, 21.79803466796875, 121.43992614746094, -16.95745086669922, 182.21234130859375, 0.6995563507080078, 4.31512451171875, 114.52851104736328, 151.466064453125, 4.4742431640625, -15.724937438964844, -39.361480712890625, 141.8045196533203, -124.84371948242188, 123.19963073730469, 33.7696533203125, 122.59890747070312, 13.78533935546875, 0.034454345703125, 132.40087890625, -47.027130126953125, -9.699264526367188, 153.01611328125, -35.20184326171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000366.npy"}
|
||||
{"epoch": 0.7664921465968586, "step": 367, "batch_size": 128, "mean": 53.71388626098633, "std": 71.30438232421875, "min": -115.47305297851562, "p10": -24.854415893554688, "median": 41.757158279418945, "p90": 153.34688415527344, "max": 182.47415161132812, "pos_frac": 0.7734375, "sample": [120.47833251953125, 27.464202880859375, -13.34185791015625, 165.4381561279297, -2.6051902770996094, 16.047775268554688, 43.4024772644043, -112.22793579101562, -6.29644775390625, 40.111839294433594, 151.450927734375, 21.681293487548828, 10.307441711425781, -1.3739013671875, 61.238250732421875, 171.6231689453125, 43.437103271484375, 154.73553466796875, 0.4737548828125, 11.541748046875, 5.565399169921875, 173.793701171875, 95.95635986328125, -42.233123779296875, 25.31237030029297, 169.67453002929688, -115.47305297851562, -10.3505859375, 49.16175842285156, -13.528305053710938, -28.5928955078125, 76.340087890625, 2.3249359130859375, -12.721073150634766, 131.96646118164062, -11.141845703125, 82.38128662109375, 122.10659790039062, 104.86758422851562, -80.21011352539062, 141.974365234375, 77.48738098144531, 9.969339370727539, 107.50537109375, 145.7098388671875, 3.1119766235351562, -69.659912109375, 45.558349609375, -7.42840576171875, -30.184616088867188, 52.76995849609375, 10.121780395507812, 25.070037841796875, 72.26272583007812, 130.39183044433594, 140.44613647460938, 120.71684265136719, 101.97933197021484, 65.28959655761719, -24.3717041015625, 1.591552734375, -29.40478515625, 175.92596435546875, 100.90411376953125, 14.416641235351562, 157.85629272460938, -25.980743408203125, 117.38838195800781, 182.47415161132812, 37.299381256103516, 62.650848388671875, 21.543487548828125, 25.332168579101562, 142.40145874023438, 151.18743896484375, 143.15512084960938, 6.328094482421875, -6.971649169921875, 17.69329833984375, -42.1282958984375, 40.0111083984375, 155.44447326660156, 140.4393310546875, 151.5955810546875, 17.1754150390625, 32.778839111328125, 146.2093505859375, 17.48992919921875, -28.678924560546875, 86.10252380371094, 139.99166870117188, 111.6156005859375, 22.70623779296875, -8.906585693359375, 11.521951675415039, -1.6975479125976562, -84.5259780883789, 155.17483520507812, 48.378177642822266, 153.04678344726562, 2.8126258850097656, 161.83563232421875, 137.49717712402344, 1.21612548828125, -12.725120544433594, 80.26126098632812, 65.66510009765625, 154.047119140625, 9.9564208984375, 129.60223388671875, 45.43251037597656, 90.82711791992188, 25.994293212890625, -108.91474151611328, 61.23396301269531, 9.59506607055664, -0.6012725830078125, 125.95780944824219, 59.31938171386719, 135.8632354736328, 14.23121452331543, 43.87664794921875, 15.53497314453125, 5.34149169921875, 179.564697265625, 102.80496215820312, -17.364471435546875, 49.469573974609375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000367.npy"}
|
||||
{"epoch": 0.768586387434555, "step": 368, "batch_size": 128, "mean": 59.9697265625, "std": 74.12918853759766, "min": -137.88308715820312, "p10": -18.976728057861326, "median": 53.25372314453125, "p90": 152.35356903076172, "max": 219.01846313476562, "pos_frac": 0.7890625, "sample": [10.99261474609375, 167.7841796875, 99.64408874511719, 142.96185302734375, 18.688873291015625, 146.30975341796875, 25.543014526367188, -1.53875732421875, 7.0032958984375, -10.579833984375, 142.27883911132812, -28.415451049804688, 22.965240478515625, 173.86798095703125, 103.22604370117188, 53.69219970703125, 191.67193603515625, 17.98577880859375, 68.86865234375, 99.04926300048828, -20.42535400390625, -7.75146484375, 50.767303466796875, -18.35588836669922, 3.5758056640625, 23.73907470703125, -6.739967346191406, 4.91522216796875, 60.24668884277344, 142.201904296875, 2.5233383178710938, 52.81524658203125, -60.538475036621094, 40.29774475097656, 13.9884033203125, -137.88308715820312, -10.2310791015625, 219.01846313476562, 124.23477172851562, 115.97479248046875, 31.632171630859375, 152.2140655517578, 136.04571533203125, 119.62728881835938, 26.292236328125, 28.86981201171875, 123.836669921875, -8.552322387695312, 25.761865615844727, -131.15249633789062, 4.8571929931640625, 130.3525390625, 7.561521530151367, 102.008056640625, -32.56604766845703, -5.7450714111328125, 164.91375732421875, 120.29700469970703, -63.135406494140625, 106.05166625976562, 33.7315673828125, 64.99771118164062, 76.07147216796875, 135.53184509277344, 155.56704711914062, 108.29916381835938, -49.343170166015625, 83.26712036132812, 30.21160888671875, -6.5850372314453125, -1.9766845703125, 162.9991455078125, 152.6790771484375, 27.657028198242188, 31.205642700195312, 113.45225524902344, 132.67230224609375, 84.6319580078125, 147.46609497070312, 129.67840576171875, 3.433929443359375, 110.02130126953125, 142.37232971191406, 54.12109375, -9.451698303222656, 137.6678009033203, 10.279144287109375, 64.71379852294922, -1.7904205322265625, 97.87765502929688, 0.07348442077636719, 11.114677429199219, 147.64022827148438, 15.628074645996094, 102.81558990478516, 26.106529235839844, 66.88346862792969, 101.9818115234375, -4.024040222167969, 110.04790496826172, 153.33914184570312, -52.70635223388672, 88.09281921386719, 166.8602294921875, -62.122802734375, 26.531005859375, -8.20361328125, 100.1800537109375, 188.3759765625, 160.57568359375, 28.46929931640625, 133.3940887451172, 31.272552490234375, 9.19183349609375, 17.81366729736328, 127.42680358886719, 154.98468017578125, 140.19720458984375, 120.71206665039062, 99.18399047851562, 31.647674560546875, -120.91474914550781, -74.8016357421875, 5.55718994140625, 111.51838684082031, 151.5186309814453, -74.90438842773438, 139.66482543945312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000368.npy"}
|
||||
{"epoch": 0.7706806282722513, "step": 369, "batch_size": 128, "mean": 43.541465759277344, "std": 78.92347717285156, "min": -161.49285888671875, "p10": -49.23150100708008, "median": 24.500481605529785, "p90": 142.6985122680664, "max": 208.70443725585938, "pos_frac": 0.7265625, "sample": [132.5914764404297, 27.021873474121094, 124.16264343261719, -50.100738525390625, 23.766407012939453, -9.693572998046875, 73.59724426269531, 5.539031982421875, 123.3426742553711, 0.824249267578125, 131.93215942382812, 62.274688720703125, -27.589305877685547, 87.8272705078125, 168.25430297851562, 159.39971923828125, 108.85139465332031, 105.33329772949219, 18.70599365234375, 118.26280212402344, 169.84848022460938, 3.417642593383789, 90.6827163696289, 111.14385986328125, -36.68939208984375, -144.358642578125, 130.19073486328125, -14.80743408203125, 25.560043334960938, 9.0833740234375, 117.63343811035156, 25.1741943359375, 136.72802734375, 107.27484130859375, -7.34063720703125, 125.43529510498047, -53.558349609375, 104.38458251953125, 130.94512939453125, 34.88301086425781, 9.094650268554688, 7.2122802734375, 140.747802734375, -36.0389404296875, 14.351959228515625, -114.12249755859375, -12.38665771484375, 84.167724609375, -100.47758483886719, 167.19189453125, -99.87554931640625, -110.91790008544922, 77.64730834960938, -85.31851196289062, 151.85311889648438, 110.35047149658203, -100.24227142333984, -7.260383605957031, 126.1014404296875, 8.092742919921875, -115.6565170288086, 2.774810791015625, -114.60022735595703, 142.61790466308594, 153.33277893066406, 23.82676887512207, 91.240478515625, -6.821937561035156, 147.13800048828125, -28.349166870117188, 208.70443725585938, -15.358917236328125, 19.202007293701172, 23.0460205078125, 3.1904296875, 186.3165283203125, 66.905517578125, 3.568328857421875, -48.858970642089844, 111.4726333618164, 6.536521911621094, -161.49285888671875, 6.695640563964844, 33.6588134765625, 19.385223388671875, 103.0378646850586, 146.04461669921875, 132.80955505371094, -8.060333251953125, 0.0, 37.06062316894531, 126.73602294921875, 6.128662109375, 114.73095703125, -5.2558135986328125, 2.5143280029296875, 66.87449645996094, -20.81463623046875, 47.1402587890625, 138.36923217773438, 15.476089477539062, 133.72763061523438, -23.90081787109375, -19.803863525390625, 62.984100341796875, -23.171070098876953, 129.81280517578125, 17.142959594726562, 142.8865966796875, -72.9005126953125, 66.679931640625, 32.05088806152344, 15.201019287109375, 128.87545776367188, 137.56277465820312, -1.699066162109375, 19.12884521484375, 159.75027465820312, -16.01702880859375, 61.114410400390625, 20.661476135253906, 157.64920043945312, 12.03387451171875, 19.244155883789062, 47.69126892089844, -21.36505126953125, 45.0321044921875, 1.563507080078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000369.npy"}
|
||||
{"epoch": 0.7727748691099476, "step": 370, "batch_size": 128, "mean": 55.00275421142578, "std": 78.71866607666016, "min": -179.85552978515625, "p10": -19.680633163452146, "median": 33.902442932128906, "p90": 146.98326416015624, "max": 274.6087341308594, "pos_frac": 0.7890625, "sample": [102.51239013671875, -136.61956787109375, 128.043212890625, 33.22004699707031, 39.286468505859375, 54.43574523925781, -22.1444091796875, 4.613311767578125, 108.31863403320312, 12.863021850585938, 99.84862518310547, 87.93307495117188, -9.914546966552734, 15.54217529296875, 111.49069213867188, 8.8057861328125, 0.384552001953125, 193.29812622070312, 5.631401062011719, 26.59368896484375, 114.905029296875, 28.947343826293945, 103.59512329101562, 128.48312377929688, -14.667633056640625, -3.7653274536132812, -18.62472915649414, 128.73397827148438, 4.806529998779297, 27.2000732421875, 26.207611083984375, -28.928466796875, 124.18716430664062, 30.683635711669922, 5.1833343505859375, 2.0202178955078125, 205.173583984375, 83.43138122558594, 4.676219940185547, 160.32327270507812, 107.89543914794922, -141.14219665527344, 125.19537353515625, 9.137474060058594, 94.83806610107422, 91.32640075683594, 52.94511413574219, 38.23069763183594, 38.164886474609375, 0.4972724914550781, 106.29782104492188, 14.859588623046875, 8.873664855957031, 113.90377807617188, 92.52520751953125, -8.673675537109375, 111.20162963867188, -5.95947265625, 146.9937744140625, -45.35736846923828, -22.348838806152344, 145.41714477539062, -10.412612915039062, -54.59443664550781, 7.099277496337891, 139.81564331054688, 195.74822998046875, 80.92840576171875, 110.2645263671875, -49.378822326660156, 2.934417724609375, -8.839508056640625, 56.02978515625, 146.978759765625, 144.774169921875, -11.9859619140625, 159.9736328125, 158.57379150390625, 103.2398681640625, 142.10975646972656, 155.28509521484375, 131.2327117919922, 34.5848388671875, 194.09750366210938, 132.81243896484375, 136.8665008544922, -24.316078186035156, 7.656768798828125, -144.76602172851562, 132.8842315673828, 116.86223602294922, 25.15852165222168, -72.60894775390625, 24.371063232421875, 274.6087341308594, -2.4606666564941406, 16.578765869140625, 33.18388366699219, 135.8643798828125, 137.90536499023438, 145.01780700683594, 156.4464111328125, 13.549346923828125, 150.43353271484375, -9.452163696289062, 87.89909362792969, 71.07527160644531, 26.981700897216797, 48.929656982421875, 61.72882080078125, 171.646484375, -179.85552978515625, 12.379356384277344, -10.581146240234375, 2.0557327270507812, 132.30581665039062, 13.754119873046875, 8.238269805908203, 144.98374938964844, 13.856414794921875, 15.95418930053711, -111.53298950195312, 30.26396942138672, 140.74606323242188, -7.859748840332031, 6.736053466796875, 109.91896057128906, -15.86297607421875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000370.npy"}
|
||||
{"epoch": 0.774869109947644, "step": 371, "batch_size": 128, "mean": 46.08458709716797, "std": 73.49795532226562, "min": -183.96133422851562, "p10": -25.946801757812494, "median": 34.91558837890625, "p90": 139.62055053710935, "max": 186.58172607421875, "pos_frac": 0.7578125, "sample": [0.2869682312011719, 81.95193481445312, 42.96728515625, 129.53390502929688, -12.49737548828125, 0.48175048828125, 169.69097900390625, 25.67218017578125, 50.44468688964844, 17.520355224609375, 1.566986083984375, -173.58499145507812, 9.035186767578125, 12.097869873046875, 26.045303344726562, -10.11328125, -56.28900146484375, 76.23637390136719, 10.492401123046875, -17.780029296875, -24.51165771484375, 35.820556640625, -9.08635139465332, 151.31103515625, 63.05548095703125, 72.09085083007812, -0.04662132263183594, 45.01171875, -23.668319702148438, 40.62005615234375, 116.0196533203125, 149.55723571777344, -78.67472839355469, 24.040634155273438, -183.96133422851562, 63.017120361328125, -37.795745849609375, 6.900604248046875, 109.93132019042969, 25.003753662109375, 143.302001953125, 10.73931884765625, -83.78611755371094, 15.7010498046875, 48.731689453125, 5.2340087890625, 11.426681518554688, 0.0, -29.29547119140625, 7.109809875488281, 160.6654815673828, 135.81800842285156, 90.785400390625, -65.7423095703125, -122.76614379882812, -5.9367828369140625, 32.06431579589844, 160.98150634765625, 137.46304321289062, 11.752006530761719, 20.374252319335938, 109.80841064453125, 73.75810241699219, 178.93838500976562, 44.48297119140625, 76.53521728515625, 87.74676513671875, 128.89361572265625, 149.27377319335938, -19.423248291015625, 32.22774887084961, 132.03964233398438, 111.18744659423828, -63.56414794921875, 103.81613159179688, 119.78766632080078, 104.21617126464844, 29.585006713867188, 138.04278564453125, 17.320632934570312, -6.11407470703125, -118.78689575195312, 20.48675537109375, 186.58172607421875, 94.21957397460938, -23.70623779296875, -82.70856475830078, 120.5787582397461, 120.5615234375, 37.13587188720703, 1.7420120239257812, -17.218017578125, 89.28256225585938, 98.20762634277344, 38.20042419433594, -2.4365158081054688, 125.97116088867188, 75.71453857421875, 21.93744659423828, 178.77468872070312, 27.141061782836914, -2.3795833587646484, 80.20968627929688, 59.89839172363281, 128.411865234375, 128.1805419921875, -0.9895477294921875, 34.0106201171875, 17.653968811035156, 97.5374755859375, 123.3650894165039, -46.17350769042969, 47.7852783203125, 37.18030548095703, 9.413055419921875, 131.88482666015625, 147.16220092773438, 17.365209579467773, 174.80548095703125, 134.19149780273438, 0.836090087890625, -23.81787109375, 30.4725341796875, -6.735809326171875, 43.123390197753906, 124.92431640625, 99.91864776611328, 157.33816528320312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000371.npy"}
|
||||
{"epoch": 0.7769633507853403, "step": 372, "batch_size": 128, "mean": 60.29133605957031, "std": 76.72049713134766, "min": -146.5286865234375, "p10": -18.086067199707028, "median": 56.73408317565918, "p90": 158.610546875, "max": 232.8436279296875, "pos_frac": 0.78125, "sample": [165.004150390625, -4.8912353515625, 123.07060241699219, 56.14510726928711, -1.6032657623291016, 123.27459716796875, 110.42387390136719, -7.57281494140625, 111.60888671875, 152.9748992919922, 68.03366088867188, 64.13706970214844, 122.2935791015625, 116.79859924316406, -21.486251831054688, 98.50350952148438, 14.601276397705078, 120.937744140625, -8.94919204711914, 18.483062744140625, 76.75851440429688, 33.09855651855469, 110.655517578125, 2.04217529296875, -13.499755859375, 105.5133056640625, 185.6533203125, 68.03596496582031, 136.37486267089844, 117.85482788085938, -22.26556396484375, -22.5643310546875, 163.91268920898438, -8.20123291015625, 74.77487182617188, 108.84129333496094, 168.4600830078125, 5.7834014892578125, 83.03004455566406, 55.51148223876953, -15.932571411132812, 161.09664916992188, 89.06830596923828, 95.50435638427734, 173.2861328125, -135.96453857421875, 174.760986328125, 62.663909912109375, 11.296257019042969, 31.872711181640625, 15.450080871582031, 9.327896118164062, 103.14059448242188, -1.2041015625, 18.676483154296875, 31.225006103515625, 95.84207153320312, 122.76495361328125, 109.95242309570312, 39.212554931640625, 60.568145751953125, 29.96575927734375, 112.49807739257812, 16.284088134765625, 5.64874267578125, 48.61235046386719, 127.9547119140625, 5.95550537109375, 151.125732421875, -14.601146697998047, 1.441650390625, 8.75555419921875, 15.018203735351562, 142.85198974609375, 158.36053466796875, 170.576171875, -75.3260498046875, 165.415283203125, 24.13555908203125, -133.4314422607422, 136.0016632080078, 108.25250244140625, 46.41352844238281, 55.61920166015625, -125.3419189453125, 57.32305908203125, 57.998626708984375, 20.264907836914062, -5.760637283325195, 148.20054626464844, -16.62884521484375, -8.1905517578125, 138.88592529296875, 112.42535400390625, 43.7086181640625, -146.5286865234375, -78.80607604980469, 36.10302734375, 114.30694580078125, -49.00945281982422, 144.9212646484375, -2.4760894775390625, -40.6346435546875, 15.5716552734375, 154.61669921875, 126.76339721679688, -142.291015625, 159.19390869140625, 232.8436279296875, 169.2869873046875, 40.709075927734375, 187.27398681640625, 150.923828125, 4.390708923339844, 54.520355224609375, 153.739013671875, -1.568136215209961, 66.9737548828125, -22.86346435546875, 133.19094848632812, 104.43447875976562, 126.02144622802734, 12.364967346191406, 4.6202392578125, -1.2772941589355469, 91.63880920410156, 43.5291748046875, 40.228271484375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000372.npy"}
|
||||
{"epoch": 0.7790575916230367, "step": 373, "batch_size": 128, "mean": 55.363372802734375, "std": 79.55511474609375, "min": -185.323974609375, "p10": -31.229144668579103, "median": 42.290283203125, "p90": 157.52300262451172, "max": 182.729248046875, "pos_frac": 0.7890625, "sample": [10.40069580078125, 57.974708557128906, 147.89276123046875, 31.079544067382812, 0.6636886596679688, -13.57965087890625, 20.878753662109375, 170.26312255859375, 18.5767822265625, 155.20419311523438, 134.23147583007812, -14.9500732421875, 17.192733764648438, 145.39112854003906, 139.87484741210938, 67.4005126953125, 1.0569000244140625, 54.824737548828125, -28.549652099609375, 105.099853515625, 62.153717041015625, 1.1402816772460938, 87.84880065917969, 16.474853515625, -39.9576416015625, 45.75506591796875, -7.8508148193359375, -33.34455871582031, 34.819854736328125, 175.855712890625, 3.5113754272460938, -23.293350219726562, 25.8741455078125, 55.85986328125, 56.13360595703125, 110.166748046875, 139.06204223632812, -16.5582275390625, 141.4754638671875, 36.76226806640625, 43.79473876953125, 129.0113525390625, 157.3187255859375, 60.22638702392578, 20.1219482421875, 154.51113891601562, 33.34440612792969, -68.04566955566406, 45.13665771484375, 164.57269287109375, -30.25333023071289, 138.36325073242188, -81.46517181396484, -39.34710693359375, 95.88641357421875, -105.01904296875, 176.45704650878906, -31.199798583984375, 22.434356689453125, 51.67097473144531, 7.193544387817383, -185.323974609375, 177.118408203125, 32.21824645996094, 143.685302734375, 14.808944702148438, -101.36531066894531, 149.91998291015625, 17.871597290039062, 117.04220581054688, 148.28506469726562, 145.7319793701172, 11.786888122558594, 67.38551330566406, 22.090911865234375, 61.04087829589844, 121.718017578125, 4.2567138671875, 64.67385864257812, 6.81451416015625, -134.548828125, -15.995780944824219, 172.86770629882812, 40.15167999267578, 40.78582763671875, 151.48904418945312, 142.21560668945312, 171.18902587890625, 182.729248046875, 92.54734802246094, 4.2879638671875, 157.99964904785156, -2.684326171875, 19.598098754882812, 149.2743377685547, 150.4952392578125, 19.094024658203125, 171.5340576171875, -112.443603515625, -121.04344177246094, 60.543792724609375, 139.67051696777344, 179.97317504882812, 140.718994140625, 62.20616149902344, 15.063232421875, 116.03684997558594, -14.00347900390625, 181.57778930664062, 16.879791259765625, 12.110111236572266, 14.640541076660156, 23.252105712890625, 73.00834655761719, -31.297618865966797, -3.955047607421875, 127.46235656738281, 5.679412841796875, -39.1983642578125, 88.85063171386719, -3.1610984802246094, 147.07098388671875, -17.34848403930664, 11.54217529296875, 150.21737670898438, 168.71099853515625, 112.44684600830078, 10.981002807617188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000373.npy"}
|
||||
{"epoch": 0.7811518324607329, "step": 374, "batch_size": 128, "mean": 38.08228302001953, "std": 79.67975616455078, "min": -166.61761474609375, "p10": -59.363201904296865, "median": 22.10540771484375, "p90": 146.7590148925781, "max": 213.41079711914062, "pos_frac": 0.71875, "sample": [193.2657470703125, 0.0, 28.59393310546875, 196.170166015625, 5.77301025390625, -122.72494506835938, -12.330879211425781, 26.306793212890625, 141.41246032714844, 90.78641510009766, 15.465553283691406, 187.61972045898438, 16.509613037109375, -108.72288513183594, 12.744430541992188, 35.16748046875, 110.3440933227539, 115.63787078857422, 17.38604736328125, 13.179290771484375, -115.87496948242188, -55.05352783203125, 95.09070587158203, -99.75851440429688, 144.2642364501953, 28.11377716064453, 14.51226806640625, 11.183624267578125, 5.397552490234375, -23.229652404785156, 0.68206787109375, 9.243606567382812, -23.994903564453125, 67.678955078125, -9.9522705078125, 27.89165496826172, 21.42303466796875, -17.593353271484375, 163.99993896484375, 126.88958740234375, 42.733642578125, 9.63690185546875, 213.41079711914062, -2.2245712280273438, 64.4794921875, 3.149505615234375, 8.351150512695312, 72.28726196289062, -166.61761474609375, -15.796867370605469, 48.24528503417969, -45.862396240234375, 151.98440551757812, 90.32779693603516, 25.707122802734375, -4.884613037109375, -9.72509765625, 63.004150390625, 71.86819458007812, -0.2146148681640625, 27.00865936279297, 168.20697021484375, 170.86517333984375, -32.23771667480469, -1.877899169921875, 60.626441955566406, 54.06671142578125, 58.53228759765625, -31.91217041015625, 123.94776916503906, 17.75689697265625, 55.4449462890625, 119.35674285888672, 103.79330444335938, -151.004638671875, 115.31932067871094, -56.07684326171875, 10.14019775390625, 18.618682861328125, 131.0935516357422, 114.56922149658203, 11.844024658203125, 57.659912109375, 135.44973754882812, 0.0, 134.1658477783203, 13.689273834228516, -67.0313720703125, -35.75244140625, -82.26641845703125, 0.10071563720703125, 109.189453125, 141.8004150390625, 153.9923553466797, 22.78778076171875, -25.38433837890625, 181.10202026367188, 18.30902099609375, 165.95101928710938, 55.06268310546875, 19.08953857421875, -97.63069152832031, -78.06144714355469, 29.620513916015625, 15.597639083862305, 5.885581970214844, 30.14349365234375, 167.58154296875, -21.26531219482422, -91.26017761230469, 1.5443115234375, 83.70651245117188, 24.854705810546875, -0.7233734130859375, 144.51956176757812, 0.1900634765625, 120.94677734375, 60.01788330078125, -90.52041625976562, 97.26368713378906, 196.89730834960938, 129.8546142578125, 135.0369110107422, 36.24836730957031, 32.905914306640625, -3.0139503479003906, 10.575088500976562, -105.73727416992188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000374.npy"}
|
||||
{"epoch": 0.7832460732984293, "step": 375, "batch_size": 128, "mean": 46.60120391845703, "std": 78.702392578125, "min": -166.42471313476562, "p10": -31.145823669433593, "median": 22.299583435058594, "p90": 152.0050079345703, "max": 233.58056640625, "pos_frac": 0.7578125, "sample": [31.536056518554688, 58.80475616455078, 81.3187255859375, 22.26263427734375, 40.13818359375, 3.457538604736328, 25.882598876953125, -104.27122497558594, 6.508079528808594, 35.094512939453125, 164.4776611328125, 0.6296787261962891, 145.69918823242188, 147.39785766601562, 74.00376892089844, -8.82330322265625, 53.32582092285156, 88.38175964355469, 7.596435546875, -2.3248062133789062, 142.22027587890625, 152.89895629882812, 12.977615356445312, 12.1656494140625, 59.48747253417969, 139.9879913330078, -19.130687713623047, 9.454475402832031, -52.49610900878906, 98.22444915771484, 10.443397521972656, 20.086788177490234, -4.448417663574219, -99.34210205078125, 16.40485382080078, 146.88543701171875, 102.04058837890625, 150.85418701171875, 21.389488220214844, -19.781646728515625, 36.186676025390625, 11.135955810546875, -109.96173095703125, -0.53924560546875, -166.42471313476562, 89.56156921386719, 130.85198974609375, 130.9730224609375, 14.453422546386719, -122.83578491210938, -111.3271484375, -18.413818359375, 58.638671875, 5.104183197021484, 20.685546875, -3.5651702880859375, 120.44607543945312, -10.614639282226562, -31.59527587890625, -109.26681518554688, 157.61273193359375, 148.77444458007812, -8.1949462890625, 194.53643798828125, 52.136356353759766, 21.4913330078125, -58.85121154785156, 16.083412170410156, 233.58056640625, -6.86192512512207, 67.61245727539062, 12.62590217590332, 22.336532592773438, 73.18964385986328, 32.999725341796875, 165.9810791015625, 99.20394897460938, 5.14251708984375, 57.6490478515625, 16.366928100585938, 120.88043212890625, 95.59751892089844, -16.78369140625, -30.953201293945312, 1.269195556640625, 105.30331420898438, 22.004486083984375, 86.23101806640625, 108.799072265625, 137.66976928710938, -128.42050170898438, 156.1930694580078, 186.55474853515625, -14.286178588867188, 188.8372802734375, 201.96694946289062, -44.725982666015625, 144.2809295654297, 81.3846435546875, 155.4757080078125, -17.89712905883789, -70.63597869873047, 151.62188720703125, -1.745880126953125, 130.3828125, 3.4206714630126953, 108.86093139648438, 153.6346435546875, 18.63165283203125, 5.48919677734375, -2.5977630615234375, 161.58352661132812, 22.829864501953125, 34.41693115234375, 142.77590942382812, -27.26641082763672, 3.7728652954101562, 147.03868103027344, 1.7013702392578125, 18.65692138671875, 10.45361328125, 14.562248229980469, 56.35125732421875, 94.11911010742188, 10.421478271484375, 125.93815612792969, 7.715179443359375, 95.11349487304688], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000375.npy"}
|
||||
{"epoch": 0.7853403141361257, "step": 376, "batch_size": 128, "mean": 44.631004333496094, "std": 80.007568359375, "min": -135.41448974609375, "p10": -56.69985046386717, "median": 37.85154724121094, "p90": 149.43751525878906, "max": 228.22235107421875, "pos_frac": 0.703125, "sample": [70.78526306152344, 16.795379638671875, 154.21005249023438, 228.22235107421875, 106.15328979492188, 207.54833984375, 79.46002197265625, 80.26077270507812, 49.48896789550781, -105.053466796875, 78.7965087890625, -23.743759155273438, -124.22274780273438, 28.44964599609375, -18.8551025390625, 72.88107299804688, 1.2704238891601562, 41.555389404296875, 112.49462890625, 152.901123046875, 93.16262817382812, -135.41448974609375, 1.3617782592773438, 12.91168212890625, 58.59906005859375, 29.320823669433594, -23.891357421875, 164.03411865234375, -3.1030731201171875, 131.307373046875, 128.56790161132812, -2.5325469970703125, 147.73403930664062, 47.692291259765625, -5.205535888671875, -20.820556640625, 149.06576538085938, 75.98766326904297, 150.304931640625, 11.566513061523438, 0.0, 18.999114990234375, 18.430099487304688, 105.69107818603516, -52.019012451171875, 174.21206665039062, -75.45953369140625, 117.45538330078125, 77.29693603515625, 117.0238037109375, -112.33187866210938, 82.55126953125, 42.913543701171875, 9.5394287109375, 137.46353149414062, 5.366485595703125, 27.192169189453125, 85.36373901367188, -65.5079345703125, -25.85107421875, 93.78630065917969, -1.768157958984375, 193.09616088867188, 183.51922607421875, 33.625030517578125, 169.56207275390625, 143.66195678710938, 66.13327026367188, -28.3018798828125, -27.034027099609375, 79.78009033203125, -52.924957275390625, 26.81732177734375, -30.84081268310547, -3.1064910888671875, 18.2806396484375, -110.72505187988281, -0.8323593139648438, -81.67161560058594, 153.70977783203125, 98.60037231445312, 56.303192138671875, 147.2769775390625, 86.98016357421875, 138.84771728515625, 106.63735961914062, 7.037885665893555, -125.78741455078125, 11.042816162109375, -84.20915222167969, 54.464874267578125, 2.911724090576172, 129.384521484375, -20.42940902709961, 9.033834457397461, 119.51761627197266, 71.7136001586914, -10.39599609375, -99.1168212890625, 123.9586181640625, 4.6204376220703125, 138.31414794921875, -87.44466400146484, 124.82550811767578, 14.274169921875, 12.066841125488281, 89.35852813720703, 28.82135009765625, 51.60136413574219, 129.158935546875, -16.6622314453125, -42.22869873046875, -14.72568130493164, 37.411224365234375, 50.87469482421875, -6.886104583740234, 35.445404052734375, -34.924537658691406, 55.14111328125, -1.5500869750976562, 4.990161895751953, 72.90644836425781, 101.17262268066406, 38.2918701171875, 156.38787841796875, -111.00154113769531, 177.64505004882812, 147.96875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000376.npy"}
|
||||
{"epoch": 0.787434554973822, "step": 377, "batch_size": 128, "mean": 42.707916259765625, "std": 73.84729766845703, "min": -137.864501953125, "p10": -37.17201538085938, "median": 27.222488403320312, "p90": 152.61626434326172, "max": 198.31088256835938, "pos_frac": 0.7109375, "sample": [165.6036376953125, 106.03273010253906, 108.80952453613281, -13.926002502441406, 66.771728515625, 19.889617919921875, 20.30682373046875, 138.89996337890625, 126.42796325683594, 14.564666748046875, 33.387290954589844, 13.500732421875, 117.22296142578125, -58.050811767578125, 96.2347412109375, 105.12742614746094, 152.66876220703125, -33.25927734375, 0.5040092468261719, -42.069610595703125, 124.96307373046875, 1.3893585205078125, 10.095169067382812, 32.49175262451172, -12.749723434448242, -16.649803161621094, 59.831878662109375, -9.89593505859375, 152.59376525878906, 24.931137084960938, 86.74613952636719, 1.31903076171875, 72.72698974609375, -120.9798583984375, -6.7929229736328125, 135.42840576171875, 6.060947418212891, -4.9305419921875, -17.434539794921875, 38.385223388671875, 14.98193359375, 86.60572814941406, 12.679573059082031, -61.61248779296875, 112.43052673339844, 9.05816650390625, 175.032958984375, 70.19677734375, 8.476470947265625, -89.00244140625, 9.7376708984375, 182.33575439453125, -17.18335723876953, -0.891571044921875, 161.987548828125, 5.663568496704102, -46.88175964355469, -5.746284484863281, 123.18966674804688, 132.14602661132812, 142.60906982421875, 25.7698974609375, 74.50732421875, 62.196014404296875, -37.161468505859375, 133.63946533203125, 61.17692565917969, 0.0, -37.196624755859375, 42.54179382324219, 124.33187866210938, -10.3809814453125, 107.23204040527344, 124.29978942871094, -1.330291748046875, 30.04534912109375, 8.478862762451172, -34.766937255859375, -0.40439414978027344, 107.48251342773438, 15.2674560546875, 192.73977661132812, 10.3751220703125, -30.295364379882812, -16.286773681640625, -90.11528778076172, 35.84771728515625, 30.214500427246094, 144.83877563476562, 36.246063232421875, 58.888214111328125, 27.15386962890625, 8.145530700683594, 102.25360107421875, 1.431640625, 161.29818725585938, 11.861686706542969, 24.145416259765625, 42.20654296875, 119.75119018554688, 27.291107177734375, -87.85958862304688, -10.105720520019531, 73.87664794921875, 167.88262939453125, -48.30085754394531, 181.14077758789062, 33.525482177734375, 53.87959289550781, 132.87033081054688, 30.130043029785156, 10.502151489257812, 28.955368041992188, 10.09375, 165.186767578125, 118.20962524414062, 154.0518035888672, -119.55413818359375, -12.587860107421875, 198.31088256835938, 30.005760192871094, -36.299072265625, -82.02153015136719, 166.58380126953125, -11.186996459960938, -8.8330078125, -137.864501953125, 50.311187744140625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000377.npy"}
|
||||
{"epoch": 0.7895287958115184, "step": 378, "batch_size": 128, "mean": 41.529903411865234, "std": 74.71979522705078, "min": -145.6695556640625, "p10": -38.39461517333984, "median": 25.232940673828125, "p90": 150.6905731201172, "max": 186.4093017578125, "pos_frac": 0.6875, "sample": [99.04537963867188, 4.417938232421875, 27.502166748046875, 113.78536987304688, -24.06939697265625, 6.9466094970703125, -30.33349609375, 7.9637451171875, 157.15777587890625, 32.2940673828125, -16.002273559570312, 2.3464508056640625, 25.13763427734375, 118.11907958984375, 117.89459228515625, -44.66246032714844, -145.6695556640625, 161.08721923828125, 172.87448120117188, -19.93365478515625, 43.4898681640625, 20.090301513671875, 9.486923217773438, -40.87419128417969, 77.81439208984375, 164.0699920654297, 179.08718872070312, 129.58157348632812, 24.011627197265625, 92.79046630859375, -8.47686767578125, 124.43890380859375, 94.52757263183594, 22.020828247070312, 9.513671875, -115.30733489990234, 1.9233951568603516, 3.1606388092041016, 55.858306884765625, 88.84951782226562, 0.0, 142.30841064453125, -4.8341064453125, 118.79434204101562, 165.55831909179688, -46.60638427734375, -10.988468170166016, 186.4093017578125, -20.93470001220703, 162.87652587890625, 89.25482177734375, 101.9111328125, -5.678802490234375, 5.0069580078125, 7.288055419921875, -4.1617889404296875, -24.383056640625, 123.31967163085938, -23.5853271484375, 121.59304809570312, 46.98370361328125, -2.8404598236083984, 120.57255554199219, 126.95468139648438, 36.707061767578125, 85.83438110351562, -26.92938232421875, -57.261322021484375, 150.12496948242188, 71.3489761352539, 92.90786743164062, 25.365280151367188, 0.21392822265625, -27.312591552734375, -101.35475158691406, 148.37991333007812, -48.83140563964844, -63.133453369140625, 49.224334716796875, -1.7523384094238281, 42.08660888671875, 7.0694732666015625, 78.93487548828125, 92.22111511230469, 10.32269287109375, 89.988037109375, 161.3304443359375, 7.1504058837890625, 16.678955078125, 39.236114501953125, 134.1527557373047, 139.76058959960938, -21.4755859375, -4.282379150390625, 34.185577392578125, -34.42876434326172, -73.40557861328125, 14.778457641601562, 12.442138671875, -17.294815063476562, -96.73590087890625, 158.51251220703125, -2.60015869140625, 71.79728698730469, -126.83494567871094, -3.1606292724609375, 25.3282470703125, 105.96991729736328, -12.020751953125, 26.534576416015625, 52.086212158203125, 40.14178466796875, 2.4758682250976562, -12.784713745117188, 179.126220703125, 117.85821533203125, 152.01031494140625, 135.26377868652344, 159.9486083984375, 16.990402221679688, -131.36801147460938, 99.02130889892578, 121.53701782226562, 49.609130859375, 28.398483276367188, -37.331939697265625, 17.5169677734375, -1.221710205078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000378.npy"}
|
||||
{"epoch": 0.7916230366492146, "step": 379, "batch_size": 128, "mean": 44.91410827636719, "std": 79.69709777832031, "min": -159.11297607421875, "p10": -48.77879257202147, "median": 24.351741790771484, "p90": 151.8182357788086, "max": 192.70501708984375, "pos_frac": 0.7265625, "sample": [-11.091964721679688, 8.857704162597656, 4.0692596435546875, 7.5751953125, -3.4106521606445312, -10.034149169921875, -0.34328460693359375, 134.25372314453125, 56.494140625, 13.39996337890625, 81.50033569335938, 126.31381225585938, 44.0531005859375, 31.241363525390625, 14.1820068359375, 9.037857055664062, 86.731201171875, -99.58920288085938, 160.45745849609375, 10.267440795898438, 91.59197998046875, 62.30865478515625, 139.87831115722656, -36.04315185546875, 22.276199340820312, -19.411346435546875, 139.2638397216797, 0.189178466796875, 53.25958251953125, -3.7864341735839844, 33.647918701171875, 152.0828399658203, 156.57720947265625, 54.855194091796875, 57.09405517578125, 3.69921875, 22.957839965820312, 164.43017578125, 91.25405883789062, 6.073585510253906, -6.481719970703125, -55.79534912109375, 29.677520751953125, 19.66602325439453, 3.29876708984375, 181.46466064453125, 12.642127990722656, -58.64178466796875, -33.7220458984375, 8.021270751953125, 121.01425170898438, -11.12486743927002, 38.160552978515625, -131.49884033203125, 7.504425048828125, 11.58197021484375, 147.01718139648438, 131.70938110351562, 115.6829833984375, 100.68789672851562, 64.99310302734375, -55.15155792236328, 94.06298828125, 22.0352783203125, 121.69650268554688, 192.70501708984375, 24.99139404296875, 154.59173583984375, 14.053497314453125, -21.435272216796875, -41.176300048828125, 0.0, -4.2181396484375, 106.16505432128906, 17.27081298828125, 160.45758056640625, 49.924041748046875, 137.92086791992188, 152.85650634765625, 140.3245391845703, 5.619873046875, 112.07251739501953, 21.086898803710938, 23.71208953857422, 134.2010040283203, 135.53280639648438, -37.628746032714844, 109.93850708007812, -9.566482543945312, 101.40548706054688, 96.3961181640625, -27.6142578125, 163.58856201171875, 2.7781810760498047, 103.72442626953125, -11.78997802734375, 2.1907901763916016, 111.28494262695312, -104.6114501953125, 183.965576171875, 173.06301879882812, -159.11297607421875, 16.389892578125, -2.1729278564453125, 149.0518798828125, 76.1187744140625, 58.298614501953125, -46.047607421875, 140.96136474609375, 143.74807739257812, 162.05984497070312, 91.6248779296875, 140.12063598632812, 63.56080627441406, -105.40536499023438, -123.68446350097656, 29.593406677246094, 6.613788604736328, -6.14605712890625, 146.08139038085938, -61.182281494140625, 108.48267364501953, -130.42166137695312, -96.60356140136719, -129.01318359375, 21.99360466003418, -6.0545654296875, 151.704833984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000379.npy"}
|
||||
{"epoch": 0.793717277486911, "step": 380, "batch_size": 128, "mean": 45.39341735839844, "std": 77.22552490234375, "min": -124.3921127319336, "p10": -44.90382843017578, "median": 31.68951416015625, "p90": 154.71816406249997, "max": 234.42376708984375, "pos_frac": 0.7265625, "sample": [123.34884643554688, 48.179656982421875, 83.7037353515625, 19.871871948242188, 101.53436279296875, 23.0479736328125, -95.03648376464844, 1.068084716796875, 0.8520183563232422, -55.6107177734375, 49.846282958984375, 71.023681640625, 132.46832275390625, -86.61192321777344, 33.283203125, 12.275077819824219, 116.86746215820312, 40.82537841796875, 125.38903045654297, -4.8870697021484375, 0.0, 43.33099365234375, 18.230270385742188, 24.7042236328125, 33.99462890625, 127.73220825195312, 4.35516357421875, 29.9483642578125, -2.707061767578125, 7.9925537109375, 46.199920654296875, 185.46612548828125, 17.770904541015625, 116.20712280273438, -0.445404052734375, -124.3921127319336, 113.1334228515625, -5.694919586181641, 39.218040466308594, -23.299766540527344, 44.471466064453125, 58.84181213378906, 149.70123291015625, -39.72320556640625, -17.448097229003906, 15.530975341796875, 14.163406372070312, 165.84088134765625, 162.6741943359375, -7.668304443359375, -86.26626586914062, 83.16682434082031, 17.304107666015625, 10.6541748046875, 27.88934326171875, -73.52030944824219, -25.20604705810547, 14.7537841796875, 164.48300170898438, 132.601318359375, -48.824462890625, 147.8069610595703, 48.218109130859375, 37.577362060546875, 168.61181640625, 149.6861572265625, -56.72826385498047, -5.12176513671875, 18.35333251953125, 50.25779342651367, 134.32098388671875, 167.78790283203125, 2.444549560546875, 112.85457611083984, 143.25100708007812, -3.7937164306640625, 81.51113891601562, -34.73919677734375, 35.505035400390625, -99.6781005859375, 90.72114562988281, 234.42376708984375, -29.293960571289062, 114.41368103027344, 144.23941040039062, 57.494140625, 135.37496948242188, 152.22833251953125, 142.4857177734375, 91.52279663085938, 33.483184814453125, 200.9158935546875, -69.48995971679688, 60.26995849609375, 4.032268524169922, 56.965782165527344, 7.46435546875, -39.67193603515625, 162.9267578125, 227.67779541015625, 33.8560791015625, 8.514022827148438, 16.494781494140625, 145.63986206054688, 50.7421875, 14.780662536621094, 25.2091064453125, 40.748016357421875, -30.839614868164062, -43.22355651855469, -69.93658447265625, -84.23211669921875, 10.428924560546875, 16.75934410095215, 25.757095336914062, 106.71200561523438, -28.736785888671875, 33.7843017578125, 69.85850524902344, 210.58685302734375, 30.0958251953125, -15.961181640625, -32.296630859375, 160.52777099609375, -1.5780181884765625, -64.27244567871094, 184.17990112304688, -36.15374755859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000380.npy"}
|
||||
{"epoch": 0.7958115183246073, "step": 381, "batch_size": 128, "mean": 52.02214050292969, "std": 70.33186340332031, "min": -86.28350830078125, "p10": -31.987272644042967, "median": 44.81388282775879, "p90": 145.49295654296876, "max": 174.2268829345703, "pos_frac": 0.703125, "sample": [99.30593872070312, 173.3133544921875, 146.4200439453125, 43.16620635986328, 155.60231018066406, -33.13990783691406, 152.3885040283203, 145.41217041015625, 101.07118225097656, 5.068576812744141, 161.1353302001953, 20.645614624023438, 12.958877563476562, 132.29983520507812, 94.22811889648438, 136.95388793945312, 0.0, 11.132781982421875, 89.93646240234375, 116.99795532226562, 119.28681945800781, 61.4063720703125, 12.197677612304688, 174.2268829345703, 50.144500732421875, 79.61554718017578, 81.16046905517578, -77.21101379394531, 0.0, -21.010528564453125, 82.50787353515625, 92.67440032958984, -9.97576904296875, 133.86387634277344, -8.392822265625, 79.33743286132812, 13.925445556640625, 92.01095581054688, 10.422454833984375, 100.45440673828125, 172.36669921875, 111.23771667480469, -17.17901611328125, 115.49714660644531, 116.450439453125, 6.23541259765625, 145.68145751953125, -15.088882446289062, -37.47239685058594, 119.92326354980469, 135.71429443359375, 145.2421112060547, 40.96528625488281, 0.6989898681640625, 137.082275390625, 144.30413818359375, 7.7628173828125, 103.10308837890625, 153.81341552734375, 57.877586364746094, -34.21644592285156, -6.418701171875, 51.28395080566406, 13.202911376953125, -5.515655517578125, 46.4615592956543, 140.308349609375, -4.466339111328125, -12.603851318359375, -72.7186279296875, -9.833526611328125, -0.8689517974853516, -5.280517578125, 135.62991333007812, -16.58319091796875, 52.79351806640625, -15.3948974609375, 70.00323486328125, 68.04704284667969, -83.785400390625, 147.80572509765625, -29.473342895507812, 130.20138549804688, 42.9688720703125, 163.218994140625, -1.6661376953125, -22.8831787109375, -7.215259552001953, 144.6398468017578, 11.397247314453125, -0.0948486328125, -51.327850341796875, 125.003662109375, 110.39640808105469, 81.33743286132812, -50.630462646484375, 4.214076995849609, 4.324920654296875, 35.03814697265625, -83.5672836303711, 77.22294616699219, 9.175338745117188, -86.28350830078125, 89.25359344482422, -82.14312744140625, 101.30998229980469, -31.4932861328125, 8.698911666870117, -14.6279296875, 107.46539306640625, -6.3095703125, 163.67445373535156, -40.975616455078125, 172.88372802734375, 4.38299560546875, 52.329673767089844, 28.42841339111328, 9.327705383300781, 110.94915771484375, 21.555419921875, 137.93032836914062, 140.365966796875, 31.554161071777344, -24.89752197265625, -43.28509521484375, 95.30216979980469, 34.86517333984375, 52.68359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000381.npy"}
|
||||
{"epoch": 0.7979057591623037, "step": 382, "batch_size": 128, "mean": 62.767913818359375, "std": 78.2535171508789, "min": -115.98692321777344, "p10": -37.6986099243164, "median": 68.75343322753906, "p90": 153.93741455078126, "max": 197.478515625, "pos_frac": 0.7734375, "sample": [-109.41326904296875, 25.691360473632812, 72.04690551757812, 48.66937255859375, 119.3875732421875, 139.3701171875, 143.51727294921875, 19.0345458984375, 52.11548614501953, 101.63214111328125, 113.6583251953125, -102.51263427734375, 104.21821594238281, 15.751686096191406, 164.3433837890625, 109.70172119140625, 28.75962257385254, 63.04103088378906, 109.91552734375, -11.368721008300781, 48.3778076171875, -115.98692321777344, 143.4473114013672, 6.2349853515625, 9.662155151367188, 167.48727416992188, 48.72306823730469, -26.84337615966797, 127.06063842773438, 96.76175689697266, -9.2918701171875, 123.13772583007812, 83.64801025390625, 34.777259826660156, 135.15869140625, 89.26280975341797, -6.518058776855469, 71.45417022705078, -8.46602725982666, 153.29458618164062, 40.486839294433594, 9.475830078125, 150.45547485351562, 30.357177734375, 146.51478576660156, 152.13140869140625, 162.0476531982422, 46.11651611328125, 48.162139892578125, 8.470046997070312, 158.49826049804688, 123.3883056640625, 15.922691345214844, 96.11532592773438, -14.257568359375, 100.69284057617188, 197.478515625, 178.51611328125, -72.81561279296875, -16.117919921875, -37.11296081542969, -112.19953155517578, 194.43252563476562, 46.655731201171875, 3.63623046875, 82.54234313964844, 166.2999267578125, 62.366119384765625, 167.2110595703125, 27.579986572265625, 172.41836547851562, 23.112884521484375, 61.66850280761719, -2.141571044921875, 150.27383422851562, 69.4658203125, 109.57354736328125, 20.123458862304688, 103.83067321777344, 144.7689208984375, 54.08599853515625, -29.264678955078125, 68.04104614257812, 120.27902221679688, -106.86227416992188, -0.763519287109375, -86.17515563964844, 147.17889404296875, -25.426849365234375, 149.65380859375, 155.43734741210938, -20.77374267578125, 10.938156127929688, 148.5392303466797, 149.46905517578125, 3.022918701171875, 95.74260711669922, -96.20475769042969, 6.0377960205078125, -13.796844482421875, 26.709884643554688, 67.71453857421875, 106.50480651855469, 130.0312042236328, 103.08990478515625, -64.33450317382812, 37.990997314453125, 100.05435180664062, -90.60916137695312, 119.51216125488281, 132.29052734375, -6.116325378417969, 126.57174682617188, 81.67617797851562, 56.18288040161133, 111.16915893554688, -68.30584716796875, 83.07034301757812, -39.06512451171875, 108.03556823730469, 152.76931762695312, -18.552600860595703, 181.9365234375, 104.16264343261719, 160.69891357421875, 138.008056640625, 139.98670959472656, -71.13427734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000382.npy"}
|
||||
{"epoch": 0.8, "step": 383, "batch_size": 128, "mean": 46.260807037353516, "std": 74.56214904785156, "min": -148.35801696777344, "p10": -42.76066284179687, "median": 28.158496856689453, "p90": 141.6534484863281, "max": 193.96417236328125, "pos_frac": 0.734375, "sample": [-2.7591400146484375, 39.67431640625, 129.1063690185547, 82.24114990234375, -39.18041229248047, 9.655181884765625, -53.85577392578125, 21.890960693359375, -1.763580322265625, 18.532424926757812, 118.57587432861328, -64.051025390625, 126.069580078125, 29.04925537109375, 0.0, -12.181877136230469, 10.388717651367188, -57.610321044921875, 53.188323974609375, -73.30874633789062, 18.7371826171875, 159.41357421875, 134.74720764160156, 88.98175048828125, -96.93658447265625, 161.4353790283203, -81.19131469726562, -4.993305206298828, -45.925018310546875, 9.839813232421875, 75.23020935058594, 10.28125, -29.87738800048828, -27.49163055419922, 134.94921875, 21.4658203125, 169.341552734375, 138.4691162109375, 32.52626037597656, 114.77296447753906, 82.05211639404297, 7.5739898681640625, 108.0369873046875, 192.7022705078125, -148.35801696777344, -2.74169921875, 62.72642517089844, 144.6790771484375, 62.2982177734375, 6.91583251953125, 146.070556640625, 0.942779541015625, 3.243806838989258, 131.69180297851562, 86.06289672851562, 20.708831787109375, -20.800067901611328, -74.09102630615234, 0.48148536682128906, -99.40591430664062, 118.73651123046875, 139.9141845703125, 12.046241760253906, 152.17013549804688, -25.24102783203125, 23.634490966796875, 148.642578125, 28.330604553222656, 9.635284423828125, -20.813812255859375, 78.43973541259766, 148.3870391845703, 111.36383056640625, 122.36102294921875, 91.791259765625, 46.230960845947266, -3.87786865234375, 12.671417236328125, 57.26129150390625, 128.8738250732422, 3.4055328369140625, 105.91899871826172, 105.56980895996094, 53.27410888671875, 140.35675048828125, 26.565155029296875, 27.98638916015625, 21.855205535888672, 83.41065979003906, -134.37228393554688, 134.51626586914062, 14.103286743164062, 56.069732666015625, 77.00543975830078, 19.929290771484375, 5.243450164794922, -37.79203796386719, 171.58062744140625, 119.39981842041016, 113.71060943603516, 129.4876708984375, -29.0078125, 193.96417236328125, -76.89274597167969, -81.658447265625, 26.896080017089844, 82.651123046875, 17.758636474609375, 0.0, 114.29745483398438, 129.51861572265625, -3.3638229370117188, 60.8602294921875, 94.9261474609375, 76.19345092773438, -22.376449584960938, -41.404510498046875, 15.550806045532227, 149.76254272460938, 40.631195068359375, 10.123580932617188, 126.75609588623047, 177.33395385742188, -34.71106719970703, 3.87628173828125, -3.1676025390625, 117.79135131835938, 129.06423950195312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000383.npy"}
|
||||
{"epoch": 0.8020942408376963, "step": 384, "batch_size": 128, "mean": 32.18366622924805, "std": 80.02406311035156, "min": -154.8431396484375, "p10": -82.99129791259764, "median": 19.18646240234375, "p90": 142.17059326171875, "max": 197.89117431640625, "pos_frac": 0.6484375, "sample": [175.89968872070312, -154.8431396484375, 87.83880615234375, -20.27557373046875, -16.88323974609375, 117.07528686523438, 57.97206115722656, 197.89117431640625, 22.539443969726562, -41.62751770019531, 116.38923645019531, -17.524211883544922, -65.24066162109375, -85.75311279296875, 34.490875244140625, -96.48208618164062, 137.35983276367188, -3.7165756225585938, 165.95968627929688, -22.663482666015625, 6.360559463500977, 161.26431274414062, 118.97810363769531, 121.34284973144531, -2.7594451904296875, 111.622314453125, 22.557662963867188, 112.48939514160156, 17.9329833984375, -44.10125732421875, 3.314727783203125, -49.4635009765625, -23.616943359375, 64.56509399414062, -27.23809814453125, 61.79736328125, 16.919219970703125, 33.828338623046875, 133.7506866455078, -19.187599182128906, -3.3355865478515625, -108.94412231445312, 4.362297058105469, 16.34180450439453, 116.19216918945312, 135.1526336669922, 134.97396850585938, -141.0963897705078, 6.37469482421875, 20.43994140625, 111.01819610595703, 167.99520874023438, -88.80917358398438, -5.2435302734375, -128.1644287109375, 62.42951965332031, 75.9251708984375, 26.43402099609375, -108.97251892089844, -44.929107666015625, 14.247161865234375, -109.61956787109375, 112.70672607421875, 95.3035888671875, 152.72744750976562, 38.2738037109375, 42.1363525390625, 52.733642578125, 85.97549438476562, -5.757671356201172, -8.733989715576172, 74.87750244140625, 144.04171752929688, 14.159103393554688, 100.54342651367188, 141.36868286132812, -16.09783935546875, 16.521469116210938, -14.604351043701172, 88.355712890625, 69.26637268066406, -99.17585754394531, 84.87028503417969, 90.39306640625, 131.25576782226562, 183.77447509765625, 23.6339111328125, 5.7205810546875, 62.89093017578125, 38.964599609375, 159.754150390625, -8.455436706542969, 132.3265380859375, 65.04586791992188, -4.37518310546875, 4.8722991943359375, 7.6876220703125, 155.29945373535156, 7.380096435546875, -19.422760009765625, -6.8158416748046875, -46.98419189453125, 36.840087890625, -108.81570434570312, -103.91510009765625, 27.345182418823242, 0.2791748046875, 28.390155792236328, 13.701330184936523, -7.202239990234375, 171.719970703125, -117.1199951171875, 55.3511962890625, -3.4000778198242188, -4.182798385620117, -23.2449951171875, 156.30859375, 145.3406982421875, -64.03773498535156, 113.53387451171875, 111.12520599365234, -81.80766296386719, 2.4130706787109375, 17.704269409179688, 37.8427734375, 56.759674072265625, 14.54736328125, -69.94027709960938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000384.npy"}
|
||||
{"epoch": 0.8041884816753927, "step": 385, "batch_size": 128, "mean": 45.56441879272461, "std": 75.50624084472656, "min": -144.78070068359375, "p10": -36.87652587890624, "median": 31.96923828125, "p90": 148.84911499023437, "max": 202.24127197265625, "pos_frac": 0.71875, "sample": [-40.9298095703125, 87.89805603027344, 150.720458984375, 126.69171142578125, 26.2259521484375, 25.08966064453125, 139.79425048828125, 25.5399169921875, 144.08612060546875, 126.55354309082031, 29.943084716796875, -0.0611419677734375, -54.92052459716797, 19.686126708984375, 25.447463989257812, 42.09220886230469, 31.34075927734375, 109.06853485107422, -52.330589294433594, -17.219863891601562, 150.5069580078125, -64.04391479492188, 119.67013549804688, -108.98716735839844, 9.275146484375, 146.3584747314453, 112.01129913330078, 10.524055480957031, 112.2298583984375, 23.193084716796875, 5.127655029296875, 184.87551879882812, 118.68708038330078, -23.0242919921875, 148.13861083984375, 202.24127197265625, 131.6775665283203, 96.15890502929688, 70.42382049560547, 197.2605743408203, -26.1641845703125, -32.36640930175781, 51.88409423828125, 98.80262756347656, 43.397216796875, -108.36360168457031, 122.64378356933594, 117.31806945800781, -105.9459228515625, 32.59771728515625, 189.78579711914062, 158.83969116210938, 199.89666748046875, -8.550033569335938, 112.405029296875, 19.4615478515625, 118.63180541992188, 1.3552570343017578, -144.78070068359375, 52.62281036376953, 15.94134521484375, 168.8568115234375, -21.016754150390625, 144.65341186523438, 2.64111328125, 2.73358154296875, -27.500274658203125, 115.46060180664062, -5.297515869140625, 62.23175048828125, 67.18122863769531, 4.014472961425781, -28.043701171875, 9.583717346191406, 19.698226928710938, -23.435943603515625, 80.83955383300781, 125.62435913085938, 107.42770385742188, -80.95150756835938, 13.561309814453125, 101.5616683959961, 128.30929565429688, 80.455322265625, 116.33453369140625, 13.01153564453125, -14.958213806152344, 25.861068725585938, -142.78155517578125, 3.7914772033691406, 73.58001708984375, -4.862712860107422, -5.2344512939453125, 56.70647430419922, 156.24795532226562, 45.68182373046875, 0.0, 39.502716064453125, 66.72555541992188, 50.858917236328125, 4.466621398925781, 9.90057373046875, 45.974273681640625, -35.139404296875, 167.82122802734375, -8.578407287597656, -60.0386962890625, 101.73367309570312, -19.845062255859375, 173.02151489257812, 59.34521484375, 0.0, -15.72210693359375, -59.34198760986328, 116.0816650390625, 3.118743896484375, 38.036895751953125, -4.532188415527344, -8.9483642578125, -5.32080078125, -61.19569396972656, 7.265625, 45.661041259765625, 36.96673583984375, 64.7001953125, 39.0892333984375, 156.069580078125, 16.1688232421875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000385.npy"}
|
||||
{"epoch": 0.806282722513089, "step": 386, "batch_size": 128, "mean": 56.98755645751953, "std": 81.13665008544922, "min": -211.0394287109375, "p10": -26.587848663330075, "median": 43.574466705322266, "p90": 158.48978881835936, "max": 213.79696655273438, "pos_frac": 0.7421875, "sample": [96.69027709960938, -2.29779052734375, 3.7882308959960938, 115.88288116455078, 64.27252197265625, -2.45281982421875, 60.38739013671875, 155.09710693359375, 95.18826293945312, 125.11441040039062, 130.22952270507812, 213.79696655273438, -37.619384765625, 33.906005859375, 175.1707763671875, 30.57598876953125, 27.91058349609375, 32.379669189453125, 2.6489810943603516, 103.38201904296875, 89.96212768554688, 128.57025146484375, 22.9532470703125, 56.22821044921875, -8.778564453125, -129.9686279296875, 39.482208251953125, -10.651504516601562, 116.88056945800781, 35.5384521484375, 28.849639892578125, -111.16917419433594, 171.68817138671875, 30.070068359375, -8.211090087890625, 16.4964599609375, 14.922027587890625, -28.423324584960938, 48.192138671875, 130.92718505859375, 211.1951904296875, 160.99789428710938, 105.00108337402344, 43.50959014892578, 103.94869995117188, 119.734375, 172.2904052734375, -73.09524536132812, 107.9776611328125, 184.76104736328125, 63.5294189453125, -20.13726806640625, -42.34173583984375, -36.73693084716797, 177.636962890625, 115.54371643066406, -7.933357238769531, 0.0, -78.859619140625, -25.80121612548828, -116.84713745117188, -88.0091552734375, 68.51123046875, 40.3968505859375, 1.6382713317871094, -9.9691162109375, 56.52655029296875, -2.1534156799316406, 139.73023986816406, -162.7318115234375, 167.88827514648438, -41.3966064453125, 113.40936279296875, 182.77288818359375, 181.27490234375, 142.91903686523438, 7.289531707763672, -5.103851318359375, 149.2150115966797, 130.24794006347656, 110.73422241210938, 10.55255126953125, -211.0394287109375, 41.765380859375, 3.80352783203125, 153.29708862304688, 128.41311645507812, 12.7830810546875, 121.57769775390625, 0.72265625, 47.719390869140625, 18.705890655517578, -2.58233642578125, 0.0, 175.79684448242188, 98.27665710449219, -10.0660400390625, 5.941932678222656, 77.6866455078125, 114.09524536132812, 133.67153930664062, 153.4326934814453, 62.89013671875, 147.7454071044922, 12.290176391601562, 154.67172241210938, 27.594940185546875, -19.814010620117188, -1.51068115234375, 129.81890869140625, 166.27003479003906, 43.63934326171875, 151.7483367919922, 98.79939270019531, 5.916717529296875, 25.66064453125, 32.62077331542969, -4.289215087890625, 77.7198257446289, 132.62319946289062, 23.83810043334961, 157.41488647460938, 13.184513092041016, -9.42333984375, 146.89828491210938, 0.0, 125.241943359375, 113.12898254394531], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000386.npy"}
|
||||
{"epoch": 0.8083769633507853, "step": 387, "batch_size": 128, "mean": 52.16790008544922, "std": 79.95233154296875, "min": -172.94937133789062, "p10": -35.299319458007815, "median": 37.435359954833984, "p90": 159.2098602294922, "max": 194.62319946289062, "pos_frac": 0.6875, "sample": [47.292236328125, -52.209747314453125, 109.46795654296875, 32.664119720458984, 5.8045654296875, 158.75540161132812, 177.24647521972656, -49.14208984375, -121.995361328125, -35.139678955078125, 0.0, 163.3526611328125, -9.91802978515625, 181.6444091796875, 126.18466186523438, 62.863983154296875, 12.609573364257812, -12.84393310546875, -19.393798828125, 110.31002807617188, -31.269729614257812, 12.388130187988281, 124.25032043457031, -38.54376220703125, 38.052772521972656, 150.6824188232422, -87.51060485839844, -18.976043701171875, 148.63148498535156, 0.6171016693115234, 139.45999145507812, -48.329559326171875, 141.56124877929688, 37.08106994628906, 132.57992553710938, 130.57870483398438, -8.9996337890625, 76.14097595214844, 27.19043731689453, -9.74835205078125, -35.67181396484375, 122.90567016601562, 28.114715576171875, -19.277618408203125, 160.270263671875, 37.91014099121094, 180.38262939453125, 70.13763427734375, -21.69727897644043, -3.483642578125, -5.927467346191406, 164.30377197265625, -40.162384033203125, -12.642608642578125, 34.211669921875, 30.457977294921875, 8.907562255859375, -101.81623840332031, 128.69793701171875, 109.47819519042969, 84.064208984375, 35.7098388671875, 58.805450439453125, 19.7354736328125, 125.62269592285156, -118.29971313476562, 47.48944091796875, 36.6138916015625, 8.307098388671875, -12.948921203613281, -18.601394653320312, -5.41082763671875, 47.38067626953125, 167.39749145507812, 133.58189392089844, 98.90817260742188, 119.32203674316406, 37.789649963378906, -8.628646850585938, 98.7576904296875, 13.337387084960938, -2.59814453125, 150.90731811523438, 25.68890380859375, 36.76470947265625, 158.74636840820312, 112.22948455810547, 7.2861328125, 91.740966796875, 127.49109649658203, 1.973297119140625, 150.26571655273438, 59.49183654785156, 21.538070678710938, 185.79412841796875, 84.65200805664062, 127.27206420898438, -30.158721923828125, 135.47634887695312, 126.96943664550781, 10.944520950317383, 54.82403564453125, -28.42913818359375, 130.23046875, -7.457141876220703, -73.92083740234375, -172.94937133789062, -15.808456420898438, 128.06307983398438, 194.62319946289062, 39.355712890625, -15.446563720703125, 107.91573333740234, 95.48098754882812, 177.08563232421875, 130.65713500976562, 109.30877685546875, 7.65802001953125, 186.7935791015625, 12.912322998046875, -24.953277587890625, -117.70486450195312, 170.76705932617188, 173.04696655273438, -3.4948196411132812, 130.12246704101562, -22.217281341552734, 121.12933349609375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000387.npy"}
|
||||
{"epoch": 0.8104712041884817, "step": 388, "batch_size": 128, "mean": 42.630855560302734, "std": 78.7320327758789, "min": -160.54879760742188, "p10": -35.48488311767578, "median": 24.44596767425537, "p90": 148.6851623535156, "max": 218.1339111328125, "pos_frac": 0.7109375, "sample": [81.0535888671875, -29.484039306640625, 113.60751342773438, 0.0, 218.1339111328125, 135.99903869628906, -6.42138671875, 164.429443359375, 50.51338195800781, 140.26174926757812, -125.33840942382812, 127.51016235351562, -122.32876586914062, 3.0525245666503906, 5.0921783447265625, 24.439605712890625, 51.68645477294922, 108.22000122070312, 182.88333129882812, -20.70751953125, 101.01875305175781, 187.00537109375, -81.16021728515625, -16.33953857421875, -7.34722900390625, -35.92527770996094, 76.94384765625, 101.35165405273438, 149.65005493164062, 154.9940185546875, 23.7059326171875, 14.970008850097656, 0.0, 9.39801025390625, 17.197616577148438, -57.93743896484375, -17.67950439453125, 14.741241455078125, 91.0238037109375, 105.09843444824219, -29.216445922851562, 25.74561309814453, 52.16905212402344, 127.96932983398438, -24.552093505859375, -55.422821044921875, 153.421142578125, -13.73492431640625, -5.564998626708984, 4.9021759033203125, 81.38793182373047, -23.743942260742188, 161.88214111328125, -35.296142578125, 10.248641967773438, 105.65906524658203, -5.7367401123046875, 159.4765625, 62.9365234375, -27.61932373046875, -8.530364990234375, 74.06643676757812, 126.72419738769531, 8.171722412109375, 20.34006690979004, -52.8104248046875, 126.76486206054688, -156.58218383789062, 168.0327606201172, -87.5156478881836, 64.5899658203125, 142.11331176757812, 106.161376953125, 93.33158111572266, 26.819549560546875, -0.186431884765625, 3.8035888671875, 144.974853515625, 8.158744812011719, -122.30751037597656, -13.898773193359375, 163.4273681640625, 145.03216552734375, -19.254776000976562, 18.617267608642578, 7.9141082763671875, 26.441314697265625, 118.95941162109375, 83.03030395507812, 43.881187438964844, -106.1063232421875, 33.232994079589844, -0.234893798828125, -160.54879760742188, 16.308792114257812, 164.6038360595703, 140.5142822265625, 129.17794799804688, 24.452329635620117, 127.0550537109375, 5.440803527832031, 106.76235961914062, 148.27163696289062, 175.46514892578125, 7.07781982421875, 11.369499206542969, 28.839752197265625, 46.857879638671875, -28.28985595703125, 1.3756370544433594, 140.00912475585938, 12.279693603515625, -29.995223999023438, 20.450042724609375, 28.453418731689453, 3.6523818969726562, 122.85342407226562, 6.247528076171875, 116.74179077148438, 48.6754150390625, -75.99061584472656, -30.911331176757812, 30.68719482421875, 6.726318359375, 30.1151123046875, 116.93898010253906, 113.3458480834961, 6.351310729980469], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000388.npy"}
|
||||
{"epoch": 0.812565445026178, "step": 389, "batch_size": 128, "mean": 53.640167236328125, "std": 76.83556365966797, "min": -140.6826171875, "p10": -35.44064559936523, "median": 56.64244842529297, "p90": 145.2145782470703, "max": 177.78494262695312, "pos_frac": 0.765625, "sample": [0.290252685546875, -89.51051330566406, -16.361244201660156, 90.53591918945312, 127.22549438476562, 131.1871337890625, 39.2896728515625, 25.23065185546875, 4.563865661621094, 157.2469482421875, 57.5948486328125, 143.53929138183594, 31.96661376953125, 170.52597045898438, 44.68926239013672, 128.06338500976562, 0.3906707763671875, 131.38453674316406, 81.84017944335938, 172.60543823242188, -72.52001953125, 116.85963439941406, 160.5003662109375, 133.78317260742188, 4.251152038574219, 129.45416259765625, -140.6826171875, 165.18356323242188, 33.674224853515625, 150.4498291015625, 26.0245361328125, -139.13705444335938, 112.44406127929688, -35.35389709472656, -46.24713134765625, -3.529388427734375, -30.230682373046875, 114.90352630615234, 163.69918823242188, 177.78494262695312, 128.68092346191406, 0.0, -88.42967224121094, 79.61190032958984, 111.8446044921875, -80.5631103515625, 149.1235809326172, 124.10232543945312, 134.34959411621094, 23.10308837890625, -12.948585510253906, 131.12217712402344, 43.45501708984375, 22.05739402770996, 110.10877990722656, 68.47967529296875, 6.4805755615234375, 53.589141845703125, -34.100189208984375, -14.108573913574219, 50.88990783691406, 45.787567138671875, 61.53538513183594, 127.42750549316406, -47.45123291015625, 121.82581329345703, 84.82620239257812, 121.52110290527344, 102.78166198730469, 97.61152648925781, 123.88327026367188, -61.015380859375, 51.68988037109375, 142.1331787109375, 2.58184814453125, 161.43017578125, 62.63322448730469, 70.4505615234375, -29.727554321289062, -6.51641845703125, 60.62213134765625, 63.24267578125, 16.219520568847656, 56.992462158203125, 14.33990478515625, 137.572509765625, -126.67388916015625, 106.2626953125, 89.17575073242188, 4.389129638671875, 109.4329833984375, 39.29901123046875, 0.961029052734375, 3.822284698486328, -6.177528381347656, 105.5696792602539, 6.4964599609375, -139.4141845703125, -25.191253662109375, 149.22671508789062, -18.530120849609375, 132.88397216796875, -13.697998046875, -35.64305877685547, 85.29669189453125, 155.32614135742188, 0.0, 71.54908752441406, 130.4581756591797, 172.27268981933594, 13.5579833984375, 31.094528198242188, -4.548919677734375, 60.41650390625, 70.04347229003906, 133.96583557128906, 122.855712890625, 4.508171081542969, -30.342361450195312, 18.113258361816406, 52.87516784667969, 134.14295959472656, -121.80426025390625, 17.427658081054688, 128.34555053710938, 2.19305419921875, 56.29243469238281, 136.85443115234375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000389.npy"}
|
||||
{"epoch": 0.8146596858638744, "step": 390, "batch_size": 128, "mean": 49.42596435546875, "std": 84.53832244873047, "min": -151.8553466796875, "p10": -64.26961364746093, "median": 53.12549591064453, "p90": 151.95433502197264, "max": 180.92294311523438, "pos_frac": 0.7265625, "sample": [98.3861083984375, 2.5936431884765625, 44.340850830078125, 17.364898681640625, 57.2073974609375, 126.42340087890625, 2.771219253540039, 130.4330596923828, 111.48583984375, 19.489913940429688, -33.66905212402344, 6.4898529052734375, 7.042388916015625, -30.0755615234375, 31.64019775390625, 96.76144409179688, 3.8610267639160156, 159.79348754882812, 141.25234985351562, 102.90542602539062, 88.35202026367188, -38.006988525390625, 147.5060577392578, 8.726638793945312, 65.51004028320312, 70.77532958984375, -117.1484375, 134.86773681640625, 128.56338500976562, 27.549774169921875, 146.75323486328125, 117.57424926757812, -88.47270202636719, 39.99907684326172, 130.23675537109375, 110.96124267578125, 86.40731048583984, 26.29974365234375, -2.6179351806640625, 65.13322448730469, -151.8553466796875, 119.0650863647461, -63.724151611328125, 114.07650756835938, 73.88211822509766, -87.75869750976562, -22.000648498535156, -148.74984741210938, 122.62405395507812, -34.25396728515625, 49.04359436035156, -13.772964477539062, -114.80401611328125, 31.84759521484375, 175.0274658203125, 16.358598709106445, 21.559051513671875, 148.56369018554688, 111.44584655761719, -130.82827758789062, -93.436279296875, 69.486328125, 150.02439880371094, -21.830780029296875, 170.98834228515625, -128.95217895507812, 180.92294311523438, 81.69039916992188, -16.357696533203125, 120.437744140625, 70.36082458496094, 172.52090454101562, 106.926513671875, 0.0, -17.44696044921875, -4.989654541015625, 136.2652587890625, -24.720809936523438, 130.0387420654297, 82.26744079589844, -65.5423583984375, 146.29421997070312, 169.62921142578125, -146.75823974609375, -16.902618408203125, 142.18893432617188, 180.33929443359375, 137.48631286621094, -5.923828125, 8.084651947021484, 0.42664337158203125, 163.20956420898438, 62.7529296875, 88.76821899414062, 70.68434143066406, 108.50196075439453, -11.942970275878906, 156.80987548828125, 31.777969360351562, -61.58721923828125, -1.2249431610107422, -67.06315612792969, 161.71627807617188, 138.5361328125, -30.38043212890625, 35.87359619140625, 168.87506103515625, 4.7474365234375, 1.1947212219238281, 6.260009765625, -56.565765380859375, 4.5714874267578125, -10.246864318847656, 111.80023193359375, 63.66484069824219, 26.631866455078125, 141.547119140625, 138.08224487304688, 31.58807373046875, 149.05279541015625, 125.26182556152344, 31.444610595703125, 6.09326171875, 157.2900390625, 156.45751953125, -117.55609130859375, 89.25546264648438, 75.91008758544922], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000390.npy"}
|
||||
{"epoch": 0.8167539267015707, "step": 391, "batch_size": 128, "mean": 49.01108932495117, "std": 72.03805541992188, "min": -131.40924072265625, "p10": -25.137362670898433, "median": 34.62139129638672, "p90": 143.29542236328126, "max": 281.4912109375, "pos_frac": 0.75, "sample": [-5.508705139160156, -86.9694595336914, 1.336984634399414, 44.49224853515625, 18.516416549682617, 8.829833984375, -0.9599609375, 7.057731628417969, 24.062454223632812, 96.95561981201172, 1.0997543334960938, 11.641815185546875, -28.181373596191406, 75.05429077148438, 157.870849609375, 86.31781005859375, 80.10857391357422, 79.33041381835938, -21.820480346679688, -19.00566864013672, -17.300674438476562, 143.11984252929688, 69.73880004882812, 44.915283203125, -24.1143798828125, 45.4324951171875, 134.35345458984375, 39.21710205078125, 6.4078216552734375, 35.574859619140625, -3.500091552734375, -45.35003662109375, 131.57550048828125, -9.642669677734375, 75.99237060546875, 106.4820556640625, 104.45283508300781, 10.703193664550781, -12.14398193359375, 106.4881591796875, 10.32320785522461, 120.32027435302734, 132.1846466064453, 193.701416015625, 138.43084716796875, -6.431610107421875, 143.70510864257812, 8.420989990234375, -23.495540618896484, -2.6846923828125, 180.4627685546875, 41.078147888183594, -42.84197235107422, 19.091552734375, 11.3951416015625, 100.29568481445312, 110.06558227539062, 84.65058898925781, 122.55221557617188, -29.06829833984375, 48.331207275390625, -108.51338195800781, 122.72457122802734, 40.360809326171875, 31.689849853515625, 14.46820068359375, 89.27099609375, 7.880817413330078, 60.47248077392578, 4.0231781005859375, 1.26580810546875, 154.21466064453125, 26.26104736328125, 123.58918762207031, 96.1663589477539, -8.06182861328125, 120.0850830078125, 74.21121215820312, -7.6054229736328125, 138.9783935546875, 3.293365478515625, 40.304656982421875, 142.10321044921875, 281.4912109375, 162.241943359375, 5.0830078125, -7.9350738525390625, -17.10809326171875, 137.75103759765625, 23.83807373046875, 12.8394775390625, -3.5485897064208984, -130.1612548828125, 140.49285888671875, 39.078857421875, 52.32470703125, -32.53083801269531, 110.95158386230469, 161.42886352539062, 158.39776611328125, -36.228271484375, -63.10272216796875, 33.66792297363281, -88.01547241210938, 3.643402099609375, -2.7930870056152344, 24.68608856201172, 91.09886169433594, 109.6683349609375, 149.5447998046875, 18.036651611328125, 114.82746124267578, 96.3552474975586, 12.287734985351562, 70.77540588378906, 70.08197021484375, 159.23416137695312, 124.70730590820312, 190.4095458984375, 9.960479736328125, -27.524322509765625, 8.50238037109375, 29.61968994140625, 6.775001525878906, -6.574775695800781, 64.94070434570312, -131.40924072265625, 145.30715942382812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000391.npy"}
|
||||
{"epoch": 0.818848167539267, "step": 392, "batch_size": 128, "mean": 64.38864135742188, "std": 72.10929870605469, "min": -136.8310546875, "p10": -12.487460327148437, "median": 49.52268981933594, "p90": 167.84979553222655, "max": 198.48663330078125, "pos_frac": 0.7890625, "sample": [144.49771118164062, 113.97321319580078, 83.37977600097656, 41.96868896484375, 36.348876953125, 4.521644592285156, 49.468658447265625, 68.3619384765625, 77.2655029296875, 54.6170654296875, 12.879608154296875, 48.852935791015625, 110.87583923339844, 53.0711669921875, -13.106689453125, 9.294052124023438, -136.8310546875, 198.48663330078125, 98.86734008789062, -22.253067016601562, 168.98004150390625, -6.332187652587891, 10.33074951171875, 102.60031127929688, 167.961669921875, 171.55474853515625, 0.0, -41.643768310546875, 145.27691650390625, 121.61697387695312, 8.285720825195312, 28.9324951171875, 172.99179077148438, 41.19195556640625, 67.77855682373047, 176.03271484375, 167.80184936523438, 146.95223999023438, 125.1739501953125, 33.199493408203125, 126.85354614257812, 145.46591186523438, 19.23590850830078, 26.94091796875, -5.42803955078125, 19.41278076171875, -16.845870971679688, 9.574951171875, 98.05232238769531, 143.95217895507812, 42.41021728515625, 154.11708068847656, 178.5679931640625, 104.28938293457031, 167.51290893554688, 120.57493591308594, 12.643157958984375, 74.24615478515625, 165.15951538085938, -5.4510498046875, -10.139884948730469, 102.69245147705078, 11.770660400390625, 166.4927978515625, 13.19400405883789, 59.13312530517578, -26.6962890625, 92.74295043945312, 181.96090698242188, -0.31317138671875, 52.004913330078125, 161.90521240234375, 172.11068725585938, 118.7381820678711, 97.35552978515625, 170.96780395507812, 43.951873779296875, -102.9639892578125, -9.828712463378906, -7.589023590087891, 14.783393859863281, 142.45774841308594, 4.37408447265625, 90.34588623046875, 27.4222412109375, -5.699249267578125, -20.7723388671875, 7.290985107421875, 20.767242431640625, -23.451553344726562, -1.86041259765625, 0.0, 124.09864044189453, 24.5174560546875, 138.559814453125, 178.17196655273438, 57.1541748046875, 140.75955200195312, 172.5581817626953, 29.717620849609375, 153.83401489257812, 0.28159523010253906, 10.7020263671875, 2.5320205688476562, 49.57672119140625, 155.76168823242188, -2.5497360229492188, 173.5311279296875, 18.077728271484375, 69.38727569580078, -97.54438781738281, 156.9136962890625, -13.23956298828125, 8.070556640625, 142.65008544921875, 27.6497802734375, 56.74407958984375, -12.222076416015625, -3.96044921875, 61.09736633300781, 35.97544860839844, 127.25302124023438, -24.627548217773438, 139.25112915039062, -22.621856689453125, 24.82324981689453, 138.18667602539062, 33.016387939453125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000392.npy"}
|
||||
{"epoch": 0.8209424083769633, "step": 393, "batch_size": 128, "mean": 43.60399627685547, "std": 77.99876403808594, "min": -172.6719970703125, "p10": -39.62723083496093, "median": 33.14966583251953, "p90": 137.63373413085938, "max": 245.0736083984375, "pos_frac": 0.71875, "sample": [43.9516487121582, -75.47760009765625, 6.396690368652344, 63.990570068359375, 92.28742218017578, -61.7930908203125, 125.75170135498047, 67.11080932617188, 111.37606811523438, 5.318328857421875, 14.26495361328125, -4.238037109375, 58.58642578125, -33.18213653564453, 76.75157928466797, 30.731048583984375, -109.71221923828125, 114.913818359375, 98.05001068115234, 89.2701416015625, 144.84088134765625, 59.047027587890625, 12.185455322265625, 79.50479125976562, -37.7752685546875, 155.95281982421875, 145.2435760498047, 137.63681030273438, 22.346710205078125, 32.473297119140625, 135.3858642578125, -5.9366455078125, 65.062255859375, 104.88919067382812, 30.360626220703125, 36.396728515625, -2.1512451171875, 130.42050170898438, 20.21923828125, 132.62408447265625, 12.59686279296875, -62.415435791015625, 44.9544677734375, 133.26873779296875, -80.6640625, 143.068359375, -48.552032470703125, -34.1539306640625, -74.42691040039062, 73.43777465820312, 230.2109375, 59.44858169555664, 9.38848876953125, 18.85753631591797, 13.604011535644531, 74.8585205078125, 0.6231861114501953, -122.69068908691406, 100.86785888671875, 59.93968200683594, -158.3497314453125, 17.291290283203125, 24.860031127929688, -18.780059814453125, 20.181640625, 135.47515869140625, 76.27476501464844, 142.10708618164062, -127.49835205078125, -8.417694091796875, -35.523590087890625, 22.767738342285156, 22.061256408691406, 47.01936340332031, 15.48074722290039, 59.43731689453125, 178.474853515625, -27.048995971679688, 48.778045654296875, -7.904487609863281, 9.05447006225586, -5.556427001953125, 124.38308715820312, -143.10047912597656, 24.445281982421875, 99.121337890625, 131.59588623046875, 74.42921447753906, 125.50086975097656, 137.07186889648438, -24.51080322265625, 137.63241577148438, 122.59487915039062, 145.89532470703125, 142.09815979003906, 33.978179931640625, 31.6072998046875, -39.411285400390625, 127.71437072753906, -172.6719970703125, 10.476974487304688, 33.82603454589844, -8.48089599609375, -2.2550048828125, 245.0736083984375, 128.01095581054688, 69.34402465820312, 0.0, 9.836105346679688, 70.36911010742188, 190.79519653320312, -40.131103515625, 4.6457061767578125, 8.991348266601562, -37.687469482421875, 81.57295227050781, 131.39089965820312, -24.97391128540039, -22.314254760742188, -0.403778076171875, 1.6908683776855469, -25.50146484375, 134.24176025390625, 138.6282958984375, -3.0618057250976562, 77.59210205078125, 103.25390625, 132.52655029296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000393.npy"}
|
||||
{"epoch": 0.8230366492146597, "step": 394, "batch_size": 128, "mean": 46.47724914550781, "std": 74.75751495361328, "min": -128.9974822998047, "p10": -34.400666046142575, "median": 26.669395446777344, "p90": 149.06402740478515, "max": 201.83660888671875, "pos_frac": 0.6796875, "sample": [151.01971435546875, 42.47821044921875, 155.54458618164062, 122.5220947265625, 48.81451416015625, 122.81580352783203, -12.9573974609375, -9.302703857421875, 22.632110595703125, 155.3853759765625, 122.9224853515625, -10.920455932617188, 32.26104736328125, 92.38021850585938, -13.07354736328125, -1.8273811340332031, 139.02520751953125, -36.33538055419922, 144.10595703125, -23.836950302124023, 3.70208740234375, 142.47189331054688, 186.76263427734375, -18.542495727539062, 1.6266899108886719, -42.625457763671875, 16.760406494140625, 124.52232360839844, 85.229736328125, 24.7745361328125, -19.605377197265625, 0.0, -44.03295135498047, -72.87992858886719, 137.957275390625, 13.000534057617188, 14.060714721679688, 93.04364776611328, 5.48602294921875, 30.512001037597656, 115.99661254882812, 66.88349914550781, 48.3607177734375, -128.9974822998047, 20.013038635253906, 98.36962890625, 145.9727783203125, 23.41042709350586, 3.878387451171875, 19.735107421875, -15.44219970703125, -39.1871337890625, 52.28570556640625, -11.98504638671875, -20.85498046875, 161.81884765625, 26.593948364257812, 50.701171875, 10.613502502441406, 54.18634033203125, 75.7596435546875, -53.75042724609375, -0.879974365234375, -66.27326965332031, 148.2258758544922, 3.519075393676758, 35.896156311035156, 80.19134521484375, 20.928321838378906, 113.29855346679688, 132.6244354248047, 142.87855529785156, 160.1744384765625, 3.566774368286133, 182.11151123046875, -3.053466796875, -63.548370361328125, -30.28460693359375, -6.1597442626953125, 201.83660888671875, 10.573986053466797, -107.19232177734375, 77.97030639648438, -30.598480224609375, -66.06967163085938, 18.41754150390625, 38.23651123046875, 28.937515258789062, 7.661460876464844, 90.93478393554688, -1.2922515869140625, -10.025093078613281, 86.58526611328125, 24.037612915039062, -25.27459716796875, -26.0762939453125, 39.324615478515625, 37.9158935546875, 100.0902099609375, 64.18499755859375, -113.31385803222656, 135.52122497558594, 26.744842529296875, -25.939895629882812, 167.66244506835938, 193.263671875, -25.724990844726562, 48.7607421875, 127.55805969238281, 156.8028564453125, 108.46771240234375, 116.3321304321289, 141.4942626953125, 133.4364013671875, 142.52407836914062, -52.494110107421875, 180.395751953125, 17.69769287109375, -18.7908935546875, 153.0362548828125, 18.498870849609375, 141.66567993164062, -19.64892578125, 101.25776672363281, -31.637664794921875, -14.43511962890625, -33.571502685546875, 129.8896484375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000394.npy"}
|
||||
{"epoch": 0.8251308900523561, "step": 395, "batch_size": 128, "mean": 47.05986022949219, "std": 74.32798767089844, "min": -130.52108764648438, "p10": -28.038117980957026, "median": 27.625572204589844, "p90": 153.2064987182617, "max": 191.704345703125, "pos_frac": 0.765625, "sample": [48.07879638671875, 9.4072265625, -15.129356384277344, -7.0218505859375, 9.86151123046875, -17.69158935546875, 0.66900634765625, 186.4334716796875, 174.66021728515625, 28.411544799804688, 19.46147918701172, 53.8875732421875, 26.839599609375, 10.085357666015625, 0.4056110382080078, 85.87623596191406, 8.086051940917969, -90.03887939453125, 60.839569091796875, -10.22674560546875, 45.368072509765625, 140.84304809570312, -18.908233642578125, -88.15109252929688, 163.22427368164062, 98.2264404296875, 9.1937255859375, 0.7079429626464844, 128.453125, 35.49560546875, 105.65966796875, 117.18478393554688, 99.90602111816406, 94.3695068359375, 45.2916259765625, 150.46261596679688, 80.630615234375, 46.1998291015625, 2.545513153076172, -24.52039337158203, 81.6368408203125, 12.804681777954102, -14.632904052734375, 26.216461181640625, 137.53564453125, 97.32467651367188, 4.40679931640625, -70.73463439941406, 113.9800033569336, 155.28460693359375, 173.15948486328125, -82.20257568359375, 76.16769409179688, 131.28448486328125, 150.46676635742188, 105.5263671875, 36.056488037109375, 152.45858764648438, 13.91265869140625, 84.69984436035156, -26.484207153320312, -79.59646606445312, 18.122116088867188, -39.68186950683594, 138.8385009765625, -23.4730224609375, 20.02532196044922, 40.246116638183594, -13.4864501953125, 7.8118896484375, -31.663909912109375, 7.8348236083984375, 7.244598388671875, -77.6923828125, 42.6212158203125, 86.96903991699219, 114.09425354003906, -3.08251953125, 16.8834228515625, 5.310302734375, 23.123687744140625, 20.800270080566406, 191.704345703125, 153.48684692382812, 149.68759155273438, -21.873992919921875, -3.420562744140625, -65.52987670898438, 17.021484375, 36.620208740234375, 8.73031997680664, 46.89054870605469, -121.39053344726562, 89.43798828125, -12.90350341796875, 20.04083251953125, 165.19818115234375, 167.58570861816406, 159.1060333251953, 151.28192138671875, -8.244224548339844, 96.28900146484375, 32.494873046875, -11.444686889648438, 94.97671508789062, 96.6016845703125, 143.2606964111328, 8.467437744140625, 138.2891387939453, -112.97560119628906, -1.3712005615234375, 128.61221313476562, 162.82406616210938, -62.66192626953125, 8.4871826171875, -130.52108764648438, 32.02459716796875, 7.46986198425293, 161.53262329101562, 158.71011352539062, 148.80218505859375, 153.0863494873047, 25.10400390625, 12.8875732421875, 25.14208984375, 63.54795837402344, 7.4998779296875, 57.90325927734375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000395.npy"}
|
||||
{"epoch": 0.8272251308900523, "step": 396, "batch_size": 128, "mean": 47.97978210449219, "std": 77.76249694824219, "min": -186.43121337890625, "p10": -37.31218109130859, "median": 40.671791076660156, "p90": 144.75965423583983, "max": 201.23077392578125, "pos_frac": 0.71875, "sample": [117.6683120727539, 13.9901123046875, -11.11407470703125, 151.98727416992188, 126.40591430664062, 13.204238891601562, -6.0536651611328125, 15.848201751708984, -40.220947265625, 166.14053344726562, 58.54388427734375, -19.048965454101562, 70.35948181152344, 51.497894287109375, 2.19842529296875, 68.11721801757812, 139.63528442382812, -52.634857177734375, 8.370147705078125, 106.62353515625, -34.549530029296875, -10.121009826660156, 8.48175048828125, 201.23077392578125, -3.340301513671875, -18.6075439453125, 130.01220703125, 125.45709228515625, 66.00340270996094, 108.0287094116211, 105.260498046875, -97.95951843261719, 66.780029296875, 130.3433837890625, -113.49520874023438, 145.0570068359375, 32.877967834472656, -26.74072265625, -5.633228302001953, 66.40829467773438, 40.43559265136719, 125.81804656982422, 71.53512573242188, 116.30528259277344, 165.13818359375, 20.34881591796875, 11.376077651977539, 11.162872314453125, -4.00665283203125, 99.5531997680664, 173.83169555664062, 163.08474731445312, 198.7669677734375, 4.679466247558594, 138.73800659179688, 55.690948486328125, 135.79132080078125, 20.025909423828125, 16.00799560546875, 17.085447311401367, 6.836235046386719, -31.052032470703125, -104.156494140625, -60.827186584472656, 36.757720947265625, -56.05406188964844, 47.83245849609375, -126.41510009765625, 17.062255859375, 16.52204132080078, 128.36334228515625, -8.094024658203125, 0.75250244140625, -3.4523696899414062, 130.05892944335938, 144.5537109375, 56.9786376953125, 127.55960083007812, 21.739593505859375, -136.81004333496094, 47.283447265625, 100.18585968017578, 17.75292205810547, 123.86749267578125, 55.60003662109375, 144.2073516845703, 28.112651824951172, 168.72740173339844, -100.57723236083984, 134.4764404296875, -22.51593017578125, 81.28704833984375, -46.86639404296875, 80.2859115600586, 85.22557830810547, 100.32965087890625, 131.86688232421875, -3.57220458984375, 13.394683837890625, 40.907989501953125, 145.3463592529297, 66.60261535644531, -0.71832275390625, -186.43121337890625, -11.343429565429688, 144.63221740722656, -9.125381469726562, 162.25051879882812, 145.1944580078125, 153.87269592285156, 0.0, 15.847686767578125, 142.09881591796875, 35.495269775390625, 74.11959838867188, 8.653617858886719, 0.0, 46.75706481933594, 78.74433898925781, -14.764053344726562, 20.062530517578125, -105.68316650390625, 97.60464477539062, 142.09205627441406, 132.04476928710938, -23.123321533203125, -36.06556701660156, 90.73106384277344], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000396.npy"}
|
||||
{"epoch": 0.8293193717277487, "step": 397, "batch_size": 128, "mean": 45.90789031982422, "std": 77.76599884033203, "min": -157.10711669921875, "p10": -39.563970947265624, "median": 32.38181495666504, "p90": 147.0010971069336, "max": 185.421630859375, "pos_frac": 0.703125, "sample": [-7.6591796875, 7.683349609375, -11.901016235351562, 14.706863403320312, 157.6583251953125, 24.486236572265625, -85.93699645996094, 47.80859375, -5.218723297119141, 22.95489501953125, 9.55670166015625, 14.82391357421875, 100.74382019042969, 15.59332275390625, 29.671798706054688, 146.38621520996094, 62.69248962402344, 92.99687194824219, 180.28094482421875, 111.70835876464844, -113.73074340820312, 24.56500244140625, -19.96954345703125, 49.802886962890625, 117.63928985595703, 136.92245483398438, 0.0, -2.82464599609375, 121.63107299804688, 44.006256103515625, 0.0, -38.42730712890625, 109.6309814453125, -11.275558471679688, -5.18280029296875, 2.4994277954101562, 7.2081298828125, 136.74366760253906, -81.68463134765625, 126.12899780273438, 155.11459350585938, 62.138397216796875, 108.7196044921875, 151.09033203125, -28.50506591796875, 0.05504608154296875, 118.34283447265625, 136.38729858398438, 185.421630859375, -37.72064208984375, 182.78665161132812, -46.171905517578125, 168.78619384765625, -157.10711669921875, -78.1536865234375, -40.618499755859375, 47.575286865234375, -111.76553344726562, -10.357208251953125, -135.10577392578125, -38.84710693359375, 23.722915649414062, 151.839111328125, 123.8258056640625, 64.22679138183594, 32.16762161254883, 101.88888549804688, -8.13427734375, 31.616500854492188, -0.9118061065673828, 53.001312255859375, 160.58526611328125, 75.84469604492188, 78.27958679199219, 134.193603515625, 137.54049682617188, 113.6787338256836, -14.31610107421875, 71.47476196289062, 113.16226196289062, 136.484130859375, 32.59600830078125, 155.161865234375, 148.43582153320312, 39.718475341796875, 1.1591033935546875, 129.41079711914062, -3.7178955078125, 12.717559814453125, 18.352447509765625, 116.59576416015625, 16.966506958007812, 17.123153686523438, 17.89379119873047, 136.5758056640625, 126.07070922851562, 28.339683532714844, 24.6761474609375, 151.82669067382812, 81.87930297851562, 18.701416015625, -103.17327880859375, 47.611083984375, -9.557666778564453, 164.4172821044922, 144.026123046875, 111.20152282714844, 5.226776123046875, -14.025627136230469, -29.949371337890625, -24.661453247070312, 146.06842041015625, 115.7081298828125, -37.91790771484375, 49.844879150390625, 39.09765625, -104.90682983398438, -16.6824951171875, 120.71910095214844, 24.101165771484375, -39.112030029296875, -86.29389953613281, 112.17831420898438, 87.65375518798828, 56.48028564453125, -99.3841552734375, 107.4556884765625, 94.64559936523438], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000397.npy"}
|
||||
{"epoch": 0.831413612565445, "step": 398, "batch_size": 128, "mean": 34.809818267822266, "std": 84.16205596923828, "min": -157.36767578125, "p10": -64.782421875, "median": 21.199542999267578, "p90": 149.636279296875, "max": 224.3533935546875, "pos_frac": 0.671875, "sample": [13.99713134765625, 20.735198974609375, 133.7501220703125, -11.54302978515625, 224.3533935546875, -44.97369384765625, 57.198944091796875, 2.869873046875, 59.081390380859375, 150.64462280273438, -119.3837890625, -55.371734619140625, -145.71453857421875, 25.246292114257812, 7.277305603027344, 14.413848876953125, 45.059814453125, -3.9526290893554688, 144.40887451171875, 155.51101684570312, 140.30624389648438, -123.77239990234375, -99.115478515625, 104.955322265625, 38.037353515625, 103.6827392578125, 10.377426147460938, 117.60452270507812, 131.61007690429688, 99.53843688964844, -84.86534118652344, 10.196609497070312, 37.32122802734375, -157.36767578125, -64.53240966796875, 103.05177307128906, 20.7427978515625, 117.34930419921875, 4.5784912109375, -96.70012664794922, 1.6858444213867188, -65.36578369140625, 21.656288146972656, 144.05569458007812, 0.021482467651367188, -51.62677001953125, -8.775314331054688, 124.52688598632812, 20.199951171875, -11.319122314453125, 33.8702392578125, -15.621124267578125, 73.58517456054688, -13.430160522460938, 116.28929138183594, 61.9886474609375, 25.442991256713867, -153.92123413085938, 24.719345092773438, -17.65195655822754, 9.484375, 48.12213134765625, 149.20413208007812, -28.8695068359375, 167.5958251953125, 7.795928955078125, 24.620750427246094, -121.5257568359375, 103.57244873046875, 158.92124938964844, 60.65440368652344, 177.48480224609375, 27.612655639648438, 182.80197143554688, -11.556060791015625, -18.298797607421875, 9.199216842651367, 148.030029296875, -29.763671875, -10.679168701171875, 157.54632568359375, 163.53173828125, 9.962638854980469, 35.83690643310547, 56.98076629638672, 6.507904052734375, -41.62611389160156, 128.25865173339844, 46.35791015625, -47.566192626953125, 113.27972412109375, 96.0792236328125, -14.694534301757812, -134.07745361328125, 11.7921142578125, -8.5780029296875, 107.17279052734375, -24.668243408203125, -3.74261474609375, 59.60881042480469, -18.410255432128906, -9.10400390625, 126.04071044921875, 96.37519836425781, 12.3087158203125, 4.600868225097656, 55.58709716796875, 153.20025634765625, 145.29544067382812, -33.740478515625, -111.26079559326172, 166.903564453125, 113.00054931640625, -55.10161590576172, 144.8468017578125, -5.270263671875, -44.60963439941406, -148.7188720703125, 13.139755249023438, 152.9708251953125, 118.2728271484375, -7.387451171875, 7.605039596557617, 196.04476928710938, 42.00048828125, 26.9703369140625, 105.69844055175781, 39.09149169921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000398.npy"}
|
||||
{"epoch": 0.8335078534031414, "step": 399, "batch_size": 128, "mean": 43.15449905395508, "std": 72.66523742675781, "min": -153.35272216796875, "p10": -32.905008697509764, "median": 30.43695831298828, "p90": 145.88509368896484, "max": 191.95867919921875, "pos_frac": 0.7265625, "sample": [-8.690536499023438, 126.09616088867188, 129.24346923828125, 125.507568359375, 10.093231201171875, -33.41466522216797, 3.294189453125, 36.685760498046875, 33.08831787109375, 157.82070922851562, 47.88581085205078, -46.3260498046875, -29.97955322265625, 136.0760955810547, -3.1272201538085938, 41.61070251464844, 72.70635986328125, 4.3692474365234375, 151.6194305419922, 30.811813354492188, -2.164093017578125, -83.04075622558594, -19.97882080078125, 143.7711181640625, 1.347259521484375, -0.3050537109375, 20.196380615234375, -112.17155456542969, -153.35272216796875, 111.85289001464844, -3.63629150390625, 147.73052978515625, 33.94663619995117, 152.38653564453125, 17.93561553955078, -10.7418212890625, 134.19247436523438, -7.917091369628906, 17.94903564453125, 153.24273681640625, 34.07025146484375, 145.0941925048828, 191.95867919921875, 140.47451782226562, -18.16888427734375, -12.800529479980469, 144.97781372070312, -58.6104736328125, 2.8721370697021484, 130.707763671875, -40.494720458984375, 42.86603546142578, 118.60458374023438, 91.31524658203125, 166.9747314453125, -7.29046630859375, 31.732166290283203, 11.61578369140625, 11.562881469726562, -14.85141372680664, 6.2111053466796875, 148.3720703125, 0.68048095703125, 26.135536193847656, 129.93801879882812, 62.656978607177734, 133.20559692382812, 20.99115753173828, 75.016357421875, 42.39703369140625, 50.23237609863281, 94.49612426757812, -10.136871337890625, -101.83892822265625, -134.39068603515625, -107.69609069824219, -14.00616455078125, 45.21977233886719, 148.0428466796875, 139.93020629882812, 29.852386474609375, 129.77569580078125, 56.05108642578125, 126.37452697753906, 73.99923706054688, 60.109771728515625, 12.74853515625, 3.5030670166015625, 27.4610595703125, 126.71041870117188, 5.86021614074707, -37.801910400390625, 14.283775329589844, 56.917266845703125, 9.546661376953125, 16.76531982421875, -23.5572509765625, 0.06623458862304688, 154.2479248046875, 10.497848510742188, 85.77056884765625, 30.062103271484375, 52.3048095703125, -56.945770263671875, 6.131103515625, 151.54653930664062, -24.126678466796875, 183.97708129882812, 88.90420532226562, 49.9656982421875, 159.9026336669922, -88.41912841796875, -32.68658447265625, 35.847503662109375, 24.39303207397461, 23.49609375, 82.0296630859375, 114.701171875, -13.681060791015625, 126.02212524414062, -27.405517578125, 42.085052490234375, 109.75730895996094, 68.3063735961914, -24.63702392578125, -1.607513427734375, 92.36750793457031, 17.62176513671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000399.npy"}
|
||||
{"epoch": 0.8356020942408376, "step": 400, "batch_size": 128, "mean": 51.497894287109375, "std": 76.16648864746094, "min": -162.82106018066406, "p10": -33.71959915161132, "median": 38.164302825927734, "p90": 147.32647247314452, "max": 184.56719970703125, "pos_frac": 0.765625, "sample": [5.1886444091796875, -56.49493408203125, 16.357666015625, 38.085296630859375, 107.567138671875, 20.402259826660156, 103.6631851196289, 108.47386169433594, 46.96543884277344, -86.41324615478516, 6.24072265625, 86.718505859375, -162.82106018066406, 38.243309020996094, 155.09825134277344, 15.554367065429688, -4.4268646240234375, 140.28945922851562, 111.379638671875, 140.04556274414062, 179.71527099609375, 3.84130859375, 26.029281616210938, 110.43704223632812, 122.45333862304688, 92.9715576171875, -41.26324462890625, 133.19638061523438, -20.2801513671875, 141.1165771484375, 47.46875, 7.151313781738281, 16.263572692871094, 113.57183837890625, 41.31639099121094, 152.35321044921875, 132.0311279296875, 104.469970703125, 24.638046264648438, 5.914794921875, -123.83389282226562, 127.96524810791016, 111.19148254394531, 147.4447021484375, 12.58453369140625, 138.02114868164062, 1.7042236328125, 115.62347412109375, 9.161376953125, 10.743133544921875, 43.3902587890625, 124.2496337890625, 156.90960693359375, -95.61453247070312, 0.73297119140625, -32.174827575683594, 11.845672607421875, 25.87810516357422, 137.06674194335938, 3.16290283203125, -1.5568580627441406, 50.439605712890625, 149.85479736328125, 107.56219482421875, 28.309112548828125, -37.324066162109375, 16.504364013671875, 139.057373046875, 73.92825317382812, 158.3148956298828, -0.46392822265625, 1.5629196166992188, -23.263214111328125, 26.749771118164062, -31.315277099609375, -84.22894287109375, 145.14041137695312, 156.64088439941406, 3.45147705078125, -42.911895751953125, 69.93405151367188, 109.20101928710938, 84.31915283203125, 83.21981811523438, -3.2209510803222656, 149.96527099609375, 121.1095962524414, -19.619537353515625, 32.47142028808594, 48.744911193847656, 102.3001708984375, -20.63134765625, -8.310562133789062, 6.499162673950195, -1.645904541015625, 133.60198974609375, -25.44598388671875, 109.48799133300781, 2.9765377044677734, -94.32025146484375, 42.818328857421875, 155.10513305664062, 118.455078125, 25.092498779296875, 11.14678955078125, 14.668212890625, 117.9683837890625, -5.479835510253906, 9.858001708984375, 147.2758026123047, 129.74114990234375, 17.175018310546875, 142.06393432617188, -13.21368408203125, 140.45950317382812, 138.23036193847656, 139.1061248779297, -15.29644775390625, 184.56719970703125, 161.3979034423828, 151.03990173339844, 140.40127563476562, 0.0, -69.44512939453125, -85.6348876953125, 6.177886962890625, -125.0552978515625, 46.453819274902344], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000400.npy"}
|
||||
{"epoch": 0.837696335078534, "step": 401, "batch_size": 128, "mean": 59.9088134765625, "std": 74.63966369628906, "min": -196.46014404296875, "p10": -19.31439323425293, "median": 53.52934265136719, "p90": 152.62048950195313, "max": 217.459716796875, "pos_frac": 0.7734375, "sample": [-1.97247314453125, 3.512481689453125, 29.182952880859375, -6.14178466796875, 20.2032470703125, 109.77314758300781, 21.72052001953125, 105.24861907958984, 83.971923828125, -5.8973388671875, 134.0308837890625, 19.341827392578125, 46.72895050048828, 147.42987060546875, 94.21548461914062, 36.43310546875, 75.19000244140625, 127.96478271484375, -9.12115478515625, 116.65260314941406, -8.180526733398438, 30.29815673828125, -19.50259780883789, 72.24853515625, -5.3807373046875, 148.33773803710938, 76.7432861328125, 1.905029296875, 134.05419921875, 58.8155517578125, 165.18511962890625, 190.6055908203125, 126.044921875, 53.18634033203125, 10.518310546875, 11.1510009765625, 127.38441467285156, 121.47571563720703, 18.19219207763672, 124.64069366455078, 217.459716796875, -19.233734130859375, 139.670654296875, 152.52386474609375, 167.8238525390625, 136.73257446289062, 60.24993896484375, 16.028564453125, 130.42864990234375, 137.01693725585938, -77.8096923828125, 125.78221130371094, 31.30828857421875, 9.660896301269531, -196.46014404296875, 82.63461303710938, 10.32781982421875, 147.43798828125, 92.62832641601562, -78.03163146972656, -30.552215576171875, 114.94479370117188, 169.63900756835938, 112.65582275390625, 111.7664794921875, 53.872344970703125, -46.445648193359375, -98.79188537597656, 143.2649383544922, -11.85882568359375, -1.1996078491210938, 0.07866668701171875, 179.2349853515625, 20.289215087890625, 41.27671813964844, 5.75054931640625, 125.44343566894531, 86.128173828125, 50.975311279296875, 49.422401428222656, 135.0003662109375, -26.1287841796875, 133.5263671875, 152.845947265625, 146.25234985351562, 156.46429443359375, 46.423187255859375, -3.2483062744140625, 59.4039306640625, -22.573486328125, 98.02520751953125, 21.1864013671875, -8.2742919921875, -10.85894775390625, 35.46745681762695, -22.322402954101562, -23.905303955078125, 160.52468872070312, 180.8347625732422, 21.5447998046875, 67.16705322265625, -1.5465240478515625, 154.25027465820312, 136.75003051757812, 157.88079833984375, 28.454376220703125, 5.273223876953125, 6.790496826171875, -19.557289123535156, 63.90972900390625, 62.967041015625, 0.334014892578125, 21.283477783203125, 15.880596160888672, -13.232208251953125, -1.792327880859375, 67.7855224609375, 124.37149047851562, 123.00444030761719, -18.694847106933594, 68.33302307128906, 135.35813903808594, 142.713623046875, 2.6742706298828125, 179.4180145263672, -155.86122131347656, 2.39776611328125, 133.561767578125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000401.npy"}
|
||||
{"epoch": 0.8397905759162304, "step": 402, "batch_size": 128, "mean": 48.934837341308594, "std": 77.52105712890625, "min": -139.37557983398438, "p10": -36.0342529296875, "median": 33.2681884765625, "p90": 145.9068176269531, "max": 219.41726684570312, "pos_frac": 0.7578125, "sample": [-6.1664886474609375, -3.241485595703125, -111.94915771484375, 44.99078369140625, 137.173095703125, -16.786422729492188, -76.38543701171875, -31.171630859375, 99.94186401367188, -139.37557983398438, -112.9993896484375, 118.04959106445312, 10.72979736328125, -31.573856353759766, 5.828239440917969, 196.02633666992188, 122.11575317382812, 111.06603240966797, 9.672836303710938, -27.145751953125, 123.4720458984375, 2.415802001953125, 13.409034729003906, 166.35137939453125, -116.61360168457031, 2.563192367553711, 1.391082763671875, 130.58602905273438, -137.09140014648438, 35.277374267578125, 101.61669921875, 2.37225341796875, 96.49765014648438, 112.06625366210938, 9.8861083984375, 0.0, 194.28662109375, 134.67654418945312, -18.807861328125, 131.36972045898438, 75.49880981445312, 30.32568359375, 121.74575805664062, 14.317718505859375, 88.65206909179688, 130.87966918945312, -99.52688598632812, -7.8223876953125, 160.85406494140625, 131.25051879882812, 26.6981201171875, -54.01020812988281, -34.4609375, 108.1636962890625, 104.3253173828125, 24.918685913085938, 116.67518615722656, 86.50633239746094, 4.212249755859375, 78.59060668945312, 12.539499282836914, 177.646240234375, 163.25982666015625, 10.382110595703125, 109.2112045288086, -20.860733032226562, -66.09898376464844, 74.60731506347656, 23.935882568359375, -118.67151641845703, 142.06466674804688, 132.92208862304688, 95.68701171875, -0.85260009765625, 124.75823974609375, -8.17108154296875, 219.41726684570312, 3.8827781677246094, 68.16525268554688, 156.58856201171875, 145.59423828125, 146.63616943359375, 27.481414794921875, 75.73332977294922, 29.748977661132812, 36.633453369140625, 153.44970703125, 93.79754638671875, -66.57243347167969, 116.88116455078125, 110.71491241455078, 35.8052978515625, 12.226951599121094, 0.211395263671875, 42.68902587890625, 68.104736328125, 6.737754821777344, 50.54571533203125, 26.487396240234375, 145.3207550048828, 138.59735107421875, 148.72386169433594, 124.48367309570312, -32.093505859375, 179.7137451171875, -4.9556732177734375, 44.23851013183594, 10.28155517578125, 35.746826171875, 151.23886108398438, 16.231895446777344, 118.5205078125, 93.97662353515625, 90.7176513671875, -49.20269775390625, 7.3369140625, -0.8602447509765625, 24.3367919921875, 15.9169921875, 5.142547607421875, 104.09466552734375, 1.0192012786865234, 3.0518226623535156, -39.705322265625, -6.20703125, 31.259002685546875, 102.30772399902344, -1.2115287780761719], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000402.npy"}
|
||||
{"epoch": 0.8418848167539267, "step": 403, "batch_size": 128, "mean": 66.93035125732422, "std": 68.29164123535156, "min": -116.08160400390625, "p10": -13.67847747802734, "median": 60.74053955078125, "p90": 151.08029327392578, "max": 203.98883056640625, "pos_frac": 0.8125, "sample": [-12.726791381835938, 119.4771728515625, -5.9700927734375, 143.40711975097656, 94.7086181640625, 52.4703369140625, 147.69725036621094, 82.7480239868164, 24.47015380859375, 27.37091064453125, 147.20175170898438, 140.3118896484375, 44.60589599609375, 44.7705078125, 8.966461181640625, 84.34185791015625, 19.034439086914062, 77.47454833984375, 35.490142822265625, 16.25109100341797, 150.55213928222656, -16.08224105834961, -34.1024169921875, 33.1458740234375, 143.34796142578125, 123.83499145507812, 149.26992797851562, 44.2962646484375, 25.67156982421875, 170.4053955078125, 181.92474365234375, 9.692413330078125, 0.942047119140625, 100.3652114868164, -3.592905044555664, 88.00145721435547, 13.566482543945312, 28.33953857421875, 0.0, -18.5841064453125, -0.5789089202880859, 140.786376953125, 179.229736328125, 85.92053985595703, 145.171630859375, 143.73944091796875, 136.30584716796875, 20.048736572265625, 124.49221801757812, 184.35760498046875, 11.86651611328125, 45.98491668701172, 12.4405517578125, 9.516712188720703, 156.6995849609375, 48.906402587890625, 30.136932373046875, -48.916290283203125, 139.88641357421875, -38.94151306152344, -15.899078369140625, 60.8416748046875, -7.671665191650391, 160.20123291015625, 133.23544311523438, 9.090980529785156, 178.12188720703125, 1.267364501953125, 32.15789794921875, 24.49420166015625, 25.05633544921875, 133.08761596679688, 13.29071044921875, 82.77105712890625, -10.8660888671875, 57.86915588378906, 101.73855590820312, 152.31265258789062, 4.9947509765625, 66.97201538085938, -1.059478759765625, 60.639404296875, 163.16531372070312, 16.020263671875, 119.49624633789062, -19.928924560546875, 19.592300415039062, 130.30081176757812, 115.37493896484375, 79.3795166015625, 200.71905517578125, 76.51168823242188, 203.98883056640625, 167.82647705078125, 50.86674499511719, 153.75015258789062, -30.691650390625, 119.5257568359375, 133.13711547851562, 112.49193572998047, 109.4798583984375, 61.969482421875, 128.37644958496094, 27.121978759765625, -116.08160400390625, 134.00514221191406, -24.441650390625, 135.7847442626953, -1.2720279693603516, 128.89222717285156, -49.457366943359375, -1.5464859008789062, 24.79461669921875, 97.14755249023438, -24.02069091796875, 47.39195251464844, 48.97747802734375, -2.231964111328125, 102.79484558105469, 9.599822998046875, -105.1123046875, 132.9169464111328, 133.84930419921875, 76.5953369140625, 120.67805480957031, 111.49925994873047, 113.68365478515625, 61.36798095703125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000403.npy"}
|
||||
{"epoch": 0.8439790575916231, "step": 404, "batch_size": 128, "mean": 46.723228454589844, "std": 80.21269226074219, "min": -183.19265747070312, "p10": -39.44465522766113, "median": 30.47149658203125, "p90": 155.1518127441406, "max": 210.9959716796875, "pos_frac": 0.7109375, "sample": [154.77890014648438, 147.88803100585938, 38.338592529296875, 164.15406799316406, -116.93063354492188, 139.0259552001953, 166.04547119140625, 5.5218048095703125, -82.94210052490234, 0.0, -135.11566162109375, 63.61553192138672, 26.383140563964844, -18.296875, -81.20565795898438, 124.15702819824219, 30.67315673828125, 158.21075439453125, -1.2365036010742188, -50.04632568359375, 149.166015625, 54.71092224121094, 20.06329345703125, -6.83538818359375, -16.7911376953125, 96.47659301757812, -183.19265747070312, 139.29910278320312, -37.8883171081543, 14.856864929199219, 10.105438232421875, -43.07611083984375, 20.734649658203125, 8.8321533203125, 86.83821105957031, -9.706809997558594, 3.651508331298828, -1.080535888671875, 210.9959716796875, 104.17259216308594, 141.22048950195312, -4.47650146484375, -6.406341552734375, -84.99569702148438, 127.91349792480469, 44.33293151855469, 64.42340087890625, 30.26983642578125, 3.9327526092529297, 18.63037109375, 93.89610290527344, 7.723747253417969, 145.47613525390625, -25.5130615234375, 43.581817626953125, 174.1964111328125, -102.50155639648438, 16.173660278320312, -14.383285522460938, 17.190170288085938, 181.81924438476562, 162.0330352783203, -113.92138671875, 120.18545532226562, 85.31954956054688, 132.61212158203125, 132.03402709960938, 15.6552734375, -4.274261474609375, 140.2628631591797, 124.31067657470703, 7.168035507202148, 9.79302978515625, -11.549688339233398, 10.60247802734375, 104.28509521484375, 15.498046875, -12.64013671875, 168.38433837890625, 39.57347106933594, 70.15577697753906, 21.314224243164062, 24.24560546875, 63.75677490234375, 7.888885498046875, -30.157943725585938, -16.6656494140625, -14.97528076171875, 160.30300903320312, -85.05880737304688, -6.036251068115234, 101.59188842773438, 142.95437622070312, 38.4947509765625, -119.51954650878906, -27.00726318359375, 73.78253173828125, 16.389991760253906, 152.52700805664062, 68.2940673828125, -24.84897804260254, 12.848373413085938, 156.02194213867188, 61.756011962890625, 152.3553466796875, 78.61918640136719, 52.80731964111328, 66.46829223632812, -104.73932647705078, 161.47177124023438, 30.98736572265625, 178.6007080078125, 142.0878448486328, 54.5401611328125, 29.23040771484375, -3.968109130859375, 61.012603759765625, -10.085315704345703, 76.21267700195312, 128.44760131835938, 28.613067626953125, -6.036247253417969, 12.614944458007812, 115.03985595703125, 127.9356689453125, 89.39909362792969, 143.76150512695312, 174.954345703125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000404.npy"}
|
||||
{"epoch": 0.8460732984293193, "step": 405, "batch_size": 128, "mean": 52.053016662597656, "std": 73.87206268310547, "min": -129.5447998046875, "p10": -31.062554931640626, "median": 35.70142364501953, "p90": 158.80519104003906, "max": 195.89703369140625, "pos_frac": 0.7578125, "sample": [14.6234130859375, 4.811248779296875, 9.600433349609375, 0.0, -50.969329833984375, -5.038360595703125, 9.45037841796875, 130.9976043701172, 106.12728881835938, 141.21319580078125, 2.8185882568359375, 18.03900146484375, 82.61174011230469, 34.41688537597656, 126.9343490600586, -70.85548400878906, -129.5447998046875, 21.601821899414062, 39.127296447753906, 165.640869140625, 58.445560455322266, 143.63491821289062, -2.118938446044922, -15.078506469726562, 125.89631652832031, 5.921173095703125, -26.036209106445312, 53.2205810546875, 156.7494659423828, 30.94709014892578, 113.59133911132812, 128.57754516601562, 5.8922119140625, -6.046875, 57.039146423339844, 128.16488647460938, 14.096511840820312, -68.38906860351562, -66.95772552490234, 115.50556182861328, -2.413301467895508, 172.40696716308594, 121.00006103515625, 0.9780616760253906, -9.238067626953125, 195.89703369140625, 126.39205932617188, 94.01182556152344, 119.6219482421875, 39.234100341796875, 10.145111083984375, 83.05023193359375, 160.13870239257812, 62.603851318359375, 25.398590087890625, 1.31561279296875, 120.49858093261719, 130.5087890625, 124.3773193359375, -40.155029296875, -95.07128143310547, 58.80322265625, 85.3603515625, -6.4661102294921875, 13.290496826171875, -6.058052062988281, 59.03656005859375, -4.554168701171875, 20.6192626953125, 181.30751037597656, 51.98344421386719, 56.446868896484375, 132.31744384765625, 76.50421142578125, 171.87600708007812, 181.69378662109375, -31.153045654296875, 101.53273010253906, 14.218193054199219, 39.45013427734375, 180.09219360351562, 10.19333267211914, -31.023773193359375, -83.08584594726562, 133.0452880859375, 25.467418670654297, 0.5901908874511719, 176.84371948242188, 28.776351928710938, 71.99566650390625, 122.98373413085938, 25.597496032714844, -70.56591796875, 8.218048095703125, 157.1578369140625, 160.079345703125, 9.690162658691406, 12.477645874023438, 129.93997192382812, 110.34038543701172, -73.62870788574219, 161.55526733398438, -1.46331787109375, 130.73458862304688, 14.610931396484375, 56.59686279296875, 9.736316680908203, 132.30130004882812, 59.339874267578125, 36.9859619140625, 40.941558837890625, 0.7188034057617188, 158.25912475585938, -3.2191715240478516, 3.09942626953125, -15.039947509765625, -20.007217407226562, 9.754438400268555, -27.2095947265625, 3.063974380493164, -37.757232666015625, 164.57415771484375, 190.06756591796875, -11.22845458984375, -63.756874084472656, 153.55401611328125, 79.50726318359375, 90.31024169921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000405.npy"}
|
||||
{"epoch": 0.8481675392670157, "step": 406, "batch_size": 128, "mean": 41.929039001464844, "std": 79.90755462646484, "min": -164.96005249023438, "p10": -46.33058166503906, "median": 28.499019622802734, "p90": 138.80856628417968, "max": 190.19271850585938, "pos_frac": 0.703125, "sample": [127.62933349609375, -27.34680938720703, -33.6966552734375, 18.1446533203125, -129.10302734375, 180.071044921875, 101.09843444824219, 138.22750854492188, 86.364013671875, 141.20765686035156, -11.601669311523438, 0.5795555114746094, -19.80126953125, 156.96124267578125, 56.06805419921875, -25.5177001953125, 113.24285888671875, -10.974369049072266, 70.16694641113281, 23.844146728515625, -113.17245483398438, -46.3077392578125, 13.9852294921875, 69.677978515625, 132.002197265625, 8.229019165039062, 92.16459655761719, 169.94036865234375, 25.930511474609375, 105.26600646972656, 54.30572509765625, 140.16436767578125, -133.31442260742188, -3.127716064453125, 21.928436279296875, -114.56787109375, -28.509307861328125, -8.943626403808594, -15.777137756347656, 127.23483276367188, -39.055755615234375, -50.9114990234375, -103.01739501953125, -22.836822509765625, 33.974327087402344, 144.77542114257812, 16.707733154296875, 127.89081573486328, 113.38742065429688, 136.52188110351562, -5.6563720703125, 21.323883056640625, 134.6727752685547, 105.97654724121094, -164.96005249023438, -9.371856689453125, 120.09683227539062, 123.16409301757812, 93.81624603271484, 19.40234375, 3.10595703125, 136.45904541015625, -105.50393676757812, 163.558837890625, 21.607223510742188, 28.73529052734375, 4.497528076171875, 14.37908935546875, -52.722991943359375, 120.2735595703125, 190.19271850585938, 134.11703491210938, 10.517929077148438, 36.5994873046875, 157.4388427734375, 34.62811279296875, 5.6721038818359375, 130.8271484375, 13.020526885986328, 164.28302001953125, 120.74575805664062, -16.650123596191406, -131.09774780273438, 12.114738464355469, 136.0290985107422, 113.62457275390625, 87.13784790039062, -12.647109985351562, 33.425628662109375, 89.13211059570312, 23.925613403320312, -14.412704467773438, 105.1900405883789, 108.35466003417969, -71.25606536865234, 188.30282592773438, 94.41757202148438, 1.6686477661132812, 114.66050720214844, 37.40643310546875, 55.55181884765625, 112.68600463867188, 35.0771484375, 126.4129638671875, -26.3372802734375, 47.3193359375, -128.26361083984375, 167.881103515625, 1.9296875, 39.337249755859375, -16.75213623046875, 56.98089599609375, -46.383880615234375, 115.77215576171875, 183.21234130859375, -0.16573333740234375, 2.795818328857422, 28.26274871826172, -13.664031982421875, -16.65704345703125, -45.579925537109375, 22.5118408203125, 9.0821533203125, 32.693267822265625, 135.97756958007812, -46.306884765625, 37.965328216552734, 15.245464324951172], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000406.npy"}
|
||||
{"epoch": 0.8502617801047121, "step": 407, "batch_size": 128, "mean": 50.29819869995117, "std": 77.25228881835938, "min": -115.6856918334961, "p10": -38.2782341003418, "median": 33.438629150390625, "p90": 151.44912109375, "max": 188.2420654296875, "pos_frac": 0.6953125, "sample": [-3.7560958862304688, 27.42626953125, 83.9324951171875, -14.138290405273438, 134.87399291992188, 13.580341339111328, 147.83331298828125, 65.74604797363281, -34.313201904296875, 19.79625129699707, -2.3289947509765625, -49.735443115234375, -115.31549072265625, -3.873199462890625, -102.44204711914062, 111.44562530517578, -0.3921051025390625, -0.6185836791992188, 15.90374755859375, -25.7249755859375, 120.63336944580078, -13.2969970703125, -2.482391357421875, 86.6397705078125, 150.55691528320312, -60.66184997558594, 79.63568878173828, 109.91151428222656, -1.5638504028320312, -11.83282470703125, -115.6856918334961, 93.45127868652344, -38.10101318359375, 37.620147705078125, 98.85969543457031, 153.82748413085938, 145.21206665039062, 157.77554321289062, -38.691749572753906, -11.75689697265625, 66.00015258789062, 24.896881103515625, 32.820343017578125, 2.2050399780273438, -36.72430419921875, 172.49832153320312, 67.86829376220703, 98.63541412353516, 173.62310791015625, 9.600830078125, 69.70428466796875, 130.05706787109375, 6.92816162109375, 173.71780395507812, 0.662811279296875, 134.55825805664062, -45.648765563964844, -3.7333145141601562, 27.208465576171875, 84.0164794921875, 123.43415832519531, 34.056915283203125, 8.288360595703125, 34.34001922607422, 140.0137939453125, 14.67779541015625, -74.47586059570312, -16.179779052734375, 107.7603759765625, 152.25738525390625, 18.225860595703125, -24.961776733398438, 125.21389770507812, 149.74771118164062, 163.98681640625, 17.45913314819336, 6.2510528564453125, 25.37236785888672, 0.0, 183.8388671875, 140.44204711914062, 8.725471496582031, 84.66618347167969, -51.040557861328125, 149.39920043945312, -89.8870849609375, 166.30624389648438, 82.83139038085938, 15.572479248046875, 120.29705810546875, 143.88619995117188, -4.59619140625, -6.3226318359375, 104.56295776367188, 92.93417358398438, 13.056907653808594, -36.017730712890625, 36.33270263671875, 40.376983642578125, 153.74008178710938, 133.8280029296875, -19.150192260742188, 179.541015625, -76.6661376953125, 109.38992309570312, -95.95785522460938, 74.46649169921875, 73.3782958984375, 151.10272216796875, 3.0487213134765625, 132.854736328125, 151.00186157226562, 51.3260498046875, -109.43768310546875, 162.4739990234375, 0.5439910888671875, 37.06153869628906, -30.072975158691406, 8.603401184082031, 140.32000732421875, 130.7310333251953, 129.61410522460938, 21.88958740234375, 8.589731216430664, -3.82440185546875, 188.2420654296875, 133.85379028320312, 0.0], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000407.npy"}
|
||||
{"epoch": 0.8523560209424084, "step": 408, "batch_size": 128, "mean": 51.183265686035156, "std": 74.4272232055664, "min": -174.18234252929688, "p10": -27.520278930664062, "median": 41.160552978515625, "p90": 147.5619171142578, "max": 175.77560424804688, "pos_frac": 0.734375, "sample": [-3.6010208129882812, 14.571990966796875, -54.67308807373047, 62.924346923828125, 23.596038818359375, 140.02398681640625, 32.15751647949219, 160.925537109375, 148.43121337890625, 124.56483459472656, -14.286590576171875, 11.282119750976562, 93.8675537109375, 103.2337646484375, -43.9464111328125, 133.39266967773438, 18.519927978515625, -14.20318603515625, -2.2104949951171875, 73.34288787841797, 0.4362945556640625, 20.730670928955078, 0.1068572998046875, -7.615936279296875, 164.84521484375, 131.13629150390625, 165.998046875, 153.46282958984375, -13.664276123046875, -3.30615234375, -23.048995971679688, 138.71548461914062, 105.107421875, 116.5787353515625, -29.81573486328125, 68.89802551269531, 175.77560424804688, 109.73361206054688, 141.03814697265625, 44.813140869140625, -23.015945434570312, -160.9095458984375, -174.18234252929688, 10.350723266601562, 153.45623779296875, 112.07691955566406, 4.861480712890625, 22.034744262695312, -28.408203125, 140.75083923339844, 36.505462646484375, 100.41229248046875, -13.38177490234375, -8.24932861328125, 71.95631408691406, 37.861419677734375, 107.78533935546875, 129.29434204101562, -27.139739990234375, 65.27947998046875, 145.8241729736328, 124.4046630859375, 173.21463012695312, 14.993309020996094, -17.94775390625, 146.46197509765625, -41.112274169921875, -17.166610717773438, 120.5869140625, -87.69845581054688, 156.70489501953125, 17.652938842773438, 120.39804077148438, 14.2994384765625, 73.53396606445312, 147.18936157226562, -18.48126220703125, 85.09465026855469, -14.35089111328125, 27.355850219726562, 137.05712890625, 62.35040283203125, -0.9069595336914062, 41.246978759765625, -25.35546875, 4.187774658203125, 127.24493408203125, 130.43121337890625, 102.93692016601562, -57.35466003417969, 19.841705322265625, 11.897438049316406, 41.074127197265625, 135.14578247070312, 140.9140625, -1.542938232421875, -129.51690673828125, 64.84576416015625, 152.29861450195312, 43.13739013671875, -39.32164001464844, 143.8784637451172, 17.9652099609375, 44.52647399902344, 116.37117004394531, -9.506072998046875, 15.282012939453125, 9.794525146484375, 12.23834228515625, 49.00213623046875, 62.493125915527344, 9.672950744628906, 26.704792022705078, 0.0, 10.603424072265625, 172.42123413085938, 62.08387756347656, 3.7016544342041016, 107.781494140625, -39.182647705078125, -102.70608520507812, 27.0703125, 149.23294067382812, 62.514312744140625, 165.96148681640625, 78.42620849609375, 109.84849548339844, 88.5313720703125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000408.npy"}
|
||||
{"epoch": 0.8544502617801047, "step": 409, "batch_size": 128, "mean": 52.762847900390625, "std": 81.65744018554688, "min": -146.09967041015625, "p10": -44.196249389648436, "median": 34.2352180480957, "p90": 158.35003356933595, "max": 214.6053466796875, "pos_frac": 0.7421875, "sample": [135.20013427734375, 109.64002990722656, 129.30447387695312, 48.95137023925781, 166.4703369140625, -44.3043212890625, 109.90281677246094, 163.056884765625, 1.9425048828125, 39.742645263671875, 21.25959014892578, -2.8400192260742188, 17.779067993164062, 4.75811767578125, 24.63323974609375, 181.9019775390625, 29.84557342529297, 22.8956298828125, 155.11257934570312, -128.7215576171875, -8.659645080566406, -62.53352355957031, -1.4477348327636719, 150.8329315185547, 122.86788940429688, -75.27374267578125, 16.533233642578125, -93.73553466796875, 149.4167022705078, -142.75579833984375, 158.05136108398438, 130.62420654296875, -80.63640594482422, 21.517898559570312, -4.35345458984375, 176.50103759765625, 21.0452880859375, 37.94342041015625, 169.0213623046875, 155.82525634765625, 91.38092041015625, 147.24856567382812, -89.86297607421875, 30.280838012695312, 2.8749465942382812, 39.256072998046875, 63.746124267578125, 144.72976684570312, 14.14208984375, 170.7877197265625, 128.6826171875, 127.33625793457031, 107.7333984375, 128.39016723632812, 214.6053466796875, -40.72344970703125, -44.149932861328125, 41.394195556640625, 20.456817626953125, 3.8433189392089844, 195.06512451171875, 101.88388061523438, 17.235107421875, 53.769775390625, 133.56683349609375, 51.30145263671875, 23.365264892578125, -35.3697509765625, -4.064643859863281, -34.16026306152344, 138.766845703125, -31.958023071289062, -85.5379638671875, 159.04693603515625, -8.228092193603516, 149.63580322265625, 166.21551513671875, 177.19488525390625, 40.58440399169922, 23.89605712890625, 7.122364044189453, -25.674808502197266, 79.67898559570312, 0.0, 97.46562194824219, 205.51165771484375, 34.96421813964844, 157.55328369140625, 124.34445190429688, -116.57232666015625, -38.471466064453125, 142.73388671875, 48.3765869140625, 3.4545211791992188, 24.645111083984375, -2.7649002075195312, 116.03456115722656, 142.27639770507812, 129.51905822753906, 107.96939086914062, 22.138214111328125, -4.075859069824219, 120.99578857421875, 72.17987060546875, 45.879913330078125, -7.127647399902344, 21.775314331054688, -45.0341796875, -3.9003143310546875, -13.013526916503906, 44.72275161743164, 3.6825027465820312, -146.09967041015625, 31.888444900512695, 126.66081237792969, 30.000152587890625, 2.636758804321289, 114.2137451171875, 28.941360473632812, -62.98748779296875, 135.53280639648438, 7.925712585449219, 7.781181335449219, 105.79840087890625, -13.246871948242188, 120.515869140625, 172.50912475585938, 33.50621795654297], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000409.npy"}
|
||||
{"epoch": 0.856544502617801, "step": 410, "batch_size": 128, "mean": 47.015830993652344, "std": 79.56884765625, "min": -129.7777099609375, "p10": -42.53206329345702, "median": 39.03888702392578, "p90": 149.89414062499998, "max": 215.9625244140625, "pos_frac": 0.703125, "sample": [58.46514892578125, -29.491722106933594, 15.229972839355469, -38.758392333984375, -17.522727966308594, 137.46197509765625, 12.275491714477539, 28.1759033203125, -28.990386962890625, 115.07023620605469, 193.93069458007812, -69.52435302734375, 29.968061447143555, -72.63031005859375, 72.88072967529297, -129.7777099609375, -26.09210205078125, 36.237701416015625, 146.52557373046875, 93.10348510742188, 155.06378173828125, -121.71417236328125, 66.61634826660156, 11.644981384277344, 138.88040161132812, -5.76123046875, 108.26123046875, 81.87612915039062, 104.55059814453125, 126.47419738769531, 29.36077880859375, 77.61861419677734, 34.011962890625, 135.13595581054688, -4.291402816772461, 108.93315124511719, 41.22075653076172, 69.39932250976562, 35.253631591796875, 145.84237670898438, 148.52520751953125, 44.259063720703125, -5.2335662841796875, 144.70294189453125, -112.98724365234375, 117.70248413085938, -129.72962951660156, 6.168914794921875, -14.364974975585938, -21.218612670898438, 114.33334350585938, 136.28372192382812, 85.14961242675781, -51.779754638671875, -34.63005828857422, 46.45941162109375, -4.209526062011719, 137.84469604492188, 87.14912414550781, 47.21453857421875, -27.50804901123047, 171.8282470703125, 35.1260986328125, 157.63623046875, -1.8035945892333984, 6.5540924072265625, -31.32916259765625, -1.01031494140625, 103.61585235595703, 24.517852783203125, 114.02487182617188, 129.3058624267578, 42.16835403442383, -18.1627197265625, -52.489410400390625, 0.0, 36.857017517089844, 25.132492065429688, 179.67431640625, 147.53643798828125, 182.84278869628906, 0.091400146484375, 3.736572265625, -50.60820007324219, -33.245689392089844, 121.8992919921875, 22.39624786376953, 23.364501953125, -28.796585083007812, 27.18598175048828, -124.82447814941406, 54.7733039855957, -11.8350830078125, 103.16209411621094, -39.07086181640625, 183.39404296875, -85.44757080078125, 153.08831787109375, 3.3095779418945312, 52.65254211425781, 76.74927520751953, 1.0667190551757812, 75.30211639404297, -129.21823120117188, 43.28533935546875, 78.97683715820312, 92.17771911621094, 126.2928466796875, 202.9296875, 93.43116760253906, 143.93865966796875, -18.410797119140625, 113.1064453125, 59.563507080078125, 1.308135986328125, 62.7652587890625, 175.76089477539062, -54.074005126953125, 42.92066955566406, 8.580184936523438, 4.376688003540039, 158.51559448242188, 183.759765625, 19.733795166015625, 70.80708312988281, -34.66680908203125, 215.9625244140625, -3.2114009857177734], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000410.npy"}
|
||||
{"epoch": 0.8586387434554974, "step": 411, "batch_size": 128, "mean": 53.87716293334961, "std": 76.04335021972656, "min": -181.75384521484375, "p10": -21.031834411621084, "median": 32.57537841796875, "p90": 150.75999755859374, "max": 263.242431640625, "pos_frac": 0.7734375, "sample": [108.35704040527344, 150.54840087890625, 31.8653564453125, 126.50482177734375, 20.97003173828125, 0.5740814208984375, -1.6300163269042969, 32.40062713623047, 42.60875701904297, 123.26966857910156, 128.685302734375, 118.921630859375, -1.1440544128417969, -18.766998291015625, -72.969482421875, -112.55905151367188, 177.13629150390625, -14.828765869140625, 121.712890625, 98.67762756347656, 39.4781494140625, 57.554840087890625, 130.13714599609375, 32.6014404296875, 115.02508544921875, 38.6143798828125, 23.67913818359375, 263.242431640625, 116.6495361328125, -17.01043701171875, 151.62734985351562, 22.506032943725586, 27.088584899902344, 22.509872436523438, 117.50166320800781, 17.709930419921875, 160.37744140625, 120.28207397460938, 34.43426513671875, 24.011287689208984, 9.634170532226562, 30.625701904296875, -0.33319091796875, 71.73979187011719, 5.4408416748046875, -62.189170837402344, 1.28155517578125, 6.499732971191406, -17.61688232421875, 121.36593627929688, 127.11455535888672, 26.1358642578125, -9.827102661132812, 3.680204391479492, 15.589288711547852, 151.25372314453125, -31.33057403564453, 87.49057006835938, 31.943878173828125, 13.028541564941406, -0.24136924743652344, 29.948577880859375, -7.601226806640625, -80.69464111328125, 125.00276184082031, 115.15744018554688, 91.50765991210938, 149.18118286132812, 23.62884521484375, 175.01934814453125, 7.243419647216797, 115.18199157714844, 42.98992919921875, 149.56829833984375, 10.107337951660156, 171.11349487304688, 19.071044921875, 7.43267822265625, 139.26815795898438, 125.58496856689453, 25.641334533691406, -181.75384521484375, 157.97811889648438, 12.451789855957031, 12.363510131835938, -115.3416976928711, 50.27831268310547, -1.9935874938964844, 132.3387908935547, 179.02178955078125, 102.98196411132812, -80.56755065917969, 31.67938232421875, 36.132415771484375, 14.367034912109375, 39.720603942871094, 95.87080383300781, 187.23919677734375, 105.98587799072266, -47.46588134765625, 91.830810546875, -26.316452026367188, -87.303466796875, -16.47711181640625, 8.255777359008789, 32.54931640625, 109.90614318847656, 45.76861572265625, 35.04911804199219, 111.83586883544922, -29.925323486328125, 112.41400146484375, 140.00364685058594, -7.725868225097656, 19.677886962890625, -17.98443603515625, 86.20272827148438, 193.9678955078125, 128.12039184570312, 139.2740936279297, 124.2467041015625, 179.38111877441406, 112.0501708984375, 168.85311889648438, 25.341064453125, -4.9188232421875, -71.55140686035156, -11.52862548828125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000411.npy"}
|
||||
{"epoch": 0.8607329842931937, "step": 412, "batch_size": 128, "mean": 46.950218200683594, "std": 81.8581314086914, "min": -166.7060546875, "p10": -44.49461975097656, "median": 34.008522033691406, "p90": 149.8544158935547, "max": 223.25393676757812, "pos_frac": 0.6796875, "sample": [-112.9468994140625, 149.1668701171875, 5.816383361816406, 164.0703125, 39.53289794921875, 92.24667358398438, 33.14598083496094, 115.0216064453125, -35.50093078613281, -14.566436767578125, 39.596435546875, 33.79266357421875, 0.5894012451171875, -16.270151138305664, 20.940338134765625, -166.7060546875, -0.2326202392578125, -26.571685791015625, -86.51336669921875, 27.854141235351562, 5.3222808837890625, 2.6445045471191406, 16.78912353515625, 81.754638671875, 147.72850036621094, 139.396484375, 157.42919921875, 31.074623107910156, -17.40277099609375, 55.112403869628906, 106.96673583984375, 107.75091552734375, -17.428741455078125, 148.40438842773438, 164.57427978515625, 139.87744140625, 149.8525390625, 15.745292663574219, 132.88046264648438, 67.12457275390625, 117.11677551269531, -25.501983642578125, 147.287353515625, -108.21478271484375, 77.61662292480469, -15.9666748046875, -3.0924453735351562, -24.504104614257812, -127.29608154296875, 48.964996337890625, 112.38470458984375, 161.476806640625, 179.65411376953125, -20.7935791015625, 152.82852172851562, 11.916152954101562, -9.290863037109375, 68.41204833984375, -57.565673828125, -27.607635498046875, -58.59039306640625, -7.301441192626953, 184.9571533203125, 69.9022216796875, -5.4446563720703125, 153.76309204101562, 114.4733657836914, 223.25393676757812, 49.473419189453125, 61.844085693359375, -84.44390869140625, 61.09429931640625, -46.02008056640625, -17.89794921875, 65.73074340820312, 34.22438049316406, 174.44090270996094, 149.7330322265625, 17.227386474609375, -149.79888916015625, 48.666595458984375, 76.4290771484375, 138.54904174804688, 1.088409423828125, 149.77044677734375, 4.3740386962890625, 144.66891479492188, 135.3836669921875, 167.26083374023438, -57.19224548339844, 16.228256225585938, 149.85879516601562, 125.27142333984375, -114.11123657226562, -5.452857971191406, 101.28724670410156, -16.130218505859375, 110.8206787109375, -5.64495849609375, 113.92192840576172, -13.81829833984375, -0.43908119201660156, 180.04690551757812, 65.91218566894531, 123.91009521484375, 6.404882431030273, -34.251678466796875, 82.66603088378906, 9.783111572265625, 103.16447448730469, 140.67437744140625, 27.120025634765625, 0.0, 64.42787170410156, -12.1142578125, 133.17291259765625, 3.9687042236328125, 26.71605682373047, -120.80255126953125, 3.749959945678711, 17.956329345703125, 121.12753295898438, -43.840850830078125, 54.33447265625, -10.985122680664062, -9.57586669921875, 112.82131958007812, 145.94332885742188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000412.npy"}
|
||||
{"epoch": 0.86282722513089, "step": 413, "batch_size": 128, "mean": 42.215675354003906, "std": 72.05658721923828, "min": -147.5159912109375, "p10": -35.89104385375976, "median": 22.20519256591797, "p90": 141.95648956298828, "max": 219.61441040039062, "pos_frac": 0.7265625, "sample": [20.37049674987793, -8.33270263671875, -1.2437610626220703, -0.7494125366210938, 6.452484130859375, 131.40029907226562, 46.477447509765625, 20.04461669921875, -147.5159912109375, -22.954544067382812, 158.22010803222656, 41.749969482421875, 159.2528076171875, -46.377227783203125, 30.241729736328125, 15.103130340576172, 6.25885009765625, 155.5078125, 61.646392822265625, -78.53703308105469, 50.67669677734375, 12.548480987548828, 100.47734069824219, 16.91845703125, 25.119049072265625, -15.872261047363281, 7.6949920654296875, 174.5306396484375, -7.255441665649414, 91.54093170166016, 136.38516235351562, -31.537353515625, -37.059364318847656, 53.243194580078125, -35.39033508300781, 135.8855438232422, 141.30113220214844, 131.55990600585938, 134.9747314453125, 10.042213439941406, 92.42274475097656, -3.092437744140625, -80.41156005859375, 5.71826171875, 82.43212890625, -29.9208984375, -79.05426788330078, 54.48912048339844, 158.37582397460938, 75.24566650390625, -39.99224853515625, 2.876800537109375, -20.484207153320312, 75.85025024414062, 25.13995361328125, 110.69577026367188, 147.34722900390625, -20.397705078125, 44.077392578125, 27.873947143554688, 153.8524169921875, 25.779495239257812, 12.631027221679688, 20.655838012695312, 31.598480224609375, 186.10662841796875, -60.90202713012695, 58.58575439453125, 138.69224548339844, -22.957244873046875, 129.37408447265625, 37.863807678222656, -61.80052185058594, 219.61441040039062, 18.8991756439209, -12.760208129882812, 1.6147308349609375, 23.084701538085938, -20.960845947265625, 92.88829040527344, 4.919342041015625, 130.965087890625, -48.49273681640625, 116.06857299804688, 124.19039916992188, -19.745712280273438, 10.5455322265625, 66.1939697265625, 9.346433639526367, -57.424652099609375, 140.68502807617188, 31.337539672851562, -31.800277709960938, 116.324462890625, 19.001708984375, -17.629009246826172, 7.530830383300781, 8.799957275390625, 133.16766357421875, 10.056110382080078, 8.738468170166016, -28.22655487060547, 5.9779815673828125, 119.970703125, 100.22662353515625, 73.72357177734375, 10.41091537475586, 9.44818115234375, 112.3587646484375, 63.717437744140625, 31.199661254882812, 21.32568359375, 49.10874938964844, 137.9603271484375, -15.806182861328125, -71.8526611328125, 54.6497802734375, 143.48565673828125, -120.65121459960938, 187.21405029296875, 52.375396728515625, 162.82797241210938, 5.259162902832031, -0.11193084716796875, 157.41015625, 133.3140869140625, -18.587799072265625, 20.24859619140625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000413.npy"}
|
||||
{"epoch": 0.8649214659685864, "step": 414, "batch_size": 128, "mean": 50.59251403808594, "std": 83.3232421875, "min": -186.04425048828125, "p10": -45.12858619689941, "median": 36.63359069824219, "p90": 154.97768859863282, "max": 239.05670166015625, "pos_frac": 0.7265625, "sample": [104.11752319335938, -120.25985717773438, 79.00709533691406, 130.49786376953125, 2.1370468139648438, 127.2803955078125, 46.5736083984375, 64.64251708984375, 53.47981262207031, -99.548828125, -7.9347686767578125, 7.0329437255859375, 156.89581298828125, 17.9354248046875, -0.5620880126953125, -1.901397705078125, -7.432220458984375, -93.38534545898438, 137.37420654296875, -9.384578704833984, -186.04425048828125, -54.843841552734375, 186.79766845703125, 8.10986328125, 112.73159790039062, -122.64869689941406, -41.067138671875, 8.500324249267578, 125.00567626953125, 128.96340942382812, -44.962581634521484, -25.44049072265625, 119.91822814941406, 132.6217803955078, -53.15966796875, 153.4319305419922, 128.205322265625, 1.1053314208984375, 179.0869140625, 0.9718017578125, 28.9754638671875, 144.47308349609375, 77.65728759765625, -9.2774658203125, 100.71620178222656, 127.9510498046875, 6.5487060546875, 11.141403198242188, 38.69049072265625, 178.9820556640625, -6.554473876953125, -12.018959045410156, -3.842304229736328, 186.68624877929688, 130.2471160888672, -7.139617919921875, 135.6552276611328, 153.65357971191406, 153.85577392578125, -83.22842407226562, 108.12142944335938, 3.3175716400146484, 0.11750030517578125, 134.6520538330078, 6.5330810546875, 69.7205810546875, 86.47008514404297, 96.23977661132812, 55.46240234375, -2.646331787109375, 132.0299835205078, -7.714813232421875, 13.027099609375, 174.1597900390625, 127.60765838623047, 155.2159423828125, 137.44422912597656, 0.24139404296875, 50.541839599609375, 109.70167541503906, 2.3093719482421875, -35.46405029296875, -108.36627960205078, 180.19973754882812, 239.05670166015625, 147.3464813232422, 145.82363891601562, 87.28113555908203, 3.4729385375976562, 123.74349975585938, -170.2523193359375, 12.90472412109375, 36.75689697265625, 134.22760009765625, 11.612163543701172, 154.87557983398438, -30.3023681640625, -4.31201171875, -17.777603149414062, 14.943229675292969, -58.33045196533203, 67.1497802734375, 206.4150390625, 36.510284423828125, 8.220787048339844, 9.695613861083984, 132.82229614257812, 92.94674682617188, 90.81901550292969, 19.125152587890625, -45.51593017578125, 64.4288101196289, 176.43719482421875, 53.195068359375, -3.267803192138672, 14.8992919921875, 109.95504760742188, 167.97409057617188, 22.983749389648438, 38.20649719238281, 21.349624633789062, 0.1229248046875, -57.5191650390625, -13.82073974609375, 30.172088623046875, 64.24563598632812, 159.03094482421875, -7.7526397705078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000414.npy"}
|
||||
{"epoch": 0.8670157068062827, "step": 415, "batch_size": 128, "mean": 59.23355484008789, "std": 77.84939575195312, "min": -149.06781005859375, "p10": -25.141513061523437, "median": 47.95552062988281, "p90": 158.8675994873047, "max": 220.0447998046875, "pos_frac": 0.7421875, "sample": [18.66011619567871, 114.18095397949219, 20.35082244873047, 47.543914794921875, -34.45323181152344, 2.0413055419921875, 125.5119857788086, 0.7418212890625, 124.01384735107422, 10.17205810546875, 50.95342254638672, -2.3580360412597656, 154.66647338867188, 88.65582275390625, -25.6624755859375, -6.3199005126953125, 141.63238525390625, 3.82867431640625, 128.14202880859375, 42.65000915527344, 134.24038696289062, 21.5848388671875, 167.53713989257812, 4.640869140625, 117.08311462402344, 36.04486083984375, 83.94853210449219, 148.25506591796875, 147.03182983398438, 95.8082275390625, -116.11849975585938, 174.0718994140625, 162.2640380859375, 22.2406005859375, -118.99688720703125, 21.47796630859375, 54.341552734375, 158.05410766601562, -9.1788330078125, 98.43130493164062, 100.02542877197266, 22.235321044921875, -1.2814483642578125, 18.245269775390625, 66.36683654785156, -20.764724731445312, 146.49423217773438, 41.26544189453125, 160.7657470703125, 122.13563537597656, -10.465293884277344, 31.058380126953125, -85.31192016601562, 48.294830322265625, 134.83944702148438, 151.00247192382812, -45.697509765625, 140.71182250976562, 34.4132080078125, -28.593780517578125, 47.6162109375, 134.96644592285156, -73.93893432617188, 108.65383911132812, 220.0447998046875, -22.97100830078125, 27.482295989990234, 47.543418884277344, -15.755584716796875, -0.6821632385253906, -21.542144775390625, 169.387939453125, 125.62567138671875, 33.40400695800781, -5.0498504638671875, -7.80712890625, 139.36093139648438, 175.73394775390625, 105.53294372558594, -39.18878173828125, 86.50448608398438, 6.3956298828125, 84.79632568359375, -56.2958984375, 152.8507080078125, -22.658111572265625, -102.46727752685547, 65.65219116210938, 88.23028564453125, -24.293304443359375, -0.1599578857421875, 3.58062744140625, 39.96995544433594, 138.5306396484375, 97.0323486328125, -24.918243408203125, 59.736480712890625, 63.530242919921875, 139.63946533203125, -149.06781005859375, 183.57400512695312, 45.364013671875, 123.87994384765625, 21.343307495117188, 27.812042236328125, 163.36761474609375, 76.39468383789062, -18.588287353515625, 146.97653198242188, 117.47200012207031, 133.54592895507812, 50.560218811035156, 128.9754180908203, 111.03396606445312, -90.94398498535156, 194.99749755859375, 20.195846557617188, 167.63868713378906, 167.25680541992188, -5.990318298339844, 3.140045166015625, 24.63494873046875, 124.85519409179688, -1.5832672119140625, 137.46533203125, 116.92707824707031, -10.792579650878906, 193.92254638671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000415.npy"}
|
||||
{"epoch": 0.8691099476439791, "step": 416, "batch_size": 128, "mean": 56.2318115234375, "std": 78.29242706298828, "min": -171.94015502929688, "p10": -24.37852630615234, "median": 43.417686462402344, "p90": 155.3370834350586, "max": 206.07183837890625, "pos_frac": 0.7421875, "sample": [146.98095703125, 185.078125, 160.71957397460938, 24.917236328125, 3.1617584228515625, 2.5180435180664062, -26.946319580078125, 14.051727294921875, -13.3331298828125, 116.01129150390625, -36.38812255859375, 206.07183837890625, 60.469482421875, 39.813507080078125, 22.628997802734375, 12.991111755371094, -13.006439208984375, 167.98406982421875, 193.3291015625, 83.35101318359375, 115.86323547363281, 135.01992797851562, 29.34576416015625, 155.32167053222656, -15.3311767578125, 35.56256103515625, -44.12324523925781, 123.042236328125, -81.41635131835938, 26.83819580078125, 95.05801391601562, 139.19265747070312, 131.72329711914062, -4.380035400390625, -7.2966156005859375, 157.05902099609375, -120.43217468261719, 141.763427734375, 19.316368103027344, 20.06292724609375, 22.26861572265625, 35.51191711425781, -19.911773681640625, -171.94015502929688, -18.662227630615234, 148.0916748046875, 25.955734252929688, 110.54728698730469, 119.16888427734375, 155.373046875, 44.69244384765625, 155.6488494873047, 14.1689453125, 24.870445251464844, -88.21603393554688, 138.63320922851562, 149.69915771484375, 109.02049255371094, -7.0675048828125, -12.063812255859375, 113.19598388671875, 39.19439697265625, -119.71903228759766, -24.1375732421875, 178.6495361328125, 94.38729858398438, -2.1642074584960938, 119.19888305664062, -19.935821533203125, 122.90846252441406, 0.8561859130859375, -24.940750122070312, 47.95513916015625, 125.49921417236328, 188.7005615234375, 121.0062255859375, 96.356689453125, 136.86631774902344, -9.575302124023438, 13.871421813964844, 83.9962158203125, 25.14599609375, 146.07269287109375, -23.870052337646484, -37.2841796875, 90.62257385253906, 49.243927001953125, 84.50018310546875, -12.127799987792969, 24.309921264648438, 39.205787658691406, 124.34443664550781, 57.974647521972656, 94.99609375, -142.2264404296875, 119.63082885742188, 57.4132080078125, 1.7776031494140625, 32.245819091796875, 144.15455627441406, -61.083831787109375, 113.84059143066406, 46.042205810546875, 11.449493408203125, 56.137481689453125, -13.175872802734375, 169.55877685546875, 102.25164031982422, 187.27816772460938, -15.489883422851562, 3.9583206176757812, 183.23095703125, 47.27105712890625, 21.682373046875, -18.1614990234375, 137.645751953125, 131.67367553710938, 68.71917724609375, 42.14292907714844, -9.356571197509766, 31.690704345703125, -61.301513671875, 150.40982055664062, -16.20218276977539, 126.22055053710938, 12.792800903320312, 109.76065063476562, 142.0038604736328], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000416.npy"}
|
||||
{"epoch": 0.8712041884816754, "step": 417, "batch_size": 128, "mean": 47.32077407836914, "std": 80.1475601196289, "min": -149.22073364257812, "p10": -42.0138946533203, "median": 36.69294738769531, "p90": 152.08047180175782, "max": 201.59701538085938, "pos_frac": 0.71875, "sample": [-0.17478179931640625, 41.52610778808594, 35.38648986816406, 80.20222473144531, -26.929534912109375, 57.95355224609375, 26.409210205078125, -48.860107421875, 165.0731201171875, 166.62060546875, 10.455154418945312, 44.70404052734375, 5.1551971435546875, 112.1678466796875, -128.13534545898438, 7.7980499267578125, 9.748626708984375, -8.165451049804688, -10.62640380859375, 136.18911743164062, 147.56436157226562, -25.29156494140625, 163.31756591796875, -83.38835144042969, -23.116439819335938, -126.51378631591797, 77.6866455078125, 5.229713439941406, -36.87791442871094, 179.9659423828125, 50.966285705566406, -8.123626708984375, 1.4705810546875, 87.65204620361328, 98.46672058105469, 152.20294189453125, 19.659881591796875, -102.16583251953125, 123.04768371582031, 115.03175354003906, 142.34735107421875, 21.709381103515625, 94.5455322265625, -7.70458984375, 120.45770263671875, 109.52716064453125, 15.038238525390625, 80.58197021484375, 99.21583557128906, 70.11402893066406, 22.494417190551758, -0.79901123046875, -4.5150909423828125, -74.75251770019531, -70.44155883789062, 36.440216064453125, -3.0189208984375, 141.06039428710938, 61.939117431640625, -10.896484375, 21.97576904296875, 14.4034423828125, 3.211057662963867, 14.59423828125, 152.91873168945312, 1.7166900634765625, 140.18246459960938, -28.531707763671875, -149.22073364257812, 3.0901641845703125, -23.93603515625, 152.02798461914062, 104.6533203125, 36.9456787109375, 53.46697998046875, -79.21649169921875, 74.68794250488281, 148.66162109375, 0.0, -1.432159423828125, 70.51947021484375, 0.0, 49.70721435546875, 129.942138671875, 99.4090576171875, 107.65721130371094, 41.33482360839844, 4.5251312255859375, 118.45841979980469, -19.099517822265625, -71.02322387695312, 150.52728271484375, 135.7034454345703, 139.973388671875, 151.74310302734375, 23.06829833984375, -135.69931030273438, 81.80055236816406, 195.45892333984375, 21.028182983398438, 131.67474365234375, -110.2614517211914, -39.079803466796875, 64.3775634765625, 136.75369262695312, 153.5771942138672, 168.20217895507812, -35.37542724609375, -109.29524230957031, 14.875, 8.03631591796875, -5.1299591064453125, 11.932615280151367, 146.69686889648438, 164.93948364257812, 143.21014404296875, -24.598236083984375, 45.74951171875, 156.79751586914062, 6.807872772216797, 44.459930419921875, 33.38519287109375, 46.08043670654297, 10.371109008789062, 133.79217529296875, 103.86288452148438, 201.59701538085938, 181.7606201171875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000417.npy"}
|
||||
{"epoch": 0.8732984293193717, "step": 418, "batch_size": 128, "mean": 41.19898223876953, "std": 76.05888366699219, "min": -159.27426147460938, "p10": -50.97838745117187, "median": 26.617828369140625, "p90": 142.10702972412108, "max": 213.16775512695312, "pos_frac": 0.7265625, "sample": [-32.709930419921875, 132.56854248046875, 36.301300048828125, 46.102783203125, -62.43817138671875, -105.36180114746094, 0.4561767578125, 166.23956298828125, -39.94049072265625, -159.27426147460938, 40.755340576171875, 95.18402862548828, 29.788421630859375, -86.48258972167969, 56.368316650390625, 92.30833435058594, 98.20629119873047, -89.73916625976562, 100.78268432617188, 12.417327880859375, 140.53701782226562, 98.88119506835938, -17.0074462890625, -1.4629440307617188, -1.0963134765625, 12.5499267578125, -18.875503540039062, 118.79571533203125, 51.0953369140625, 167.8980712890625, 130.28897094726562, 41.01043701171875, 141.7788543701172, 108.14494323730469, 17.940536499023438, 79.28619384765625, -23.74273681640625, 123.62637329101562, -48.37713623046875, -6.7852630615234375, 0.646484375, 146.16604614257812, 16.3736572265625, 93.6063232421875, 7.229217529296875, 20.60882568359375, -4.351594924926758, 46.56690979003906, 27.291107177734375, -18.323379516601562, 1.651123046875, 149.05096435546875, 109.11802673339844, 25.944549560546875, 21.743682861328125, -9.9931640625, 7.9454345703125, 25.667022705078125, 63.779815673828125, 71.618408203125, -73.50115966796875, 120.28726196289062, 140.8671112060547, 30.695846557617188, 31.347900390625, 9.0999755859375, 113.49065399169922, -73.58271026611328, -119.802001953125, 5.867378234863281, -71.5924072265625, 9.038871765136719, 116.73716735839844, 19.62646484375, 55.348724365234375, 163.76710510253906, 137.6800537109375, -13.225799560546875, 82.95587158203125, -44.252593994140625, 43.498748779296875, 128.88751220703125, -9.339691162109375, 46.37353515625, 85.4423828125, 142.87277221679688, 133.69619750976562, -42.56073760986328, 3.114013671875, -77.35040283203125, 13.575469970703125, -31.097259521484375, -19.8021240234375, 113.35302734375, 129.6064453125, 14.262367248535156, 154.76260375976562, 89.18901062011719, 1.3538742065429688, 137.84774780273438, -23.37005615234375, 0.7265090942382812, 10.464424133300781, 17.95867919921875, -95.07281494140625, 35.97137451171875, 55.175048828125, 124.3797378540039, -137.5057373046875, 146.6751708984375, 23.649394989013672, 16.95721435546875, 49.82554626464844, -2.028257369995117, 138.58724975585938, 60.539085388183594, 213.16775512695312, 147.26052856445312, 9.543525695800781, 200.58425903320312, 54.679229736328125, -24.264915466308594, -8.83685302734375, 18.325439453125, -57.0479736328125, 145.50473022460938, 192.04873657226562, 12.675094604492188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000418.npy"}
|
||||
{"epoch": 0.875392670157068, "step": 419, "batch_size": 128, "mean": 57.8297119140625, "std": 85.26213073730469, "min": -199.5817413330078, "p10": -39.76154479980468, "median": 48.329532623291016, "p90": 157.65805358886718, "max": 205.3154296875, "pos_frac": 0.7734375, "sample": [41.3447265625, 14.512771606445312, 0.79522705078125, 156.604248046875, 25.477333068847656, 26.53106689453125, 6.981891632080078, 16.223617553710938, 63.1868896484375, -83.943359375, -70.42247009277344, 77.98419189453125, -114.461181640625, 144.4636993408203, 8.706672668457031, 27.268402099609375, 179.64202880859375, 10.861770629882812, 118.724365234375, 149.88888549804688, 150.7372283935547, 131.9500732421875, 161.5338134765625, 79.92018127441406, 164.2430419921875, 2.0167236328125, 44.85673522949219, 9.1365966796875, -27.16796875, -135.35165405273438, 146.55691528320312, 134.77252197265625, 120.43258666992188, -4.436531066894531, 64.4444580078125, -25.024459838867188, 82.676025390625, -0.11712837219238281, 135.5484619140625, -120.75567626953125, 149.87704467773438, 3.984771728515625, 146.02352905273438, 101.07229614257812, 201.72991943359375, 9.376350402832031, 188.9924774169922, -60.057373046875, 49.179443359375, 179.41445922851562, 48.40057373046875, 78.66189575195312, -25.03411865234375, 54.93852996826172, 42.413909912109375, 145.30999755859375, 183.24493408203125, 90.04348754882812, 7.93780517578125, -14.345489501953125, -96.17967224121094, 15.889068603515625, -32.947227478027344, 2.862895965576172, 205.3154296875, -104.41326904296875, 140.66270446777344, 159.71961975097656, 8.45013427734375, 36.0074462890625, -25.35417938232422, 129.63552856445312, 121.71652221679688, 20.862884521484375, 106.08708190917969, 0.0, -76.46981811523438, -43.32415771484375, 13.530426025390625, 0.0, 31.298110961914062, 122.27345275878906, 129.26425170898438, 155.09707641601562, -14.333938598632812, 137.41424560546875, 23.718719482421875, 157.322265625, 82.03286743164062, 19.964096069335938, 156.90602111816406, 36.90435791015625, 159.3094482421875, 86.012451171875, 0.718048095703125, 13.281982421875, 115.15321350097656, 158.44155883789062, 137.53004455566406, 155.7174072265625, 169.12493896484375, 12.48291015625, 203.1072998046875, 18.20672607421875, 147.79891967773438, -38.234710693359375, 131.62562561035156, 134.71568298339844, 58.07440185546875, 126.25711059570312, -36.63787841796875, -62.58905029296875, 154.29989624023438, 133.48301696777344, -199.5817413330078, 73.11833190917969, -131.217529296875, -14.920166015625, 132.85728454589844, -25.86175537109375, 91.82666015625, 137.46737670898438, -12.33013916015625, 20.771728515625, 12.233840942382812, 48.25849151611328, 144.94155883789062, 29.340423583984375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000419.npy"}
|
||||
{"epoch": 0.8774869109947644, "step": 420, "batch_size": 128, "mean": 55.19243621826172, "std": 79.1772689819336, "min": -171.68385314941406, "p10": -36.065286254882814, "median": 36.10631561279297, "p90": 153.60947875976564, "max": 204.9180908203125, "pos_frac": 0.75, "sample": [-47.662689208984375, 29.080535888671875, 28.268203735351562, -155.87600708007812, -22.71759033203125, -14.616600036621094, 16.018157958984375, 143.99221801757812, -40.20782470703125, 80.2237548828125, 151.6103515625, 7.53399658203125, -70.18374633789062, 114.549560546875, 114.10325622558594, 10.778732299804688, 131.55099487304688, -100.49610137939453, 124.56097412109375, 103.42414093017578, 31.500274658203125, 64.82427215576172, -5.0941162109375, 38.4681396484375, 5.11181640625, 153.82925415039062, 15.09771728515625, 24.581825256347656, 108.27002716064453, 26.649749755859375, -1.5933990478515625, 186.7125244140625, 37.21092224121094, 132.4151611328125, 11.725067138671875, -6.1926422119140625, 114.6502456665039, 122.44256591796875, 170.86572265625, 12.621917724609375, 172.50152587890625, 84.00794982910156, 24.06414031982422, 134.15101623535156, 16.3953857421875, 32.5704345703125, 78.62013244628906, -4.099945068359375, -0.312042236328125, 88.08473205566406, -9.90240478515625, 97.14892578125, 128.2934112548828, 146.57720947265625, 111.49711608886719, 120.02816772460938, 0.0, 34.50532531738281, -37.216949462890625, -119.71980285644531, 120.5748519897461, 119.80095672607422, 3.8623046875, 5.759178161621094, 14.840240478515625, 33.24397277832031, -2.003997802734375, 168.183837890625, 189.72015380859375, 126.46812438964844, 131.34039306640625, -88.1966552734375, 28.262939453125, 97.68263244628906, -35.57171630859375, 162.18014526367188, 103.08135986328125, 21.61407470703125, 149.91702270507812, 35.001708984375, -18.373260498046875, 12.270286560058594, 204.9180908203125, -53.58404541015625, 134.55410766601562, 53.428314208984375, -3.1243743896484375, -27.205867767333984, 147.79380798339844, 154.48304748535156, 64.33468627929688, 22.069244384765625, 153.9884033203125, 153.51528930664062, 10.895843505859375, 14.478851318359375, 86.76783752441406, 58.034881591796875, 184.3856201171875, -6.537841796875, 172.34152221679688, 106.05572509765625, -171.68385314941406, -3.829620361328125, 33.642791748046875, 140.54998779296875, 112.68109130859375, 11.83905029296875, -66.52188110351562, -70.44407653808594, 92.93789672851562, 114.05108642578125, 95.69586181640625, 145.30459594726562, 135.52841186523438, 19.975738525390625, 105.80319213867188, -149.0894775390625, 19.21209716796875, 157.78671264648438, 120.94671630859375, 121.87664794921875, -35.539459228515625, -9.125202178955078, 4.1887359619140625, 109.20352172851562, -1.7206344604492188, 74.88442993164062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000420.npy"}
|
||||
{"epoch": 0.8795811518324608, "step": 421, "batch_size": 128, "mean": 58.47954177856445, "std": 73.33659362792969, "min": -131.40078735351562, "p10": -23.22535858154297, "median": 51.06916046142578, "p90": 149.93584289550782, "max": 184.23464965820312, "pos_frac": 0.75, "sample": [180.27093505859375, 76.98690795898438, 107.21623992919922, 45.396484375, 118.16357421875, 80.73014831542969, 118.52618408203125, 10.878143310546875, -131.40078735351562, -5.509529113769531, 132.5469207763672, -17.979736328125, 4.60516357421875, 106.08882141113281, 31.279651641845703, 107.82767486572266, 47.328857421875, -121.16696166992188, 31.82598876953125, 103.68446350097656, -30.345619201660156, -23.431671142578125, 42.04618835449219, 14.758354187011719, 114.41655731201172, 182.60775756835938, 133.31753540039062, 41.74322509765625, 9.63437271118164, 79.57050323486328, 0.0, -85.362548828125, 78.4727783203125, 34.97856140136719, 11.425857543945312, 62.11894226074219, 150.63458251953125, 142.5125732421875, 30.189620971679688, 139.4046630859375, 163.8558349609375, 133.39401245117188, 43.9864501953125, 15.549537658691406, -5.277523040771484, 18.403671264648438, 127.2984848022461, -23.409576416015625, 169.40371704101562, 117.31788635253906, 119.19438171386719, -7.4080810546875, 93.45564270019531, 80.39239501953125, 10.0787353515625, 156.915283203125, -103.9594955444336, 141.80465698242188, 13.007171630859375, 90.61466979980469, 33.5850830078125, -12.291511535644531, 182.7293701171875, 167.76092529296875, 139.11044311523438, 59.4306640625, 123.09034729003906, 142.37327575683594, 142.2682342529297, 132.4486846923828, 136.24415588378906, 149.63638305664062, 82.1561279296875, 138.69985961914062, 64.34693908691406, 47.34355163574219, -20.554107666015625, 37.507293701171875, 142.90257263183594, 12.752388000488281, 98.01882934570312, 3.2922821044921875, 42.8914794921875, 141.45590209960938, -5.92041015625, 96.39736938476562, 33.50544738769531, -22.291748046875, 130.149658203125, -12.595024108886719, 111.4053726196289, 96.26570129394531, 177.63650512695312, 79.10546875, -36.30815505981445, 54.794769287109375, 84.21160888671875, -77.01612854003906, -119.27267456054688, 73.58985900878906, -35.52960205078125, -21.450210571289062, 46.634124755859375, -5.213592529296875, 23.521240234375, 154.572265625, 11.711944580078125, -7.6309814453125, 103.0646743774414, -64.68185424804688, -26.103424072265625, 144.38589477539062, 35.705596923828125, 123.8375244140625, -23.146408081054688, 150.94406127929688, 88.00811004638672, 0.0, -18.22149658203125, -17.85012435913086, 184.23464965820312, 77.3828125, 26.448287963867188, 0.0, 3.001068115234375, -5.17034912109375, 171.84393310546875, 3.61737060546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000421.npy"}
|
||||
{"epoch": 0.881675392670157, "step": 422, "batch_size": 128, "mean": 41.4867057800293, "std": 82.40203857421875, "min": -168.04046630859375, "p10": -63.129750823974604, "median": 35.196136474609375, "p90": 149.89031219482422, "max": 193.0350341796875, "pos_frac": 0.71875, "sample": [17.168182373046875, -16.57190704345703, 141.6388397216797, -25.665267944335938, 102.23501586914062, -25.74848175048828, 139.54531860351562, -132.3772735595703, 29.26092529296875, 0.180938720703125, 103.42544555664062, 4.64044189453125, 51.28924560546875, 138.88339233398438, 110.91375732421875, -168.04046630859375, 8.89483642578125, 12.88458251953125, -22.597274780273438, 111.06354522705078, 102.20415496826172, 45.063507080078125, 94.44425964355469, 29.257705688476562, -45.350494384765625, 118.81076049804688, -125.15184020996094, 109.520751953125, 116.20765686035156, 162.7576904296875, 43.86302947998047, 4.755340576171875, -17.122817993164062, 71.74752807617188, 8.269050598144531, 119.56350708007812, 1.8066864013671875, -104.14802551269531, 167.0490264892578, 6.7167205810546875, 10.837265014648438, 147.68588256835938, -57.77386474609375, 52.161346435546875, 27.64801025390625, -151.07666015625, -11.449859619140625, 193.0350341796875, 64.25320434570312, 138.30404663085938, 48.52972412109375, 161.18377685546875, 108.76925659179688, 15.579862594604492, 84.94497680664062, 6.891754150390625, 35.37774658203125, 15.59678840637207, 21.166259765625, 62.445343017578125, 123.07522583007812, -66.15032196044922, -33.900474548339844, -131.11767578125, 162.23245239257812, 6.345794677734375, 105.1974105834961, 10.238052368164062, 2.89630126953125, 164.58602905273438, 99.99435424804688, -158.20208740234375, 146.71559143066406, -23.348403930664062, 61.86834716796875, -26.906856536865234, 131.12574768066406, -61.83522033691406, 158.40194702148438, 9.214439392089844, 164.07568359375, -3.0934829711914062, 116.63229370117188, 43.18048095703125, 163.78216552734375, 50.732147216796875, -74.14654541015625, 7.9566650390625, 149.22927856445312, 0.6547355651855469, -14.44607925415039, -19.2808837890625, 51.7108154296875, -21.167724609375, 119.12129211425781, 79.08120727539062, -36.94940185546875, -1.31134033203125, 53.8125, 104.21853637695312, 114.70292663574219, 15.87261962890625, 176.17901611328125, 145.92962646484375, -97.33984375, 126.79425048828125, 47.637420654296875, -105.27597045898438, -38.0242919921875, 94.47503662109375, 0.0, -93.2735595703125, 151.43272399902344, 127.48161315917969, -91.787353515625, 35.0145263671875, 23.18560791015625, 119.97191619873047, 35.513893127441406, -1.4886627197265625, -25.976852416992188, 76.10723114013672, 3.7960281372070312, 154.00648498535156, 41.29981994628906, 21.66759490966797, -0.9743156433105469, 173.75015258789062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000422.npy"}
|
||||
{"epoch": 0.8837696335078534, "step": 423, "batch_size": 128, "mean": 58.97514724731445, "std": 80.92790222167969, "min": -141.4320068359375, "p10": -34.634387207031246, "median": 52.747276306152344, "p90": 160.97716217041014, "max": 187.41131591796875, "pos_frac": 0.75, "sample": [0.72900390625, 160.6542510986328, -17.129837036132812, 14.09033203125, 145.5723876953125, 129.26528930664062, 26.358478546142578, -13.914306640625, 2.7275047302246094, 130.70767211914062, 77.0245361328125, 126.42549133300781, 25.726226806640625, 120.81196594238281, -101.01760864257812, -12.088531494140625, 140.62127685546875, -53.093109130859375, 154.4241943359375, 10.610153198242188, 6.584957122802734, -33.919342041015625, 20.911148071289062, 122.6586685180664, -130.1854248046875, 103.81320190429688, 113.71778869628906, 15.053390502929688, 139.24392700195312, 166.96206665039062, 4.5012969970703125, 23.0145263671875, 61.949371337890625, 65.63749694824219, 92.96635437011719, 119.78436279296875, 149.48907470703125, -0.162994384765625, -18.196014404296875, 75.56570434570312, 92.68951416015625, 155.97149658203125, -11.796478271484375, 136.007080078125, -1.92364501953125, 26.73895263671875, 8.629119873046875, 175.67755126953125, -32.53350830078125, -33.387420654296875, 163.05224609375, -40.31611633300781, -53.937591552734375, 50.9759521484375, 155.3211669921875, -36.302825927734375, -76.48779296875, -98.3034896850586, 163.1528778076172, 151.27822875976562, 30.330047607421875, -5.7725830078125, 129.45681762695312, 15.897430419921875, -10.347633361816406, 109.2208251953125, 15.093023300170898, 142.00204467773438, 7.528511047363281, 36.145263671875, 128.39041137695312, 141.38397216796875, 108.42501831054688, 120.35879516601562, 35.534332275390625, 58.308868408203125, -42.190643310546875, 151.56832885742188, 185.61074829101562, 49.745697021484375, 16.261260986328125, 153.81491088867188, 14.312576293945312, 145.23336791992188, 175.71456909179688, 54.51860046386719, 99.11178588867188, 21.211883544921875, 3.7730255126953125, 169.24276733398438, 128.03721618652344, 129.14747619628906, 164.5845947265625, -1.460205078125, 181.4154052734375, 120.97366333007812, 3.202047348022461, -19.833251953125, 161.73062133789062, -7.43792724609375, 5.2487640380859375, 107.228515625, 32.706146240234375, 168.83380126953125, 187.41131591796875, -31.85797119140625, -9.171188354492188, 116.79459381103516, 162.3543243408203, -18.7635498046875, -115.55549621582031, 31.82276153564453, -77.36593627929688, 56.92498779296875, 16.98764419555664, 150.69586181640625, 72.88510131835938, 130.28396606445312, 155.2508544921875, 34.003692626953125, 106.3402099609375, 123.27197265625, 142.9776611328125, -127.41741943359375, 26.398712158203125, 62.772254943847656, -3.454832077026367, -141.4320068359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000423.npy"}
|
||||
{"epoch": 0.8858638743455497, "step": 424, "batch_size": 128, "mean": 55.35126876831055, "std": 72.85440063476562, "min": -144.1619873046875, "p10": -34.64798431396484, "median": 60.1602783203125, "p90": 146.22651062011718, "max": 197.243896484375, "pos_frac": 0.75, "sample": [85.77439880371094, 52.25901794433594, -0.553863525390625, -3.977294921875, 66.92424011230469, 15.40106201171875, -9.189125061035156, -4.58709716796875, 32.033775329589844, -3.263946533203125, 4.8561248779296875, -95.55252075195312, 197.243896484375, 73.99392700195312, -5.0195159912109375, 89.57894897460938, 11.903564453125, 169.5840301513672, 151.1038818359375, 129.52444458007812, -7.068599700927734, 6.440673828125, 171.53509521484375, 127.84649658203125, 72.24392700195312, 87.50145721435547, -0.3968544006347656, -39.59112548828125, 111.35018920898438, 110.93131256103516, 99.7459487915039, 170.02963256835938, 136.6212921142578, -82.96836853027344, 106.39312744140625, 119.9989013671875, 32.373565673828125, 127.7176513671875, 153.1854248046875, -78.51922607421875, 29.39581298828125, 136.49464416503906, 105.80978393554688, -144.1619873046875, 0.9808807373046875, 175.39373779296875, 159.44589233398438, 76.61215209960938, -128.9348602294922, 31.0533447265625, -19.941646575927734, -7.70135498046875, -5.227630615234375, 146.45147705078125, 60.915374755859375, 120.29464721679688, 16.289764404296875, 131.8681640625, 119.85867309570312, 59.405181884765625, 41.918853759765625, -10.4332275390625, 142.98486328125, 63.1036376953125, 170.84091186523438, 121.28206634521484, 61.407867431640625, 30.575042724609375, -58.24591064453125, -40.55419921875, -35.76141357421875, 66.56271362304688, 120.2536392211914, 173.8070068359375, 172.63739013671875, 146.13009643554688, 9.143081665039062, 75.33905792236328, 113.44395446777344, 69.872314453125, 44.428985595703125, 115.3489990234375, -1.1294174194335938, 1.6872024536132812, 143.79075622558594, 74.26439666748047, -30.871185302734375, 0.0, -34.91181945800781, 34.01292419433594, 13.4239501953125, 123.97811126708984, 128.44195556640625, -8.16119384765625, 29.5994873046875, 135.0331573486328, 82.74859619140625, 129.706787109375, 133.4062957763672, 107.59351348876953, 13.503547668457031, -81.79693603515625, 107.0504150390625, 121.10501861572266, 100.24459838867188, 3.122631072998047, -12.031538009643555, 10.59149169921875, -42.6080322265625, 6.200571060180664, 112.26799011230469, -68.11653900146484, 101.21014404296875, 127.15736389160156, -33.956787109375, 2.1641921997070312, 15.031461715698242, 40.55491256713867, 26.365234375, 63.87457275390625, 26.249237060546875, 85.85130310058594, -34.534912109375, 13.151809692382812, 86.982177734375, 12.65960693359375, 8.91104507446289, 169.34811401367188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000424.npy"}
|
||||
{"epoch": 0.8879581151832461, "step": 425, "batch_size": 128, "mean": 48.18128204345703, "std": 77.89530181884766, "min": -184.88784790039062, "p10": -28.79190139770507, "median": 27.290451049804688, "p90": 151.5662658691406, "max": 205.6890869140625, "pos_frac": 0.71875, "sample": [127.68597412109375, 132.80319213867188, 67.6827392578125, 53.36228942871094, -11.5709228515625, 15.371246337890625, 42.5374755859375, 119.86271667480469, 36.01457214355469, -19.60089111328125, 141.1268310546875, 25.26416015625, 143.02963256835938, 61.558441162109375, 119.65084838867188, 0.4090766906738281, 107.53434753417969, 133.28982543945312, 1.301279067993164, 21.183197021484375, 59.75006103515625, 122.77839660644531, 27.63922119140625, -3.2771987915039062, 136.98605346679688, 72.44510650634766, 3.71484375, 19.8812255859375, -79.64933776855469, -3.7147903442382812, 46.469573974609375, 0.65802001953125, -16.606369018554688, 24.071792602539062, 161.82919311523438, -15.24620246887207, 12.200164794921875, 169.76300048828125, 6.599761962890625, 18.21868133544922, -18.649749755859375, 171.336181640625, 122.13957977294922, 30.8095703125, -1.4732666015625, 9.82180404663086, -33.297088623046875, -1.965087890625, -129.59918212890625, 153.1807861328125, 75.07354736328125, 113.089599609375, -106.64434814453125, 47.943145751953125, 148.79779052734375, -2.5068206787109375, 130.3163299560547, 205.6890869140625, 38.40338134765625, 21.537933349609375, -33.304779052734375, 26.546722412109375, 8.06294059753418, 37.47767639160156, 162.39703369140625, -43.985626220703125, 55.916473388671875, -3.9654903411865234, 144.09596252441406, 120.72298431396484, 121.54507446289062, 9.350006103515625, 177.22503662109375, 6.635204315185547, -18.527618408203125, -8.358512878417969, -10.301261901855469, 26.941680908203125, -18.43804931640625, 88.37581634521484, -0.42292022705078125, 150.87432861328125, 24.51629638671875, 126.36920166015625, 158.637939453125, -92.22611999511719, 14.630683898925781, -54.58387756347656, 28.826324462890625, -22.66168212890625, 16.7435302734375, -26.861106872558594, 15.2530517578125, -106.85237121582031, -42.34735107421875, -66.61526489257812, 98.578125, 12.501823425292969, 120.36801147460938, -111.154296875, 43.988128662109375, 193.18328857421875, 22.62030029296875, 139.5384063720703, 49.00341033935547, 93.02989196777344, 147.98797607421875, 0.887359619140625, 4.375251770019531, 128.55511474609375, 0.0, -17.410614013671875, 140.8173828125, 150.069580078125, 192.79244995117188, -4.108144760131836, 30.527286529541016, 142.90382385253906, -184.88784790039062, 43.00007629394531, 163.53546142578125, 158.34938049316406, 102.94296264648438, 143.1754150390625, 159.03192138671875, -26.520652770996094, -2.5248870849609375, 3.3450164794921875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000425.npy"}
|
||||
{"epoch": 0.8900523560209425, "step": 426, "batch_size": 128, "mean": 51.855613708496094, "std": 72.21925354003906, "min": -127.79824829101562, "p10": -22.6402193069458, "median": 28.227514266967773, "p90": 147.65513916015624, "max": 180.7747802734375, "pos_frac": 0.78125, "sample": [128.18251037597656, -1.9997711181640625, 1.554203987121582, -34.918975830078125, 136.7933349609375, 21.337081909179688, 107.91204833984375, 2.5556411743164062, -104.7979736328125, 1.1014938354492188, -112.72463989257812, -44.8880615234375, -47.16331481933594, 112.66354370117188, 135.79214477539062, 146.24758911132812, 145.60272216796875, 24.793182373046875, 125.07305908203125, 28.86370849609375, -45.94464111328125, 63.46575927734375, 163.34555053710938, 17.68609619140625, 3.58538818359375, 22.581695556640625, 152.5897979736328, 157.93304443359375, 130.42550659179688, 161.39300537109375, 102.7570571899414, 3.228801727294922, 137.48812866210938, 115.91339874267578, 135.87384033203125, 10.068845748901367, 2.4370346069335938, 22.143814086914062, 150.47607421875, 160.67596435546875, 39.13433837890625, 160.61390686035156, 6.801322937011719, -6.183319091796875, 141.71580505371094, 141.38284301757812, -1.18994140625, 103.67643737792969, 17.19244384765625, 6.3209228515625, -87.17778015136719, 18.447967529296875, 23.920425415039062, 0.600677490234375, 27.591320037841797, 141.27145385742188, -69.79116821289062, 72.54722595214844, 143.7987060546875, 109.24821472167969, 101.15335083007812, 119.08602905273438, -32.43988037109375, -2.5033226013183594, 0.0854644775390625, 83.0146484375, 46.64707946777344, 114.48291015625, 3.694305419921875, -7.5050048828125, 64.3931884765625, 25.358291625976562, -12.917678833007812, 11.25335693359375, 0.2939796447753906, 180.75921630859375, -24.6962890625, 3.2222442626953125, 56.15760803222656, 120.96686553955078, -19.504837036132812, 135.51898193359375, 71.291748046875, 69.20252990722656, 49.58830261230469, 29.30999755859375, 26.5419921875, 9.393074035644531, 13.502655029296875, 125.42935180664062, 123.95947265625, 19.26593017578125, 1.3350086212158203, -14.065597534179688, 164.64837646484375, 37.66261291503906, 3.3931884765625, 158.01467895507812, -10.25164794921875, -21.75904655456543, 79.25562286376953, 6.30718994140625, 0.0, 10.0067138671875, -10.914276123046875, 174.43539428710938, 69.57901000976562, 6.194084167480469, 8.8824462890625, 62.949676513671875, 115.55154418945312, 35.1324462890625, -95.18035888671875, -1.70379638671875, -6.501518249511719, 120.97844696044922, -35.2113037109375, 146.4461669921875, 133.30982971191406, 115.15692138671875, 115.84516143798828, 3.8201675415039062, -127.79824829101562, 95.73202514648438, 180.7747802734375, 36.24713134765625, -13.1424560546875, 158.35731506347656], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000426.npy"}
|
||||
{"epoch": 0.8921465968586387, "step": 427, "batch_size": 128, "mean": 69.73506164550781, "std": 74.59123229980469, "min": -141.343017578125, "p10": -17.77030334472656, "median": 72.3101806640625, "p90": 163.574755859375, "max": 260.96771240234375, "pos_frac": 0.796875, "sample": [18.17407989501953, 164.3974609375, 22.992904663085938, -27.19384765625, -42.648475646972656, 11.324264526367188, 111.57484436035156, 71.370849609375, 106.5338134765625, 37.62217712402344, 104.35400390625, 76.03160095214844, 166.92562866210938, 30.788902282714844, 137.61639404296875, 151.02218627929688, 134.81385803222656, 7.32989501953125, 124.47848510742188, 173.21392822265625, 172.64901733398438, 53.35694885253906, -23.340713500976562, 101.02264404296875, 136.75537109375, 124.6314697265625, 120.539306640625, 95.28179931640625, 40.3191032409668, 53.290794372558594, 184.95516967773438, 146.75405883789062, 86.1209716796875, 36.733062744140625, 2.424896240234375, -1.025125503540039, 109.75350952148438, 11.867015838623047, -4.338470458984375, -33.85198974609375, 156.04490661621094, 23.842010498046875, -94.44401550292969, 70.96394348144531, -105.43246459960938, 124.20069885253906, 0.8627986907958984, 165.7291717529297, -16.751510620117188, 129.03329467773438, 163.22216796875, 25.89190673828125, 132.619140625, 77.18115234375, 178.843994140625, 158.498046875, 190.70816040039062, -9.279922485351562, 27.054412841796875, -11.077857971191406, 118.93766784667969, 89.05099487304688, 172.24050903320312, 16.3724365234375, 73.24951171875, 53.782958984375, -16.764862060546875, 136.44400024414062, 134.76910400390625, -8.562286376953125, -141.343017578125, 153.9176025390625, 105.62049102783203, -20.1163330078125, 116.115966796875, 19.81511878967285, -24.499496459960938, 131.8823699951172, 107.0716552734375, 25.443313598632812, 176.00408935546875, -87.2308349609375, 117.38542938232422, -38.63311767578125, 27.767379760742188, 78.39346313476562, 127.22969055175781, 67.3779296875, 74.21505737304688, 109.23101806640625, 111.89407348632812, 156.569091796875, -28.97509765625, 36.26800537109375, 143.2977294921875, 117.61878204345703, 164.64501953125, 54.57568359375, 8.337181091308594, 106.7198715209961, -15.86395263671875, 15.0330810546875, 39.58055114746094, -27.361709594726562, 8.020744323730469, 260.96771240234375, -8.05743408203125, 119.48213195800781, 3.3252716064453125, 148.0108642578125, 161.086669921875, 45.48158264160156, 113.07559204101562, -0.681915283203125, 44.101043701171875, -6.2068939208984375, 134.08123779296875, 71.1617431640625, 2.574737548828125, 1.989166259765625, 152.66390991210938, 26.860992431640625, -0.3971710205078125, -12.503990173339844, 28.289520263671875, 141.95233154296875, 181.38600158691406, 149.5863800048828], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000427.npy"}
|
||||
{"epoch": 0.8942408376963351, "step": 428, "batch_size": 128, "mean": 66.0711441040039, "std": 73.24652862548828, "min": -128.34176635742188, "p10": -14.759718132019042, "median": 62.802276611328125, "p90": 153.0246841430664, "max": 241.46286010742188, "pos_frac": 0.796875, "sample": [34.693641662597656, -8.874908447265625, -101.20892333984375, 149.32525634765625, 162.4927978515625, -16.272125244140625, 55.144805908203125, 53.939697265625, -80.69869995117188, 15.109909057617188, 241.46286010742188, 21.6785888671875, 68.71136474609375, 99.50347900390625, -7.16339111328125, 149.0858154296875, 26.92584228515625, -8.22509765625, 14.79661750793457, -4.717132568359375, 159.03036499023438, -0.40252685546875, -61.360260009765625, 178.58364868164062, 131.18072509765625, 164.422119140625, 120.95169067382812, 114.6515121459961, 88.93486022949219, 205.544921875, 27.36981201171875, 159.90029907226562, 2.13555908203125, 70.28363800048828, -18.12835693359375, 14.099113464355469, -1.654296875, 117.52285766601562, 34.01666259765625, 119.98248291015625, 66.36431884765625, -10.615772247314453, -128.34176635742188, 127.200927734375, 142.90225219726562, 185.52774047851562, -30.396713256835938, 46.34698486328125, 1.7248382568359375, 13.663467407226562, 134.39768981933594, 132.29733276367188, 105.39459228515625, 92.96025085449219, 26.984954833984375, 113.99813842773438, 135.59591674804688, 85.50439453125, -40.48431396484375, 197.22854614257812, -111.21673583984375, 177.13125610351562, 132.00628662109375, 48.78997802734375, 124.832275390625, 7.3757476806640625, 120.76885986328125, 133.60107421875, 90.50802612304688, 49.968994140625, 21.525894165039062, 29.931060791015625, 140.03884887695312, 136.428466796875, 190.29666137695312, 17.148284912109375, 46.033721923828125, 4.010772705078125, 74.03520202636719, 22.499008178710938, 48.68451690673828, 20.325439453125, 128.58554077148438, 67.78057861328125, -3.592071533203125, 9.597412109375, -18.623687744140625, 10.16241455078125, 76.2349853515625, 99.18537902832031, 222.03460693359375, 151.4614715576172, 149.60427856445312, -0.23516845703125, 87.5928955078125, -73.78515625, -24.508209228515625, 156.67218017578125, 114.70501708984375, 0.0, 23.67999267578125, 146.71961975097656, 130.5558319091797, 19.692630767822266, 122.88128662109375, 18.086097717285156, -12.54888916015625, 94.26953125, 106.24652099609375, 68.9617919921875, 59.240234375, 124.4527587890625, 17.783287048339844, 53.658103942871094, 143.4564666748047, 68.23223876953125, 3.1673583984375, 50.746337890625, 108.236572265625, -3.351593017578125, 43.96295166015625, 32.19927978515625, 144.967529296875, -15.811487197875977, -14.3089599609375, 130.52972412109375, 115.26513671875, 107.51103973388672], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000428.npy"}
|
||||
{"epoch": 0.8963350785340314, "step": 429, "batch_size": 128, "mean": 51.696651458740234, "std": 89.32054901123047, "min": -171.3199005126953, "p10": -51.51214294433593, "median": 35.390926361083984, "p90": 163.93328857421875, "max": 195.3974609375, "pos_frac": 0.6796875, "sample": [137.7417755126953, -158.20785522460938, -5.61639404296875, 111.58468627929688, 148.31044006347656, 143.35092163085938, 34.146080017089844, 30.003982543945312, -76.09529113769531, 110.83590698242188, 6.486976623535156, -114.56398010253906, 6.01385498046875, 141.1998291015625, 163.906005859375, 142.0922088623047, 3.7989463806152344, 142.57958984375, -27.836639404296875, 24.146484375, 104.66854858398438, 121.28133392333984, 9.759931564331055, 145.17254638671875, -37.606597900390625, 153.05096435546875, 100.59286499023438, 176.2845916748047, 20.8902587890625, 81.6695556640625, -109.95974731445312, 122.537841796875, 125.1761474609375, 145.4761962890625, 26.201705932617188, 25.905120849609375, -110.0576171875, -41.12811279296875, 146.84515380859375, 195.3974609375, 136.0723876953125, 138.09117126464844, -56.72792053222656, -2.8748321533203125, 82.10009765625, -4.0508880615234375, 184.2589111328125, 91.8377685546875, 6.8506622314453125, -37.80029296875, 5.145551681518555, 169.939453125, 177.82241821289062, 150.20303344726562, 14.99725341796875, 64.70455932617188, 186.002685546875, -87.07644653320312, -115.54156494140625, 138.92221069335938, -1.598388671875, 51.975860595703125, 147.7213134765625, 165.23171997070312, -32.21942138671875, 39.19023895263672, 52.2889404296875, 36.635772705078125, 22.85247039794922, 65.42935180664062, -20.57037353515625, -67.92610168457031, -34.42071533203125, -46.25413513183594, 134.147705078125, -171.3199005126953, 156.01373291015625, -16.54900360107422, 18.65582275390625, 151.51231384277344, -4.544506072998047, 0.0, 146.45252990722656, 182.98153686523438, 137.44683837890625, 68.14779663085938, 151.55636596679688, 3.3142318725585938, -1.411590576171875, 5.071178436279297, 180.72109985351562, -30.9356689453125, 115.51634216308594, -12.226829528808594, 11.033126831054688, 50.803985595703125, 21.09225082397461, 4.641654968261719, -5.4068603515625, 109.98150634765625, -64.17245483398438, -17.16302490234375, -40.251129150390625, 19.497329711914062, -37.20953369140625, 109.00039672851562, 110.58551025390625, 163.9969482421875, -139.39688110351562, 113.03019714355469, 169.40426635742188, 135.28396606445312, -49.27680969238281, 148.72654724121094, 171.73687744140625, -14.581268310546875, 10.718704223632812, 3.3555221557617188, 163.6180419921875, 164.75738525390625, -34.171600341796875, 125.51171875, -15.644943237304688, -30.855819702148438, 88.93301391601562, -5.73858642578125, 101.05120849609375, -127.51837158203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000429.npy"}
|
||||
{"epoch": 0.8984293193717278, "step": 430, "batch_size": 128, "mean": 46.47850036621094, "std": 80.79393005371094, "min": -141.81619262695312, "p10": -49.869773864746094, "median": 32.42169952392578, "p90": 159.577978515625, "max": 203.91326904296875, "pos_frac": 0.71875, "sample": [-60.007843017578125, -23.480056762695312, -115.28546142578125, 5.450836181640625, 110.66567993164062, 114.33526611328125, -111.92677307128906, 75.64764404296875, -11.1953125, 11.756507873535156, 123.60723876953125, 13.083599090576172, 34.35350036621094, 11.548469543457031, 65.53994750976562, -141.81619262695312, 163.75564575195312, 20.8294677734375, 2.518310546875, 132.63998413085938, 99.83444213867188, -81.96908569335938, -22.370590209960938, 119.09994506835938, 18.03875732421875, 34.908935546875, -41.1875, 0.0, 159.5184326171875, 73.71728515625, 79.8875961303711, -20.40087890625, 129.04559326171875, 60.56907653808594, 17.783676147460938, 0.0, 57.8824462890625, -2.7408065795898438, 2.0396728515625, 113.31549835205078, -84.49456787109375, 166.69732666015625, -40.853271484375, 120.28488159179688, 169.13626098632812, 90.22012329101562, 40.7413330078125, 18.750686645507812, -22.783782958984375, 203.91326904296875, 82.44468688964844, -17.0186767578125, 115.85723876953125, 13.9744873046875, 145.42343139648438, 165.09544372558594, 18.663467407226562, -50.88710021972656, 144.27178955078125, -131.07894897460938, 133.0834503173828, 149.2776336669922, 160.23809814453125, 147.0, 72.61131286621094, 55.397491455078125, 137.90420532226562, 125.5810775756836, 10.739961624145508, 170.5556640625, 24.672119140625, 29.462448120117188, 184.45858764648438, 129.78521728515625, 1.9254302978515625, -29.97705078125, -3.7481689453125, 17.884490966796875, -3.3736953735351562, 78.16897583007812, 127.8115234375, 32.68450927734375, -49.43377685546875, 12.580921173095703, -23.524864196777344, 83.19696044921875, 51.01165771484375, 189.09149169921875, 0.0, 5.746482849121094, 28.705215454101562, -22.5772705078125, 92.238037109375, 13.462295532226562, 172.37222290039062, 106.30170440673828, 29.8441162109375, 152.55303955078125, 7.112081527709961, -105.60026550292969, 176.43312072753906, 166.38839721679688, 50.19879150390625, -39.72083282470703, -91.54818725585938, 78.11448669433594, 132.21315002441406, 57.29405975341797, 159.7169189453125, 85.40447998046875, -97.97216796875, 12.214431762695312, 59.427825927734375, -13.258827209472656, 32.15888977050781, 141.72943115234375, 9.515274047851562, -2.189849853515625, 0.0580902099609375, -113.3436279296875, -11.66876220703125, -123.76632690429688, 65.21624755859375, 111.0057373046875, 106.69977569580078, 21.43096923828125, -23.418548583984375, 138.34048461914062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000430.npy"}
|
||||
{"epoch": 0.900523560209424, "step": 431, "batch_size": 128, "mean": 43.86272048950195, "std": 76.70436096191406, "min": -134.8040771484375, "p10": -42.54294738769531, "median": 32.36798095703125, "p90": 153.60946350097655, "max": 215.0548095703125, "pos_frac": 0.6875, "sample": [-10.3408203125, -134.8040771484375, 54.35906982421875, 131.0586395263672, 116.66806030273438, 75.90927124023438, 19.227569580078125, 75.29281616210938, 54.613494873046875, -71.54452514648438, -1.9150390625, 36.4771728515625, 9.150421142578125, 33.53131103515625, -41.640869140625, 63.88018798828125, -26.951202392578125, 110.08360290527344, 94.93061828613281, 17.350921630859375, -5.09857177734375, 112.32254028320312, 45.62609100341797, -95.97230529785156, 81.16067504882812, 130.38194274902344, 76.66648864746094, -51.70428466796875, 137.81546020507812, 182.32101440429688, 35.49946594238281, 66.05498504638672, 104.23849487304688, -106.37236785888672, -46.04341125488281, 124.7178955078125, 88.19290161132812, -13.61053466796875, -5.533210754394531, 97.68887329101562, -35.822811126708984, -125.4671630859375, 8.614313125610352, 11.7666015625, 119.18251037597656, 53.9779052734375, 164.33047485351562, -14.683673858642578, -34.43640899658203, -122.08698272705078, 10.819829940795898, 32.13958740234375, 1.9923591613769531, 130.53466796875, 1.495849609375, 133.12957763671875, 44.2760009765625, 170.7384033203125, 142.73538208007812, -16.85260009765625, 0.68499755859375, 32.59637451171875, 147.7402801513672, 27.64764404296875, 20.229095458984375, 95.0340576171875, 63.4739990234375, 157.26226806640625, -20.18828582763672, -18.332366943359375, 140.67657470703125, -17.1851806640625, -1.49774169921875, 165.93820190429688, -2.642852783203125, 202.21929931640625, 3.64544677734375, 183.4825439453125, 44.78709411621094, 33.17729187011719, 166.6448974609375, 215.0548095703125, 71.30120849609375, -47.2578125, -1.1780509948730469, -12.160064697265625, 124.746826171875, 30.935623168945312, -44.647796630859375, 17.30571746826172, 186.14129638671875, 124.15646362304688, 152.58175659179688, -25.109085083007812, 3.894500732421875, 201.73291015625, -27.307891845703125, 156.0074462890625, -1.093893051147461, 13.45654296875, 36.51325988769531, 46.52116394042969, 8.124755859375, 6.51080322265625, 28.760101318359375, 3.07379150390625, 131.61172485351562, 55.00419616699219, 100.44960021972656, -11.230712890625, 48.02263641357422, -14.098541259765625, 13.751602172851562, 126.1111831665039, 25.000579833984375, 89.15992736816406, -2.2016448974609375, -32.39863586425781, -100.21412658691406, 32.84629821777344, -7.5822601318359375, -47.47711181640625, 12.803886413574219, -5.255859375, -72.14788818359375, 186.25575256347656, 119.28619384765625, 33.20082092285156], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000431.npy"}
|
||||
{"epoch": 0.9026178010471204, "step": 432, "batch_size": 128, "mean": 56.163490295410156, "std": 77.91231536865234, "min": -159.96197509765625, "p10": -28.088809204101562, "median": 41.03430938720703, "p90": 163.11555175781248, "max": 203.59820556640625, "pos_frac": 0.703125, "sample": [-12.557647705078125, 146.9842529296875, 3.037078857421875, 141.4459228515625, 188.2144775390625, 172.1414794921875, 135.4312744140625, 66.58737182617188, -43.3017578125, -139.03289794921875, 168.3818359375, -5.15576171875, 145.54241943359375, 42.819091796875, 22.756332397460938, 15.388740539550781, 34.013214111328125, 148.53465270996094, -40.94862365722656, 5.6939849853515625, 113.97262573242188, -159.96197509765625, 46.21607971191406, -0.31992149353027344, -7.605072021484375, 7.091856956481934, -24.7357177734375, 143.59295654296875, 64.65635681152344, -12.472198486328125, 26.3228759765625, 149.8465118408203, 171.47476196289062, 154.0020294189453, -27.416259765625, -5.756355285644531, 88.55636596679688, -33.780364990234375, 163.97640991210938, 24.973876953125, 113.41178894042969, 119.13449096679688, 9.778900146484375, 2.406045913696289, -2.2449722290039062, -0.01531982421875, 56.13336181640625, 10.214408874511719, -101.79354858398438, -3.6732177734375, 111.7145004272461, 80.92254638671875, 10.182075500488281, 165.75137329101562, -11.797508239746094, 11.454071044921875, -31.52013397216797, 133.10150146484375, -107.68695068359375, 3.5502395629882812, -7.896453857421875, 22.019363403320312, 135.3186492919922, -14.480712890625, 130.62184143066406, 3.7613754272460938, 166.65097045898438, 120.228271484375, 118.40719604492188, 119.21432495117188, -67.13880157470703, 88.463623046875, 5.213287353515625, -13.356281280517578, 18.354652404785156, 175.88058471679688, 119.49681091308594, 182.46609497070312, 24.281524658203125, 124.90484619140625, 43.787109375, 3.5914535522460938, 203.59820556640625, -15.407302856445312, 132.14468383789062, 99.2564697265625, 153.15138244628906, 114.05439758300781, -27.713653564453125, 116.44292449951172, 8.211532592773438, -13.11025619506836, 0.0, 162.74661254882812, 144.86874389648438, -18.959503173828125, 11.126617431640625, -28.96417236328125, -8.91632080078125, 52.823089599609375, 51.037841796875, -6.773708343505859, -36.52687072753906, 125.72979736328125, 99.00902557373047, 103.48217010498047, 177.18972778320312, 132.50196838378906, 181.22711181640625, 147.6131134033203, -0.6246337890625, 121.91339111328125, 40.330169677734375, 179.941162109375, -0.3273582458496094, 38.6273193359375, 67.36383056640625, 118.92514038085938, 4.627202987670898, 109.0538330078125, 88.21083068847656, 131.88882446289062, -39.76904296875, 4.864768981933594, 41.73844909667969, 112.10020446777344, -2.7196712493896484, -34.484771728515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000432.npy"}
|
||||
{"epoch": 0.9047120418848168, "step": 433, "batch_size": 128, "mean": 43.75343322753906, "std": 76.18968200683594, "min": -163.410400390625, "p10": -30.61188964843749, "median": 42.20994567871094, "p90": 126.99390029907227, "max": 256.9356689453125, "pos_frac": 0.75, "sample": [109.42528533935547, -135.32394409179688, -161.9698486328125, 106.14242553710938, 52.24220275878906, 31.753746032714844, 100.03717041015625, -12.35113525390625, -22.29731559753418, 35.351593017578125, 27.383087158203125, 20.66241455078125, 124.94599151611328, 66.66676330566406, 57.693695068359375, 25.645721435546875, -15.138031005859375, 59.57341766357422, 57.4246826171875, 16.742263793945312, 41.77813720703125, 14.802154541015625, 115.82154846191406, -17.695602416992188, -0.38257408142089844, 16.26644515991211, 159.40359497070312, 6.218151092529297, 166.97796630859375, 130.89410400390625, 126.88182830810547, -13.563468933105469, 119.28958129882812, 50.764404296875, 70.35711669921875, 172.17510986328125, 127.25540161132812, 112.88601684570312, -111.62295532226562, 83.1627197265625, -9.002342224121094, 19.527618408203125, 1.266265869140625, 40.98399353027344, 43.25355529785156, 84.83078002929688, 79.38656616210938, -2.720661163330078, 28.7596435546875, 113.6139144897461, 116.06819152832031, 58.535804748535156, 105.10835266113281, 3.912506103515625, 52.6004638671875, -79.76614379882812, 42.641754150390625, 122.89616394042969, 18.43222427368164, 9.309085845947266, -18.75238037109375, 55.819610595703125, -73.56946563720703, 104.72355651855469, 153.82217407226562, 68.88528442382812, 114.26029968261719, -163.410400390625, 103.46095275878906, 114.9862060546875, -26.62548828125, 17.326553344726562, 137.248046875, -5.573406219482422, 5.98748779296875, 28.375381469726562, 99.01679229736328, 116.98084259033203, 126.6392822265625, -152.96868896484375, -97.36770629882812, 141.95045471191406, 90.49870300292969, -38.86933898925781, 26.25018310546875, -1.190155029296875, 133.34652709960938, -4.763492584228516, 62.91168212890625, 38.105560302734375, 119.96409606933594, -0.67315673828125, -161.02923583984375, 6.804258346557617, 20.38641357421875, 91.21322631835938, 82.09649658203125, 148.4156951904297, 107.90460205078125, -81.95623779296875, 57.8973388671875, -8.605934143066406, 3.3753280639648438, 256.9356689453125, -13.2796630859375, 101.32461547851562, 139.62298583984375, 108.50212097167969, 126.62181091308594, 1.2918243408203125, 50.85772705078125, 15.1787109375, 48.2364501953125, -27.946533203125, 23.98333740234375, 86.390869140625, 40.156150817871094, -6.166110992431641, 110.078369140625, 174.72442626953125, 32.3521728515625, 109.8841552734375, -137.14141845703125, 4.863979339599609, -21.23078155517578, -36.8310546875, 98.2626953125, 4.55096435546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000433.npy"}
|
||||
{"epoch": 0.9068062827225131, "step": 434, "batch_size": 128, "mean": 53.18547821044922, "std": 79.02644348144531, "min": -158.30517578125, "p10": -21.344898223876953, "median": 32.068756103515625, "p90": 152.0500015258789, "max": 217.70849609375, "pos_frac": 0.703125, "sample": [35.14183044433594, -15.3642578125, 0.64862060546875, 5.1110076904296875, 151.8315887451172, 18.951675415039062, 113.65778350830078, 109.60984802246094, -14.366729736328125, 211.31414794921875, -9.456186294555664, 139.5330047607422, -42.73065185546875, 4.720184326171875, -15.864704132080078, 7.3699951171875, -17.989501953125, 106.46380615234375, 88.52478790283203, -2.026123046875, 15.19964599609375, 203.73501586914062, -8.1031494140625, 217.70849609375, 150.42916870117188, -77.92962646484375, 0.0, -12.696952819824219, -12.493095397949219, 119.14886474609375, 166.39990234375, 129.06719970703125, -20.846336364746094, 167.65594482421875, 116.14453125, 122.734619140625, -10.291015625, 48.54949951171875, 25.77642822265625, 171.20849609375, 43.319427490234375, 89.12603759765625, 160.90728759765625, -43.172149658203125, 32.22113037109375, -114.48289489746094, 93.08511352539062, 123.15179443359375, 29.2510986328125, 112.06396484375, 151.7208709716797, 142.64846801757812, 100.56513977050781, 206.603515625, 22.063858032226562, 105.7320327758789, 136.4737548828125, 14.00152587890625, -27.441802978515625, 152.55963134765625, 48.37120819091797, 97.958740234375, 183.6539306640625, -18.75799560546875, 1.001708984375, 9.332473754882812, 83.51437377929688, 143.67532348632812, 21.71624755859375, 33.15349578857422, 172.5055389404297, 9.813644409179688, -19.870819091796875, 157.04751586914062, -60.0338134765625, 0.6436958312988281, -24.422454833984375, 169.48226928710938, 6.3734130859375, 111.83404541015625, 141.37557983398438, -35.75115966796875, 64.25173950195312, 14.298431396484375, 31.9163818359375, 132.8162841796875, 141.67962646484375, -155.589111328125, 119.1649169921875, -22.508209228515625, 63.00669860839844, -19.150924682617188, -103.73605346679688, 0.0, 129.6402587890625, 29.298049926757812, 19.300537109375, 149.75909423828125, 147.14694213867188, 8.201744079589844, -15.643157958984375, -5.3454742431640625, -18.450387954711914, 102.17596435546875, -8.518951416015625, 8.996955871582031, -6.9545440673828125, -19.165969848632812, 131.58349609375, 77.71023559570312, 42.633026123046875, 3.789752960205078, 0.0, 81.15936279296875, 99.53338623046875, -4.839019775390625, 19.019882202148438, 125.83758544921875, 77.58203125, 75.35818481445312, -158.30517578125, 14.597564697265625, 27.20953369140625, -6.4819793701171875, -99.32855224609375, 146.55648803710938, 86.68254089355469, 131.35977172851562], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000434.npy"}
|
||||
{"epoch": 0.9089005235602095, "step": 435, "batch_size": 128, "mean": 48.143550872802734, "std": 75.4719009399414, "min": -156.51171875, "p10": -34.28992843627929, "median": 35.46482467651367, "p90": 143.94720764160155, "max": 197.20623779296875, "pos_frac": 0.7578125, "sample": [114.18452453613281, 5.21636962890625, 4.331798553466797, 35.679344177246094, 31.715362548828125, 161.32810974121094, 135.3147735595703, -101.77337646484375, 115.94584655761719, -5.615386962890625, -6.872409820556641, -21.518905639648438, -156.51171875, 42.515525817871094, 4.994489669799805, -17.67437744140625, 66.92059326171875, 67.97048950195312, 142.2232666015625, 102.71368408203125, 17.81097412109375, 23.19646644592285, 142.65673828125, 0.0, 3.5228633880615234, -8.56146240234375, 68.81246948242188, 20.33037567138672, 114.97235107421875, 20.427505493164062, 122.01195526123047, 135.67137145996094, 109.924560546875, -39.0859375, -10.492561340332031, 180.82177734375, 75.88114929199219, 139.47268676757812, 43.33868408203125, 10.87127685546875, 149.99420166015625, 153.16110229492188, 152.0225830078125, 26.781326293945312, 132.33058166503906, 20.2491455078125, 129.93075561523438, -31.083221435546875, 143.3283233642578, 29.84429931640625, 130.05364990234375, 83.14407348632812, 173.26116943359375, 128.222900390625, 25.99173927307129, 3.7081298828125, 42.666778564453125, 25.176376342773438, 62.2257080078125, -97.07855224609375, 12.326187133789062, 169.4935302734375, 8.60556411743164, -124.57315063476562, 19.03093719482422, 13.472076416015625, 38.154205322265625, 152.88995361328125, -105.19766235351562, -12.29583740234375, 23.554092407226562, -1.71502685546875, 29.1949462890625, 160.78030395507812, 162.74761962890625, -37.2950439453125, 124.97712707519531, 143.8731231689453, -68.35639953613281, 108.62611389160156, 1.17926025390625, 127.35488891601562, 55.72442626953125, 123.87467956542969, -113.732666015625, 5.165191650390625, 75.9345703125, 3.853832244873047, 1.04620361328125, -2.878997802734375, -7.359432220458984, 126.368408203125, 103.80645751953125, 197.20623779296875, 97.36344909667969, 144.1200714111328, 71.17533874511719, 104.66827392578125, -33.00202178955078, 59.749267578125, -7.3857269287109375, 41.94401550292969, 50.53785705566406, 4.571210861206055, 26.841339111328125, 18.36376953125, 35.804595947265625, 8.342330932617188, 117.41455078125, 5.989585876464844, -8.35516357421875, -44.072723388671875, -113.7392578125, -38.34740447998047, -2.02313232421875, 189.491455078125, 9.25811767578125, 35.25030517578125, 85.038818359375, -112.95077514648438, 45.73248291015625, 113.08563232421875, 123.89187622070312, -3.3412322998046875, -32.1436767578125, 132.95840454101562, 40.90313720703125, 102.80195617675781], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000435.npy"}
|
||||
{"epoch": 0.9109947643979057, "step": 436, "batch_size": 128, "mean": 32.90284729003906, "std": 73.75289916992188, "min": -155.7020263671875, "p10": -37.72017936706543, "median": 19.094738006591797, "p90": 142.27652130126953, "max": 194.720458984375, "pos_frac": 0.671875, "sample": [55.141754150390625, -32.1422119140625, 167.85507202148438, 119.6124267578125, 3.04144287109375, 22.19532012939453, 123.24365997314453, 32.994293212890625, -37.07403564453125, 91.3701171875, -77.395263671875, 52.99273681640625, -59.352210998535156, 47.502593994140625, 5.7084503173828125, 163.23077392578125, 27.559814453125, -18.73607635498047, -36.652679443359375, -81.63427734375, 9.805145263671875, 109.0324478149414, 115.84481811523438, 104.7083740234375, 158.38839721679688, -35.53025817871094, 140.399658203125, -0.646820068359375, 125.19005584716797, 30.55230712890625, 60.394187927246094, 8.044189453125, -4.437082290649414, 23.0863037109375, 146.65586853027344, 39.561256408691406, -12.630905151367188, 36.958396911621094, -2.626373291015625, -32.05235290527344, 96.90723419189453, 10.073577880859375, 15.926338195800781, 79.43670654296875, 150.19232177734375, 33.00885009765625, 4.220611572265625, 6.808563232421875, -10.38592529296875, 138.57839965820312, 42.64642333984375, -22.688629150390625, 16.332305908203125, -11.26568603515625, 98.36904907226562, -21.64496612548828, 160.23333740234375, -155.7020263671875, 4.77191162109375, 147.62864685058594, 131.20130920410156, 27.794326782226562, 13.08514404296875, 32.68362808227539, 46.87646484375, 119.67343139648438, -12.553436279296875, -112.96734619140625, 53.735565185546875, 118.71875, -1.603271484375, 194.720458984375, 57.855712890625, -103.99705505371094, 22.44635009765625, 41.678497314453125, 15.360328674316406, 108.6239013671875, 9.32177734375, -2.73712158203125, 68.98416137695312, -30.74346923828125, 14.199310302734375, -5.109199523925781, 56.036529541015625, 86.65911865234375, -1.60791015625, -57.03973388671875, 32.56278610229492, -84.49411010742188, 154.80392456054688, 12.167953491210938, -5.862495422363281, -51.76483154296875, 13.92425537109375, 24.9468994140625, 193.849609375, 1.2857208251953125, 6.082160949707031, -30.713699340820312, -14.074066162109375, -144.9500732421875, 19.267112731933594, 88.65545654296875, 86.86688232421875, 8.87091064453125, -127.96405029296875, 109.771484375, 193.43991088867188, -13.342483520507812, 9.298408508300781, 22.74622344970703, -28.547332763671875, 102.33737182617188, 18.92236328125, 102.97270202636719, -2.815185546875, 177.2645263671875, -39.227848052978516, 10.503204345703125, -24.4261474609375, 44.38189697265625, -138.12814331054688, 70.59658813476562, -28.611572265625, 157.66180419921875, 30.943893432617188, -12.54058837890625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000436.npy"}
|
||||
{"epoch": 0.9130890052356021, "step": 437, "batch_size": 128, "mean": 56.87449264526367, "std": 80.63442993164062, "min": -137.08517456054688, "p10": -21.462202453613276, "median": 50.28757095336914, "p90": 156.37777404785155, "max": 208.71502685546875, "pos_frac": 0.7578125, "sample": [23.455078125, 145.69766235351562, 49.46905517578125, 122.89898681640625, -134.87570190429688, 163.7100830078125, 5.612701416015625, 7.4040069580078125, 77.36245727539062, 14.30609130859375, 150.34783935546875, 18.769699096679688, 107.27085876464844, 67.00764465332031, 11.776107788085938, 129.72021484375, -23.859832763671875, 2.5739288330078125, 42.906341552734375, 132.48049926757812, 26.343215942382812, 162.96304321289062, 208.71502685546875, 145.87632751464844, 23.202125549316406, 29.906951904296875, -3.55865478515625, 139.56686401367188, 129.72943115234375, 85.95338439941406, -18.5703125, 6.3079833984375, 7.85662841796875, 56.30522155761719, -4.7021484375, 36.520751953125, -119.27279663085938, -9.212417602539062, 9.345993041992188, 31.2041015625, 126.89570617675781, 55.2557373046875, 154.05413818359375, 144.02249145507812, -40.187835693359375, 4.952728271484375, 29.28814697265625, 18.779296875, 165.4923095703125, 133.4378662109375, 117.10546875, 180.76576232910156, 106.21049499511719, 142.469970703125, 8.299861907958984, 1.989349365234375, -30.257354736328125, -14.534969329833984, 169.4775390625, 162.11996459960938, 137.95639038085938, -2.27825927734375, 149.03750610351562, 167.79098510742188, 194.41525268554688, 82.78657531738281, 2.4384994506835938, 54.475250244140625, -18.201858520507812, -20.434646606445312, 51.10608673095703, 10.39434814453125, -35.136566162109375, 155.38119506835938, -137.08517456054688, 138.97515869140625, 2.896540641784668, 70.31788635253906, -131.17153930664062, 0.0, 131.2322998046875, -105.16645812988281, 3.664886474609375, -9.709512710571289, 158.703125, 109.18927001953125, -74.25079345703125, 83.04180908203125, -116.12847900390625, 105.119873046875, 168.79229736328125, 73.16622924804688, -2.6228675842285156, 112.01409912109375, 36.997406005859375, -3.00244140625, 116.53350830078125, 25.28215789794922, 92.44009399414062, 90.80796813964844, 154.35874938964844, 168.49032592773438, 103.81787109375, 151.44943237304688, 35.86029052734375, -14.283226013183594, 56.18145751953125, 99.24993133544922, 145.9974365234375, -110.59309387207031, 2.4618072509765625, -0.875732421875, 124.09548950195312, -129.724853515625, -15.619659423828125, 134.4034423828125, 33.8447265625, 131.0867156982422, 117.86029052734375, -5.6581878662109375, 90.18142700195312, -14.434661865234375, 25.5745849609375, 70.92120361328125, 0.0, 33.915740966796875, 122.210205078125, 205.24395751953125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000437.npy"}
|
||||
{"epoch": 0.9151832460732985, "step": 438, "batch_size": 128, "mean": 25.934612274169922, "std": 78.47480773925781, "min": -147.6025390625, "p10": -86.540966796875, "median": 12.172721862792969, "p90": 136.9315216064453, "max": 205.5400390625, "pos_frac": 0.6171875, "sample": [20.403358459472656, -139.15890502929688, 102.58120727539062, -14.691696166992188, -27.13482666015625, 114.46124267578125, 46.08172607421875, 31.42095947265625, 0.0, 12.544904708862305, 157.35708618164062, -87.76583862304688, 16.895492553710938, 133.13922119140625, 40.8084716796875, 8.032058715820312, 16.5115966796875, 137.9006805419922, 112.70410919189453, -16.49311065673828, -40.18365478515625, 2.27862548828125, 59.898162841796875, -0.4386749267578125, -83.39932250976562, -67.63727569580078, -20.055999755859375, 107.72486877441406, 10.28271484375, 55.25225830078125, 10.464147567749023, 162.3775634765625, 49.622833251953125, -13.280593872070312, 28.65338134765625, 136.70989990234375, -110.15133666992188, -13.263252258300781, 48.15191650390625, -143.97171020507812, -147.6025390625, -116.5286865234375, 58.6845703125, 98.03600311279297, -12.449691772460938, -7.32928466796875, 0.8630828857421875, 25.604736328125, -132.37149047851562, -3.0952491760253906, 2.536041259765625, -119.77606201171875, -12.99001693725586, 18.321365356445312, 7.34075927734375, 29.546401977539062, 95.7320556640625, -22.8021240234375, 30.316558837890625, 114.6347885131836, 12.505569458007812, -25.277421951293945, -1.892141342163086, -128.17044067382812, 85.2679443359375, -4.150236129760742, 12.997100830078125, -1.2889862060546875, 141.4617919921875, -98.50248718261719, 65.85870361328125, -37.24834442138672, 150.28472900390625, -18.80998992919922, 27.435150146484375, 70.10952758789062, 107.70030212402344, -97.90126037597656, 125.38589477539062, 133.10293579101562, -31.1158447265625, 129.23989868164062, -53.28416442871094, 1.472442626953125, 174.6497802734375, -25.758010864257812, 118.73799133300781, 91.46115112304688, -103.55712890625, 51.27745819091797, 75.38651275634766, 12.976921081542969, 1.6917171478271484, -3.6573486328125, 151.87286376953125, 124.64649963378906, -1.1849288940429688, 146.8418731689453, 199.241943359375, -25.731842041015625, -17.943832397460938, -35.485595703125, 38.8675537109375, 91.29782104492188, 117.12483978271484, -25.55615234375, 23.547622680664062, 16.10192108154297, -29.298858642578125, 127.12760925292969, 154.8383331298828, 10.870555877685547, -86.01602172851562, 11.54990005493164, 0.0, 0.3916778564453125, 78.782958984375, 8.582435607910156, -9.734962463378906, 94.24560546875, 149.19525146484375, -88.72305297851562, 7.011396408081055, 137.44863891601562, 205.5400390625, -5.355339050292969, 11.839874267578125, 28.000625610351562], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000438.npy"}
|
||||
{"epoch": 0.9172774869109948, "step": 439, "batch_size": 128, "mean": 47.68614959716797, "std": 77.01081848144531, "min": -170.96249389648438, "p10": -47.16234741210936, "median": 26.313125610351562, "p90": 147.89925689697264, "max": 193.77310180664062, "pos_frac": 0.71875, "sample": [-68.11553955078125, -81.00259399414062, -124.81817626953125, 175.80770874023438, -57.27557373046875, 148.62844848632812, 22.676559448242188, 19.02178192138672, 63.5234375, 125.54617309570312, -2.164562225341797, 13.116607666015625, 23.634767532348633, 2.711944580078125, 94.26284790039062, -107.06549072265625, 8.583541870117188, 100.20065307617188, -0.1558990478515625, 9.627532958984375, 162.42041015625, 135.87237548828125, 6.809906005859375, 29.890457153320312, 1.305908203125, 125.72293853759766, 40.23334503173828, 131.9786376953125, 152.9019775390625, 17.771259307861328, -56.30601501464844, 85.0738525390625, 13.913555145263672, 177.35186767578125, -21.719085693359375, 103.06118774414062, 176.58050537109375, 63.402374267578125, 193.77310180664062, 20.7127685546875, -30.932708740234375, 16.423152923583984, 132.7960205078125, 120.71009826660156, 6.107513427734375, 116.33103942871094, 123.91586303710938, 9.17431640625, 2.3088531494140625, -1.439208984375, 21.456787109375, 21.230392456054688, -60.8614501953125, 15.4498291015625, 62.68428039550781, 146.4759521484375, 120.05252075195312, -26.32215118408203, 35.82670593261719, 0.0, 22.484237670898438, 0.0, -20.0501708984375, 16.987777709960938, -71.78414916992188, 143.86911010742188, 25.253265380859375, -55.75660705566406, 12.22989273071289, -34.290008544921875, -1.29608154296875, 133.1563720703125, 14.56304931640625, -25.020172119140625, -6.594135284423828, 44.043800354003906, -1.9126472473144531, 95.15229034423828, -41.1773681640625, 100.45474243164062, -43.866485595703125, 90.44930267333984, 137.66285705566406, 27.37298583984375, 59.536895751953125, 91.31201171875, -28.976470947265625, -3.6404876708984375, -73.04996490478516, -6.9288787841796875, 147.5867462158203, -119.08993530273438, 0.0, 152.4710693359375, 120.8096923828125, 86.43035125732422, 173.38601684570312, 130.49325561523438, 40.919891357421875, -31.304672241210938, 42.736572265625, -54.852691650390625, 120.39926147460938, 3.1868133544921875, 51.30577087402344, 3.4857177734375, 107.35324096679688, 104.66407775878906, 116.14237213134766, 36.571044921875, 141.3190460205078, 70.03179931640625, 56.36317825317383, 146.44342041015625, 131.72433471679688, 2.665231704711914, -170.96249389648438, 165.31529235839844, 98.93545532226562, 162.436279296875, 141.50921630859375, 117.30470275878906, 13.517417907714844, -20.296798706054688, 164.82542419433594, 0.0, 156.08517456054688, 134.84771728515625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000439.npy"}
|
||||
{"epoch": 0.9193717277486911, "step": 440, "batch_size": 128, "mean": 37.10902786254883, "std": 79.10539245605469, "min": -161.87896728515625, "p10": -59.6355369567871, "median": 23.88465118408203, "p90": 145.3245178222656, "max": 213.74554443359375, "pos_frac": 0.6640625, "sample": [-39.858795166015625, 0.0, 33.27362060546875, -123.843994140625, -22.432689666748047, 85.92214965820312, -117.78363037109375, 91.73635864257812, -11.724693298339844, 50.50563049316406, 11.86090087890625, 24.631378173828125, -84.75250244140625, 3.517120361328125, 38.2984619140625, -97.26937866210938, 86.68960571289062, 118.37796020507812, 114.50114440917969, 122.71438598632812, 50.816558837890625, 165.54364013671875, -4.384605407714844, 24.1240234375, -66.80679321289062, -5.337432861328125, -161.87896728515625, 103.50028991699219, 45.58740234375, 75.53451538085938, 164.71337890625, 184.90826416015625, 9.258430480957031, -14.6151123046875, -18.26788330078125, 40.9754638671875, 109.52397155761719, -135.52227783203125, 0.0, 56.90911865234375, -147.33038330078125, -1.02105712890625, 8.24310302734375, 93.38196563720703, 153.6615753173828, -17.145668029785156, -92.50131225585938, -0.7972412109375, 16.778282165527344, 122.43778991699219, 178.1905517578125, -16.14794158935547, 115.7840576171875, -0.04485130310058594, 36.32745361328125, 163.32186889648438, -2.312602996826172, -108.63554382324219, -46.19952392578125, -31.683135986328125, -44.05706787109375, -1.0155029296875, 7.11968994140625, -36.752593994140625, 213.74554443359375, 113.19012451171875, 178.33889770507812, 120.94760131835938, 135.22900390625, 43.180511474609375, -1.8826236724853516, 26.171722412109375, 12.057662963867188, 120.54209899902344, 133.04104614257812, 11.670303344726562, -0.308563232421875, -46.83858108520508, 22.143035888671875, 16.587554931640625, 46.62261962890625, 102.7252197265625, 110.4769515991211, 78.47109985351562, 167.001708984375, -84.81512451171875, 43.755828857421875, 158.83740234375, 37.797943115234375, -80.220947265625, 170.0091552734375, -22.501907348632812, 135.33627319335938, -17.42212677001953, 123.21002197265625, 19.658493041992188, 23.645278930664062, 30.0290584564209, 33.94602966308594, 40.69244384765625, 142.04486083984375, -16.239166259765625, 22.770477294921875, 17.544097900390625, 45.766357421875, 64.822265625, 6.3093109130859375, 152.97705078125, 113.81951141357422, 70.31536865234375, 10.715606689453125, 37.4473876953125, 21.039703369140625, 136.84690856933594, -41.221221923828125, 120.83554077148438, 117.66668701171875, 18.797454833984375, 10.623687744140625, 87.375, -28.23187255859375, 160.4827117919922, 3.3836517333984375, -78.20426940917969, -9.86279296875, 16.812530517578125, -56.56214141845703, 128.2630615234375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000440.npy"}
|
||||
{"epoch": 0.9214659685863874, "step": 441, "batch_size": 128, "mean": 55.953346252441406, "std": 72.13493347167969, "min": -151.49563598632812, "p10": -12.928466033935546, "median": 45.481895446777344, "p90": 153.8313217163086, "max": 193.82296752929688, "pos_frac": 0.765625, "sample": [-11.7525634765625, 32.224365234375, -104.53402709960938, 91.0032958984375, -0.24017333984375, -11.937652587890625, 164.2724609375, -2.90728759765625, 7.2230224609375, -5.324462890625, -45.17938232421875, 193.82296752929688, 27.217750549316406, 126.31219482421875, 49.66685485839844, 125.88250732421875, 139.75460815429688, 136.9091796875, 115.60846710205078, 71.13279724121094, 54.35601806640625, 115.65823364257812, 17.677913665771484, 15.57745361328125, 132.98455810546875, 52.642578125, 124.517333984375, 145.67031860351562, 52.99066162109375, 5.507049560546875, 14.463409423828125, 132.9959716796875, 73.59611511230469, -12.758216857910156, 127.4991455078125, 107.15374755859375, 14.69677734375, 145.81866455078125, 108.69436645507812, -66.59819793701172, 142.3042449951172, 125.15156555175781, -72.84297943115234, -7.15777587890625, 154.42962646484375, -1.508443832397461, 45.16943359375, 35.49577331542969, 117.11309814453125, 121.9490966796875, 133.07052612304688, 8.00006103515625, 47.034210205078125, 15.968193054199219, 16.436397552490234, 13.75238037109375, -0.76727294921875, 127.18551635742188, 22.818405151367188, -11.39691162109375, 1.031341552734375, 28.338226318359375, 9.323143005371094, 128.84747314453125, -1.3056640625, 0.0, 9.305839538574219, 71.16203308105469, 45.79396057128906, 169.76638793945312, -42.57135009765625, -33.339385986328125, -2.244295120239258, 11.690902709960938, 0.0, 121.82916259765625, 47.268798828125, 40.84014892578125, 122.453369140625, 55.161773681640625, 13.568099975585938, 137.39407348632812, 24.544265747070312, -58.4033203125, 45.169830322265625, 69.64419555664062, 124.85282897949219, 81.98544311523438, -1.3915863037109375, 23.7244873046875, 3.4405517578125, 9.6595458984375, 35.339813232421875, 98.24285888671875, 112.57121276855469, 14.13116455078125, 153.5749053955078, -93.9552993774414, 156.57781982421875, 50.60626220703125, 156.33676147460938, -151.49563598632812, 56.647674560546875, 3.398712158203125, -67.404052734375, 4.3094024658203125, 174.48971557617188, -96.12171936035156, 138.50596618652344, -13.325714111328125, 27.90851593017578, 158.12716674804688, 158.79092407226562, 156.3772735595703, 168.35134887695312, 106.79052734375, 65.40406799316406, -4.29180908203125, -29.97967529296875, 8.5784912109375, 183.510498046875, -6.634063720703125, 122.37572479248047, 170.48040771484375, 148.96737670898438, 88.50712585449219, 118.87932586669922, 35.410888671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000441.npy"}
|
||||
{"epoch": 0.9235602094240838, "step": 442, "batch_size": 128, "mean": 42.366493225097656, "std": 85.56124877929688, "min": -181.505615234375, "p10": -66.74936218261718, "median": 27.345264434814453, "p90": 151.5256591796875, "max": 231.79156494140625, "pos_frac": 0.7265625, "sample": [126.32888793945312, 14.810302734375, 20.350601196289062, 21.415569305419922, 189.61676025390625, 148.4317169189453, 2.2521743774414062, -9.355575561523438, 156.91790771484375, 5.37457275390625, 58.313934326171875, 20.892730712890625, -81.25189971923828, -6.9107818603515625, 132.1070556640625, -132.54156494140625, 26.72735595703125, 20.78570556640625, 14.430419921875, -114.32804107666016, 14.56597900390625, 39.2041015625, -6.7850799560546875, -7.162990570068359, 107.8079833984375, -5.9571075439453125, 111.2473373413086, -63.710479736328125, 171.38555908203125, 159.24066162109375, -126.23870849609375, 7.5681915283203125, -36.420623779296875, 133.03543090820312, 105.18470764160156, 131.70599365234375, -15.449050903320312, 4.3209381103515625, -118.3477783203125, 5.1945343017578125, 33.08052062988281, -45.568824768066406, 60.50813293457031, 6.145820617675781, -21.95086669921875, -140.59100341796875, 10.456405639648438, 35.95147705078125, 7.5217437744140625, -30.78350830078125, 13.93267822265625, 9.74871826171875, 89.77351379394531, 184.71002197265625, 7.355705261230469, 77.84527587890625, 121.44450378417969, 0.0, 62.734405517578125, -114.4091796875, -35.806884765625, 11.694938659667969, -131.61500549316406, 127.63737487792969, -111.93658447265625, 166.05450439453125, 95.7052001953125, -7.4056854248046875, 32.0223388671875, 116.1075668334961, 0.3247814178466797, -181.505615234375, 161.58120727539062, -8.205093383789062, 81.45187377929688, 62.110809326171875, 185.88504028320312, -23.05792236328125, 113.68878173828125, 52.75065612792969, 35.69769287109375, 73.09033203125, -5.352958679199219, 183.69497680664062, 127.59249877929688, 143.12278747558594, -73.840087890625, -123.3267822265625, -48.5240478515625, -46.204368591308594, 20.849395751953125, 46.57453155517578, 10.014755249023438, 100.621826171875, 15.132720947265625, 121.28461456298828, 99.36006164550781, 122.88098907470703, 105.30191040039062, 128.96994018554688, 165.60714721679688, 151.5328369140625, 138.94528198242188, 151.5225830078125, 24.664520263671875, 231.79156494140625, 13.14459228515625, -30.909584045410156, -32.19805908203125, 183.81753540039062, 130.5302734375, 117.9293212890625, 57.77046203613281, 119.90093994140625, 1.872833251953125, 16.295364379882812, 27.819496154785156, 53.2283935546875, 128.49432373046875, -56.621826171875, 87.09199523925781, 124.5542221069336, 139.50238037109375, 26.87103271484375, -119.51954650878906, 60.28106689453125, 114.43704223632812, 57.468902587890625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000442.npy"}
|
||||
{"epoch": 0.9256544502617801, "step": 443, "batch_size": 128, "mean": 44.22319793701172, "std": 76.88037872314453, "min": -153.6278076171875, "p10": -36.66853027343749, "median": 30.431926727294922, "p90": 148.1758590698242, "max": 206.20404052734375, "pos_frac": 0.7109375, "sample": [31.12408447265625, 7.934356689453125, 105.0377197265625, 21.975936889648438, -30.84564208984375, 50.2491455078125, 166.5663299560547, 2.243316650390625, 160.2574462890625, -11.563148498535156, 10.056121826171875, 63.47637939453125, -7.109245300292969, 96.725341796875, -10.102447509765625, -108.74964904785156, -6.306915283203125, 103.64593505859375, 147.11074829101562, -6.634494781494141, 38.97453308105469, 80.04171752929688, 64.29307556152344, 37.652557373046875, -8.980789184570312, 30.457550048828125, 30.889556884765625, 134.8452911376953, 31.7762451171875, -61.08942413330078, -23.7685546875, 61.839447021484375, -127.20050048828125, 7.4295501708984375, -10.478515625, 26.504669189453125, 142.55743408203125, 24.415287017822266, 141.51409912109375, 10.793876647949219, 156.2219696044922, -15.273040771484375, -10.3157958984375, 0.651611328125, 17.963607788085938, 30.40630340576172, 206.20404052734375, 9.169387817382812, 51.85343933105469, -33.587738037109375, 64.19708251953125, -148.27020263671875, -56.96562194824219, -41.293212890625, 51.8670654296875, 123.9422607421875, 112.39988708496094, 198.06707763671875, 73.62200927734375, 165.11636352539062, 105.30145263671875, 32.237823486328125, -13.146141052246094, 117.39132690429688, 143.81927490234375, 17.248863220214844, -4.0972900390625, 93.32770538330078, 125.57115173339844, 129.756103515625, -4.8992767333984375, -6.0943603515625, 161.72900390625, 14.126235961914062, 150.66111755371094, 5.623748779296875, -1.3146133422851562, 122.81356811523438, -79.55685424804688, 10.18292236328125, -18.100906372070312, 151.8690185546875, 142.5150146484375, 94.92140197753906, 66.0675048828125, -43.69951629638672, 14.289764404296875, -70.37274169921875, 114.17173767089844, 183.464599609375, 154.30029296875, 88.07220458984375, 31.63604736328125, 25.195999145507812, -58.97041320800781, 113.54364013671875, 12.348602294921875, 176.62588500976562, 112.90554809570312, 129.63006591796875, 51.03697204589844, -15.086761474609375, -138.3212890625, 143.87278747558594, -2.628204345703125, 1.8037109375, 47.75401306152344, 107.4224853515625, 60.63140869140625, 6.8679351806640625, 111.35270690917969, 7.635101318359375, -34.6865234375, 10.231430053710938, -1.2998504638671875, -10.936012268066406, 74.61004638671875, 154.86204528808594, 4.764533996582031, -153.6278076171875, 123.40533447265625, 131.60946655273438, 145.2369842529297, 5.780601501464844, -1.394317626953125, -104.46925354003906, 9.795883178710938, 9.71551513671875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000443.npy"}
|
||||
{"epoch": 0.9277486910994764, "step": 444, "batch_size": 128, "mean": 49.70329284667969, "std": 80.14091491699219, "min": -178.81195068359375, "p10": -33.499525451660155, "median": 35.78559875488281, "p90": 154.54358520507813, "max": 246.77789306640625, "pos_frac": 0.703125, "sample": [-6.4385833740234375, 12.956840515136719, 157.60968017578125, 21.33660888671875, 77.72860717773438, 62.83992004394531, -22.65289306640625, 2.957244873046875, 20.13928985595703, 44.3685302734375, -8.0821533203125, 30.102081298828125, -31.418502807617188, 38.80352783203125, 122.3734130859375, 40.17791748046875, 117.87715148925781, 63.54303741455078, -25.518280029296875, 33.794761657714844, 95.12721252441406, 29.687103271484375, 91.204833984375, -50.47705078125, 18.491912841796875, -36.4307861328125, 134.2174072265625, 10.216243743896484, 6.136566162109375, 59.014251708984375, 14.449920654296875, 133.028564453125, 120.21452331542969, -15.221176147460938, 147.6784210205078, -29.62396240234375, -27.387954711914062, 42.942291259765625, 146.33004760742188, 127.45257568359375, 142.6937255859375, 183.0364990234375, -14.303955078125, -3.463531494140625, 41.549896240234375, 106.0509033203125, 8.224365234375, -116.0499038696289, -107.99697875976562, -37.2703857421875, 199.201416015625, 116.77694702148438, -53.99725341796875, 182.40673828125, -4.806934356689453, 70.8525390625, 211.417724609375, 154.47662353515625, -7.8504638671875, -12.73516845703125, 1.717254638671875, 246.77789306640625, 123.40931701660156, -29.219070434570312, 73.42787170410156, -15.814910888671875, 168.22409057617188, 101.47840881347656, -32.24327087402344, 190.4564208984375, 119.53262329101562, 23.846908569335938, -6.49114990234375, 154.6998291015625, -16.737701416015625, 53.39208984375, 160.42169189453125, 146.1907958984375, 128.07760620117188, 52.688934326171875, 151.69863891601562, -2.9464855194091797, 1.6747665405273438, 169.23526000976562, -1.8381233215332031, 139.2071533203125, -8.357315063476562, 128.6125946044922, 129.13589477539062, -107.09374237060547, 2.227325439453125, 170.16439819335938, -18.66551971435547, 119.9195556640625, 37.77643585205078, 28.346359252929688, 65.7349853515625, 28.90447998046875, 9.843841552734375, -48.4261474609375, -18.710357666015625, 6.4370269775390625, 156.0538330078125, 4.88836669921875, -78.20884704589844, 40.31756591796875, -42.842437744140625, 129.63308715820312, -40.85719299316406, 116.45748901367188, 5.840764999389648, 66.92683410644531, 8.483917236328125, 101.60342407226562, 83.94677734375, 38.6126708984375, 151.10791015625, 123.60749816894531, 123.01981353759766, 120.33129119873047, -3.098783493041992, 13.40530014038086, -21.238327026367188, 1.5122909545898438, -152.67239379882812, 23.9224853515625, 115.60130310058594, -178.81195068359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000444.npy"}
|
||||
{"epoch": 0.9298429319371728, "step": 445, "batch_size": 128, "mean": 38.75702667236328, "std": 78.02845001220703, "min": -143.87928771972656, "p10": -50.5708625793457, "median": 18.448235511779785, "p90": 152.96505126953124, "max": 197.95547485351562, "pos_frac": 0.7109375, "sample": [9.847335815429688, -143.87928771972656, 167.08270263671875, 67.80082702636719, 42.16447448730469, 163.95855712890625, -4.50152587890625, 23.456649780273438, 152.82363891601562, -91.83563232421875, 8.884346008300781, 82.31658935546875, 135.27427673339844, -60.05413818359375, -24.94097900390625, 7.5325927734375, 49.926666259765625, 12.890274047851562, 175.17312622070312, 30.547607421875, 153.70843505859375, -3.61981201171875, 11.697807312011719, -97.14875793457031, 24.810882568359375, 114.10638427734375, 37.97593688964844, -2.909038543701172, 130.76722717285156, 20.401168823242188, 175.86041259765625, 78.69003295898438, 37.16265869140625, 4.326469421386719, -25.774734497070312, -12.169174194335938, 18.42533302307129, 37.589271545410156, 135.86407470703125, 92.04440307617188, 8.98846435546875, 153.29501342773438, 169.94790649414062, -0.295196533203125, 109.15521240234375, 4.028604507446289, 1.6963958740234375, 27.7313232421875, 39.184959411621094, -112.07968139648438, -35.051109313964844, 13.282257080078125, 125.908935546875, -49.676422119140625, 122.98957061767578, 0.03655242919921875, 45.33392333984375, 156.64288330078125, 14.46337890625, 138.7360382080078, 84.29342651367188, 50.908538818359375, 197.95547485351562, 4.758270263671875, 142.0095672607422, 104.05775451660156, 93.70785522460938, -134.18067932128906, 51.600006103515625, 171.89651489257812, 26.474853515625, 96.50274658203125, -24.037353515625, 118.73237609863281, 116.67568969726562, 1.063995361328125, -15.339996337890625, -124.26736450195312, 144.86358642578125, 85.77793884277344, -4.050201416015625, -26.37445068359375, -40.167205810546875, -106.37527465820312, 4.939605712890625, 22.908714294433594, 16.203048706054688, -26.989898681640625, 92.4856948852539, 1.8521041870117188, 62.093475341796875, 122.66290283203125, 170.34693908691406, 132.91668701171875, 43.8062744140625, 120.87310791015625, -3.245412826538086, -124.25601196289062, 156.36404418945312, -3.6826171875, 12.403022766113281, 42.115821838378906, 147.07644653320312, 15.645030975341797, -7.267578125, 17.781021118164062, 8.356834411621094, 18.47113800048828, 110.3233642578125, 1.223846435546875, -41.418701171875, -8.958938598632812, 0.0, 5.192607879638672, 12.01885986328125, -50.5484619140625, 1.8446807861328125, -8.7049560546875, 6.0153045654296875, 142.89776611328125, -50.646026611328125, -50.97119140625, -130.93487548828125, -1.2947654724121094, 50.503173828125, 140.27032470703125, -50.623130798339844, 153.7680206298828], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000445.npy"}
|
||||
{"epoch": 0.9319371727748691, "step": 446, "batch_size": 128, "mean": 51.405670166015625, "std": 78.93782806396484, "min": -184.902099609375, "p10": -30.0060302734375, "median": 40.55586814880371, "p90": 155.86875, "max": 213.42227172851562, "pos_frac": 0.7578125, "sample": [104.892822265625, 193.23486328125, 0.03816986083984375, 2.37762451171875, 0.11248779296875, 44.02500915527344, 123.96971130371094, 38.27191925048828, 126.30760192871094, -15.785446166992188, 6.373748779296875, -49.65769958496094, 83.10958862304688, 41.64415740966797, 7.085906982421875, 32.96593475341797, 184.50860595703125, 20.41368865966797, -21.435531616210938, -8.680221557617188, 19.982757568359375, 111.58810424804688, 141.392578125, 125.55679321289062, -6.255743026733398, -64.08465576171875, 164.02276611328125, 4.489082336425781, 9.6009521484375, -49.560394287109375, 12.1756591796875, 11.656806945800781, -26.089874267578125, 32.44227600097656, -55.881195068359375, 24.355247497558594, 145.64886474609375, 0.3779754638671875, 18.983642578125, 24.25939178466797, 143.90655517578125, 5.888160705566406, 78.08243560791016, 22.813186645507812, 43.93890380859375, 123.00128173828125, 155.8150634765625, 119.4593505859375, 143.57241821289062, -184.902099609375, 40.918548583984375, 99.38197326660156, 182.8990478515625, 3.542144775390625, 117.61125183105469, 135.25689697265625, 8.502288818359375, 46.59613037109375, -169.67330932617188, 97.26461791992188, 155.9940185546875, 1.2166175842285156, -26.9378662109375, 0.0, 45.442108154296875, 142.76303100585938, 41.2684326171875, 86.293701171875, 88.030029296875, 129.84356689453125, -10.364013671875, 134.59971618652344, 107.40571594238281, 130.2225341796875, 121.97048950195312, 8.92071533203125, 144.78643798828125, 115.33946228027344, -34.405487060546875, 0.0, 63.52947998046875, -47.39344787597656, 169.42269897460938, -11.066070556640625, 28.531692504882812, -74.56781005859375, 74.19680786132812, 190.817138671875, 72.838623046875, 69.8875732421875, -29.6907958984375, 213.42227172851562, 179.56195068359375, 146.55624389648438, 95.5335693359375, 50.516845703125, 44.16046142578125, 120.83331298828125, -14.081501007080078, 90.41400146484375, 20.905242919921875, -13.58172607421875, 170.47406005859375, 169.80828857421875, -164.13931274414062, -122.31153869628906, -12.52703857421875, 20.163818359375, -10.38531494140625, 5.19854736328125, 46.596710205078125, 0.0, -30.7415771484375, -15.400375366210938, 120.20408630371094, -44.1436767578125, 168.90869140625, 37.91656494140625, 188.07925415039062, 14.065109252929688, 4.88555908203125, 53.648773193359375, 17.696182250976562, -10.454521179199219, 137.11309814453125, 40.19318771362305, 120.99191284179688, 78.64077758789062], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000446.npy"}
|
||||
{"epoch": 0.9340314136125655, "step": 447, "batch_size": 128, "mean": 56.9735107421875, "std": 71.66927337646484, "min": -162.01254272460938, "p10": -21.676353454589844, "median": 46.59856986999512, "p90": 147.88336944580078, "max": 195.74285888671875, "pos_frac": 0.7734375, "sample": [-50.35682678222656, 10.064739227294922, 124.96426391601562, 124.66542053222656, 27.483230590820312, 9.350051879882812, 166.73062133789062, 3.15350341796875, -39.8343505859375, 137.90859985351562, 34.508331298828125, 112.71072387695312, 57.933135986328125, 152.71844482421875, 143.89341735839844, 47.488277435302734, 0.0, 6.270751953125, 160.59780883789062, 10.582794189453125, -7.3802642822265625, 37.094512939453125, 128.28350830078125, 24.193572998046875, 76.37030029296875, -5.81146240234375, 0.21417236328125, 116.78268432617188, -117.70553588867188, 121.64056396484375, -95.14262390136719, 22.9132080078125, 108.68415832519531, 93.72647094726562, -17.518909454345703, 23.840354919433594, 142.86598205566406, 145.8299560546875, 27.66259765625, 121.16642761230469, 9.40639877319336, 130.2996826171875, 157.1772918701172, 156.4505615234375, 32.2283935546875, 6.7635498046875, 128.1832275390625, 149.36300659179688, 142.0948944091797, 0.0, 145.1046142578125, 17.131500244140625, 57.375762939453125, 112.57894134521484, -34.55473327636719, -36.36347961425781, -9.14883804321289, 0.0142822265625, -63.77704620361328, 146.95619201660156, 9.662017822265625, 57.15034484863281, 13.7918701171875, 68.46279907226562, -21.430374145507812, 175.81884765625, 109.96261596679688, -6.385986328125, -34.93760681152344, 15.006103515625, 50.7689208984375, -31.793182373046875, -24.6463623046875, 10.74298095703125, 74.62350463867188, 0.9679985046386719, 44.63600540161133, -13.63214111328125, 51.17340087890625, 149.5218505859375, 195.5443115234375, -1.57818603515625, -4.018732070922852, 34.445465087890625, -162.01254272460938, 0.0, 7.001655578613281, 124.9100112915039, 45.7088623046875, 148.45187377929688, 51.96514892578125, 35.193695068359375, 33.253265380859375, 110.35334777832031, 186.055908203125, 106.65914916992188, 137.15933227539062, 195.74285888671875, 0.0, 146.69566345214844, 93.036376953125, 117.33858489990234, 31.21728515625, 27.65301513671875, 33.6239013671875, 94.08944702148438, 28.828094482421875, -105.01129150390625, 147.6397247314453, 95.28329467773438, -22.25030517578125, 111.07746887207031, 64.59243774414062, 137.03704833984375, 76.00555419921875, 93.43753051757812, 13.022491455078125, 114.47892761230469, -8.983039855957031, -14.073721885681152, 58.469635009765625, 151.10260009765625, 123.41056823730469, 69.97039794921875, 9.69598388671875, 144.49932861328125, -9.216766357421875, 89.8133316040039], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000447.npy"}
|
||||
{"epoch": 0.9361256544502617, "step": 448, "batch_size": 128, "mean": 50.42095184326172, "std": 76.73674774169922, "min": -133.2828369140625, "p10": -24.308114624023435, "median": 36.17262268066406, "p90": 151.25614318847656, "max": 205.26910400390625, "pos_frac": 0.75, "sample": [11.153335571289062, 8.854568481445312, -101.08808898925781, 11.4296875, 72.89535522460938, 158.125244140625, 5.557098388671875, -122.23077392578125, 90.01286315917969, 131.02774047851562, -7.251007080078125, -119.29568481445312, 66.33355712890625, 35.396759033203125, 1.408447265625, 88.62651062011719, 119.63568878173828, 129.94924926757812, 86.14263916015625, -7.872270584106445, 86.00286102294922, 156.3629150390625, 193.4109344482422, 5.41802978515625, 74.39286804199219, 33.25138854980469, 51.29443359375, -21.600006103515625, 12.987213134765625, 7.533721923828125, -9.55572509765625, 36.948486328125, 153.46694946289062, -83.04621887207031, 113.18951416015625, -31.80462646484375, 140.57879638671875, 150.54742431640625, 48.77362060546875, 86.79425048828125, 59.38861083984375, 25.20562744140625, 164.15386962890625, -1.2643280029296875, 60.26750946044922, 7.125818252563477, 60.44538116455078, 114.56303405761719, -2.4550323486328125, 186.39450073242188, -11.64581298828125, -4.5860595703125, -11.37225341796875, 70.60609436035156, 3.4809646606445312, -18.215301513671875, 40.244384765625, 98.680908203125, -73.09283447265625, 156.041015625, 119.41432189941406, 76.28515625, 12.85638427734375, 0.9025955200195312, 149.573974609375, 179.63375854492188, 24.78664779663086, 11.33587646484375, 142.12814331054688, 172.2835693359375, 140.177001953125, 96.07716369628906, 103.49601745605469, 33.03411865234375, 11.671241760253906, 104.6060791015625, 155.49514770507812, 136.77903747558594, 132.9376220703125, 89.86984252929688, -16.5804443359375, 154.62191772460938, -3.9055023193359375, 140.22462463378906, 1.4626617431640625, -126.34580993652344, -118.67623901367188, -133.2828369140625, 135.34567260742188, 16.9808349609375, -10.115386962890625, -4.014862060546875, 137.17123413085938, 144.1088104248047, 25.525299072265625, -114.32989501953125, 79.4139404296875, -1.6479778289794922, 21.1007080078125, 132.70944213867188, 131.13726806640625, 13.319206237792969, 152.90982055664062, 6.996631622314453, 114.6871337890625, -25.468963623046875, -59.561981201171875, 64.1214599609375, 32.241241455078125, -22.566177368164062, 143.7560272216797, -23.81060791015625, -15.2432861328125, 13.1927490234375, -26.971405029296875, 21.0748291015625, 5.755859375, 16.63409423828125, 87.87387084960938, 1.3388099670410156, 102.51921844482422, -11.05743408203125, 7.932975769042969, 142.52655029296875, 205.26910400390625, 105.84941101074219, 64.5833740234375, 64.00857543945312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000448.npy"}
|
||||
{"epoch": 0.9382198952879581, "step": 449, "batch_size": 128, "mean": 55.79218292236328, "std": 77.2677993774414, "min": -162.19381713867188, "p10": -33.23502426147461, "median": 53.431488037109375, "p90": 155.17087402343748, "max": 190.9803466796875, "pos_frac": 0.734375, "sample": [134.12655639648438, 79.38482666015625, -43.78169250488281, 108.19754028320312, 56.325164794921875, -13.73651123046875, 6.05462646484375, 167.20895385742188, -5.8568267822265625, 24.07295799255371, 15.79766845703125, 20.1634521484375, 109.99465942382812, 44.15196990966797, -9.226638793945312, 123.2987289428711, 174.68310546875, 151.47799682617188, 14.91522216796875, -3.5842742919921875, -84.28790283203125, 113.99174499511719, 157.3665771484375, 102.55144500732422, 55.6273193359375, 90.73347473144531, -43.38691711425781, 63.00506591796875, 6.008095741271973, 157.72021484375, 177.40087890625, 154.2298583984375, 141.180908203125, 163.28221130371094, 27.171722412109375, 138.41946411132812, 20.552703857421875, 83.06559753417969, 132.25245666503906, 109.04008483886719, 149.72540283203125, 40.9642333984375, 112.62039184570312, 14.068214416503906, 190.9803466796875, 180.4239501953125, 0.14801025390625, -1.364532470703125, 166.02133178710938, 134.63592529296875, -53.34162139892578, 56.95212173461914, 14.93218994140625, 25.288543701171875, 6.56837272644043, 126.16325378417969, 20.95477294921875, 92.64273071289062, 4.8499755859375, 75.96728515625, -1.244232177734375, 45.703338623046875, 141.521484375, 19.10992431640625, 89.71597290039062, -25.146377563476562, 58.05694580078125, 56.491302490234375, -7.430633544921875, 136.83413696289062, -42.3326416015625, 95.05012512207031, 132.14260864257812, 131.45242309570312, -162.19381713867188, -4.524242401123047, 3.976715087890625, 125.9290771484375, 51.23565673828125, 142.46377563476562, 128.3056640625, 79.52497863769531, 189.91488647460938, 18.734451293945312, 102.28545379638672, 90.16047668457031, 142.580810546875, 25.12982177734375, 109.59321594238281, -26.645282745361328, -5.533638000488281, 35.181182861328125, -4.89215087890625, 165.9798583984375, 125.56964111328125, 158.4500732421875, -98.91423034667969, -15.45213508605957, 112.95663452148438, 60.25628662109375, -35.87213134765625, 120.82642364501953, -32.104835510253906, -20.736404418945312, 70.36190795898438, -21.54473876953125, -105.14767456054688, -8.800590515136719, 127.92242431640625, 20.57568359375, 5.911430358886719, 123.4208984375, -49.578826904296875, -3.2213516235351562, 37.384613037109375, -8.400360107421875, 132.83847045898438, 66.80632019042969, -1.4540863037109375, 151.6407470703125, 34.410125732421875, -143.05271911621094, 162.75827026367188, 18.283424377441406, 46.939666748046875, -140.5339813232422, -58.49395751953125, -8.53009033203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000449.npy"}
|
||||
{"epoch": 0.9403141361256544, "step": 450, "batch_size": 128, "mean": 41.805599212646484, "std": 74.8165283203125, "min": -150.13711547851562, "p10": -34.60014343261718, "median": 25.730446815490723, "p90": 150.88148193359373, "max": 185.35928344726562, "pos_frac": 0.7109375, "sample": [0.0, -93.76434326171875, 153.47682189941406, -9.050979614257812, 56.06048583984375, 2.7095718383789062, 133.0934600830078, -14.2220458984375, 157.7022705078125, 45.7547607421875, 104.66192626953125, 164.56600952148438, -1.9585895538330078, 63.600616455078125, 132.22555541992188, 13.53594970703125, -6.626220703125, 108.12774658203125, 16.473663330078125, 25.99815559387207, -46.50883483886719, 80.11433410644531, 169.13467407226562, 143.4357147216797, 102.9693374633789, 35.883544921875, -14.041999816894531, 16.341766357421875, 73.79457092285156, 25.462738037109375, -6.5055999755859375, -7.5159912109375, -6.74786376953125, 4.7586669921875, 52.392723083496094, 15.39630126953125, 184.86734008789062, -119.31132507324219, 58.67192077636719, -40.023284912109375, -2.306598663330078, 154.35101318359375, 23.920116424560547, 89.71334838867188, 17.473068237304688, 21.37952423095703, -25.128782272338867, 61.26947021484375, 167.14639282226562, 29.139068603515625, 4.60821533203125, 126.47830200195312, 16.24853515625, 9.051376342773438, 83.52249145507812, -10.982986450195312, 34.28985595703125, 168.3921356201172, 10.986602783203125, -123.72686767578125, 43.13787841796875, -150.13711547851562, 131.92889404296875, 31.1436767578125, -24.783447265625, 160.8460693359375, 130.22398376464844, 3.222299575805664, 100.30087280273438, 92.94454956054688, -2.0181732177734375, 104.12323760986328, 0.649688720703125, 143.6619873046875, 3.853851318359375, 24.108901977539062, 151.6998291015625, 116.94390869140625, -92.89469909667969, 12.181640625, 96.9580078125, 44.1724853515625, -43.37794494628906, 4.930267333984375, 160.86322021484375, 31.880615234375, -1.5832462310791016, 13.4078369140625, 34.07330322265625, -15.55926513671875, 24.15283203125, 83.12327575683594, 150.53076171875, 185.35928344726562, 24.04638671875, 130.65679931640625, -31.328765869140625, -97.77371215820312, -32.27593994140625, -97.5345458984375, 15.696430206298828, 83.45425415039062, 133.37905883789062, 119.12635803222656, 162.61337280273438, 37.877838134765625, -113.61109924316406, 46.17669677734375, 8.2052001953125, 139.55992126464844, 141.2565155029297, 12.631011962890625, 26.331947326660156, -18.356964111328125, 144.30081176757812, -93.27532958984375, 7.341796875, -10.8846435546875, 39.8409423828125, 52.892364501953125, 0.0, -12.088996887207031, 92.94509887695312, 37.896934509277344, -10.1094970703125, 134.03318786621094, -21.695724487304688, -81.03805541992188], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000450.npy"}
|
||||
{"epoch": 0.9424083769633508, "step": 451, "batch_size": 128, "mean": 49.522300720214844, "std": 77.0085220336914, "min": -149.43003845214844, "p10": -32.23462829589843, "median": 31.110027313232422, "p90": 148.54138793945313, "max": 208.77145385742188, "pos_frac": 0.765625, "sample": [61.9993896484375, 28.320308685302734, -23.381256103515625, 6.960052490234375, 175.0839080810547, 134.39669799804688, 54.15606689453125, 2.513833999633789, -133.87498474121094, -25.86041259765625, 104.38040161132812, 5.807546615600586, 161.855224609375, 22.55633544921875, 29.378509521484375, 100.27789306640625, 102.844970703125, 11.342071533203125, 160.43807983398438, -66.15701293945312, 138.69558715820312, -1.225738525390625, 13.219039916992188, 26.329238891601562, 185.75723266601562, 0.229248046875, 208.77145385742188, 12.08782958984375, -7.0721282958984375, 28.836822509765625, -49.92591857910156, 22.939170837402344, -149.43003845214844, 155.26177978515625, 70.62155151367188, 40.608551025390625, 21.613685607910156, 84.00543212890625, 38.1552734375, 118.99803924560547, 17.457225799560547, 124.82998657226562, 106.45651245117188, 123.68862915039062, 117.26505279541016, 148.14089965820312, 12.652664184570312, -0.743560791015625, 171.76254272460938, 2.95623779296875, 6.5135040283203125, 173.94000244140625, 17.431610107421875, 186.06674194335938, 121.40972900390625, 131.54177856445312, -23.13483428955078, 67.25091552734375, -148.0899658203125, 118.27772521972656, 105.7281494140625, 111.37570190429688, 22.99945068359375, 147.28298950195312, 3.3258209228515625, 46.68104553222656, -91.06574249267578, 2.589111328125, 71.43099975585938, 101.63087463378906, 7.6294403076171875, 8.491058349609375, -29.006988525390625, 137.72845458984375, 11.246826171875, -45.515716552734375, 78.40275573730469, 8.53375244140625, 132.80270385742188, 40.052276611328125, -1.5604248046875, -3.1090011596679688, -29.85260009765625, 87.6329116821289, 144.41558837890625, -24.014984130859375, 16.129013061523438, 88.78549194335938, 22.823104858398438, 14.008369445800781, 107.30410766601562, 99.8258285522461, -21.139892578125, 147.4957733154297, 29.264053344726562, -130.62890625, 32.84154510498047, -37.792694091796875, -15.525459289550781, 149.47586059570312, 50.4017333984375, 140.23019409179688, 59.552093505859375, 153.30184936523438, -21.431198120117188, -82.23410034179688, -85.3177490234375, -59.15147399902344, 81.80047607421875, 49.353057861328125, 121.31748962402344, 20.10558319091797, 156.88565063476562, 138.46432495117188, -4.502983093261719, 17.820114135742188, 93.47637939453125, 3.808380126953125, 69.91165161132812, 207.6934814453125, 38.85004806518555, -1.6898345947265625, -27.491348266601562, 9.469047546386719, 115.29798889160156, 124.0609130859375, -42.710845947265625, 115.67530822753906], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000451.npy"}
|
||||
{"epoch": 0.9445026178010472, "step": 452, "batch_size": 128, "mean": 46.86863708496094, "std": 73.29729461669922, "min": -137.8667755126953, "p10": -30.24347229003906, "median": 33.41946792602539, "p90": 151.0196807861328, "max": 205.224609375, "pos_frac": 0.765625, "sample": [-56.559814453125, 124.71768188476562, 42.97967529296875, 183.954345703125, -50.52301025390625, 5.845848083496094, 11.067710876464844, 157.82675170898438, -6.7518310546875, -23.223968505859375, 155.50369262695312, 0.738067626953125, 6.6302337646484375, 17.490333557128906, -2.6468276977539062, 11.23870849609375, 62.849456787109375, -2.239410400390625, -4.58343505859375, 14.61810302734375, 0.008794784545898438, -119.75279235839844, 82.64749145507812, 133.790771484375, 117.48028564453125, 61.240753173828125, 50.069366455078125, 15.690322875976562, 0.010608673095703125, -90.23599243164062, -74.93370056152344, 168.94180297851562, -33.7861328125, 174.29275512695312, 78.26171875, -21.5439453125, 157.04116821289062, 149.09796142578125, -19.162872314453125, 122.03108215332031, 77.08317565917969, 13.4403076171875, 7.998382568359375, 63.630859375, 123.6490478515625, 12.517301559448242, 35.02210235595703, 113.7906494140625, 81.52061462402344, 22.044105529785156, 147.1466064453125, 36.64697265625, 81.8026123046875, -10.805908203125, 10.638160705566406, -45.56048583984375, 14.374465942382812, 17.53106689453125, 22.107452392578125, 45.20654296875, 12.47930908203125, 37.900482177734375, -52.44696044921875, 88.395751953125, -60.67510986328125, 126.20875549316406, 110.14567565917969, 148.12515258789062, 10.032806396484375, 13.576263427734375, 0.8774261474609375, 147.67523193359375, 26.391883850097656, 11.654296875, 126.16456604003906, 205.224609375, 132.2443389892578, -16.956268310546875, 36.40081787109375, 57.92179870605469, -127.18054962158203, 95.5223388671875, 157.7650604248047, -3.7828750610351562, 156.42562866210938, 132.3689422607422, -5.882331848144531, 128.2987823486328, 155.98443603515625, 31.81683349609375, 60.74812316894531, 45.9366455078125, 11.20651626586914, -0.9159011840820312, 30.72295379638672, -46.15147399902344, 129.01373291015625, 37.56401062011719, 57.453643798828125, -17.28509521484375, 190.1544189453125, 54.618988037109375, 20.3775634765625, 112.746337890625, 45.469642639160156, 1.0370159149169922, -25.4449462890625, 27.545730590820312, 146.75009155273438, 138.49305725097656, 192.96841430664062, 155.9927215576172, 12.109725952148438, 35.26812744140625, 144.94143676757812, 0.0, 55.92041015625, -2.176239013671875, -91.79705810546875, -28.725189208984375, 36.361541748046875, 47.74627685546875, -137.8667755126953, 47.47100830078125, 3.618865966796875, 147.34852600097656, 6.93121337890625, 0.4490985870361328], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000452.npy"}
|
||||
{"epoch": 0.9465968586387434, "step": 453, "batch_size": 128, "mean": 53.70438766479492, "std": 73.73004150390625, "min": -150.85137939453125, "p10": -32.797717285156246, "median": 45.64855194091797, "p90": 147.31453857421874, "max": 214.47930908203125, "pos_frac": 0.7890625, "sample": [26.798301696777344, 126.80308532714844, 4.125640869140625, 56.66357421875, 128.465087890625, 5.2559356689453125, 17.96045684814453, 5.325202941894531, 132.03956604003906, 82.88088989257812, 60.96587371826172, 62.32411193847656, 19.112716674804688, 90.4158935546875, -28.90350341796875, -45.88763427734375, 61.43048095703125, -150.85137939453125, 12.17364501953125, 146.96493530273438, -25.588348388671875, 141.19876098632812, 144.68386840820312, 113.94853210449219, 66.46990966796875, 8.15228271484375, -43.866943359375, 142.36572265625, 52.713653564453125, 139.6424560546875, 18.84368896484375, 116.93118286132812, 38.37872314453125, 120.15847778320312, 34.33465576171875, 30.2149658203125, 36.22833251953125, -3.34417724609375, 160.513671875, 162.898193359375, 6.795440673828125, 149.50726318359375, 31.0401611328125, 50.916595458984375, 111.3727035522461, -95.462890625, 11.94683837890625, 23.7318115234375, 14.2921142578125, -55.607330322265625, 9.437744140625, 121.58514404296875, 8.040847778320312, 18.797164916992188, -135.787353515625, 33.995086669921875, -4.358062744140625, -46.261627197265625, 52.523834228515625, 10.041046142578125, 110.19607543945312, -9.64453125, 0.0, 113.95947265625, 106.03935241699219, 22.553543090820312, 104.23396301269531, 10.669403076171875, 130.12103271484375, 46.09931945800781, 119.10269165039062, 58.57916259765625, 60.98541259765625, -124.89425659179688, 16.603515625, 145.290283203125, 162.60406494140625, 143.48806762695312, -2.75433349609375, 130.43331909179688, -14.883438110351562, 167.13943481445312, 115.53103637695312, -7.8433837890625, 9.7371826171875, 27.4627685546875, 167.89794921875, -45.246337890625, -32.473876953125, 161.939697265625, 45.197784423828125, 82.66973876953125, -47.074798583984375, -33.5533447265625, 148.13027954101562, 19.04510498046875, -128.04736328125, 136.9842529296875, -6.752685546875, 214.47930908203125, 53.88916015625, 56.19293212890625, 31.596969604492188, 95.4996337890625, 24.625091552734375, 59.2864990234375, 0.561248779296875, 2.7387237548828125, 156.49526977539062, 55.0576171875, 141.65599060058594, 114.09074401855469, 24.68756103515625, 124.12606811523438, 33.06298828125, 116.56834411621094, 158.64859008789062, 161.52593994140625, -8.811508178710938, -5.235023498535156, 129.77621459960938, 9.532051086425781, 107.83938598632812, -18.022216796875, 131.857666015625, -45.56512451171875, 65.0244140625, 187.96453857421875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000453.npy"}
|
||||
{"epoch": 0.9486910994764398, "step": 454, "batch_size": 128, "mean": 49.214027404785156, "std": 74.66571044921875, "min": -189.16079711914062, "p10": -22.62349624633789, "median": 25.550467491149902, "p90": 155.51685943603516, "max": 205.96466064453125, "pos_frac": 0.7890625, "sample": [45.00677490234375, 135.61932373046875, -125.72897338867188, -13.891494750976562, -4.9781494140625, 8.5386962890625, 158.3138427734375, 127.38162231445312, 9.35052490234375, 38.39869689941406, 23.606719970703125, 34.00520324707031, -12.25126838684082, 109.03751373291016, 28.88372039794922, 97.72647857666016, -118.42965698242188, 6.136873245239258, 18.093643188476562, -2.8303661346435547, 105.89248657226562, 20.080215454101562, 20.135345458984375, 2.7291259765625, 192.90643310546875, 205.96466064453125, 0.2318115234375, 164.5806884765625, -189.16079711914062, 3.741607666015625, 19.746177673339844, 131.14308166503906, 53.56218719482422, 113.5334701538086, 140.59471130371094, -1.2058639526367188, 166.24508666992188, -22.228622436523438, 15.704833984375, 17.409988403320312, 143.40765380859375, 16.213592529296875, 157.97064208984375, 123.57455444335938, 28.017601013183594, 128.25338745117188, 100.149169921875, 3.75390625, 161.90084838867188, -12.1986083984375, 20.624923706054688, 56.746246337890625, 32.05194091796875, -0.85107421875, 120.83169555664062, -14.3089599609375, 95.15213012695312, 4.3279876708984375, 145.8670654296875, 159.63214111328125, 106.46505737304688, 128.51658630371094, 161.23419189453125, 95.68325805664062, 160.25970458984375, 152.13909912109375, 137.49703979492188, 19.39532470703125, 26.814727783203125, 0.650665283203125, 159.26190185546875, 130.01675415039062, 2.0184326171875, -58.21415328979492, 11.654510498046875, 31.82977294921875, -23.54486846923828, 67.14971923828125, 9.161788940429688, -91.6663818359375, -21.168075561523438, 1.8555221557617188, 152.8284912109375, -40.23486328125, -28.675567626953125, 42.4248046875, 130.20591735839844, 62.3804931640625, -25.3094482421875, 141.0101318359375, 28.674972534179688, 12.386520385742188, 17.981689453125, 157.59832763671875, 182.44723510742188, 0.0247802734375, 20.75267791748047, 7.883209228515625, 154.6248016357422, 6.246738433837891, -36.955352783203125, 3.8492279052734375, 150.16629028320312, 25.61720085144043, 25.483734130859375, 26.757293701171875, 146.99343872070312, 32.64887237548828, -58.39013671875, 121.32582092285156, 3.806060791015625, 21.170379638671875, 66.92633056640625, 0.8547821044921875, 25.70068359375, 123.67932891845703, 114.02044677734375, -5.3209228515625, -48.85693359375, 37.170379638671875, 14.953407287597656, -7.1717681884765625, 22.877899169921875, 149.55691528320312, -16.098236083984375, -2.373626708984375, 12.505172729492188, -76.47592163085938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000454.npy"}
|
||||
{"epoch": 0.9507853403141361, "step": 455, "batch_size": 128, "mean": 41.185081481933594, "std": 73.11112213134766, "min": -149.69326782226562, "p10": -34.298562622070314, "median": 26.926780700683594, "p90": 140.20736236572264, "max": 199.63616943359375, "pos_frac": 0.6953125, "sample": [127.06866455078125, 72.57278442382812, 24.425613403320312, -10.401092529296875, 99.44821166992188, -16.669677734375, 107.14161682128906, 146.87351989746094, 20.171829223632812, -25.551025390625, 3.988189697265625, 83.35479736328125, -1.97589111328125, 65.41212463378906, 54.74037170410156, 140.0062713623047, 150.8637237548828, 112.41606140136719, 194.39059448242188, -14.2701416015625, 5.7978515625, 35.9078369140625, 11.62744140625, 136.635498046875, 199.63616943359375, 140.12428283691406, 136.14584350585938, 24.485565185546875, 33.17036056518555, 5.236236572265625, -17.611541748046875, -149.69326782226562, -0.7157325744628906, 27.162887573242188, 23.545516967773438, 6.90179443359375, -10.997406005859375, -24.68499755859375, -79.13792419433594, 25.236083984375, 26.177780151367188, -8.574127197265625, 108.88526916503906, 184.82546997070312, -10.7071533203125, 53.80194091796875, 124.23727416992188, -5.96331787109375, 140.40121459960938, 27.33880615234375, 10.116409301757812, 10.02947998046875, 41.28462219238281, -6.103008270263672, -8.641983032226562, -3.42999267578125, 10.76568603515625, -94.31695556640625, -39.189720153808594, -1.0763130187988281, 17.685562133789062, -33.72242736816406, 8.969039916992188, -34.268341064453125, -31.986907958984375, -36.982139587402344, 85.95369720458984, 42.64906311035156, 119.796142578125, 95.58798217773438, -130.47052001953125, 98.30704498291016, 142.93130493164062, 138.23077392578125, 43.5977783203125, -12.723907470703125, -16.49700927734375, 145.53631591796875, 140.53919982910156, 73.62547302246094, 40.426055908203125, -3.0484771728515625, 7.420536041259766, -85.32150268554688, 137.3905029296875, -41.284202575683594, -140.0314483642578, 61.77217102050781, -4.409149169921875, 23.565792083740234, -7.150177001953125, 88.49627685546875, 2.07574462890625, 41.41484069824219, 20.450454711914062, 95.2838134765625, 183.06646728515625, 185.438232421875, 90.67692565917969, -34.36907958984375, 0.03057861328125, 5.690799713134766, 78.59280395507812, 77.09649658203125, 9.287506103515625, -2.286510467529297, 10.587379455566406, 89.558837890625, 26.690673828125, 34.12041473388672, -83.02532958984375, 135.07257080078125, 65.02841186523438, 82.44929504394531, 125.2744140625, -70.47710418701172, -125.67437744140625, 52.74554443359375, -22.37664794921875, 134.23269653320312, 124.875244140625, 56.943145751953125, 157.67459106445312, 50.89422607421875, 74.84136962890625, 55.487640380859375, 150.5708770751953, 36.52264404296875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000455.npy"}
|
||||
{"epoch": 0.9528795811518325, "step": 456, "batch_size": 128, "mean": 59.58992004394531, "std": 84.2351303100586, "min": -156.20205688476562, "p10": -45.044238281249996, "median": 40.999515533447266, "p90": 162.84196166992186, "max": 286.078369140625, "pos_frac": 0.734375, "sample": [35.089508056640625, -12.577972412109375, -50.259063720703125, 111.50251770019531, 42.000823974609375, -5.326622009277344, 133.9800567626953, -89.87667846679688, 163.9989013671875, 96.40188598632812, 150.03863525390625, -156.20205688476562, 18.369384765625, -2.65374755859375, -25.88604736328125, 173.81573486328125, 112.64849853515625, -46.282012939453125, 5.264312744140625, 22.2449951171875, 156.37213134765625, 68.72119140625, -18.24542236328125, 145.9767608642578, 1.1587610244750977, -20.244213104248047, 60.3331298828125, 36.46855163574219, 137.83023071289062, -44.513763427734375, 126.79196166992188, -99.0718994140625, 124.84281921386719, -49.50605773925781, 95.47976684570312, 157.4857177734375, 79.32168579101562, 128.24844360351562, 78.62419128417969, 107.060791015625, 193.71173095703125, -19.034210205078125, 161.13134765625, 22.92462158203125, 162.34613037109375, 127.04306030273438, 175.51028442382812, -4.4932861328125, 10.05047607421875, 136.8660888671875, 39.02630615234375, 76.42292785644531, -4.600433349609375, 43.848899841308594, -48.239593505859375, 20.178085327148438, 5.02886962890625, 17.867816925048828, 14.94757080078125, 182.5389404296875, 2.43353271484375, -6.399541854858398, 17.659400939941406, -100.6430435180664, -11.759185791015625, 150.4425048828125, 129.21253967285156, -118.05897521972656, 28.214599609375, 22.25555419921875, 13.59207534790039, 115.13395690917969, -3.214588165283203, 115.3807373046875, 39.998207092285156, 113.90960693359375, 187.3751220703125, 115.1837158203125, -47.09808349609375, 60.41558837890625, 125.14263153076172, 143.05859375, -112.34280395507812, 162.07098388671875, 5.1671142578125, -11.3875732421875, 25.141708374023438, 171.4276123046875, 140.58802795410156, 168.5421600341797, 118.1046142578125, 176.9132080078125, -6.539253234863281, 156.1263427734375, 0.0, 3.481016159057617, 106.95532989501953, 153.31463623046875, 113.78858184814453, 107.53883361816406, 15.9744873046875, 14.46942138671875, 111.78375244140625, 0.541259765625, 51.45319366455078, -51.9271240234375, 127.90037536621094, 5.8413238525390625, 9.6531982421875, -14.1121826171875, 4.550994873046875, 98.54571533203125, 78.48361206054688, 108.19039916992188, 164.07762145996094, -0.9655513763427734, 236.5982666015625, 21.023651123046875, 286.078369140625, 192.5826416015625, 151.24832153320312, 30.314727783203125, -17.64337158203125, -6.781646728515625, 145.88197326660156, -26.711029052734375, -104.16412353515625, 160.99462890625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000456.npy"}
|
||||
{"epoch": 0.9549738219895288, "step": 457, "batch_size": 128, "mean": 53.89155578613281, "std": 81.43975067138672, "min": -173.14859008789062, "p10": -38.718469238281244, "median": 42.31947326660156, "p90": 156.4436279296875, "max": 188.5737762451172, "pos_frac": 0.7265625, "sample": [158.68405151367188, 25.960693359375, -61.20252227783203, -6.242828369140625, 109.43032836914062, 32.5030517578125, 142.8887481689453, 151.60150146484375, 171.12026977539062, 11.1326904296875, 138.89190673828125, -0.7025604248046875, 170.26654052734375, 99.1492919921875, 3.867919921875, 174.41995239257812, -3.6766357421875, 31.42437744140625, -104.9509506225586, 156.55484008789062, 114.93646240234375, 135.74009704589844, 90.51703643798828, 188.5737762451172, 113.748291015625, 50.8900146484375, 151.51556396484375, 0.8448028564453125, 150.78045654296875, 18.28741455078125, 58.57373046875, -7.913360595703125, -4.049346923828125, -29.501998901367188, 122.77886962890625, 149.6039581298828, -21.379127502441406, 47.38902282714844, -99.22238159179688, 106.60964965820312, -2.339935302734375, 125.6591567993164, -26.454116821289062, 1.96160888671875, 4.2427978515625, -173.14859008789062, 168.265869140625, 29.927001953125, 163.71051025390625, -10.774383544921875, -42.79046630859375, 67.75238037109375, 97.28009033203125, -59.695068359375, 124.34385681152344, 17.6098690032959, -128.99591064453125, 114.89718627929688, 49.183685302734375, 25.29901123046875, -31.980224609375, -19.765769958496094, 18.731048583984375, -24.41510772705078, 32.75152587890625, 98.53213500976562, 116.30673217773438, 21.283424377441406, 188.39320373535156, -11.121856689453125, 21.114456176757812, 132.26263427734375, 127.76235961914062, 37.44805908203125, 22.83685302734375, -127.68791198730469, -43.033935546875, 32.776519775390625, 23.349349975585938, 79.23420715332031, 22.612060546875, 147.931640625, 110.36540222167969, 71.21014404296875, 129.12362670898438, -19.51971435546875, 92.50334167480469, 1.2855072021484375, 131.13363647460938, 179.70538330078125, 7.301597595214844, -14.97247314453125, 145.10906982421875, -101.86505126953125, -46.35260009765625, 144.43069458007812, -36.97332763671875, 115.22227478027344, -18.691192626953125, 154.73388671875, 124.74954223632812, 51.66229248046875, 12.878433227539062, 47.190887451171875, -87.60919189453125, 169.7259521484375, 29.2001953125, 100.75748443603516, 9.5955810546875, 109.27684020996094, 159.04534912109375, 95.2547836303711, 8.99261474609375, 156.08372497558594, 141.48333740234375, 122.68817901611328, 5.21527099609375, -1.0679988861083984, -122.22244262695312, 103.89176940917969, 166.0455322265625, 0.9901905059814453, 0.0, 156.39596557617188, -5.3003692626953125, 132.46102905273438, 51.596588134765625, -35.712249755859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000457.npy"}
|
||||
{"epoch": 0.9570680628272251, "step": 458, "batch_size": 128, "mean": 48.549034118652344, "std": 77.7068099975586, "min": -156.05496215820312, "p10": -28.74693450927734, "median": 22.644989013671875, "p90": 157.93057556152343, "max": 205.1242218017578, "pos_frac": 0.75, "sample": [-1.3529129028320312, -48.034027099609375, 11.050765991210938, -138.9915771484375, 175.35984802246094, 15.486854553222656, -34.65667724609375, 156.47503662109375, -7.9051055908203125, -29.864044189453125, 21.391571044921875, -66.27665710449219, -46.9959716796875, 0.9755916595458984, 2.2534971237182617, 65.37452697753906, 30.03179931640625, 49.318603515625, -23.21923828125, 63.711639404296875, 159.80825805664062, -0.9101886749267578, 93.30517578125, -1.5100021362304688, 25.418838500976562, 4.963552474975586, 137.36798095703125, 161.73956298828125, -11.57720947265625, 124.92727661132812, 22.42486572265625, -19.887550354003906, 114.10711669921875, 188.58135986328125, 43.11180114746094, 19.1007080078125, 20.38580322265625, 69.49761962890625, 154.41036987304688, -28.268173217773438, 157.1258544921875, -2.376178741455078, -84.59152221679688, 22.8651123046875, 9.316741943359375, 5.8172760009765625, 16.2020263671875, 9.565982818603516, 16.621166229248047, 27.43042755126953, 3.9252662658691406, -36.78924560546875, -17.976242065429688, 146.38916015625, 126.75015258789062, -55.82208251953125, -0.6644287109375, -0.5586166381835938, 114.05611419677734, 180.84600830078125, 34.71612548828125, 2.556072235107422, 197.04144287109375, 51.8055419921875, 101.046142578125, 3.064056396484375, 9.33721923828125, -86.97957611083984, -27.2144775390625, 175.84072875976562, 138.4886474609375, 58.77197265625, 8.959030151367188, 26.17218017578125, -53.27520751953125, 134.8170928955078, 91.08029174804688, -18.031829833984375, 12.520130157470703, 110.3924560546875, 134.44717407226562, -25.378753662109375, -8.939529418945312, 28.948715209960938, 181.47952270507812, 20.1121826171875, 205.1242218017578, 11.584381103515625, 47.231353759765625, 133.33566284179688, 28.558258056640625, 173.21475219726562, 124.72514343261719, 119.03854370117188, 188.48651123046875, -4.744667053222656, 123.0728759765625, -7.112579345703125, 5.978507995605469, 112.42764282226562, 45.170989990234375, 11.40093994140625, 9.113922119140625, 144.89761352539062, 102.36422729492188, 32.31926727294922, 9.886482238769531, 12.891693115234375, 140.6284637451172, 151.62132263183594, 71.9940185546875, 174.96717834472656, 150.03819274902344, 90.51708984375, 121.2093505859375, 18.52435302734375, 123.6397705078125, 1.4412994384765625, -153.2725830078125, -156.05496215820312, 120.85023498535156, 165.27517700195312, 16.645488739013672, 4.24261474609375, 116.00456237792969, 30.970672607421875, 12.094490051269531, -27.067291259765625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000458.npy"}
|
||||
{"epoch": 0.9591623036649215, "step": 459, "batch_size": 128, "mean": 36.10858917236328, "std": 82.14163208007812, "min": -160.60305786132812, "p10": -59.49742584228515, "median": 18.971046447753906, "p90": 154.05808715820314, "max": 205.0364990234375, "pos_frac": 0.6640625, "sample": [5.3191375732421875, -38.72666931152344, 8.923500061035156, 131.40185546875, 6.25738525390625, 130.07534790039062, -122.60194396972656, 40.14984130859375, 94.42158508300781, 29.534439086914062, 56.46429443359375, 96.96987915039062, 17.016109466552734, -15.82159423828125, 15.463493347167969, 8.174922943115234, 158.96339416503906, 89.23715209960938, -15.884611129760742, 44.2568359375, 89.1678466796875, 18.53106689453125, -63.083160400390625, 29.863433837890625, 2.894378662109375, -12.95294189453125, 128.68983459472656, -6.339790344238281, 129.14068603515625, -37.44108581542969, -16.983125686645508, 0.0, 4.6715240478515625, -47.776397705078125, 136.0186767578125, -106.9569091796875, 56.02381134033203, -29.67245101928711, -17.883224487304688, 62.968017578125, 142.96075439453125, -18.098434448242188, 18.096923828125, 66.55661010742188, -126.902587890625, 37.682655334472656, -29.229248046875, 26.708587646484375, -17.21694564819336, 164.729248046875, 5.315086364746094, 7.125646591186523, -3.696929931640625, 193.85812377929688, 43.55851745605469, 19.411026000976562, 44.75262451171875, -131.48165893554688, 148.99256896972656, 205.0364990234375, 101.40625, -66.26278686523438, 156.6417236328125, 121.44911193847656, 134.76626586914062, 167.86190795898438, 44.35082244873047, 121.19668579101562, -123.91018676757812, -126.561279296875, 15.37469482421875, 40.5875244140625, 133.19395446777344, 139.04925537109375, -15.086288452148438, -150.75650024414062, 172.91876220703125, 74.89065551757812, 7.559722900390625, -58.079498291015625, -160.60305786132812, -3.66119384765625, -6.771820068359375, -131.88601684570312, 154.17596435546875, 48.044281005859375, -17.24639892578125, 154.007568359375, 0.0, 16.07434844970703, 130.43016052246094, -4.068756103515625, 20.51324462890625, 186.62557983398438, -88.3577880859375, 177.63088989257812, -5.1640472412109375, 126.22221374511719, 14.842300415039062, 46.149559020996094, -62.80592346191406, 112.96688842773438, 15.0599365234375, -13.357101440429688, -8.472442626953125, 43.316253662109375, 118.50851440429688, 14.791519165039062, 7.3004608154296875, -10.0977783203125, -43.37811279296875, -7.15631103515625, 23.019500732421875, -38.89227294921875, 138.0997314453125, 167.36846923828125, 1.0099029541015625, 21.74970245361328, 147.85794067382812, 29.70489501953125, -6.337921142578125, 163.49099731445312, 177.70697021484375, 28.060089111328125, 38.75848388671875, 5.57672119140625, 45.15740966796875, 108.71142578125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000459.npy"}
|
||||
{"epoch": 0.9612565445026178, "step": 460, "batch_size": 128, "mean": 36.738304138183594, "std": 84.05647277832031, "min": -170.2200927734375, "p10": -67.46690673828125, "median": 24.913482666015625, "p90": 144.78116455078126, "max": 208.344970703125, "pos_frac": 0.7109375, "sample": [16.31890869140625, 66.62161254882812, -110.56317138671875, 117.26797485351562, 61.74664306640625, 141.7009735107422, -130.52044677734375, 109.6018295288086, -43.11981201171875, 164.41058349609375, -108.48944091796875, 14.223663330078125, 16.68115234375, 11.102752685546875, 44.61237335205078, -9.11676025390625, -75.48776245117188, 13.1922607421875, 145.08779907226562, 160.37548828125, -23.25592041015625, 27.55633544921875, 36.49822998046875, 100.18527221679688, 41.74711608886719, 12.12109375, 92.92034912109375, -5.9262237548828125, 181.76239013671875, -54.464263916015625, -13.64532470703125, 102.75392150878906, 40.05596923828125, 208.344970703125, 99.68612670898438, -56.150474548339844, 110.5777816772461, -122.4466552734375, 197.57379150390625, -18.543060302734375, -154.730224609375, 156.275634765625, -65.38458251953125, 9.354127883911133, 75.59173583984375, 129.64175415039062, -45.304931640625, 120.46028137207031, 9.589614868164062, -56.78839111328125, 92.07418823242188, -1.4527015686035156, 0.9102668762207031, -0.42887115478515625, -19.63982391357422, 9.543975830078125, 2.1171417236328125, -3.18707275390625, -9.822490692138672, 26.320541381835938, 117.3575439453125, 31.823577880859375, 123.06990814208984, 10.596954345703125, 22.871978759765625, -89.39193725585938, 4.399477005004883, 49.91893005371094, 137.83096313476562, -0.3459205627441406, 14.732025146484375, 56.22003173828125, 24.08123779296875, 16.0509033203125, 92.10052490234375, 111.97349548339844, 135.20257568359375, 128.63330078125, 29.43115234375, 30.270381927490234, 166.80010986328125, 19.18408203125, -51.15899658203125, 200.11907958984375, 47.3634033203125, 196.30221557617188, 151.40020751953125, 139.58773803710938, -92.0015869140625, 155.75210571289062, 102.52743530273438, 112.01852416992188, -109.80435180664062, 6.8850555419921875, -41.13946533203125, 10.5513916015625, 135.7164306640625, -170.2200927734375, 104.35250854492188, -68.91787719726562, 55.23650360107422, -136.644775390625, 130.53662109375, 45.48280334472656, 14.629005432128906, -64.72309875488281, -12.034614562988281, -66.84506225585938, 144.08689880371094, 37.579620361328125, 25.7457275390625, 117.67294311523438, 30.92169189453125, 140.69668579101562, -101.73406982421875, 3.3312320709228516, 5.184803009033203, 124.93435668945312, 22.10382080078125, 154.29212951660156, 144.64974975585938, 14.338768005371094, 33.349365234375, -63.71010971069336, -60.82057189941406, 7.003326416015625, 23.172332763671875, 33.78352355957031], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000460.npy"}
|
||||
{"epoch": 0.9633507853403142, "step": 461, "batch_size": 128, "mean": 47.046722412109375, "std": 82.7116928100586, "min": -158.615966796875, "p10": -56.71105957031249, "median": 34.40678405761719, "p90": 160.37742767333984, "max": 197.3892822265625, "pos_frac": 0.7109375, "sample": [16.478179931640625, -129.20040893554688, 162.4176025390625, 18.56371307373047, 33.26887512207031, 120.80368041992188, 136.07025146484375, 23.774169921875, -24.141387939453125, 164.69284057617188, 129.0931396484375, 0.0, 15.137298583984375, 197.3892822265625, 117.59280395507812, 120.1314926147461, 118.45792388916016, 42.8626708984375, -4.137451171875, 9.821891784667969, 57.330848693847656, 163.78948974609375, 82.66419219970703, -69.49510192871094, 73.89218139648438, -7.276153564453125, 12.8011474609375, 172.478271484375, -37.20359802246094, 133.62033081054688, -0.9568634033203125, 176.12091064453125, 53.9930419921875, 130.31846618652344, -29.917129516601562, 137.9446563720703, 173.31256103515625, 17.95849609375, -134.151123046875, -40.856170654296875, 160.83380126953125, -126.226318359375, 3.9547271728515625, 17.550994873046875, 69.9776611328125, 2.93670654296875, 104.93134307861328, 116.46511840820312, 160.1818389892578, -1.44561767578125, 120.08078002929688, -68.73101806640625, 22.773223876953125, -2.28875732421875, 10.474494934082031, -17.386398315429688, 124.69534301757812, -155.60693359375, -61.054229736328125, 60.113616943359375, 113.67449188232422, 1.9627876281738281, -16.194168090820312, 110.96432495117188, 161.33560180664062, 25.980621337890625, -158.615966796875, 7.7885894775390625, 153.35313415527344, 6.4326171875, 3.236927032470703, 140.810791015625, -7.6323699951171875, 148.67340087890625, -11.1412353515625, 175.85516357421875, 78.85723876953125, 119.862548828125, -54.849700927734375, -34.828277587890625, 135.92855834960938, 3.764984130859375, 119.13218688964844, 93.09274291992188, 136.84454345703125, 52.71607208251953, -21.208953857421875, -85.9044189453125, -7.20068359375, 64.59909057617188, 74.33488464355469, 101.29759216308594, 11.81243896484375, 114.5596923828125, 3.12554931640625, 168.66099548339844, 4.0420684814453125, 35.258087158203125, 136.1402587890625, -112.08353424072266, 87.62682342529297, 0.9423580169677734, 164.92601013183594, 20.453811645507812, -79.99404907226562, -9.5364990234375, 33.55548095703125, 132.37327575683594, 134.1472625732422, -5.7664794921875, -32.83592987060547, 92.40887451171875, -6.446720123291016, 74.21904754638672, 6.20587158203125, 2.8792648315429688, 102.99443054199219, 82.62544250488281, -129.90603637695312, 49.52098846435547, 37.15606689453125, -4.282936096191406, -85.42132568359375, 165.85723876953125, -17.210790634155273, 70.22360229492188, 110.96536254882812, 152.18637084960938], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000461.npy"}
|
||||
{"epoch": 0.9654450261780104, "step": 462, "batch_size": 128, "mean": 60.074554443359375, "std": 83.52590942382812, "min": -157.72979736328125, "p10": -30.835591888427732, "median": 61.485389709472656, "p90": 161.99844970703126, "max": 211.5135498046875, "pos_frac": 0.7578125, "sample": [125.72261047363281, 56.07862854003906, 2.1305084228515625, 56.894317626953125, 151.84854125976562, 41.118896484375, 133.66024780273438, 78.79620361328125, 64.85396575927734, 160.66915893554688, -125.18023681640625, 125.70660400390625, 170.7999267578125, 42.66224670410156, 162.43731689453125, 12.308303833007812, 50.14740753173828, 120.46942138671875, 80.54965209960938, 7.57257080078125, 50.17156982421875, 122.212646484375, 121.5313720703125, 44.31298828125, 67.73103332519531, -2.00665283203125, 144.8828582763672, -1.299234390258789, 141.416259765625, -85.95828247070312, -123.5318603515625, 85.8857421875, -6.935211181640625, 23.369537353515625, -5.988555908203125, 137.36216735839844, 164.8763427734375, -72.30548095703125, -13.89825439453125, 19.715362548828125, 69.5294189453125, 139.26754760742188, 125.98733520507812, 6.248138427734375, 201.295166015625, 91.65704345703125, 87.03419494628906, 155.35781860351562, 119.80615234375, 3.05340576171875, 163.6021270751953, 7.511329650878906, 125.78355407714844, -99.4735107421875, 12.793251037597656, -11.597183227539062, 161.81036376953125, 78.16586303710938, 12.556983947753906, 14.514312744140625, 130.20199584960938, 185.04022216796875, -4.690299987792969, 146.1586456298828, -17.08135986328125, 87.07312774658203, -98.81253814697266, 72.7890625, 211.5135498046875, -24.327102661132812, 133.80352783203125, 150.91830444335938, 141.0240020751953, 120.86773681640625, 106.87240600585938, 62.469451904296875, -20.63763427734375, 23.506256103515625, 8.67170524597168, -34.74786376953125, 13.970638275146484, -26.01837158203125, -3.70159912109375, 6.201690673828125, -134.30093383789062, 60.50132751464844, 123.5599594116211, 200.2742919921875, -88.47064208984375, 102.25505065917969, -87.74636840820312, 177.4951171875, 56.949371337890625, 35.23341369628906, 82.27244567871094, 210.60833740234375, -31.35869598388672, 91.34100341796875, 113.46138000488281, 142.8353271484375, 95.17611694335938, -6.8978271484375, 207.486083984375, 19.449813842773438, -16.02716064453125, 18.182037353515625, 124.54708099365234, 19.440277099609375, 47.60809326171875, 11.27911376953125, 134.078857421875, 5.482818603515625, 56.90887451171875, 158.9127197265625, -131.70281982421875, 189.363525390625, -30.611404418945312, 100.30667877197266, 101.03790283203125, -20.928665161132812, 168.59048461914062, 23.953765869140625, -0.726104736328125, -11.534271240234375, 101.7164306640625, 124.22955322265625, 136.2812042236328, -157.72979736328125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000462.npy"}
|
||||
{"epoch": 0.9675392670157068, "step": 463, "batch_size": 128, "mean": 43.48271179199219, "std": 74.28623962402344, "min": -141.89544677734375, "p10": -32.92863159179687, "median": 28.202590942382812, "p90": 146.5341613769531, "max": 190.03399658203125, "pos_frac": 0.7265625, "sample": [121.18881225585938, 32.80955505371094, 39.18323516845703, 2.2928237915039062, 14.3876953125, -1.873708724975586, 172.1312255859375, 9.8931884765625, -137.5721435546875, 70.2418212890625, 140.65371704101562, 0.0989532470703125, -131.6245574951172, -38.79261016845703, 4.1502685546875, 154.31890869140625, 37.521331787109375, 47.5584716796875, 19.76226806640625, 7.043306350708008, 18.657546997070312, -124.50503540039062, -21.008026123046875, 7.360576629638672, 81.30328369140625, 0.0982208251953125, -74.444091796875, 32.025970458984375, 150.128173828125, 158.13180541992188, -129.88978576660156, 94.43582916259766, 17.497756958007812, -32.291534423828125, -35.02996826171875, 169.3756103515625, 147.8026123046875, 136.85145568847656, 190.03399658203125, -35.539337158203125, 103.07623291015625, -34.415191650390625, 81.19485473632812, 179.63134765625, 63.784576416015625, 12.9349365234375, 168.4471435546875, -3.248748779296875, 23.86944580078125, 2.0421066284179688, 16.725318908691406, 28.730323791503906, 153.6473388671875, 51.73138427734375, 7.2576141357421875, 21.39910888671875, 157.19802856445312, -20.6541748046875, 138.87335205078125, 118.68539428710938, 71.816162109375, 138.46664428710938, 10.12237548828125, -15.886032104492188, -25.051773071289062, 20.549118041992188, 145.99053955078125, -12.78375244140625, 112.30828857421875, 50.79913330078125, 138.47015380859375, -27.599853515625, -5.91796875, 60.158203125, 73.80062866210938, -11.23175048828125, 109.54635620117188, -23.886398315429688, -22.870101928710938, 28.410797119140625, 135.92935180664062, 14.52996826171875, 72.0023193359375, 106.74844360351562, 82.74041748046875, 6.896223068237305, 25.139984130859375, 123.59146881103516, 49.09825134277344, -16.77794075012207, -16.66546630859375, 95.46746826171875, -39.115997314453125, 0.0, -10.1976318359375, 4.322391510009766, 8.306119918823242, 167.29946899414062, 40.82066345214844, -12.904304504394531, 76.51741027832031, 111.9530029296875, -10.307769775390625, 1.9281768798828125, 27.994384765625, -77.01828002929688, 32.131591796875, 10.965087890625, 97.84161376953125, -0.8434200286865234, 6.3900146484375, -44.183380126953125, -7.741241455078125, 84.0993881225586, 0.4307861328125, 134.83065795898438, 140.87844848632812, 35.33040237426758, 140.86532592773438, 143.49612426757812, 184.18789672851562, 36.89044189453125, 47.085052490234375, 73.51608276367188, 114.587158203125, 138.976318359375, -141.89544677734375, -30.838134765625], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000463.npy"}
|
||||
{"epoch": 0.9696335078534032, "step": 464, "batch_size": 128, "mean": 44.975013732910156, "std": 74.77018737792969, "min": -181.11212158203125, "p10": -34.67355041503906, "median": 26.413820266723633, "p90": 143.21588134765625, "max": 183.3138427734375, "pos_frac": 0.7578125, "sample": [0.0, 89.68539428710938, 23.455490112304688, 127.93014526367188, -13.209915161132812, 114.79093170166016, 75.35702514648438, -34.828277587890625, 145.63504028320312, -0.2574119567871094, 58.844146728515625, 56.9588623046875, 128.54647827148438, 27.189712524414062, 118.47697448730469, 21.89072036743164, 143.25198364257812, -4.091552734375, 134.05526733398438, 97.02116394042969, 46.9169921875, 138.42495727539062, 12.288492202758789, 8.156463623046875, 60.22149658203125, 95.07373046875, 38.17060089111328, -133.07723999023438, 183.3138427734375, 16.670608520507812, 0.19537353515625, -17.21160888671875, 20.793441772460938, 167.26904296875, 148.0457763671875, 150.17721557617188, 30.343658447265625, 11.728782653808594, -11.016571044921875, 96.90345764160156, 12.650726318359375, 28.0185546875, 138.94512939453125, 83.99174499511719, 94.51535034179688, -3.7823486328125, -11.07318115234375, 20.637786865234375, 1.1426849365234375, 62.22660827636719, 15.035446166992188, 22.392578125, 7.377777099609375, 121.26615905761719, 9.518413543701172, 95.67724609375, 126.78616333007812, 101.80596923828125, -134.43511962890625, 86.80374145507812, -3.867462158203125, 125.72897338867188, -8.14556884765625, 136.7001953125, 163.57528686523438, -9.23150634765625, 86.51142883300781, 24.900070190429688, 30.01068115234375, 103.52880859375, -80.7537841796875, -181.11212158203125, -96.67623901367188, 1.956085205078125, 13.44464111328125, 20.601806640625, -17.95758056640625, -58.04974365234375, 114.601806640625, 6.737861633300781, 147.93711853027344, 172.76519775390625, 133.39404296875, 157.82754516601562, -36.6287841796875, 143.20040893554688, 1.3771858215332031, -34.60723876953125, 10.7144775390625, 3.2518672943115234, 39.847320556640625, -81.85194396972656, 140.78634643554688, -6.26934814453125, -77.6866455078125, 7.752128601074219, -2.0962066650390625, 121.28872680664062, 15.731903076171875, 11.021419525146484, 96.07817840576172, 156.2906036376953, 25.637928009033203, 149.59579467773438, 144.62841796875, -32.90972900390625, 111.40585327148438, 4.88226318359375, 126.72588348388672, -99.18930053710938, 11.329376220703125, 7.782978057861328, -15.518299102783203, 63.342529296875, 112.0430908203125, 45.13771438598633, 105.50286865234375, 130.30047607421875, 124.37030029296875, -156.92620849609375, 21.09991455078125, 24.915283203125, -48.93800735473633, 89.58477783203125, 18.129093170166016, -21.721603393554688, 36.34315490722656, 33.02685546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000464.npy"}
|
||||
{"epoch": 0.9717277486910995, "step": 465, "batch_size": 128, "mean": 59.35228729248047, "std": 80.50604248046875, "min": -162.10317993164062, "p10": -33.29132080078125, "median": 43.28504180908203, "p90": 162.1976058959961, "max": 196.57443237304688, "pos_frac": 0.75, "sample": [-1.7865447998046875, -7.2039794921875, 10.8834228515625, 42.96708679199219, 149.2779541015625, -14.224752426147461, -40.9017333984375, 168.613037109375, 10.8846435546875, 26.906173706054688, 3.428253173828125, 12.20074462890625, -2.6856689453125, 9.054718017578125, -108.34597778320312, -33.9300537109375, 164.3916015625, 139.76760864257812, 156.64361572265625, 155.7379150390625, 139.53289794921875, -43.44317626953125, 41.11140441894531, 145.71829223632812, 32.4478759765625, -0.28766632080078125, 4.52178955078125, 148.31430053710938, 162.17572021484375, 123.56858825683594, 0.29705810546875, 128.47145080566406, 29.897369384765625, 108.52426147460938, 0.7569122314453125, -9.95989990234375, -34.5579833984375, 163.77853393554688, 117.61195373535156, 196.57443237304688, 3.7087860107421875, 141.10919189453125, -1.24853515625, -28.48785400390625, -68.53131103515625, 33.6063232421875, -6.679901123046875, 35.706642150878906, -25.32684326171875, 113.544921875, 120.61463928222656, 70.55740356445312, 80.37812805175781, 119.90324401855469, 105.65736389160156, -64.47655487060547, 181.63302612304688, 38.25846862792969, -12.975677490234375, 21.4124755859375, 27.594818115234375, 45.141754150390625, 43.602996826171875, 97.84425354003906, 154.31063842773438, 27.683761596679688, 166.0029296875, 184.27267456054688, -11.16729736328125, 24.7094783782959, 69.61508178710938, 141.90240478515625, 122.94389343261719, 108.86305236816406, 87.90010070800781, -75.8101806640625, -7.72259521484375, 47.08917236328125, 25.642959594726562, 85.64991760253906, 27.0281982421875, 9.971893310546875, 4.8521728515625, 10.811767578125, 6.2625732421875, 116.07233428955078, 172.35064697265625, -41.115394592285156, 56.15723419189453, -12.473358154296875, 158.93572998046875, 177.23150634765625, 195.6513671875, 84.45044708251953, 116.27691650390625, 47.149261474609375, 172.00689697265625, 153.47833251953125, 159.35955810546875, 3.8882369995117188, 144.49209594726562, 148.1917266845703, 173.0439453125, 153.55038452148438, 10.228591918945312, 127.96649932861328, -16.691131591796875, 138.88232421875, -127.42547607421875, 64.67031860351562, -75.95718383789062, 2.085540771484375, 152.9212646484375, 144.5322265625, 141.6208038330078, 162.24867248535156, -114.12588500976562, 132.15545654296875, 4.830780029296875, 10.123193740844727, -33.017578125, 97.831298828125, -14.4693603515625, -17.144317626953125, 142.1646728515625, -162.10317993164062, -8.69482421875, 75.66970825195312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000465.npy"}
|
||||
{"epoch": 0.9738219895287958, "step": 466, "batch_size": 128, "mean": 62.473541259765625, "std": 72.90467071533203, "min": -136.270263671875, "p10": -14.173952484130854, "median": 55.61919403076172, "p90": 147.37969512939452, "max": 203.7099609375, "pos_frac": 0.8125, "sample": [1.2674636840820312, 97.56767272949219, 121.92341613769531, 135.37490844726562, 113.59402465820312, 155.24404907226562, 54.569122314453125, 0.0, 76.91876220703125, 70.52407836914062, 135.41566467285156, -24.19643783569336, 112.23941040039062, 94.11968994140625, 43.2657470703125, 193.19293212890625, 43.468406677246094, 5.98457145690918, 122.4586181640625, 70.82110595703125, 106.36784362792969, 25.94189453125, 56.66926574707031, -113.65255737304688, 21.789642333984375, 52.774749755859375, 102.72102355957031, 25.82965087890625, 115.68244171142578, 99.92295837402344, -0.5589103698730469, 139.330078125, 5.237144470214844, 116.8912353515625, 119.31341552734375, 5.937625885009766, 140.8870849609375, -8.195541381835938, 140.09344482421875, 71.79437255859375, 197.13267517089844, 4.9705810546875, 189.98028564453125, 51.16229248046875, 100.97454833984375, 163.19500732421875, 33.615272521972656, -51.87548828125, -104.86672973632812, 8.163055419921875, 6.0538787841796875, 130.70248413085938, 2.694610595703125, 131.39215087890625, 98.45196533203125, 145.12667846679688, -12.832427978515625, -136.270263671875, 28.43511962890625, 123.97064208984375, 114.95314025878906, 128.61500549316406, 99.13743591308594, 104.64198303222656, 15.227691650390625, 115.82882690429688, 138.85821533203125, 118.49420166015625, 167.18447875976562, 124.97051239013672, 0.0, 132.80909729003906, 188.3922119140625, -26.490447998046875, -8.152061462402344, 36.70746612548828, 113.69364929199219, -27.325973510742188, 82.573974609375, 203.7099609375, 5.747657775878906, 9.38604736328125, 185.6673583984375, 69.55776977539062, 30.066146850585938, -132.50881958007812, 21.07708740234375, 25.779571533203125, 24.85986328125, 102.1905288696289, 135.16880798339844, 20.171981811523438, 18.06427764892578, -19.276046752929688, 183.5074462890625, 3.3620500564575195, 53.75274658203125, 126.4296875, -54.41517639160156, -115.353271484375, 66.73907470703125, 25.740966796875, -1.217010498046875, 152.63673400878906, 22.163124084472656, 0.0, 139.62030029296875, 17.370849609375, -0.8522167205810547, 81.14224243164062, 158.84286499023438, 103.08624267578125, -3.354034423828125, 125.78401184082031, 77.27224731445312, -2.718231201171875, 25.81683349609375, 41.35227966308594, 23.84815216064453, 3.964111328125, 26.127044677734375, 16.3104248046875, 175.9512939453125, 122.48577880859375, -17.34423828125, -17.304176330566406, 128.81512451171875, 28.562423706054688], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000466.npy"}
|
||||
{"epoch": 0.9759162303664921, "step": 467, "batch_size": 128, "mean": 47.513648986816406, "std": 80.28214263916016, "min": -145.08908081054688, "p10": -49.19741516113281, "median": 33.60150718688965, "p90": 154.27455444335936, "max": 191.77017211914062, "pos_frac": 0.671875, "sample": [149.83541870117188, 125.2972412109375, 86.51951599121094, 56.5035514831543, 41.69953155517578, -9.865081787109375, 53.1060791015625, 95.0767593383789, 104.7093505859375, -0.777252197265625, -140.43145751953125, -145.08908081054688, -10.2132568359375, -11.986297607421875, 20.07623291015625, 129.615966796875, -58.424407958984375, 172.44259643554688, 0.38897705078125, 100.794189453125, -18.55743408203125, 114.81118774414062, -18.328125, 98.49455261230469, 147.8745574951172, 62.952789306640625, 20.714813232421875, 116.43498229980469, 1.3132858276367188, 13.229248046875, 23.13861083984375, 138.8651123046875, 74.642822265625, 151.62066650390625, 161.26959228515625, 77.79623413085938, 28.408493041992188, 164.80938720703125, 61.55220031738281, -61.007080078125, 165.06448364257812, -4.383941650390625, 0.0304107666015625, 156.07223510742188, 75.12557983398438, 133.06231689453125, -12.479072570800781, -39.881591796875, 156.99191284179688, -108.2977523803711, 11.395111083984375, -48.976470947265625, -33.65655517578125, 142.25535583496094, -46.541046142578125, 125.5977783203125, -21.09787368774414, -41.59625244140625, 131.75009155273438, 177.30419921875, 13.7144775390625, 36.580650329589844, -0.5513820648193359, 11.80419921875, 119.98910522460938, 15.56976318359375, 0.32466888427734375, 77.80520629882812, -15.941741943359375, -60.03962326049805, 7.967803955078125, -127.02798461914062, 131.61929321289062, 0.0, -1.3610687255859375, -5.5148468017578125, 73.49237060546875, 169.17054748535156, 3.189666748046875, -0.876953125, 0.0, 34.891876220703125, 121.26541137695312, -8.10443115234375, 101.43646240234375, 85.15997314453125, 124.42645263671875, -39.53082275390625, 184.81478881835938, -96.02981567382812, -61.53215026855469, 172.29248046875, 107.20843505859375, 111.06227111816406, 153.50411987304688, 0.8691139221191406, -19.610671997070312, 26.202545166015625, -7.3920135498046875, 81.52043151855469, -34.471954345703125, 139.46710205078125, 140.02493286132812, 38.535552978515625, -17.824432373046875, 31.172103881835938, -79.15434265136719, -64.41912841796875, 14.486712455749512, 90.8990478515625, 148.04205322265625, 114.51243591308594, -103.74800109863281, 119.26350402832031, 9.704277038574219, 9.03436279296875, -13.4349365234375, 51.86161804199219, 191.77017211914062, 32.31113815307617, 145.172119140625, 104.57675170898438, 43.0177001953125, 149.93251037597656, 161.78390502929688, 170.168212890625, -49.71295166015625, -20.645721435546875], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000467.npy"}
|
||||
{"epoch": 0.9780104712041885, "step": 468, "batch_size": 128, "mean": 40.69056701660156, "std": 72.39042663574219, "min": -120.86965942382812, "p10": -42.4084228515625, "median": 20.652969360351562, "p90": 143.89686279296873, "max": 215.043212890625, "pos_frac": 0.6953125, "sample": [36.469024658203125, 166.9375, 126.41651916503906, -44.47095489501953, 33.62925720214844, 18.0101318359375, -30.230682373046875, 139.14031982421875, 54.79656219482422, 33.68518829345703, 31.2696533203125, -37.64134216308594, 26.97796630859375, 161.36761474609375, -60.479217529296875, -15.030746459960938, -0.160064697265625, 152.36764526367188, -87.29679870605469, -23.49310302734375, 71.9547119140625, 14.973085403442383, -68.18571472167969, 59.49407958984375, -7.11260986328125, 189.92327880859375, 48.78167724609375, 55.59185791015625, 6.743133544921875, -24.128490447998047, 2.42071533203125, 20.246566772460938, 163.09780883789062, -17.16263771057129, 34.821441650390625, -20.232114791870117, -22.644119262695312, 143.20291137695312, -107.5469970703125, 19.621978759765625, 8.825263977050781, 24.414276123046875, -42.231231689453125, 125.34909057617188, 5.217132568359375, 6.6237335205078125, 68.84136962890625, -77.74382019042969, 41.144805908203125, 3.868804931640625, -4.4719085693359375, 6.676700592041016, 28.17840576171875, -2.512451171875, 13.801727294921875, 17.4449462890625, 10.406768798828125, 116.675048828125, 114.06182098388672, 130.61973571777344, 150.57888793945312, -23.18023681640625, -120.86965942382812, 10.571525573730469, 215.043212890625, 8.421478271484375, 14.680046081542969, 150.52423095703125, -16.54534912109375, 68.50521850585938, 50.409698486328125, 93.03945922851562, -26.931884765625, 75.1142578125, 129.921142578125, 135.25054931640625, 21.059371948242188, 19.128158569335938, 8.309455871582031, 103.18478393554688, -26.28396224975586, 178.23565673828125, 32.73657989501953, 152.29420471191406, 31.1361083984375, 103.78491973876953, 124.9771728515625, -18.188705444335938, 119.28329467773438, 101.03636932373047, -9.694259643554688, 5.481813430786133, 1.666748046875, 172.7977294921875, -0.95556640625, 33.72801208496094, 109.50947570800781, 2.8207168579101562, 36.962493896484375, -1.85235595703125, 135.08316040039062, 140.783935546875, 49.61317443847656, 7.17095947265625, -47.79278564453125, -28.1932373046875, -55.271148681640625, 69.79302978515625, 52.6322021484375, 131.9083251953125, -28.59539794921875, -4.208251953125, 94.09107971191406, -72.38798522949219, -54.363311767578125, 15.36767578125, 130.94081115722656, 9.19683837890625, -4.3579254150390625, -96.06522369384766, 124.67146301269531, 133.69097900390625, 90.22335815429688, 149.90484619140625, -42.821868896484375, 145.51608276367188, -28.383941650390625, 133.23995971679688], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000468.npy"}
|
||||
{"epoch": 0.9801047120418848, "step": 469, "batch_size": 128, "mean": 54.22024917602539, "std": 82.36051177978516, "min": -187.67721557617188, "p10": -30.50327453613281, "median": 39.46431350708008, "p90": 154.6607681274414, "max": 198.07403564453125, "pos_frac": 0.7578125, "sample": [16.692962646484375, 152.32156372070312, 0.0, 12.57956314086914, 28.13116455078125, 99.82106018066406, -18.440582275390625, 27.00091552734375, 0.132354736328125, 90.931640625, 106.17440795898438, 36.009674072265625, 129.42300415039062, -173.20248413085938, 31.349639892578125, -5.4594573974609375, 172.36788940429688, 131.57923889160156, 21.4915771484375, 116.40489959716797, -13.987075805664062, 73.84379577636719, 5.978065490722656, -0.8534488677978516, -132.718994140625, 146.60733032226562, 18.32757568359375, 138.69854736328125, 156.73458862304688, 39.369964599609375, 122.0720443725586, 62.62066650390625, 192.2774658203125, 139.30599975585938, 2.227886199951172, 89.16482543945312, 48.89642333984375, 84.742919921875, 150.62896728515625, 15.4122314453125, -2.27166748046875, -32.54766845703125, 30.30645751953125, 6.65521240234375, 124.75234985351562, -4.537628173828125, 20.993072509765625, 25.22100830078125, 52.592742919921875, 36.1448974609375, 114.37202453613281, 158.2176513671875, 145.35487365722656, 159.82440185546875, -118.30108642578125, 108.9137954711914, 134.01153564453125, 180.14950561523438, 141.08868408203125, -107.31381225585938, 5.266613006591797, -10.5169677734375, 168.4381103515625, -73.08645629882812, -105.45542907714844, 62.68927001953125, 23.94921875, -3.61419677734375, -28.947296142578125, 147.55215454101562, 6.5148162841796875, 126.81233215332031, 70.94775390625, 143.66624450683594, -187.67721557617188, 7.581005096435547, 50.4896240234375, 133.9174041748047, -7.6053619384765625, 198.07403564453125, 146.2342071533203, 101.11404418945312, 34.0654296875, 0.20550537109375, 149.02191162109375, -12.4656982421875, -169.65997314453125, 5.365129470825195, 45.5819091796875, 20.738502502441406, 98.37417602539062, 167.30615234375, -2.647216796875, 147.18649291992188, -33.08056640625, -36.409454345703125, 0.478759765625, 4.409637451171875, -16.705219268798828, 132.5467071533203, -12.485763549804688, 4.097984313964844, -20.086883544921875, 185.26571655273438, 10.875167846679688, -46.80322265625, 153.77198791503906, 82.83993530273438, 124.3671875, 167.121826171875, 123.4154052734375, 9.425979614257812, 149.25442504882812, 44.337646484375, -1.8092193603515625, 70.0770263671875, 17.52178192138672, 137.66302490234375, 116.29986572265625, 39.55866241455078, 162.13470458984375, 148.7984619140625, 85.90225219726562, -29.627105712890625, 46.375244140625, 31.6273193359375, 178.57376098632812, -37.242462158203125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000469.npy"}
|
||||
{"epoch": 0.9821989528795811, "step": 470, "batch_size": 128, "mean": 60.41712951660156, "std": 79.7147445678711, "min": -149.636474609375, "p10": -26.708163452148433, "median": 46.577850341796875, "p90": 161.56522064208983, "max": 203.7023162841797, "pos_frac": 0.78125, "sample": [-15.8558349609375, 19.599395751953125, 153.79696655273438, 48.36651611328125, -2.4897918701171875, 112.79293823242188, 4.43450927734375, 136.01593017578125, 147.07025146484375, 22.771240234375, -28.703125, 5.9364471435546875, -4.7185516357421875, 139.09942626953125, 107.926025390625, 131.44821166992188, 152.6107177734375, 0.760528564453125, -12.735740661621094, 46.5458984375, 105.66265869140625, -56.50518798828125, -84.21482849121094, -113.95010375976562, 189.1339111328125, 120.2747802734375, 56.11384582519531, 26.6434326171875, 142.74337768554688, 116.08657836914062, 192.6536865234375, 112.66145324707031, 7.714759826660156, -0.182373046875, 119.99702453613281, -72.3785400390625, 122.85211181640625, 122.4019775390625, 27.0595703125, -77.45054626464844, 11.11224365234375, 178.22897338867188, 108.99592590332031, 19.10479736328125, 11.255950927734375, 134.10708618164062, 171.28256225585938, 40.183319091796875, 31.544994354248047, 104.05242919921875, 5.576751708984375, 126.2015380859375, 26.46527099609375, 32.49952697753906, -16.450042724609375, 122.27349853515625, 84.93339538574219, -0.2144927978515625, 131.1881866455078, 15.1986083984375, 65.04157257080078, 3.420635223388672, 140.9556427001953, 38.8204345703125, 25.579055786132812, 24.211692810058594, 4.468162536621094, -149.636474609375, 17.00384521484375, 91.09561157226562, 2.68756103515625, 128.043701171875, 119.20339965820312, -13.842529296875, -100.285400390625, -48.152198791503906, -25.853179931640625, 111.9943618774414, 156.31446838378906, 132.68341064453125, 32.94866943359375, 33.99851989746094, 128.33563232421875, 184.86370849609375, 100.45054626464844, -19.654014587402344, 7.2721710205078125, 168.82623291015625, 167.97125244140625, 166.35914611816406, -21.5115966796875, 142.59561157226562, 110.14350891113281, 131.57815551757812, -1.8574905395507812, 46.583831787109375, 18.927452087402344, 119.69706726074219, 101.05642700195312, -9.280517578125, 152.73828125, 159.51068115234375, 102.5789794921875, 203.7023162841797, 7.3787078857421875, 129.54498291015625, 0.630340576171875, 7.621185302734375, -5.287200927734375, -124.94953155517578, 10.343505859375, -50.73199462890625, 179.11300659179688, -23.504669189453125, 105.25040435791016, 185.20257568359375, 2.28265380859375, 46.571868896484375, 70.9473876953125, 69.2171630859375, 175.69091796875, -104.27951049804688, 65.4921875, 37.28900146484375, 179.06494140625, 147.21405029296875, 154.91806030273438, -42.745880126953125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000470.npy"}
|
||||
{"epoch": 0.9842931937172775, "step": 471, "batch_size": 128, "mean": 43.34907913208008, "std": 75.12560272216797, "min": -143.0671844482422, "p10": -30.215058517456047, "median": 28.718502044677734, "p90": 148.51367645263673, "max": 201.18289184570312, "pos_frac": 0.6953125, "sample": [-68.24687194824219, 137.0481414794922, 30.966644287109375, 61.882904052734375, 122.00912475585938, 186.56988525390625, 131.7457733154297, -104.44033813476562, -11.661670684814453, 123.76463317871094, -45.87547302246094, -19.715362548828125, 14.494354248046875, 109.50108337402344, 0.753387451171875, 44.17633056640625, 17.91058349609375, 33.35554504394531, -22.5675048828125, -15.9501953125, 26.8648681640625, 27.263214111328125, 19.1343994140625, 35.41375732421875, -46.53923034667969, -9.132377624511719, 170.393798828125, -6.37835693359375, -28.212066650390625, 33.35749816894531, 162.71115112304688, 53.06689453125, 58.5074462890625, 11.25970458984375, -4.308197021484375, 68.10861206054688, 94.42640686035156, 100.48067474365234, 38.47392272949219, 15.175506591796875, 6.6472930908203125, -4.60968017578125, 145.45477294921875, 134.76284790039062, -20.80856704711914, 75.14804077148438, 29.246566772460938, 178.00938415527344, 128.17184448242188, -7.785923004150391, 7.4756317138671875, 11.5262451171875, 148.41799926757812, 60.73065185546875, -99.17843627929688, 121.859375, 105.19882202148438, 3.2168655395507812, -3.020050048828125, 28.19043731689453, 192.63702392578125, 200.36181640625, 13.943685531616211, -4.50286865234375, -34.88870620727539, -7.089128494262695, -9.036430358886719, 5.2382965087890625, -143.0671844482422, 27.53192138671875, 59.05909729003906, 75.82546997070312, 129.29139709472656, 36.6705322265625, -125.91838073730469, 34.439849853515625, 27.5509033203125, 93.70763397216797, 33.463592529296875, -25.59771728515625, 97.08984375, -17.341705322265625, -3.84027099609375, -8.477333068847656, 51.31689453125, 117.7728500366211, 15.030189514160156, 6.035697937011719, 110.27497863769531, -2.84576416015625, 124.68963623046875, 139.07540893554688, -5.9515228271484375, 53.67401123046875, 13.710235595703125, 148.73692321777344, 151.84173583984375, -54.54051208496094, -7.1359405517578125, -23.637893676757812, 35.4948616027832, 18.072555541992188, 11.360870361328125, 64.8492431640625, 126.33932495117188, 151.1463623046875, 44.524749755859375, 201.18289184570312, 70.45051574707031, 8.802032470703125, 101.86471557617188, 106.03173828125, 160.09356689453125, -102.61009979248047, 140.4810791015625, 21.035308837890625, 114.23919677734375, 0.0, 49.569427490234375, -66.62158203125, 164.33245849609375, 132.78240966796875, 164.9256591796875, -16.726806640625, 17.353271484375, -18.609405517578125, -108.61427307128906, -122.60916137695312], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000471.npy"}
|
||||
{"epoch": 0.9863874345549738, "step": 472, "batch_size": 128, "mean": 57.035438537597656, "std": 80.103271484375, "min": -140.5115966796875, "p10": -32.57711486816406, "median": 43.61918640136719, "p90": 153.9479766845703, "max": 286.281494140625, "pos_frac": 0.75, "sample": [286.281494140625, -6.565185546875, -4.839242935180664, -85.42440032958984, 3.0194129943847656, 5.206024169921875, -13.545795440673828, 126.64344787597656, 1.825439453125, 16.428253173828125, 91.09259033203125, 71.44415283203125, 152.43212890625, 36.307212829589844, -61.753196716308594, 130.29833984375, -15.324119567871094, 10.960182189941406, -31.5986328125, 78.94554138183594, 9.4713134765625, 28.18096923828125, 96.5068359375, 42.601348876953125, -16.173606872558594, 158.77450561523438, 114.6965560913086, -105.21588134765625, 4.109500885009766, 134.69058227539062, 44.892425537109375, -58.81886291503906, 20.267913818359375, 189.511474609375, 25.494827270507812, -27.763412475585938, 115.06585693359375, 147.85537719726562, -32.51066589355469, 4.750587463378906, 24.16082763671875, 112.90615844726562, 91.81466674804688, 88.66778564453125, -27.450637817382812, 19.906387329101562, 16.501220703125, 7.643280029296875, 22.0711669921875, 15.214447021484375, 125.69937133789062, 52.0133056640625, 150.26785278320312, -98.649658203125, 121.096923828125, 128.85726928710938, 148.04000854492188, 121.76437377929688, 95.87091064453125, 28.365394592285156, -7.6868743896484375, 75.1956787109375, 11.773580551147461, -1.5542716979980469, 126.6903076171875, 149.88641357421875, 107.8517074584961, 74.15093994140625, 100.86524200439453, -2.501800537109375, 38.828121185302734, 34.70716094970703, 46.151641845703125, 139.06089782714844, 122.99349975585938, 186.85586547851562, 157.48495483398438, -8.36134147644043, 148.14271545410156, 26.96167755126953, -11.89990234375, 211.59771728515625, 142.7745819091797, 186.9495849609375, 29.924713134765625, 187.55349731445312, 180.13229370117188, 152.39056396484375, 128.38067626953125, -140.5115966796875, -32.73216247558594, -93.98089599609375, 101.13795471191406, 67.85086059570312, 3.0531005859375, 116.78524780273438, 134.61407470703125, 11.329208374023438, -33.390960693359375, 95.905517578125, 144.34420776367188, -10.650924682617188, 149.52426147460938, 118.34252166748047, 66.70472717285156, 50.44837188720703, -3.245849609375, 22.68255615234375, -2.8388233184814453, 3.02117919921875, 164.21047973632812, 157.72279357910156, 4.811151504516602, 15.33319091796875, 41.11865234375, -102.65574645996094, -68.91572570800781, 212.9326171875, -12.81561279296875, 44.63702392578125, 158.19003295898438, 75.34100341796875, -96.17822265625, 97.38290405273438, 109.59916687011719, 112.61390686035156, -35.141929626464844, -14.324419021606445], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000472.npy"}
|
||||
{"epoch": 0.9884816753926702, "step": 473, "batch_size": 128, "mean": 57.18824005126953, "std": 78.77610778808594, "min": -160.33119201660156, "p10": -25.829074096679673, "median": 45.34823417663574, "p90": 157.09491729736328, "max": 195.07168579101562, "pos_frac": 0.7578125, "sample": [10.136978149414062, 161.7662353515625, 60.57905578613281, -1.518991470336914, 107.71023559570312, 4.848968505859375, 33.514068603515625, 16.67620849609375, 41.255126953125, -12.607513427734375, 54.70855712890625, 37.05290222167969, -66.75164794921875, 27.588279724121094, -98.25574493408203, 47.518680572509766, 45.41033935546875, 12.489593505859375, 107.92315673828125, 129.06622314453125, 97.57327270507812, 128.72113037109375, 175.14361572265625, 195.07168579101562, 74.1202392578125, -101.18389892578125, -14.543594360351562, 112.34918212890625, 70.02568054199219, 32.24400329589844, 163.82406616210938, 155.97406005859375, 137.07638549804688, 35.165283203125, -14.634391784667969, -41.71392822265625, 49.28717041015625, -104.05269622802734, 185.70083618164062, -16.021804809570312, 94.70240020751953, 125.0877685546875, 37.218658447265625, -14.010196685791016, -14.385452270507812, -36.835235595703125, -43.91064453125, 153.872802734375, 52.9300537109375, 28.61761474609375, 21.040313720703125, 23.088241577148438, -0.21218109130859375, 168.71417236328125, -16.840240478515625, 158.94024658203125, -21.65411376953125, 124.10313415527344, 177.213134765625, -144.88848876953125, 65.48837280273438, 164.5296630859375, 159.67874145507812, 4.329437255859375, 27.698287963867188, -35.570648193359375, 194.97708129882812, 149.9508056640625, 72.32455444335938, 174.4599609375, 90.28277587890625, 120.53206634521484, 100.6300048828125, 13.497940063476562, 3.6941986083984375, 131.65512084960938, 125.66000366210938, 17.88794708251953, 1.5869140625, 117.89556884765625, -3.90008544921875, -20.8798828125, 149.8182373046875, 116.47892761230469, 120.96511840820312, 156.30406188964844, -160.33119201660156, -95.4885025024414, 127.17282104492188, -17.05889892578125, 22.168212890625, 136.10479736328125, 3.039642333984375, 44.98828125, -4.811553955078125, -59.06263732910156, 45.286128997802734, 34.870941162109375, 42.871238708496094, -5.00872802734375, 155.97308349609375, 123.65444946289062, 122.96062469482422, 0.0, 131.91351318359375, 75.0970458984375, -122.39480590820312, 126.48094177246094, 30.51123809814453, 118.27837371826172, 130.39776611328125, 15.724418640136719, 5.038074493408203, 154.91326904296875, 24.9259033203125, -6.386871337890625, 3.599639892578125, -13.998153686523438, 26.39215087890625, 48.17839050292969, 141.26348876953125, 87.90646362304688, 142.8958282470703, 51.71282196044922, 189.8656005859375, 122.82492065429688, 145.1652374267578, 40.45648193359375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000473.npy"}
|
||||
{"epoch": 0.9905759162303664, "step": 474, "batch_size": 128, "mean": 41.6748046875, "std": 74.33451080322266, "min": -152.91293334960938, "p10": -36.269120025634756, "median": 31.942689895629883, "p90": 139.57413940429686, "max": 217.35662841796875, "pos_frac": 0.734375, "sample": [-4.8908843994140625, 14.1427001953125, 217.35662841796875, -4.916526794433594, -11.392171859741211, -40.832366943359375, 111.71711730957031, 170.1072998046875, 64.50846099853516, 135.01516723632812, 7.1905364990234375, 45.667724609375, -9.403076171875, -90.191162109375, 6.599491119384766, -120.59291076660156, 1.7432708740234375, 4.826423645019531, 83.48123931884766, 35.74140167236328, 79.32038116455078, 151.1356658935547, 4.092193603515625, -63.66162109375, 127.69869995117188, -31.39760971069336, -79.20574951171875, 27.14263916015625, -17.57672882080078, 43.19329833984375, 119.5914306640625, 46.9361572265625, 2.208038330078125, 119.02035522460938, 25.785232543945312, 95.82373046875, 200.87518310546875, 21.173587799072266, 3.1383438110351562, 165.01171875, 114.69329833984375, 153.43032836914062, 163.94821166992188, 153.0337677001953, 56.735931396484375, 10.667179107666016, 0.0, 8.075027465820312, 21.699737548828125, 9.051399230957031, -1.9310073852539062, -8.872589111328125, -73.76617431640625, 19.338394165039062, 40.53865432739258, 99.49732971191406, 20.708351135253906, -40.77404022216797, 144.38381958007812, 126.43852233886719, -10.370758056640625, -6.867706298828125, -6.9174957275390625, 132.86114501953125, 107.34126281738281, 24.440689086914062, 137.51284790039062, 76.85787963867188, -34.33843994140625, 8.971771240234375, 96.12881469726562, 32.214847564697266, -114.4300765991211, 44.17890548706055, 57.6051025390625, -25.90838623046875, 159.6298828125, 112.32620239257812, 80.85484313964844, -7.206390380859375, -24.928436279296875, 18.205474853515625, -0.06473731994628906, -0.8213672637939453, 0.8098945617675781, -149.3873291015625, -91.92842864990234, 96.08941650390625, -134.1180877685547, 130.8172607421875, 74.74964904785156, -152.91293334960938, -34.0406494140625, -3.842041015625, 103.55494689941406, 7.568614959716797, 40.59788513183594, -12.6468505859375, 31.6705322265625, 104.5242919921875, 71.248046875, 116.817626953125, 122.2652587890625, 62.967987060546875, 186.382080078125, 83.65220642089844, 129.03219604492188, 20.692108154296875, 48.88525390625, 21.74493408203125, 40.01812744140625, 39.15838623046875, 51.797760009765625, 72.56451416015625, 36.973602294921875, 42.4967041015625, 130.60018920898438, 146.18002319335938, 19.383193969726562, 39.7667236328125, 38.73169708251953, 151.86085510253906, 30.607635498046875, -102.3663330078125, 23.887969970703125, 14.466400146484375, 28.47393798828125, 124.22406005859375], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000474.npy"}
|
||||
{"epoch": 0.9926701570680628, "step": 475, "batch_size": 128, "mean": 44.418426513671875, "std": 76.19638061523438, "min": -170.10443115234375, "p10": -45.555014801025386, "median": 32.094627380371094, "p90": 139.07565307617188, "max": 219.17190551757812, "pos_frac": 0.7109375, "sample": [169.37881469726562, 138.78530883789062, -44.04395294189453, 135.09310913085938, 114.37728118896484, 28.069305419921875, -170.10443115234375, 187.16653442382812, -59.79730987548828, -47.545196533203125, -8.732254028320312, -109.41070556640625, -66.79419708251953, -44.70207977294922, 14.350341796875, 105.06385803222656, 61.6941032409668, 132.15846252441406, 142.87841796875, -14.73299789428711, 167.44808959960938, -13.020179748535156, 28.609352111816406, 34.12332534790039, 154.15708923339844, 137.95033264160156, 127.41492462158203, -9.665008544921875, 201.54888916015625, -1.626138687133789, 37.95466613769531, 81.39564514160156, -30.36114501953125, 136.72080993652344, 44.12019348144531, 18.5789794921875, -67.3183822631836, 25.402114868164062, 0.5926189422607422, 138.79083251953125, 116.62384033203125, 31.041015625, -112.51793670654297, 90.68601989746094, 83.60694885253906, -16.453948974609375, 102.7750473022461, -12.26336669921875, 131.30377197265625, -26.642410278320312, 13.867298126220703, -68.7715072631836, 131.26727294921875, 131.78903198242188, -90.61602783203125, 139.740234375, 161.43267822265625, -5.208391189575195, 23.188419342041016, 52.9254150390625, 133.99790954589844, 39.66120910644531, 7.066734313964844, -22.926025390625, 30.344696044921875, 82.91044616699219, 21.194229125976562, 43.53118896484375, 11.932723999023438, 27.105712890625, 15.04461669921875, 143.53701782226562, 66.60275268554688, -16.33349609375, 111.35784912109375, -35.1749267578125, 0.7676448822021484, 219.17190551757812, 36.26007080078125, -3.19708251953125, 51.38444519042969, 171.27059936523438, 137.21240234375, -31.842132568359375, 128.68820190429688, -21.327861785888672, 56.95235061645508, 10.71575927734375, 50.44757080078125, 42.862815856933594, 1.2225532531738281, 32.39262390136719, 119.27789306640625, 115.78543090820312, -1.828826904296875, 137.24221801757812, 124.43760681152344, 143.31878662109375, 35.46018981933594, 58.1661376953125, 12.155265808105469, -80.80691528320312, 32.97698974609375, 0.2292022705078125, -6.506622314453125, -32.0455322265625, -76.81329345703125, -14.705352783203125, 121.86286926269531, 82.46046447753906, 83.18257141113281, 94.67864990234375, -0.9535865783691406, 13.76171875, 125.41166687011719, -32.8165283203125, 43.09184265136719, -50.045433044433594, -100.647216796875, 17.772247314453125, 124.00991821289062, 31.796630859375, 18.898910522460938, 173.578857421875, 64.26962280273438, 10.4869384765625, 0.06566619873046875, 1.772491455078125], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000475.npy"}
|
||||
{"epoch": 0.9947643979057592, "step": 476, "batch_size": 128, "mean": 52.29417419433594, "std": 77.45762634277344, "min": -145.79617309570312, "p10": -30.460891723632812, "median": 32.20880126953125, "p90": 156.24412536621094, "max": 211.18389892578125, "pos_frac": 0.734375, "sample": [-70.36436462402344, -104.170654296875, -4.4864349365234375, -36.99456787109375, 7.797119140625, 123.33189392089844, -5.6251068115234375, -8.62476921081543, 18.184295654296875, 75.22959899902344, 115.27543640136719, 64.1590576171875, 142.80001831054688, 170.26025390625, 111.1333999633789, 183.937255859375, -30.374420166015625, 159.13291931152344, 127.36767578125, 2.7456893920898438, -6.263359069824219, 36.869232177734375, 14.307525634765625, 28.485870361328125, 111.24649047851562, -17.97808074951172, 143.5404052734375, -7.477983474731445, -145.79617309570312, 167.59556579589844, 13.91192626953125, 38.28509521484375, -34.19920349121094, 164.57400512695312, 136.12966918945312, 15.104888916015625, 156.05227661132812, -47.23451232910156, 27.45135498046875, -12.46868896484375, 0.9571533203125, -124.50408935546875, 113.65042114257812, 18.829322814941406, 108.65069580078125, 28.833740234375, 119.86722564697266, 169.69866943359375, 19.991477966308594, -73.266357421875, 94.44236755371094, 151.1619873046875, 13.751007080078125, 13.719276428222656, -0.6021862030029297, 174.43618774414062, 10.20989990234375, -16.4998779296875, 151.03733825683594, 28.0665283203125, 125.301513671875, 90.78157806396484, -7.43707275390625, 152.69903564453125, 18.13800048828125, 166.06423950195312, 156.9871826171875, 13.2259521484375, 121.74166870117188, 60.8310546875, 29.48187255859375, -24.539901733398438, 190.10987854003906, 39.336761474609375, 44.455055236816406, 2.0032482147216797, 123.18639373779297, -109.16370391845703, 135.461669921875, 45.07952880859375, 129.32814025878906, 121.14649963378906, 117.35319519042969, 3.7181396484375, 8.745956420898438, 5.855499267578125, -3.8857345581054688, 118.28880310058594, -140.8979949951172, 98.3187484741211, -13.153533935546875, 38.31298828125, 156.6917724609375, 3.4375534057617188, 95.4315185546875, -9.524337768554688, -17.83197021484375, 77.93915557861328, 3.43280029296875, 122.69619750976562, -37.76556396484375, 34.93572998046875, -30.66265869140625, 211.18389892578125, 18.07257843017578, 18.588333129882812, -8.07659912109375, 21.31684112548828, 86.78555297851562, 96.83868408203125, 52.7972412109375, 127.27566528320312, 136.148193359375, 15.17877197265625, 6.355743408203125, -17.258148193359375, 111.1246337890625, -16.766372680664062, -4.881195068359375, 90.46908569335938, 147.94821166992188, 116.6417465209961, -73.414306640625, 151.5646209716797, -21.370681762695312, 157.98446655273438, 113.52908325195312, 74.68180084228516], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000476.npy"}
|
||||
{"epoch": 0.9968586387434555, "step": 477, "batch_size": 128, "mean": 56.48642349243164, "std": 82.9750747680664, "min": -168.04161071777344, "p10": -42.37093734741211, "median": 53.839111328125, "p90": 157.70826110839843, "max": 207.5076904296875, "pos_frac": 0.7109375, "sample": [-63.698516845703125, 58.00665283203125, 126.50166320800781, -93.93255615234375, 83.98883056640625, 36.29002380371094, -76.94268798828125, 128.6536102294922, -27.048049926757812, 81.89619445800781, -8.333024978637695, 154.51513671875, 14.4949951171875, 119.290283203125, 27.605377197265625, -13.34210205078125, 182.80401611328125, 86.90341186523438, -168.04161071777344, -17.091064453125, 3.4150390625, -34.967132568359375, 75.42288208007812, 156.07980346679688, 165.34751892089844, 161.50799560546875, 117.34168243408203, -9.070871353149414, 19.9677734375, 30.548614501953125, 129.6186981201172, 148.631103515625, 26.7486572265625, 190.8463134765625, 112.5130615234375, 37.90875244140625, 182.5191650390625, 38.997642517089844, -14.919235229492188, 0.0, 16.18878173828125, 130.62986755371094, -1.5326385498046875, -28.608688354492188, 111.6514663696289, -15.16607666015625, 14.520477294921875, 130.348388671875, 140.21636962890625, -74.13064575195312, -25.84516143798828, 7.8931121826171875, -25.804061889648438, 143.34426879882812, 26.757843017578125, 14.30303955078125, 130.94442749023438, 33.18815612792969, 136.70953369140625, 99.93550109863281, 123.70831298828125, 150.83389282226562, 207.5076904296875, -9.9720458984375, 103.1417236328125, 108.89643096923828, 169.49710083007812, -88.0882568359375, 66.076171875, -59.3914794921875, 99.90778350830078, 140.1844482421875, 98.93254089355469, 126.67672729492188, 87.94696044921875, 163.2406005859375, -41.52344512939453, 16.752166748046875, 1.4650421142578125, 134.1964111328125, -23.2745361328125, 147.2613067626953, 74.3948974609375, -9.427284240722656, -33.39576721191406, -140.6729736328125, 141.05709838867188, 105.3853759765625, 69.10231018066406, 105.45730590820312, -2.9838666915893555, 19.946029663085938, -12.50351333618164, 177.039794921875, -11.135986328125, 132.19737243652344, -102.25089263916016, 109.15940856933594, 39.559356689453125, 139.32980346679688, -73.91073608398438, 183.72491455078125, -35.1947021484375, -135.75851440429688, 17.3389892578125, -17.915008544921875, -44.348419189453125, -4.86871337890625, 172.7352294921875, 66.27911376953125, 39.488494873046875, 199.041259765625, 147.14852905273438, 163.17660522460938, 4.0094146728515625, 143.3934326171875, 140.72598266601562, -46.388648986816406, 148.1949005126953, 13.414947509765625, 55.16961669921875, 118.76968383789062, 129.00961303710938, 28.21045684814453, 3.04180908203125, 91.98428344726562, 52.50860595703125, 10.525100708007812], "npy": "/scratch/feng.yulu/dynamic-dpo-v4/outputs/qwen3-8b-base-margin-dpo-ultrafeedback-4xh200-batch-128-20260423-040315/margin_logs/step_0000477.npy"}
|
||||
3
margin_logs/step_0000001.npy
Normal file
3
margin_logs/step_0000001.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2b4e5bed25bd4783993e7d8bed8c0dcb604850687b6133068c9ebbcfe9593fce
|
||||
size 640
|
||||
3
margin_logs/step_0000002.npy
Normal file
3
margin_logs/step_0000002.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:923d8d34ebce2d992a81fde80eff4a6f422a21f64bc55e9cf132991c17436c41
|
||||
size 640
|
||||
3
margin_logs/step_0000003.npy
Normal file
3
margin_logs/step_0000003.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:357c891257f5a8820daf07e6c0cf11d78b4f89f808c553e6b3d560bb6a450775
|
||||
size 640
|
||||
3
margin_logs/step_0000004.npy
Normal file
3
margin_logs/step_0000004.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:10a34d5d352b034acf5fa1f20844d29e22588e4b4aeccb6695ff9400fa196151
|
||||
size 640
|
||||
3
margin_logs/step_0000005.npy
Normal file
3
margin_logs/step_0000005.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7a14a7bf1543af41a5838e5b820ef0b50938671a86c21ca5b4756bb24cf5c74b
|
||||
size 640
|
||||
3
margin_logs/step_0000006.npy
Normal file
3
margin_logs/step_0000006.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7057e3e3697e0c29b3c79838a5c4cac5e27a277e9696a698f6993287d0299f64
|
||||
size 640
|
||||
3
margin_logs/step_0000007.npy
Normal file
3
margin_logs/step_0000007.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:382a6a58aa431474b300d34a178e3fe9f2869e30e4317c1f9024452019ab33b9
|
||||
size 640
|
||||
3
margin_logs/step_0000008.npy
Normal file
3
margin_logs/step_0000008.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:e067d0ad9c23858f1911bb2b9393663b94b21d0f94213e501550708a1b0cdc53
|
||||
size 640
|
||||
3
margin_logs/step_0000009.npy
Normal file
3
margin_logs/step_0000009.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:9365e72876a6ce359754652749ffbb2e2042b6fba7078248302e55a6d38596be
|
||||
size 640
|
||||
3
margin_logs/step_0000010.npy
Normal file
3
margin_logs/step_0000010.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:31ad9a5eb0fc94091246f61502a165584b671f287b8f1eea55b82757da45b9cd
|
||||
size 640
|
||||
3
margin_logs/step_0000011.npy
Normal file
3
margin_logs/step_0000011.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:4c58b4e9071d5eaa5da45f1ee75597bfc4340bf996015bac552aba19ba082c9b
|
||||
size 640
|
||||
3
margin_logs/step_0000012.npy
Normal file
3
margin_logs/step_0000012.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:8a08a856b46685b99f8407b828cc5450ec590271355803a84f20fe0e002181b5
|
||||
size 640
|
||||
3
margin_logs/step_0000013.npy
Normal file
3
margin_logs/step_0000013.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:1045b604fe684bcfb158e46ae1daaeff929c8b0dc5fe23b54d80872ee86eeb07
|
||||
size 640
|
||||
3
margin_logs/step_0000014.npy
Normal file
3
margin_logs/step_0000014.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:cacd386af430aca5eeb694a524c36132b9eaadbf23466654fe6439ebeebb52ad
|
||||
size 640
|
||||
3
margin_logs/step_0000015.npy
Normal file
3
margin_logs/step_0000015.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2ee02e9a037af2a721326eb5e3f3ab818bb0aecfee13957ba0dce9974d1f1955
|
||||
size 640
|
||||
3
margin_logs/step_0000016.npy
Normal file
3
margin_logs/step_0000016.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fd165d199bcdc35a1b87adad45aaaee04daa20a16005b76989316664363b50d6
|
||||
size 640
|
||||
3
margin_logs/step_0000017.npy
Normal file
3
margin_logs/step_0000017.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:20a478a8c14561d610c6db19181e18ac470995bf0cddecb72767af3b448068b5
|
||||
size 640
|
||||
3
margin_logs/step_0000018.npy
Normal file
3
margin_logs/step_0000018.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:9f464aae0a9305e1a738e4daa348e4cc3eb30da4c4563c0ccc47fd208b1b420d
|
||||
size 640
|
||||
3
margin_logs/step_0000019.npy
Normal file
3
margin_logs/step_0000019.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:414b13218f3ed878816ee4bc5f5d2aa78a0fc3e37a7320ef2de53cd63c237660
|
||||
size 640
|
||||
3
margin_logs/step_0000020.npy
Normal file
3
margin_logs/step_0000020.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:9a7c73eca82974f6b9649a51ef7a29b4fcab83dcc9206f2cffe52c8f48012f79
|
||||
size 640
|
||||
3
margin_logs/step_0000021.npy
Normal file
3
margin_logs/step_0000021.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f06317af1496b4936955599105c6a15b4c5d4a726b3b81bff3d9a1cce353d107
|
||||
size 640
|
||||
3
margin_logs/step_0000022.npy
Normal file
3
margin_logs/step_0000022.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c6cd610ffa283b932c26c83785a4cb0738de0a4f8a7be5630e80f8cfa7ca648b
|
||||
size 640
|
||||
3
margin_logs/step_0000023.npy
Normal file
3
margin_logs/step_0000023.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:3ccd87bc1bff94312e36a506b360119938367ee2bfaab571a4408d5983b0a504
|
||||
size 640
|
||||
3
margin_logs/step_0000024.npy
Normal file
3
margin_logs/step_0000024.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fe06fcc850cb43908ca7bce7a454f5a4f92ddbc99087f1bf42ffad47e16f18b9
|
||||
size 640
|
||||
3
margin_logs/step_0000025.npy
Normal file
3
margin_logs/step_0000025.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d7ad984d80012fa17db8be8156de183f498ee9e93ba93534844fe2c252c9173d
|
||||
size 640
|
||||
3
margin_logs/step_0000026.npy
Normal file
3
margin_logs/step_0000026.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:95263dbd87570878ad0c4da85a80d69718210b5ccfd8a896f86dc59329893478
|
||||
size 640
|
||||
3
margin_logs/step_0000027.npy
Normal file
3
margin_logs/step_0000027.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5c1085223187372d23a2a285381d8b634b93b0fabada214270ce07704d568597
|
||||
size 640
|
||||
3
margin_logs/step_0000028.npy
Normal file
3
margin_logs/step_0000028.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fadb8b7166183d52272de4f97534467db14a7e7ba8a7cd8df8715fab0f48c840
|
||||
size 640
|
||||
3
margin_logs/step_0000029.npy
Normal file
3
margin_logs/step_0000029.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:44f51044b6a3b16d1bde347502cb8ad302eb9bfadb11bc99e5b7e555d5e54ccf
|
||||
size 640
|
||||
3
margin_logs/step_0000030.npy
Normal file
3
margin_logs/step_0000030.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:29ca0c72d325d6acf4fdd4f4439b1af1de88992ec2a13582c7f4b0202a78a051
|
||||
size 640
|
||||
3
margin_logs/step_0000031.npy
Normal file
3
margin_logs/step_0000031.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:70983727d1041a234affbff96d9ac40a8673064b4dd44c5ca37b410f57a4e265
|
||||
size 640
|
||||
3
margin_logs/step_0000032.npy
Normal file
3
margin_logs/step_0000032.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0d5a8b6dce7a138c31173ef01ff9edc0fdb88b5f314e18a2b280f783b5f6d926
|
||||
size 640
|
||||
3
margin_logs/step_0000033.npy
Normal file
3
margin_logs/step_0000033.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d643630cda9871b891ab03e000ab66b1b2634c68ec65d2826c2d945d1170dba9
|
||||
size 640
|
||||
3
margin_logs/step_0000034.npy
Normal file
3
margin_logs/step_0000034.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a997b2ae404eb1ea53e5e8cf912512bcd61e0344023afb08175cb7917d46f214
|
||||
size 640
|
||||
3
margin_logs/step_0000035.npy
Normal file
3
margin_logs/step_0000035.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0f45c838b8b9c79992e2a7027fa54087cdccd6b95085610af8e96949a565c0c5
|
||||
size 640
|
||||
3
margin_logs/step_0000036.npy
Normal file
3
margin_logs/step_0000036.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:1e94237433ae4a21b480415d0f4e63919729055324f818da762d11daa555f6fd
|
||||
size 640
|
||||
3
margin_logs/step_0000037.npy
Normal file
3
margin_logs/step_0000037.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:13066931fbabba9dd98d2a56ae5e017f483b88366051b87a2f3f9a19abe30953
|
||||
size 640
|
||||
3
margin_logs/step_0000038.npy
Normal file
3
margin_logs/step_0000038.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ba0d1755af03910f514f9615044b3a4c9c4aebe9b19ff748936dea86cff17b81
|
||||
size 640
|
||||
3
margin_logs/step_0000039.npy
Normal file
3
margin_logs/step_0000039.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:8a3a4fd96a13f131de3662447e141b75073ec6921b68a2c67d078aec4249556a
|
||||
size 640
|
||||
3
margin_logs/step_0000040.npy
Normal file
3
margin_logs/step_0000040.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:636611ea53a18e411482bc65b67416c926d8dec0e58ad8beb0579e827a2ea411
|
||||
size 640
|
||||
3
margin_logs/step_0000041.npy
Normal file
3
margin_logs/step_0000041.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2a5ca2648f0d195f2891ca5d3059f493a9eb2e564062916ac57a636e65edf517
|
||||
size 640
|
||||
3
margin_logs/step_0000042.npy
Normal file
3
margin_logs/step_0000042.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5d335dbebcb40943b71f210430a235115f08c6770538273248d38b7fde991a9d
|
||||
size 640
|
||||
3
margin_logs/step_0000043.npy
Normal file
3
margin_logs/step_0000043.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c9255fb63161ae68e1753735c3586f9971c6ea2f153df0ad12a711b684167bba
|
||||
size 640
|
||||
3
margin_logs/step_0000044.npy
Normal file
3
margin_logs/step_0000044.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d339edbb2cb99eee016ea72c7b400b34733e60a19dcf3d362cc169a135181494
|
||||
size 640
|
||||
3
margin_logs/step_0000045.npy
Normal file
3
margin_logs/step_0000045.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2ba881453da6b6269e7c6a779c96196b27f55df7e147f9b197246b63cf7f749a
|
||||
size 640
|
||||
3
margin_logs/step_0000046.npy
Normal file
3
margin_logs/step_0000046.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:928aa99d6117e5200e5d4f039bee2dd2a04c149efe718c6981290e4cdb9f7049
|
||||
size 640
|
||||
3
margin_logs/step_0000047.npy
Normal file
3
margin_logs/step_0000047.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:708eb8521908d06fbd444fc7032b8e05b901d1e42bcc76b504b5b1a83cf92a52
|
||||
size 640
|
||||
3
margin_logs/step_0000048.npy
Normal file
3
margin_logs/step_0000048.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:bf9617bdfa8cbdbadf6b22944d54abebb20823d8bbc80e457907205195ffe2d7
|
||||
size 640
|
||||
3
margin_logs/step_0000049.npy
Normal file
3
margin_logs/step_0000049.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:74c6d9d8cb1fb0293dcde15a155f5323e4415a5fc030ed8e654fc6dfeffc7087
|
||||
size 640
|
||||
3
margin_logs/step_0000050.npy
Normal file
3
margin_logs/step_0000050.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:aa42f73f07bd18e17353a1981aa9c230070b3df5d8f81e36bdd54411e895c458
|
||||
size 640
|
||||
3
margin_logs/step_0000051.npy
Normal file
3
margin_logs/step_0000051.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:629cc0a5250c2603d4dd20656afef264e00990bd9a66f4da614cc920e7f716ba
|
||||
size 640
|
||||
3
margin_logs/step_0000052.npy
Normal file
3
margin_logs/step_0000052.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d652ab86c3108c8b2c9fb7257694dad50b6776a97d90da406352d80016bd8d4b
|
||||
size 640
|
||||
3
margin_logs/step_0000053.npy
Normal file
3
margin_logs/step_0000053.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:8235479772fed3738abec57a6e0808822457764ab3e54b068019986cb636d558
|
||||
size 640
|
||||
3
margin_logs/step_0000054.npy
Normal file
3
margin_logs/step_0000054.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5daf1898addeedae26313de471c79c1cd63996f85f2cb990f7c939b6872a5d3b
|
||||
size 640
|
||||
3
margin_logs/step_0000055.npy
Normal file
3
margin_logs/step_0000055.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0b3d885afddab3c3622f6a0016fbaab14eb319db0ec3d5e1b3c82cf0a24a5381
|
||||
size 640
|
||||
3
margin_logs/step_0000056.npy
Normal file
3
margin_logs/step_0000056.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b79a8ed57bd692ccbc664e4df3ffa61bf3548f2d3994f6e8339a1faa5ef44ff2
|
||||
size 640
|
||||
3
margin_logs/step_0000057.npy
Normal file
3
margin_logs/step_0000057.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5333597b369f5de9905fe0d72fe7fbeaa7750cc0887fc6f48556e8837d954071
|
||||
size 640
|
||||
3
margin_logs/step_0000058.npy
Normal file
3
margin_logs/step_0000058.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f59ef3a30576349f434419947ff3ac21f4891fec9405124e32bc302aae3a904e
|
||||
size 640
|
||||
3
margin_logs/step_0000059.npy
Normal file
3
margin_logs/step_0000059.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:389bbf1f1c5dbbc7581b9cba02bca9fe26ea217d0936c789dad7cfe38370e707
|
||||
size 640
|
||||
3
margin_logs/step_0000060.npy
Normal file
3
margin_logs/step_0000060.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d2ba3b12eb557153acbcb3aa05e2d2d6c802536d3cfe89934ed6bc949c4233af
|
||||
size 640
|
||||
3
margin_logs/step_0000061.npy
Normal file
3
margin_logs/step_0000061.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c6df401fd370f1d08356f12f2e7f9df86cd9e206201ae9da85920e5d48853856
|
||||
size 640
|
||||
3
margin_logs/step_0000062.npy
Normal file
3
margin_logs/step_0000062.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a5003b11642a3114e5355f8c3d79a0d61f7e64fbfd2b3ebcd3be9d66e2d14088
|
||||
size 640
|
||||
3
margin_logs/step_0000063.npy
Normal file
3
margin_logs/step_0000063.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:72d7e18b52d23b683f69807402eb39ff09570a85b9bc3c40e5d9724277f4bfba
|
||||
size 640
|
||||
3
margin_logs/step_0000064.npy
Normal file
3
margin_logs/step_0000064.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b75bb9af84fdad75a8090cf0cd3a289321845ed84c390a5c987b54c11bedfbdb
|
||||
size 640
|
||||
3
margin_logs/step_0000065.npy
Normal file
3
margin_logs/step_0000065.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:e347144414985bc2a527aa097aad8f22daeacc271c9fffc56b07c21f6542e6ea
|
||||
size 640
|
||||
3
margin_logs/step_0000066.npy
Normal file
3
margin_logs/step_0000066.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f3970eadc5448bbf70da3456f294bb80a7ccbc8545c30a2ee422b1bcb0b46257
|
||||
size 640
|
||||
3
margin_logs/step_0000067.npy
Normal file
3
margin_logs/step_0000067.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c67c4c835825011a4909c704fa2453482385e910768cc21f1fece76a6f94e464
|
||||
size 640
|
||||
3
margin_logs/step_0000068.npy
Normal file
3
margin_logs/step_0000068.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:93cc75b20488447600d1a0d460114ce62cc6c44e53c8f4044c3ebedf24fdd976
|
||||
size 640
|
||||
3
margin_logs/step_0000069.npy
Normal file
3
margin_logs/step_0000069.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0b34fd76e9210181593f75bfd04e3634ed3c139780730fb98e847bc2a2ba6ce4
|
||||
size 640
|
||||
3
margin_logs/step_0000070.npy
Normal file
3
margin_logs/step_0000070.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d25654b3d3392160688901c191b417be75cf853712358329c214ef05c645bbc0
|
||||
size 640
|
||||
3
margin_logs/step_0000071.npy
Normal file
3
margin_logs/step_0000071.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ff5b12a0724dbfd4ab4a69607fdcf73aa7c6b754811c5da0a9a38dc435953ebd
|
||||
size 640
|
||||
3
margin_logs/step_0000072.npy
Normal file
3
margin_logs/step_0000072.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:1cfa27b8f0abecf0a454f195d74b9cea71816b39767bc6baae3c8bf8fce0a458
|
||||
size 640
|
||||
3
margin_logs/step_0000073.npy
Normal file
3
margin_logs/step_0000073.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2bda0c64b003878aac33e5165960b7de1f268d7229ac55ebe7598b0cbd57291d
|
||||
size 640
|
||||
3
margin_logs/step_0000074.npy
Normal file
3
margin_logs/step_0000074.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b3a2498035a44bf52c372a19bb158e7228321524e9cf8650e75625508be320de
|
||||
size 640
|
||||
3
margin_logs/step_0000075.npy
Normal file
3
margin_logs/step_0000075.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ff23f2e3df54489ef18bc4ef883a2821836bc78db37e1caec09763eb7d8b31e3
|
||||
size 640
|
||||
3
margin_logs/step_0000076.npy
Normal file
3
margin_logs/step_0000076.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:843c82c352af0c53aebb45f293047d7def92dfcae15b191e063b8349a43882f9
|
||||
size 640
|
||||
3
margin_logs/step_0000077.npy
Normal file
3
margin_logs/step_0000077.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f792ce45ec3be3071e45f0b0fd0c2345902596bbde8e8a8d39a17ba02ccb1183
|
||||
size 640
|
||||
3
margin_logs/step_0000078.npy
Normal file
3
margin_logs/step_0000078.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:51b437efd3a67fbb7d171cac471b54e646be733c19cbb3b0ce05f7e246a497a8
|
||||
size 640
|
||||
3
margin_logs/step_0000079.npy
Normal file
3
margin_logs/step_0000079.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fc8c17bfa32e4753190f9c668d8fb10e9bc6e74a92bfe5cafdf37dad8971477a
|
||||
size 640
|
||||
3
margin_logs/step_0000080.npy
Normal file
3
margin_logs/step_0000080.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:fa223d59e6e9d0f68beb63baae81f2571be0a41fcbd1219996fad88956dccb8b
|
||||
size 640
|
||||
3
margin_logs/step_0000081.npy
Normal file
3
margin_logs/step_0000081.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ddb26d4edc2a098806ed6c46470e38813a004760b49eadc8bafafbfb29d3d3b7
|
||||
size 640
|
||||
3
margin_logs/step_0000082.npy
Normal file
3
margin_logs/step_0000082.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:32edb3c8da1b21d88e2f58520ddc3f9367c6a09366ff541f3ecaa7b3db20feb2
|
||||
size 640
|
||||
3
margin_logs/step_0000083.npy
Normal file
3
margin_logs/step_0000083.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ed9b4226e507dc42419879a08579f4f39849e3770ea26485a2d7400cf4be14b0
|
||||
size 640
|
||||
3
margin_logs/step_0000084.npy
Normal file
3
margin_logs/step_0000084.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b1c548d711bdab08be20c72ef90e87ee08b375d2d7de5dc60f397a49b96698d2
|
||||
size 640
|
||||
3
margin_logs/step_0000085.npy
Normal file
3
margin_logs/step_0000085.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:002c02b3975ba1f1879703db3ebf8e14903eefbc6b7a66e428cf09182cc102e3
|
||||
size 640
|
||||
3
margin_logs/step_0000086.npy
Normal file
3
margin_logs/step_0000086.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:1d7ea929e119af80bb9ad82ae2fdd667f7c8b5b85d6e059f58086d0755af6926
|
||||
size 640
|
||||
3
margin_logs/step_0000087.npy
Normal file
3
margin_logs/step_0000087.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:44d7c2474c900edd35d2509fbd33b7258abd5f728eea96f3122bbaa4acac9399
|
||||
size 640
|
||||
3
margin_logs/step_0000088.npy
Normal file
3
margin_logs/step_0000088.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:28e0fc0f1d2063652e71ce6ddbbe5f175d487713ec9dad266b17a7a34302e5de
|
||||
size 640
|
||||
3
margin_logs/step_0000089.npy
Normal file
3
margin_logs/step_0000089.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ad2a9c57daa222535fd85c2ac70aa9c97850d6d4ff9b07cc529db635716f03ef
|
||||
size 640
|
||||
3
margin_logs/step_0000090.npy
Normal file
3
margin_logs/step_0000090.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2b3761b53cfb0f3b337a75964b8cd996dbf1262b115b57c5962c072a72fd3c36
|
||||
size 640
|
||||
3
margin_logs/step_0000091.npy
Normal file
3
margin_logs/step_0000091.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:9f5ad14975ee0df104f52610b70442bce39a2cd3b1d8b0688248d0a35893ac36
|
||||
size 640
|
||||
3
margin_logs/step_0000092.npy
Normal file
3
margin_logs/step_0000092.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b3b9fe824ada6504ef2124e40f419b42f0eecc6ae630e7b643e17e3e48ccc1e4
|
||||
size 640
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user