初始化项目,由ModelHub XC社区提供模型
Model: jackf857/llama-3-8b-base-margin-dpo-hh-harmless-batch-size-64 Source: Original Platform
This commit is contained in:
36
.gitattributes
vendored
Normal file
36
.gitattributes
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
||||
80
README.md
Normal file
80
README.md
Normal file
@@ -0,0 +1,80 @@
|
||||
---
|
||||
library_name: transformers
|
||||
base_model: W-61/llama-3-8b-base-sft-hh-harmless-4xh200
|
||||
tags:
|
||||
- alignment-handbook
|
||||
- margin-dpo
|
||||
- generated_from_trainer
|
||||
datasets:
|
||||
- Anthropic/hh-rlhf
|
||||
model-index:
|
||||
- name: llama-3-8b-base-margin-dpo-hh-harmless
|
||||
results: []
|
||||
---
|
||||
|
||||
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
||||
should probably proofread and complete it, then remove this comment. -->
|
||||
|
||||
# llama-3-8b-base-margin-dpo-hh-harmless
|
||||
|
||||
This model is a fine-tuned version of [W-61/llama-3-8b-base-sft-hh-harmless-4xh200](https://huggingface.co/W-61/llama-3-8b-base-sft-hh-harmless-4xh200) on the Anthropic/hh-rlhf dataset.
|
||||
It achieves the following results on the evaluation set:
|
||||
- Loss: 0.5259
|
||||
- Margin Dpo/margin Mean: 9.3649
|
||||
- Margin Dpo/margin Std: 14.8097
|
||||
- Logps/chosen: -92.0386
|
||||
- Logps/rejected: -106.0930
|
||||
- Logps/ref Chosen: -74.8595
|
||||
- Logps/ref Rejected: -79.5490
|
||||
- Logits/chosen: 0.3798
|
||||
- Logits/rejected: 0.3285
|
||||
|
||||
## Model description
|
||||
|
||||
More information needed
|
||||
|
||||
## Intended uses & limitations
|
||||
|
||||
More information needed
|
||||
|
||||
## Training and evaluation data
|
||||
|
||||
More information needed
|
||||
|
||||
## Training procedure
|
||||
|
||||
### Training hyperparameters
|
||||
|
||||
The following hyperparameters were used during training:
|
||||
- learning_rate: 5e-07
|
||||
- train_batch_size: 8
|
||||
- eval_batch_size: 8
|
||||
- seed: 42
|
||||
- distributed_type: multi-GPU
|
||||
- num_devices: 4
|
||||
- gradient_accumulation_steps: 2
|
||||
- total_train_batch_size: 64
|
||||
- total_eval_batch_size: 32
|
||||
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
||||
- lr_scheduler_type: cosine
|
||||
- lr_scheduler_warmup_ratio: 0.1
|
||||
- num_epochs: 1
|
||||
|
||||
### Training results
|
||||
|
||||
| Training Loss | Epoch | Step | Validation Loss | Margin Dpo/margin Mean | Margin Dpo/margin Std | Logps/chosen | Logps/rejected | Logps/ref Chosen | Logps/ref Rejected | Logits/chosen | Logits/rejected |
|
||||
|:-------------:|:------:|:----:|:---------------:|:----------------------:|:---------------------:|:------------:|:--------------:|:----------------:|:------------------:|:-------------:|:---------------:|
|
||||
| 1.3342 | 0.1512 | 100 | 0.6557 | 1.4205 | 4.9786 | -79.7014 | -85.8115 | -74.8595 | -79.5490 | 0.2556 | 0.2183 |
|
||||
| 0.9165 | 0.3023 | 200 | 0.5447 | 7.4721 | 12.5600 | -86.5507 | -98.7123 | -74.8595 | -79.5490 | 0.3345 | 0.2868 |
|
||||
| 0.9692 | 0.4535 | 300 | 0.5345 | 9.3794 | 14.9738 | -93.1794 | -107.2484 | -74.8595 | -79.5490 | 0.4017 | 0.3507 |
|
||||
| 1.084 | 0.6047 | 400 | 0.5337 | 8.8635 | 14.3566 | -91.2627 | -104.8157 | -74.8595 | -79.5490 | 0.3912 | 0.3394 |
|
||||
| 1.0037 | 0.7559 | 500 | 0.5277 | 9.5078 | 15.0672 | -92.1725 | -106.3698 | -74.8595 | -79.5490 | 0.3937 | 0.3419 |
|
||||
| 1.0459 | 0.9070 | 600 | 0.5259 | 9.3649 | 14.8097 | -92.0386 | -106.0930 | -74.8595 | -79.5490 | 0.3798 | 0.3285 |
|
||||
|
||||
|
||||
### Framework versions
|
||||
|
||||
- Transformers 4.51.0
|
||||
- Pytorch 2.3.1+cu121
|
||||
- Datasets 2.21.0
|
||||
- Tokenizers 0.21.4
|
||||
9
all_results.json
Normal file
9
all_results.json
Normal file
@@ -0,0 +1,9 @@
|
||||
{
|
||||
"epoch": 0.999244142101285,
|
||||
"total_flos": 0.0,
|
||||
"train_loss": 1.049779979526185,
|
||||
"train_runtime": 1908.5591,
|
||||
"train_samples": 42336,
|
||||
"train_samples_per_second": 22.182,
|
||||
"train_steps_per_second": 0.346
|
||||
}
|
||||
29
config.json
Normal file
29
config.json
Normal file
@@ -0,0 +1,29 @@
|
||||
{
|
||||
"architectures": [
|
||||
"LlamaForCausalLM"
|
||||
],
|
||||
"attention_bias": false,
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 128000,
|
||||
"eos_token_id": 128001,
|
||||
"head_dim": 128,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 14336,
|
||||
"max_position_embeddings": 8192,
|
||||
"mlp_bias": false,
|
||||
"model_type": "llama",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 32,
|
||||
"num_key_value_heads": 8,
|
||||
"pretraining_tp": 1,
|
||||
"rms_norm_eps": 1e-05,
|
||||
"rope_scaling": null,
|
||||
"rope_theta": 500000.0,
|
||||
"tie_word_embeddings": false,
|
||||
"torch_dtype": "float32",
|
||||
"transformers_version": "4.51.0",
|
||||
"use_cache": true,
|
||||
"vocab_size": 128256
|
||||
}
|
||||
9
generation_config.json
Normal file
9
generation_config.json
Normal file
@@ -0,0 +1,9 @@
|
||||
{
|
||||
"bos_token_id": 128000,
|
||||
"do_sample": true,
|
||||
"eos_token_id": 128001,
|
||||
"max_length": 4096,
|
||||
"temperature": 0.6,
|
||||
"top_p": 0.9,
|
||||
"transformers_version": "4.51.0"
|
||||
}
|
||||
661
margin_logs/margins.jsonl
Normal file
661
margin_logs/margins.jsonl
Normal file
@@ -0,0 +1,661 @@
|
||||
{"epoch": 0.0, "step": 1, "batch_size": 64, "mean": -0.0013527870178222656, "std": 0.2564818859100342, "min": -0.736083984375, "p10": -0.3432229995727539, "median": 0.038166046142578125, "p90": 0.29227676391601565, "max": 0.645111083984375, "pos_frac": 0.578125, "sample": [0.1120758056640625, 0.12518310546875, 0.31621551513671875, 0.13765716552734375, -0.12592506408691406, 0.23141098022460938, -0.21887779235839844, 0.21950721740722656, 0.04480743408203125, 0.020877838134765625, 0.0570220947265625, 0.058269500732421875, -0.4338226318359375, -0.030628204345703125, 0.645111083984375, -0.395477294921875, 0.09050941467285156, 0.0007190704345703125, -0.34615325927734375, 0.016077041625976562, -0.33638572692871094, 0.293853759765625, 0.17610931396484375, 0.22386932373046875, 0.21470260620117188, -0.08536529541015625, 0.0907745361328125, -0.03816986083984375, 0.39190101623535156, 0.16336441040039062, 0.08024787902832031, -0.031158447265625, 0.08477020263671875, 0.002460479736328125, -0.242034912109375, 0.07232666015625, -0.60186767578125, 0.20531463623046875, 0.155731201171875, -0.14299774169921875, -0.25698089599609375, 0.12331962585449219, -0.26497650146484375, 0.15140533447265625, -0.0920257568359375, -0.18599319458007812, 0.19028091430664062, 0.2496490478515625, 0.42162322998046875, 0.17873382568359375, -0.1525421142578125, -0.4972076416015625, 0.32010650634765625, -0.10365867614746094, -0.233795166015625, -0.19828224182128906, -0.4018898010253906, -0.13407135009765625, -0.09596633911132812, 0.031524658203125, 0.28859710693359375, -0.192962646484375, -0.736083984375, 0.3026123046875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000001.npy"}
|
||||
{"epoch": 0.0015117157974300832, "step": 2, "batch_size": 64, "mean": 0.03744968771934509, "std": 0.2875921130180359, "min": -0.7604827880859375, "p10": -0.2812448501586914, "median": 0.03963661193847656, "p90": 0.3654294967651367, "max": 0.8134727478027344, "pos_frac": 0.5625, "sample": [0.30594635009765625, -0.24289894104003906, -0.11509323120117188, -0.13417816162109375, 0.06942558288574219, 0.36568641662597656, -0.14640045166015625, 0.1497650146484375, 0.30261993408203125, 0.10124588012695312, 0.13028717041015625, -0.0031890869140625, 0.0361480712890625, 0.5662612915039062, 0.09694290161132812, -0.01091766357421875, 0.1128997802734375, 0.0411834716796875, -0.21860504150390625, -0.1236419677734375, -0.08812713623046875, 0.10360527038574219, 0.1790008544921875, -0.5114288330078125, 0.3056755065917969, -0.14553451538085938, 0.28168487548828125, 0.26990509033203125, 0.1686878204345703, 0.038089752197265625, 0.19541168212890625, -0.10783576965332031, -0.2644004821777344, -0.19707489013671875, -0.140472412109375, 0.1349811553955078, 0.19672012329101562, -0.0714111328125, 0.53369140625, 0.1271820068359375, 0.8134727478027344, 0.2990264892578125, -0.7604827880859375, -0.08274078369140625, 0.05890846252441406, 0.029361724853515625, 0.4510040283203125, -0.1599273681640625, -0.29346656799316406, 0.10005569458007812, -0.27509117126464844, -0.1937713623046875, 0.19167327880859375, 0.28173065185546875, -0.09406471252441406, -0.3380699157714844, -0.29186248779296875, 0.36483001708984375, 0.009979248046875, 0.44391632080078125, -0.126708984375, -0.6550216674804688, 0.6160736083984375, -0.28388214111328125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000002.npy"}
|
||||
{"epoch": 0.0030234315948601664, "step": 3, "batch_size": 64, "mean": -0.02046513557434082, "std": 0.294729620218277, "min": -0.6945343017578125, "p10": -0.3885284423828125, "median": -0.03394126892089844, "p90": 0.3117528915405273, "max": 0.555877685546875, "pos_frac": 0.4375, "sample": [0.07639312744140625, -0.1809234619140625, -0.0682525634765625, -0.5877304077148438, -0.0416107177734375, 0.3059844970703125, 0.05162811279296875, 0.0534515380859375, 0.555877685546875, -0.19660186767578125, -0.30046844482421875, -0.289520263671875, -0.2647247314453125, 0.19112014770507812, -0.37471771240234375, 0.25583648681640625, -0.1798248291015625, 0.5108871459960938, 0.18897247314453125, -0.4187164306640625, -0.14422607421875, -0.10743904113769531, -0.07501220703125, 0.03575897216796875, -0.6945343017578125, 0.47162437438964844, 0.30352020263671875, -0.006805419921875, -0.027957916259765625, 0.13306427001953125, 0.055606842041015625, -0.18733596801757812, -0.11013031005859375, -0.16483306884765625, -0.079437255859375, -0.002017974853515625, -0.4828643798828125, -0.6525115966796875, -0.10840797424316406, 0.5537261962890625, 0.49761199951171875, 0.16332244873046875, 0.03350830078125, 0.5349884033203125, 0.19634246826171875, 0.16209030151367188, -0.03992462158203125, 0.17009353637695312, 0.22017669677734375, -0.2071380615234375, -0.594635009765625, 0.19889068603515625, -0.04132080078125, 0.31229591369628906, 0.07370376586914062, 0.31048583984375, -0.01373291015625, -0.39444732666015625, -0.18184661865234375, -0.351104736328125, -0.14605712890625, 0.20172119140625, -0.34865570068359375, -0.06298446655273438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000003.npy"}
|
||||
{"epoch": 0.0045351473922902496, "step": 4, "batch_size": 64, "mean": 0.005287140607833862, "std": 0.3103655278682709, "min": -0.7440032958984375, "p10": -0.35634613037109375, "median": -0.002872467041015625, "p90": 0.3888633728027345, "max": 0.8995361328125, "pos_frac": 0.484375, "sample": [-0.3033294677734375, 0.39763641357421875, -0.35538482666015625, -0.18173980712890625, -0.04018402099609375, -0.3078727722167969, 0.10132789611816406, 0.4627971649169922, 0.171600341796875, 0.1356048583984375, -0.3710517883300781, 0.0204010009765625, -0.040618896484375, 0.19652557373046875, -0.54290771484375, -0.399932861328125, 0.3592338562011719, -0.0995025634765625, 0.3329315185546875, -0.5560226440429688, -0.2303009033203125, 0.12933731079101562, 0.1298828125, 0.4094390869140625, -0.12431716918945312, -0.10562896728515625, 0.05596923828125, 0.25368499755859375, -0.20920562744140625, 0.01776123046875, 0.8995361328125, -0.00176239013671875, 0.19585418701171875, -0.02559661865234375, -0.5551528930664062, 0.3683929443359375, 0.809967041015625, 0.050136566162109375, 0.134124755859375, 0.15836334228515625, -0.2842445373535156, -0.1509265899658203, -0.03857421875, -0.155609130859375, -0.35675811767578125, -0.02264404296875, -0.28681373596191406, 0.21956825256347656, -0.7440032958984375, 0.013805389404296875, -0.04993438720703125, -0.043155670166015625, 0.08580398559570312, 0.047153472900390625, 0.564117431640625, 0.035247802734375, -0.2762298583984375, 0.4529876708984375, 0.30234527587890625, -0.31472015380859375, -0.0039825439453125, -0.145263671875, 0.20026016235351562, -0.050048828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000004.npy"}
|
||||
{"epoch": 0.006046863189720333, "step": 5, "batch_size": 64, "mean": -0.06676921248435974, "std": 0.3281959891319275, "min": -1.235107421875, "p10": -0.39916000366210935, "median": -0.05465221405029297, "p90": 0.321966552734375, "max": 0.9239654541015625, "pos_frac": 0.390625, "sample": [-0.28993988037109375, -0.4829254150390625, 0.019628524780273438, 0.332977294921875, -0.00994873046875, -0.8381500244140625, -0.13919830322265625, 0.01725006103515625, 0.328216552734375, 0.228790283203125, 0.19123077392578125, 0.1629638671875, -0.0590972900390625, -0.2345752716064453, -0.378326416015625, -0.08934783935546875, 0.34867095947265625, -0.21573638916015625, -0.30730438232421875, -0.3112220764160156, -0.016618728637695312, 0.12637710571289062, 0.01508331298828125, -0.4885673522949219, -1.235107421875, -0.347686767578125, -0.431793212890625, -0.03658103942871094, -0.214874267578125, -0.017711639404296875, -0.0788421630859375, -0.11042022705078125, -0.32904052734375, -0.020738601684570312, 0.4633941650390625, -0.2723197937011719, -0.2106781005859375, -0.24764251708984375, 0.03985404968261719, -0.06413459777832031, 0.15123748779296875, -0.53289794921875, -0.15620040893554688, 0.32183837890625, 0.0825958251953125, -0.1245880126953125, 0.2178802490234375, 0.12408447265625, 0.29381561279296875, -0.3016204833984375, 0.175506591796875, 0.24581146240234375, 0.47000885009765625, 0.9239654541015625, 0.322021484375, -0.40808868408203125, -0.2762489318847656, -0.19301986694335938, 0.1915874481201172, -0.05020713806152344, -0.03673362731933594, -0.34600830078125, -0.35755348205566406, 0.19367599487304688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000005.npy"}
|
||||
{"epoch": 0.007558578987150416, "step": 6, "batch_size": 64, "mean": -0.09001976251602173, "std": 0.35035836696624756, "min": -0.938812255859375, "p10": -0.3727842330932617, "median": -0.0887460708618164, "p90": 0.36508483886718757, "max": 0.7586517333984375, "pos_frac": 0.34375, "sample": [-0.18019485473632812, 0.11639976501464844, 0.073089599609375, 0.071075439453125, -0.18075942993164062, -0.21762847900390625, 0.1571979522705078, 0.12486648559570312, -0.1494140625, -0.15071868896484375, -0.09056854248046875, -0.292755126953125, -0.9306907653808594, -0.36844635009765625, 0.183380126953125, -0.8502960205078125, -0.08548736572265625, 0.7586517333984375, 0.3733978271484375, -0.5860538482666016, -0.07471466064453125, 0.01325225830078125, -0.05647468566894531, 0.3930530548095703, -0.36115264892578125, -0.2994842529296875, 0.08062744140625, -0.053497314453125, -0.3455009460449219, 0.28318023681640625, -0.2001018524169922, 0.20746994018554688, 0.471099853515625, -0.00615692138671875, -0.34820556640625, -0.01909637451171875, -0.15827560424804688, 0.11688232421875, -0.3862800598144531, -0.0852203369140625, -0.25727081298828125, -0.023244857788085938, -0.2725944519042969, 0.07135391235351562, -0.20151519775390625, -0.21138381958007812, 0.637451171875, 0.6780776977539062, -0.37464332580566406, 0.1361103057861328, -0.187225341796875, 0.5601806640625, -0.3678760528564453, -0.3294849395751953, -0.19229888916015625, -0.3489646911621094, -0.07196426391601562, -0.148406982421875, -0.938812255859375, -0.930419921875, -0.3116607666015625, 0.11811447143554688, 0.3456878662109375, -0.08692359924316406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000006.npy"}
|
||||
{"epoch": 0.009070294784580499, "step": 7, "batch_size": 64, "mean": 0.014449506998062134, "std": 0.22572197020053864, "min": -0.5367050170898438, "p10": -0.291986083984375, "median": 0.07516098022460938, "p90": 0.2760061264038086, "max": 0.6507568359375, "pos_frac": 0.578125, "sample": [0.10291671752929688, -0.1425628662109375, 0.155609130859375, 0.11102294921875, 0.12586402893066406, -0.021261215209960938, 0.08564186096191406, 0.146514892578125, -0.5367050170898438, 0.159881591796875, 0.21593475341796875, 0.28275108337402344, 0.26506996154785156, -0.006732940673828125, 0.1773204803466797, -0.0944671630859375, -0.23670196533203125, -0.18444061279296875, 0.11432266235351562, 0.066802978515625, 0.1476593017578125, 0.0367889404296875, -0.18735504150390625, 0.3713226318359375, -0.30072021484375, 0.10528945922851562, -0.128204345703125, -0.2994842529296875, -0.2744903564453125, 0.13846969604492188, -0.091339111328125, 0.22035980224609375, -0.12504005432128906, 0.1094818115234375, 0.05912017822265625, 0.28069305419921875, -0.14696884155273438, 0.6507568359375, 0.34813690185546875, 0.025526046752929688, 0.1174163818359375, 0.014604568481445312, 0.22087478637695312, 0.119537353515625, -0.3606719970703125, -0.18450164794921875, -0.09222793579101562, -0.16179275512695312, 0.15790557861328125, 0.17901611328125, -0.1760692596435547, -0.4979667663574219, 0.28907203674316406, -0.15240478515625, 0.12224578857421875, 0.10079193115234375, -0.1716766357421875, -0.3705291748046875, -0.3364410400390625, 0.1885223388671875, 0.08351898193359375, -0.204925537109375, 0.39533233642578125, -0.08164596557617188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000007.npy"}
|
||||
{"epoch": 0.010582010582010581, "step": 8, "batch_size": 64, "mean": -0.010671883821487427, "std": 0.36916667222976685, "min": -0.8608436584472656, "p10": -0.4003923416137695, "median": -0.08148956298828125, "p90": 0.42314376831054706, "max": 1.3581695556640625, "pos_frac": 0.40625, "sample": [0.16869544982910156, 1.3581695556640625, 0.85198974609375, 0.0554962158203125, -0.43451690673828125, -0.11055755615234375, 0.13161468505859375, 0.44545745849609375, -0.16187667846679688, 0.00408935546875, -0.172637939453125, -0.1851043701171875, -0.3110332489013672, -0.093414306640625, 0.6110992431640625, 0.8843917846679688, 0.0697021484375, 0.14108848571777344, -0.17863845825195312, -0.4050140380859375, -0.08934783935546875, -0.2401580810546875, -0.5929946899414062, -0.018341064453125, 0.10056114196777344, -0.36682891845703125, -0.04411888122558594, -0.2443695068359375, 0.19778060913085938, -0.1305103302001953, 0.1105804443359375, -0.07363128662109375, -0.048091888427734375, -0.20296478271484375, 0.28173065185546875, 0.27020263671875, 0.1864013671875, 0.10556793212890625, 0.4572906494140625, -0.09233856201171875, 0.4810638427734375, -0.025350570678710938, -0.14833450317382812, -0.070953369140625, -0.14307022094726562, -0.8608436584472656, 0.33876609802246094, -0.148834228515625, -0.47670745849609375, -0.6622772216796875, 0.14965438842773438, -0.09239006042480469, -0.2613563537597656, -0.10903167724609375, -0.54656982421875, 0.13031387329101562, -0.38960838317871094, -0.3614654541015625, 0.22858810424804688, 0.3332557678222656, -0.2142200469970703, 0.3710784912109375, -0.2696685791015625, -0.17045974731445312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000008.npy"}
|
||||
{"epoch": 0.012093726379440665, "step": 9, "batch_size": 64, "mean": 0.0005104541778564453, "std": 0.29696235060691833, "min": -0.7371330261230469, "p10": -0.3260459899902343, "median": 0.008625030517578125, "p90": 0.31724700927734373, "max": 1.2818756103515625, "pos_frac": 0.515625, "sample": [-0.097320556640625, 0.34482574462890625, 0.17206954956054688, -0.25433349609375, -0.12911415100097656, -0.3480072021484375, -0.27480316162109375, -0.19367599487304688, -0.237091064453125, -0.086578369140625, -0.6039810180664062, 0.16962432861328125, 0.36992645263671875, 0.10158157348632812, -0.399261474609375, -0.38488006591796875, -0.069580078125, 0.0072784423828125, -0.13819503784179688, 0.3276863098144531, 0.3262176513671875, -0.015781402587890625, 1.2818756103515625, 0.05810546875, -0.07764625549316406, -0.233001708984375, 0.07639122009277344, 0.2080230712890625, 0.17031478881835938, -0.7371330261230469, -0.24798583984375, -0.38297271728515625, 0.1341094970703125, 0.3173065185546875, -0.23701095581054688, -0.18337631225585938, 0.23676300048828125, -0.1496734619140625, -0.1689453125, 0.11551666259765625, 0.107269287109375, -0.20324325561523438, -0.06011962890625, 0.29132080078125, 0.07648468017578125, 0.1983795166015625, 0.054988861083984375, 0.38964271545410156, 0.1826343536376953, -0.2723884582519531, 0.2503547668457031, 0.0153656005859375, 0.06991958618164062, 0.0496826171875, 0.2364482879638672, 0.317108154296875, 0.16937828063964844, -0.0327606201171875, -0.0155487060546875, 0.2050495147705078, -0.126678466796875, -0.6251068115234375, 0.00997161865234375, -0.0227508544921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000009.npy"}
|
||||
{"epoch": 0.013605442176870748, "step": 10, "batch_size": 64, "mean": 0.040148526430130005, "std": 0.3693771958351135, "min": -0.6218833923339844, "p10": -0.42201690673828124, "median": 0.0010223388671875, "p90": 0.5712539672851569, "max": 1.22137451171875, "pos_frac": 0.5, "sample": [-0.084075927734375, -0.18448257446289062, 0.20983123779296875, -0.019590377807617188, -0.3976783752441406, -0.4228363037109375, -0.2293109893798828, -0.164459228515625, 0.037322998046875, -0.2749786376953125, -0.1135711669921875, 0.2033233642578125, 0.6335372924804688, 0.333770751953125, -0.42010498046875, -0.033466339111328125, 0.01702880859375, -0.202606201171875, 0.692291259765625, 0.06432342529296875, -0.00604248046875, 0.1515827178955078, 0.326751708984375, 0.951751708984375, -0.20140647888183594, -0.15668487548828125, -0.04709625244140625, 0.018310546875, -0.5177688598632812, -0.06505966186523438, 0.008087158203125, 0.095855712890625, 0.10023880004882812, 0.42592620849609375, 0.3271522521972656, 0.742523193359375, -0.6218833923339844, 0.34941673278808594, -0.019725799560546875, 0.23211669921875, -0.1423187255859375, -0.46007347106933594, -0.518096923828125, 0.03583526611328125, -0.257537841796875, -0.023677825927734375, -0.44002532958984375, -0.22830581665039062, 1.22137451171875, -0.2953338623046875, -0.5164642333984375, -0.057098388671875, 0.07172393798828125, -0.02344512939453125, 0.14960098266601562, 0.23046875, 0.8401336669921875, 0.06606292724609375, 0.04778289794921875, 0.13253021240234375, 0.34689903259277344, 0.209442138671875, 0.6966896057128906, -0.254974365234375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000010.npy"}
|
||||
{"epoch": 0.015117157974300832, "step": 11, "batch_size": 64, "mean": 0.02575582265853882, "std": 0.311528742313385, "min": -0.7578048706054688, "p10": -0.31547698974609373, "median": 0.025903701782226562, "p90": 0.41491298675537114, "max": 0.78826904296875, "pos_frac": 0.578125, "sample": [0.01950836181640625, 0.025838851928710938, 0.013629913330078125, 0.28794097900390625, -0.22619056701660156, 0.014133453369140625, -0.06551551818847656, 0.025968551635742188, -0.471160888671875, -0.6309051513671875, -0.1910552978515625, 0.07957077026367188, -0.21169281005859375, 0.24401092529296875, -0.15792465209960938, 0.20175933837890625, 0.78826904296875, 0.521087646484375, -0.1317138671875, 0.188812255859375, 0.4883537292480469, -0.0513153076171875, -0.7578048706054688, 0.10359954833984375, -0.04312896728515625, -0.3130950927734375, 0.033382415771484375, 0.022045135498046875, 0.0551605224609375, 0.19124603271484375, -0.034252166748046875, 0.16277503967285156, 0.15229034423828125, 0.3057403564453125, -0.17569732666015625, -0.1495513916015625, -0.02835845947265625, 0.1545257568359375, 0.7416152954101562, 0.4808349609375, -0.5852813720703125, 0.2403411865234375, -0.4715118408203125, 0.0699920654296875, -0.09960174560546875, -0.740478515625, 0.07028579711914062, 0.10368919372558594, 0.1747283935546875, 0.40193939208984375, 0.16406631469726562, 0.15826416015625, -0.11525917053222656, 0.14200973510742188, -0.1512603759765625, 0.126373291015625, 0.039005279541015625, 0.4204730987548828, -0.016506195068359375, -0.1341705322265625, -0.16619873046875, 0.7204971313476562, -0.316497802734375, -0.04926300048828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000011.npy"}
|
||||
{"epoch": 0.016628873771730914, "step": 12, "batch_size": 64, "mean": -0.018443971872329712, "std": 0.2739444673061371, "min": -0.678924560546875, "p10": -0.39494667053222654, "median": 0.03540992736816406, "p90": 0.2897815704345705, "max": 0.742706298828125, "pos_frac": 0.53125, "sample": [0.06185722351074219, -0.5461502075195312, 0.14237022399902344, 0.1355304718017578, -0.018209457397460938, -0.370819091796875, 0.009328842163085938, 0.0911407470703125, -0.04772377014160156, -0.07561492919921875, -0.678924560546875, -0.3762626647949219, -0.48316192626953125, 0.45196533203125, 0.12859344482421875, -0.061676025390625, 0.10150146484375, -0.06763458251953125, 0.106903076171875, 0.3204689025878906, -0.4029541015625, 0.2458648681640625, -0.3401031494140625, 0.742706298828125, 0.22892189025878906, -0.21714019775390625, -0.1952686309814453, 0.3452262878417969, -0.299713134765625, -0.1415119171142578, -0.0437774658203125, 0.100799560546875, -0.326904296875, -0.4814605712890625, -0.13023757934570312, -0.2249603271484375, 0.05277252197265625, 0.34473419189453125, 0.05925750732421875, -0.30744361877441406, 0.071136474609375, -0.105682373046875, -0.06406402587890625, 0.06608200073242188, 0.20545196533203125, 0.6149711608886719, -0.0408782958984375, 0.0375518798828125, 0.14090728759765625, 0.12711715698242188, 0.2510490417480469, 0.3063812255859375, 0.15649795532226562, -0.445892333984375, 0.1804351806640625, 0.132293701171875, -0.42756080627441406, 0.1470489501953125, -0.3044700622558594, -0.08450126647949219, -0.13267135620117188, 0.033267974853515625, 0.07474708557128906, 0.048076629638671875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000012.npy"}
|
||||
{"epoch": 0.018140589569160998, "step": 13, "batch_size": 64, "mean": -0.0182991623878479, "std": 0.30076324939727783, "min": -1.02276611328125, "p10": -0.3535144805908203, "median": 0.036365509033203125, "p90": 0.25807781219482423, "max": 0.6974220275878906, "pos_frac": 0.546875, "sample": [-1.02276611328125, 0.33300018310546875, -0.2059478759765625, 0.24129867553710938, -0.116455078125, -0.9535408020019531, 0.6974220275878906, 0.25150489807128906, 0.260894775390625, 0.11397552490234375, 0.13677406311035156, 0.012826919555664062, 0.10303497314453125, 0.03687286376953125, 0.1575775146484375, 0.4327812194824219, 0.059906005859375, 0.17344284057617188, -0.071868896484375, 0.0995025634765625, -0.15641021728515625, -0.24106597900390625, -0.325225830078125, 0.1323089599609375, -0.021327972412109375, 0.13417816162109375, -0.3412284851074219, 0.060909271240234375, 0.10587310791015625, -0.6021804809570312, -0.05479240417480469, -0.24633026123046875, 0.4967498779296875, -0.6577835083007812, -0.49920654296875, 0.027812957763671875, -0.0055084228515625, 0.2173919677734375, 0.41893577575683594, -0.3587799072265625, 0.17972183227539062, 0.14307403564453125, -0.13214874267578125, -0.00420379638671875, -0.06920623779296875, 0.219970703125, 0.035858154296875, -0.10251617431640625, -0.20656585693359375, -0.11309051513671875, -0.04506683349609375, -0.4143218994140625, 0.07967948913574219, 0.03797149658203125, 0.03743743896484375, -0.1148223876953125, -0.1106719970703125, 0.22454261779785156, -0.2475128173828125, 0.19105148315429688, 0.2666778564453125, 0.1526031494140625, -0.043701171875, 0.03953742980957031], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000013.npy"}
|
||||
{"epoch": 0.019652305366591082, "step": 14, "batch_size": 64, "mean": 0.019834458827972412, "std": 0.3216404914855957, "min": -0.820465087890625, "p10": -0.2998764038085937, "median": -0.02169322967529297, "p90": 0.4532241821289065, "max": 0.932708740234375, "pos_frac": 0.4375, "sample": [-0.21947097778320312, -0.2041015625, -0.226318359375, 0.6399993896484375, 0.1626567840576172, 0.29486656188964844, 0.19488525390625, 0.932708740234375, -0.2217121124267578, 0.11029052734375, -0.00841522216796875, 0.5809669494628906, -0.10531044006347656, -0.18853759765625, 0.0555877685546875, 0.05692863464355469, -0.02614593505859375, 0.26849365234375, -0.272216796875, -0.06758499145507812, -0.22878265380859375, -0.6046676635742188, -0.1337432861328125, -0.020290374755859375, -0.01735687255859375, 0.09674835205078125, -0.820465087890625, 0.53692626953125, 0.36952972412109375, 0.14687347412109375, 0.02611541748046875, 0.7499008178710938, 0.19008636474609375, -0.2308349609375, -0.1389598846435547, 0.38796234130859375, 0.5237388610839844, -0.37470245361328125, 0.29969215393066406, -0.3809967041015625, 0.312652587890625, 0.3756122589111328, -0.18365859985351562, -0.026702880859375, 0.1076812744140625, -0.024679183959960938, -0.308807373046875, 0.09030914306640625, 0.48119354248046875, -0.03155517578125, -0.2790374755859375, -0.5374298095703125, 0.07992172241210938, -0.344818115234375, -0.018201828002929688, -0.105010986328125, -0.15468597412109375, -0.034626007080078125, -0.023096084594726562, -0.24173927307128906, 0.3647613525390625, -0.1265106201171875, -0.2721824645996094, 0.035671234130859375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000014.npy"}
|
||||
{"epoch": 0.021164021164021163, "step": 15, "batch_size": 64, "mean": 0.03488925099372864, "std": 0.29750171303749084, "min": -0.65313720703125, "p10": -0.2864540100097656, "median": 0.024736404418945312, "p90": 0.35476531982421877, "max": 0.769775390625, "pos_frac": 0.53125, "sample": [0.3054180145263672, -0.1760406494140625, 0.05723762512207031, -0.1519451141357422, 0.036106109619140625, 0.5141220092773438, -0.0980072021484375, -0.28029632568359375, -0.161651611328125, -0.17938232421875, 0.17296981811523438, -0.05262184143066406, -0.1661834716796875, -0.06410408020019531, 0.356170654296875, 0.18093299865722656, 0.7328262329101562, 0.1895599365234375, 0.0013065338134765625, -0.6042499542236328, -0.20792388916015625, -0.30298614501953125, 0.210296630859375, 0.1589813232421875, -0.10671234130859375, 0.609649658203125, 0.06412506103515625, -0.1997528076171875, 0.65362548828125, 0.266754150390625, -0.035327911376953125, -0.22107696533203125, -0.58099365234375, -0.5189056396484375, 0.3202323913574219, 0.1100921630859375, 0.0584564208984375, 0.3514862060546875, -0.07515335083007812, -0.062225341796875, 0.09844970703125, 0.2211456298828125, 0.21352386474609375, 0.1001129150390625, 0.19684219360351562, -0.65313720703125, -0.11182403564453125, 0.03839111328125, 0.2875556945800781, -0.289093017578125, -0.4684906005859375, 0.10229301452636719, -0.07805633544921875, 0.4527740478515625, 0.08385848999023438, -0.07822418212890625, -0.17816162109375, 0.17678451538085938, -0.053112030029296875, -0.024250030517578125, 0.769775390625, 0.01336669921875, 0.325592041015625, -0.01801300048828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000015.npy"}
|
||||
{"epoch": 0.022675736961451247, "step": 16, "batch_size": 64, "mean": -0.011950835585594177, "std": 0.2813331186771393, "min": -0.8603515625, "p10": -0.317413330078125, "median": -0.05694103240966797, "p90": 0.3721603393554688, "max": 0.6271133422851562, "pos_frac": 0.453125, "sample": [-0.10750579833984375, 0.2953071594238281, -0.36710357666015625, 0.05404853820800781, 0.019298553466796875, -0.17335128784179688, -0.18323326110839844, -0.43346405029296875, -0.1114501953125, 0.1800994873046875, -0.05992889404296875, 0.2683868408203125, 0.38134765625, -0.35584259033203125, 0.27640342712402344, 0.0308837890625, 0.1492938995361328, 0.08923912048339844, -0.17043304443359375, -0.111328125, 0.2056427001953125, -0.16281509399414062, 0.45849609375, 0.28119659423828125, 0.056232452392578125, -0.3843994140625, 0.3988075256347656, -0.018680572509765625, 0.05941581726074219, 0.3507232666015625, 0.6160659790039062, 0.1471710205078125, 0.29447174072265625, -0.20668411254882812, -0.0992889404296875, 0.34319305419921875, 0.394744873046875, -0.2870025634765625, -0.23379898071289062, -0.284637451171875, -0.19850921630859375, -0.05395317077636719, -0.15218353271484375, -0.15294265747070312, 0.077880859375, -0.014148712158203125, 0.23871612548828125, -0.30892181396484375, 0.12349510192871094, -0.2064800262451172, -0.08480453491210938, -0.1350536346435547, 0.02673625946044922, -0.32105255126953125, -0.239013671875, -0.14518356323242188, -0.0618896484375, -0.5997161865234375, -0.8603515625, 0.3854198455810547, 0.025325775146484375, -0.20310211181640625, 0.6271133422851562, -0.13175582885742188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000016.npy"}
|
||||
{"epoch": 0.02418745275888133, "step": 17, "batch_size": 64, "mean": 0.015699952840805054, "std": 0.3616268038749695, "min": -0.89453125, "p10": -0.4206916809082031, "median": -0.012938499450683594, "p90": 0.47805881500244163, "max": 1.060394287109375, "pos_frac": 0.46875, "sample": [0.10935592651367188, -0.0501861572265625, -0.012037277221679688, 0.03812980651855469, 0.5039825439453125, 0.22548294067382812, -0.711883544921875, -0.06574249267578125, 0.4175701141357422, -0.06270599365234375, 0.1824493408203125, -0.5401420593261719, -0.10534858703613281, -0.2506828308105469, -0.06776809692382812, -0.1545562744140625, 0.06494903564453125, -0.36965179443359375, -0.05776405334472656, -0.89453125, 0.923675537109375, -0.2274608612060547, -0.606536865234375, -0.238739013671875, 0.02046966552734375, -0.025236129760742188, -0.2064056396484375, -0.12743186950683594, -0.04881477355957031, 0.14104080200195312, -0.34006500244140625, -0.5337753295898438, 0.6071853637695312, -0.21961593627929688, 0.32634735107421875, 1.060394287109375, 0.1581573486328125, 0.5522232055664062, 0.1099395751953125, -0.44256591796875, -0.0138397216796875, -0.07402801513671875, 0.04855155944824219, 0.23810577392578125, -0.00543212890625, 0.12197113037109375, -0.2622528076171875, 0.53106689453125, 0.32596588134765625, 0.351654052734375, 0.3694610595703125, -0.7137451171875, -0.19609832763671875, -0.16618728637695312, 0.5781478881835938, 0.07889556884765625, -0.06489372253417969, 0.18387222290039062, 0.39936065673828125, -0.05155181884765625, 0.16077041625976562, 0.06303024291992188, 0.18595504760742188, -0.16568756103515625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000017.npy"}
|
||||
{"epoch": 0.025699168556311415, "step": 18, "batch_size": 64, "mean": 0.06356379389762878, "std": 0.2850642204284668, "min": -0.786163330078125, "p10": -0.2439249038696289, "median": 0.04323863983154297, "p90": 0.42846298217773443, "max": 0.7415084838867188, "pos_frac": 0.609375, "sample": [0.0031585693359375, 0.09661102294921875, 0.1745147705078125, 0.10734367370605469, 0.496124267578125, 0.12885284423828125, 0.19077110290527344, 0.6613388061523438, 0.41033172607421875, -0.19665908813476562, 0.12743377685546875, 0.0244903564453125, 0.009019851684570312, 0.043060302734375, 0.18141937255859375, 0.38095855712890625, -0.09364128112792969, 0.2638702392578125, 0.4362335205078125, 0.623565673828125, 0.18928909301757812, 0.7415084838867188, 0.16748809814453125, 0.47855377197265625, -0.02106475830078125, 0.04317283630371094, -0.42998504638671875, 0.2130889892578125, -0.24639511108398438, 0.09043312072753906, -0.1315631866455078, -0.04703521728515625, -0.5220222473144531, -0.4376678466796875, -0.2381610870361328, -0.01441192626953125, -0.08144378662109375, -0.17945480346679688, 0.002716064453125, 0.043304443359375, 0.34755706787109375, 0.12732696533203125, -0.18042755126953125, -0.786163330078125, -0.1010284423828125, 0.27386474609375, -0.08245658874511719, -0.07384681701660156, -0.1990509033203125, -0.3344879150390625, -0.3492889404296875, 0.020009994506835938, 0.07855606079101562, -0.057209014892578125, 0.27278900146484375, -0.108917236328125, -0.07288360595703125, 0.061237335205078125, 0.25616455078125, 0.09862899780273438, 0.4895820617675781, -0.0755767822265625, 0.3964958190917969, 0.37805938720703125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000018.npy"}
|
||||
{"epoch": 0.027210884353741496, "step": 19, "batch_size": 64, "mean": 0.06925900280475616, "std": 0.27696365118026733, "min": -0.9151687622070312, "p10": -0.25036964416503904, "median": 0.04220867156982422, "p90": 0.36347808837890627, "max": 0.6920661926269531, "pos_frac": 0.609375, "sample": [0.248382568359375, 0.32624053955078125, 0.08829689025878906, -0.06658935546875, 0.0447540283203125, 0.3393268585205078, -0.16373443603515625, -0.3111381530761719, -0.39910888671875, 0.15016937255859375, -0.029571533203125, -0.0773162841796875, 0.20281982421875, 0.03966331481933594, 0.5680389404296875, -0.04150199890136719, 0.19208335876464844, 0.028501510620117188, 0.05426025390625, 0.56756591796875, 0.171142578125, -0.058502197265625, -0.16997528076171875, 0.0356903076171875, 0.48583984375, 0.25247955322265625, 0.23741912841796875, 0.03495025634765625, 0.03577423095703125, -0.06714057922363281, 0.6629180908203125, -0.22880172729492188, -0.3132781982421875, 0.2357330322265625, -0.39019775390625, -0.259613037109375, 0.1902618408203125, -0.00600433349609375, 0.3204689025878906, 0.05520820617675781, -0.08857917785644531, -0.18732833862304688, 0.36602783203125, 0.15093231201171875, -0.04454803466796875, 0.2118511199951172, -0.0881805419921875, 0.5075225830078125, 0.11289787292480469, 0.29283905029296875, 0.1699371337890625, 0.011568069458007812, 0.6920661926269531, -0.15631580352783203, 0.3575286865234375, 0.016384124755859375, -0.9151687622070312, -0.05268096923828125, -0.08054542541503906, 0.34310150146484375, -0.15755844116210938, -0.3398704528808594, 0.07647705078125, 0.2487030029296875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000019.npy"}
|
||||
{"epoch": 0.02872260015117158, "step": 20, "batch_size": 64, "mean": -0.009327858686447144, "std": 0.2824243903160095, "min": -0.7684783935546875, "p10": -0.3807548522949218, "median": -0.007671356201171875, "p90": 0.29600181579589846, "max": 0.74249267578125, "pos_frac": 0.5, "sample": [-0.18596649169921875, -0.18576812744140625, 0.06111907958984375, -0.09043502807617188, 0.3005943298339844, -0.41855621337890625, -0.06175994873046875, 0.19287109375, 0.28528594970703125, 0.204620361328125, 0.2801475524902344, 0.24556350708007812, -0.5815277099609375, 0.16736984252929688, -0.1935882568359375, -0.06317901611328125, -0.5356674194335938, 0.030176162719726562, 0.37055206298828125, -0.7684783935546875, 0.1016845703125, -0.13475799560546875, -0.11942291259765625, -0.027496337890625, 0.15558815002441406, -0.1549530029296875, 0.2616291046142578, -0.1630992889404297, 0.018218994140625, 0.06073570251464844, -0.31170654296875, -0.6772422790527344, -0.17278289794921875, 0.20223236083984375, -0.41034698486328125, 0.10517501831054688, 0.02307891845703125, 0.06853485107421875, -0.3091583251953125, 0.48248291015625, -0.09268379211425781, -0.01599884033203125, -0.19338226318359375, -0.066131591796875, -0.30475616455078125, 0.221343994140625, 0.39293670654296875, -0.03293609619140625, 0.10089111328125, -0.094696044921875, -0.5582656860351562, 0.2233734130859375, 0.3501739501953125, 0.74249267578125, 0.13854217529296875, -0.04553413391113281, 0.2561607360839844, 0.44663238525390625, -0.0235748291015625, 0.038509368896484375, -0.049808502197265625, 0.029842376708984375, -0.11253738403320312, 0.0006561279296875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000020.npy"}
|
||||
{"epoch": 0.030234315948601664, "step": 21, "batch_size": 64, "mean": -0.022250384092330933, "std": 0.29816141724586487, "min": -0.76824951171875, "p10": -0.3880998611450195, "median": 0.011896133422851562, "p90": 0.3661300659179689, "max": 0.7617034912109375, "pos_frac": 0.515625, "sample": [0.12182044982910156, -0.0712890625, 0.26354026794433594, -0.4771270751953125, 0.20545196533203125, 0.1587982177734375, 0.016376495361328125, -0.1644763946533203, -0.026170730590820312, -0.23850631713867188, 0.043560028076171875, -0.515533447265625, -0.041473388671875, 0.09474945068359375, 0.053131103515625, 0.3859100341796875, -0.49039459228515625, 0.6265182495117188, -0.15627670288085938, -0.05777740478515625, -0.02425384521484375, 0.017621994018554688, 0.051082611083984375, -0.2964019775390625, 0.259674072265625, -0.76824951171875, -0.1725444793701172, 0.7617034912109375, -0.1005859375, 0.15750503540039062, 0.172027587890625, 0.018144607543945312, -0.711761474609375, -0.20507049560546875, -0.3205108642578125, -0.06456756591796875, 0.471954345703125, 0.11928558349609375, -0.39600372314453125, -0.3438739776611328, 0.4495086669921875, -0.32733154296875, 0.007415771484375, -0.3696575164794922, -0.3263092041015625, 0.15472793579101562, 0.120269775390625, -0.055210113525390625, -0.4635467529296875, 0.09283638000488281, 0.09075164794921875, 0.420135498046875, 0.13697052001953125, 0.319976806640625, -0.02483367919921875, 0.1415863037109375, -0.14745330810546875, 0.07367134094238281, 0.1839141845703125, -0.2583465576171875, -0.20294570922851562, 0.42484283447265625, 0.05063629150390625, -0.2716407775878906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000021.npy"}
|
||||
{"epoch": 0.031746031746031744, "step": 22, "batch_size": 64, "mean": -0.0016689598560333252, "std": 0.2811715602874756, "min": -0.7690582275390625, "p10": -0.3966651916503906, "median": 0.06168174743652344, "p90": 0.34532470703125, "max": 0.601318359375, "pos_frac": 0.5625, "sample": [-0.46952056884765625, -0.5551681518554688, -0.3511695861816406, 0.2945365905761719, 0.3471202850341797, 0.20610809326171875, 0.1690540313720703, 0.1183624267578125, -0.07348060607910156, 0.2689647674560547, 0.2602577209472656, 0.22008514404296875, -0.0498199462890625, 0.06599807739257812, -0.041172027587890625, -0.451751708984375, -0.2473602294921875, -0.21732330322265625, 0.09392547607421875, -0.38854217529296875, 0.047149658203125, 0.2520599365234375, -0.08654022216796875, 0.036128997802734375, -0.14144515991210938, -0.40521240234375, 0.17270278930664062, 0.17110633850097656, -0.0430145263671875, 0.15901756286621094, -0.7690582275390625, -0.166107177734375, 0.09313201904296875, -0.009185791015625, 0.27179908752441406, -0.17690277099609375, 0.10629844665527344, 0.11767578125, -0.5494308471679688, -0.2912139892578125, 0.34113502502441406, 0.41259765625, -0.33051300048828125, 0.1708984375, -0.023403167724609375, -0.15313720703125, 0.05736541748046875, -0.2752532958984375, 0.40606689453125, 0.08905029296875, 0.055713653564453125, 0.601318359375, 0.08092498779296875, 0.4042396545410156, 0.471221923828125, 0.08037376403808594, -0.2233409881591797, 0.11423301696777344, 0.08289337158203125, -0.1515045166015625, -0.32535552978515625, 0.35152435302734375, 0.06822013854980469, -0.400146484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000022.npy"}
|
||||
{"epoch": 0.03325774754346183, "step": 23, "batch_size": 64, "mean": 0.01681581139564514, "std": 0.334221214056015, "min": -0.9820404052734375, "p10": -0.41097259521484375, "median": 0.0262451171875, "p90": 0.34746856689453126, "max": 1.1717529296875, "pos_frac": 0.5625, "sample": [-0.68353271484375, 0.67108154296875, 0.20926666259765625, -0.07135009765625, -0.07539176940917969, 0.10191917419433594, 0.05713653564453125, -0.3943634033203125, 0.18221282958984375, 0.02874755859375, 0.10332489013671875, -0.12074470520019531, 0.28136444091796875, -0.07684707641601562, 0.049068450927734375, 0.350067138671875, -0.4883880615234375, 0.3465232849121094, -0.1166229248046875, 0.35227203369140625, 0.11671066284179688, 0.06476020812988281, 0.039745330810546875, 0.13858795166015625, 0.3079185485839844, -0.0356597900390625, 0.0103759765625, -0.2442169189453125, 1.1717529296875, 0.02374267578125, -0.9820404052734375, -0.08602142333984375, -0.06880950927734375, 0.24441909790039062, 0.6565895080566406, -0.370880126953125, -0.276336669921875, -0.0676727294921875, 0.15955734252929688, -0.3077831268310547, 0.2271575927734375, 0.06864166259765625, 0.000247955322265625, -0.4245758056640625, 0.20423507690429688, 0.615692138671875, -0.156982421875, 0.0060882568359375, -0.01892852783203125, -0.4180908203125, -0.5038909912109375, -0.008819580078125, -0.033420562744140625, 0.13277053833007812, -0.5488090515136719, -0.24250030517578125, 0.24900054931640625, 0.13632965087890625, 0.3478736877441406, 0.30637359619140625, 0.0634307861328125, -0.2686500549316406, 0.15610885620117188, -0.013553619384765625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000023.npy"}
|
||||
{"epoch": 0.03476946334089191, "step": 24, "batch_size": 64, "mean": 0.019780874252319336, "std": 0.26083430647850037, "min": -0.6362476348876953, "p10": -0.288701057434082, "median": 0.02825927734375, "p90": 0.3249584197998047, "max": 0.8259201049804688, "pos_frac": 0.53125, "sample": [0.17391204833984375, -0.054050445556640625, 0.002044677734375, 0.3901176452636719, 0.0909423828125, -0.14543533325195312, 0.22599411010742188, -0.071533203125, -0.16773605346679688, -0.0095672607421875, 0.09355354309082031, -0.08811187744140625, -0.2584800720214844, 0.0730743408203125, 0.3183441162109375, -0.57550048828125, 0.13768768310546875, -0.3016529083251953, -0.14725494384765625, 0.2671222686767578, -0.12398529052734375, 0.3277931213378906, -0.01499176025390625, -0.2249755859375, 0.8259201049804688, -0.18109512329101562, 0.13860702514648438, -0.012042999267578125, -0.2578277587890625, 0.3591156005859375, -0.03399658203125, -0.37882423400878906, -0.1013946533203125, -0.07456588745117188, -0.091705322265625, 0.3111724853515625, -0.34758758544921875, 0.19618797302246094, -0.001514434814453125, 0.1898040771484375, 0.34174346923828125, 0.14239120483398438, 0.2942962646484375, 0.10604095458984375, 0.277587890625, 0.0918731689453125, 0.04489898681640625, -0.38262939453125, 0.07067298889160156, 0.130218505859375, -0.6362476348876953, -0.18939208984375, 0.18479156494140625, -0.09598541259765625, -0.47998046875, 0.0273284912109375, 0.03800201416015625, -0.207794189453125, 0.4795494079589844, -0.25121307373046875, 0.1996612548828125, 0.172332763671875, 0.0291900634765625, 0.42107582092285156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000024.npy"}
|
||||
{"epoch": 0.036281179138321996, "step": 25, "batch_size": 64, "mean": 0.0047473907470703125, "std": 0.35908302664756775, "min": -1.2697906494140625, "p10": -0.3058113098144531, "median": -0.031131744384765625, "p90": 0.4371833801269532, "max": 0.936676025390625, "pos_frac": 0.46875, "sample": [0.038974761962890625, -0.29022216796875, 0.0153656005859375, 0.39896392822265625, 0.11746978759765625, -0.1326007843017578, 0.36719512939453125, 0.011445999145507812, 0.1739959716796875, 0.10718917846679688, -0.0904998779296875, -0.14320755004882812, -0.33563995361328125, 0.464111328125, 0.339141845703125, 0.03136444091796875, 0.055267333984375, 0.07683563232421875, -0.07863616943359375, 0.3720855712890625, -0.017974853515625, 0.08901214599609375, -0.31249237060546875, 0.71917724609375, -0.10468673706054688, -0.04538154602050781, 0.038555145263671875, 0.936676025390625, 0.061809539794921875, -0.2573089599609375, -0.11212158203125, -0.06423187255859375, -0.2354888916015625, 0.22840118408203125, -0.0337677001953125, -0.02849578857421875, -0.0986328125, -0.11767005920410156, -0.1783123016357422, 0.423858642578125, -1.2697906494140625, -0.12249374389648438, 0.1800079345703125, 0.1327991485595703, -0.0345001220703125, -0.1370849609375, -0.085235595703125, 0.001476287841796875, -0.28720855712890625, -0.63555908203125, -0.3718376159667969, -0.1348400115966797, 0.44289398193359375, -0.39281463623046875, -0.8988189697265625, 0.13181495666503906, 0.78936767578125, 0.0796051025390625, 0.5144577026367188, -0.28265380859375, -0.2082061767578125, -0.11968994140625, -0.178070068359375, 0.800689697265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000025.npy"}
|
||||
{"epoch": 0.03779289493575208, "step": 26, "batch_size": 64, "mean": 0.02373906970024109, "std": 0.31259018182754517, "min": -0.6033248901367188, "p10": -0.3210590362548828, "median": -0.024019241333007812, "p90": 0.3895320892333985, "max": 0.9209327697753906, "pos_frac": 0.484375, "sample": [0.17121505737304688, 0.2776031494140625, 0.379638671875, -0.10860443115234375, 0.2552986145019531, -0.023214340209960938, -0.324676513671875, 0.03472137451171875, -0.14554595947265625, 0.3937721252441406, -0.3126182556152344, 0.76055908203125, 0.2044219970703125, -0.0704803466796875, -0.4621429443359375, -0.36292266845703125, -0.12169647216796875, 0.29991722106933594, 0.3636436462402344, 0.4478797912597656, 0.5898895263671875, -0.1735095977783203, -0.36464691162109375, -0.2572975158691406, 0.16284561157226562, -0.53765869140625, 0.0455169677734375, -0.03979301452636719, 0.029844284057617188, -0.18008041381835938, -0.024824142456054688, 0.9209327697753906, -0.08831024169921875, 0.31952667236328125, 0.21955108642578125, -0.312530517578125, -0.29834747314453125, -0.13222312927246094, -0.1229705810546875, 0.3768310546875, 0.21640586853027344, 0.05275726318359375, -0.10888099670410156, -0.08814239501953125, 0.521087646484375, -0.138824462890625, -0.5374755859375, 0.34381103515625, 0.6252288818359375, -0.17392730712890625, 0.0318450927734375, -0.12652587890625, 0.04581451416015625, 0.1595458984375, -0.08892250061035156, -0.290191650390625, -0.15150833129882812, 0.10808944702148438, 0.13832855224609375, -0.22599220275878906, -0.6033248901367188, 0.1499004364013672, -0.22336578369140625, 0.09405326843261719], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000026.npy"}
|
||||
{"epoch": 0.039304610733182165, "step": 27, "batch_size": 64, "mean": -0.004584580659866333, "std": 0.287662148475647, "min": -0.7574005126953125, "p10": -0.3563343048095703, "median": -0.0019588470458984375, "p90": 0.3404384613037111, "max": 0.9500732421875, "pos_frac": 0.5, "sample": [0.0767974853515625, -0.07748794555664062, 0.3572578430175781, -0.07262039184570312, 0.18301773071289062, 0.07837677001953125, -0.05294036865234375, -0.29170989990234375, 0.00226593017578125, -0.7574005126953125, -0.1564178466796875, -0.636962890625, 0.20349884033203125, 0.2951698303222656, 0.07008934020996094, 0.08294677734375, 0.04830360412597656, -0.07791900634765625, -0.36644744873046875, -0.16341400146484375, -0.19704437255859375, -0.17254638671875, -0.2838287353515625, -0.14669036865234375, 0.35820770263671875, -0.4006195068359375, -0.0416259765625, 0.2819671630859375, 0.39786529541015625, -0.3327369689941406, -0.0859527587890625, 0.1581249237060547, 0.2061138153076172, 0.28228759765625, 0.27524375915527344, -0.2429962158203125, 0.3539161682128906, 0.9500732421875, 0.37158203125, -0.04453277587890625, -0.48430633544921875, 0.059234619140625, -0.37450408935546875, -0.057521820068359375, 0.019237518310546875, 0.410247802734375, -0.287506103515625, -0.0346221923828125, 0.089599609375, 0.06256484985351562, -0.31645965576171875, 0.00615692138671875, -0.08455848693847656, -0.1116943359375, 0.15172386169433594, -0.006183624267578125, 0.308990478515625, 0.04657745361328125, -0.1988525390625, -0.5732002258300781, 0.23093032836914062, -0.083892822265625, 0.2874412536621094, 0.215972900390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000027.npy"}
|
||||
{"epoch": 0.04081632653061224, "step": 28, "batch_size": 64, "mean": -0.03518790006637573, "std": 0.2802904546260834, "min": -0.77984619140625, "p10": -0.40495071411132805, "median": -0.013835906982421875, "p90": 0.27501640319824217, "max": 0.545135498046875, "pos_frac": 0.484375, "sample": [0.3062744140625, -0.77984619140625, 0.14605712890625, 0.2358245849609375, -0.099456787109375, -0.13254928588867188, -0.0452117919921875, 0.00719451904296875, 0.2753334045410156, -0.135894775390625, -0.3221015930175781, 0.08138275146484375, -0.16133880615234375, -0.6705551147460938, -0.0122528076171875, 0.11964035034179688, -0.43935394287109375, -0.19607162475585938, -0.2947654724121094, 0.3691062927246094, 0.17034912109375, -0.21162033081054688, 0.09641456604003906, 0.060760498046875, 0.23998451232910156, -0.036106109619140625, -0.324676513671875, 0.3741607666015625, -0.587860107421875, 0.17229080200195312, 0.2742767333984375, -0.3021202087402344, 0.16535186767578125, 0.06340789794921875, 0.40912628173828125, -0.17169952392578125, -0.01541900634765625, -0.12554168701171875, -0.5367355346679688, -0.0981597900390625, -0.10500526428222656, -0.15126800537109375, -0.026157379150390625, 0.07807350158691406, 0.006439208984375, -0.073211669921875, -0.06841659545898438, -0.6767654418945312, 0.10046958923339844, 0.054248809814453125, 0.19467926025390625, 0.545135498046875, 0.4264564514160156, -0.18161773681640625, 0.251312255859375, -0.3203125, 0.020893096923828125, -0.47034454345703125, -0.29001617431640625, 0.13590431213378906, 0.1580047607421875, 0.15519332885742188, 0.22715377807617188, -0.1104736328125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000028.npy"}
|
||||
{"epoch": 0.042328042328042326, "step": 29, "batch_size": 64, "mean": 0.005606889724731445, "std": 0.2923359274864197, "min": -0.8553314208984375, "p10": -0.31536178588867186, "median": 0.01105499267578125, "p90": 0.32391471862792987, "max": 0.9709320068359375, "pos_frac": 0.515625, "sample": [0.3430824279785156, -0.2552757263183594, 0.27097511291503906, -0.487213134765625, 0.5467376708984375, -0.1761646270751953, 0.0115203857421875, -0.23037338256835938, -0.0405120849609375, 0.448272705078125, 0.03502655029296875, 0.35523223876953125, -0.09716796875, 0.022602081298828125, 0.18014144897460938, 0.176300048828125, -0.184783935546875, -0.03652763366699219, -0.2001800537109375, 0.235015869140625, 0.07966995239257812, 0.0176544189453125, -0.00144195556640625, -0.1815013885498047, 0.010589599609375, -0.166534423828125, 0.08287811279296875, -0.5094070434570312, 0.085723876953125, 0.2567863464355469, 0.125579833984375, -0.32470703125, -0.8553314208984375, 0.042682647705078125, 0.028961181640625, 0.08699226379394531, -0.11043930053710938, -0.10349273681640625, -0.40777587890625, 0.4076080322265625, -0.09497833251953125, 0.03606414794921875, -0.061794281005859375, -0.07117080688476562, -0.24387550354003906, -0.05901336669921875, 0.06702423095703125, 0.2791900634765625, 0.14589691162109375, -0.04096794128417969, 0.2785491943359375, 0.06653594970703125, 0.6870040893554688, -0.013092041015625, -0.29355621337890625, 0.23530960083007812, -0.15756988525390625, -0.5315170288085938, -0.45903778076171875, 0.9709320068359375, -0.13934326171875, -0.040126800537109375, 0.05975914001464844, 0.257415771484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000029.npy"}
|
||||
{"epoch": 0.04383975812547241, "step": 30, "batch_size": 64, "mean": 0.02944222092628479, "std": 0.336262047290802, "min": -0.764984130859375, "p10": -0.3537506103515625, "median": -0.009145736694335938, "p90": 0.3972328186035158, "max": 1.389312744140625, "pos_frac": 0.484375, "sample": [-0.3742389678955078, 0.13494873046875, 0.01019287109375, -0.4172821044921875, -0.025964736938476562, -0.0347900390625, 0.06405830383300781, 0.35491943359375, -0.4506988525390625, -0.4708404541015625, 0.040821075439453125, 0.6160202026367188, -0.05615997314453125, 0.41536712646484375, -0.00453948974609375, 0.2693939208984375, -0.3082275390625, 0.8491058349609375, -0.1905975341796875, -0.203948974609375, 0.22924041748046875, -0.022491455078125, -0.05425453186035156, 0.4597492218017578, 0.178466796875, -0.25746917724609375, -0.2855491638183594, -0.39061737060546875, 0.2890625, -0.039249420166015625, 0.1371917724609375, -0.020177841186523438, -0.03326416015625, 0.7466964721679688, -0.35137939453125, 0.34107208251953125, -0.22722625732421875, -0.09087371826171875, 0.212554931640625, 0.05068397521972656, -0.03361701965332031, 0.2812614440917969, 1.389312744140625, 0.1328887939453125, -0.10135269165039062, 0.045196533203125, -0.16884613037109375, -0.013751983642578125, 0.0291290283203125, -0.764984130859375, 0.00241851806640625, 0.4357757568359375, 0.2855815887451172, 0.06415557861328125, 0.004161834716796875, -0.16098403930664062, -0.15655136108398438, -0.02796173095703125, -0.354766845703125, 0.019552230834960938, 0.16649627685546875, -0.24088287353515625, -0.21219635009765625, 0.1745624542236328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000030.npy"}
|
||||
{"epoch": 0.045351473922902494, "step": 31, "batch_size": 64, "mean": 0.023108333349227905, "std": 0.3122015595436096, "min": -0.6493911743164062, "p10": -0.3149787902832031, "median": -0.0425567626953125, "p90": 0.44008636474609386, "max": 0.8874740600585938, "pos_frac": 0.421875, "sample": [0.55706787109375, 0.21805572509765625, -0.520965576171875, 0.5085887908935547, -0.1411113739013672, 0.5472412109375, -0.13950729370117188, 0.22750473022460938, -0.29341888427734375, -0.116668701171875, -0.06253814697265625, 0.2817840576171875, 0.28536415100097656, -0.552398681640625, 0.04491424560546875, 0.409912109375, 0.37490272521972656, -0.06455230712890625, 0.28948211669921875, -0.2078094482421875, 0.23968505859375, -0.041896820068359375, 0.032840728759765625, -0.07940292358398438, -0.04254913330078125, 0.4530181884765625, 0.8874740600585938, 0.084197998046875, -0.14565277099609375, -0.172027587890625, -0.0634002685546875, -0.16099166870117188, -0.47223663330078125, -0.2248554229736328, -0.10117912292480469, 0.211761474609375, -0.22850799560546875, -0.07207489013671875, -0.43582916259765625, 0.06632614135742188, 0.3286285400390625, -0.6493911743164062, -0.15863418579101562, 0.5734405517578125, 0.6593017578125, -0.4429512023925781, 0.05577659606933594, -0.08952903747558594, 0.18454360961914062, 0.34598541259765625, -0.32421875, -0.03241729736328125, -0.132049560546875, -0.13699913024902344, -0.12774658203125, -0.04256439208984375, -0.11925888061523438, 0.3681793212890625, -0.03707313537597656, -0.216766357421875, 0.35837554931640625, -0.029230117797851562, -0.2878875732421875, 0.050872802734375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000031.npy"}
|
||||
{"epoch": 0.04686318972033258, "step": 32, "batch_size": 64, "mean": 0.036289215087890625, "std": 0.31476888060569763, "min": -0.9563827514648438, "p10": -0.2761035919189453, "median": 0.040009498596191406, "p90": 0.395229721069336, "max": 0.7628288269042969, "pos_frac": 0.515625, "sample": [-0.2469024658203125, -0.09946441650390625, -0.25867462158203125, 0.2311248779296875, 0.12908935546875, 0.13678741455078125, 0.4547119140625, -0.49022674560546875, 0.239532470703125, -0.0072479248046875, 0.37329864501953125, 0.7628288269042969, 0.23627090454101562, 0.1098785400390625, 0.5573635101318359, 0.213348388671875, -0.7454681396484375, 0.46312713623046875, -0.4906044006347656, -0.11835098266601562, -0.21863365173339844, 0.3314208984375, -0.01454925537109375, 0.328094482421875, -0.186004638671875, -0.1384105682373047, 0.371826171875, 0.2514495849609375, 0.4488983154296875, 0.3792266845703125, 0.16609764099121094, 0.3415031433105469, 0.12871932983398438, -0.10018157958984375, 0.13231658935546875, -0.2449493408203125, -0.29619598388671875, -0.2835731506347656, -0.1177520751953125, 0.7464599609375, 0.22171783447265625, 0.323944091796875, 0.16790771484375, -0.31819915771484375, 0.010425567626953125, -0.15362548828125, -0.127166748046875, -0.038585662841796875, -0.1620941162109375, -0.20952796936035156, -0.1937255859375, 0.4020881652832031, -0.1623382568359375, -0.07159614562988281, -0.07749557495117188, 0.06959342956542969, -0.9563827514648438, 0.14141273498535156, -0.10606575012207031, 0.10674095153808594, -0.18935775756835938, 0.1482391357421875, 0.106353759765625, -0.0859375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000032.npy"}
|
||||
{"epoch": 0.04837490551776266, "step": 33, "batch_size": 64, "mean": 0.004706323146820068, "std": 0.3486509919166565, "min": -0.91973876953125, "p10": -0.4237173080444336, "median": 0.05871391296386719, "p90": 0.4144979476928713, "max": 0.83868408203125, "pos_frac": 0.546875, "sample": [-0.40203857421875, -0.449066162109375, 0.26653289794921875, -0.11222076416015625, 0.07458686828613281, 0.18787384033203125, 0.6316757202148438, 0.026081085205078125, 0.6725692749023438, 0.45256805419921875, -0.1946563720703125, 0.15581512451171875, -0.2765674591064453, -0.12333297729492188, 0.23332977294921875, -0.8672866821289062, -0.513702392578125, 0.1876678466796875, 0.049713134765625, 0.495758056640625, -0.390380859375, 0.83868408203125, 0.26409912109375, -0.03798103332519531, -0.7832412719726562, 0.2229766845703125, -0.18851470947265625, 0.30043792724609375, 0.0817413330078125, 0.29473876953125, 0.20188522338867188, -0.1512622833251953, 0.13971710205078125, -0.0827789306640625, -0.0369873046875, 0.1469573974609375, 0.1317462921142578, -0.09783935546875, -0.08092689514160156, -0.1845226287841797, 0.23681640625, 0.06771469116210938, -0.43300819396972656, 0.5620269775390625, 0.14952468872070312, -0.293487548828125, 0.11314773559570312, -0.1891937255859375, 0.06888008117675781, 0.10361099243164062, -0.36215782165527344, -0.18339920043945312, 0.008291244506835938, -0.46588134765625, -0.07593536376953125, 0.31510162353515625, -0.2072906494140625, 0.4336090087890625, -0.91973876953125, -0.3202362060546875, 0.17103195190429688, 0.3231048583984375, 0.3699054718017578, -0.2550811767578125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000033.npy"}
|
||||
{"epoch": 0.049886621315192746, "step": 34, "batch_size": 64, "mean": 0.040650635957717896, "std": 0.2951290011405945, "min": -0.8173141479492188, "p10": -0.3108184814453125, "median": 0.05209636688232422, "p90": 0.444269180297852, "max": 0.6971664428710938, "pos_frac": 0.546875, "sample": [-0.2557048797607422, 0.644989013671875, 0.1360015869140625, 0.12838363647460938, -0.07013702392578125, 0.219757080078125, -0.2285633087158203, -0.3151054382324219, 0.2481212615966797, 0.14788436889648438, 0.2908668518066406, -0.2910919189453125, -0.07445526123046875, 0.0489959716796875, -0.3838653564453125, -0.05620574951171875, 0.64483642578125, 0.3304443359375, 0.24982452392578125, 0.05519676208496094, -0.13762664794921875, -0.15814781188964844, 0.07149887084960938, 0.0574188232421875, -0.0669403076171875, 0.1521148681640625, 0.18314361572265625, 0.5208320617675781, -0.4067230224609375, -0.46732330322265625, -0.11052703857421875, 0.33010101318359375, -0.06888961791992188, -0.004268646240234375, -0.12836837768554688, -0.2181243896484375, 0.061740875244140625, 0.1956501007080078, 0.17084503173828125, -0.12233734130859375, -0.08132171630859375, -0.4676780700683594, 0.026523590087890625, 0.18659400939941406, 0.5344085693359375, 0.48822021484375, -0.25222015380859375, 0.3417167663574219, 0.2599639892578125, 0.5267753601074219, -0.04432868957519531, 0.01334381103515625, 0.6971664428710938, -0.3307952880859375, 0.17905426025390625, -0.3008155822753906, 0.26123046875, -0.20712661743164062, -0.10080528259277344, 0.11138916015625, 0.08221435546875, 0.2573699951171875, -0.0861663818359375, -0.8173141479492188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000034.npy"}
|
||||
{"epoch": 0.05139833711262283, "step": 35, "batch_size": 64, "mean": 0.06845930218696594, "std": 0.32675719261169434, "min": -0.92254638671875, "p10": -0.19050674438476561, "median": 0.04993629455566406, "p90": 0.41238250732421877, "max": 0.990478515625, "pos_frac": 0.59375, "sample": [-0.199554443359375, -0.3829994201660156, 0.5805892944335938, -0.1421661376953125, -0.27519989013671875, 0.9762496948242188, -0.16939544677734375, 0.2848491668701172, -0.04589080810546875, -0.11525154113769531, 0.231292724609375, 0.12029647827148438, 0.043636322021484375, 0.025648117065429688, 0.05623626708984375, 0.10999488830566406, 0.8475341796875, -0.09976959228515625, 0.0724334716796875, -0.0220794677734375, 0.08275985717773438, 0.07470321655273438, 0.29693031311035156, -0.147430419921875, -0.017444610595703125, 0.02495574951171875, 0.128936767578125, -0.0832061767578125, 0.4487113952636719, -0.00363922119140625, -0.92254638671875, 0.5719375610351562, 0.2123260498046875, -0.3699798583984375, -0.0627899169921875, -0.0904083251953125, -0.1299571990966797, -0.69378662109375, 0.09572792053222656, -0.6192359924316406, 0.16420745849609375, 0.3401832580566406, 0.25646209716796875, -0.16123199462890625, 0.01982879638671875, 0.14644622802734375, 0.187164306640625, -0.06592559814453125, 0.03766632080078125, 0.0863189697265625, -0.1346874237060547, 0.28060150146484375, -0.12970733642578125, 0.11687469482421875, 0.11795806884765625, -0.09948348999023438, 0.990478515625, 0.0316162109375, 0.370819091796875, 0.123626708984375, 0.4071807861328125, 0.41461181640625, -0.07737159729003906, 0.264739990234375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000035.npy"}
|
||||
{"epoch": 0.05291005291005291, "step": 36, "batch_size": 64, "mean": 0.05655008554458618, "std": 0.35572123527526855, "min": -0.847625732421875, "p10": -0.4291261672973632, "median": 0.04271888732910156, "p90": 0.581382751464844, "max": 0.9178466796875, "pos_frac": 0.578125, "sample": [0.09014892578125, 0.4591217041015625, 0.03034210205078125, -0.057903289794921875, -0.20811080932617188, 0.011379241943359375, 0.8083267211914062, 0.20218658447265625, 0.15389442443847656, -0.507232666015625, 0.11116790771484375, 0.06734085083007812, 0.6074981689453125, 0.1488800048828125, 0.014600753784179688, 0.101898193359375, 0.10678863525390625, -0.0630340576171875, 0.06266212463378906, 0.9178466796875, 0.055095672607421875, 0.2309417724609375, -0.01090240478515625, -0.10267066955566406, -0.22235107421875, -0.257537841796875, 0.17777633666992188, 0.7538528442382812, 0.3206062316894531, 0.4821929931640625, -0.1703033447265625, -0.023599624633789062, -0.03276824951171875, -0.081298828125, -0.3356781005859375, -0.17387771606445312, -0.4872856140136719, -0.248931884765625, -0.15797996520996094, -0.4665184020996094, -0.10402297973632812, 0.52044677734375, 0.290191650390625, -0.46543312072753906, -0.00072479248046875, 0.08571815490722656, 0.2928123474121094, 0.6380844116210938, 0.7264328002929688, 0.4722251892089844, 0.16466522216796875, -0.62640380859375, -0.2019939422607422, 0.076934814453125, 0.004638671875, -0.4739990234375, 0.229400634765625, 0.20961761474609375, -0.847625732421875, 0.10652923583984375, -0.12830162048339844, 0.6742095947265625, -0.3444099426269531, 0.01364898681640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000036.npy"}
|
||||
{"epoch": 0.05442176870748299, "step": 37, "batch_size": 64, "mean": 0.12516844272613525, "std": 0.3942371904850006, "min": -0.7794342041015625, "p10": -0.3184616088867187, "median": 0.08268356323242188, "p90": 0.6478523254394531, "max": 1.34295654296875, "pos_frac": 0.59375, "sample": [-0.20191192626953125, 0.460205078125, 0.5594215393066406, -0.03351593017578125, 0.0787353515625, 0.19381332397460938, -0.041484832763671875, 0.1733074188232422, 0.1818523406982422, 0.6034393310546875, 1.34295654296875, 0.19658660888671875, 0.1819305419921875, 0.8535308837890625, -0.294952392578125, -0.3883514404296875, 0.11458396911621094, 0.26071929931640625, 0.449310302734375, 0.64251708984375, 0.021451950073242188, -0.33187103271484375, -0.6767425537109375, 0.23035430908203125, -0.4900245666503906, -0.0147857666015625, 0.23819732666015625, 0.08663177490234375, -0.137420654296875, 0.00792694091796875, -0.2897186279296875, -0.3285369873046875, 0.6501388549804688, 0.013202667236328125, 0.881500244140625, 0.5635299682617188, 0.30536651611328125, -0.08717536926269531, -0.23563003540039062, -0.07434463500976562, 0.887542724609375, -0.22033309936523438, -0.11922836303710938, 0.11144256591796875, -0.7794342041015625, 0.075958251953125, 0.7045440673828125, 0.7133941650390625, 0.43643951416015625, -0.07830810546875, 0.020416259765625, 0.15262603759765625, -0.10146903991699219, -0.15652847290039062, 0.39693450927734375, 0.36419677734375, -0.151397705078125, 0.30035400390625, -0.11199951171875, 0.11248016357421875, 0.4109954833984375, -0.44036102294921875, -0.01372528076171875, -0.1685028076171875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000037.npy"}
|
||||
{"epoch": 0.055933484504913075, "step": 38, "batch_size": 64, "mean": 0.037221550941467285, "std": 0.4079892337322235, "min": -0.7763137817382812, "p10": -0.530389404296875, "median": 0.06541061401367188, "p90": 0.5139091491699219, "max": 1.1215667724609375, "pos_frac": 0.578125, "sample": [0.03562355041503906, 0.8384361267089844, 0.4809722900390625, 1.1215667724609375, -0.363433837890625, 0.3461647033691406, -0.09088897705078125, 0.14376449584960938, -0.5911483764648438, -0.0872650146484375, 0.12370491027832031, -0.0289154052734375, 0.033416748046875, 0.02877044677734375, -0.35347747802734375, -0.4739494323730469, -0.7763137817382812, -0.0029964447021484375, 0.084808349609375, 0.1393890380859375, -0.3845367431640625, 0.14251708984375, -0.5774383544921875, 0.06976318359375, 0.02254486083984375, 0.08197021484375, 0.22479248046875, -0.5419387817382812, 0.5179214477539062, -0.31072998046875, -0.21829986572265625, -0.2403717041015625, -0.07685661315917969, 0.2472991943359375, 0.561920166015625, 0.07367324829101562, -0.7195816040039062, -0.7077560424804688, 0.06105804443359375, 0.13532447814941406, 0.5029792785644531, 0.1734161376953125, 0.3899688720703125, -0.04248809814453125, 0.3603630065917969, -0.5979995727539062, -0.400390625, -0.01068115234375, 0.7042617797851562, 0.504547119140625, 0.423828125, -0.181884765625, 0.94390869140625, -0.28205108642578125, -0.2103290557861328, 0.23863983154296875, 0.09506988525390625, 0.089630126953125, -0.21856689453125, 0.10145950317382812, 0.2888298034667969, 0.7216796875, 0.3219261169433594, -0.5034408569335938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000038.npy"}
|
||||
{"epoch": 0.05744520030234316, "step": 39, "batch_size": 64, "mean": 0.04584622383117676, "std": 0.4392756521701813, "min": -1.461700439453125, "p10": -0.49886894226074213, "median": 0.06268978118896484, "p90": 0.5606544494628907, "max": 0.889129638671875, "pos_frac": 0.609375, "sample": [0.6655158996582031, -0.30625152587890625, -0.19959259033203125, 0.29274940490722656, -0.107330322265625, -0.52301025390625, 0.8324508666992188, 0.4617958068847656, 0.08856964111328125, 0.22014427185058594, 0.048553466796875, 0.09486770629882812, 0.5222721099853516, -0.11503028869628906, 0.450836181640625, 0.22084426879882812, -0.11081695556640625, 0.6729755401611328, 0.4325714111328125, -0.2929840087890625, -0.7898445129394531, -0.5415191650390625, -0.04630279541015625, 0.22323989868164062, 0.565673828125, -0.026277542114257812, -1.461700439453125, 0.2513313293457031, -0.24463653564453125, -0.4425392150878906, 0.1425457000732422, -0.2570953369140625, 0.15752410888671875, 0.04776763916015625, -0.026058197021484375, 0.326934814453125, -0.4231376647949219, 0.6338882446289062, 0.0316162109375, -0.7528610229492188, 0.0626373291015625, -1.21881103515625, 0.05138397216796875, 0.302886962890625, 0.098846435546875, -0.07863998413085938, 0.889129638671875, 0.22418212890625, 0.5915946960449219, 0.06274223327636719, 0.5489425659179688, -0.692718505859375, 0.13536453247070312, -0.062427520751953125, -0.008287429809570312, 0.1738433837890625, 0.00656890869140625, 0.4680023193359375, 0.32669830322265625, -0.1204071044921875, 0.345550537109375, 0.23442649841308594, -0.16550445556640625, 0.04047393798828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000039.npy"}
|
||||
{"epoch": 0.05895691609977324, "step": 40, "batch_size": 64, "mean": 0.0119723379611969, "std": 0.2963016629219055, "min": -0.942291259765625, "p10": -0.315643310546875, "median": 0.022867202758789062, "p90": 0.36672248840332045, "max": 0.7061004638671875, "pos_frac": 0.5625, "sample": [-0.06815338134765625, 0.16689300537109375, -0.12900924682617188, 0.22705650329589844, -0.1733531951904297, 0.20233535766601562, 0.43344879150390625, 0.0254669189453125, 0.1522064208984375, 0.2920989990234375, 0.21215057373046875, 0.1474781036376953, 0.27332305908203125, -0.2912883758544922, 0.3792228698730469, 0.010137557983398438, 0.065185546875, 0.7061004638671875, 0.22810745239257812, -0.144256591796875, -0.2377166748046875, 0.337554931640625, 0.26311492919921875, 0.17676162719726562, 0.25335693359375, -0.29189300537109375, 0.013965606689453125, -0.003265380859375, 0.079681396484375, -0.13654136657714844, 0.5275020599365234, -0.75262451171875, -0.2047119140625, 0.07279205322265625, 0.19565391540527344, 0.020267486572265625, 0.03633880615234375, -0.11504364013671875, -0.1249542236328125, -0.325531005859375, 0.04926300048828125, -0.422698974609375, 0.40229034423828125, -0.144378662109375, -0.292572021484375, -0.043079376220703125, -0.04667472839355469, -0.4167442321777344, -0.12776565551757812, 0.01214599609375, 0.4966926574707031, 0.1019744873046875, 0.19552993774414062, 0.456390380859375, 0.26021575927734375, -0.438720703125, -0.942291259765625, 0.196044921875, -0.1621246337890625, -0.44280242919921875, 0.15044403076171875, -0.10312652587890625, -0.25323486328125, -0.21840667724609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000040.npy"}
|
||||
{"epoch": 0.06046863189720333, "step": 41, "batch_size": 64, "mean": 0.12007108330726624, "std": 0.3749207854270935, "min": -0.6768798828125, "p10": -0.3119544982910156, "median": 0.07917594909667969, "p90": 0.6283332824707032, "max": 1.085601806640625, "pos_frac": 0.609375, "sample": [0.15972900390625, 0.22739028930664062, -0.16782379150390625, -0.5939865112304688, 0.0023193359375, -0.6768798828125, -0.24686431884765625, 0.7336463928222656, -0.06694412231445312, -0.21523284912109375, 0.4386138916015625, 0.15956878662109375, 0.00341033935546875, 0.2525787353515625, 0.27863311767578125, 0.9423065185546875, 0.08455276489257812, 0.6162261962890625, -0.0408935546875, -0.06860160827636719, 0.6363677978515625, -0.48394775390625, -0.28055572509765625, 0.469146728515625, 0.03133964538574219, -0.05458831787109375, 0.1843414306640625, 0.09092330932617188, 0.14951324462890625, 0.049407958984375, -0.023866653442382812, 0.7677459716796875, 1.085601806640625, 0.28971099853515625, -0.15710067749023438, -0.062591552734375, 0.47137451171875, -0.39604949951171875, 0.5728378295898438, -0.06263160705566406, 0.32247161865234375, -0.302154541015625, -0.15959930419921875, 0.007846832275390625, 0.82373046875, 0.4043693542480469, 0.13988113403320312, 0.150787353515625, 0.07379913330078125, 0.4053955078125, -0.07065582275390625, -0.141937255859375, -0.15692138671875, -0.31615447998046875, 0.42595672607421875, -0.3768768310546875, 0.16324615478515625, 0.11286163330078125, 0.5154781341552734, 0.0213165283203125, 0.588348388671875, 0.6335220336914062, -0.14043807983398438, -0.5384521484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000041.npy"}
|
||||
{"epoch": 0.06198034769463341, "step": 42, "batch_size": 64, "mean": 0.00273972749710083, "std": 0.4251692295074463, "min": -1.1341629028320312, "p10": -0.5154098510742188, "median": -0.018983840942382812, "p90": 0.5566230773925783, "max": 0.9421234130859375, "pos_frac": 0.46875, "sample": [-0.0776214599609375, 0.668212890625, -0.7215194702148438, -0.20885086059570312, -0.2060070037841797, -0.580078125, 0.1929779052734375, -0.056972503662109375, -0.35645294189453125, 0.086029052734375, -0.5434837341308594, 0.05474090576171875, -0.3136444091796875, -0.14594268798828125, -0.05818939208984375, 0.25785064697265625, 0.8047943115234375, 0.578277587890625, -0.10740280151367188, 0.5814743041992188, -0.11713027954101562, 0.5060958862304688, 0.4061164855957031, 0.26761627197265625, -0.54901123046875, -0.01105499267578125, -0.026912689208984375, 0.0536651611328125, -0.14265060424804688, -0.37244606018066406, 0.020122528076171875, 0.6924400329589844, -0.11731529235839844, -0.18547821044921875, 0.8793716430664062, 0.3121185302734375, 0.3456268310546875, 0.0944976806640625, -0.5239334106445312, -0.1402130126953125, -0.27201271057128906, -1.0776214599609375, -1.1341629028320312, -0.4148406982421875, -0.24568939208984375, 0.033721923828125, 0.16469573974609375, 0.9421234130859375, -0.3985099792480469, -0.49552154541015625, -0.21802330017089844, 0.4577751159667969, 0.1702117919921875, -0.005924224853515625, 0.35387420654296875, -0.1638031005859375, -0.30587005615234375, 0.02750396728515625, 0.4329833984375, 0.4831428527832031, 0.24628257751464844, 0.4502983093261719, 0.16689300537109375, -0.26190185546875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000042.npy"}
|
||||
{"epoch": 0.06349206349206349, "step": 43, "batch_size": 64, "mean": 0.0393202006816864, "std": 0.3804587125778198, "min": -1.1278152465820312, "p10": -0.49287567138671873, "median": 0.07504940032958984, "p90": 0.4800422668457033, "max": 0.969482421875, "pos_frac": 0.59375, "sample": [0.05525970458984375, 0.0613861083984375, -0.36456298828125, -0.043254852294921875, -0.5011863708496094, -0.5375995635986328, 0.10042953491210938, -0.03592491149902344, 0.009979248046875, -0.6056900024414062, 0.1742725372314453, 0.2497711181640625, 0.4940338134765625, -0.44329071044921875, -0.091217041015625, 0.4335479736328125, 0.2689189910888672, -0.4063568115234375, 0.044239044189453125, 0.191070556640625, 0.2707347869873047, 0.32818603515625, 0.1031951904296875, -0.025136947631835938, 0.6687736511230469, 0.08871269226074219, 0.1824626922607422, 0.969482421875, -0.19345855712890625, 0.3172149658203125, 0.5949249267578125, 0.304779052734375, 0.3244361877441406, -0.092620849609375, 0.3696765899658203, 0.413360595703125, -0.09098052978515625, -0.5572566986083984, 0.44739532470703125, 0.55804443359375, 0.10875701904296875, -0.2947120666503906, -0.3417205810546875, -1.1278152465820312, 0.15058326721191406, 0.597564697265625, 0.11334991455078125, 0.4281044006347656, -0.37516021728515625, 0.3379993438720703, -0.05980110168457031, 0.5005416870117188, 0.060577392578125, 0.0149383544921875, 0.13828659057617188, -0.049774169921875, -0.4734840393066406, -0.09841537475585938, 0.36517333984375, -0.084808349609375, -0.51922607421875, -0.45674896240234375, 0.15183258056640625, -0.6053009033203125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000043.npy"}
|
||||
{"epoch": 0.06500377928949358, "step": 44, "batch_size": 64, "mean": -0.005816161632537842, "std": 0.4509422481060028, "min": -1.3038482666015625, "p10": -0.49583358764648433, "median": -0.016294479370117188, "p90": 0.532943344116211, "max": 1.3270263671875, "pos_frac": 0.46875, "sample": [-0.562042236328125, -0.4209098815917969, 0.19908905029296875, -0.21479225158691406, 0.9391098022460938, -0.139556884765625, 0.32489013671875, -0.31612396240234375, -0.7786293029785156, -0.5827102661132812, -0.07769775390625, 0.7393951416015625, -1.3038482666015625, 0.30043792724609375, -0.010562896728515625, -0.271026611328125, -0.1645488739013672, -0.09391403198242188, 0.06809806823730469, 0.25876426696777344, -0.23661041259765625, 0.4227790832519531, 0.2775707244873047, -0.1972503662109375, 0.029087066650390625, -0.244659423828125, 0.658660888671875, -0.5030593872070312, -0.247650146484375, -0.478973388671875, 0.035675048828125, -0.31484222412109375, 0.05938720703125, -0.02202606201171875, 0.08985710144042969, 0.5015754699707031, 1.14764404296875, -0.04480171203613281, 0.0286102294921875, 0.052093505859375, -0.06411933898925781, 0.011745452880859375, 0.0089263916015625, 0.28668212890625, 0.5741195678710938, 0.54638671875, -0.010162353515625, 0.4751129150390625, -0.097015380859375, 0.14490699768066406, -0.4710884094238281, 0.1879730224609375, 0.00310516357421875, -0.5693817138671875, -0.2716064453125, -0.09904098510742188, 0.26842498779296875, 1.3270263671875, 0.3268871307373047, -0.44731903076171875, -0.1185455322265625, -0.034717559814453125, -0.30922698974609375, -0.9477958679199219], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000044.npy"}
|
||||
{"epoch": 0.06651549508692366, "step": 45, "batch_size": 64, "mean": 0.10335150361061096, "std": 0.37495580315589905, "min": -0.832000732421875, "p10": -0.39203643798828125, "median": 0.1043548583984375, "p90": 0.5609703063964846, "max": 1.0227813720703125, "pos_frac": 0.65625, "sample": [-0.2978363037109375, -0.45905303955078125, 0.33713722229003906, -0.046478271484375, 0.09002876281738281, 0.11470794677734375, -0.035373687744140625, 0.2367706298828125, -0.3941650390625, -0.6880340576171875, 0.12323760986328125, 0.5871429443359375, 0.3289222717285156, 0.3607635498046875, 0.0002765655517578125, 0.148406982421875, 0.28707122802734375, -0.5547714233398438, 0.2802276611328125, 0.392913818359375, 0.4863739013671875, -0.30538177490234375, 0.32491111755371094, -0.011960983276367188, 0.2817039489746094, 0.6732330322265625, 0.1973419189453125, 0.08518028259277344, 0.7657623291015625, 0.2581787109375, 0.10485458374023438, 0.23236083984375, -0.3870697021484375, 0.10385513305664062, 0.49990081787109375, 0.65673828125, -0.344024658203125, 0.3392791748046875, -0.6187744140625, 0.078369140625, -0.832000732421875, -0.0204010009765625, -0.08507537841796875, 0.06711959838867188, 0.40895843505859375, 0.10703659057617188, -0.04400634765625, 0.07724952697753906, 0.099365234375, 0.8645858764648438, 0.14818572998046875, 0.09948539733886719, 0.7354507446289062, -0.128265380859375, 0.0018978118896484375, -0.23247528076171875, 1.0227813720703125, -0.1330718994140625, 0.13550758361816406, -0.22849464416503906, -0.5077781677246094, 0.42108154296875, 0.4988555908203125, -0.0942230224609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000045.npy"}
|
||||
{"epoch": 0.06802721088435375, "step": 46, "batch_size": 64, "mean": 0.1505853533744812, "std": 0.4320390224456787, "min": -0.7872314453125, "p10": -0.33516273498535154, "median": 0.17556476593017578, "p90": 0.698698425292969, "max": 1.2563552856445312, "pos_frac": 0.640625, "sample": [0.10344696044921875, -0.22164154052734375, -0.6058502197265625, 0.32118988037109375, 0.317718505859375, -0.1707916259765625, 0.032501220703125, -0.146331787109375, -0.16777801513671875, -0.7872314453125, 0.5048828125, 0.21378707885742188, 0.3483123779296875, -0.19844818115234375, 0.0673980712890625, -0.14136123657226562, -0.749725341796875, 0.3543586730957031, 0.2517719268798828, 0.39685630798339844, 0.38025665283203125, 0.5325164794921875, 0.18830299377441406, 0.52874755859375, 0.6278762817382812, 0.642547607421875, 0.8223724365234375, 1.087554931640625, -0.085540771484375, -0.27718353271484375, -0.4904327392578125, 0.32492828369140625, -0.247406005859375, 0.106170654296875, -0.5434055328369141, 0.7447052001953125, 0.112823486328125, 0.31812286376953125, -0.30470848083496094, 0.3868293762207031, 0.19140625, -0.15240478515625, 0.26868438720703125, -0.039272308349609375, 0.7802505493164062, 0.021148681640625, -0.31052398681640625, -0.02506256103515625, -0.27524566650390625, 0.21602630615234375, 0.5680465698242188, 0.07171058654785156, -0.09653282165527344, 0.5980491638183594, 0.1628265380859375, 1.2563552856445312, 0.222930908203125, 0.05689239501953125, -0.5383129119873047, 1.023193359375, 0.4589996337890625, 0.223114013671875, -0.3457221984863281, 0.7227630615234375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000046.npy"}
|
||||
{"epoch": 0.06953892668178382, "step": 47, "batch_size": 64, "mean": 0.19316792488098145, "std": 0.33682122826576233, "min": -0.5944805145263672, "p10": -0.18251800537109372, "median": 0.184722900390625, "p90": 0.5927932739257813, "max": 0.9856033325195312, "pos_frac": 0.734375, "sample": [0.4832420349121094, 0.058502197265625, 0.2566413879394531, 0.35451507568359375, 0.9856033325195312, 0.3696632385253906, -0.07320213317871094, 0.39646148681640625, -0.1900482177734375, 0.03377532958984375, 0.07103919982910156, -0.2857170104980469, 0.2877349853515625, -0.5574111938476562, 0.196502685546875, 0.19779205322265625, 0.2032623291015625, -0.10248565673828125, 0.1253814697265625, -0.1260986328125, -0.02660369873046875, 0.545013427734375, -0.214599609375, 0.5227680206298828, -0.1366100311279297, 0.8730621337890625, 0.2505779266357422, 0.3154792785644531, 0.2584266662597656, 0.12755393981933594, -0.35019493103027344, 0.5271511077880859, 0.16845703125, 0.39764404296875, 0.12635040283203125, 0.16735458374023438, 0.6958084106445312, 0.2157745361328125, 0.2853965759277344, 0.5971603393554688, 0.151824951171875, 0.4185333251953125, 0.217315673828125, 0.1412811279296875, 0.6682167053222656, -0.5522994995117188, 0.122283935546875, 0.007293701171875, 0.52227783203125, 0.6290321350097656, 0.126373291015625, 0.172943115234375, -0.164947509765625, -0.16214752197265625, -0.5944805145263672, 0.8520545959472656, -0.06361198425292969, 0.5543937683105469, 0.5611839294433594, -0.153228759765625, 0.5826034545898438, 0.3665618896484375, 0.0703582763671875, -0.14219284057617188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000047.npy"}
|
||||
{"epoch": 0.0710506424792139, "step": 48, "batch_size": 64, "mean": 0.042948633432388306, "std": 0.40119773149490356, "min": -0.78070068359375, "p10": -0.5044685363769531, "median": 0.05225372314453125, "p90": 0.4538795471191408, "max": 1.4864730834960938, "pos_frac": 0.546875, "sample": [0.13898277282714844, -0.119659423828125, 0.23616600036621094, 0.0631866455078125, 0.10625839233398438, 0.6280174255371094, -0.5567779541015625, -0.09649658203125, -0.36144447326660156, -0.0759735107421875, 0.19905853271484375, 0.06797218322753906, -0.06501007080078125, -0.026214599609375, 0.6645889282226562, 0.1882781982421875, 0.23322296142578125, 0.24048614501953125, -0.5116348266601562, 0.3854827880859375, -0.13225936889648438, 0.47376251220703125, -0.0434112548828125, 0.3035392761230469, 0.16414642333984375, 0.31290435791015625, 0.04132080078125, 0.5507164001464844, -0.6741485595703125, 0.35721588134765625, -0.15741729736328125, 0.4074859619140625, -0.09790802001953125, 0.32068634033203125, 0.3923797607421875, 0.25139617919921875, -0.6896171569824219, -0.0167999267578125, 0.8313999176025391, -0.0164337158203125, -0.3442974090576172, 0.134796142578125, 0.10126495361328125, -0.0462646484375, -0.5227584838867188, 0.10311126708984375, 0.29651641845703125, 0.03496551513671875, -0.40328216552734375, -0.3258495330810547, 0.3140411376953125, -0.2156829833984375, -0.4877471923828125, 0.03954505920410156, -0.44193458557128906, 0.30658721923828125, 1.4864730834960938, -0.10992050170898438, -0.7008209228515625, -0.06519317626953125, 0.21432113647460938, 0.6284713745117188, -0.38437652587890625, -0.78070068359375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000048.npy"}
|
||||
{"epoch": 0.07256235827664399, "step": 49, "batch_size": 64, "mean": 0.09795981645584106, "std": 0.4102455973625183, "min": -1.3295135498046875, "p10": -0.3413032531738281, "median": 0.1643848419189453, "p90": 0.4555505752563478, "max": 1.324127197265625, "pos_frac": 0.6875, "sample": [0.1949920654296875, 0.1878509521484375, -0.0018157958984375, 0.268646240234375, 0.37184906005859375, -0.2072772979736328, 0.16534996032714844, 0.29034423828125, 0.18174362182617188, 0.3132171630859375, 0.2057209014892578, -0.15579605102539062, 0.35296630859375, 0.15543365478515625, -0.2109050750732422, -0.24443817138671875, -0.408538818359375, -0.2852325439453125, 0.18358230590820312, -0.968994140625, 0.3697795867919922, 0.08084869384765625, -0.07566070556640625, 0.4906463623046875, 0.16407394409179688, 0.07924652099609375, 0.2847747802734375, 0.16469573974609375, -0.020807266235351562, 0.02651214599609375, 0.4275646209716797, 0.416412353515625, -0.1138763427734375, 0.7147712707519531, 0.11774826049804688, 0.898590087890625, 0.33318328857421875, 0.316741943359375, 0.344390869140625, 0.41037750244140625, -0.1119384765625, -0.09563446044921875, 0.07770538330078125, 0.4675445556640625, -0.5144882202148438, 0.0889739990234375, 0.3888092041015625, 0.2671070098876953, 0.011117935180664062, 0.016139984130859375, 0.6158294677734375, 0.489013671875, -1.3295135498046875, -0.2583160400390625, -0.4877510070800781, -0.36533355712890625, -0.14618682861328125, 1.324127197265625, 0.31740570068359375, 0.04997825622558594, 0.34664154052734375, 0.09228515625, 0.1765899658203125, -0.969390869140625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000049.npy"}
|
||||
{"epoch": 0.07407407407407407, "step": 50, "batch_size": 64, "mean": 0.0826747715473175, "std": 0.44819995760917664, "min": -1.1360702514648438, "p10": -0.3920423507690429, "median": 0.09418678283691406, "p90": 0.6266429901123048, "max": 1.0522537231445312, "pos_frac": 0.625, "sample": [0.33074188232421875, 0.8940277099609375, 0.041534423828125, 0.20679092407226562, 0.40570068359375, 0.47217559814453125, 0.201812744140625, -0.26938629150390625, 0.9098968505859375, 0.4761772155761719, 0.36240196228027344, 0.7146759033203125, -0.4049530029296875, -0.18632888793945312, 0.3196601867675781, -0.36191749572753906, 0.5441513061523438, 0.12178421020507812, 0.5837287902832031, -0.08647918701171875, 0.056732177734375, -0.102294921875, -0.1968536376953125, -0.22872161865234375, 0.4704627990722656, 0.7520751953125, -0.2923851013183594, 0.28736114501953125, -0.2515449523925781, 0.031890869140625, 0.23855018615722656, 0.10927009582519531, 0.4562110900878906, 0.3619518280029297, -0.8700027465820312, 0.24521636962890625, 0.383087158203125, 0.6674575805664062, 0.06665611267089844, -0.24609375, 1.0522537231445312, 0.6450347900390625, -0.0803680419921875, -1.1360702514648438, 0.021860122680664062, -0.0242767333984375, -0.2774810791015625, 0.09044647216796875, -1.1163864135742188, 0.11789703369140625, 0.35945701599121094, 0.02948760986328125, 0.09792709350585938, 0.16571044921875, 0.43247032165527344, 0.07378387451171875, -0.7487983703613281, -0.18582916259765625, -0.6707305908203125, -0.05334663391113281, 0.11702537536621094, -0.12241363525390625, -0.048046112060546875, -0.6636428833007812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000050.npy"}
|
||||
{"epoch": 0.07558578987150416, "step": 51, "batch_size": 64, "mean": 0.17682453989982605, "std": 0.5282368063926697, "min": -0.9228096008300781, "p10": -0.4975418090820312, "median": 0.1751871109008789, "p90": 0.9055732727050783, "max": 1.6107177734375, "pos_frac": 0.609375, "sample": [0.87823486328125, 0.3207073211669922, -0.04200172424316406, -0.182037353515625, -0.21567535400390625, 0.7130203247070312, 0.15471839904785156, 0.1593647003173828, -0.5803451538085938, -0.44091796875, -0.04828453063964844, 0.5704460144042969, -0.07512283325195312, 1.0044174194335938, -0.234405517578125, -0.0852203369140625, 0.5764083862304688, 0.2332916259765625, 0.49596405029296875, 0.563751220703125, 0.191009521484375, 0.2410602569580078, 0.2819194793701172, 1.27984619140625, 0.255767822265625, -0.639434814453125, -0.5072784423828125, -0.46267127990722656, 0.25279808044433594, 0.4483795166015625, 0.47106170654296875, 0.842987060546875, 0.015363693237304688, 0.3510284423828125, -0.8072357177734375, 0.07157516479492188, 0.2777976989746094, -0.3004493713378906, -0.7351608276367188, 0.26593017578125, -0.100738525390625, -0.109100341796875, 0.36397552490234375, 1.120941162109375, 0.5943565368652344, -0.015323638916015625, -0.9228096008300781, 1.6107177734375, -0.474822998046875, 0.9172897338867188, 0.5240802764892578, 0.978118896484375, 0.26917266845703125, 0.02245330810546875, -0.0084075927734375, 0.010515213012695312, -0.13040542602539062, 1.3436279296875, 0.2587471008300781, 0.00921630859375, -0.549407958984375, 0.329376220703125, -0.03452110290527344, -0.2508888244628906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000051.npy"}
|
||||
{"epoch": 0.07709750566893424, "step": 52, "batch_size": 64, "mean": 0.19093959033489227, "std": 0.5613593459129333, "min": -0.8073959350585938, "p10": -0.43205184936523433, "median": 0.1544647216796875, "p90": 0.801788330078125, "max": 1.961456298828125, "pos_frac": 0.59375, "sample": [1.1806564331054688, -0.444793701171875, 0.3537750244140625, 0.6605987548828125, 0.7978363037109375, -0.24600982666015625, 0.23607635498046875, -0.269561767578125, -0.40232086181640625, 1.5479660034179688, 1.00384521484375, 0.4871368408203125, -0.40024566650390625, 0.8034820556640625, 0.6618423461914062, -0.26720428466796875, 0.9436798095703125, -0.2576904296875, 0.01300048828125, -0.40111732482910156, -0.031833648681640625, -0.15651321411132812, 0.5448074340820312, 0.3711261749267578, 0.1744384765625, 0.179901123046875, -0.1720123291015625, 0.1437053680419922, 0.1221771240234375, 0.17078781127929688, 0.05938911437988281, -0.731414794921875, 0.2373809814453125, -0.20012474060058594, 0.160675048828125, 0.7495651245117188, 0.03821563720703125, 0.2545032501220703, 0.7865447998046875, -0.48614501953125, -0.21346664428710938, -0.06652069091796875, 0.33385467529296875, 0.44379425048828125, -0.12467765808105469, 0.7522964477539062, -0.20165538787841797, -0.15317916870117188, 1.961456298828125, -0.8073959350585938, 0.65692138671875, -0.5226058959960938, -0.5777053833007812, 0.4931373596191406, 0.6453628540039062, -0.16115570068359375, 1.4291152954101562, -0.6478538513183594, 0.7611083984375, 0.17563247680664062, 0.14825439453125, 0.19055557250976562, -0.12012100219726562, -0.391143798828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000052.npy"}
|
||||
{"epoch": 0.07860922146636433, "step": 53, "batch_size": 64, "mean": 0.1907462179660797, "std": 0.44482186436653137, "min": -1.1405181884765625, "p10": -0.46289672851562497, "median": 0.18593406677246094, "p90": 0.7314891815185548, "max": 1.1375160217285156, "pos_frac": 0.71875, "sample": [0.19545745849609375, 0.7824058532714844, 0.8872566223144531, 0.187255859375, 0.18665122985839844, 0.1681499481201172, 0.25458526611328125, 0.0513153076171875, 0.5099411010742188, -0.5912132263183594, -0.279937744140625, -0.2211761474609375, -0.06169891357421875, 0.6648521423339844, -0.46551513671875, -0.06481170654296875, 0.05628013610839844, 1.0665054321289062, -0.17263412475585938, -0.5136184692382812, 0.184051513671875, 0.31830596923828125, 0.1077117919921875, 0.5555686950683594, -1.1405181884765625, 0.39990997314453125, -0.4908638000488281, 0.6275367736816406, 0.5533466339111328, 0.64630126953125, 0.23894500732421875, -0.6989593505859375, -0.2518501281738281, 0.20240020751953125, 0.34964752197265625, 0.5715179443359375, 0.45275306701660156, 1.1375160217285156, -0.16037940979003906, -0.012311935424804688, -0.4728279113769531, 0.0461883544921875, 0.4451465606689453, -0.456787109375, 0.48557281494140625, 0.597991943359375, 0.5174560546875, 0.765716552734375, 0.7396888732910156, 1.0108184814453125, 0.29955291748046875, 0.08346939086914062, 0.07341194152832031, 0.7123565673828125, 0.30507659912109375, -0.09422683715820312, 0.36688232421875, -0.20808792114257812, 0.1051177978515625, 0.04063224792480469, 0.18109130859375, 0.16728973388671875, 0.08032989501953125, 0.18521690368652344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000053.npy"}
|
||||
{"epoch": 0.0801209372637944, "step": 54, "batch_size": 64, "mean": 0.15445497632026672, "std": 0.530115008354187, "min": -1.1484031677246094, "p10": -0.40606918334960934, "median": 0.06073284149169922, "p90": 0.8643959045410157, "max": 1.7960357666015625, "pos_frac": 0.609375, "sample": [0.6376876831054688, 1.3857345581054688, 0.0013427734375, 0.5241260528564453, -0.09903717041015625, -0.6020965576171875, 1.0556983947753906, 0.036285400390625, 1.0128326416015625, -0.7503585815429688, 0.022443771362304688, -0.018352508544921875, 0.00281524658203125, -1.1484031677246094, 0.2654991149902344, 0.06181907653808594, 0.0596466064453125, 0.1826915740966797, -0.047809600830078125, -0.29738616943359375, -0.08617401123046875, 1.7960357666015625, -0.036651611328125, 0.442840576171875, -0.11663818359375, 0.8681678771972656, 0.2841911315917969, -0.139129638671875, 0.2344512939453125, 0.03430938720703125, 0.23079681396484375, -0.2900810241699219, -0.25817108154296875, -0.6659259796142578, 0.5160446166992188, 0.202667236328125, 0.2631378173828125, -0.10620880126953125, 0.9927940368652344, 0.8555946350097656, 0.21004676818847656, -0.42050933837890625, 0.3267707824707031, 0.46836090087890625, -0.11229324340820312, 0.648651123046875, 0.7617645263671875, 0.3839397430419922, -0.3550224304199219, 0.341522216796875, 0.0201416015625, -0.7510528564453125, 0.258758544921875, -0.193817138671875, 0.5932769775390625, -0.16490554809570312, 0.08679771423339844, -0.31057167053222656, 0.984405517578125, -0.28469085693359375, -0.37237548828125, 0.7312469482421875, 0.276641845703125, -0.5491981506347656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000054.npy"}
|
||||
{"epoch": 0.08163265306122448, "step": 55, "batch_size": 64, "mean": 0.30795878171920776, "std": 0.72398442029953, "min": -0.8996200561523438, "p10": -0.4741533279418945, "median": 0.1667194366455078, "p90": 1.2880229949951172, "max": 2.618743896484375, "pos_frac": 0.65625, "sample": [1.9339752197265625, 0.32570648193359375, 0.27458953857421875, 0.495269775390625, -0.048900604248046875, 0.2713623046875, 1.0844879150390625, 0.070281982421875, -0.5106163024902344, -0.18454742431640625, 0.14652633666992188, 0.923309326171875, -0.681610107421875, -0.37554168701171875, 0.25276947021484375, 1.2999267578125, 1.2801856994628906, 0.57061767578125, 0.0954742431640625, -0.16300582885742188, 0.24600601196289062, -0.79632568359375, 0.24309539794921875, 0.15114784240722656, 2.33551025390625, 0.2768726348876953, 0.08741188049316406, 1.2913818359375, -0.07190704345703125, -0.172210693359375, 0.18229103088378906, 0.36053466796875, 0.4114532470703125, -0.48628807067871094, -0.056396484375, -0.2191314697265625, 0.19676494598388672, 0.04741859436035156, 0.3035469055175781, -0.08964920043945312, -0.00447845458984375, 0.9580879211425781, -0.22163009643554688, 1.572418212890625, 0.34372711181640625, 0.29645729064941406, 1.8620452880859375, -0.684417724609375, -0.41245269775390625, -0.8996200561523438, 0.14064407348632812, 0.7941455841064453, -0.44583892822265625, 0.06897735595703125, -0.21751022338867188, 1.0202140808105469, 0.90850830078125, 0.15027236938476562, 0.5152187347412109, 2.618743896484375, -0.1377716064453125, 0.1085357666015625, 0.8661994934082031, -0.7929000854492188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000055.npy"}
|
||||
{"epoch": 0.08314436885865457, "step": 56, "batch_size": 64, "mean": 0.18851301074028015, "std": 0.6484047770500183, "min": -2.184326171875, "p10": -0.41484260559082026, "median": 0.21430301666259766, "p90": 0.8196100234985352, "max": 2.0366363525390625, "pos_frac": 0.671875, "sample": [-0.29204559326171875, 0.3748207092285156, 0.002506256103515625, -0.2742290496826172, 0.30074310302734375, 0.05399322509765625, 1.144561767578125, -1.365509033203125, -0.10408401489257812, 0.21389198303222656, 0.39231109619140625, -0.74981689453125, 0.5951690673828125, 0.25724029541015625, 0.21471405029296875, 1.5783843994140625, -0.01442718505859375, -0.5180797576904297, 0.031280517578125, 0.9624366760253906, -0.3846015930175781, 2.0366363525390625, 0.5608673095703125, 0.1678466796875, 0.28997039794921875, 0.8265285491943359, -0.2569160461425781, 0.0534515380859375, 0.2905158996582031, 0.5095367431640625, -0.42780303955078125, 0.2176666259765625, 0.77508544921875, 0.7157630920410156, 0.24172210693359375, -0.10742950439453125, -2.184326171875, 0.5899543762207031, 0.06548309326171875, 0.01601409912109375, 0.5211391448974609, 0.803466796875, -1.0200042724609375, -0.53021240234375, 0.09967994689941406, -0.17669296264648438, 0.0187530517578125, 0.41565704345703125, -0.1336650848388672, 0.4726696014404297, 0.7161979675292969, 0.6347198486328125, 0.9772453308105469, 0.44602203369140625, 0.7857666015625, -0.054821014404296875, -0.34642791748046875, 0.024749755859375, 0.2635040283203125, 0.26134490966796875, -0.20807647705078125, 1.6061019897460938, -0.06633949279785156, -0.2457733154296875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000056.npy"}
|
||||
{"epoch": 0.08465608465608465, "step": 57, "batch_size": 64, "mean": 0.30083510279655457, "std": 0.68483966588974, "min": -0.9184799194335938, "p10": -0.5801200866699219, "median": 0.23040390014648438, "p90": 1.1814729690551762, "max": 2.41265869140625, "pos_frac": 0.65625, "sample": [-0.6123428344726562, -0.49744224548339844, 0.30846595764160156, 2.41265869140625, -0.18601036071777344, 1.54400634765625, 0.493560791015625, -0.5593185424804688, 0.16048431396484375, 0.9235076904296875, 1.384857177734375, 1.0955257415771484, -0.9184799194335938, 0.10433769226074219, 0.6323013305664062, -0.630584716796875, 0.3086090087890625, 0.5647354125976562, 1.9533538818359375, -0.051029205322265625, 0.24558258056640625, -0.0966644287109375, 0.6723899841308594, 0.9356231689453125, -0.2031402587890625, -0.7351512908935547, 0.349273681640625, 0.15203857421875, -0.0557861328125, -0.5130996704101562, 0.8581314086914062, 0.5056629180908203, 0.2152252197265625, 0.01035308837890625, -0.37635040283203125, 0.1795806884765625, -0.49365234375, 0.4244041442871094, -0.9164276123046875, 1.2183074951171875, 0.20163726806640625, -0.24068069458007812, -0.5890350341796875, 0.7926807403564453, -0.3968658447265625, 0.494232177734375, 0.0970916748046875, 0.42273712158203125, 0.9785614013671875, 0.6964282989501953, 0.14642715454101562, 0.8779067993164062, -0.2020111083984375, 1.371551513671875, 0.15674781799316406, 0.9775543212890625, 0.381805419921875, 0.8010425567626953, -0.06404495239257812, 0.529052734375, 0.3810157775878906, -0.0299835205078125, 1.2865524291992188, -0.6244544982910156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000057.npy"}
|
||||
{"epoch": 0.08616780045351474, "step": 58, "batch_size": 64, "mean": 0.24241399765014648, "std": 1.0720629692077637, "min": -1.5333137512207031, "p10": -0.8658069610595703, "median": 0.118499755859375, "p90": 1.1293996810913087, "max": 4.73944091796875, "pos_frac": 0.59375, "sample": [-1.5333137512207031, 0.8449592590332031, 1.1097583770751953, 0.11411285400390625, 0.4708709716796875, -0.3456573486328125, 0.0440673828125, 0.4857025146484375, -0.10207366943359375, -0.5822677612304688, 0.9565162658691406, -0.299591064453125, 1.1378173828125, -1.21343994140625, -0.8765678405761719, 0.8649177551269531, 1.5782852172851562, -0.7158050537109375, -0.38275146484375, 0.954437255859375, 0.027553558349609375, -0.203582763671875, 0.7707901000976562, 0.6776351928710938, -0.6194610595703125, -0.42342376708984375, -1.1950836181640625, 0.0670013427734375, -0.7839794158935547, 0.954132080078125, 0.06256103515625, -1.0316028594970703, 0.21136093139648438, 0.7178726196289062, 4.73944091796875, -0.45389556884765625, -0.951873779296875, 3.900909423828125, 1.5313186645507812, 0.5218963623046875, -0.8406982421875, 0.12439537048339844, 0.4161415100097656, -0.486907958984375, 0.191802978515625, 2.74420166015625, -0.3440895080566406, -0.9284515380859375, -0.166290283203125, 0.9321022033691406, 0.2675018310546875, 1.3373908996582031, 0.22414398193359375, 0.028667449951171875, 0.48403167724609375, 0.7359466552734375, 0.2733135223388672, -0.2543792724609375, -0.7689437866210938, 0.5847320556640625, -0.6170654296875, -0.144500732421875, 0.12288665771484375, 0.5690174102783203], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000058.npy"}
|
||||
{"epoch": 0.08767951625094482, "step": 59, "batch_size": 64, "mean": 0.30412447452545166, "std": 0.8560804128646851, "min": -1.2905654907226562, "p10": -0.6824054718017578, "median": 0.10054779052734375, "p90": 1.3561515808105469, "max": 2.97015380859375, "pos_frac": 0.578125, "sample": [-0.2083740234375, 0.05142974853515625, 0.2124176025390625, -0.75543212890625, -0.13100624084472656, -0.5061054229736328, 0.3456611633300781, -1.2905654907226562, 0.6563491821289062, 1.13775634765625, 2.3682861328125, -0.053936004638671875, 0.89520263671875, 1.1335105895996094, 0.6264286041259766, -0.31824302673339844, 1.7013702392578125, -0.7640209197998047, -0.3280029296875, 0.2907562255859375, 0.2149982452392578, -0.04498291015625, 0.217376708984375, 0.14966583251953125, -0.0906524658203125, 0.43907928466796875, 0.015117645263671875, 1.3473129272460938, 2.97015380859375, 1.1086578369140625, -0.05556488037109375, -0.12971878051757812, 0.7347469329833984, -0.003856658935546875, -0.47576904296875, 0.2077503204345703, 0.00077056884765625, 0.6149940490722656, -0.48970794677734375, -0.32572174072265625, 2.90008544921875, 0.7993850708007812, 1.6464080810546875, 0.49016380310058594, -0.6421775817871094, 0.48239707946777344, -0.69964599609375, 1.0182247161865234, -0.000904083251953125, 1.6532745361328125, -0.8005294799804688, 0.7623748779296875, -0.0700225830078125, -1.200042724609375, 0.6275253295898438, -0.1042938232421875, -0.7706298828125, -0.170074462890625, 0.19421768188476562, 0.03325653076171875, 0.021728515625, 1.3599395751953125, 0.5329360961914062, -0.06776237487792969], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000059.npy"}
|
||||
{"epoch": 0.08919123204837491, "step": 60, "batch_size": 64, "mean": 0.1287935972213745, "std": 0.8152156472206116, "min": -2.2818756103515625, "p10": -0.8892501831054688, "median": 0.21413898468017578, "p90": 0.9179000854492189, "max": 2.0403289794921875, "pos_frac": 0.609375, "sample": [0.145782470703125, -0.06128692626953125, 0.6683578491210938, -0.477783203125, 0.2548828125, 1.5270614624023438, -1.5006256103515625, -0.6158447265625, 0.4955940246582031, -1.5659332275390625, 0.5773811340332031, 0.84344482421875, -0.19721603393554688, 0.5925807952880859, -0.7671279907226562, 0.7530364990234375, -0.3193244934082031, 0.8234691619873047, 1.2051334381103516, 0.23292922973632812, -0.9024200439453125, 0.3050804138183594, 0.4203033447265625, -0.8585205078125, 0.19534873962402344, 0.1535797119140625, 0.494354248046875, 0.25023651123046875, 1.8180084228515625, -0.9992790222167969, 0.45226287841796875, -0.1055145263671875, 0.9304885864257812, 0.7224960327148438, 0.0703277587890625, 0.45523834228515625, 0.6439018249511719, -0.2856025695800781, -0.4764518737792969, -1.1740856170654297, -0.6766891479492188, 0.7637405395507812, 0.8885269165039062, -1.4182968139648438, -0.029682159423828125, 1.6657257080078125, -0.17778396606445312, -0.3277740478515625, 0.8441123962402344, -0.0898895263671875, 1.10064697265625, -0.15960311889648438, 0.6583919525146484, 0.24352264404296875, 2.0403289794921875, -0.2718639373779297, 0.3648223876953125, 0.5503807067871094, 0.02923583984375, 0.05092811584472656, -2.2818756103515625, -0.7681312561035156, 0.4405498504638672, 0.07920265197753906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000060.npy"}
|
||||
{"epoch": 0.09070294784580499, "step": 61, "batch_size": 64, "mean": 0.0772022008895874, "std": 1.0485469102859497, "min": -2.4391937255859375, "p10": -1.4656799316406248, "median": 0.0685739517211914, "p90": 1.4009994506835943, "max": 2.3125152587890625, "pos_frac": 0.53125, "sample": [0.7999725341796875, 0.4283447265625, 1.8486328125, 2.3125152587890625, 0.08978271484375, -0.056003570556640625, 0.344451904296875, 1.8765220642089844, 1.0111541748046875, -0.16725540161132812, 0.22193145751953125, -0.11875534057617188, -0.48265838623046875, 0.1864776611328125, 0.3465728759765625, -0.5330772399902344, 0.5578079223632812, -0.3932228088378906, -0.0489501953125, 0.31293296813964844, 0.013492584228515625, 0.6797008514404297, 0.614593505859375, -0.12462234497070312, -0.13538742065429688, -0.2624855041503906, -1.2540283203125, -2.13262939453125, 0.29001617431640625, -0.17450332641601562, 0.6123256683349609, -0.5516185760498047, -1.28363037109375, 1.5377197265625, -0.6839752197265625, 0.64013671875, -0.03170013427734375, -0.5311393737792969, -0.42957305908203125, 1.677001953125, 2.2612457275390625, -2.0086746215820312, -0.43349456787109375, -1.7372779846191406, 1.2207489013671875, 1.1499710083007812, 0.3770751953125, 0.1928081512451172, -0.678070068359375, -0.20376014709472656, 1.2491455078125, -1.543701171875, 0.04736518859863281, -0.1606903076171875, 0.7116546630859375, -2.4391937255859375, 1.4660797119140625, 0.6559276580810547, 0.14002227783203125, -0.10076904296875, 0.9969100952148438, 0.8722457885742188, -2.247608184814453, -1.8538894653320312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000061.npy"}
|
||||
{"epoch": 0.09221466364323508, "step": 62, "batch_size": 64, "mean": 0.2140759527683258, "std": 0.9205039143562317, "min": -2.869049072265625, "p10": -0.8065845489501953, "median": 0.2886810302734375, "p90": 1.1738616943359375, "max": 2.405548095703125, "pos_frac": 0.625, "sample": [0.7717361450195312, -1.6445732116699219, -0.7937164306640625, 0.3149528503417969, 0.767181396484375, 0.13449859619140625, 1.1222953796386719, -0.2786979675292969, -1.3964157104492188, -0.3141021728515625, 1.8126869201660156, -0.2939605712890625, 1.1743011474609375, -1.4063262939453125, 1.1728363037109375, 0.37334442138671875, 0.37953948974609375, 0.7489471435546875, -0.2625141143798828, 0.5294647216796875, -0.6311721801757812, -0.7003631591796875, -1.056182861328125, 0.1989917755126953, 0.2624092102050781, -0.042236328125, -0.4060173034667969, 0.7423934936523438, 1.1066207885742188, 0.6065216064453125, -0.3011932373046875, -0.15266799926757812, -0.03377532958984375, 0.0209197998046875, 1.3073539733886719, -2.869049072265625, -0.170745849609375, 1.7719001770019531, -0.17232322692871094, 0.46539306640625, 0.4788246154785156, 0.6077346801757812, 0.500335693359375, 0.21825408935546875, 1.081817626953125, 0.36042022705078125, 0.4974822998046875, -0.8120994567871094, 0.33282470703125, 1.58721923828125, -0.1143951416015625, 0.7779865264892578, 0.7406673431396484, 0.07903671264648438, -0.5261764526367188, 0.1742706298828125, 2.1912841796875, 2.405548095703125, -0.08898735046386719, 0.47045135498046875, 0.3274993896484375, 0.21652984619140625, -1.6625823974609375, 0.9986591339111328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000062.npy"}
|
||||
{"epoch": 0.09372637944066516, "step": 63, "batch_size": 64, "mean": 0.5596264600753784, "std": 0.8312181234359741, "min": -1.9854278564453125, "p10": -0.4931089401245116, "median": 0.4959297180175781, "p90": 1.6520864486694338, "max": 2.699859619140625, "pos_frac": 0.828125, "sample": [0.7263641357421875, 0.048004150390625, -0.6542205810546875, 0.0459442138671875, 2.59100341796875, -0.3001728057861328, -0.4174823760986328, 2.699859619140625, 0.9192733764648438, 0.5065231323242188, -1.9854278564453125, -0.5255203247070312, 0.4312114715576172, -1.1709747314453125, 1.1935501098632812, 0.41778564453125, 1.5556640625, 0.12969207763671875, 0.4853363037109375, 0.8602752685546875, -0.6579170227050781, 0.32584571838378906, 1.09173583984375, 2.003692626953125, 0.8579483032226562, 0.8408603668212891, 0.6896209716796875, -0.037487030029296875, 0.46390533447265625, 1.0012359619140625, 0.5855178833007812, 0.45745849609375, 1.1783905029296875, 0.9716720581054688, 0.10424041748046875, 0.03043365478515625, 0.5457954406738281, -0.6362991333007812, 0.5821704864501953, 0.4468364715576172, 0.47925376892089844, 2.3389892578125, 0.25453662872314453, 1.6170215606689453, 0.7211055755615234, 1.169637680053711, 0.74420166015625, 1.6671142578125, 0.4600067138671875, 0.06821441650390625, 0.1460132598876953, 0.7229499816894531, 0.14407730102539062, 0.03182220458984375, 0.9401779174804688, 0.8093719482421875, 0.8426132202148438, 0.4372062683105469, -0.04004669189453125, 0.08044624328613281, 0.7324790954589844, 1.9090423583984375, 1.6792449951171875, -0.5417327880859375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000063.npy"}
|
||||
{"epoch": 0.09523809523809523, "step": 64, "batch_size": 64, "mean": 0.09469178318977356, "std": 0.8834921717643738, "min": -3.5447845458984375, "p10": -0.8414871215820312, "median": 0.12833118438720703, "p90": 1.0161796569824222, "max": 2.4412078857421875, "pos_frac": 0.59375, "sample": [0.8443984985351562, 2.4412078857421875, -0.844329833984375, -1.2381973266601562, 0.18927383422851562, 0.19808197021484375, 0.8645820617675781, 0.3729133605957031, 0.3423290252685547, 1.5868301391601562, 0.6125602722167969, 0.3324165344238281, 1.1740550994873047, -0.43343353271484375, 0.2254486083984375, 0.006175994873046875, -0.2554473876953125, 0.9364395141601562, 0.2524070739746094, 0.3749351501464844, 1.834503173828125, 0.07303619384765625, -0.33295440673828125, -0.13796234130859375, -0.8443527221679688, 0.03023529052734375, 0.3006858825683594, -3.5447845458984375, -0.3835411071777344, 0.3383617401123047, 0.3101329803466797, -0.23900604248046875, 0.11075592041015625, -1.0039825439453125, 0.003387451171875, 0.07117462158203125, 0.8987197875976562, -2.3362884521484375, -0.9072475433349609, 0.5852127075195312, 0.75079345703125, -0.08713531494140625, 1.05035400390625, -0.1623516082763672, 0.49419593811035156, -0.348358154296875, -0.4442939758300781, 0.40460205078125, -0.6498069763183594, 1.41015625, -0.3032569885253906, -0.1814136505126953, 1.2409591674804688, -0.0359649658203125, 0.1459064483642578, 0.3454723358154297, 0.6492061614990234, -0.20067977905273438, -0.8348541259765625, -0.30684661865234375, 0.161712646484375, -0.548095703125, -0.17489242553710938, 0.8761329650878906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000064.npy"}
|
||||
{"epoch": 0.09674981103552532, "step": 65, "batch_size": 64, "mean": 0.48643139004707336, "std": 1.1278084516525269, "min": -2.736431121826172, "p10": -0.9136001586914062, "median": 0.45426177978515625, "p90": 1.9133581161499027, "max": 4.353271484375, "pos_frac": 0.703125, "sample": [0.35006141662597656, 0.7436447143554688, 1.1805191040039062, 0.2726287841796875, 0.057697296142578125, 0.7886810302734375, -0.8863525390625, 0.361328125, -0.6942329406738281, 1.9908218383789062, -0.0381317138671875, 0.3915214538574219, 1.0597400665283203, 0.5506515502929688, 1.1571006774902344, 0.6912288665771484, 0.5883731842041016, 0.2779407501220703, -0.9890289306640625, 2.819671630859375, -0.0871429443359375, -0.05176544189453125, -1.05218505859375, 2.013671875, 1.6574058532714844, -0.059185028076171875, -2.736431121826172, 0.3347015380859375, -0.9252777099609375, 0.413238525390625, 0.005687713623046875, 0.4952850341796875, -0.829345703125, -1.2360610961914062, 0.6532363891601562, 2.6225814819335938, 0.5031585693359375, 2.0977630615234375, 1.82733154296875, 1.0864810943603516, 1.9393482208251953, 0.19384002685546875, 1.3904190063476562, 0.9606361389160156, -0.4647407531738281, 0.16353607177734375, 0.7037792205810547, 0.1249237060546875, -1.3466472625732422, -0.49092864990234375, 0.5689964294433594, 1.0005226135253906, -0.1125640869140625, 0.9099159240722656, -0.2039642333984375, 1.0353317260742188, 1.8527145385742188, -0.2085857391357422, 1.256011962890625, 0.962127685546875, 0.5045318603515625, -1.4954833984375, 4.353271484375, 0.12760353088378906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000065.npy"}
|
||||
{"epoch": 0.0982615268329554, "step": 66, "batch_size": 64, "mean": 0.31815215945243835, "std": 1.1856145858764648, "min": -2.796173095703125, "p10": -0.9597282409667968, "median": 0.24280548095703125, "p90": 2.0407814025878914, "max": 3.770965576171875, "pos_frac": 0.640625, "sample": [0.12849044799804688, 3.770965576171875, 0.7813339233398438, 0.7203426361083984, 0.45453453063964844, 0.6442642211914062, 0.34595680236816406, 0.03380584716796875, -0.3082466125488281, 0.16624832153320312, -0.5492095947265625, 1.1197357177734375, 2.420562744140625, 0.046295166015625, -0.99285888671875, -1.5501022338867188, 0.10445404052734375, 0.4670600891113281, 2.129241943359375, 1.8343734741210938, 0.3401641845703125, 1.766448974609375, -1.564605712890625, -0.02501678466796875, 0.24925994873046875, 1.054300308227539, 0.4831409454345703, 1.0456657409667969, 0.573822021484375, 0.23635101318359375, 0.7558822631835938, 0.7233219146728516, -0.47243499755859375, -0.006984710693359375, 0.11074638366699219, -0.19194793701171875, 0.33687400817871094, 0.47718048095703125, -0.8799972534179688, 0.18396377563476562, -2.796173095703125, -2.2709579467773438, -0.21198654174804688, -0.13849639892578125, -0.7615814208984375, -1.0354385375976562, 2.1359519958496094, -0.32659149169921875, 0.5507965087890625, 0.9954090118408203, -0.09637069702148438, -0.14008331298828125, 0.09560012817382812, 3.663299560546875, 2.2153778076171875, -1.5303726196289062, 0.747711181640625, -0.8824234008789062, -0.5913238525390625, 2.4932861328125, 0.5314788818359375, -0.28046417236328125, 0.6397705078125, 0.391937255859375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000066.npy"}
|
||||
{"epoch": 0.09977324263038549, "step": 67, "batch_size": 64, "mean": 0.33653974533081055, "std": 0.9262024164199829, "min": -1.7634124755859375, "p10": -0.844869041442871, "median": 0.38215065002441406, "p90": 1.4511219024658204, "max": 2.7962875366210938, "pos_frac": 0.671875, "sample": [-0.103912353515625, -0.0816192626953125, -0.3897590637207031, -0.4372367858886719, -1.7113113403320312, 0.4201831817626953, 1.708669662475586, 0.9763717651367188, 0.00786590576171875, -0.0927581787109375, 1.4363365173339844, 0.15801239013671875, -0.8813323974609375, -1.2700881958007812, 0.8971405029296875, 0.8096275329589844, -0.27136802673339844, 0.5443534851074219, -0.32207489013671875, 0.8558349609375, -0.770660400390625, 1.45745849609375, -1.7634124755859375, 2.7962875366210938, 0.23598670959472656, 0.6702880859375, 0.592193603515625, 0.6195144653320312, 0.9672393798828125, -0.8766727447509766, -0.080108642578125, 0.5367813110351562, 0.5902786254882812, 1.8889236450195312, 0.36772918701171875, 0.12393951416015625, 0.764678955078125, 0.3965721130371094, -0.5876426696777344, 0.7460861206054688, 2.262298583984375, -0.2565460205078125, 1.2659683227539062, 1.2394332885742188, 0.6858444213867188, 0.1414337158203125, -0.2770843505859375, 0.14815139770507812, 0.517333984375, -0.5126571655273438, 0.4976844787597656, 2.475555419921875, 0.4924659729003906, 0.03662300109863281, -0.27754974365234375, 0.0308074951171875, 0.89544677734375, 0.106475830078125, -1.0396194458007812, -1.5321578979492188, 1.8260765075683594, 0.31945037841796875, 0.8846435546875, 0.6800689697265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000067.npy"}
|
||||
{"epoch": 0.10128495842781557, "step": 68, "batch_size": 64, "mean": 0.3413764536380768, "std": 1.3143774271011353, "min": -4.231048583984375, "p10": -1.1051284790039062, "median": 0.4183998107910156, "p90": 1.6522811889648439, "max": 3.897064208984375, "pos_frac": 0.671875, "sample": [3.897064208984375, 1.6032257080078125, 0.21793556213378906, 2.55157470703125, -0.33512115478515625, 2.2695388793945312, 0.42362213134765625, -0.47049713134765625, 0.6789321899414062, -0.0463409423828125, 1.7779922485351562, -4.231048583984375, 1.3681716918945312, 0.81793212890625, 1.1265411376953125, 0.9211349487304688, -0.7252960205078125, -0.12157821655273438, 0.0679168701171875, 0.7359848022460938, 1.6573944091796875, 0.11629104614257812, 0.6845855712890625, 1.0118331909179688, 0.413177490234375, 0.7684383392333984, 0.4928741455078125, -0.6331787109375, 2.0167999267578125, -0.651702880859375, -0.29842567443847656, 0.6269302368164062, 0.35150909423828125, 0.040569305419921875, -1.105499267578125, 0.07035064697265625, 3.65191650390625, 0.2711639404296875, 0.083465576171875, 0.13744354248046875, -1.848663330078125, -1.3061943054199219, 0.618743896484375, 0.42777252197265625, -1.7009201049804688, -2.8144683837890625, 0.4456024169921875, 0.385772705078125, -0.8686866760253906, -0.11676788330078125, 1.640350341796875, 0.5377044677734375, -1.1042633056640625, 1.5054855346679688, 0.868408203125, 0.6352005004882812, -0.21903228759765625, 1.5900650024414062, 1.6220703125, 0.7079505920410156, 1.1000289916992188, -0.998992919921875, -0.011890411376953125, -1.480804443359375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000068.npy"}
|
||||
{"epoch": 0.10279667422524566, "step": 69, "batch_size": 64, "mean": 0.5912973880767822, "std": 1.252090573310852, "min": -2.1119461059570312, "p10": -0.7729665756225584, "median": 0.4678153991699219, "p90": 2.179357147216798, "max": 5.265655517578125, "pos_frac": 0.703125, "sample": [0.15087127685546875, -0.234405517578125, 2.8608169555664062, -0.22393798828125, 0.40406036376953125, -0.20756912231445312, -0.27029991149902344, -0.31181907653808594, 1.3638916015625, 0.294586181640625, 0.8751068115234375, 0.5315704345703125, 0.28045654296875, 1.77239990234375, 1.2664756774902344, 2.2777557373046875, 0.3516979217529297, 1.6950225830078125, 2.723052978515625, -1.6963958740234375, 2.453460693359375, -2.1119461059570312, 1.54974365234375, -0.0313873291015625, 2.298095703125, -0.08653640747070312, 0.659881591796875, 0.9890899658203125, 0.850738525390625, 0.568878173828125, 0.7678928375244141, -1.820037841796875, 0.34496307373046875, 0.31801605224609375, 0.7512321472167969, -0.6356964111328125, 1.3484039306640625, 0.6386756896972656, 0.0407562255859375, 0.8729019165039062, -0.1840057373046875, 0.5354423522949219, 0.8285675048828125, 0.30234527587890625, 1.7667236328125, -0.30876922607421875, 1.869903564453125, 0.9491310119628906, -0.38455772399902344, -1.3801803588867188, 0.19014739990234375, 1.125396728515625, -2.0007972717285156, -0.8317966461181641, 1.0827484130859375, -0.029296875, 1.5888214111328125, 0.24017333984375, 1.9497604370117188, -1.1400184631347656, 5.265655517578125, 0.2325611114501953, 2.3647918701171875, 0.13982009887695312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000069.npy"}
|
||||
{"epoch": 0.10430839002267574, "step": 70, "batch_size": 64, "mean": 0.35000741481781006, "std": 1.1895856857299805, "min": -2.136688232421875, "p10": -1.1632139205932617, "median": 0.4033184051513672, "p90": 2.0593044281005874, "max": 3.136749267578125, "pos_frac": 0.640625, "sample": [0.26818084716796875, -1.1449317932128906, 0.992645263671875, -1.80975341796875, 0.417816162109375, -0.8603515625, -0.4645957946777344, 1.0998001098632812, 1.24432373046875, -0.3939399719238281, 1.051605224609375, 1.5589790344238281, -2.136688232421875, 1.1755714416503906, 0.16040420532226562, -0.7295379638671875, 2.2106285095214844, -1.6666946411132812, -0.603485107421875, -0.3733692169189453, 3.136749267578125, 0.6065311431884766, 1.236846923828125, 0.7139053344726562, -1.9335861206054688, 0.7582359313964844, 0.5854454040527344, 2.5378265380859375, -0.9408798217773438, 3.132049560546875, 0.04840087890625, 0.3098182678222656, -0.5972671508789062, -0.3213043212890625, 0.3657684326171875, 2.2173538208007812, 1.7062149047851562, 0.49603271484375, 0.7088642120361328, 1.4209880828857422, 0.5681037902832031, -1.1710491180419922, -0.2882118225097656, 0.42725372314453125, 1.269439697265625, 0.6714286804199219, -0.014270782470703125, 2.46630859375, -0.09496688842773438, 0.8786239624023438, 0.31089019775390625, 2.312225341796875, 0.3946075439453125, -1.5207977294921875, 0.4120292663574219, 0.3276329040527344, 0.15674591064453125, -1.7687721252441406, 1.0287704467773438, 0.6612663269042969, -0.7435150146484375, 0.7376670837402344, -0.5436038970947266, -0.261932373046875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000070.npy"}
|
||||
{"epoch": 0.10582010582010581, "step": 71, "batch_size": 64, "mean": 0.23569674789905548, "std": 1.403541922569275, "min": -3.992401123046875, "p10": -1.6529037475585937, "median": 0.14943313598632812, "p90": 1.7821929931640625, "max": 2.9306182861328125, "pos_frac": 0.546875, "sample": [-0.05103874206542969, 1.4690742492675781, 1.0824089050292969, 0.4259052276611328, -0.09032440185546875, -0.6652488708496094, 0.8483448028564453, -1.7141036987304688, 0.024688720703125, 2.36273193359375, 0.33585548400878906, -1.7085609436035156, -0.0786590576171875, 1.014801025390625, -2.55194091796875, 0.049896240234375, -0.18256378173828125, 2.79339599609375, 0.9085884094238281, 1.6576385498046875, -3.31610107421875, 2.9306182861328125, 0.197174072265625, -0.6173439025878906, 1.2567138671875, 2.45501708984375, 0.39136505126953125, -0.16077232360839844, 0.6582679748535156, -0.1377716064453125, -3.992401123046875, -0.348297119140625, -1.5230369567871094, -0.6040229797363281, 0.33366966247558594, -0.021106719970703125, 0.35924339294433594, 1.0620269775390625, 1.6256370544433594, -1.2519989013671875, -0.29570770263671875, -2.2222137451171875, -0.409576416015625, 1.7716827392578125, -0.452117919921875, -0.2175130844116211, -1.8496551513671875, 1.7866973876953125, -0.12445068359375, -0.053256988525390625, 1.7030277252197266, 2.230487823486328, -1.1244735717773438, 1.72430419921875, 1.400299072265625, 0.8814735412597656, 0.9804306030273438, 0.10169219970703125, 0.438812255859375, 0.45960426330566406, 1.2073287963867188, -0.08035469055175781, 2.8655319213867188, -0.8652305603027344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000071.npy"}
|
||||
{"epoch": 0.1073318216175359, "step": 72, "batch_size": 64, "mean": 0.6959859132766724, "std": 1.313461184501648, "min": -3.3927459716796875, "p10": -0.5754743576049804, "median": 0.7092323303222656, "p90": 1.9448516845703125, "max": 4.915191650390625, "pos_frac": 0.703125, "sample": [2.2058792114257812, 1.3959808349609375, 0.6784896850585938, 1.6482086181640625, -3.3927459716796875, 0.2843284606933594, 0.3380870819091797, 4.915191650390625, -1.47357177734375, -0.5978717803955078, 1.4337844848632812, 1.37939453125, 0.107177734375, 1.788726806640625, -0.06850051879882812, -0.44611358642578125, -0.3464641571044922, 1.0326309204101562, 1.9128952026367188, -0.42437744140625, 3.753021240234375, 0.9312915802001953, 1.609323501586914, 0.33725547790527344, 1.164520263671875, -0.0859222412109375, 0.38791465759277344, -0.5801029205322266, -1.24896240234375, -0.3940582275390625, 2.941162109375, 1.2577438354492188, 0.7948226928710938, 1.9488983154296875, 0.8284378051757812, 0.4963035583496094, 0.6086826324462891, 1.0462150573730469, 0.3657493591308594, -0.11069107055664062, 1.8001785278320312, 1.1393318176269531, 1.8628997802734375, 0.018596649169921875, 0.7336959838867188, -0.40424346923828125, 1.9354095458984375, 1.1421051025390625, -0.5542182922363281, 1.5171318054199219, -0.9959716796875, 0.890106201171875, 3.1677780151367188, -0.3783912658691406, 2.5163192749023438, -0.1851062774658203, 1.4788322448730469, 0.7485084533691406, -1.7659454345703125, 0.09718513488769531, -0.5646743774414062, 0.8119964599609375, 0.4240684509277344, 0.6847686767578125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000072.npy"}
|
||||
{"epoch": 0.10884353741496598, "step": 73, "batch_size": 64, "mean": 0.19277739524841309, "std": 1.6517032384872437, "min": -2.9333648681640625, "p10": -1.5608095169067382, "median": 0.04687690734863281, "p90": 1.5999975204467776, "max": 7.9058837890625, "pos_frac": 0.515625, "sample": [1.4242324829101562, 0.7146034240722656, 0.29538726806640625, -0.9588546752929688, -1.3234519958496094, -0.97064208984375, -1.718475341796875, -1.3772506713867188, -2.9333648681640625, 1.3416423797607422, -1.751678466796875, 0.5358505249023438, 0.16991424560546875, -0.9575653076171875, -1.181070327758789, 0.6391334533691406, -1.8443374633789062, 0.08200454711914062, -0.549224853515625, 0.89788818359375, -0.221160888671875, -0.42913818359375, -0.8071365356445312, 0.35953521728515625, 7.9058837890625, 2.4156341552734375, -0.56927490234375, -0.7728252410888672, -0.08530235290527344, 1.0169181823730469, 0.4275169372558594, -0.09286117553710938, 0.6037731170654297, 0.011749267578125, -2.326568603515625, 0.3135986328125, 1.5656909942626953, 2.215667724609375, -1.421926498413086, -0.9920654296875, -0.73406982421875, 2.6332550048828125, -0.0042266845703125, -1.9633026123046875, -0.05210113525390625, 1.3723678588867188, 0.884429931640625, -0.7213230133056641, -0.009822845458984375, -0.47891998291015625, -1.620330810546875, 0.7007331848144531, 0.9875431060791016, 0.6765632629394531, 1.6147003173828125, 0.9625244140625, 0.9691276550292969, 0.353729248046875, -0.7327079772949219, 0.2247467041015625, 0.5557003021240234, 2.5761871337890625, 5.038665771484375, -0.5481643676757812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000073.npy"}
|
||||
{"epoch": 0.11035525321239607, "step": 74, "batch_size": 64, "mean": 0.7022018432617188, "std": 1.3231043815612793, "min": -2.1056976318359375, "p10": -0.8304599761962891, "median": 0.7823047637939453, "p90": 2.073669433593751, "max": 5.63775634765625, "pos_frac": 0.6875, "sample": [-0.545379638671875, 1.8050384521484375, -0.487457275390625, 0.8439712524414062, 1.5842361450195312, -0.5882797241210938, 1.6169281005859375, 5.63775634765625, 1.5846672058105469, 2.1887969970703125, 1.3906230926513672, 0.48970794677734375, 2.2484893798828125, 0.8184585571289062, -0.367279052734375, 0.7962532043457031, 1.406829833984375, 0.7683563232421875, -0.08259963989257812, -2.1056976318359375, 0.3552818298339844, 2.20928955078125, 0.6763763427734375, 0.7590866088867188, 0.6872100830078125, -0.05301094055175781, -0.4635944366455078, 1.3934707641601562, 2.3375511169433594, -1.4862442016601562, 0.30010414123535156, 0.8967342376708984, 1.4377193450927734, -1.0981483459472656, -0.9902191162109375, 1.1661605834960938, -0.8397789001464844, 1.2460708618164062, -1.7365341186523438, -0.4568309783935547, 0.6408271789550781, -0.8087158203125, 1.6017417907714844, 0.8258857727050781, 1.30047607421875, 3.0838775634765625, 1.7190170288085938, -0.7461433410644531, 1.385162353515625, 1.5443801879882812, -2.0514373779296875, -0.2763938903808594, 0.7375373840332031, -0.14754486083984375, -0.7885818481445312, 0.34406471252441406, 1.0147476196289062, 3.371307373046875, 1.7709732055664062, 1.4025115966796875, 1.540740966796875, 1.0827903747558594, 0.47946739196777344, 0.5701141357421875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000074.npy"}
|
||||
{"epoch": 0.11186696900982615, "step": 75, "batch_size": 64, "mean": 0.7536483407020569, "std": 1.343436360359192, "min": -2.711414337158203, "p10": -0.701566696166992, "median": 0.5830326080322266, "p90": 2.6676536560058595, "max": 4.5780181884765625, "pos_frac": 0.734375, "sample": [-0.5351676940917969, -0.3362579345703125, -0.007305145263671875, 1.0143013000488281, 2.6858749389648438, -0.048572540283203125, 0.32610321044921875, 1.319366455078125, 1.4560165405273438, 3.1252899169921875, -0.28437042236328125, 0.25241851806640625, -1.5121002197265625, 0.5872268676757812, -0.016757965087890625, 1.1166267395019531, 2.6251373291015625, -1.3322219848632812, 0.5537347793579102, 0.5913352966308594, 3.308624267578125, 1.2311897277832031, 1.7103042602539062, 2.1940841674804688, 1.0120296478271484, 0.4586677551269531, 1.1653823852539062, -1.409027099609375, 1.8583755493164062, 0.07252120971679688, 0.7479267120361328, 2.8056640625, 0.1099090576171875, 1.7998886108398438, 1.0617294311523438, 0.5640125274658203, 4.5780181884765625, 0.69921875, 0.8888931274414062, 0.7184104919433594, -0.12322235107421875, 1.1522388458251953, -2.711414337158203, 0.5617561340332031, 1.3647537231445312, -0.3756523132324219, 0.15568161010742188, 0.299957275390625, 0.3354511260986328, 0.26372718811035156, 0.5788383483886719, 2.5894546508789062, -1.15582275390625, -0.24763870239257812, 0.887054443359375, -0.7728805541992188, 3.0615234375, -0.1461048126220703, -1.2393798828125, 0.17201614379882812, 0.6080741882324219, 0.05951881408691406, 4.0973358154296875, 1.6617279052734375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000075.npy"}
|
||||
{"epoch": 0.11337868480725624, "step": 76, "batch_size": 64, "mean": 0.32796770334243774, "std": 1.5456434488296509, "min": -4.499176025390625, "p10": -1.3040246963500977, "median": 0.30381202697753906, "p90": 1.9909675598144532, "max": 4.268890380859375, "pos_frac": 0.578125, "sample": [-4.499176025390625, 1.6188545227050781, -2.524200439453125, 0.1782073974609375, 0.46430206298828125, -0.8566703796386719, -0.4074287414550781, 0.75286865234375, 0.4307117462158203, 3.4814529418945312, 2.2426986694335938, 4.268890380859375, -0.5493373870849609, -3.5465545654296875, 1.796539306640625, -1.3155136108398438, 0.4972381591796875, 1.1228866577148438, 0.7430877685546875, 0.3255500793457031, 0.3518218994140625, 2.6282272338867188, -1.3137664794921875, 3.3444290161132812, -0.8715057373046875, 1.774749755859375, 0.6338691711425781, 1.6035842895507812, -0.39690589904785156, -0.3334178924560547, 0.11688423156738281, -0.7531242370605469, -1.7259521484375, -0.791748046875, -0.5122089385986328, 0.12237548828125, 2.0038833618164062, 1.169952392578125, -0.5021743774414062, 1.1641921997070312, -0.016124725341796875, 0.282073974609375, -0.10542678833007812, 0.4660224914550781, 1.8657302856445312, 0.43551063537597656, 0.3819427490234375, -1.2812938690185547, -0.2994194030761719, -0.35752105712890625, 0.1094818115234375, 0.7288360595703125, -0.30477142333984375, 1.9608306884765625, 2.8614044189453125, 0.8314037322998047, -0.41338157653808594, -1.1929588317871094, -2.027618408203125, 1.9115142822265625, 1.5773048400878906, -0.1827983856201172, 1.8688812255859375, -0.04726219177246094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000076.npy"}
|
||||
{"epoch": 0.11489040060468632, "step": 77, "batch_size": 64, "mean": 0.748406171798706, "std": 1.464658260345459, "min": -3.3835067749023438, "p10": -1.076511764526367, "median": 0.4405698776245117, "p90": 2.221298980712891, "max": 4.669586181640625, "pos_frac": 0.734375, "sample": [-0.284027099609375, 1.7501945495605469, -1.5344467163085938, -1.3097763061523438, 1.7826728820800781, 2.2282180786132812, -1.1431999206542969, 0.11828804016113281, -3.3835067749023438, 0.1126708984375, 1.7457962036132812, 1.6220855712890625, 0.03407478332519531, -1.3685226440429688, 0.14331436157226562, 2.8716049194335938, 0.2322845458984375, 1.4615936279296875, 0.6615982055664062, 4.27593994140625, -0.15329360961914062, 0.09031105041503906, 2.2051544189453125, 1.957611083984375, 0.5206184387207031, 0.4373931884765625, 0.17737960815429688, 1.7372055053710938, 0.3312034606933594, 0.7137832641601562, -0.46935081481933594, -0.9209060668945312, 0.08333969116210938, 0.323333740234375, 1.28662109375, 1.961334228515625, 0.9384231567382812, 1.5027236938476562, 1.2179031372070312, 2.0423622131347656, 1.8881912231445312, 0.26917266845703125, 0.8939208984375, -0.16663551330566406, -0.00119781494140625, 4.669586181640625, 0.164581298828125, -0.119720458984375, 2.8571243286132812, 1.4478874206542969, 3.8837432861328125, -0.5201187133789062, 0.3490581512451172, 1.861001968383789, 1.9940643310546875, 0.3558616638183594, -1.9629783630371094, -1.2378387451171875, 2.1779861450195312, -0.20812225341796875, 0.8969879150390625, 2.8819122314453125, 0.44374656677246094, -0.92022705078125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000077.npy"}
|
||||
{"epoch": 0.1164021164021164, "step": 78, "batch_size": 64, "mean": 0.4618911147117615, "std": 1.4039136171340942, "min": -3.5293655395507812, "p10": -1.195899200439453, "median": 0.46004581451416016, "p90": 2.2886995315551757, "max": 3.6596641540527344, "pos_frac": 0.671875, "sample": [2.292539596557617, -3.176849365234375, 0.7450637817382812, 2.40252685546875, 0.5449371337890625, 2.3314266204833984, 1.0702972412109375, 1.0857315063476562, 0.32527923583984375, 0.41065216064453125, 1.572723388671875, -1.0997314453125, -0.2891082763671875, 2.2797393798828125, -0.05340576171875, -0.45934295654296875, -1.7580680847167969, 3.20941162109375, -1.2371139526367188, 0.325286865234375, -0.922637939453125, -0.9958114624023438, 0.3891143798828125, -0.4033927917480469, 0.8430633544921875, 0.81683349609375, 0.12579345703125, -3.5293655395507812, 2.04296875, -2.31201171875, 1.5861091613769531, 1.2650680541992188, 0.04740142822265625, 1.6112327575683594, 0.06060028076171875, -0.3892822265625, -1.4722366333007812, 0.01662445068359375, 1.6621570587158203, 1.7304801940917969, 0.7350540161132812, 0.5665931701660156, 0.95074462890625, 1.543701171875, -0.6122512817382812, 1.78521728515625, -1.4071846008300781, 0.5094394683837891, 0.29734039306640625, 1.5848846435546875, -0.5757122039794922, 2.5538101196289062, -0.46860504150390625, -0.7195587158203125, 0.577545166015625, -0.5866508483886719, 0.13958358764648438, -0.2263641357421875, 1.239349365234375, 2.3712387084960938, 0.8190460205078125, 0.3997764587402344, 3.6596641540527344, 1.729665756225586], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000078.npy"}
|
||||
{"epoch": 0.11791383219954649, "step": 79, "batch_size": 64, "mean": 1.0946189165115356, "std": 2.0348281860351562, "min": -3.1635360717773438, "p10": -0.6821798324584961, "median": 0.5981330871582031, "p90": 3.038525009155274, "max": 8.010345458984375, "pos_frac": 0.703125, "sample": [-0.10035133361816406, 0.28519439697265625, -0.9656867980957031, 1.595794677734375, -0.9999351501464844, 1.1172256469726562, -3.1635360717773438, -0.07204627990722656, 0.49752044677734375, 0.44387054443359375, -0.6829242706298828, 0.5774993896484375, 1.7863502502441406, 7.6984710693359375, -1.7960052490234375, 0.48841094970703125, 2.030200958251953, 0.5144844055175781, -0.45795440673828125, 0.4001350402832031, 1.0183391571044922, 7.148040771484375, 1.8532257080078125, 3.2728271484375, 1.3411712646484375, -1.4145317077636719, 4.490818023681641, 1.5880355834960938, 1.6220474243164062, 2.2194747924804688, -1.3895721435546875, -0.059906005859375, 2.48431396484375, 2.1351699829101562, 1.8771114349365234, 0.43109130859375, 0.9869613647460938, 1.795206069946289, -0.6804428100585938, 2.5969314575195312, -0.5485115051269531, 3.07208251953125, -0.10007476806640625, -0.3305492401123047, 0.26117897033691406, 0.8849277496337891, 2.4767303466796875, 0.1591815948486328, 0.8139133453369141, 0.20752716064453125, 1.1023330688476562, -0.4567108154296875, 5.232696533203125, -0.2184906005859375, 0.7542686462402344, 8.010345458984375, 0.3840484619140625, 0.6187667846679688, 0.3076324462890625, 2.069427490234375, -0.499053955078125, 0.7684745788574219, 2.960224151611328, -0.3877906799316406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000079.npy"}
|
||||
{"epoch": 0.11942554799697656, "step": 80, "batch_size": 64, "mean": 0.5406800508499146, "std": 1.933139443397522, "min": -4.4472198486328125, "p10": -1.73245849609375, "median": 0.24652576446533203, "p90": 3.230627822875977, "max": 4.9974365234375, "pos_frac": 0.515625, "sample": [-0.5588626861572266, 3.2426109313964844, 0.79644775390625, 2.4496841430664062, 1.3656902313232422, -0.5366363525390625, -0.4813499450683594, -0.339111328125, -1.8119964599609375, -1.7491531372070312, 2.171682357788086, 3.2578048706054688, -0.8938522338867188, -1.3281021118164062, -0.043170928955078125, 4.0649566650390625, -1.449371337890625, -0.5575027465820312, 0.7086105346679688, -0.09415245056152344, 0.5313568115234375, -1.5423507690429688, -0.22074317932128906, 1.9091339111328125, 3.6803436279296875, 2.9906139373779297, 2.212432861328125, -0.24501800537109375, 2.9928741455078125, 3.9927520751953125, -1.7779827117919922, 1.9751129150390625, 0.3659210205078125, 0.44356346130371094, 1.450653076171875, 2.04229736328125, -0.6648635864257812, -0.048797607421875, 4.139858245849609, -4.4472198486328125, 3.202667236328125, -0.30410003662109375, -2.8670730590820312, 1.6694602966308594, -0.0075359344482421875, 1.1843147277832031, 1.2638092041015625, 0.3850269317626953, -1.767791748046875, -1.6935043334960938, 2.8204498291015625, -0.2317333221435547, -0.419647216796875, -0.8113250732421875, -0.4867839813232422, 0.12713050842285156, 1.38916015625, -1.4596881866455078, 1.7966480255126953, -3.3089637756347656, 0.41953277587890625, -0.4415283203125, 4.9974365234375, 1.1533966064453125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000080.npy"}
|
||||
{"epoch": 0.12093726379440665, "step": 81, "batch_size": 64, "mean": 0.8504737615585327, "std": 1.8003772497177124, "min": -2.834320068359375, "p10": -0.9882423400878905, "median": 0.6565723419189453, "p90": 3.0811393737792976, "max": 7.6277923583984375, "pos_frac": 0.671875, "sample": [1.0135955810546875, 2.085378646850586, 1.8089790344238281, -1.0392532348632812, 1.1283187866210938, -2.4552230834960938, 0.2550849914550781, 0.3582916259765625, -0.4912834167480469, -0.5844192504882812, 4.8354034423828125, -0.166656494140625, 0.3566131591796875, 2.3355789184570312, 0.6418571472167969, 0.18202972412109375, 0.38860130310058594, 0.20064544677734375, 3.4945144653320312, 1.9501781463623047, 0.7839603424072266, -0.4511451721191406, 2.3097610473632812, 1.00799560546875, 2.368682861328125, 0.9164505004882812, 0.102630615234375, -0.3903045654296875, -0.28379058837890625, -1.2978515625, 0.9425125122070312, -1.8621788024902344, 1.9266510009765625, 0.22539520263671875, -0.7184906005859375, 3.210693359375, -0.491668701171875, 7.6277923583984375, 1.642486572265625, 5.3667755126953125, -0.7602691650390625, 1.2185935974121094, 0.6712875366210938, -0.0916748046875, 1.3765029907226562, 0.09276580810546875, 1.9053497314453125, 1.04095458984375, 2.9528427124023438, 1.9434051513671875, 3.6126155853271484, -0.8692169189453125, 2.118459701538086, -0.45204925537109375, -0.10969924926757812, 1.0434799194335938, -1.594533920288086, 0.3084239959716797, 1.0065345764160156, 3.1361236572265625, -0.5151424407958984, 1.192230224609375, -2.834320068359375, -1.1969375610351562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000081.npy"}
|
||||
{"epoch": 0.12244897959183673, "step": 82, "batch_size": 64, "mean": 1.0444717407226562, "std": 1.5496211051940918, "min": -4.196929931640625, "p10": -0.6091178894042968, "median": 1.0111923217773438, "p90": 3.1479141235351573, "max": 5.4624176025390625, "pos_frac": 0.75, "sample": [0.6080951690673828, 1.56201171875, 0.6814956665039062, -0.6580104827880859, 2.2157974243164062, 0.28804588317871094, 1.0437126159667969, 2.2362823486328125, 3.5149383544921875, 1.1394786834716797, 1.1024169921875, 0.5882396697998047, 1.358306884765625, 2.5211563110351562, -0.1038360595703125, 1.4792709350585938, 0.9917449951171875, 0.19387054443359375, 1.496419906616211, 0.4349822998046875, 0.3820037841796875, -0.17826461791992188, 0.8830928802490234, -0.018877029418945312, -0.7416229248046875, 3.9691085815429688, 1.6338958740234375, 1.0306396484375, 1.6803874969482422, 1.3371047973632812, 0.34113311767578125, -0.35414886474609375, 5.4624176025390625, 0.2053375244140625, 1.8417816162109375, 3.5376434326171875, 2.8618621826171875, 1.2369918823242188, 1.3482437133789062, 3.70428466796875, -0.02193450927734375, -1.5117645263671875, 2.7954788208007812, -1.4140357971191406, 2.0955963134765625, 0.6011734008789062, 2.0850448608398438, -0.025205612182617188, 4.205284118652344, 0.2527923583984375, -0.1361236572265625, 0.23681640625, 1.980072021484375, 3.2705078125, -0.17475509643554688, -0.7426986694335938, 2.0721397399902344, 1.1325302124023438, 1.883941650390625, 0.5508041381835938, -0.982574462890625, -4.196929931640625, -0.49503517150878906, 0.5276336669921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000082.npy"}
|
||||
{"epoch": 0.12396069538926682, "step": 83, "batch_size": 64, "mean": 1.1835986375808716, "std": 1.847933053970337, "min": -3.476083755493164, "p10": -0.7220520019531249, "median": 1.0591964721679688, "p90": 3.0880966186523446, "max": 8.451065063476562, "pos_frac": 0.796875, "sample": [2.0027008056640625, 0.56005859375, -0.672821044921875, 0.24071502685546875, 0.8219776153564453, 0.5857620239257812, 1.6272430419921875, 1.714447021484375, 2.5454864501953125, 4.712371826171875, 0.071044921875, 1.5333023071289062, -2.0480918884277344, 0.9074058532714844, 1.8715763092041016, 0.8647403717041016, 1.827239990234375, 2.925689697265625, 2.2438507080078125, 1.3813438415527344, 1.8251800537109375, -0.7419281005859375, 4.1589813232421875, -0.016588211059570312, -1.2383041381835938, 1.3756179809570312, 3.46612548828125, 1.0577774047851562, 0.41162872314453125, -0.39575958251953125, -0.6756744384765625, 0.6671524047851562, 2.7982254028320312, -3.476083755493164, -0.8059291839599609, 2.545839309692383, 1.503763198852539, 0.26142120361328125, 1.5437049865722656, -2.1931228637695312, 0.3192100524902344, 0.7429580688476562, 0.3687477111816406, 0.016021728515625, -0.47028160095214844, -0.17661476135253906, 0.35688018798828125, 2.5846710205078125, 3.7755584716796875, 1.4064254760742188, 8.451065063476562, 1.6730461120605469, 3.1576995849609375, 1.0606155395507812, 0.8094253540039062, 1.5538883209228516, 1.3261070251464844, 1.7779006958007812, 1.8225250244140625, 6.21014404296875, 1.07427978515625, 0.7973251342773438, 0.2908973693847656, -0.9662551879882812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000083.npy"}
|
||||
{"epoch": 0.1254724111866969, "step": 84, "batch_size": 64, "mean": 0.9225125908851624, "std": 1.9130252599716187, "min": -3.626495361328125, "p10": -1.5291976928710938, "median": 0.816716194152832, "p90": 3.6082847595214846, "max": 5.1717529296875, "pos_frac": 0.734375, "sample": [1.1920166015625, 2.0365066528320312, 0.8398532867431641, 1.8641471862792969, 4.853199005126953, 2.1164302825927734, 3.6267623901367188, -0.4731292724609375, -0.3385448455810547, -2.5604934692382812, 0.2139892578125, 1.9335803985595703, 3.2074851989746094, 0.5085678100585938, 0.41085052490234375, 0.35394287109375, 0.5807342529296875, -1.4848175048828125, 4.130638122558594, -1.3005027770996094, 1.3623504638671875, 2.1472244262695312, 1.1394920349121094, 2.2473297119140625, 0.4125843048095703, 0.7935791015625, -1.8127288818359375, 3.5651702880859375, 0.06661033630371094, 0.7743072509765625, 0.5929908752441406, 0.851715087890625, -0.17400741577148438, 5.1717529296875, -1.8061141967773438, 1.3221054077148438, 1.5439739227294922, 2.103425979614258, 1.2075843811035156, -2.3533401489257812, -3.626495361328125, 4.493133544921875, 1.6209449768066406, 0.16407203674316406, 4.705253601074219, 0.7155818939208984, 0.48449134826660156, 2.7605056762695312, 1.2899055480957031, -1.7323684692382812, 0.8538894653320312, -1.015777587890625, 1.9366741180419922, 1.3071136474609375, 1.5430641174316406, -1.5482177734375, 4.5866851806640625, 0.5929241180419922, 0.17186737060546875, -1.394287109375, 2.8623199462890625, -0.01753997802734375, -1.311492919921875, -1.2686614990234375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000084.npy"}
|
||||
{"epoch": 0.12698412698412698, "step": 85, "batch_size": 64, "mean": 1.265273094177246, "std": 2.0298678874969482, "min": -3.832569122314453, "p10": -1.504831314086914, "median": 1.0618257522583008, "p90": 3.5489494323730475, "max": 6.922027587890625, "pos_frac": 0.765625, "sample": [2.6063461303710938, -1.4818000793457031, 1.2209625244140625, 2.35687255859375, 0.8543167114257812, 1.2292098999023438, 1.6923065185546875, 0.7932338714599609, -0.285003662109375, 3.295543670654297, 3.6245651245117188, -0.1616058349609375, 2.2254276275634766, 1.0012969970703125, 1.3035392761230469, 1.828765869140625, 1.5471572875976562, 3.2285614013671875, 2.7148399353027344, 2.3724441528320312, -1.8185310363769531, 2.5072364807128906, -2.179901123046875, 2.6637840270996094, 0.33751678466796875, 3.3725128173828125, 0.5295753479003906, 1.061248779296875, 4.660709381103516, -0.008378982543945312, -1.5147018432617188, 1.876007080078125, 2.3240203857421875, 1.688812255859375, 1.0624027252197266, 0.7916793823242188, 3.8092575073242188, 2.455686569213867, 0.42278289794921875, 0.03679466247558594, -3.832569122314453, 1.76397705078125, -0.4085235595703125, 1.6159133911132812, 6.922027587890625, 3.1874618530273438, 5.345794677734375, 0.24917221069335938, 0.9935226440429688, 0.9552097320556641, 0.929046630859375, -2.1592941284179688, 0.8125877380371094, -0.37747955322265625, 0.4878425598144531, 1.15106201171875, -2.0947608947753906, -1.682586669921875, 4.771800994873047, 0.3993721008300781, 6.419425964355469, -1.1723861694335938, -0.32120513916015625, 0.9765720367431641], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000085.npy"}
|
||||
{"epoch": 0.12849584278155707, "step": 86, "batch_size": 64, "mean": 0.9834346175193787, "std": 2.089958429336548, "min": -4.491432189941406, "p10": -1.495336151123047, "median": 0.9292373657226562, "p90": 3.9605171203613283, "max": 6.168952941894531, "pos_frac": 0.65625, "sample": [6.168952941894531, -0.2877082824707031, -1.484649658203125, 0.0060710906982421875, 2.3820877075195312, -0.06085777282714844, 0.200958251953125, 0.46227264404296875, -4.491432189941406, 1.491811752319336, -0.37530517578125, -1.6349639892578125, 4.3406219482421875, 5.456367492675781, -0.8558502197265625, 2.8080673217773438, 1.7549610137939453, -0.7454948425292969, -3.4223556518554688, 0.47948455810546875, 3.92608642578125, -0.10125732421875, -0.21807098388671875, 2.2823333740234375, 3.9752731323242188, 0.2589740753173828, 1.659759521484375, 1.2138404846191406, 1.7247085571289062, 0.5625820159912109, 2.511402130126953, 4.705718994140625, 2.42022705078125, -2.3383102416992188, 3.2797088623046875, 1.8688507080078125, -1.4999160766601562, 0.9966697692871094, -1.2350349426269531, 0.48394012451171875, -1.641937255859375, 2.7709407806396484, 1.9271621704101562, -0.5366649627685547, 1.3391647338867188, 0.4911613464355469, 3.474649429321289, -1.2164192199707031, 1.8376998901367188, -0.5353145599365234, 1.9855194091796875, 4.30328369140625, 0.5146713256835938, -0.17809486389160156, 2.1589088439941406, 4.564781188964844, -2.011259078979492, 1.4987030029296875, 1.0255584716796875, -0.9107398986816406, 1.2729988098144531, -0.004787445068359375, 0.8618049621582031, 1.2775001525878906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000086.npy"}
|
||||
{"epoch": 0.13000755857898716, "step": 87, "batch_size": 64, "mean": 1.0872149467468262, "std": 2.5169198513031006, "min": -6.54681396484375, "p10": -1.8834251403808593, "median": 1.107147216796875, "p90": 4.411101341247559, "max": 7.126739501953125, "pos_frac": 0.703125, "sample": [1.2556190490722656, 4.5977783203125, 7.126739501953125, 0.02353668212890625, 1.7451171875, 0.5031890869140625, 3.51361083984375, -1.55279541015625, 3.4533843994140625, 2.6783676147460938, -1.9628334045410156, -1.2110042572021484, 1.8109607696533203, -2.281808853149414, -6.54681396484375, 5.60028076171875, -2.350433349609375, 4.448383331298828, 1.1813411712646484, 1.2515296936035156, -2.659881591796875, 0.453369140625, 1.5655841827392578, -1.1164932250976562, 3.91729736328125, -4.39410400390625, -0.12154769897460938, -0.5889663696289062, 1.71533203125, -1.2792911529541016, -1.166839599609375, 0.35037994384765625, 4.32411003112793, -2.4598770141601562, 1.2074432373046875, -1.1853790283203125, 0.714508056640625, 1.7306385040283203, 2.62091064453125, 0.5357398986816406, 0.96514892578125, 3.1235122680664062, 1.1427421569824219, -1.6981391906738281, 1.4904632568359375, 1.2456626892089844, 1.0105552673339844, 3.18402099609375, -0.31751060485839844, 2.843994140625, 3.8508148193359375, 1.267730712890625, 5.6684112548828125, 0.5885467529296875, -0.08978271484375, 1.2952423095703125, 0.6814289093017578, -0.24547576904296875, 6.73126220703125, 4.788246154785156, 0.95892333984375, 1.0715522766113281, 0.5680999755859375, 2.0092239379882812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000087.npy"}
|
||||
{"epoch": 0.13151927437641722, "step": 88, "batch_size": 64, "mean": 0.8278965950012207, "std": 3.024770736694336, "min": -8.224472045898438, "p10": -2.430110549926758, "median": 0.7006816864013672, "p90": 4.044215393066406, "max": 11.1026611328125, "pos_frac": 0.65625, "sample": [6.203117370605469, 1.0121784210205078, -2.1143226623535156, -0.9557113647460938, 6.2960052490234375, 3.27886962890625, 5.856689453125, 0.7200508117675781, 1.2538681030273438, -0.1552600860595703, 3.8936691284179688, 3.8788833618164062, 1.7843551635742188, 1.6553211212158203, -0.7271499633789062, 1.6274337768554688, 4.594429016113281, -1.765350341796875, -2.5447998046875, 0.8556671142578125, 2.6331844329833984, 0.8165683746337891, 0.19919967651367188, -0.5069198608398438, -2.33380126953125, 1.6836318969726562, -0.21293258666992188, 11.1026611328125, -0.9360580444335938, -2.926137924194336, -7.0605926513671875, -2.9794464111328125, 3.9486083984375, 0.4419403076171875, 0.9050559997558594, -0.154296875, -0.81158447265625, 2.1669368743896484, 1.5289878845214844, 1.1943435668945312, 0.3800487518310547, 3.0047836303710938, 2.303457260131836, 0.6105823516845703, -3.3174896240234375, -0.280609130859375, 0.22829437255859375, 0.5553932189941406, 0.6813125610351562, 0.15282821655273438, 2.4505996704101562, 0.7837142944335938, 6.3237762451171875, 4.0851898193359375, 2.6964340209960938, -2.315896987915039, 1.6719436645507812, 0.2568931579589844, 0.425628662109375, 2.9038753509521484, -1.2481689453125, -8.224472045898438, -2.018644332885742, -2.471385955810547], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000088.npy"}
|
||||
{"epoch": 0.1330309901738473, "step": 89, "batch_size": 64, "mean": 1.409203290939331, "std": 2.6141104698181152, "min": -4.146827697753906, "p10": -0.850894737243652, "median": 1.0860137939453125, "p90": 4.109409523010254, "max": 12.062255859375, "pos_frac": 0.71875, "sample": [0.2690773010253906, 1.8160171508789062, -0.44861793518066406, 1.445220947265625, 3.6059417724609375, 4.046346664428711, -0.5740432739257812, 0.7523288726806641, 1.1352996826171875, 3.8763809204101562, 1.8601417541503906, 3.866790771484375, 2.609893798828125, -0.42877197265625, 0.2411956787109375, 0.5350914001464844, -0.21822738647460938, 1.82501220703125, -0.5286636352539062, -2.7411537170410156, 3.11871337890625, -0.9695453643798828, -0.2233734130859375, 3.640625, -0.4488811492919922, 12.062255859375, 2.6330337524414062, 0.9219207763671875, -0.47307586669921875, 4.39892578125, 0.11214447021484375, 1.274322509765625, 0.9139556884765625, 0.4690093994140625, 4.3498687744140625, -0.2130451202392578, 3.5174789428710938, 1.0367279052734375, -1.44775390625, 2.7677459716796875, 2.7778549194335938, 2.9868240356445312, 6.858390808105469, 7.078046798706055, 1.3440570831298828, 0.09811210632324219, 0.2548484802246094, 3.4416656494140625, 4.136436462402344, -2.8828125, -0.3809356689453125, 2.412006378173828, 0.23232269287109375, 0.5076408386230469, 0.308807373046875, -2.8926620483398438, 1.726461410522461, -0.291656494140625, -3.481121063232422, -4.146827697753906, 5.0987396240234375, 1.2746353149414062, 1.4070549011230469, 1.9348087310791016], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000089.npy"}
|
||||
{"epoch": 0.1345427059712774, "step": 90, "batch_size": 64, "mean": 1.2589483261108398, "std": 2.5005600452423096, "min": -4.755401611328125, "p10": -1.8515823364257813, "median": 0.9938087463378906, "p90": 3.88408203125, "max": 6.7032623291015625, "pos_frac": 0.75, "sample": [-1.0657024383544922, 2.7724533081054688, 0.8819580078125, 2.4231529235839844, 3.412282943725586, 0.35601043701171875, 3.288602828979492, 6.7032623291015625, -3.70947265625, -4.755401611328125, 3.695770263671875, 3.2461814880371094, 3.142995834350586, 0.9959907531738281, 2.715423583984375, -1.1890907287597656, 1.6254043579101562, 6.22979736328125, -1.8708038330078125, 1.2442436218261719, 3.6774978637695312, -0.7986602783203125, 6.63116455078125, 3.8924179077148438, 5.9533233642578125, 0.8087177276611328, 0.9079055786132812, 3.81170654296875, 0.2881050109863281, 0.1141204833984375, 0.3296852111816406, -0.9843978881835938, 1.8914737701416016, 6.4422760009765625, 1.496734619140625, 3.7419662475585938, -2.1886520385742188, -0.9780502319335938, 4.548683166503906, 1.674612045288086, -1.3798370361328125, 1.3901824951171875, 1.96173095703125, -1.806732177734375, 3.8646316528320312, 0.2765979766845703, 0.6559410095214844, 0.7674903869628906, 0.03546714782714844, -0.5161399841308594, -3.0060081481933594, 1.4451980590820312, 1.7716522216796875, 0.02072906494140625, 0.3274040222167969, 0.9916267395019531, 0.241851806640625, -3.045024871826172, -2.0318679809570312, 2.6565704345703125, 2.4265060424804688, 2.471954345703125, -0.5250396728515625, 0.1741199493408203], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000090.npy"}
|
||||
{"epoch": 0.1360544217687075, "step": 91, "batch_size": 64, "mean": 1.1339049339294434, "std": 2.8037497997283936, "min": -6.0162506103515625, "p10": -1.754723358154297, "median": 0.920867919921875, "p90": 4.353787994384766, "max": 11.80877685546875, "pos_frac": 0.625, "sample": [3.16082763671875, 3.6077423095703125, 1.3405628204345703, -1.4907550811767578, 0.501251220703125, 6.64361572265625, 0.31363677978515625, 1.672119140625, 2.8282699584960938, 7.082008361816406, 11.80877685546875, -0.83319091796875, 1.1916770935058594, -1.4525184631347656, 1.6717510223388672, -1.7858257293701172, -0.840850830078125, -0.6866302490234375, 3.0531234741210938, -3.0210723876953125, 2.0347671508789062, 3.524383544921875, 0.8193130493164062, 0.4422950744628906, 1.3463516235351562, 2.118724822998047, 3.187183380126953, -1.7799758911132812, -3.1840744018554688, 3.8087158203125, 0.9530029296875, -1.3722801208496094, -2.3058319091796875, 1.3478870391845703, 5.16615104675293, -1.1112442016601562, -0.9297332763671875, 2.9556732177734375, -0.8327655792236328, -6.0162506103515625, 2.2544326782226562, 1.1553955078125, -2.3695068359375, -1.2399444580078125, 0.5860385894775391, 0.88873291015625, 0.07472610473632812, 4.82478141784668, 1.982879638671875, 1.6228408813476562, -0.11405181884765625, -1.5544910430908203, -0.13229942321777344, -0.22228622436523438, 4.446441650390625, 3.053190231323242, 5.678314208984375, -0.10198593139648438, 4.137596130371094, 2.5053634643554688, -1.69580078125, -0.16072463989257812, 1.6727485656738281, 0.3407154083251953], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000091.npy"}
|
||||
{"epoch": 0.13756613756613756, "step": 92, "batch_size": 64, "mean": 1.5650835037231445, "std": 3.1517252922058105, "min": -5.731189727783203, "p10": -1.5757919311523436, "median": 0.9597854614257812, "p90": 5.716004753112794, "max": 11.265396118164062, "pos_frac": 0.703125, "sample": [3.1950912475585938, -0.48665809631347656, -2.838134765625, 0.055908203125, -0.7404251098632812, -0.9041023254394531, 0.12732887268066406, 0.45993804931640625, 1.0248336791992188, -1.086639404296875, -1.4908103942871094, 1.4882678985595703, 3.4995880126953125, -1.597412109375, 4.659402847290039, 2.466106414794922, 2.426860809326172, 0.10259819030761719, 5.384553909301758, 0.6182289123535156, 3.7209320068359375, -0.5879440307617188, 1.0954627990722656, 0.8947372436523438, 4.862579345703125, -0.1637744903564453, -3.4822006225585938, 1.263214111328125, -3.6249618530273438, 2.7837753295898438, -1.8981304168701172, 2.9228134155273438, 6.422966003417969, 0.8741359710693359, 3.2398681640625, 1.1073875427246094, 0.6418533325195312, 3.3117523193359375, -3.9874420166015625, 2.514251708984375, 0.028106689453125, 0.007808685302734375, 5.858055114746094, 2.902984619140625, 1.9442615509033203, -1.5253448486328125, 4.4715576171875, 5.991859436035156, 3.9228363037109375, -1.4226455688476562, -0.48874664306640625, 9.826629638671875, 1.608001708984375, -0.21625518798828125, 4.492706298828125, -5.731189727783203, 11.265396118164062, 4.678245544433594, 0.47746849060058594, 7.42730712890625, 5.881843566894531, 0.4467658996582031, 0.40667724609375, -0.36478424072265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000092.npy"}
|
||||
{"epoch": 0.13907785336356765, "step": 93, "batch_size": 64, "mean": 1.0032142400741577, "std": 3.22245192527771, "min": -5.389381408691406, "p10": -2.5993440628051756, "median": 0.37667083740234375, "p90": 4.96754608154297, "max": 14.091239929199219, "pos_frac": 0.609375, "sample": [0.7815952301025391, 2.816997528076172, 4.7143707275390625, 3.6939239501953125, -2.3234291076660156, 3.6874313354492188, -2.3387069702148438, -0.9572982788085938, -0.21599960327148438, 2.7495460510253906, 0.8251323699951172, 2.316692352294922, 14.091239929199219, 0.3135261535644531, -0.5846824645996094, 0.2673377990722656, -0.025485992431640625, 3.4870948791503906, -0.7727451324462891, -5.389381408691406, 2.1710586547851562, 7.24761962890625, -2.3422927856445312, 0.15032196044921875, 2.932432174682617, -0.25313568115234375, 4.61676025390625, -0.042087554931640625, -1.658487319946289, 1.464202880859375, -0.259918212890625, 0.05082511901855469, -0.9160442352294922, 5.718467712402344, -1.7212600708007812, -0.6768417358398438, 0.056793212890625, 0.3625335693359375, -4.8751373291015625, -0.2808074951171875, 0.8514289855957031, 0.3880577087402344, 1.9341621398925781, 1.6640434265136719, 1.8618621826171875, -2.876251220703125, 0.5734462738037109, 0.3652839660644531, 5.4416961669921875, 1.2309989929199219, 4.345905303955078, 1.5076446533203125, 5.0760498046875, 2.08404541015625, -2.1017379760742188, -2.7095088958740234, 5.483982086181641, -5.1695556640625, -3.2644195556640625, 5.551948547363281, 3.954833984375, -3.101287841796875, -0.6544990539550781, 2.8854141235351562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000093.npy"}
|
||||
{"epoch": 0.14058956916099774, "step": 94, "batch_size": 64, "mean": 1.8816919326782227, "std": 4.06487512588501, "min": -10.556961059570312, "p10": -1.1358375549316404, "median": 2.0014848709106445, "p90": 5.722311401367188, "max": 14.798690795898438, "pos_frac": 0.765625, "sample": [3.74078369140625, 0.5666580200195312, 2.3325424194335938, 3.265972137451172, 1.3350372314453125, -9.025169372558594, 0.1283588409423828, 0.7015705108642578, 1.7138442993164062, 2.2284812927246094, 0.7361125946044922, 5.5408172607421875, 2.22320556640625, -8.524942398071289, 2.201263427734375, 5.365837097167969, -1.8186302185058594, -0.5780506134033203, 5.8000946044921875, 2.7099990844726562, 2.390726089477539, 2.3017120361328125, 3.723245620727539, 1.8573951721191406, 2.483264923095703, 1.0575027465820312, 0.4998607635498047, 0.762298583984375, -2.9528236389160156, 3.1288528442382812, 3.4780426025390625, -0.9764041900634766, 0.5041389465332031, -0.2824249267578125, 3.0998382568359375, 3.2164573669433594, 3.7839813232421875, 2.738006591796875, 3.628437042236328, 0.2655029296875, 6.086494445800781, 6.5204010009765625, -0.3185615539550781, 12.48223876953125, 4.180980682373047, 4.385765075683594, -1.1905975341796875, 10.838775634765625, 3.7841339111328125, 2.573099136352539, 14.798690795898438, 0.9112377166748047, -0.7228012084960938, -0.7972126007080078, 1.9263935089111328, 0.7090225219726562, -10.556961059570312, 2.0765762329101562, 9.460281372070312, 1.6105670928955078, 0.20592689514160156, -1.0080642700195312, -2.2391300201416016, -0.6403694152832031], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000094.npy"}
|
||||
{"epoch": 0.1421012849584278, "step": 95, "batch_size": 64, "mean": 1.5964727401733398, "std": 3.4026076793670654, "min": -5.6966552734375, "p10": -2.479621124267578, "median": 1.2141799926757812, "p90": 6.263985443115235, "max": 9.094223022460938, "pos_frac": 0.65625, "sample": [5.5966033935546875, -2.9655838012695312, 1.217742919921875, -0.5251884460449219, -2.5356216430664062, 0.1575946807861328, 9.094223022460938, 0.9789810180664062, 2.6111984252929688, 1.5896987915039062, -1.2790374755859375, 1.1656723022460938, 0.05181884765625, -0.8035011291503906, -0.6871719360351562, 1.2106170654296875, -3.5980682373046875, -1.8074798583984375, -0.45742034912109375, 1.3782005310058594, 7.4686279296875, 6.6816253662109375, -0.2288074493408203, 0.8719596862792969, -5.628173828125, -0.5879898071289062, 7.76715087890625, 6.1943817138671875, -3.0113067626953125, -5.6966552734375, 2.0544662475585938, 6.9462738037109375, 1.3965530395507812, 3.0958251953125, -0.6853313446044922, 1.487945556640625, 5.5348968505859375, -0.7678298950195312, -3.6374435424804688, 2.8070907592773438, 3.572784423828125, 0.38385772705078125, 6.293815612792969, 2.58477783203125, -2.3489532470703125, 4.9392242431640625, 4.930877685546875, 0.04953575134277344, 2.1136245727539062, 5.752960205078125, -0.5745162963867188, 4.716148376464844, -0.5880489349365234, 3.4677886962890625, 5.393852233886719, 8.39959716796875, -1.3859481811523438, 0.36685943603515625, 1.9383563995361328, 3.3158493041992188, 0.4712562561035156, 1.9161205291748047, 6.00567626953125, -1.997772216796875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000095.npy"}
|
||||
{"epoch": 0.1436130007558579, "step": 96, "batch_size": 64, "mean": 1.8977141380310059, "std": 4.605284690856934, "min": -7.835052490234375, "p10": -2.569322967529297, "median": 1.2808513641357422, "p90": 7.748275756835938, "max": 15.608642578125, "pos_frac": 0.6875, "sample": [3.1483726501464844, -2.625396728515625, 7.807373046875, -0.01642608642578125, 12.92041015625, -7.835052490234375, 3.470184326171875, 0.17141151428222656, 2.5928115844726562, 0.4361724853515625, 5.3291168212890625, -0.5571994781494141, 1.3705902099609375, 10.526748657226562, 2.788677215576172, 1.6542129516601562, -5.350311279296875, -0.2750396728515625, 1.0551738739013672, -0.6776695251464844, 1.9428825378417969, 3.0353927612304688, 7.610382080078125, -5.8296356201171875, 0.68975830078125, 0.7388591766357422, 5.4955596923828125, -0.328887939453125, -2.4384841918945312, 0.11051177978515625, 2.6207313537597656, 8.23785400390625, 15.608642578125, 4.826560974121094, -1.050741195678711, -1.8721694946289062, -7.445281982421875, 0.47098731994628906, -3.0708694458007812, 1.7583427429199219, 3.438751220703125, 1.1911125183105469, 1.4924068450927734, 0.25701904296875, 13.1669921875, -1.7873077392578125, 3.9926986694335938, 0.156402587890625, 2.0286026000976562, -0.5626869201660156, 4.241458892822266, -1.4952735900878906, 1.0537147521972656, 7.406822204589844, 10.712448120117188, 2.9325790405273438, -1.3985710144042969, 0.6514892578125, 1.73956298828125, 3.8900985717773438, 7.246612548828125, 1.5540962219238281, -1.711151123046875, -5.788730621337891], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000096.npy"}
|
||||
{"epoch": 0.14512471655328799, "step": 97, "batch_size": 64, "mean": 1.0474696159362793, "std": 5.05444860458374, "min": -17.5645751953125, "p10": -2.827281188964844, "median": 0.9145803451538086, "p90": 6.302619552612309, "max": 16.952117919921875, "pos_frac": 0.609375, "sample": [-10.101165771484375, -2.369159698486328, -0.031221389770507812, 7.562246322631836, 1.0243053436279297, 13.317626953125, -2.8039321899414062, -3.322265625, -2.3907394409179688, -17.5645751953125, -2.590963363647461, -1.0438919067382812, 0.8748855590820312, 2.762714385986328, -3.6926498413085938, -9.533950805664062, 0.2585430145263672, 0.9542751312255859, 3.147613525390625, 0.18362045288085938, 1.036031723022461, 3.4911651611328125, 1.5981216430664062, -2.3207015991210938, 3.9010543823242188, -1.1666030883789062, -0.7136154174804688, 5.200279235839844, -1.3884506225585938, 11.657745361328125, 1.7049942016601562, -2.69287109375, 3.320098876953125, 0.3637580871582031, 0.8345413208007812, 4.807708740234375, 7.82611083984375, 6.775051116943359, -0.9512882232666016, 4.0688629150390625, 2.8412017822265625, -5.091064453125, 16.952117919921875, 2.500530242919922, 2.9375076293945312, -1.2344112396240234, 3.6769466400146484, 3.1439056396484375, 9.455963134765625, -2.8372879028320312, 0.16486358642578125, 1.9469528198242188, 1.190338134765625, -1.883453369140625, -1.8168563842773438, 2.8792343139648438, 3.912393569946289, -1.7886886596679688, 2.9058303833007812, 2.9285812377929688, 0.4990882873535156, -0.20948028564453125, 2.929546356201172, -0.959014892578125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000097.npy"}
|
||||
{"epoch": 0.14663643235071808, "step": 98, "batch_size": 64, "mean": 0.680237352848053, "std": 3.328080177307129, "min": -9.224334716796875, "p10": -2.759770584106445, "median": 1.101694107055664, "p90": 4.878152084350589, "max": 7.0157012939453125, "pos_frac": 0.625, "sample": [-0.02399444580078125, 1.2856864929199219, 5.140506744384766, 0.7287750244140625, 2.2662429809570312, 2.49822998046875, 1.9968032836914062, 1.6674728393554688, -1.2009201049804688, 6.621589660644531, -8.721893310546875, -2.5233497619628906, -1.643890380859375, -1.3827781677246094, -2.5620651245117188, 0.088165283203125, -0.09814453125, 1.7978858947753906, 1.228250503540039, -0.2685699462890625, 3.6841773986816406, 6.385993957519531, -3.6348876953125, 5.556549072265625, 0.4963264465332031, -2.0577449798583984, 3.6689834594726562, 0.9318809509277344, 1.3175926208496094, 1.0538291931152344, 1.9045944213867188, 1.29547119140625, 5.7395477294921875, 2.9020538330078125, -2.844501495361328, 1.3972244262695312, 1.48944091796875, 2.8150482177734375, 0.9044609069824219, -0.4272193908691406, 2.8413238525390625, -0.29471588134765625, -8.649070739746094, 1.5928382873535156, -1.4011383056640625, -3.690645217895508, 1.7956600189208984, 1.1495590209960938, -1.3007049560546875, 5.375450134277344, -2.2346935272216797, 4.257347106933594, -1.9872970581054688, -2.1922760009765625, -9.224334716796875, 4.190116882324219, 2.1641464233398438, 0.08585548400878906, 0.9468307495117188, 7.0157012939453125, 2.7692832946777344, 4.2659912109375, -0.5230255126953125, -2.8898353576660156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000098.npy"}
|
||||
{"epoch": 0.14814814814814814, "step": 99, "batch_size": 64, "mean": 0.9784270524978638, "std": 4.635390281677246, "min": -12.97943115234375, "p10": -3.4913734436035155, "median": 0.766901969909668, "p90": 6.425466537475587, "max": 14.73760986328125, "pos_frac": 0.609375, "sample": [8.297264099121094, 7.739013671875, 6.2570648193359375, 11.115135192871094, -0.20459747314453125, -2.753437042236328, -0.7371101379394531, -1.5484905242919922, 1.272796630859375, 6.497638702392578, 3.459442138671875, 0.6277923583984375, 2.0769996643066406, 4.700740814208984, -0.5280342102050781, 0.8972930908203125, 1.0253753662109375, 0.005680084228515625, -0.8210487365722656, 1.8887996673583984, 5.74981689453125, 0.048549652099609375, -5.422943115234375, -2.8930816650390625, -6.0095672607421875, -3.4330062866210938, 14.73760986328125, -9.489669799804688, -2.556427001953125, 6.540447235107422, 1.6481704711914062, -3.516387939453125, -0.5650863647460938, 0.5810699462890625, 0.7949943542480469, 1.1949386596679688, 1.0566940307617188, -3.010000228881836, 0.315093994140625, 5.603099822998047, 1.2589740753173828, -12.97943115234375, 5.641937255859375, 11.03656005859375, -0.4829292297363281, 1.5408172607421875, 5.9520416259765625, 0.7388095855712891, -8.38763427734375, 1.3880367279052734, -0.3979454040527344, 3.5672950744628906, 2.0350093841552734, 0.6873741149902344, 2.052764892578125, -0.24053573608398438, -0.12447738647460938, 4.770225524902344, 1.6066246032714844, -0.888092041015625, 1.4106903076171875, -2.20233154296875, -1.7141647338867188, -4.292919158935547], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000099.npy"}
|
||||
{"epoch": 0.14965986394557823, "step": 100, "batch_size": 64, "mean": 1.4460135698318481, "std": 6.240384578704834, "min": -17.120620727539062, "p10": -3.575221252441406, "median": 0.9681224822998047, "p90": 7.245265197753909, "max": 26.143402099609375, "pos_frac": 0.65625, "sample": [0.4371337890625, -2.171356201171875, -4.250816345214844, 2.569122314453125, 7.5118255615234375, -1.0037841796875, 0.7146511077880859, 3.5733070373535156, 6.09771728515625, 4.317024230957031, 1.6272430419921875, 1.4798507690429688, 4.116218566894531, 12.639076232910156, 1.762847900390625, 4.460723876953125, -1.7230148315429688, 1.0397453308105469, 0.8204345703125, 1.0644073486328125, 0.7194194793701172, 0.22088623046875, -5.606842041015625, 0.2943382263183594, -4.0485382080078125, 4.569129943847656, -1.5289325714111328, 8.520317077636719, 1.0848236083984375, -15.151580810546875, -1.3590469360351562, -0.0361785888671875, -3.14251708984375, 0.8585433959960938, 0.7808837890625, -0.6644058227539062, 4.030609130859375, 0.2967987060546875, -12.908119201660156, 5.579442977905273, 1.628173828125, 17.772567749023438, -3.7606658935546875, 0.8964996337890625, -1.2665901184082031, -0.6744956970214844, 2.9953575134277344, 1.3368358612060547, -0.9219093322753906, -0.8904190063476562, 6.623291015625, 9.490814208984375, -0.5414886474609375, 3.078348159790039, 2.342987060546875, 1.2208404541015625, -3.03424072265625, 12.104965209960938, 4.630836486816406, 1.266357421875, 26.143402099609375, -0.099212646484375, -17.120620727539062, 1.731842041015625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000100.npy"}
|
||||
{"epoch": 0.15117157974300832, "step": 101, "batch_size": 64, "mean": 2.0524988174438477, "std": 4.237483501434326, "min": -11.776763916015625, "p10": -2.271822166442871, "median": 1.554835319519043, "p90": 7.120146179199219, "max": 12.611503601074219, "pos_frac": 0.765625, "sample": [-1.4051170349121094, -0.013275146484375, 3.3010177612304688, 0.5291900634765625, 6.591461181640625, 6.966156005859375, 4.273681640625, 0.91619873046875, 0.0468292236328125, -0.2690010070800781, 0.7614459991455078, 4.6688690185546875, -11.776763916015625, 8.302349090576172, 4.551918029785156, 0.3714103698730469, 4.7071533203125, 2.1381759643554688, -0.9939613342285156, -2.761951446533203, 4.447643280029297, 1.2231216430664062, 5.8313446044921875, -1.9358024597167969, 3.5714492797851562, -2.051319122314453, 4.054973602294922, 3.9881591796875, 0.8667316436767578, 6.474721908569336, 1.2839508056640625, 10.0103759765625, 5.28931999206543, 0.48832130432128906, -0.66796875, -2.36761474609375, 1.2035903930664062, -6.895347595214844, -2.366323471069336, 4.063072204589844, -4.38250732421875, 3.8174667358398438, 1.1661720275878906, 1.697540283203125, 7.1861419677734375, 1.79522705078125, 0.362518310546875, -9.221084594726562, 0.10561370849609375, 1.412130355834961, 7.508674621582031, 0.4211463928222656, 7.594106674194336, 3.0335559844970703, 3.4304351806640625, 0.297088623046875, 12.611503601074219, 2.532012939453125, -0.6643142700195312, 2.1064796447753906, 5.9494781494140625, 2.9350204467773438, 10.887336730957031, 1.3599872589111328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000101.npy"}
|
||||
{"epoch": 0.15268329554043839, "step": 102, "batch_size": 64, "mean": 0.8108617663383484, "std": 5.484974384307861, "min": -10.184158325195312, "p10": -4.428290176391601, "median": 0.2243947982788086, "p90": 6.902362442016602, "max": 21.104949951171875, "pos_frac": 0.53125, "sample": [7.2914581298828125, -3.4784317016601562, -1.9798145294189453, 1.2265777587890625, -6.931037902832031, -1.200979232788086, -4.8260040283203125, 2.8607025146484375, -7.939247131347656, -0.24325942993164062, 2.5575180053710938, 3.9804153442382812, 0.0573883056640625, 1.6029129028320312, 6.9495086669921875, -3.3746566772460938, -2.2985286712646484, 17.107498168945312, -3.4280624389648438, 9.521598815917969, -2.9919376373291016, 0.7523555755615234, -2.4221878051757812, -0.9409637451171875, 2.8874645233154297, 2.3998794555664062, 0.9353179931640625, 6.792354583740234, -9.480384826660156, 1.0352764129638672, 6.357536315917969, 0.8483772277832031, -4.467201232910156, 0.3241767883300781, -3.5575103759765625, -0.8010101318359375, 4.071971893310547, 7.358375549316406, 11.624313354492188, -2.7821826934814453, -0.4474754333496094, 1.8369140625, -10.184158325195312, -1.0386810302734375, -1.6613578796386719, -0.2869873046875, 0.1724853515625, 6.0331268310546875, 0.2763042449951172, -1.427276611328125, 2.110462188720703, 4.977836608886719, 1.9174041748046875, -4.337497711181641, -1.6449661254882812, -4.040443420410156, -1.7860260009765625, 21.104949951171875, -2.882701873779297, -7.7547149658203125, 3.970775604248047, 4.9032440185546875, 5.755577087402344, 0.9287815093994141], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000102.npy"}
|
||||
{"epoch": 0.15419501133786848, "step": 103, "batch_size": 64, "mean": 1.5441327095031738, "std": 3.7237093448638916, "min": -6.5787811279296875, "p10": -2.394597625732422, "median": 1.011183738708496, "p90": 6.808670425415039, "max": 10.357437133789062, "pos_frac": 0.640625, "sample": [-3.2848129272460938, -0.44775390625, 7.0818328857421875, -6.540348052978516, -0.1768627166748047, 3.670757293701172, -0.97515869140625, -1.1245975494384766, 1.1871910095214844, 6.818973541259766, -2.1956329345703125, 2.0687713623046875, -2.2620086669921875, 1.469696044921875, -1.7471923828125, 5.304447174072266, 2.5250930786132812, -0.9741134643554688, -3.777587890625, 6.202728271484375, 1.4419937133789062, 4.648197174072266, -0.039340972900390625, 2.4764442443847656, 2.089191436767578, -0.8411216735839844, -0.9889106750488281, 9.36395263671875, -2.4412002563476562, 2.677104949951172, 3.9838180541992188, 6.784629821777344, 8.506072998046875, -1.7552204132080078, 0.8241806030273438, 4.288665771484375, -6.5787811279296875, 3.68316650390625, 0.563751220703125, 0.6139392852783203, 5.594276428222656, 0.7949008941650391, 0.9074077606201172, 2.04144287109375, 3.298421859741211, 0.63153076171875, 6.394618988037109, -2.285858154296875, 6.848920822143555, 0.24055099487304688, 9.412567138671875, 2.8453903198242188, -4.8251953125, 0.6169929504394531, -3.2595977783203125, 0.07248687744140625, -1.4075698852539062, -0.9536361694335938, -1.087738037109375, 1.114959716796875, 3.076251983642578, 3.4671897888183594, 10.357437133789062, 2.8047866821289062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000103.npy"}
|
||||
{"epoch": 0.15570672713529857, "step": 104, "batch_size": 64, "mean": 2.2608299255371094, "std": 6.499909400939941, "min": -11.512413024902344, "p10": -4.3815979003906245, "median": 1.0457763671875, "p90": 8.6472713470459, "max": 28.670623779296875, "pos_frac": 0.65625, "sample": [8.72698974609375, 5.5505218505859375, 3.9218101501464844, -1.9531917572021484, -4.748298645019531, 2.81787109375, -1.8643627166748047, 1.0505218505859375, 4.6419830322265625, -1.145050048828125, 0.7809257507324219, -11.512413024902344, 28.670623779296875, 7.779266357421875, -6.2366485595703125, -2.2219696044921875, 13.111724853515625, 2.3633365631103516, -0.9650001525878906, 7.5758056640625, -5.2219085693359375, -10.013198852539062, 6.5589599609375, 8.458145141601562, -0.1443328857421875, -0.6719341278076172, 0.7182884216308594, -7.452415466308594, 5.021757125854492, -0.3439750671386719, 0.3482799530029297, 0.3330860137939453, -0.9984226226806641, 2.8388729095458984, 8.461261749267578, 2.9540576934814453, 1.0410308837890625, -1.0425167083740234, 3.3309478759765625, 2.5075302124023438, 0.786590576171875, -3.0547542572021484, 18.71502685546875, 0.8935642242431641, 4.130710601806641, 10.801315307617188, 3.4715576171875, 0.9211044311523438, -9.828636169433594, 0.30621337890625, 1.8959846496582031, 3.101011276245117, 6.259136199951172, 4.432109832763672, 1.3607177734375, -3.5259628295898438, -0.0499114990234375, -1.497650146484375, 0.5588092803955078, 1.602081298828125, -2.9064865112304688, 5.813915252685547, 12.522552490234375, 14.956146240234375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000104.npy"}
|
||||
{"epoch": 0.15721844293272866, "step": 105, "batch_size": 64, "mean": 1.8436133861541748, "std": 5.01878023147583, "min": -8.04791259765625, "p10": -4.5366798400878885, "median": 1.2180900573730469, "p90": 7.871994781494142, "max": 18.336563110351562, "pos_frac": 0.625, "sample": [1.7208499908447266, 5.158073425292969, 2.3809661865234375, 3.450939178466797, 1.7475395202636719, 4.8929443359375, -7.8137359619140625, 10.112396240234375, -2.847503662109375, 2.3247509002685547, -0.3097686767578125, 4.244913101196289, -1.613861083984375, 8.5106201171875, 0.4626655578613281, 8.028556823730469, -0.9166431427001953, -5.8321533203125, -0.7004318237304688, 3.77447509765625, 2.8338775634765625, -1.7773818969726562, 1.2404251098632812, -5.241706848144531, 0.576263427734375, 4.4529571533203125, 0.9008636474609375, -5.2631988525390625, 3.2639923095703125, -6.8191375732421875, -2.1082992553710938, 0.6884860992431641, -2.2143478393554688, 1.1957550048828125, -8.04791259765625, 18.336563110351562, 1.2981395721435547, -0.699005126953125, 2.366973876953125, -0.04113578796386719, -0.2532081604003906, -1.1675796508789062, 0.43907928466796875, 2.0716171264648438, -5.4119720458984375, 5.155277252197266, 5.71331787109375, -0.9014205932617188, 6.7514801025390625, -2.3064651489257812, -2.8916168212890625, 2.3191184997558594, 1.0974979400634766, 11.0118408203125, 14.887771606445312, 0.6825637817382812, 7.491035461425781, 6.177764892578125, -0.293426513671875, 2.4172286987304688, -0.4249267578125, 10.619712829589844, 5.582122802734375, 7.506683349609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000105.npy"}
|
||||
{"epoch": 0.15873015873015872, "step": 106, "batch_size": 64, "mean": 1.566506266593933, "std": 5.349547386169434, "min": -16.633224487304688, "p10": -5.354207992553709, "median": 1.3381147384643555, "p90": 6.087954330444337, "max": 15.825881958007812, "pos_frac": 0.6875, "sample": [-6.112419128417969, 5.550285339355469, 3.447458267211914, 5.533943176269531, -6.9265594482421875, -0.8636054992675781, 8.92593002319336, 3.036436080932617, 0.9167442321777344, 1.0068702697753906, 4.101936340332031, 6.142910003662109, 11.021575927734375, 3.3297119140625, 0.5205612182617188, -3.5850486755371094, 1.653900146484375, 5.959724426269531, 3.6633834838867188, 0.952484130859375, -0.7211341857910156, 0.0883026123046875, -7.79638671875, 2.7551422119140625, -0.8979921340942383, -0.6413116455078125, -0.31682395935058594, -3.4285125732421875, 1.282501220703125, 15.825881958007812, -0.7383651733398438, -0.043140411376953125, -2.2850570678710938, -2.93634033203125, 5.6389007568359375, 5.051967620849609, 0.7487888336181641, -1.1916389465332031, 5.8823089599609375, 15.39837646484375, 4.309516906738281, 2.767772674560547, 4.670186996459961, 0.9621658325195312, -6.370185852050781, 1.1456680297851562, 1.93804931640625, 1.745361328125, -16.633224487304688, 1.348623275756836, -9.565277099609375, 1.327606201171875, 5.160240173339844, 4.331964492797852, 5.6501007080078125, 8.171585083007812, 0.8060569763183594, 2.7316322326660156, -0.82196044921875, 10.526229858398438, -8.073333740234375, 1.4971542358398438, 0.330718994140625, 2.3480587005615234], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000106.npy"}
|
||||
{"epoch": 0.1602418745275888, "step": 107, "batch_size": 64, "mean": 0.8254714012145996, "std": 6.356660842895508, "min": -18.227783203125, "p10": -5.065654754638672, "median": 1.2518959045410156, "p90": 7.191637420654298, "max": 19.414825439453125, "pos_frac": 0.640625, "sample": [1.258453369140625, -5.0885772705078125, 2.5211944580078125, -0.03546905517578125, 6.8830108642578125, 4.10528564453125, -16.736541748046875, 6.696144104003906, -0.610687255859375, 1.77398681640625, -18.227783203125, 0.00911712646484375, 1.7089576721191406, 1.2453384399414062, 3.6778316497802734, -3.9060211181640625, -3.668914794921875, 5.0133056640625, 3.547384262084961, 0.5108489990234375, -1.5572891235351562, 11.076446533203125, 9.134712219238281, 0.6464443206787109, 4.584682464599609, -10.134498596191406, 4.268962860107422, -2.82965087890625, -0.7483749389648438, 1.143280029296875, 2.913951873779297, -7.189258575439453, 1.8918914794921875, 5.779850006103516, 0.6206436157226562, 5.4293060302734375, 6.1202850341796875, 1.23797607421875, 3.02899169921875, 9.4805908203125, 11.92279052734375, 7.323905944824219, -4.993072509765625, -2.0639190673828125, 3.2243690490722656, 0.8636627197265625, -0.4146690368652344, 2.7568206787109375, 2.7034912109375, -11.59234619140625, 8.92083740234375, 3.2611770629882812, -2.414937973022461, -3.7877769470214844, -0.1308002471923828, 0.6866302490234375, 2.652618408203125, 19.414825439453125, 2.9449405670166016, -3.1371383666992188, -5.012168884277344, -14.836990356445312, 2.2702102661132812, -3.3080978393554688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000107.npy"}
|
||||
{"epoch": 0.1617535903250189, "step": 108, "batch_size": 64, "mean": 1.8423789739608765, "std": 5.410542964935303, "min": -11.61883544921875, "p10": -4.863829803466796, "median": 1.6969184875488281, "p90": 8.13376083374024, "max": 18.505386352539062, "pos_frac": 0.609375, "sample": [2.9743881225585938, 5.2038116455078125, 10.041107177734375, -1.3420791625976562, -0.010894775390625, 5.287017822265625, 3.998809814453125, 3.837047576904297, 6.379486083984375, -5.0064849853515625, 2.4812850952148438, 6.1461944580078125, 3.1256866455078125, -1.2563228607177734, 5.7983551025390625, 1.720794677734375, -3.7341690063476562, -0.46116065979003906, 0.6060600280761719, 0.9909896850585938, 6.8558197021484375, -0.7946624755859375, -0.48572540283203125, 5.462127685546875, 1.2331924438476562, -6.3073272705078125, 3.767597198486328, -11.61883544921875, 13.015396118164062, 0.5953445434570312, -9.57568359375, -1.8132858276367188, 6.180877685546875, -0.22574615478515625, 8.681449890136719, 5.214826583862305, 0.07065200805664062, -0.19423294067382812, -6.0968475341796875, 4.66363525390625, -3.8947982788085938, 8.841163635253906, 2.5963897705078125, -1.5310955047607422, -0.6640586853027344, 1.9255180358886719, -1.465963363647461, 2.8333663940429688, 2.4247512817382812, -4.530967712402344, -7.619110107421875, 3.4949417114257812, -5.3877410888671875, 2.076416015625, 11.523750305175781, 0.01720428466796875, 3.08062744140625, 18.505386352539062, 1.6730422973632812, -0.47757720947265625, -0.4453392028808594, 5.554683685302734, -0.9034652709960938, 14.876632690429688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000108.npy"}
|
||||
{"epoch": 0.16326530612244897, "step": 109, "batch_size": 64, "mean": 3.0105397701263428, "std": 5.512070178985596, "min": -5.3126068115234375, "p10": -1.4720300674438476, "median": 1.4968490600585938, "p90": 7.990365982055665, "max": 24.3323974609375, "pos_frac": 0.703125, "sample": [0.36171722412109375, 13.433235168457031, 16.851593017578125, 0.8328514099121094, 4.23554801940918, 3.959951400756836, 7.162487030029297, -0.37249755859375, 2.2366294860839844, 4.915729522705078, -0.1317901611328125, 1.5184249877929688, 1.9684906005859375, 5.820713043212891, 3.0018157958984375, -5.0174560546875, 2.540546417236328, 0.3330059051513672, -1.2807121276855469, 1.0124740600585938, 1.2315177917480469, 4.958560943603516, -3.5909271240234375, -0.6449441909790039, -0.69561767578125, 24.3323974609375, 0.17311859130859375, -0.48088645935058594, 4.098155975341797, 20.51068115234375, 3.6408653259277344, 1.184906005859375, 1.064788818359375, 0.058246612548828125, 0.4247875213623047, 2.0345401763916016, -1.395792007446289, 5.4964752197265625, -0.9340972900390625, 4.6643524169921875, -1.5047035217285156, 4.243186950683594, 1.4752731323242188, 6.26513671875, -0.6415805816650391, -1.3428802490234375, -0.057018280029296875, 1.6238937377929688, -5.3126068115234375, -3.5801849365234375, 1.0193042755126953, 3.5216140747070312, 5.162895202636719, 7.839988708496094, 8.054813385009766, 14.572105407714844, 11.487445831298828, 1.3658447265625, 5.780204772949219, 3.5308914184570312, -3.0957717895507812, -0.7758445739746094, -2.3874359130859375, 5.916084289550781], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000109.npy"}
|
||||
{"epoch": 0.16477702191987906, "step": 110, "batch_size": 64, "mean": 0.2910418212413788, "std": 5.4693522453308105, "min": -20.43829345703125, "p10": -5.43427734375, "median": 0.9285697937011719, "p90": 6.649275970458985, "max": 13.08123779296875, "pos_frac": 0.578125, "sample": [1.9956645965576172, -0.0718841552734375, -3.4884796142578125, 1.1264686584472656, -0.6847686767578125, 6.75128173828125, 3.7295913696289062, -1.2385673522949219, -9.94256591796875, 1.585845947265625, 5.3162689208984375, 3.134326934814453, 1.0033950805664062, 6.879302978515625, 0.868896484375, -2.708831787109375, 0.4394187927246094, 10.636505126953125, -0.9446010589599609, -4.063409805297852, 2.5569801330566406, -2.1028213500976562, 1.9847393035888672, 6.411262512207031, 0.4232921600341797, 2.6192703247070312, 1.0924568176269531, 1.3510093688964844, 0.97271728515625, -5.59625244140625, -3.1960525512695312, 2.0228271484375, 2.571514129638672, -1.7513885498046875, -5.681632995605469, 0.8844223022460938, -5.05633544921875, 5.51715087890625, -1.6298904418945312, 1.716989517211914, 8.409408569335938, 4.5489349365234375, -1.2076339721679688, -3.0935516357421875, -1.2955665588378906, 2.9225387573242188, 4.598701477050781, -7.090309143066406, -8.379241943359375, -16.87884521484375, -0.62994384765625, 1.3384552001953125, -0.4693450927734375, -0.5814743041992188, -0.8644866943359375, 13.08123779296875, 8.701702117919922, 3.7361698150634766, 1.9291000366210938, 7.3538665771484375, -4.107337951660156, 1.1368331909179688, 0.47164154052734375, -20.43829345703125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000110.npy"}
|
||||
{"epoch": 0.16628873771730915, "step": 111, "batch_size": 64, "mean": 1.9606516361236572, "std": 5.230652332305908, "min": -7.075584411621094, "p10": -3.5555179595947264, "median": 1.1118583679199219, "p90": 9.360696411132814, "max": 19.257415771484375, "pos_frac": 0.59375, "sample": [-0.8889827728271484, -2.867155075073242, -0.15996551513671875, 11.630477905273438, 2.20550537109375, 7.78607177734375, -1.3695487976074219, 3.768402099609375, 2.482372283935547, 0.47150230407714844, 5.351318359375, -1.3642730712890625, -4.134552001953125, 2.1106491088867188, -7.075584411621094, 0.007099151611328125, -0.30426788330078125, 1.2762413024902344, -2.21112060546875, 2.688901901245117, 9.457275390625, 5.8134613037109375, 11.163551330566406, 9.763797760009766, 12.832046508789062, 4.126981735229492, 3.899169921875, 6.139122009277344, 2.2288589477539062, -6.891029357910156, -0.6023426055908203, -2.726409912109375, -4.42974853515625, -5.9365386962890625, -0.39110565185546875, -2.5959854125976562, 0.3945808410644531, 5.015960693359375, -2.3815231323242188, 5.792488098144531, 9.135345458984375, 8.923927307128906, 9.572723388671875, -1.5108051300048828, -2.002941131591797, -0.4491844177246094, 3.107748031616211, 9.121803283691406, -2.6424026489257812, 1.0570564270019531, 5.891700744628906, 3.596485137939453, 2.9918270111083984, -3.1765499114990234, -6.052742004394531, 0.4237174987792969, -3.4384231567382812, 3.4485015869140625, -3.605701446533203, 1.1666603088378906, 19.257415771484375, -2.1461143493652344, 2.248615264892578, 0.487335205078125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000111.npy"}
|
||||
{"epoch": 0.16780045351473924, "step": 112, "batch_size": 64, "mean": 2.4020986557006836, "std": 4.750843048095703, "min": -10.050079345703125, "p10": -2.4799850463867186, "median": 2.45709228515625, "p90": 8.358224868774416, "max": 13.028045654296875, "pos_frac": 0.71875, "sample": [11.979034423828125, 7.2603912353515625, 8.884445190429688, 3.7505226135253906, -0.09293746948242188, 1.795318603515625, 12.577301025390625, 3.4027328491210938, 4.7972259521484375, 2.9021530151367188, 0.36798095703125, 6.3548583984375, -5.522129058837891, 2.660989761352539, 1.2269554138183594, -7.1044464111328125, -1.8939971923828125, 3.9900283813476562, -4.854667663574219, -0.38704681396484375, 3.5461883544921875, -2.48602294921875, 3.400196075439453, 5.819694519042969, 2.4276123046875, 3.080974578857422, 5.5213775634765625, 1.7362785339355469, 13.028045654296875, 3.098917007446289, 7.925811767578125, 1.7533111572265625, -2.4658966064453125, 1.0411415100097656, 6.041164398193359, -0.7560577392578125, 2.762186050415039, -5.846839904785156, 8.54354476928711, 2.7981185913085938, -1.5537147521972656, 2.4498291015625, 4.952968597412109, 2.4435195922851562, 2.6072235107421875, 10.992843627929688, 5.8401947021484375, 5.5641326904296875, -0.8995227813720703, 0.7817153930664062, -8.489509582519531, 2.46435546875, -0.5898895263671875, 1.7025909423828125, 7.742767333984375, 1.080810546875, 1.7764625549316406, -2.400594711303711, -1.6298599243164062, 4.908477783203125, -1.0335502624511719, -10.050079345703125, 1.7448501586914062, 10.263832092285156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000112.npy"}
|
||||
{"epoch": 0.1693121693121693, "step": 113, "batch_size": 64, "mean": 1.4066444635391235, "std": 4.714938640594482, "min": -15.18359375, "p10": -3.407194709777832, "median": 1.099069595336914, "p90": 6.371261978149414, "max": 14.534820556640625, "pos_frac": 0.640625, "sample": [-2.2399444580078125, -5.851799011230469, -2.3242111206054688, 1.2501258850097656, -0.32122039794921875, 6.290637969970703, 2.321674346923828, -8.847610473632812, 0.29781341552734375, 3.922639846801758, 1.7730636596679688, -1.0834426879882812, -3.4491653442382812, 2.6984100341796875, -3.309263229370117, -0.6308135986328125, 1.72467041015625, 8.967105865478516, -0.49056243896484375, -3.6969871520996094, -2.4092178344726562, 3.44317626953125, 6.405815124511719, -5.4838104248046875, -0.6527843475341797, 3.4154815673828125, 4.761466979980469, 3.0062408447265625, 7.194080352783203, -0.5327777862548828, 4.01708984375, 14.534820556640625, 0.6246757507324219, -0.7871017456054688, 4.006252288818359, 7.001516342163086, -15.18359375, 2.7408447265625, 5.3853759765625, -1.3495407104492188, 1.5095977783203125, 5.490468978881836, -4.0778045654296875, 5.858421325683594, -3.014446258544922, 9.601119995117188, 0.9480133056640625, 0.16045570373535156, -0.1609172821044922, 4.816877365112305, 2.886383056640625, -3.0363006591796875, 1.698659896850586, 11.890243530273438, 5.6781158447265625, 0.017772674560546875, 3.1080398559570312, 0.3910026550292969, 5.199806213378906, 0.3367767333984375, 0.3966789245605469, 0.3686943054199219, -3.127826690673828, 5.946281433105469], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000113.npy"}
|
||||
{"epoch": 0.1708238851095994, "step": 114, "batch_size": 64, "mean": 2.358133554458618, "std": 4.746648788452148, "min": -5.712791442871094, "p10": -3.7208007812499995, "median": 1.5808486938476562, "p90": 8.202000999450688, "max": 16.391204833984375, "pos_frac": 0.6875, "sample": [-5.712791442871094, -4.2903289794921875, 0.8138236999511719, 6.5187225341796875, 4.598409652709961, 1.74346923828125, 3.6947288513183594, 7.3070220947265625, 4.471256256103516, -2.284452438354492, -3.99102783203125, 1.1445655822753906, 3.135974884033203, -3.09027099609375, -0.6245880126953125, 1.6906890869140625, 3.3010406494140625, -0.35384368896484375, 0.3764610290527344, -0.7427902221679688, 2.847320556640625, -2.5929031372070312, 16.391204833984375, -3.054607391357422, 0.6244029998779297, 7.104705810546875, 4.357732772827148, 5.375139236450195, 1.47100830078125, 6.361072540283203, -1.738992691040039, 1.4450416564941406, 8.554244995117188, 3.3693084716796875, 2.6667404174804688, 1.093719482421875, 6.8627777099609375, 1.8098297119140625, 5.3213043212890625, -4.976589202880859, 15.781219482421875, 7.380098342895508, 6.1842041015625, 0.09368324279785156, 4.661285400390625, 1.1241817474365234, 11.464813232421875, -2.5864944458007812, -1.832489013671875, -4.111537933349609, 1.1473770141601562, -1.683746337890625, 3.6420459747314453, 9.6444091796875, 9.300769805908203, 10.2513427734375, 1.7248268127441406, -0.5422840118408203, -4.363700866699219, -3.9977569580078125, 5.671665191650391, -1.3219757080078125, 1.0751152038574219, 1.2149677276611328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000114.npy"}
|
||||
{"epoch": 0.17233560090702948, "step": 115, "batch_size": 64, "mean": 1.881299614906311, "std": 4.393327236175537, "min": -7.785133361816406, "p10": -3.06246337890625, "median": 1.51806640625, "p90": 7.088592529296877, "max": 15.656326293945312, "pos_frac": 0.671875, "sample": [-1.6215744018554688, -2.3294830322265625, 4.292121887207031, -6.120513916015625, 15.656326293945312, 2.218740463256836, 3.5485000610351562, 4.898921966552734, 0.626434326171875, 13.1319580078125, 1.8072547912597656, -1.4603958129882812, 9.728073120117188, -5.754234313964844, -2.99957275390625, -3.08941650390625, 0.21584129333496094, 6.0892333984375, 1.4018402099609375, 3.0212173461914062, 4.614715576171875, 3.392822265625, 1.9112319946289062, -0.022182464599609375, -3.340036392211914, 2.0118179321289062, 4.074470520019531, 1.4112701416015625, -7.785133361816406, 7.547145843505859, -3.1225738525390625, -0.284576416015625, 0.3217792510986328, 7.322052001953125, 1.4415359497070312, 4.456062316894531, -0.6329727172851562, 3.6123504638671875, -2.8784255981445312, 2.2606735229492188, 2.993175506591797, -6.205005645751953, 6.1151123046875, 3.7028961181640625, 10.800323486328125, 2.8160228729248047, 1.5945968627929688, 1.8451404571533203, -1.2454376220703125, 5.603546142578125, 0.9645538330078125, 0.2561798095703125, 6.543853759765625, -0.16818618774414062, 0.5165634155273438, -0.930328369140625, -1.9560317993164062, -0.869110107421875, 0.49190521240234375, 1.2621498107910156, 4.165252685546875, -0.021823883056640625, 2.4870128631591797, 10.067512512207031], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000115.npy"}
|
||||
{"epoch": 0.17384731670445955, "step": 116, "batch_size": 64, "mean": 2.8025131225585938, "std": 4.069322109222412, "min": -5.450340270996094, "p10": -2.13831729888916, "median": 3.5070133209228516, "p90": 7.279079818725587, "max": 12.69970703125, "pos_frac": 0.703125, "sample": [6.6454010009765625, -2.5454940795898438, 0.2593269348144531, 8.671035766601562, 5.3148193359375, 4.330802917480469, 3.715179443359375, 1.4562606811523438, 5.078977584838867, 7.4360504150390625, 6.16925048828125, -2.486917495727539, -5.01263427734375, 10.281402587890625, 0.21460533142089844, -1.3070964813232422, 0.4779510498046875, -0.0029315948486328125, 3.7332687377929688, 5.6518096923828125, -5.1255340576171875, 2.0819854736328125, 5.470684051513672, 3.5780982971191406, -0.4381256103515625, 6.0352020263671875, 5.453971862792969, 6.912815093994141, 5.568572998046875, 1.3217430114746094, 2.1064491271972656, -1.126211166381836, -2.2284698486328125, 1.0894813537597656, 9.870529174804688, -0.0909423828125, 4.066764831542969, 5.365898132324219, -0.515289306640625, -1.5241432189941406, 5.246326446533203, 4.263420104980469, 3.9542808532714844, 0.8673629760742188, 12.69970703125, 4.3557891845703125, 11.464057922363281, -3.6113739013671875, 0.8189468383789062, 3.4359283447265625, -0.37553977966308594, 6.306327819824219, 0.827911376953125, -5.450340270996094, -0.6723899841308594, 5.7000885009765625, -1.6074867248535156, 4.367279052734375, 1.3748588562011719, 9.360549926757812, 5.827690124511719, -0.20132827758789062, 6.382198333740234, -1.9279613494873047], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000116.npy"}
|
||||
{"epoch": 0.17535903250188964, "step": 117, "batch_size": 64, "mean": 1.8969154357910156, "std": 4.618406772613525, "min": -11.791107177734375, "p10": -3.006532287597656, "median": 1.8311691284179688, "p90": 7.128590393066409, "max": 16.362716674804688, "pos_frac": 0.703125, "sample": [-0.19652557373046875, 5.059175491333008, 3.6823883056640625, 0.8620948791503906, 9.417770385742188, 2.9703330993652344, 5.325220108032227, 1.8121604919433594, -2.19073486328125, 4.126016616821289, 0.778045654296875, -1.682464599609375, 6.4229736328125, -3.926361083984375, 0.8458938598632812, 3.40570068359375, -3.1808624267578125, 2.481781005859375, 2.649059295654297, 5.557136535644531, 2.2622222900390625, 3.559661865234375, 4.809383392333984, 3.0459823608398438, -1.5807209014892578, -4.959308624267578, 1.755523681640625, -0.34328460693359375, 3.421548843383789, 10.657058715820312, -2.550769805908203, 9.625907897949219, 7.393013000488281, 0.1521320343017578, 8.245315551757812, 1.8501777648925781, 4.131067276000977, 9.163925170898438, 0.5781402587890625, 6.511604309082031, 2.3432159423828125, -0.5893898010253906, 1.6703109741210938, 3.4722061157226562, -1.940216064453125, -10.675071716308594, 1.5807723999023438, 1.1129684448242188, 3.1591033935546875, 4.6946868896484375, 1.0571784973144531, -0.4485626220703125, 5.793212890625, -2.599761962890625, 16.362716674804688, -6.293785095214844, -3.73272705078125, 1.9264335632324219, -11.791107177734375, -0.7157821655273438, 4.744903564453125, 0.07511138916015625, 0.42575645446777344, -0.17696762084960938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000117.npy"}
|
||||
{"epoch": 0.17687074829931973, "step": 118, "batch_size": 64, "mean": 2.6522059440612793, "std": 3.899268865585327, "min": -7.428834915161133, "p10": -1.9792533874511717, "median": 2.5516090393066406, "p90": 8.018861389160156, "max": 10.836532592773438, "pos_frac": 0.734375, "sample": [2.5702285766601562, 1.8354568481445312, 5.161407470703125, -3.5280609130859375, -7.428834915161133, -1.804962158203125, 5.0097808837890625, 2.532989501953125, 4.776029586791992, -1.4880638122558594, 2.9272842407226562, 3.5890655517578125, 1.062479019165039, 7.94427490234375, 0.8300704956054688, 0.3822784423828125, -1.857696533203125, 5.419624328613281, 0.3218116760253906, 0.9156932830810547, -1.437255859375, 8.050827026367188, 3.540090560913086, 4.8530731201171875, 3.673828125, -2.018962860107422, 0.1805572509765625, -1.7752838134765625, 4.733800888061523, -0.2473602294921875, 4.133575439453125, 2.7867774963378906, 8.98944091796875, -2.094390869140625, 1.2294921875, 4.6872100830078125, 2.1721572875976562, 10.836532592773438, 9.1575927734375, -2.806060791015625, -1.9202747344970703, 7.7764434814453125, 0.262908935546875, 5.797515869140625, -1.064727783203125, 7.878990173339844, -1.6964950561523438, 8.990005493164062, 4.197990417480469, 8.450645446777344, 4.606269836425781, 0.6696281433105469, 3.9036788940429688, 9.843414306640625, 6.665699005126953, 1.622610092163086, -2.17138671875, 4.060834884643555, 6.029693603515625, 7.718137741088867, 2.5210418701171875, -2.0045299530029297, 1.1460590362548828, -1.3594589233398438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000118.npy"}
|
||||
{"epoch": 0.17838246409674982, "step": 119, "batch_size": 64, "mean": 1.855197787284851, "std": 5.4828386306762695, "min": -12.885147094726562, "p10": -4.022219085693359, "median": 1.6185417175292969, "p90": 8.88108139038086, "max": 22.010345458984375, "pos_frac": 0.65625, "sample": [-1.7078895568847656, 4.51312255859375, -3.702932357788086, 0.34133148193359375, -4.2342529296875, -0.5209426879882812, 4.775970458984375, 3.778411865234375, -1.2998275756835938, 0.9361076354980469, -4.0825042724609375, -0.83807373046875, 1.9151458740234375, 9.725410461425781, 7.132541656494141, -12.885147094726562, 0.2974205017089844, -3.8815536499023438, 1.1303958892822266, 3.655181884765625, -1.4449501037597656, 5.977237701416016, -2.26629638671875, 2.766366958618164, 9.123764038085938, 4.653411865234375, -4.670536041259766, 2.4263267517089844, 9.047943115234375, 11.558334350585938, 6.642852783203125, -1.8767623901367188, 3.4440250396728516, 4.315315246582031, 1.3187103271484375, 22.010345458984375, 2.232664108276367, -0.39281463623046875, 0.8683662414550781, 8.491737365722656, 2.639575958251953, 7.6776885986328125, -2.4618358612060547, 0.292449951171875, -6.231292724609375, -5.078424453735352, 10.629135131835938, 5.310142517089844, 2.5239715576171875, 3.9573211669921875, 1.6061248779296875, -12.24334716796875, -3.8361663818359375, -1.2389907836914062, 1.2071113586425781, 1.6309585571289062, 4.377721786499023, 10.512825012207031, 5.4333648681640625, 1.256927490234375, 2.372232437133789, 1.7358474731445312, -1.7309379577636719, -0.8837051391601562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000119.npy"}
|
||||
{"epoch": 0.17989417989417988, "step": 120, "batch_size": 64, "mean": 2.7568416595458984, "std": 3.9008655548095703, "min": -4.665624618530273, "p10": -1.6384010314941404, "median": 2.0036163330078125, "p90": 6.9586528778076175, "max": 14.089866638183594, "pos_frac": 0.765625, "sample": [1.922760009765625, 0.6565303802490234, -3.0944595336914062, 6.766490936279297, 4.2167816162109375, 0.38626861572265625, 5.432258605957031, 0.9071121215820312, 1.0901947021484375, -4.102996826171875, 8.09075927734375, 1.6688385009765625, 3.3654937744140625, 3.939472198486328, 4.747169494628906, 13.14996337890625, 3.6567306518554688, -0.4821891784667969, 2.1262588500976562, 5.1775054931640625, 6.257606506347656, -1.4690780639648438, 4.632118225097656, 3.0515899658203125, 1.1440277099609375, 1.5280933380126953, 7.041007995605469, 0.7775192260742188, 5.300506591796875, 5.029136657714844, 14.089866638183594, -2.733856201171875, 8.785148620605469, -1.2038707733154297, 8.605094909667969, 0.2593803405761719, 1.9862823486328125, 0.7184410095214844, 6.000083923339844, -1.710968017578125, -0.23799514770507812, 2.2284812927246094, 1.304656982421875, 12.633163452148438, -1.3737850189208984, -1.8997573852539062, -4.665624618530273, 6.72528076171875, 4.358642578125, 1.2999801635742188, -0.16786956787109375, 0.23248672485351562, 6.13629150390625, 0.90625, 3.6430416107177734, -0.022613525390625, 2.0209503173828125, 4.302757263183594, -1.0681438446044922, -3.338603973388672, 5.2256011962890625, 1.9617252349853516, 5.65814208984375, 2.8657302856445312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000120.npy"}
|
||||
{"epoch": 0.18140589569160998, "step": 121, "batch_size": 64, "mean": 3.0023269653320312, "std": 5.044544696807861, "min": -4.341880798339844, "p10": -1.8286594390869138, "median": 1.7056264877319336, "p90": 9.061322021484377, "max": 22.77716064453125, "pos_frac": 0.78125, "sample": [1.0041656494140625, 13.521697998046875, 0.6464080810546875, -2.259828567504883, 5.913045883178711, 13.996917724609375, 7.838520050048828, -2.170686721801758, 2.479816436767578, 5.204093933105469, -0.7542190551757812, -1.9346466064453125, 12.480606079101562, 3.458864212036133, -0.75579833984375, 0.30850791931152344, 4.67869758605957, 0.05770111083984375, 5.362335205078125, -4.341880798339844, 9.380401611328125, -0.028347015380859375, 6.8450927734375, 3.011993408203125, 4.436614990234375, 2.5429115295410156, 2.14324951171875, -2.3047561645507812, 2.9319915771484375, 1.4480781555175781, -0.6165084838867188, 22.77716064453125, 2.255645751953125, 2.8811569213867188, 0.8693656921386719, 0.4542083740234375, -0.894378662109375, 1.0216598510742188, 2.6569976806640625, 0.16479873657226562, -0.10993385314941406, 0.2731494903564453, 3.5321502685546875, 5.632362365722656, 0.96893310546875, 1.8781967163085938, -3.493968963623047, 16.522186279296875, 0.132171630859375, 2.7489147186279297, 1.9631576538085938, 1.4479179382324219, 2.2051315307617188, 4.221654891967773, 1.0555667877197266, 15.719146728515625, 1.5330562591552734, -1.5813560485839844, 0.8735408782958984, 0.2986183166503906, 0.2528839111328125, 3.4145164489746094, 8.316802978515625, -2.367521286010742], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000121.npy"}
|
||||
{"epoch": 0.18291761148904007, "step": 122, "batch_size": 64, "mean": 2.6941189765930176, "std": 5.781430244445801, "min": -7.155242919921875, "p10": -3.797168731689453, "median": 1.573481559753418, "p90": 8.907952880859378, "max": 23.15814208984375, "pos_frac": 0.671875, "sample": [23.15814208984375, -3.8298873901367188, 0.4110107421875, 6.850135803222656, -3.7208251953125, -2.4574127197265625, 13.79302978515625, 9.335838317871094, -0.9898815155029297, 6.126468658447266, 4.441028594970703, 1.2953643798828125, 0.4249706268310547, 2.2829132080078125, 0.6897449493408203, 17.4468994140625, -2.350341796875, 2.5553207397460938, -4.603084564208984, 6.34904670715332, 0.9170417785644531, 11.248014450073242, 4.9364776611328125, 5.766279220581055, 7.909553527832031, 6.0111083984375, -2.0336666107177734, -3.9035911560058594, 4.8553466796875, 4.855171203613281, -7.155242919921875, 4.871253967285156, -0.5153007507324219, -2.7072601318359375, -4.6595458984375, 3.9858970642089844, 2.004487991333008, -1.4098052978515625, 3.061037063598633, -0.16342926025390625, -0.22272109985351562, -0.2238922119140625, -4.109691619873047, 11.66229248046875, -3.000396728515625, 5.704011917114258, 1.94354248046875, 1.9468345642089844, 0.13542556762695312, 1.6616058349609375, 1.4675407409667969, 1.4853572845458984, 0.2430591583251953, 1.2020072937011719, -4.002597808837891, 2.899097442626953, 0.30681610107421875, -0.8252105712890625, -0.585235595703125, 6.291156768798828, 3.6631317138671875, 4.778171539306641, 3.6206817626953125, 21.300308227539062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000122.npy"}
|
||||
{"epoch": 0.18442932728647016, "step": 123, "batch_size": 64, "mean": 2.5633208751678467, "std": 4.464794158935547, "min": -8.115966796875, "p10": -1.9222602844238281, "median": 1.958268165588379, "p90": 7.765492630004884, "max": 19.42413330078125, "pos_frac": 0.6875, "sample": [7.454166412353516, 6.860679626464844, 2.7061996459960938, 19.42413330078125, -4.412435531616211, -1.5445613861083984, 2.4464111328125, 1.2270278930664062, 11.133541107177734, 5.2747344970703125, 1.7206573486328125, 9.05865478515625, -0.6238479614257812, -1.456787109375, 5.609855651855469, 5.7156829833984375, 2.844026565551758, 4.0746002197265625, -1.2517166137695312, 7.898918151855469, 1.7795028686523438, 6.754310607910156, 1.398590087890625, 7.281894683837891, 4.037055969238281, 10.979110717773438, 2.818399429321289, -1.8712577819824219, 6.682952880859375, 1.9480056762695312, 1.2431392669677734, 10.427993774414062, -1.8369522094726562, 6.364297866821289, -0.8589820861816406, -5.310333251953125, 1.6012954711914062, -2.526155471801758, -0.2056121826171875, 1.7713489532470703, 1.6757125854492188, -1.9441184997558594, 2.1338882446289062, -0.73895263671875, 1.9685306549072266, 6.09637451171875, -0.40302467346191406, -1.6552162170410156, 1.0770511627197266, 5.394062042236328, -1.772125244140625, 2.3064498901367188, 1.1865005493164062, 3.2743148803710938, -8.115966796875, 8.059906005859375, -0.6719341278076172, 2.456218719482422, -1.9738006591796875, 2.560302734375, 1.4266529083251953, 2.4405784606933594, 4.934417724609375, -2.3018264770507812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000123.npy"}
|
||||
{"epoch": 0.18594104308390022, "step": 124, "batch_size": 64, "mean": 3.63966703414917, "std": 4.9041666984558105, "min": -4.986686706542969, "p10": -2.5703048706054688, "median": 3.3811111450195312, "p90": 10.56845626831055, "max": 16.713638305664062, "pos_frac": 0.78125, "sample": [-1.2116355895996094, 16.713638305664062, 5.568595886230469, -4.277107238769531, 4.085039138793945, 4.534095764160156, 15.85015869140625, 5.479885101318359, 13.111106872558594, 10.953857421875, 7.750816345214844, 7.150348663330078, 1.4031333923339844, 13.566207885742188, 0.8901710510253906, 1.9760990142822266, 2.2380599975585938, 6.14695930480957, 1.4890975952148438, -2.0424747467041016, -4.621185302734375, 3.5144805908203125, 3.5519332885742188, 2.0377197265625, 2.3255958557128906, 4.644145965576172, 1.6239604949951172, 0.1042938232421875, -4.31036376953125, -2.677471160888672, 4.302646636962891, 5.412872314453125, -2.6386032104492188, 9.953910827636719, 4.816738128662109, 7.891326904296875, 12.539421081542969, -3.192169189453125, 6.822540283203125, 3.24774169921875, 0.42604827880859375, 9.085197448730469, -0.8960800170898438, 4.757423400878906, 3.087186813354492, 3.8470001220703125, 1.7980117797851562, 6.655708312988281, -2.4109420776367188, 3.7590198516845703, 10.831832885742188, 1.2276649475097656, 0.2772655487060547, 3.160083770751953, 3.081451416015625, 3.6275100708007812, 3.119049072265625, -2.1357421875, -4.986686706542969, 5.066680908203125, -0.4926033020019531, 5.085163116455078, -0.9459381103515625, 9.188796997070312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000124.npy"}
|
||||
{"epoch": 0.1874527588813303, "step": 125, "batch_size": 64, "mean": 3.129549980163574, "std": 5.137041091918945, "min": -6.750907897949219, "p10": -3.1246097564697264, "median": 2.9683895111083984, "p90": 8.890753173828127, "max": 20.868438720703125, "pos_frac": 0.734375, "sample": [-0.848052978515625, 7.3480072021484375, 6.583492279052734, 3.120147705078125, 0.7383193969726562, -1.6304988861083984, 11.79086685180664, 0.4850044250488281, 7.08343505859375, 3.347127914428711, 20.868438720703125, -4.06005859375, 1.0329360961914062, 3.274749755859375, 2.1235828399658203, 9.034133911132812, 9.220535278320312, 4.449737548828125, 3.0571746826171875, -1.1510848999023438, 8.556198120117188, 0.5610618591308594, 3.4102096557617188, 3.9045677185058594, -1.3481979370117188, 1.9399604797363281, 8.033210754394531, -3.8312110900878906, -1.8049468994140625, -0.3392677307128906, 7.901702880859375, 4.311912536621094, 0.964080810546875, 12.777923583984375, 6.244382858276367, -3.2557716369628906, 5.565135955810547, 1.5703506469726562, 11.667373657226562, 16.916580200195312, 6.40789794921875, 3.1601638793945312, 6.7113494873046875, 5.322422027587891, 2.938457489013672, -6.750907897949219, -0.0004119873046875, 8.074169158935547, 0.4857044219970703, 0.9729156494140625, 2.998321533203125, 3.250387191772461, 4.217430114746094, 2.5609397888183594, -2.3048934936523438, 4.576381683349609, 1.1438560485839844, -2.8185653686523438, 1.3207778930664062, -0.4187793731689453, 2.378255844116211, -3.58856201171875, -6.581626892089844, -3.3777313232421875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000125.npy"}
|
||||
{"epoch": 0.1889644746787604, "step": 126, "batch_size": 64, "mean": 1.7365506887435913, "std": 4.744752883911133, "min": -10.313629150390625, "p10": -3.07413330078125, "median": 1.8815803527832031, "p90": 6.771679687500002, "max": 16.61773681640625, "pos_frac": 0.625, "sample": [3.3639163970947266, 0.5580883026123047, 6.310518264770508, 2.740030288696289, 0.11198997497558594, 6.4551544189453125, 6.31768798828125, -0.09148216247558594, 3.3074722290039062, -2.194581985473633, 5.996620178222656, 6.9073333740234375, -9.267875671386719, 0.6896514892578125, 6.161521911621094, 1.8942794799804688, -2.4584503173828125, -2.084308624267578, -0.09173583984375, -3.957305908203125, 2.4285888671875, 6.90771484375, 1.4071598052978516, 3.177340507507324, -1.2987861633300781, -0.23414993286132812, 5.095481872558594, -5.726898193359375, -2.3159332275390625, -1.9224414825439453, 0.21484756469726562, -3.0123748779296875, -2.7562713623046875, 3.255155563354492, -4.24589729309082, -1.3011245727539062, -2.527738571166992, 16.61773681640625, 4.583696365356445, -10.313629150390625, 4.365959167480469, 4.0224609375, 4.032255172729492, 5.141170501708984, 4.8014373779296875, 4.866539001464844, 7.3500823974609375, 13.341781616210938, -0.7019844055175781, 1.3651580810546875, 0.8982505798339844, 8.366592407226562, 3.3624229431152344, 3.7156944274902344, 1.8688812255859375, -1.314849853515625, -3.1006011962890625, 5.586368560791016, -2.2282562255859375, -1.6046829223632812, 2.7618560791015625, 9.235763549804688, -6.28172492980957, 2.5876617431640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000126.npy"}
|
||||
{"epoch": 0.19047619047619047, "step": 127, "batch_size": 64, "mean": 3.3200788497924805, "std": 4.390384674072266, "min": -6.6210479736328125, "p10": -1.951674652099609, "median": 2.821125030517578, "p90": 8.817316818237305, "max": 14.599227905273438, "pos_frac": 0.765625, "sample": [-1.2306900024414062, 14.33642578125, 7.857139587402344, 3.5074424743652344, -2.39251708984375, 8.856674194335938, 3.2835006713867188, 8.38936996459961, 1.6495513916015625, 11.258552551269531, 6.955436706542969, 3.4314804077148438, 4.507728576660156, 8.283624649047852, 1.3980560302734375, 1.4307708740234375, 5.927604675292969, 4.259380340576172, 1.4893875122070312, 1.707977294921875, 7.275489807128906, -6.6210479736328125, 4.4807586669921875, 7.179786682128906, 3.7832183837890625, -5.360221862792969, 4.0139923095703125, 1.2500762939453125, 2.7355499267578125, 1.3502120971679688, 0.3373908996582031, 3.8378334045410156, -0.42017555236816406, 9.026947021484375, -0.2064361572265625, 8.277132034301758, 1.6074066162109375, 8.725482940673828, 9.965587615966797, 14.599227905273438, -2.5597305297851562, 6.009956359863281, 2.3383560180664062, -2.2367725372314453, 2.4335784912109375, 2.9067001342773438, -2.076751708984375, -1.6514129638671875, 6.90765380859375, -3.774932861328125, -0.00202178955078125, 7.446533203125, 2.548809051513672, 1.5775604248046875, 1.0136184692382812, -1.0295906066894531, -1.6598281860351562, 1.7324447631835938, 3.9087982177734375, 3.438800811767578, 4.235553741455078, 0.7535381317138672, 10.111404418945312, -0.63232421875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000127.npy"}
|
||||
{"epoch": 0.19198790627362056, "step": 128, "batch_size": 64, "mean": 3.62443470954895, "std": 6.564926624298096, "min": -13.166397094726562, "p10": -3.6875238418579093, "median": 3.1215457916259766, "p90": 11.93785858154297, "max": 20.275390625, "pos_frac": 0.765625, "sample": [3.61181640625, 11.647171020507812, 3.0493812561035156, 3.4638633728027344, 3.4885902404785156, 1.56365966796875, 4.697998046875, 12.06243896484375, 1.9070358276367188, 17.0050048828125, 1.6467666625976562, -13.166397094726562, 11.404556274414062, -1.1693992614746094, 4.025627136230469, -7.22955322265625, 0.7877216339111328, -1.3844375610351562, -6.693260192871094, 12.869270324707031, 17.4853515625, -5.028511047363281, -2.586383819580078, 20.275390625, 19.49456787109375, 7.696868896484375, 0.5236053466796875, 10.457176208496094, -1.8739986419677734, 6.8667755126953125, 1.49383544921875, 4.504608154296875, 3.4521026611328125, 8.664321899414062, 7.6187286376953125, 6.4980010986328125, 0.1965484619140625, 9.970043182373047, -6.803018569946289, 1.9068756103515625, -2.374469757080078, 2.6494808197021484, 2.4379730224609375, 0.3715934753417969, -4.159440994262695, 2.1101150512695312, 2.1437301635742188, 3.4204330444335938, 3.289276123046875, -1.960428237915039, 5.03533935546875, 3.1937103271484375, 8.785797119140625, -0.938262939453125, 5.2805023193359375, 0.944061279296875, 1.5442733764648438, 6.798431396484375, 16.564956665039062, -1.7357006072998047, -7.9272918701171875, 2.002124786376953, 5.102592468261719, 4.984283447265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000128.npy"}
|
||||
{"epoch": 0.19349962207105065, "step": 129, "batch_size": 64, "mean": 2.0446584224700928, "std": 5.972801685333252, "min": -15.529098510742188, "p10": -4.013859367370605, "median": 2.111879348754883, "p90": 8.527902412414551, "max": 19.557525634765625, "pos_frac": 0.59375, "sample": [-0.6157913208007812, -0.8123073577880859, 5.566173553466797, -5.202232360839844, -0.5151958465576172, 7.429332733154297, -0.1772308349609375, -1.649606704711914, 2.7768325805664062, 9.82070541381836, 7.338645935058594, 2.712615966796875, 6.66827392578125, -2.3689651489257812, -2.8756256103515625, -1.3515167236328125, 7.9372100830078125, 0.4158477783203125, 9.49420166015625, 7.571971893310547, 1.956085205078125, -12.760208129882812, 1.7321929931640625, 4.3866119384765625, -3.0847320556640625, 19.557525634765625, 4.737386703491211, -1.8483314514160156, 3.915647506713867, 4.113250732421875, 5.925117492675781, -6.9238433837890625, -5.7846221923828125, 8.561071395874023, 0.4887046813964844, 7.689788818359375, -7.9018707275390625, -2.0558013916015625, -3.7048282623291016, 9.588443756103516, 3.9746780395507812, -1.5271987915039062, -2.5559158325195312, 1.3265762329101562, 5.016994476318359, 3.719511032104492, -1.40325927734375, 16.10907745361328, 6.601600646972656, 8.12603759765625, 2.8539772033691406, 8.450508117675781, 9.389839172363281, -1.3163490295410156, -1.240072250366211, -3.4062557220458984, 4.196773529052734, -15.529098510742188, 4.722076416015625, 0.8510684967041016, 2.2676734924316406, -4.14630126953125, -0.04589653015136719, 3.671173095703125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000129.npy"}
|
||||
{"epoch": 0.19501133786848074, "step": 130, "batch_size": 64, "mean": 3.442401647567749, "std": 4.992996692657471, "min": -5.7024993896484375, "p10": -3.8629638671874993, "median": 3.5214004516601562, "p90": 9.430727386474612, "max": 17.220352172851562, "pos_frac": 0.796875, "sample": [1.6367301940917969, 2.305011749267578, 10.638046264648438, 8.985275268554688, 5.5844268798828125, 0.4633026123046875, -2.5783538818359375, 8.181718826293945, 3.991128921508789, 0.8874282836914062, -4.46466064453125, 7.343208312988281, 8.705345153808594, 1.9463138580322266, -5.183967590332031, 7.021629333496094, 2.495290756225586, -4.075828552246094, 0.7554473876953125, 12.056831359863281, 3.3008193969726562, 3.7419815063476562, 6.640663146972656, 8.011322021484375, -2.9265403747558594, -3.3662796020507812, -4.4768524169921875, 1.4580078125, -2.7218780517578125, 0.7241744995117188, 8.793930053710938, 9.621635437011719, 7.691192626953125, 5.244293212890625, 1.8644943237304688, 14.476615905761719, 8.298263549804688, 4.291412353515625, 17.220352172851562, 2.987720489501953, -1.5802803039550781, 3.814441680908203, 0.21004486083984375, -5.352607727050781, 2.0017967224121094, 3.9899349212646484, 5.786705017089844, 0.8285675048828125, 4.7012481689453125, -1.8357009887695312, -5.172506332397461, 2.183076858520508, 5.37432861328125, 5.465124130249023, -5.7024993896484375, 5.725406646728516, 4.723960876464844, 2.5230979919433594, 3.1335906982421875, 11.100341796875, 9.913536071777344, 3.7791881561279297, 1.8622703552246094, 5.2709808349609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000130.npy"}
|
||||
{"epoch": 0.1965230536659108, "step": 131, "batch_size": 64, "mean": 4.3565826416015625, "std": 6.462599277496338, "min": -8.005882263183594, "p10": -3.316257858276366, "median": 3.1792144775390625, "p90": 14.03850555419922, "max": 18.434341430664062, "pos_frac": 0.75, "sample": [3.1553878784179688, 7.705331802368164, -0.64556884765625, -1.6645336151123047, -1.9789295196533203, 18.434341430664062, 3.5982894897460938, 4.582345962524414, -2.061464309692383, 13.811103820800781, 3.3160781860351562, 1.8289222717285156, -3.6865577697753906, -4.05584716796875, 1.5540657043457031, 6.446510314941406, -1.912322998046875, 3.2030410766601562, 9.362411499023438, -7.05950927734375, -4.795631408691406, -2.4522247314453125, 16.777511596679688, 6.04132080078125, 1.816314697265625, 0.1617431640625, 1.8065643310546875, -4.713592529296875, -1.5798606872558594, 0.8808135986328125, 6.385974884033203, 14.135963439941406, 11.505935668945312, 6.110258102416992, 12.806350708007812, 16.892303466796875, 2.3527259826660156, 15.48681640625, 4.6294403076171875, 13.4322509765625, 2.1480331420898438, 14.896812438964844, -2.0070838928222656, 7.690879821777344, 2.3315200805664062, 0.6804046630859375, 0.8059406280517578, 5.765533447265625, -5.327415466308594, 10.173828125, 0.2964916229248047, -0.019214630126953125, 8.212772369384766, 0.16745376586914062, -8.005882263183594, 11.79718017578125, 2.6418724060058594, 2.260530471801758, 13.217132568359375, 3.5838165283203125, 3.284820556640625, 9.274513244628906, 14.476343154907227, 8.860950469970703], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000131.npy"}
|
||||
{"epoch": 0.1980347694633409, "step": 132, "batch_size": 64, "mean": 3.5960710048675537, "std": 5.402273178100586, "min": -14.090789794921875, "p10": -2.66489028930664, "median": 4.4375457763671875, "p90": 9.426112365722657, "max": 17.319915771484375, "pos_frac": 0.765625, "sample": [9.479377746582031, 2.665904998779297, 6.1212615966796875, 6.1435699462890625, -2.2373523712158203, -1.6204891204833984, 7.864860534667969, 0.3293800354003906, -2.8079071044921875, 5.832237243652344, 5.588836669921875, 8.603906631469727, 4.830802917480469, 4.159839630126953, -0.3281364440917969, 1.6788406372070312, -2.0054359436035156, 15.242919921875, 5.25762939453125, -0.1634368896484375, -2.3311843872070312, -14.090789794921875, 8.579669952392578, -0.9687423706054688, -3.7632102966308594, -6.7324066162109375, 6.1959381103515625, 10.163650512695312, 10.192686080932617, -1.6014251708984375, 6.582374572753906, 0.7937278747558594, -11.021093368530273, 4.735626220703125, 2.4144973754882812, 4.6100311279296875, 17.319915771484375, 6.908525466918945, 9.028694152832031, 5.2573699951171875, 0.00962066650390625, 9.878772735595703, 3.8141860961914062, 9.301826477050781, 7.920082092285156, 4.2650604248046875, 4.678005218505859, 2.4816665649414062, 2.5477657318115234, 4.13578987121582, 5.803550720214844, 3.29931640625, 3.9664154052734375, 5.636175155639648, 8.935615539550781, 5.3153839111328125, 9.59527587890625, -2.973217010498047, 1.53668212890625, 2.6731109619140625, -4.465171813964844, 5.6275177001953125, 3.0928497314453125, 6.16180419921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000132.npy"}
|
||||
{"epoch": 0.19954648526077098, "step": 133, "batch_size": 64, "mean": 2.83449649810791, "std": 5.917599678039551, "min": -14.949249267578125, "p10": -3.775817489624023, "median": 2.300405502319336, "p90": 10.485234069824221, "max": 17.373611450195312, "pos_frac": 0.671875, "sample": [5.250114440917969, -5.643669128417969, 3.433361053466797, 10.766319274902344, 0.09680938720703125, -1.7030105590820312, 3.8027915954589844, 11.777839660644531, 0.6871185302734375, 1.65679931640625, 2.3882293701171875, 15.080230712890625, 8.714052200317383, 1.4031181335449219, -6.128749847412109, 2.7695770263671875, 2.3598403930664062, -14.949249267578125, 7.923900604248047, 1.9702682495117188, -0.43807220458984375, 7.1619415283203125, -1.653289794921875, 7.072303771972656, 4.487804412841797, -0.24340438842773438, 8.064712524414062, -0.7497100830078125, 3.81573486328125, -0.8307685852050781, 2.2409706115722656, 0.8301963806152344, -9.100860595703125, 9.829368591308594, -3.927928924560547, 2.1229629516601562, -0.4784049987792969, -3.4208908081054688, 9.0009765625, 9.531494140625, -1.6674575805664062, 3.0537948608398438, 5.056755065917969, -6.755563735961914, 3.214984893798828, 15.06317138671875, 6.764823913574219, 8.3011474609375, 1.323516845703125, 17.373611450195312, 1.4098892211914062, -0.2377471923828125, -2.1629714965820312, 13.177978515625, 6.03900146484375, -0.259185791015625, 5.266914367675781, 10.987319946289062, 4.5435943603515625, -1.6629886627197266, 2.7066421508789062, -1.981597900390625, -4.646522521972656, 1.5278472900390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000133.npy"}
|
||||
{"epoch": 0.20105820105820105, "step": 134, "batch_size": 64, "mean": 4.8859405517578125, "std": 6.263420581817627, "min": -11.950210571289062, "p10": -1.637946510314941, "median": 4.070915222167969, "p90": 13.27979736328125, "max": 21.625137329101562, "pos_frac": 0.828125, "sample": [-1.40673828125, 1.337738037109375, 3.552001953125, 4.120018005371094, 4.021812438964844, 3.8064212799072266, 3.1730499267578125, 6.300148010253906, 13.414169311523438, 12.82061767578125, 4.5477142333984375, 9.542091369628906, 4.684165954589844, 3.688617706298828, 4.29847526550293, 0.1199798583984375, 10.419696807861328, 7.871330261230469, -3.3219871520996094, 7.040058135986328, 7.818264007568359, 5.3954010009765625, 8.953958511352539, -0.8693885803222656, 4.697360992431641, 8.803009033203125, 0.19635009765625, 1.66424560546875, -1.7370357513427734, 6.976812362670898, -3.3684635162353516, -0.2848377227783203, 12.966262817382812, 1.1691741943359375, 0.7228507995605469, -7.245729446411133, 4.230739593505859, 2.9506187438964844, 1.2729339599609375, 4.668230056762695, 21.606788635253906, 9.971866607666016, -11.950210571289062, 3.03643798828125, 6.348243713378906, 1.8458824157714844, 13.86065673828125, -3.417510986328125, 16.890594482421875, 3.05572509765625, 1.8560066223144531, 6.898918151855469, 5.5357208251953125, 8.3155517578125, 18.47320556640625, 1.462127685546875, -3.076465606689453, 2.9124069213867188, 3.2441043853759766, 10.179759979248047, 14.685592651367188, 21.625137329101562, 1.6242351531982422, -1.2947273254394531], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000134.npy"}
|
||||
{"epoch": 0.20256991685563114, "step": 135, "batch_size": 64, "mean": 3.1143112182617188, "std": 5.880242824554443, "min": -12.771499633789062, "p10": -3.0288457870483394, "median": 2.696957588195801, "p90": 9.976100158691409, "max": 19.714614868164062, "pos_frac": 0.734375, "sample": [4.828544616699219, -2.1700401306152344, 7.514862060546875, 2.5115184783935547, 2.3185386657714844, -2.4515628814697266, -0.278900146484375, -4.940162658691406, -0.49913597106933594, 11.3544921875, 7.501190185546875, -3.2762527465820312, 5.990753173828125, -0.7619647979736328, 1.8332595825195312, -6.404743194580078, 11.919029235839844, 4.01666259765625, -0.570404052734375, 2.628030776977539, 2.5319290161132812, 2.3951263427734375, -11.955009460449219, 2.4099349975585938, 4.155998229980469, 8.822879791259766, 2.20355224609375, 4.3368072509765625, 1.3356552124023438, 4.243267059326172, 4.1443939208984375, 0.2765655517578125, 10.187797546386719, 2.1561222076416016, 5.139213562011719, 1.3775596618652344, 2.7658843994140625, 2.982372283935547, 7.5254669189453125, -6.605043411254883, 0.132781982421875, 19.714614868164062, 0.21036148071289062, 4.312705993652344, 3.2806949615478516, 3.47705078125, -0.14231109619140625, -5.836225509643555, 7.586589813232422, 7.8408966064453125, 5.911548614501953, -12.771499633789062, 0.8993759155273438, 7.269565582275391, 13.360527038574219, 15.006729125976562, -2.2170944213867188, 6.785219192504883, -2.0994930267333984, 7.833263397216797, 3.1863861083984375, 9.482139587402344, 15.430477142333984, -0.8325653076171875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000135.npy"}
|
||||
{"epoch": 0.20408163265306123, "step": 136, "batch_size": 64, "mean": 4.121640205383301, "std": 8.111139297485352, "min": -14.180328369140625, "p10": -4.663221740722656, "median": 3.579833984375, "p90": 15.91175651550293, "max": 23.547943115234375, "pos_frac": 0.671875, "sample": [15.928085327148438, 11.082405090332031, -9.737627029418945, -1.1130790710449219, -5.0017242431640625, 9.811225891113281, -3.5177078247070312, 0.6348037719726562, -6.595603942871094, 7.25847053527832, 3.2171173095703125, -14.180328369140625, -0.741485595703125, 5.557096481323242, -2.081878662109375, 7.86322021484375, 16.55167579650879, 4.028690338134766, 1.8643550872802734, -3.873382568359375, 10.220993041992188, 5.681365966796875, 10.813255310058594, 19.385948181152344, -1.8278350830078125, -0.07006072998046875, 16.07379913330078, 4.055248260498047, 1.45172119140625, 23.547943115234375, 10.74212646484375, -0.8071975708007812, 15.90618896484375, 13.488998413085938, 4.1607513427734375, 3.23699951171875, -0.35797882080078125, 5.063783645629883, 6.518836975097656, 0.0828094482421875, -12.287328720092773, 3.92266845703125, 1.59332275390625, 10.814949035644531, 5.182445526123047, 7.754146575927734, -3.6329612731933594, 6.3735809326171875, -3.8362274169921875, 18.438758850097656, 3.1553802490234375, -8.65521240234375, -1.1548309326171875, 9.811111450195312, 2.32684326171875, 15.90289306640625, -2.2993927001953125, -0.109405517578125, 13.085577011108398, 15.914142608642578, 7.853546142578125, 0.3942108154296875, 1.1936912536621094, -12.278961181640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000136.npy"}
|
||||
{"epoch": 0.20559334845049132, "step": 137, "batch_size": 64, "mean": 4.146416664123535, "std": 6.492955207824707, "min": -11.45001220703125, "p10": -2.134831047058105, "median": 2.6184844970703125, "p90": 11.76124496459961, "max": 23.97052001953125, "pos_frac": 0.703125, "sample": [-1.6111907958984375, -11.45001220703125, 6.1396942138671875, 11.209426879882812, 6.592475891113281, 11.0552978515625, 12.413753509521484, 6.6065826416015625, -0.3886070251464844, 23.97052001953125, 9.523164749145508, 2.29266357421875, 12.061359405517578, 2.944305419921875, 11.330177307128906, -0.6798553466796875, 3.0460357666015625, 11.817176818847656, -1.517608642578125, 1.7206077575683594, -1.0901298522949219, 6.059711456298828, 10.061187744140625, -6.392387390136719, 1.5278186798095703, 0.1969451904296875, 7.568183898925781, 21.20745086669922, -1.2980937957763672, 1.2957077026367188, -1.8166656494140625, -1.9146804809570312, -1.1886978149414062, 5.187957763671875, 14.998809814453125, 1.6934471130371094, 0.23703956604003906, -3.7469863891601562, 6.247016906738281, -2.2291812896728516, 2.1443862915039062, 2.9962158203125, 5.885805130004883, 12.123832702636719, 6.132846832275391, -3.7507495880126953, 1.6764278411865234, 9.265485763549805, 9.055145263671875, 11.6307373046875, 2.0857505798339844, -4.368095397949219, -0.10784149169921875, -7.101337432861328, 10.46571159362793, 3.6541709899902344, 1.0231876373291016, -1.7293548583984375, -0.00530242919921875, 7.962120056152344, 2.2446746826171875, 11.106681823730469, 0.94012451171875, 8.359634399414062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000137.npy"}
|
||||
{"epoch": 0.20710506424792138, "step": 138, "batch_size": 64, "mean": 3.651109457015991, "std": 7.298638343811035, "min": -16.459869384765625, "p10": -3.1880924224853513, "median": 4.591917037963867, "p90": 11.396847534179688, "max": 22.684860229492188, "pos_frac": 0.71875, "sample": [0.9916839599609375, 8.070709228515625, 4.82073974609375, 8.96335220336914, 5.852165222167969, 3.6988353729248047, 0.122589111328125, 4.508918762207031, -8.455276489257812, 4.674915313720703, 3.6829662322998047, -16.459869384765625, 13.824630737304688, -0.21756744384765625, 22.684860229492188, -8.411468505859375, -2.8829345703125, 8.415140151977539, 9.578483581542969, 4.922603607177734, 7.090396881103516, 1.7009735107421875, 19.808273315429688, 9.471893310546875, 2.487274169921875, 7.824811935424805, 10.667919158935547, 6.9768218994140625, 1.0618095397949219, 1.4762039184570312, 7.708158493041992, -0.42400360107421875, 11.576793670654297, -1.117431640625, -3.3878860473632812, -2.8723297119140625, 0.11733055114746094, -13.764144897460938, -0.4280052185058594, 7.337299346923828, 6.2921295166015625, 11.128387451171875, 0.5716400146484375, 5.455299377441406, -1.9973564147949219, -14.968040466308594, 16.557449340820312, 4.166543960571289, 5.3318328857421875, -0.3311767578125, 5.054130554199219, 7.43463134765625, 9.317207336425781, -2.3996810913085938, 0.2362213134765625, -1.3301734924316406, -3.3188743591308594, 6.107265472412109, -1.7614994049072266, 6.3511962890625, 16.864906311035156, 4.682300567626953, 11.51190185546875, 1.0171356201171875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000138.npy"}
|
||||
{"epoch": 0.20861678004535147, "step": 139, "batch_size": 64, "mean": 4.259641170501709, "std": 6.38577127456665, "min": -14.868606567382812, "p10": -2.6950849533081054, "median": 4.0037126541137695, "p90": 11.714567947387696, "max": 20.344879150390625, "pos_frac": 0.75, "sample": [3.323467254638672, 11.867753982543945, 5.721282958984375, 3.9654998779296875, -7.67742919921875, 8.734661102294922, 8.135673522949219, 11.542060852050781, 6.971599578857422, -1.4354686737060547, -1.3708839416503906, 17.127229690551758, 9.469890594482422, -2.718048095703125, 5.0404510498046875, 11.6004638671875, 19.35308837890625, 4.041925430297852, 2.263641357421875, 11.763469696044922, -2.3702239990234375, -14.868606567382812, -3.3452987670898438, 3.2101783752441406, 8.545581817626953, 3.7958641052246094, 7.6509857177734375, 5.433872222900391, 5.925167083740234, 0.0464630126953125, 6.414665222167969, 8.232933044433594, 3.7695693969726562, 0.09661865234375, -0.7830734252929688, 7.8857879638671875, 7.03924560546875, 1.382537841796875, 7.355224609375, 1.6414146423339844, 9.844318389892578, -1.2656974792480469, 12.463315963745117, -0.6179237365722656, -2.6415042877197266, -5.923885345458984, 5.090972900390625, 4.663330078125, 20.344879150390625, 3.6501312255859375, 9.06927490234375, 3.475250244140625, 2.2237014770507812, -7.7895050048828125, -2.6310653686523438, 7.239326477050781, 0.5696678161621094, 2.7835540771484375, -4.216396331787109, 7.6432952880859375, 5.632560729980469, 14.966110229492188, 3.8626747131347656, -0.5985870361328125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000139.npy"}
|
||||
{"epoch": 0.21012849584278157, "step": 140, "batch_size": 64, "mean": 5.968560218811035, "std": 9.314580917358398, "min": -23.977249145507812, "p10": -6.126542091369628, "median": 5.796398162841797, "p90": 18.159553909301756, "max": 22.980758666992188, "pos_frac": 0.734375, "sample": [19.237167358398438, 11.695556640625, 0.049755096435546875, -0.692840576171875, -0.24941253662109375, 11.160995483398438, -8.869834899902344, -8.378631591796875, 10.157482147216797, 13.614967346191406, -23.977249145507812, 9.131866455078125, 5.590919494628906, 5.48895263671875, 17.757278442382812, 7.547508239746094, -6.536077499389648, 2.0919570922851562, 0.8726959228515625, 14.943405151367188, 3.9734745025634766, 19.977493286132812, 12.186763763427734, 8.907394409179688, 0.9205856323242188, 10.262542724609375, 3.3096065521240234, 19.56870460510254, 2.383392333984375, -6.6956024169921875, -5.17095947265625, -3.8731765747070312, 11.1702880859375, -3.9126815795898438, -3.632669448852539, 6.654693603515625, 13.049747467041016, 18.634689331054688, 16.084617614746094, 22.980758666992188, 8.154548645019531, 17.467838287353516, 17.81285858154297, 18.08432388305664, 5.077674865722656, -7.40155029296875, 18.191795349121094, -0.1556243896484375, 20.715560913085938, -3.267915725708008, -7.892574310302734, 10.734458923339844, 3.8669891357421875, 8.301944732666016, 9.715995788574219, 2.006305694580078, 9.512489318847656, -2.8281631469726562, 6.0018768310546875, 4.595741271972656, 0.2145538330078125, 15.653907775878906, -4.1956024169921875, 4.2042999267578125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000140.npy"}
|
||||
{"epoch": 0.21164021164021163, "step": 141, "batch_size": 64, "mean": 5.30136775970459, "std": 8.943147659301758, "min": -14.26287841796875, "p10": -6.5569732666015605, "median": 5.089363098144531, "p90": 16.385496520996096, "max": 28.6490478515625, "pos_frac": 0.75, "sample": [-14.26287841796875, 0.5915260314941406, 16.66851806640625, 4.658885955810547, 8.670562744140625, -0.20925140380859375, 3.3336830139160156, 14.00424575805664, 3.361419677734375, -8.51192855834961, 6.4675140380859375, 0.5170516967773438, 8.485918045043945, -7.4508209228515625, 10.287406921386719, 6.737436294555664, 6.576454162597656, 1.3450355529785156, -1.5306930541992188, -13.681640625, 4.0799713134765625, 0.6088809967041016, 0.59259033203125, 7.334400177001953, 7.130912780761719, 2.274383544921875, 2.2081146240234375, 8.710433959960938, -2.034027099609375, -9.1312255859375, -1.5106582641601562, -8.78570556640625, 9.25445556640625, 9.701622009277344, 11.151798248291016, -2.6043834686279297, 10.888416290283203, 5.099708557128906, 23.56619644165039, 19.785202026367188, 15.725112915039062, 12.161426544189453, 12.769889831542969, 9.193870544433594, 21.657752990722656, 1.2867965698242188, -0.583038330078125, 5.079017639160156, 17.95285415649414, 15.096954345703125, -10.059341430664062, 2.6314544677734375, 6.209957122802734, 23.351898193359375, 9.251373291015625, 28.6490478515625, -4.262781143188477, 13.768817901611328, 1.5511550903320312, 8.156654357910156, 10.279212951660156, 3.260021209716797, -4.4713287353515625, -3.7487564086914062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000141.npy"}
|
||||
{"epoch": 0.21315192743764172, "step": 142, "batch_size": 64, "mean": 4.381509304046631, "std": 6.84030818939209, "min": -14.037300109863281, "p10": -2.790606880187988, "median": 3.5603113174438477, "p90": 13.585836029052736, "max": 22.453536987304688, "pos_frac": 0.75, "sample": [5.175273895263672, -10.798500061035156, 2.3749160766601562, 10.463424682617188, 12.318550109863281, 19.376556396484375, 11.394515991210938, -2.0698585510253906, 2.2856483459472656, 6.4622650146484375, 1.9850921630859375, -1.3394050598144531, 6.674739837646484, -14.037300109863281, -0.15931320190429688, 8.878341674804688, 5.516429901123047, -6.454084396362305, 1.7798652648925781, 3.5797672271728516, 2.6605911254882812, 10.65814208984375, 7.149814605712891, 3.7271766662597656, 3.2717132568359375, 14.343185424804688, -0.9320526123046875, -1.6507415771484375, 12.033729553222656, -2.2364463806152344, 2.4767074584960938, 2.08489990234375, 15.100631713867188, 22.453536987304688, 2.0537033081054688, -2.8874969482421875, 3.098957061767578, -7.076423645019531, 0.6430244445800781, 4.572116851806641, 16.301414489746094, 13.669723510742188, 3.5408554077148438, 1.6669387817382812, -0.27081298828125, 11.666618347167969, -2.8412704467773438, 4.060150146484375, -0.04564666748046875, 9.558082580566406, 3.6513671875, -3.030366897583008, 1.805877685546875, 14.695114135742188, 4.928325653076172, 4.871437072753906, 13.390098571777344, 10.552545547485352, 5.724700927734375, 0.01421356201171875, 0.762054443359375, -2.672391891479492, 9.800010681152344, 3.6658554077148438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000142.npy"}
|
||||
{"epoch": 0.2146636432350718, "step": 143, "batch_size": 64, "mean": 3.7343831062316895, "std": 9.436310768127441, "min": -27.843414306640625, "p10": -6.07621841430664, "median": 3.8336143493652344, "p90": 13.651156997680665, "max": 21.494003295898438, "pos_frac": 0.6875, "sample": [4.129768371582031, 9.594329833984375, 13.362640380859375, 10.448139190673828, 4.618919372558594, -1.8363876342773438, 2.468280792236328, -0.6277999877929688, 18.545333862304688, -10.313800811767578, 12.132902145385742, -2.237945556640625, 21.494003295898438, 3.5374603271484375, 8.064903259277344, 6.374553680419922, -14.146774291992188, 0.86187744140625, 20.176010131835938, 1.4351768493652344, -3.00537109375, -0.4997367858886719, 20.924102783203125, 13.155826568603516, 9.968994140625, 3.2117576599121094, 5.0929412841796875, -15.390594482421875, 2.3346214294433594, -4.097309112548828, -0.4178314208984375, 4.420461654663086, -13.338882446289062, 7.625274658203125, 17.415687561035156, -2.6494407653808594, -5.311431884765625, 0.5350208282470703, 0.525665283203125, 7.9120635986328125, 3.134387969970703, -27.843414306640625, 12.720222473144531, 2.5756072998046875, -2.375621795654297, 9.764663696289062, 13.18743896484375, 10.722610473632812, -0.20191192626953125, -1.8151016235351562, -6.403984069824219, 4.843282699584961, 13.77480697631836, 7.598075866699219, -18.504615783691406, 16.623058319091797, 10.730644226074219, 4.513031005859375, 12.020820617675781, 10.284004211425781, 2.464303970336914, -2.065896987915039, 4.331081390380859, 2.429597854614258], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000143.npy"}
|
||||
{"epoch": 0.2161753590325019, "step": 144, "batch_size": 64, "mean": 4.878805160522461, "std": 9.226908683776855, "min": -20.060836791992188, "p10": -5.833259391784668, "median": 3.3472299575805664, "p90": 17.673271560668944, "max": 26.143295288085938, "pos_frac": 0.640625, "sample": [17.700328826904297, 7.151374816894531, -6.920082092285156, 1.0310230255126953, -2.1377639770507812, 3.783670425415039, 0.2068195343017578, 12.695453643798828, 1.2789421081542969, -0.08661460876464844, -6.056083679199219, 8.292312622070312, -0.48607635498046875, 3.6396961212158203, 8.509963989257812, 2.7911834716796875, 17.610137939453125, -0.156707763671875, -5.195716857910156, 10.320549011230469, 11.8734130859375, 4.304828643798828, -1.55792236328125, -2.400543212890625, -2.5856285095214844, 2.0658340454101562, 6.563011169433594, 15.229942321777344, -0.5071697235107422, -1.55035400390625, 9.916397094726562, 3.0547637939453125, -5.547344207763672, -2.9053726196289062, 14.408309936523438, -0.7087631225585938, -1.385721206665039, 19.99092674255371, 26.070083618164062, 19.115219116210938, 12.2325439453125, 6.829444885253906, -8.362525939941406, 15.591690063476562, -8.253433227539062, 7.4231109619140625, 4.46630859375, -5.911592483520508, 12.841056823730469, 2.4662704467773438, 0.8262767791748047, -5.650482177734375, 4.554222106933594, 20.842235565185547, -7.67840576171875, 17.006853103637695, 21.60546112060547, 26.143295288085938, -0.02484893798828125, -20.060836791992188, 5.873500823974609, 2.7142181396484375, 9.522197723388672, 9.830673217773438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000144.npy"}
|
||||
{"epoch": 0.21768707482993196, "step": 145, "batch_size": 64, "mean": 5.204442024230957, "std": 9.74425220489502, "min": -24.082504272460938, "p10": -4.942203140258789, "median": 5.462804794311523, "p90": 18.037800598144536, "max": 27.799205780029297, "pos_frac": 0.734375, "sample": [6.711334228515625, 7.808216094970703, -4.025611877441406, 6.8250885009765625, 0.445587158203125, -24.082504272460938, 10.711349487304688, -1.648681640625, 27.799205780029297, 1.8175201416015625, 8.291431427001953, 8.543289184570312, 3.465005874633789, 9.466815948486328, 3.6559829711914062, 9.67572021484375, -0.5162200927734375, 19.672622680664062, -7.3016357421875, 7.00726318359375, 2.9066505432128906, 2.8627662658691406, -4.763408660888672, 6.45208740234375, 19.303363800048828, -2.038055419921875, 1.7352981567382812, 7.83984375, 7.1225433349609375, -0.9370269775390625, 20.74958038330078, 14.077018737792969, 5.430320739746094, 23.690216064453125, -5.018829345703125, 18.523345947265625, 3.588163375854492, 10.412233352661133, 8.98916244506836, 13.629884719848633, -16.334762573242188, 8.110687255859375, 16.182647705078125, 5.7938690185546875, 0.5159702301025391, -0.2505950927734375, 3.057558059692383, 5.495288848876953, -3.2825546264648438, 9.595672607421875, 13.6702880859375, 10.811429977416992, 13.558845520019531, -2.564746856689453, 1.6624984741210938, 26.866607666015625, 5.2259979248046875, -5.992130279541016, -18.647079467773438, 4.570899963378906, 16.904861450195312, -7.560386657714844, -3.71392822265625, 0.5303878784179688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000145.npy"}
|
||||
{"epoch": 0.21919879062736206, "step": 146, "batch_size": 64, "mean": 5.246752738952637, "std": 8.768335342407227, "min": -12.412208557128906, "p10": -7.170144653320312, "median": 4.749874114990234, "p90": 17.66120529174805, "max": 26.353534698486328, "pos_frac": 0.75, "sample": [15.000244140625, 9.386154174804688, -6.9813232421875, 5.7244873046875, 2.6582870483398438, -7.291534423828125, -3.3430404663085938, 8.255775451660156, 6.550178527832031, 5.446746826171875, -10.077835083007812, -7.251068115234375, 14.249542236328125, 7.348850250244141, 2.415119171142578, -6.2196807861328125, 10.561906814575195, -1.39288330078125, 14.795757293701172, -12.412208557128906, 7.3524169921875, 16.968399047851562, 2.468017578125, -1.2937984466552734, -9.510488510131836, 0.7267570495605469, 3.9704952239990234, 10.872848510742188, -8.6982421875, 3.135498046875, 7.524396896362305, -8.748458862304688, 22.749893188476562, 6.2379913330078125, 9.884147644042969, -1.2355194091796875, 3.7721633911132812, 0.38616180419921875, 5.7595672607421875, 0.23043441772460938, 17.95812225341797, 6.3851165771484375, 3.6853580474853516, 13.096611022949219, 12.837554931640625, 21.167152404785156, 14.058536529541016, 6.925464630126953, 15.818069458007812, 3.7812728881835938, 7.274290084838867, 0.778472900390625, 18.65806007385254, 1.9052543640136719, -5.179058074951172, -3.5171661376953125, 26.353534698486328, -3.735198974609375, 3.369813919067383, 3.9048900604248047, 18.128341674804688, 9.26816177368164, 4.053001403808594, 18.84038543701172], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000146.npy"}
|
||||
{"epoch": 0.22071050642479215, "step": 147, "batch_size": 64, "mean": 4.191573619842529, "std": 9.066277503967285, "min": -24.60675048828125, "p10": -3.910424995422363, "median": 3.819141387939453, "p90": 16.066708564758308, "max": 24.098373413085938, "pos_frac": 0.625, "sample": [4.500009536743164, 7.90172004699707, 13.207138061523438, -3.897064208984375, 1.6410446166992188, 7.7385711669921875, -10.752784729003906, 3.395782470703125, -1.18963623046875, -3.9161510467529297, 8.598831176757812, 7.641910552978516, 12.5150146484375, 8.92498779296875, -2.7761001586914062, 8.490226745605469, 20.63726806640625, -24.60675048828125, 9.829627990722656, -1.7412548065185547, 6.603559494018555, 3.2901840209960938, -0.6848049163818359, 2.26727294921875, -0.7425765991210938, 16.820472717285156, -3.85430908203125, 9.238338470458984, 4.287944793701172, 4.242500305175781, -1.0275630950927734, 5.491004943847656, 7.419036865234375, -9.245979309082031, -1.1609573364257812, 9.879676818847656, 14.400869369506836, 24.098373413085938, 20.901512145996094, 1.266336441040039, 6.340423583984375, 2.8663711547851562, -1.6722869873046875, 12.615428924560547, 12.977874755859375, -5.926261901855469, 23.546287536621094, -2.949748992919922, 12.909355163574219, -1.3543434143066406, -0.9995536804199219, 5.4480438232421875, 9.830551147460938, -0.5649433135986328, -3.52410888671875, 1.040191650390625, 2.375265121459961, -0.6794052124023438, 16.7806396484375, -13.9473876953125, -12.674375534057617, 7.293479919433594, 19.148019790649414, -0.25209999084472656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000147.npy"}
|
||||
{"epoch": 0.2222222222222222, "step": 148, "batch_size": 64, "mean": 5.507094383239746, "std": 8.759076118469238, "min": -17.472457885742188, "p10": -7.071137237548827, "median": 5.917566299438477, "p90": 16.30733642578125, "max": 23.372085571289062, "pos_frac": 0.734375, "sample": [22.216842651367188, -9.637853622436523, 5.3054351806640625, -8.503339767456055, 15.37774658203125, 18.685142517089844, 9.779983520507812, -9.894830703735352, 17.480560302734375, 16.37683868408203, -2.450946807861328, 11.25362777709961, 3.6979598999023438, 4.62384033203125, 10.17196273803711, -6.0666656494140625, -2.722066879272461, 14.417251586914062, 12.583961486816406, -2.9853248596191406, 9.74160385131836, -7.501625061035156, 13.441116333007812, 2.5399932861328125, 10.982763290405273, 20.004150390625, 12.050712585449219, -1.0010528564453125, 20.904449462890625, 1.7343025207519531, 6.529697418212891, -1.381805419921875, 10.991086959838867, -8.259674072265625, 7.171031951904297, 15.26220703125, 1.797647476196289, 13.74300765991211, -17.472457885742188, 1.1571903228759766, 8.214347839355469, 1.6601295471191406, 9.361701965332031, 23.372085571289062, -11.199594497680664, 8.051412582397461, 7.000253677368164, 4.959558486938477, 16.145164489746094, -0.7014007568359375, 10.669609069824219, 3.14227294921875, 4.518606185913086, 1.6191749572753906, 0.3389778137207031, 6.5663909912109375, 8.25848388671875, 7.853527069091797, 8.17215347290039, -1.5441665649414062, 2.353883743286133, 5.2334136962890625, -2.47607421875, -1.2603683471679688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000148.npy"}
|
||||
{"epoch": 0.2237339380196523, "step": 149, "batch_size": 64, "mean": 6.057290554046631, "std": 7.917817115783691, "min": -11.471328735351562, "p10": -2.9218822479248043, "median": 6.077868461608887, "p90": 16.707098388671877, "max": 24.889118194580078, "pos_frac": 0.78125, "sample": [11.733711242675781, 17.368606567382812, 12.671188354492188, -2.5547256469726562, 2.2430267333984375, 3.9699630737304688, -3.079235076904297, 12.167612075805664, -1.8226909637451172, 14.370153427124023, -11.336402893066406, 3.8873119354248047, 1.6468582153320312, 4.096263885498047, 8.223800659179688, 0.16109466552734375, 3.361103057861328, 2.2048492431640625, 1.646768569946289, 10.444900512695312, 2.3176116943359375, 11.746894836425781, 18.608789443969727, -4.083339691162109, 24.889118194580078, 6.033742904663086, 4.63787841796875, -2.4876937866210938, 8.420480728149414, -11.471328735351562, -2.2807254791259766, 16.27324676513672, -9.663383483886719, 20.906932830810547, 5.068794250488281, 17.8748722076416, 11.030908584594727, -6.700080871582031, 11.74713134765625, -4.306972503662109, 2.3095855712890625, 8.98193359375, 1.098642349243164, 16.507125854492188, 9.68935775756836, 14.622888565063477, 5.717475891113281, 7.030242919921875, 4.560092926025391, 6.444183349609375, 10.125465393066406, 6.1219940185546875, -1.043060302734375, -0.981658935546875, 0.8426666259765625, 9.77545166015625, 6.51544189453125, 21.94073486328125, 13.006256103515625, 6.185871124267578, 6.469085693359375, 16.792800903320312, -1.2629013061523438, 6.249855041503906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000149.npy"}
|
||||
{"epoch": 0.2252456538170824, "step": 150, "batch_size": 64, "mean": 8.050765037536621, "std": 9.838495254516602, "min": -12.326011657714844, "p10": -2.725154876708983, "median": 6.462512969970703, "p90": 22.464382171630866, "max": 30.384384155273438, "pos_frac": 0.796875, "sample": [7.623897552490234, 4.051597595214844, 25.2750244140625, 7.0430755615234375, -10.358406066894531, -3.2288665771484375, 5.9574737548828125, 6.4626312255859375, 0.9410858154296875, 8.479215621948242, 2.497528076171875, 13.087516784667969, -0.5921173095703125, 6.314910888671875, 0.3215141296386719, 6.028221130371094, 4.7001953125, 30.36334228515625, -0.284454345703125, 5.910240173339844, 9.708663940429688, 1.9654579162597656, 25.752334594726562, -10.616912841796875, 1.57623291015625, 11.370046615600586, 4.441068649291992, 7.772426605224609, 6.666717529296875, 20.782794952392578, 9.710227966308594, 15.817535400390625, 20.669700622558594, 6.3206634521484375, 24.81817626953125, -5.3839874267578125, 1.4546947479248047, 9.488929748535156, 16.332748413085938, 24.775863647460938, 10.600479125976562, 18.586837768554688, -0.9051513671875, 11.106231689453125, 20.91131591796875, 1.4378204345703125, 14.739751815795898, 17.030048370361328, -12.326011657714844, 30.384384155273438, 5.6549072265625, -0.05954742431640625, 7.140342712402344, -5.437507629394531, 3.156108856201172, 23.129981994628906, -7.70440673828125, -1.5498275756835938, 15.528579711914062, -1.3805618286132812, 5.279243469238281, 17.74211883544922, 6.462394714355469, 11.704421997070312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000150.npy"}
|
||||
{"epoch": 0.22675736961451248, "step": 151, "batch_size": 64, "mean": 9.110525131225586, "std": 9.875978469848633, "min": -9.123252868652344, "p10": -2.4843477249145507, "median": 9.152090072631836, "p90": 18.50576171875, "max": 40.182952880859375, "pos_frac": 0.765625, "sample": [7.752166748046875, 15.264263153076172, 16.369457244873047, -3.311573028564453, 2.935850143432617, 3.531496047973633, 40.182952880859375, 0.7431755065917969, 17.882232666015625, -0.6982612609863281, 10.518280029296875, 14.56332015991211, 8.520889282226562, -2.3418731689453125, 10.226024627685547, 12.705034255981445, 15.884401321411133, -1.5509910583496094, 7.642648696899414, -1.1705856323242188, -3.653411865234375, -2.545408248901367, 13.660743713378906, -3.3935890197753906, 31.617904663085938, -9.123252868652344, 12.898122787475586, 10.112476348876953, -2.0291519165039062, -1.7924346923828125, 6.087924957275391, -3.2078475952148438, 14.146072387695312, 15.14166259765625, 3.9656829833984375, -1.8385848999023438, 12.78432846069336, 7.8392486572265625, 7.467803955078125, 32.610321044921875, 0.5920429229736328, 10.763561248779297, 17.345890045166016, 6.39689826965332, 24.73387908935547, 21.38422393798828, 18.63884735107422, 17.536468505859375, 18.160446166992188, 9.21866226196289, 9.085517883300781, 15.646446228027344, 3.3700523376464844, 20.49211883544922, -7.762287139892578, 14.047269821166992, 12.726066589355469, 6.808074951171875, -1.9533767700195312, 0.3253173828125, 12.354179382324219, 17.83881378173828, 0.7317085266113281, 18.195228576660156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000151.npy"}
|
||||
{"epoch": 0.22826908541194255, "step": 152, "batch_size": 64, "mean": 5.812672138214111, "std": 10.571117401123047, "min": -17.250526428222656, "p10": -6.863438415527343, "median": 4.409473419189453, "p90": 19.5785364151001, "max": 27.30670166015625, "pos_frac": 0.734375, "sample": [-0.9758377075195312, 2.7527389526367188, 16.748613357543945, 2.6182174682617188, -17.250526428222656, 4.4707489013671875, 12.571479797363281, 8.012825012207031, 6.255867004394531, -7.7779388427734375, 5.813982009887695, -8.456031799316406, 10.930183410644531, -3.2359466552734375, 18.863143920898438, 17.177345275878906, 2.496915817260742, 8.855121612548828, 0.6484241485595703, -4.795429229736328, -16.777008056640625, 22.533458709716797, 10.578460693359375, 9.361083984375, -4.940864562988281, 1.1848163604736328, 15.350257873535156, 4.348197937011719, 1.954549789428711, 11.038810729980469, 2.7131214141845703, 19.68403434753418, 1.5850391387939453, 2.6225624084472656, -2.5180892944335938, -5.169677734375, 2.209444046020508, -5.905723571777344, 17.301002502441406, 1.654266357421875, -7.225456237792969, 20.439544677734375, 12.456619262695312, -6.018730163574219, 7.393379211425781, 27.30670166015625, 14.538551330566406, 0.6017646789550781, 2.0676956176757812, 9.114547729492188, -14.907981872558594, 14.902496337890625, 1.0271015167236328, 25.92890167236328, 16.320505142211914, 7.730377197265625, 24.017555236816406, 19.332374572753906, -2.87164306640625, 22.783004760742188, -1.1835098266601562, -12.116374969482422, 8.111442565917969, 17.73052978515625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000152.npy"}
|
||||
{"epoch": 0.22978080120937264, "step": 153, "batch_size": 64, "mean": 6.849618434906006, "std": 8.973270416259766, "min": -13.598369598388672, "p10": -5.22487678527832, "median": 6.116879463195801, "p90": 19.281012344360352, "max": 24.437225341796875, "pos_frac": 0.765625, "sample": [11.508749008178711, 3.8911514282226562, 6.910400390625, 15.082015991210938, 16.63993263244629, 22.009132385253906, 11.097969055175781, 3.068309783935547, 3.778839111328125, 6.633628845214844, 6.151453018188477, 17.730426788330078, 3.15557861328125, 16.14685821533203, 6.082305908203125, -5.526023864746094, 6.465415954589844, 0.8640365600585938, 2.2982635498046875, 21.03099822998047, 20.300994873046875, 13.991840362548828, -5.694042205810547, 8.696197509765625, 22.108287811279297, -11.201990127563477, 14.185188293457031, 1.9594917297363281, 0.8799209594726562, 17.212474822998047, -5.3946990966796875, 4.967229843139648, -13.598369598388672, -0.7101325988769531, 12.267066955566406, 4.88947868347168, -1.7396011352539062, 19.21075439453125, 8.822998046875, 5.225627899169922, -3.222442626953125, -1.2236251831054688, 22.66241455078125, 10.136302947998047, 11.59476089477539, -1.0117664337158203, 10.56939697265625, 7.921958923339844, 10.097465515136719, -2.7405738830566406, 0.1564788818359375, 14.693992614746094, 5.461139678955078, -0.09926605224609375, 5.8413848876953125, -8.370956420898438, -6.736236572265625, 17.176992416381836, 1.9495086669921875, 24.437225341796875, 19.31112289428711, 1.3580265045166016, -4.828624725341797, 11.842735290527344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000153.npy"}
|
||||
{"epoch": 0.23129251700680273, "step": 154, "batch_size": 64, "mean": 5.053785800933838, "std": 10.143854141235352, "min": -20.571571350097656, "p10": -9.969337654113767, "median": 6.019001007080078, "p90": 16.720418739318852, "max": 26.38933753967285, "pos_frac": 0.671875, "sample": [-12.085823059082031, -0.1793670654296875, -11.442581176757812, -20.571571350097656, 21.96717071533203, 14.185884475708008, -10.755548477172852, 6.151285171508789, 4.2993621826171875, -12.997200012207031, 5.559564590454102, -0.665618896484375, 14.059663772583008, 6.025543212890625, -7.286750793457031, -6.729301452636719, -2.5233497619628906, 6.122356414794922, 0.7499141693115234, -0.929107666015625, 13.984092712402344, 26.38933753967285, 24.318248748779297, 6.012458801269531, -11.768402099609375, 8.891561508178711, 11.979530334472656, -2.1600265502929688, 14.905517578125, -17.859909057617188, 8.64103889465332, 3.3558826446533203, 15.564002990722656, 15.759780883789062, 11.938594818115234, -0.050434112548828125, 10.74506950378418, 15.992462158203125, 8.925355911254883, 6.334953308105469, -1.6792068481445312, -0.59600830078125, 11.586286544799805, -1.760549545288086, 2.9925918579101562, 20.884016036987305, 8.9927978515625, 5.930458068847656, 13.960296630859375, -8.134845733642578, 2.5900421142578125, -3.3017501831054688, 2.108715057373047, 4.236045837402344, 9.37640380859375, 6.912696838378906, 18.11209487915039, -0.6939945220947266, 17.032400131225586, 21.00567626953125, 7.300407409667969, 9.289434432983398, 2.076934814453125, 10.36770248413086], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000154.npy"}
|
||||
{"epoch": 0.2328042328042328, "step": 155, "batch_size": 64, "mean": 7.028270721435547, "std": 12.238687515258789, "min": -24.668636322021484, "p10": -5.506732177734375, "median": 6.7075653076171875, "p90": 24.001742172241215, "max": 32.70054626464844, "pos_frac": 0.71875, "sample": [6.585624694824219, -1.20233154296875, 31.47216796875, 6.441047668457031, 4.339439392089844, 8.484840393066406, 0.7930183410644531, 21.43736457824707, -2.4888782501220703, 14.649211883544922, 10.834358215332031, 15.229837417602539, 1.1605987548828125, -5.436492919921875, 0.09986495971679688, -3.30804443359375, 11.65216064453125, 27.696197509765625, 9.703178405761719, 26.68414306640625, 24.54128646850586, -2.0390987396240234, 12.495536804199219, -13.540212631225586, 13.54718017578125, 26.323875427246094, -0.94464111328125, 18.657852172851562, -3.8157501220703125, 14.538475036621094, -16.760330200195312, 10.565835952758789, 22.27309799194336, 9.258918762207031, 32.70054626464844, 11.029617309570312, -1.2188911437988281, 2.0194168090820312, -17.734130859375, 18.594497680664062, -3.5546875, 0.5570335388183594, 12.854988098144531, 1.6074256896972656, 28.188201904296875, 8.412296295166016, 4.8893280029296875, 20.37628936767578, -11.201644897460938, 6.829505920410156, 7.762504577636719, 3.738800048828125, 2.5844383239746094, 2.29888916015625, 3.557027816772461, -0.6437225341796875, 7.688194274902344, -12.090003967285156, -24.668636322021484, -0.3080902099609375, 9.2777099609375, 22.74280548095703, 19.127111434936523, -5.536834716796875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000155.npy"}
|
||||
{"epoch": 0.23431594860166288, "step": 156, "batch_size": 64, "mean": 8.273383140563965, "std": 10.130899429321289, "min": -13.841625213623047, "p10": -3.360412979125976, "median": 6.8435516357421875, "p90": 22.446410751342775, "max": 30.15237045288086, "pos_frac": 0.78125, "sample": [5.82147216796875, -3.6123008728027344, -4.421541213989258, -13.841625213623047, 10.791366577148438, 27.789871215820312, 30.15237045288086, 11.188850402832031, 13.132781982421875, 22.122913360595703, 2.4814910888671875, 21.10614013671875, -6.2615509033203125, 15.991378784179688, 17.6964111328125, 21.722808837890625, 21.783004760742188, 19.491554260253906, -0.2935047149658203, 13.506851196289062, 3.2010726928710938, 2.5102157592773438, 7.326602935791016, 17.235301971435547, -2.6694488525390625, 4.750762939453125, 0.1851329803466797, 17.55908966064453, 4.229854583740234, 13.266014099121094, 8.725013732910156, 6.124542236328125, -2.772674560546875, 11.073165893554688, 10.547157287597656, 28.238327026367188, -9.971389770507812, 7.089790344238281, -4.8512420654296875, 25.112709045410156, 12.041343688964844, 22.585052490234375, 3.5566558837890625, 2.5194091796875, 0.6610469818115234, 4.5360107421875, -0.49335479736328125, -0.189788818359375, 8.571693420410156, -2.5708999633789062, 7.823535919189453, 3.058380126953125, 11.53579330444336, 0.03536224365234375, 6.597312927246094, 4.030387878417969, -2.4639968872070312, 17.014442443847656, 23.630279541015625, 2.395832061767578, 7.357275009155273, 25.737777709960938, 4.856372833251953, -6.588142395019531], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000156.npy"}
|
||||
{"epoch": 0.23582766439909297, "step": 157, "batch_size": 64, "mean": 6.533997535705566, "std": 10.536174774169922, "min": -19.842315673828125, "p10": -4.896387481689453, "median": 6.168727874755859, "p90": 20.121787452697756, "max": 32.911956787109375, "pos_frac": 0.734375, "sample": [26.387313842773438, 25.32779884338379, -4.141563415527344, 13.279281616210938, 17.089492797851562, -19.842315673828125, 5.5225372314453125, -4.616973876953125, 13.705368041992188, -5.016136169433594, 0.8199691772460938, 6.699653625488281, 16.98101806640625, 5.396167755126953, -17.9193115234375, 0.21349716186523438, -0.10774993896484375, 6.030799865722656, 14.615293502807617, 20.40964698791504, 6.9452972412109375, 19.450115203857422, 32.911956787109375, 10.305294036865234, 9.283163070678711, 15.274993896484375, -6.471885681152344, -0.7374477386474609, 8.388198852539062, -2.2063064575195312, 16.35867691040039, -0.5576400756835938, 10.478599548339844, 6.5367584228515625, 12.894416809082031, 21.894203186035156, -8.351348876953125, 22.879302978515625, 25.615928649902344, 2.9368743896484375, -2.8197250366210938, 3.4609451293945312, -0.8408966064453125, 15.531883239746094, 8.091552734375, 3.9328746795654297, 1.9421234130859375, 10.848808288574219, 9.285432815551758, 2.7142810821533203, 1.96038818359375, -2.560436248779297, 10.367591857910156, 1.4444694519042969, 5.2768096923828125, 13.76746940612793, 3.2462539672851562, 13.98769760131836, -7.942649841308594, 13.383426666259766, 0.730255126953125, -3.3705596923828125, -15.231719970703125, 6.3066558837890625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000157.npy"}
|
||||
{"epoch": 0.23733938019652306, "step": 158, "batch_size": 64, "mean": 6.945442199707031, "std": 11.061383247375488, "min": -17.713592529296875, "p10": -5.542582321166992, "median": 4.598094940185547, "p90": 23.541204833984377, "max": 31.44354248046875, "pos_frac": 0.703125, "sample": [-1.9357986450195312, 13.604812622070312, 1.80877685546875, 8.436622619628906, -17.713592529296875, 15.35427474975586, 24.517990112304688, 11.157150268554688, -0.595367431640625, 0.8815078735351562, 20.672775268554688, 10.620895385742188, 10.012222290039062, 17.031545639038086, -17.198657989501953, 13.776365280151367, 31.44354248046875, 4.496217727661133, 4.280130386352539, 6.36407470703125, -2.2460708618164062, 4.64935302734375, 13.309728622436523, 1.7501564025878906, 23.33765411376953, 13.338348388671875, -1.4156494140625, 23.628440856933594, 17.992828369140625, 12.589881896972656, -5.761566162109375, 10.535499572753906, 23.901260375976562, -1.060333251953125, 22.739112854003906, 24.77311897277832, 14.736892700195312, -4.8043975830078125, 2.534992218017578, -8.003631591796875, 20.706298828125, 9.135871887207031, 1.7598648071289062, 24.14776611328125, 13.043312072753906, 14.725303649902344, -3.9681949615478516, 4.197639465332031, 26.907073974609375, -5.031620025634766, 4.546836853027344, -4.18304443359375, -7.589427947998047, 1.1159095764160156, 1.6356964111328125, -0.1776294708251953, -4.505193710327148, 5.383922576904297, 11.82830810546875, 4.025066375732422, -9.994895935058594, 2.6415328979492188, -0.19437217712402344, -9.1888427734375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000158.npy"}
|
||||
{"epoch": 0.23885109599395313, "step": 159, "batch_size": 64, "mean": 7.900569915771484, "std": 11.643502235412598, "min": -32.221435546875, "p10": -4.741639900207519, "median": 6.414640426635742, "p90": 23.88414001464844, "max": 33.11738586425781, "pos_frac": 0.765625, "sample": [17.83066177368164, 13.702308654785156, 2.2672882080078125, 20.51215362548828, 10.093132019042969, -12.344291687011719, -6.597747802734375, 24.115036010742188, 2.3200645446777344, 33.11738586425781, 25.834075927734375, 5.064857482910156, 20.04559898376465, 6.567985534667969, -2.736083984375, 2.0928077697753906, 8.120384216308594, 30.666748046875, 21.775421142578125, -0.03260231018066406, 1.9101982116699219, -8.082374572753906, 6.071985244750977, 1.5021286010742188, 6.0102996826171875, 3.344327926635742, 1.7466793060302734, -8.898536682128906, 19.727401733398438, 23.345382690429688, -4.925041198730469, 15.495651245117188, 14.501550674438477, -0.1133880615234375, 4.447944641113281, 10.115144729614258, 21.328384399414062, 9.290962219238281, 4.5230712890625, -0.26349639892578125, 6.261295318603516, -9.778244018554688, 25.68364715576172, 0.6647624969482422, 11.310539245605469, 25.14708709716797, -4.313703536987305, 14.633176803588867, -0.456634521484375, 7.2681732177734375, 0.4543018341064453, 26.312419891357422, 9.054336547851562, -2.4652862548828125, -0.2716941833496094, -32.221435546875, 18.090070724487305, 13.426910400390625, 0.5731582641601562, 7.654428482055664, 16.972503662109375, 6.113929748535156, 12.261756896972656, 9.767509460449219], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000159.npy"}
|
||||
{"epoch": 0.24036281179138322, "step": 160, "batch_size": 64, "mean": 6.634532928466797, "std": 11.82649040222168, "min": -16.697044372558594, "p10": -5.987269020080566, "median": 3.7108116149902344, "p90": 22.199938392639165, "max": 36.57383728027344, "pos_frac": 0.734375, "sample": [1.8161735534667969, -3.615833282470703, 14.961280822753906, 30.485488891601562, 1.0833892822265625, 23.06401824951172, 3.675140380859375, -5.325372695922852, 11.073654174804688, 11.513107299804688, 36.57383728027344, -5.929004669189453, 21.017683029174805, 17.65581512451172, 7.8187103271484375, 17.78509521484375, 1.1934528350830078, -16.697044372558594, 12.981334686279297, 10.807378768920898, -4.098411560058594, -8.466758728027344, 3.0360336303710938, 1.3667449951171875, 11.750288009643555, 1.9195976257324219, 10.793510437011719, 1.7612972259521484, 4.508148193359375, 1.598785400390625, 13.989606857299805, 8.812032699584961, 35.578399658203125, 28.451702117919922, 2.854433059692383, -5.906940460205078, 13.292304992675781, 1.6381645202636719, -0.5178165435791016, 22.706619262695312, 2.649127960205078, 13.651447296142578, 1.7012939453125, -12.167289733886719, -4.7398681640625, 8.991264343261719, -12.678634643554688, -6.012239456176758, 13.37667465209961, 11.906147003173828, -7.461631774902344, -4.3051605224609375, -3.1979751586914062, 12.827499389648438, 11.465065002441406, 3.6923294067382812, 1.26300048828125, 25.809547424316406, -15.775436401367188, 3.7292938232421875, 20.892166137695312, 20.199386596679688, 7.709197998046875, -5.9211578369140625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000160.npy"}
|
||||
{"epoch": 0.2418745275888133, "step": 161, "batch_size": 64, "mean": 7.34462833404541, "std": 11.287602424621582, "min": -28.443252563476562, "p10": -5.27568941116333, "median": 7.730113983154297, "p90": 21.5956735610962, "max": 34.78108215332031, "pos_frac": 0.765625, "sample": [8.863960266113281, -15.201587677001953, 3.1466598510742188, 10.189453125, 2.3093719482421875, -1.0488739013671875, 17.280059814453125, 27.80779266357422, -6.902055740356445, -4.841734886169434, -9.22723388671875, 7.6446533203125, 11.328330993652344, 12.700363159179688, 25.25326156616211, 11.307348251342773, 8.075450897216797, -28.443252563476562, 16.018688201904297, -8.818161010742188, -3.72076416015625, 2.2830657958984375, 7.152313232421875, 2.1022872924804688, 17.487876892089844, -2.822345733642578, 5.5938873291015625, 13.663686752319336, 13.18841552734375, 2.9813690185546875, 20.171815872192383, 15.71258544921875, 10.277925491333008, 16.926727294921875, 17.86772918701172, -3.5336990356445312, -9.693157196044922, 34.78108215332031, 25.075424194335938, 22.20589828491211, 4.006122589111328, 14.119186401367188, -0.6838722229003906, 6.210601806640625, 10.47994613647461, 3.0206756591796875, 7.815574645996094, 8.269264221191406, 6.010498046875, 18.979080200195312, -3.9235382080078125, 7.849403381347656, 26.370742797851562, 7.625068664550781, 0.5872573852539062, 0.6721420288085938, -5.461669921875, 3.1716785430908203, 10.714607238769531, 11.09649658203125, 30.846420288085938, -2.0196456909179688, 0.8277034759521484, 8.327831268310547], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000161.npy"}
|
||||
{"epoch": 0.24338624338624337, "step": 162, "batch_size": 64, "mean": 9.570235252380371, "std": 11.484976768493652, "min": -20.396713256835938, "p10": -5.714031219482421, "median": 8.937728881835938, "p90": 23.856363677978518, "max": 42.70106506347656, "pos_frac": 0.8125, "sample": [13.74566650390625, 7.126609802246094, -20.396713256835938, 4.581329345703125, 4.841636657714844, 19.888851165771484, 13.367803573608398, 6.777984619140625, 19.690765380859375, 26.08902931213379, 14.85858154296875, 16.504474639892578, 5.166505813598633, 42.70106506347656, 21.26390838623047, 7.390665054321289, 9.5992431640625, 26.644630432128906, 16.586517333984375, 25.478618621826172, 24.78085708618164, 20.698360443115234, 5.246162414550781, -5.0769805908203125, 10.751976013183594, -1.2456588745117188, 8.276214599609375, 2.5253753662109375, -12.503662109375, 15.063720703125, 5.724279403686523, 4.447998046875, 1.7075576782226562, 17.58865737915039, 11.329132080078125, 10.82098388671875, 15.945472717285156, 16.518659591674805, -0.26780128479003906, -9.862417221069336, -9.656600952148438, -0.37718963623046875, 24.193504333496094, 19.72855567932129, 18.69658851623535, 7.4950408935546875, -10.673965454101562, 13.809982299804688, -5.987052917480469, 0.527252197265625, 7.287723541259766, -0.1987457275390625, 10.898483276367188, 19.0865421295166, 23.0697021484375, 14.334121704101562, 2.6332473754882812, -11.967525482177734, 4.849761962890625, 4.769193649291992, 3.9325618743896484, 21.646621704101562, 25.335535049438477, 4.685646057128906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000162.npy"}
|
||||
{"epoch": 0.24489795918367346, "step": 163, "batch_size": 64, "mean": 9.510485649108887, "std": 11.873640060424805, "min": -17.45188331604004, "p10": -7.9711975097656245, "median": 8.205589294433594, "p90": 23.850957107543945, "max": 31.333934783935547, "pos_frac": 0.765625, "sample": [-1.7831220626831055, 10.43701171875, -7.51861572265625, 17.187313079833984, 7.537971496582031, 2.919872283935547, -17.45188331604004, 5.967742919921875, 27.141448974609375, -8.296737670898438, 7.517112731933594, 18.237625122070312, 28.11547088623047, 7.210456848144531, 5.943244934082031, 16.212745666503906, 19.910724639892578, -2.8202667236328125, 29.68572998046875, 13.306610107421875, 3.2797088623046875, 21.9647216796875, 21.673498153686523, -8.1651611328125, -7.466266632080078, 25.848236083984375, 12.136798858642578, 23.313491821289062, 6.841947555541992, 7.1320037841796875, 14.003179550170898, 4.899471282958984, 16.78014373779297, -3.024658203125, 16.162750244140625, 18.663742065429688, -8.667232513427734, 19.3963623046875, -9.483253479003906, 2.9864234924316406, 14.129352569580078, 31.333934783935547, 14.351058959960938, 23.891490936279297, 9.489288330078125, -2.616556167602539, -0.6889991760253906, 8.341537475585938, 4.727666854858398, -0.7789306640625, -13.721992492675781, 23.51212501525879, 16.922988891601562, 1.2761192321777344, 1.6913414001464844, -9.434432983398438, 23.447425842285156, 23.756378173828125, 27.29632568359375, 21.557514190673828, 14.889657974243164, 2.922700881958008, 6.569074630737305, 8.06964111328125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000163.npy"}
|
||||
{"epoch": 0.24640967498110355, "step": 164, "batch_size": 64, "mean": 9.104915618896484, "std": 12.5341157913208, "min": -24.96053695678711, "p10": -5.193962287902831, "median": 8.536632537841797, "p90": 26.007312774658207, "max": 34.45997619628906, "pos_frac": 0.796875, "sample": [-0.37718963623046875, 0.7093925476074219, -1.8307952880859375, 6.112125396728516, 4.812171936035156, 9.981605529785156, 32.20746612548828, 19.05915069580078, -0.214324951171875, 4.8838043212890625, 11.656829833984375, -4.061805725097656, 17.879894256591797, -12.675350189208984, 9.609428405761719, 13.696334838867188, 17.458839416503906, 9.524044036865234, 34.30145263671875, -5.373298645019531, 5.569969177246094, -11.574623107910156, 8.437705993652344, 8.63555908203125, 0.12841796875, 8.707630157470703, 25.055282592773438, 1.8866653442382812, 14.903167724609375, 3.76409912109375, 24.657373428344727, 28.110031127929688, -4.775510787963867, 11.783733367919922, 3.3915138244628906, -12.789909362792969, 27.264625549316406, 34.45997619628906, 28.089542388916016, -11.625450134277344, 8.399955749511719, 4.207618713378906, 3.2385597229003906, 18.246564865112305, 14.481529235839844, 12.970499038696289, 22.489463806152344, 2.1458358764648438, 21.002960205078125, 3.797748565673828, 5.867584228515625, 0.10516738891601562, 25.115623474121094, -11.03639030456543, -0.4585742950439453, -24.96053695678711, 7.760566711425781, 23.083921432495117, 23.95604705810547, 1.7800979614257812, 26.38946533203125, 12.080657958984375, 11.549131393432617, 9.061492919921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000164.npy"}
|
||||
{"epoch": 0.24792139077853365, "step": 165, "batch_size": 64, "mean": 8.175054550170898, "std": 12.426295280456543, "min": -22.03217315673828, "p10": -6.252104949951172, "median": 8.172661781311035, "p90": 22.869635963439944, "max": 31.386390686035156, "pos_frac": 0.75, "sample": [31.386390686035156, 8.455022811889648, 10.995321273803711, 18.9482421875, 8.864662170410156, 2.17193603515625, 5.401691436767578, -1.951873779296875, 12.26315689086914, 4.145084381103516, 2.763904571533203, 9.63161849975586, 27.163490295410156, 0.19754981994628906, -1.74395751953125, 7.713199615478516, 7.964897155761719, 11.493125915527344, -20.003599166870117, -4.683357238769531, -17.392242431640625, -2.2504425048828125, 19.804244995117188, -3.128721237182617, 18.918197631835938, 22.015836715698242, 8.380426406860352, 14.20832633972168, 27.185596466064453, 27.99584197998047, 6.674400329589844, 0.9801883697509766, 27.535919189453125, 25.808547973632812, 16.344581604003906, -22.03217315673828, 1.61553955078125, 15.336944580078125, 19.136058807373047, 19.286087036132812, -6.672765731811523, -2.428985595703125, -1.842041015625, 4.519559860229492, 19.192651748657227, -6.245658874511719, 3.192676544189453, 1.583709716796875, 15.90272331237793, -2.2791213989257812, -6.2548675537109375, 18.94011688232422, -21.519813537597656, 21.666278839111328, 4.767160415649414, -7.806659698486328, 8.800102233886719, 7.241046905517578, 18.200498580932617, 7.9141845703125, 22.00202178955078, 23.235549926757812, 17.06867218017578, 16.426795959472656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000165.npy"}
|
||||
{"epoch": 0.2494331065759637, "step": 166, "batch_size": 64, "mean": 6.286128997802734, "std": 10.5748872756958, "min": -10.909603118896484, "p10": -5.995008850097656, "median": 5.592048645019531, "p90": 21.380439758300785, "max": 31.72808074951172, "pos_frac": 0.65625, "sample": [15.718612670898438, -0.8866119384765625, -3.6534576416015625, 14.193817138671875, 20.015426635742188, 1.9553680419921875, -5.686454772949219, 22.086883544921875, 6.549468994140625, 5.65643310546875, -3.9347381591796875, 6.45244026184082, 25.551090240478516, -5.30767822265625, 18.5423583984375, 6.6617431640625, -0.08591461181640625, 14.067289352416992, -5.1921539306640625, 3.1624526977539062, -3.809907913208008, 17.594181060791016, -6.893379211425781, 5.0385894775390625, 23.606475830078125, 7.728132247924805, -6.872467041015625, 17.218902587890625, 10.371383666992188, 7.888084411621094, -8.723983764648438, -10.909603118896484, 3.769124984741211, 21.746421813964844, 17.08147430419922, -1.1071128845214844, 20.52648162841797, 2.9462432861328125, 11.42266845703125, 12.288782119750977, 5.5276641845703125, 27.5850830078125, -5.806678771972656, -6.075721740722656, 9.7596435546875, -4.825111389160156, -3.1840591430664062, 0.6229476928710938, 8.534126281738281, 31.72808074951172, 10.765312194824219, -8.6361083984375, 0.4461669921875, 11.949142456054688, -8.026077270507812, -2.3770294189453125, 12.773460388183594, -4.450239181518555, 27.08404541015625, -2.603424072265625, 8.091934204101562, 13.424224853515625, 1.589437484741211, 1.6385650634765625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000166.npy"}
|
||||
{"epoch": 0.2509448223733938, "step": 167, "batch_size": 64, "mean": 9.522764205932617, "std": 10.926082611083984, "min": -13.284492492675781, "p10": -1.8192535400390621, "median": 7.2781572341918945, "p90": 24.967391204833984, "max": 31.45186996459961, "pos_frac": 0.828125, "sample": [4.898872375488281, 4.630573272705078, 3.7422313690185547, 9.085857391357422, -9.953300476074219, 21.980941772460938, 11.150550842285156, 4.8926849365234375, 7.001768112182617, 15.741104125976562, 18.825468063354492, 4.779937744140625, 7.577598571777344, -1.9474868774414062, -0.9232501983642578, 0.21218490600585938, 24.815841674804688, 21.489044189453125, 17.50762939453125, -0.090301513671875, -13.284492492675781, -0.24794387817382812, 4.887731552124023, 1.6960296630859375, 9.859399795532227, 1.0971221923828125, 4.019248962402344, 0.14500999450683594, 4.739017486572266, 13.54510498046875, 5.2415008544921875, 17.40447235107422, 3.4679622650146484, 12.350208282470703, 13.348190307617188, 10.757537841796875, 29.758251190185547, 30.666427612304688, 19.776092529296875, 18.12146759033203, 2.752012252807617, 21.964874267578125, 0.22229766845703125, 7.554546356201172, 18.588247299194336, -8.94207763671875, -9.246505737304688, 30.26836395263672, -1.5200424194335938, 17.59587860107422, 19.51870346069336, 0.7517890930175781, 3.7753849029541016, 31.45186996459961, 15.84674072265625, 25.052650451660156, -3.2316665649414062, 29.4892578125, 19.273956298828125, 25.03234100341797, 11.76507568359375, 1.2623176574707031, 2.0243301391601562, -4.559698104858398], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000167.npy"}
|
||||
{"epoch": 0.25245653817082386, "step": 168, "batch_size": 64, "mean": 7.4220194816589355, "std": 10.882966041564941, "min": -12.45782470703125, "p10": -4.625502014160156, "median": 4.642917633056641, "p90": 23.592224311828616, "max": 28.717308044433594, "pos_frac": 0.671875, "sample": [7.328941345214844, -5.5028076171875, 25.639606475830078, -3.807170867919922, 5.00987434387207, 22.549354553222656, 13.222824096679688, -2.8625030517578125, -12.45782470703125, -9.370452880859375, 6.643180847167969, -2.6095504760742188, 4.3990020751953125, 27.093353271484375, 9.899826049804688, 4.588569641113281, 14.452068328857422, 4.697265625, 0.32311248779296875, 13.95068359375, 12.470710754394531, -1.7634830474853516, 2.1063575744628906, 5.922414779663086, -0.8247241973876953, 10.30984878540039, 24.885154724121094, 2.345508575439453, -5.844688415527344, 14.827545166015625, 1.0599212646484375, -0.1730194091796875, 8.187911987304688, 23.854736328125, 22.708221435546875, -3.240509033203125, 19.84210205078125, -0.5365943908691406, -0.27457427978515625, 1.1326751708984375, 2.782461166381836, 28.717308044433594, -4.001060485839844, 22.45861053466797, 17.730575561523438, -1.03509521484375, 14.83123779296875, -5.907928466796875, 3.42633056640625, -2.7838897705078125, 12.342544555664062, 11.617355346679688, 21.300289154052734, 3.186695098876953, -3.5358657836914062, 22.97969627380371, -4.893119812011719, -6.733936309814453, -1.1962738037109375, 1.151601791381836, 21.82605743408203, 27.16167449951172, 27.19000244140625, 6.211067199707031], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000168.npy"}
|
||||
{"epoch": 0.25396825396825395, "step": 169, "batch_size": 64, "mean": 8.366128921508789, "std": 13.639642715454102, "min": -26.29315185546875, "p10": -5.334801101684569, "median": 7.45419979095459, "p90": 25.640818023681646, "max": 48.99336242675781, "pos_frac": 0.75, "sample": [6.552146911621094, 23.984535217285156, 20.630157470703125, -10.652656555175781, 7.411935806274414, 2.948993682861328, -5.804286956787109, -0.4376792907714844, 9.7698974609375, 12.134048461914062, 7.918525695800781, 0.2216339111328125, 18.802749633789062, 24.531478881835938, 5.331146240234375, -3.470479965209961, 5.190052032470703, 13.863876342773438, 14.197502136230469, -0.7924461364746094, -4.2393341064453125, 21.360794067382812, 26.116249084472656, 9.4058837890625, -2.9651031494140625, 44.05046844482422, 2.320220947265625, 8.929283142089844, -2.0187339782714844, 10.57925033569336, 5.944780349731445, 4.3767242431640625, 19.782215118408203, -2.575725555419922, 21.298446655273438, 4.998619079589844, 17.120450973510742, -26.29315185546875, 9.428789138793945, 7.3461761474609375, 0.2157440185546875, 7.496463775634766, 2.1873703002929688, 0.988616943359375, 7.890743255615234, 10.357177734375, 8.894664764404297, 4.91888427734375, 27.60101318359375, 13.948944091796875, -0.17411041259765625, 10.753118515014648, 9.088874816894531, -17.530517578125, -10.906600952148438, 48.99336242675781, 28.498062133789062, 28.439231872558594, -15.05504035949707, 33.72604751586914, 3.3317337036132812, 20.470502853393555, -14.923187255859375, -1.0763092041015625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000169.npy"}
|
||||
{"epoch": 0.25547996976568405, "step": 170, "batch_size": 64, "mean": 8.350889205932617, "std": 13.500554084777832, "min": -29.83429527282715, "p10": -11.239326667785642, "median": 9.016425132751465, "p90": 24.144585418701173, "max": 37.54969787597656, "pos_frac": 0.765625, "sample": [10.706306457519531, 0.98675537109375, 8.990989685058594, 8.67706298828125, 2.4148941040039062, 31.810211181640625, 7.566686630249023, 9.041860580444336, 4.537559509277344, 28.86571502685547, -15.9027099609375, 3.6000137329101562, 16.823341369628906, 32.53825378417969, 26.95745849609375, -8.660024642944336, 2.7884445190429688, 12.363525390625, 19.054466247558594, -2.3307876586914062, 3.7821998596191406, 23.68170166015625, 15.979604721069336, 1.9668560028076172, -12.344741821289062, 11.52960205078125, -16.646812438964844, -3.4221267700195312, -29.83429527282715, 15.83258056640625, 21.408946990966797, 16.503799438476562, 20.102256774902344, 8.971590042114258, -13.45814323425293, 11.160537719726562, -7.77197265625, 13.89175033569336, 21.79198455810547, -3.4168243408203125, 11.221099853515625, -0.3668937683105469, -0.43376922607421875, 17.235382080078125, 9.127166748046875, 37.54969787597656, 10.444580078125, 23.329116821289062, 7.475048065185547, 1.7459259033203125, 3.7183380126953125, -4.238615036010742, 1.8666000366210938, 29.97313690185547, -12.821723937988281, 4.831691741943359, 18.08094024658203, 24.34296417236328, 0.5534820556640625, 9.74482536315918, 21.30856704711914, 21.673995971679688, 13.015754699707031, -15.458976745605469], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000170.npy"}
|
||||
{"epoch": 0.25699168556311414, "step": 171, "batch_size": 64, "mean": 9.011910438537598, "std": 14.256095886230469, "min": -32.69062042236328, "p10": -4.344390869140624, "median": 6.916728973388672, "p90": 27.5221420288086, "max": 33.42684555053711, "pos_frac": 0.703125, "sample": [9.024345397949219, 7.830883026123047, 25.907485961914062, 7.197681427001953, 20.294692993164062, 29.794849395751953, 17.01199722290039, 25.029296875, 1.1436576843261719, 4.17811393737793, -1.736459732055664, -0.019439697265625, 27.970443725585938, 14.009632110595703, 33.42684555053711, -3.5999679565429688, 20.626712799072266, 5.299251556396484, 19.87709617614746, -1.2554702758789062, -0.5935688018798828, 6.473535537719727, 16.031864166259766, -0.555999755859375, 23.219768524169922, 0.6019287109375, 10.914566040039062, 1.32427978515625, 31.49205780029297, -24.598318099975586, 15.325027465820312, 15.2828369140625, 33.30517578125, -7.8684539794921875, 13.665096282958984, 25.110088348388672, -4.663429260253906, -12.282745361328125, 32.469905853271484, 26.476104736328125, -32.69062042236328, 3.660249710083008, 17.708683013916016, 3.035125732421875, 4.4706573486328125, 25.903839111328125, 19.57867431640625, -0.10880279541015625, -2.7303390502929688, 13.97918701171875, 6.635776519775391, 29.933082580566406, 23.692216873168945, 1.7511062622070312, -1.4288444519042969, -4.719207763671875, 11.130340576171875, -0.06657791137695312, -2.4042320251464844, -23.596588134765625, 17.065948486328125, 0.3209991455078125, 3.417144775390625, -0.9169082641601562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000171.npy"}
|
||||
{"epoch": 0.2585034013605442, "step": 172, "batch_size": 64, "mean": 8.390253067016602, "std": 12.77292537689209, "min": -25.69481086730957, "p10": -5.633728027343749, "median": 7.639862060546875, "p90": 26.234125328063964, "max": 39.24267578125, "pos_frac": 0.796875, "sample": [-10.441390991210938, 7.421272277832031, 1.6089324951171875, 7.858451843261719, 1.99957275390625, 1.2299385070800781, -0.23252105712890625, 13.410720825195312, 16.959304809570312, 28.64606285095215, 39.24267578125, 8.880523681640625, 8.246719360351562, -0.8687591552734375, 15.399505615234375, 35.902008056640625, 12.215606689453125, -11.66314697265625, 0.50579833984375, 8.403640747070312, 9.246902465820312, 2.7244720458984375, -15.984968185424805, 0.400482177734375, 8.292625427246094, 7.242591857910156, 3.9800643920898438, 15.16668701171875, 16.325050354003906, -1.2904205322265625, 13.181377410888672, 17.450340270996094, 8.486434936523438, 19.710098266601562, 1.49993896484375, 0.6096000671386719, 26.181116104125977, 16.5341796875, 26.25684356689453, 2.2608108520507812, -1.5349788665771484, 9.054420471191406, 6.057701110839844, 21.099611282348633, 21.054523468017578, -17.237804412841797, 1.493927001953125, -4.7052154541015625, 2.2510452270507812, 11.135459899902344, -7.6428070068359375, 25.19232177734375, 28.622238159179688, -25.69481086730957, 2.0339298248291016, 18.28424072265625, 1.2060165405273438, 6.829902648925781, 1.8309059143066406, 27.497282028198242, -6.0316619873046875, -0.6283168792724609, 29.47008514404297, 24.33905029296875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000172.npy"}
|
||||
{"epoch": 0.2600151171579743, "step": 173, "batch_size": 64, "mean": 8.837799072265625, "std": 15.64268684387207, "min": -25.12849235534668, "p10": -9.44524574279785, "median": 10.252440452575684, "p90": 30.557076263427742, "max": 41.42401123046875, "pos_frac": 0.6875, "sample": [-13.940820693969727, 1.5234832763671875, -1.7200889587402344, 11.590301513671875, 17.25970458984375, 24.7415771484375, 10.800065994262695, -3.1073074340820312, -2.0851173400878906, 18.46112060546875, 3.016857147216797, 12.894882202148438, 28.53522491455078, -11.352607727050781, -6.563825607299805, 28.036537170410156, 35.11900329589844, 3.2937355041503906, 12.686674118041992, 1.5028152465820312, 12.736328125, 32.80235290527344, -4.155220031738281, 27.448822021484375, -25.12849235534668, -22.882003784179688, 12.834419250488281, 20.032867431640625, 4.049884796142578, 1.1438446044921875, -6.583869934082031, 32.12583923339844, -8.54043197631836, 31.423583984375, 11.726585388183594, 3.049135208129883, 23.91619873046875, -5.983371734619141, 18.091896057128906, 19.569473266601562, 11.033721923828125, 1.5870742797851562, 4.45805549621582, 27.76567840576172, 12.540229797363281, -22.563919067382812, -2.8577346801757812, -6.509620666503906, 41.42401123046875, 22.275516510009766, 32.762908935546875, -15.90770149230957, 10.463325500488281, 20.54460906982422, 24.552181243896484, -3.5794754028320312, -0.6620330810546875, -9.833023071289062, 10.041555404663086, 36.65234375, 1.862945556640625, 19.315494537353516, -0.25543975830078125, 2.138336181640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000173.npy"}
|
||||
{"epoch": 0.2615268329554044, "step": 174, "batch_size": 64, "mean": 9.184125900268555, "std": 13.119674682617188, "min": -23.80059051513672, "p10": -4.216256713867187, "median": 8.387231826782227, "p90": 27.517774963378912, "max": 36.09403991699219, "pos_frac": 0.78125, "sample": [6.867622375488281, 12.822174072265625, 19.342613220214844, 5.545125961303711, 10.933523178100586, -3.416900634765625, 3.8685531616210938, 19.80731201171875, 3.048248291015625, 8.441822052001953, 28.659873962402344, 25.884674072265625, 3.2060470581054688, -1.9012680053710938, 17.179931640625, -3.1113147735595703, 23.146080017089844, 18.944997787475586, 16.520671844482422, -23.80059051513672, 29.317276000976562, 8.449676513671875, -2.705585479736328, 15.227058410644531, 35.584068298339844, -1.2089672088623047, 8.7928466796875, -3.318572998046875, 34.256439208984375, 6.840660095214844, 1.019989013671875, 36.09403991699219, 32.79591369628906, -4.558837890625, 14.667404174804688, 12.6072998046875, 6.290092468261719, 9.172714233398438, 2.293020248413086, 25.238494873046875, -7.994140625, 26.258819580078125, 8.420970916748047, 0.9164829254150391, 0.02738189697265625, 21.59217071533203, 15.372566223144531, -9.527664184570312, 28.057327270507812, 8.353492736816406, -22.189125061035156, 17.590255737304688, 1.4032440185546875, 0.9104347229003906, 16.74779510498047, -3.1078033447265625, 11.511344909667969, -8.196014404296875, 18.089107513427734, -12.481613159179688, 5.433477401733398, 3.8686065673828125, 7.069786071777344, 0.8129119873046875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000174.npy"}
|
||||
{"epoch": 0.26303854875283444, "step": 175, "batch_size": 64, "mean": 10.22334098815918, "std": 14.192386627197266, "min": -18.1165771484375, "p10": -6.490357208251953, "median": 7.407447814941406, "p90": 29.409215736389164, "max": 41.434356689453125, "pos_frac": 0.71875, "sample": [-12.651845932006836, 17.415369033813477, -3.9837646484375, 3.0934619903564453, -6.412422180175781, 20.994461059570312, 0.4891338348388672, 18.398193359375, 7.9901885986328125, -2.0861549377441406, 4.5543212890625, 29.80347442626953, 12.876876831054688, 17.35395050048828, -6.5237579345703125, 28.48927879333496, 10.615966796875, 24.981117248535156, 25.87201690673828, 2.9949302673339844, 5.299285888671875, 5.905534744262695, 5.303565979003906, 26.300338745117188, 21.514144897460938, 9.971725463867188, -0.3755149841308594, -1.9781417846679688, 30.634775161743164, -18.1165771484375, 26.40875244140625, -0.7763748168945312, 9.669273376464844, 3.7805099487304688, 3.08807373046875, -1.1467666625976562, 13.45186996459961, 34.081520080566406, -2.519468307495117, -15.217140197753906, 8.402420043945312, -6.5413970947265625, 1.58233642578125, 6.82470703125, -5.5284576416015625, -2.5285186767578125, 25.440643310546875, -7.989238739013672, 25.747161865234375, 28.083080291748047, 10.567991256713867, 27.37226676940918, 4.128440856933594, -8.691741943359375, 3.6777801513671875, 25.106971740722656, -6.313442230224609, 34.637733459472656, 19.459304809570312, 41.434356689453125, 1.5738372802734375, 31.396041870117188, 14.467208862304688, 32.44016647338867], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000175.npy"}
|
||||
{"epoch": 0.26455026455026454, "step": 176, "batch_size": 64, "mean": 6.519006729125977, "std": 14.160381317138672, "min": -27.298797607421875, "p10": -8.375358581542967, "median": 5.9065446853637695, "p90": 25.744810485839846, "max": 41.626495361328125, "pos_frac": 0.65625, "sample": [33.10150146484375, -21.47393798828125, 14.914222717285156, 17.243255615234375, 0.470123291015625, 4.8453826904296875, -4.406440734863281, 15.049720764160156, 5.860208511352539, 3.173370361328125, -21.037857055664062, -8.895011901855469, -7.162834167480469, 9.914230346679688, -17.02660369873047, 25.805015563964844, -0.3329353332519531, 19.073047637939453, -2.398529052734375, -1.8155364990234375, -3.1194305419921875, -2.1765289306640625, 6.300561904907227, 35.31675338745117, 17.41217041015625, 10.97735595703125, -6.837089538574219, 17.880416870117188, 0.7581939697265625, 18.949928283691406, 9.45709228515625, -0.9653854370117188, -2.499298095703125, 20.695457458496094, 13.459531784057617, 31.165813446044922, 1.1269989013671875, 13.76971435546875, 0.5044403076171875, -9.211441040039062, 2.0576705932617188, 5.952880859375, 12.058320999145508, 26.595027923583984, 7.098045349121094, 10.823314666748047, -5.2185211181640625, -4.092922210693359, 17.33026123046875, 25.604331970214844, -2.0525970458984375, 20.095870971679688, 1.8559951782226562, 15.116981506347656, 2.0084686279296875, 6.699756622314453, 41.626495361328125, -0.7262954711914062, -2.3151168823242188, 32.14593505859375, 6.674079895019531, -27.298797607421875, 7.193904876708984, -19.882308959960938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000176.npy"}
|
||||
{"epoch": 0.2660619803476946, "step": 177, "batch_size": 64, "mean": 12.537084579467773, "std": 12.976811408996582, "min": -22.030548095703125, "p10": -1.8451749801635737, "median": 10.95036506652832, "p90": 29.613881301879882, "max": 39.65177917480469, "pos_frac": 0.859375, "sample": [4.9313507080078125, 3.0194664001464844, 18.404294967651367, 39.65177917480469, 19.36516571044922, -1.291299819946289, 8.921867370605469, 10.592803955078125, 35.0223388671875, 13.371688842773438, 37.856483459472656, 29.010089874267578, 22.173458099365234, -0.8355903625488281, 11.307926177978516, -9.398845672607422, 31.783767700195312, 4.084934234619141, -4.299106597900391, 12.459611892700195, 12.005844116210938, 11.831024169921875, 4.039606094360352, 28.40730857849121, 15.366235733032227, 2.959320068359375, 6.436712265014648, 23.78707504272461, -4.304136276245117, 3.6087703704833984, 12.642730712890625, 7.588783264160156, 28.837081909179688, 11.67156982421875, 4.019487380981445, 26.485034942626953, 28.22206687927246, 3.3841018676757812, 8.7060546875, 20.142803192138672, 2.2878456115722656, 9.082101821899414, 20.343101501464844, 8.142601013183594, 19.436012268066406, -11.089530944824219, 2.9788818359375, -3.804656982421875, 29.379085540771484, 0.3670654296875, 29.714508056640625, 1.0129032135009766, -22.030548095703125, 36.10052490234375, 7.785736083984375, 14.979682922363281, 14.017955780029297, 34.38282775878906, 18.8873291015625, 9.918375015258789, 6.427089691162109, 26.448200225830078, -2.082550048828125, 7.719200134277344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000177.npy"}
|
||||
{"epoch": 0.2675736961451247, "step": 178, "batch_size": 64, "mean": 7.1955084800720215, "std": 11.586197853088379, "min": -17.100753784179688, "p10": -7.152301788330078, "median": 6.437282562255859, "p90": 24.234580230712893, "max": 32.33345031738281, "pos_frac": 0.734375, "sample": [6.79417610168457, 20.423110961914062, 6.985569000244141, -0.868011474609375, 16.792152404785156, 0.8140163421630859, -1.338775634765625, 13.671039581298828, 18.83271026611328, 9.570369720458984, -7.274101257324219, 6.506561279296875, 7.284904479980469, -15.36961555480957, 6.75518798828125, 3.802001953125, 28.345733642578125, -0.8547611236572266, -6.86810302734375, -0.6639747619628906, 26.988445281982422, 3.2578601837158203, 7.099273681640625, -15.52972412109375, 3.5133018493652344, 4.34759521484375, 4.4193878173828125, 7.742034912109375, 4.711753845214844, 4.791540145874023, 16.121368408203125, -0.057247161865234375, 6.368003845214844, 32.33345031738281, 6.8195343017578125, 10.983589172363281, 8.761016845703125, 4.571868896484375, 28.44414520263672, 4.402488708496094, -11.786514282226562, 20.904220581054688, 24.374900817871094, 5.706058502197266, -17.100753784179688, 26.4483642578125, -1.3251876831054688, -10.320087432861328, 26.417694091796875, 4.288972854614258, 22.020217895507812, 23.90716552734375, 9.354907989501953, 18.447593688964844, 5.951940536499023, 1.8635787963867188, 14.6046142578125, -3.9388885498046875, -16.387298583984375, -2.2684173583984375, 11.190582275390625, 13.739112854003906, 13.661018371582031, -2.6711254119873047], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000178.npy"}
|
||||
{"epoch": 0.2690854119425548, "step": 179, "batch_size": 64, "mean": 8.756677627563477, "std": 13.864158630371094, "min": -28.405746459960938, "p10": -7.26568374633789, "median": 9.08315658569336, "p90": 24.501072120666507, "max": 43.27384948730469, "pos_frac": 0.765625, "sample": [13.850669860839844, -4.541259765625, -2.32171630859375, 11.169769287109375, 0.27751922607421875, 11.344100952148438, 11.402956008911133, 14.302337646484375, 33.541229248046875, 14.127071380615234, 24.1535701751709, 24.650001525878906, 23.05044937133789, -15.168296813964844, 8.724594116210938, -9.582138061523438, -28.405746459960938, -7.081428527832031, 3.5581588745117188, 9.441719055175781, -3.360475540161133, 20.584430694580078, 5.5361785888671875, 19.62718963623047, 28.065364837646484, -17.301063537597656, -2.8300933837890625, 20.460357666015625, 4.885650634765625, 7.029191970825195, 10.939273834228516, 19.21739959716797, 3.9396018981933594, -5.5355682373046875, 1.553436279296875, -10.64434814453125, 4.3903350830078125, 41.80140686035156, 24.117080688476562, 0.7011260986328125, 15.778875350952148, 15.972999572753906, 15.691965103149414, 11.046985626220703, -7.3446502685546875, 17.540283203125, 30.245147705078125, 6.298805236816406, 28.83251190185547, 43.27384948730469, 15.715579986572266, 14.893035888671875, 20.337509155273438, 3.620269775390625, 0.19020843505859375, 14.601486206054688, 5.991279602050781, 0.1549530029296875, 6.929910659790039, -1.8903961181640625, -16.289392471313477, 0.00649261474609375, 13.694488525390625, -4.534854888916016], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000179.npy"}
|
||||
{"epoch": 0.2705971277399849, "step": 180, "batch_size": 64, "mean": 7.216020107269287, "std": 13.164663314819336, "min": -23.77497100830078, "p10": -9.454264831542966, "median": 6.1182861328125, "p90": 26.810099601745605, "max": 35.99208068847656, "pos_frac": 0.765625, "sample": [14.983673095703125, 1.0554656982421875, 6.5201263427734375, 4.833427429199219, 3.4076080322265625, 11.426902770996094, -11.757102966308594, 7.481960296630859, 0.9588146209716797, -2.2697296142578125, 26.6485538482666, 10.316226959228516, -6.7567596435546875, -0.6043548583984375, 6.0098724365234375, 1.3846015930175781, -10.357208251953125, 23.349075317382812, 1.1957015991210938, -21.342552185058594, 29.62860107421875, 9.046520233154297, 6.896638870239258, 20.2686767578125, 10.736236572265625, 18.551834106445312, 0.6843757629394531, 2.252960205078125, 6.000478744506836, 9.77975845336914, 8.857864379882812, 23.753049850463867, -12.172386169433594, 2.4152984619140625, 7.451488494873047, 12.244338989257812, 6.2266998291015625, 30.448829650878906, 0.038127899169921875, -2.236307144165039, 4.981027603149414, 18.855257034301758, 29.185583114624023, -0.2167510986328125, -3.2423019409179688, 2.1620845794677734, 32.829345703125, -2.073383331298828, 35.99208068847656, 1.8708209991455078, 24.977157592773438, -15.237483978271484, 4.912071228027344, -7.3473968505859375, 6.272834777832031, 8.23879623413086, 32.45227813720703, -23.77497100830078, 14.718074798583984, -15.768539428710938, 14.24285888671875, 10.087348937988281, 26.87933349609375, 3.471773147583008], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000180.npy"}
|
||||
{"epoch": 0.272108843537415, "step": 181, "batch_size": 64, "mean": 10.40049934387207, "std": 11.450883865356445, "min": -12.060718536376953, "p10": -2.0254131317138664, "median": 8.613786697387695, "p90": 25.95672187805176, "max": 38.584007263183594, "pos_frac": 0.828125, "sample": [11.716445922851562, 14.531982421875, 0.3611297607421875, -0.11566734313964844, 4.525684356689453, 2.220827102661133, 9.369880676269531, -8.142227172851562, 3.423961639404297, 1.5274772644042969, 19.885948181152344, 6.120136260986328, 12.05624771118164, -5.0898284912109375, 13.903985977172852, 7.7161102294921875, 18.09415054321289, 5.9759674072265625, 22.010963439941406, -6.3997344970703125, 15.82672119140625, 27.856674194335938, 18.943035125732422, 13.364311218261719, 33.327667236328125, 19.8001708984375, 5.85870361328125, 38.584007263183594, 3.0130462646484375, -12.060718536376953, 21.031234741210938, 4.4585723876953125, 32.183074951171875, -7.25422477722168, 18.385589599609375, 0.8694419860839844, -1.3125877380371094, 31.370742797851562, 2.1514930725097656, 4.776611328125, 21.498477935791016, 4.063840866088867, 18.657608032226562, 25.885086059570312, 23.040592193603516, 1.9273242950439453, 2.038278579711914, 9.225086212158203, 11.211944580078125, -0.0003528594970703125, 3.2247314453125, 25.987422943115234, 34.14048767089844, 1.5137100219726562, 8.002487182617188, 20.258132934570312, 3.75164794921875, 9.881057739257812, 10.051340103149414, -2.3309097290039062, 12.723037719726562, -0.6066303253173828, 20.29193878173828, -3.6714324951171875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000181.npy"}
|
||||
{"epoch": 0.273620559334845, "step": 182, "batch_size": 64, "mean": 10.551786422729492, "std": 14.212845802307129, "min": -27.768783569335938, "p10": -6.097832489013672, "median": 8.935894012451172, "p90": 28.68572235107422, "max": 35.50494384765625, "pos_frac": 0.75, "sample": [28.16278076171875, -10.091573715209961, -7.717098236083984, 11.822265625, 17.015174865722656, 17.2423095703125, 15.228788375854492, -8.856643676757812, -6.0845184326171875, 28.524703979492188, 15.088569641113281, 5.0677032470703125, 35.50494384765625, 1.1362457275390625, 25.688316345214844, 17.275604248046875, 9.357032775878906, 8.514755249023438, 28.426952362060547, 9.751899719238281, 17.133710861206055, 1.0146160125732422, 0.4105377197265625, -5.0650787353515625, 19.41405487060547, -0.407958984375, 25.455190658569336, 7.915302276611328, 27.551116943359375, 27.703020095825195, 33.040313720703125, 9.679222106933594, 7.46160888671875, 6.293422698974609, 29.32256317138672, 0.0278778076171875, -0.6573333740234375, 3.356353759765625, 28.754730224609375, -0.5315437316894531, 8.414329528808594, -16.445188522338867, 31.710769653320312, -2.5419998168945312, 26.78661346435547, 20.697654724121094, 4.882049560546875, -3.9431838989257812, 23.715866088867188, 3.0239028930664062, 19.979598999023438, 30.952171325683594, -6.103538513183594, 24.253225326538086, 31.191802978515625, 5.57963752746582, -3.4777069091796875, 9.724508285522461, 23.69240951538086, 3.0925350189208984, 4.237285614013672, -14.003280639648438, -1.2663154602050781, -27.768783569335938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000182.npy"}
|
||||
{"epoch": 0.2751322751322751, "step": 183, "batch_size": 64, "mean": 4.446890830993652, "std": 11.647648811340332, "min": -30.68030548095703, "p10": -8.540689849853514, "median": 2.9869470596313477, "p90": 20.06326293945313, "max": 28.22052764892578, "pos_frac": 0.671875, "sample": [4.432403564453125, 23.334259033203125, -0.7711944580078125, 0.5659141540527344, 4.497285842895508, -3.85931396484375, -12.074203491210938, 23.77576446533203, -9.122634887695312, -2.5736083984375, -30.68030548095703, -2.1650314331054688, 15.797733306884766, -4.8456573486328125, 11.452888488769531, 15.725292205810547, -1.1512107849121094, 12.01666259765625, 16.6971435546875, 11.464824676513672, 14.784904479980469, -4.7111663818359375, 15.044136047363281, -7.312021255493164, 2.7809066772460938, -4.221412658691406, 7.2689208984375, 1.2016372680664062, -15.30657958984375, -8.945877075195312, 28.22052764892578, 8.764389038085938, 3.6226119995117188, 2.033538818359375, -7.240043640136719, 10.314727783203125, 10.018150329589844, 25.814838409423828, -24.157005310058594, 23.294536590576172, -5.436023712158203, 0.5495834350585938, 24.05091094970703, -1.9585113525390625, 2.2679786682128906, 7.706661224365234, 12.48472785949707, 3.1929874420166016, -7.595252990722656, -1.003274917602539, 7.784820556640625, 6.3715972900390625, 0.9961204528808594, 1.5874176025390625, 19.387252807617188, 0.7794399261474609, 12.147254943847656, -11.603302001953125, 2.5956859588623047, 0.6870040893554688, 8.084068298339844, 20.352981567382812, 15.957048416137695, 11.427108764648438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000183.npy"}
|
||||
{"epoch": 0.2766439909297052, "step": 184, "batch_size": 64, "mean": 7.624444961547852, "std": 10.335906028747559, "min": -8.84665298461914, "p10": -1.9713539123535153, "median": 4.969243049621582, "p90": 23.379735374450686, "max": 31.73811149597168, "pos_frac": 0.734375, "sample": [1.237060546875, 21.65094757080078, 23.778528213500977, 0.21370315551757812, 14.165061950683594, 1.0353546142578125, -0.37464141845703125, 0.3717193603515625, -1.5265769958496094, 2.6131134033203125, -0.02480316162109375, 25.052391052246094, 0.014057159423828125, 6.6572113037109375, 18.251300811767578, -6.233131408691406, 19.211395263671875, 12.32900619506836, 6.717372894287109, 28.498611450195312, -6.6777191162109375, 31.73811149597168, 13.56796646118164, 29.57604217529297, -8.039981842041016, 21.224472045898438, 5.270212173461914, 11.758468627929688, 25.158538818359375, 2.0706787109375, 13.713432312011719, -1.0195770263671875, 4.531280517578125, -8.84665298461914, -0.01430511474609375, 6.794666290283203, 28.97802734375, 9.791749954223633, 1.2790145874023438, 19.615394592285156, 8.601240158081055, -1.6789627075195312, 15.792869567871094, 0.303741455078125, 12.844635009765625, 6.097394943237305, 14.745147705078125, 7.547119140625, 0.2667407989501953, 22.44921875, -7.962587356567383, -1.5475921630859375, 5.504219055175781, 4.66827392578125, 14.527374267578125, 10.862411499023438, -3.0901966094970703, 4.333290100097656, -0.5942192077636719, 3.0319671630859375, -2.0966644287109375, 1.8955078125, -1.5254535675048828, -1.1185455322265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000184.npy"}
|
||||
{"epoch": 0.2781557067271353, "step": 185, "batch_size": 64, "mean": 8.20844554901123, "std": 14.408135414123535, "min": -24.72026824951172, "p10": -10.0419038772583, "median": 7.915760040283203, "p90": 27.231465148925786, "max": 36.11114501953125, "pos_frac": 0.6875, "sample": [4.431276321411133, 32.90392303466797, -1.951995849609375, 15.03095817565918, 30.460037231445312, -18.32302474975586, 9.374752044677734, 7.399013519287109, -6.21893310546875, 2.7464752197265625, 28.214141845703125, 30.09972381591797, 0.06993865966796875, -10.393318176269531, 35.076759338378906, 5.347862243652344, 36.11114501953125, 8.121505737304688, 22.87698745727539, 25.851577758789062, -5.337699890136719, 23.530784606933594, 17.34359359741211, 7.4949951171875, 1.7048206329345703, 19.327861785888672, 11.73631477355957, 12.070541381835938, 7.710014343261719, 18.395492553710938, -6.996526718139648, -0.8995132446289062, 13.111818313598633, 19.009490966796875, -21.630104064941406, 17.920387268066406, 18.740554809570312, 15.295801162719727, 11.936878204345703, -2.3478660583496094, -24.72026824951172, 7.468559265136719, -4.682579040527344, 12.44610595703125, 19.848114013671875, -17.231948852539062, 23.767770767211914, -3.1370906829833984, 27.822845458984375, -3.731386184692383, 7.616363525390625, -11.281511306762695, 13.84033203125, -6.131111145019531, 0.6628646850585938, 10.521526336669922, -13.587772369384766, -9.22193717956543, 13.357574462890625, -1.5678482055664062, -4.546894073486328, 4.110118865966797, 24.303308486938477, 24.068937301635742], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000185.npy"}
|
||||
{"epoch": 0.2796674225245654, "step": 186, "batch_size": 64, "mean": 9.002809524536133, "std": 13.624337196350098, "min": -27.916404724121094, "p10": -5.136941719055175, "median": 7.97238826751709, "p90": 26.93631381988526, "max": 38.5078125, "pos_frac": 0.71875, "sample": [15.7166748046875, 33.093017578125, -14.210609436035156, 10.697792053222656, 14.886619567871094, -3.8105316162109375, 7.444480895996094, -11.997505187988281, -5.47479248046875, 18.51818084716797, 9.673377990722656, -2.6658592224121094, 12.191314697265625, 18.277286529541016, 38.5078125, 21.445903778076172, 5.182048797607422, 6.251457214355469, 5.878116607666016, -19.990066528320312, 4.688323974609375, 17.953031539916992, 20.783720016479492, 31.605209350585938, -27.916404724121094, -4.1759185791015625, -4.348623275756836, -4.303400039672852, -2.2467575073242188, 1.9330825805664062, 4.6617889404296875, 20.02564811706543, 6.66253662109375, -3.3029937744140625, -4.016082763671875, -1.9331207275390625, 11.990264892578125, 25.10443878173828, 20.307151794433594, 2.749706268310547, -2.5656089782714844, -7.708473205566406, 3.2116470336914062, 5.310447692871094, 13.417465209960938, 7.056095123291016, 27.721403121948242, 13.842864990234375, -6.5168304443359375, 11.520576477050781, 35.368408203125, 8.500295639038086, 11.600090026855469, 38.28782653808594, 12.396965026855469, 9.891921997070312, 33.939178466796875, 23.014619827270508, 18.101730346679688, 14.001272201538086, 7.4053192138671875, 17.321273803710938, -2.146484375, 7.371416091918945], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000186.npy"}
|
||||
{"epoch": 0.2811791383219955, "step": 187, "batch_size": 64, "mean": 8.059208869934082, "std": 13.458449363708496, "min": -21.908859252929688, "p10": -9.31581344604492, "median": 6.54984188079834, "p90": 25.487874603271486, "max": 39.26316833496094, "pos_frac": 0.734375, "sample": [9.346809387207031, -6.890899658203125, 23.094947814941406, 32.16459655761719, 39.26316833496094, 13.8961181640625, 14.556556701660156, 4.647192001342773, 3.6212310791015625, 23.71752166748047, -18.52288818359375, 12.340194702148438, 0.8826026916503906, 11.389480590820312, 11.989471435546875, 1.8127326965332031, 2.925569534301758, 18.272933959960938, 15.682022094726562, 6.680234909057617, -12.594635009765625, -21.908859252929688, 25.605247497558594, -0.18955230712890625, 3.260028839111328, -3.4779205322265625, 1.013671875, 6.4194488525390625, 4.975372314453125, 23.83319854736328, 21.628944396972656, 25.869400024414062, -10.122344970703125, -1.2948493957519531, 10.689506530761719, 12.701156616210938, 4.719215393066406, 0.605133056640625, -7.433906555175781, 13.631305694580078, -3.0661277770996094, -7.050537109375, 21.960338592529297, 0.33389854431152344, 5.2396087646484375, 11.398490905761719, 29.127342224121094, -7.310924530029297, 9.49871826171875, 18.244033813476562, 25.214004516601562, -10.470664978027344, 16.198974609375, 33.7591552734375, 2.8534183502197266, -3.0169525146484375, 33.090179443359375, 24.817298889160156, 2.048858642578125, 7.72784423828125, -3.5628433227539062, -10.963924407958984, -11.10601806640625, 12.026054382324219], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000187.npy"}
|
||||
{"epoch": 0.28269085411942557, "step": 188, "batch_size": 64, "mean": 8.44560432434082, "std": 12.518075942993164, "min": -26.361839294433594, "p10": -3.9891994476318358, "median": 5.979667663574219, "p90": 25.747528076171875, "max": 35.184051513671875, "pos_frac": 0.8125, "sample": [3.6873855590820312, -2.6340503692626953, 30.116031646728516, -7.973020553588867, 0.65380859375, 22.721115112304688, 20.208816528320312, 4.132768630981445, 19.41253662109375, 7.1059417724609375, 3.719635009765625, 35.184051513671875, 25.158039093017578, 0.2055816650390625, 0.22611236572265625, 1.2762470245361328, 19.134597778320312, 28.194290161132812, 0.6987457275390625, -3.9492225646972656, 7.900800704956055, 1.4800567626953125, 2.2765884399414062, 0.492156982421875, 9.055091857910156, -1.257843017578125, -5.191028594970703, 20.067489624023438, 8.832149505615234, 22.059864044189453, 12.600709915161133, 2.7062530517578125, 6.179168701171875, 33.87779998779297, 19.836135864257812, 1.2871437072753906, 25.859390258789062, 5.7801666259765625, 4.60821533203125, 8.317306518554688, 28.905559539794922, 1.8982086181640625, 4.801422119140625, 12.063560485839844, -0.13565826416015625, -4.0063323974609375, 3.8379650115966797, 25.751373291015625, 3.5416793823242188, 0.5113544464111328, 25.738555908203125, -17.622146606445312, -4.111419677734375, 6.783294677734375, 7.9571533203125, 6.760932922363281, 7.277530670166016, -11.760427474975586, 19.967056274414062, 25.71246337890625, 24.446996688842773, 8.130477905273438, -3.616079330444336, -26.361839294433594], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000188.npy"}
|
||||
{"epoch": 0.2842025699168556, "step": 189, "batch_size": 64, "mean": 8.14105224609375, "std": 11.966704368591309, "min": -19.001605987548828, "p10": -9.127962493896481, "median": 7.530143737792969, "p90": 26.573479461669923, "max": 32.154090881347656, "pos_frac": 0.8125, "sample": [0.6816520690917969, 10.707775115966797, 8.251335144042969, 13.599525451660156, 9.656143188476562, 29.13458251953125, 26.57831573486328, 13.069250106811523, 1.784231185913086, 9.325790405273438, -11.969573974609375, 23.879196166992188, -10.485252380371094, 3.1119842529296875, -0.2451934814453125, -1.9119682312011719, 29.983314514160156, 16.774532318115234, -11.32174301147461, 10.168048858642578, 4.999462127685547, 15.08204460144043, 4.260917663574219, 8.525283813476562, -0.1594390869140625, 2.5394229888916016, 11.764381408691406, 10.483345031738281, 4.247005462646484, 1.6178054809570312, 27.49736785888672, -5.9609527587890625, 3.8561019897460938, 28.54875946044922, 26.56219482421875, 9.863197326660156, 0.5218505859375, 2.6266326904296875, 3.8861541748046875, 11.727569580078125, 25.354175567626953, 6.365203857421875, 12.692667007446289, 6.9882659912109375, 9.746292114257812, 5.998828887939453, -4.417350769042969, -13.993301391601562, 8.072021484375, 4.450347900390625, 5.0600128173828125, 17.525583267211914, -15.09326171875, 32.154090881347656, 22.995513916015625, 5.268226623535156, -12.665473937988281, -19.001605987548828, 8.279338836669922, 9.18865966796875, 6.310909271240234, 2.6070709228515625, 30.761558532714844, 23.118562698364258], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000189.npy"}
|
||||
{"epoch": 0.2857142857142857, "step": 190, "batch_size": 64, "mean": 9.855717658996582, "std": 12.001813888549805, "min": -13.194015502929688, "p10": -4.524910736083984, "median": 8.043618202209473, "p90": 26.863125610351563, "max": 37.342132568359375, "pos_frac": 0.765625, "sample": [6.671560287475586, 16.533233642578125, 35.960052490234375, -4.4196319580078125, 12.329269409179688, 22.701820373535156, -0.6243667602539062, 4.5404052734375, -9.120323181152344, 30.389923095703125, 7.031349182128906, 14.681167602539062, 8.545183181762695, 1.2821578979492188, 7.912952423095703, 0.38763427734375, -5.039836883544922, 25.9810791015625, 5.399349212646484, 8.174283981323242, 18.583911895751953, 30.002666473388672, -0.29730224609375, 3.111713409423828, -4.626796722412109, 8.961166381835938, -13.194015502929688, 4.549350738525391, 5.211616516113281, 11.233135223388672, 13.147773742675781, -0.9940299987792969, 6.7601470947265625, 9.84283447265625, 16.19538116455078, 1.5746192932128906, 23.412338256835938, -4.141925811767578, -2.2596778869628906, 12.055461883544922, -5.9570159912109375, 19.694229125976562, 21.546398162841797, 37.342132568359375, 3.114347457885742, 1.7922439575195312, -1.4216442108154297, 14.485107421875, -10.615570068359375, -4.570030212402344, 7.378602981567383, 8.419181823730469, 3.464191436767578, 26.748367309570312, 11.395416259765625, 30.690048217773438, 34.37904357910156, 3.4161758422851562, 20.949275970458984, 26.912307739257812, -0.17657852172851562, 22.8822021484375, 15.361968994140625, 15.089910507202148], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000190.npy"}
|
||||
{"epoch": 0.2872260015117158, "step": 191, "batch_size": 64, "mean": 8.066210746765137, "std": 12.947985649108887, "min": -24.182357788085938, "p10": -3.54932632446289, "median": 8.031052589416504, "p90": 25.865300750732423, "max": 34.501853942871094, "pos_frac": 0.75, "sample": [25.91295623779297, -2.6430912017822266, 5.7515411376953125, 13.139278411865234, 34.013465881347656, -0.6736373901367188, 8.3438720703125, 25.99103546142578, 7.718233108520508, 3.9861297607421875, 3.6318206787109375, 34.501853942871094, 8.7783203125, -0.33991241455078125, 8.826362609863281, 2.854869842529297, -22.720550537109375, 4.609672546386719, 30.68829345703125, 9.060388565063477, 9.111167907714844, 22.0753173828125, 10.882450103759766, 23.677276611328125, -14.708198547363281, 20.849624633789062, 9.4215087890625, 13.91085433959961, -3.2393646240234375, -0.01102447509765625, 9.407148361206055, -1.3952789306640625, -0.010637283325195312, 3.086132049560547, 31.499832153320312, -20.716781616210938, 0.007602691650390625, -12.256851196289062, 8.402153015136719, 19.439682006835938, 11.688606262207031, 9.89553451538086, 6.827728271484375, 11.950576782226562, 17.07387924194336, 2.428844451904297, 22.96698760986328, 30.61825180053711, -3.6821670532226562, 1.8698501586914062, -2.821279525756836, 6.242153167724609, 20.6778564453125, 6.6243133544921875, -7.224666595458984, 14.29095458984375, 25.754104614257812, -24.182357788085938, 11.129512786865234, -1.3467941284179688, 0.878814697265625, 20.78656005859375, 0.754852294921875, 2.1718978881835938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000191.npy"}
|
||||
{"epoch": 0.2887377173091459, "step": 192, "batch_size": 64, "mean": 5.101117134094238, "std": 10.271121978759766, "min": -26.80352783203125, "p10": -4.635131454467773, "median": 3.8015060424804688, "p90": 17.58678951263428, "max": 32.054290771484375, "pos_frac": 0.65625, "sample": [-26.80352783203125, -9.15163803100586, 2.898448944091797, 12.863178253173828, 15.557979583740234, 2.0724620819091797, -4.129119873046875, 11.183578491210938, 29.997909545898438, -5.673675537109375, -6.851604461669922, -2.3068923950195312, -1.259857177734375, 1.0229301452636719, 0.243865966796875, 11.32330322265625, 17.82730484008789, 6.8286285400390625, -0.10860061645507812, -0.6016769409179688, 17.02558708190918, 15.084556579589844, -2.622692108154297, -2.5873184204101562, 24.695167541503906, -0.5961856842041016, 3.1494140625, 12.932785034179688, 4.5643157958984375, 3.6644134521484375, 7.30609130859375, 12.985015869140625, 8.993627548217773, 12.023168563842773, -2.8799076080322266, -2.997699737548828, 3.9385986328125, -0.433349609375, -1.7773971557617188, 7.433189392089844, 4.799539566040039, -2.556346893310547, -17.571277618408203, 32.054290771484375, 13.180374145507812, 6.083505630493164, -3.0902538299560547, 3.044647216796875, 6.169731140136719, -9.732383728027344, 9.645355224609375, 20.493663787841797, 2.1544570922851562, 18.35533905029297, 0.07391548156738281, 11.913383483886719, 9.83834457397461, -2.8311004638671875, 15.074066162109375, 20.593353271484375, 12.312658309936523, -4.851993560791016, 1.8016815185546875, 4.682161331176758], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000192.npy"}
|
||||
{"epoch": 0.29024943310657597, "step": 193, "batch_size": 64, "mean": 8.865718841552734, "std": 11.599825859069824, "min": -20.173099517822266, "p10": -2.291218948364257, "median": 5.833247184753418, "p90": 24.251332092285157, "max": 40.02490997314453, "pos_frac": 0.796875, "sample": [1.4357223510742188, 4.258600234985352, 17.074867248535156, 5.9864349365234375, 5.680059432983398, -9.730728149414062, 0.1983795166015625, 23.689781188964844, -2.89642333984375, -0.9413852691650391, 19.28784942626953, 1.4065876007080078, 13.583763122558594, 9.2509765625, 15.93780517578125, 0.070098876953125, 14.340503692626953, -1.5958099365234375, 12.938730239868164, 18.871719360351562, -1.5965461730957031, 23.356857299804688, 40.02490997314453, -4.564090728759766, 18.39239501953125, 22.505401611328125, 1.556488037109375, 3.0108871459960938, -0.9763641357421875, 8.627885818481445, 24.49199676513672, 13.377532958984375, 4.512424468994141, 10.224151611328125, 9.082542419433594, -1.5675811767578125, -3.7493438720703125, 3.2697296142578125, 10.757575988769531, 11.73874282836914, -8.909454345703125, 4.689212799072266, 19.809017181396484, 34.573974609375, -2.5889358520507812, 2.9086456298828125, 8.495298385620117, -0.5504131317138672, 4.925739288330078, 27.692962646484375, 20.834617614746094, 0.3874702453613281, 29.888015747070312, 31.04490089416504, 25.120189666748047, 2.3162689208984375, 10.002609252929688, -20.173099517822266, 6.5160369873046875, 23.254995346069336, 1.3573150634765625, 2.9199256896972656, 1.0454654693603516, 0.5221443176269531], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000193.npy"}
|
||||
{"epoch": 0.29176114890400606, "step": 194, "batch_size": 64, "mean": 7.725024700164795, "std": 12.870354652404785, "min": -23.442363739013672, "p10": -7.706082534790038, "median": 7.446359634399414, "p90": 25.019658470153814, "max": 33.81432342529297, "pos_frac": 0.71875, "sample": [27.456260681152344, 4.2710723876953125, 15.354835510253906, -7.940959930419922, 23.657623291015625, 3.506389617919922, -14.517602920532227, 2.3222694396972656, 0.8230972290039062, 6.85723876953125, 7.711116790771484, 12.565017700195312, 3.7100906372070312, 20.044357299804688, -10.243576049804688, -5.812324523925781, 18.58061981201172, -7.1580352783203125, 9.210685729980469, -1.7443733215332031, 13.922271728515625, 3.335796356201172, 14.323089599609375, 20.801788330078125, -2.454700469970703, 13.987503051757812, -5.0646209716796875, 13.526885986328125, -0.6237640380859375, 7.181602478027344, -0.5223731994628906, 22.538209915161133, -9.928146362304688, -5.335060119628906, -15.79595947265625, -1.06494140625, 33.022438049316406, 17.793548583984375, 10.88485336303711, 29.614892959594727, -23.442363739013672, 8.745132446289062, 4.331180572509766, 1.5510520935058594, 15.030113220214844, 31.493059158325195, 9.55521011352539, -18.830551147460938, 33.81432342529297, 20.025665283203125, 11.620691299438477, -1.0653305053710938, 14.222206115722656, 25.6033878326416, 19.694961547851562, 32.12824249267578, 8.88096809387207, 0.5851421356201172, 2.6931228637695312, 14.045967102050781, 5.815336227416992, 2.7391395568847656, -1.7963371276855469, 8.164148330688477], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000194.npy"}
|
||||
{"epoch": 0.29327286470143615, "step": 195, "batch_size": 64, "mean": 8.665790557861328, "std": 12.034049034118652, "min": -20.666534423828125, "p10": -6.760335922241211, "median": 8.458599090576172, "p90": 23.677808380126955, "max": 33.75953674316406, "pos_frac": 0.71875, "sample": [3.7114486694335938, 10.392160415649414, -1.344085693359375, 11.914222717285156, 31.67961883544922, 16.50908660888672, 0.13138580322265625, 10.490852355957031, 10.27874755859375, 22.008255004882812, 17.853103637695312, 17.59003257751465, -10.432350158691406, -8.181533813476562, -7.7789764404296875, -20.666534423828125, -0.9352664947509766, 8.047744750976562, 8.032907485961914, 22.96674156188965, -6.8384246826171875, 11.033638000488281, 11.195228576660156, 13.705368041992188, 9.028709411621094, 8.869453430175781, 26.824888229370117, 31.29928970336914, 5.117637634277344, 17.1654052734375, 4.236789703369141, -0.9042816162109375, 29.02871322631836, 4.9481964111328125, 23.158065795898438, -1.4136734008789062, -2.972105026245117, 18.81932830810547, 18.73590850830078, 16.471054077148438, 7.82786750793457, -4.776054382324219, -1.7295074462890625, -0.578399658203125, 0.7900447845458984, 5.287038803100586, -3.0261077880859375, 23.747047424316406, -15.661087036132812, 0.003643035888671875, -6.578128814697266, -1.0091400146484375, 33.75953674316406, 20.72867774963379, 15.475296020507812, 14.415735244750977, 23.516250610351562, 12.39453125, -7.374519348144531, 24.172256469726562, 7.5930633544921875, 0.7182388305664062, 5.59173583984375, 19.54585075378418], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000195.npy"}
|
||||
{"epoch": 0.2947845804988662, "step": 196, "batch_size": 64, "mean": 7.4355244636535645, "std": 10.688447952270508, "min": -30.631641387939453, "p10": -3.328127288818359, "median": 6.741462707519531, "p90": 22.427135467529297, "max": 28.8790283203125, "pos_frac": 0.796875, "sample": [6.7506103515625, -30.631641387939453, -2.919788360595703, 3.8893585205078125, 6.23033332824707, -6.544746398925781, 3.0953521728515625, 4.688873291015625, 8.666717529296875, -2.9429168701171875, 4.9513397216796875, 19.340316772460938, 22.644607543945312, 6.337921142578125, 6.7323150634765625, 7.705169677734375, -18.90987777709961, 16.56426239013672, -1.5514450073242188, 16.888031005859375, -1.627645492553711, 3.3421249389648438, 24.318103790283203, 0.6207447052001953, -3.787700653076172, 4.235755920410156, 16.02996826171875, 13.892936706542969, 7.8805694580078125, 28.34710693359375, 8.579574584960938, 15.280326843261719, 0.00739288330078125, -3.4932174682617188, -3.5250015258789062, 7.555015563964844, 3.0152034759521484, 9.985879898071289, 6.5020294189453125, 28.8790283203125, 9.118614196777344, 6.9516754150390625, 17.289634704589844, 3.4104537963867188, 15.99505615234375, -2.8702468872070312, 0.6680126190185547, 7.3537139892578125, 8.263420104980469, 22.390792846679688, 22.442710876464844, 20.412670135498047, -0.8238983154296875, 0.6929931640625, 7.226116180419922, 25.4739990234375, 13.56816291809082, 19.78875732421875, 3.0739173889160156, 3.534168243408203, 3.4716796875, -4.929119110107422, 11.289718627929688, 25.057586669921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000196.npy"}
|
||||
{"epoch": 0.2962962962962963, "step": 197, "batch_size": 64, "mean": 8.88310432434082, "std": 14.106135368347168, "min": -30.022994995117188, "p10": -9.180335807800292, "median": 7.504373550415039, "p90": 26.418833351135255, "max": 36.10139465332031, "pos_frac": 0.75, "sample": [-3.755329132080078, 31.62987518310547, 5.6812286376953125, 0.332977294921875, 8.7823486328125, 13.579090118408203, -9.657638549804688, -30.022994995117188, 1.6784820556640625, -0.7239837646484375, 18.526290893554688, 25.835844039916992, -10.290229797363281, -11.203128814697266, 6.224445343017578, 31.68486785888672, 7.862770080566406, 3.250133514404297, 36.10139465332031, 25.150497436523438, 7.082000732421875, 4.730018615722656, 19.303028106689453, 13.272926330566406, 8.831581115722656, 2.193756103515625, 5.075157165527344, -8.066629409790039, 24.816530227661133, 12.836448669433594, 26.668685913085938, 28.343887329101562, -15.468826293945312, 21.79694175720215, -3.1217498779296875, -26.377620697021484, 3.6664886474609375, 4.8023834228515625, 20.573532104492188, 11.706085205078125, -0.18677902221679688, 21.26665496826172, 24.764747619628906, 4.7984466552734375, -0.1456756591796875, 30.310012817382812, 8.107025146484375, -0.8688583374023438, 7.145977020263672, 20.89883041381836, -8.000741958618164, -1.639801025390625, 16.140182495117188, 17.786054611206055, 3.834761619567871, -13.404163360595703, 35.24861145019531, 14.731430053710938, 2.133535385131836, 15.911491394042969, 8.30645751953125, 20.561813354492188, 6.946146011352539, 20.54096221923828], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000197.npy"}
|
||||
{"epoch": 0.29780801209372637, "step": 198, "batch_size": 64, "mean": 8.891267776489258, "std": 10.78958797454834, "min": -15.0147705078125, "p10": -5.074743461608886, "median": 9.125640869140625, "p90": 23.12096748352051, "max": 38.39642333984375, "pos_frac": 0.828125, "sample": [-1.5230712890625, 14.134513854980469, 0.5513477325439453, 11.444633483886719, 6.803518295288086, 10.537971496582031, -11.219232559204102, 1.9825897216796875, 6.177511215209961, 5.089813232421875, 21.165435791015625, 8.490158081054688, 1.3298110961914062, -12.53956413269043, 38.39642333984375, 28.32610321044922, 12.862350463867188, 13.340179443359375, 14.459884643554688, -4.39103889465332, 14.702407836914062, 15.391965866088867, 23.49582862854004, -6.009160995483398, 25.0751953125, -12.07623291015625, -5.367759704589844, 0.1662578582763672, 14.798629760742188, 14.732545852661133, 17.639549255371094, 10.499725341796875, 5.2747802734375, -15.0147705078125, 3.9477500915527344, -1.1991500854492188, 1.8154144287109375, 3.1317100524902344, -5.54437255859375, 9.494415283203125, 4.586799621582031, 23.316383361816406, 22.664997100830078, 8.302471160888672, 11.382658004760742, 9.119491577148438, 1.623321533203125, 25.358373641967773, 35.451290130615234, 8.298065185546875, 4.938056945800781, 0.08872222900390625, 13.920578002929688, 20.96299171447754, 9.131790161132812, 10.22186279296875, 10.119598388671875, 9.440608978271484, -1.0864715576171875, 8.492504119873047, 11.824615478515625, 8.437358856201172, 19.541046142578125, 12.52996826171875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000198.npy"}
|
||||
{"epoch": 0.29931972789115646, "step": 199, "batch_size": 64, "mean": 7.101439476013184, "std": 12.348226547241211, "min": -24.178760528564453, "p10": -10.782207298278808, "median": 9.598516464233398, "p90": 22.617761230468755, "max": 32.34053421020508, "pos_frac": 0.703125, "sample": [16.63268280029297, 18.17799949645996, 3.696441650390625, 7.369331359863281, -9.799959182739258, -12.586349487304688, 13.038192749023438, 25.20587158203125, 9.724021911621094, 11.220733642578125, 13.949249267578125, -2.4847946166992188, 32.34053421020508, 16.072067260742188, 31.24405288696289, 23.368621826171875, 14.957111358642578, 18.578445434570312, 11.618301391601562, 23.050762176513672, -2.188720703125, -13.441389083862305, 16.112335205078125, 17.62103271484375, -1.3222026824951172, 0.7869110107421875, 9.471321105957031, -4.3459014892578125, 9.86257553100586, 10.986701965332031, 9.0341796875, 3.9672985076904297, 13.334247589111328, -1.5799636840820312, -1.151442527770996, 9.68267822265625, 6.569122314453125, -11.203170776367188, -12.376228332519531, 9.538063049316406, -24.178760528564453, 16.583343505859375, 4.840507507324219, 18.028053283691406, -18.147048950195312, 12.706672668457031, 9.65896987915039, 4.350849151611328, -5.9154510498046875, 3.1348648071289062, 0.45508384704589844, -17.6627197265625, -4.572540283203125, 14.812274932861328, -9.134109497070312, -0.22544097900390625, 21.607425689697266, -3.5927810668945312, 16.690017700195312, 10.008533477783203, 24.782821655273438, 6.1152496337890625, 27.441604614257812, 11.973907470703125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000199.npy"}
|
||||
{"epoch": 0.30083144368858655, "step": 200, "batch_size": 64, "mean": 10.713313102722168, "std": 13.790858268737793, "min": -27.750030517578125, "p10": -4.668227195739744, "median": 8.572540283203125, "p90": 31.811827087402346, "max": 36.6856689453125, "pos_frac": 0.765625, "sample": [5.555021286010742, 22.336193084716797, 33.929473876953125, 5.530708312988281, 8.946578979492188, 2.2593936920166016, 23.994613647460938, 8.118629455566406, 6.9662017822265625, -1.6177291870117188, 2.044189453125, 20.840789794921875, -8.073963165283203, 31.46002197265625, 2.2978286743164062, 0.5703144073486328, 7.781257629394531, -1.0248680114746094, 29.306968688964844, -0.8966064453125, -1.2337417602539062, 0.09760284423828125, 15.446075439453125, 29.490230560302734, 2.0297985076904297, 14.905990600585938, 22.299097061157227, -2.4205780029296875, 21.491809844970703, -1.7923240661621094, 24.52690887451172, 12.858779907226562, 15.017745971679688, 32.87559509277344, 32.52201843261719, -1.3593006134033203, -27.750030517578125, 19.54488754272461, 31.962600708007812, 24.157852172851562, 29.4810791015625, 10.837779998779297, 36.6856689453125, 8.198501586914062, 11.13531494140625, -14.137130737304688, 2.1768264770507812, 10.063140869140625, 5.747709274291992, 33.858917236328125, -3.2767333984375, 11.204856872558594, 9.817953109741211, -6.3962554931640625, 7.939460754394531, 10.389801025390625, -5.264581680297852, -5.987068176269531, 9.593013763427734, 4.9188995361328125, 0.6323146820068359, -5.620469093322754, 25.602813720703125, 33.05418395996094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000200.npy"}
|
||||
{"epoch": 0.30234315948601664, "step": 201, "batch_size": 64, "mean": 7.891414642333984, "std": 12.591506958007812, "min": -17.9864501953125, "p10": -8.076000595092772, "median": 5.244742393493652, "p90": 26.234106445312502, "max": 35.71002960205078, "pos_frac": 0.765625, "sample": [4.988983154296875, 17.51519775390625, 4.352882385253906, 7.134788513183594, 2.2570552825927734, 7.1630859375, 28.466426849365234, 5.071065902709961, 9.720794677734375, 1.281097412109375, 4.093162536621094, 15.427282333374023, 14.683137893676758, 20.785018920898438, 1.8481464385986328, -15.339813232421875, 35.71002960205078, 23.319679260253906, 0.565704345703125, 35.17747497558594, -9.485038757324219, 1.3855056762695312, 3.9372634887695312, -8.781017303466797, 1.7633323669433594, 27.127880096435547, 11.568511962890625, 8.996192932128906, -6.430961608886719, 20.77039337158203, 1.8422317504882812, 25.955307006835938, 3.1636791229248047, -0.5400810241699219, 18.560985565185547, 22.165050506591797, -0.23831939697265625, -13.021347045898438, 2.89990234375, 4.4723358154296875, 16.701766967773438, 9.600170135498047, -2.21893310546875, 17.335983276367188, -3.2059974670410156, 11.865570068359375, 5.418418884277344, -14.598236083984375, 6.217767715454102, 26.353591918945312, -2.415761947631836, 7.78179931640625, 10.755779266357422, 21.22238540649414, 11.328840255737305, 30.726417541503906, 16.841110229492188, -5.691459655761719, 0.543212890625, -10.349533081054688, 30.76858139038086, 0.83880615234375, -3.1163177490234375, -17.9864501953125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000201.npy"}
|
||||
{"epoch": 0.30385487528344673, "step": 202, "batch_size": 64, "mean": 8.492841720581055, "std": 12.515378952026367, "min": -28.0885009765625, "p10": -5.1943008422851555, "median": 9.835716247558594, "p90": 20.883447265625005, "max": 43.57249450683594, "pos_frac": 0.8125, "sample": [-3.878459930419922, 4.9911346435546875, 35.20376205444336, 43.57249450683594, 19.3367862701416, -1.9102554321289062, 13.723949432373047, 11.070144653320312, 14.58652114868164, 9.763214111328125, 2.2497615814208984, 11.255992889404297, 1.7855567932128906, 5.753995895385742, -3.810760498046875, 14.190399169921875, 15.157333374023438, 1.3210983276367188, 10.097854614257812, -8.00677490234375, 5.00457763671875, 7.9252471923828125, 20.062835693359375, 4.191459655761719, -4.606719970703125, 11.490921020507812, 4.581865310668945, 37.326900482177734, 0.32648468017578125, 18.20909309387207, -2.3949966430664062, 8.844612121582031, 10.406734466552734, 14.087173461914062, 5.52581787109375, -28.0885009765625, 11.982421875, 11.496500015258789, 13.898712158203125, -12.970558166503906, 15.739990234375, 10.6527099609375, 12.955963134765625, 9.908218383789062, 18.93492889404297, 6.562408447265625, 18.18230438232422, 5.310083389282227, 21.235137939453125, -5.4461212158203125, 0.7801094055175781, -21.217575073242188, 0.4594917297363281, -5.473957061767578, 3.9721603393554688, 15.384201049804688, 31.372222900390625, 3.4699230194091797, 5.718994140625, 15.770353317260742, -14.320075988769531, 22.002525329589844, 10.366056442260742, 27.467544555664062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000202.npy"}
|
||||
{"epoch": 0.30536659108087677, "step": 203, "batch_size": 64, "mean": 7.473361492156982, "std": 12.007231712341309, "min": -17.87885284423828, "p10": -7.278509521484374, "median": 6.963564872741699, "p90": 22.276165008544922, "max": 36.84886169433594, "pos_frac": 0.734375, "sample": [8.012336730957031, 3.5438156127929688, -17.87885284423828, -3.79205322265625, 3.3065032958984375, 10.957725524902344, 0.7341384887695312, 22.015838623046875, 3.2901458740234375, -8.338951110839844, 15.082763671875, 7.950042724609375, 8.710245132446289, 31.725189208984375, 2.045846939086914, 36.7908935546875, 18.030540466308594, 5.496253967285156, 21.235820770263672, 7.074249267578125, -17.581069946289062, -0.87994384765625, 12.138969421386719, -4.221652984619141, -0.3233814239501953, -8.9588623046875, 3.1689834594726562, 27.547653198242188, 5.851470947265625, 20.984169006347656, 7.93927001953125, 2.5818710327148438, 26.57699966430664, 6.852880477905273, 9.53436279296875, 2.6936187744140625, 22.387733459472656, 14.771194458007812, 20.20391845703125, 7.390378952026367, 8.324256896972656, -5.7236328125, 1.5086441040039062, 9.045341491699219, 14.930686950683594, -1.9872512817382812, 26.022354125976562, 20.22846221923828, -4.358100891113281, 36.84886169433594, 12.173664093017578, -0.19786453247070312, 1.0036544799804688, 4.447669982910156, 11.488815307617188, -15.2591552734375, -1.0511474609375, 5.267509460449219, 18.784812927246094, 9.861047744750977, 11.65756607055664, -7.94488525390625, -10.786060333251953, -0.6411476135253906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000203.npy"}
|
||||
{"epoch": 0.30687830687830686, "step": 204, "batch_size": 64, "mean": 8.766094207763672, "std": 12.41862678527832, "min": -21.588409423828125, "p10": -5.267644882202148, "median": 7.7909393310546875, "p90": 27.289910697937014, "max": 37.054168701171875, "pos_frac": 0.75, "sample": [36.27946472167969, 23.681594848632812, -3.9820690155029297, 7.309530258178711, 6.529121398925781, 37.054168701171875, 1.6608734130859375, 9.763269424438477, -10.55694580078125, 7.529998779296875, 28.81121826171875, -10.016921997070312, -3.6738662719726562, 11.789676666259766, 4.171417236328125, 12.817230224609375, 14.774299621582031, 1.3757400512695312, -10.61905288696289, -0.39764404296875, 15.768651962280273, 7.105323791503906, 13.924591064453125, 25.77777862548828, -2.315980911254883, -0.8738861083984375, 9.220962524414062, 16.043018341064453, 2.968843460083008, 5.550331115722656, 27.11720848083496, -6.0563201904296875, 4.047706604003906, 13.889205932617188, 13.374082565307617, -2.4797935485839844, 7.874824523925781, 32.973812103271484, -5.436800003051758, 12.46234130859375, 1.8830909729003906, 8.150054931640625, 22.631629943847656, 22.515777587890625, -7.080894470214844, 10.368545532226562, 13.622688293457031, -1.3547210693359375, 13.9195556640625, 0.4607048034667969, 29.127960205078125, 29.5313720703125, 0.04758453369140625, 20.273574829101562, 12.584716796875, 27.36392593383789, 10.705741882324219, 0.6114730834960938, 0.6899261474609375, 7.707054138183594, -21.588409423828125, -4.872949600219727, 12.867788314819336, -4.373199462890625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000204.npy"}
|
||||
{"epoch": 0.30839002267573695, "step": 205, "batch_size": 64, "mean": 8.19404411315918, "std": 13.969647407531738, "min": -30.95574951171875, "p10": -6.823074340820312, "median": 7.411073684692383, "p90": 23.95771026611328, "max": 42.64501953125, "pos_frac": 0.71875, "sample": [42.64501953125, 2.5740184783935547, -0.5043735504150391, 6.993000030517578, 6.625640869140625, 38.159732818603516, 8.59953498840332, 7.8291473388671875, -30.95574951171875, 18.319290161132812, 3.5315017700195312, 2.3552818298339844, 41.15947723388672, 10.191131591796875, 8.477422714233398, -0.1967926025390625, 3.8067779541015625, -1.9587860107421875, 1.73431396484375, 11.937263488769531, 6.226375579833984, 18.1729736328125, 2.2310619354248047, 14.833541870117188, 20.872222900390625, -0.7049369812011719, -7.144744873046875, -3.64190673828125, 3.1600303649902344, 32.631141662597656, -2.07958984375, -5.947443008422852, 15.58056640625, 6.281890869140625, -1.9210052490234375, 10.015533447265625, 10.988359451293945, 11.271682739257812, 1.5807647705078125, 18.051437377929688, -6.072509765625, 23.82013702392578, 15.30978775024414, 9.447311401367188, 31.550537109375, 23.068023681640625, 1.4376068115234375, 9.546226501464844, 22.13719940185547, -7.452018737792969, 3.632354736328125, 13.821708679199219, 21.719223022460938, 15.521385192871094, -16.3746337890625, 24.01667022705078, 14.615047454833984, -8.989566802978516, 28.12018585205078, -7.647178649902344, -1.8196334838867188, -25.167722702026367, -2.4524993896484375, 10.850383758544922], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000205.npy"}
|
||||
{"epoch": 0.30990173847316704, "step": 206, "batch_size": 64, "mean": 6.790228843688965, "std": 13.42513370513916, "min": -23.355499267578125, "p10": -9.640133094787597, "median": 4.848716735839844, "p90": 24.29526290893555, "max": 32.50006103515625, "pos_frac": 0.6875, "sample": [-5.977235794067383, -2.9200439453125, 3.6005096435546875, -13.660781860351562, -8.262407302856445, -2.4619598388671875, 3.9199066162109375, 5.534049987792969, 18.501205444335938, 9.641050338745117, -0.6860141754150391, 32.50006103515625, 16.093502044677734, -4.351154327392578, -5.089561462402344, 23.05255126953125, -1.35577392578125, 31.17456817626953, -13.92803955078125, 8.58525276184082, 32.23252868652344, 8.67013931274414, 4.515453338623047, 5.4587249755859375, -4.099306106567383, 27.370750427246094, 5.817615509033203, -22.288881301879883, 13.941102981567383, 14.504188537597656, 21.473915100097656, 27.122838973999023, 3.700450897216797, 5.561429977416992, 22.906402587890625, 4.8779144287109375, -23.355499267578125, -13.174339294433594, -3.1636409759521484, 4.124565124511719, 12.69211196899414, 0.6600608825683594, 19.217010498046875, -4.992610931396484, -1.605133056640625, -10.230587005615234, 28.112945556640625, 10.3399658203125, 4.81951904296875, 4.782806396484375, 24.042861938476562, 19.101524353027344, 23.94165802001953, -3.6727333068847656, 12.078407287597656, 24.40343475341797, 3.715351104736328, 2.482786178588867, 6.22265625, 23.117172241210938, -18.865304946899414, 3.1264495849609375, 3.5858154296875, 17.39244842529297], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000206.npy"}
|
||||
{"epoch": 0.31141345427059713, "step": 207, "batch_size": 64, "mean": 9.210400581359863, "std": 12.522029876708984, "min": -17.017860412597656, "p10": -5.597935485839843, "median": 7.281612396240234, "p90": 26.32785797119141, "max": 40.3594970703125, "pos_frac": 0.75, "sample": [40.3594970703125, 2.9095191955566406, -4.020484924316406, 4.8135833740234375, 11.514381408691406, 8.99911117553711, 3.340179443359375, 9.0238037109375, 1.2428665161132812, 21.07135009765625, 9.648185729980469, 5.091926574707031, -1.1912899017333984, 28.52239227294922, -2.6101856231689453, -6.135345458984375, 6.965850830078125, 14.283843994140625, 5.452049255371094, 22.566905975341797, 25.58932113647461, 8.672042846679688, 12.29275131225586, 5.5694580078125, -6.289880752563477, -3.8479461669921875, -4.088901519775391, 25.172531127929688, -7.370086669921875, 7.597373962402344, 25.850677490234375, 4.443096160888672, 35.03877258300781, -0.8597030639648438, 6.4607086181640625, -2.804981231689453, -10.225873947143555, 11.923774719238281, 10.637939453125, 2.5137672424316406, 26.532363891601562, 23.26927947998047, 2.923065185546875, 10.785537719726562, -0.4729156494140625, 13.463409423828125, 16.517776489257812, -14.611351013183594, 19.41498374938965, 18.762405395507812, -17.017860412597656, 21.768457412719727, 1.92059326171875, 14.001670837402344, 6.056976318359375, 31.613479614257812, 30.357009887695312, 5.8985748291015625, 13.529815673828125, 11.141231536865234, -8.227821350097656, -4.3439788818359375, 5.536231994628906, 32.52372360229492], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000207.npy"}
|
||||
{"epoch": 0.3129251700680272, "step": 208, "batch_size": 64, "mean": 4.963096618652344, "std": 11.986795425415039, "min": -20.991851806640625, "p10": -8.977065658569334, "median": 4.456499099731445, "p90": 21.899916458129887, "max": 31.12420654296875, "pos_frac": 0.640625, "sample": [6.53605842590332, 1.433603286743164, 19.941465377807617, -6.683368682861328, 24.23804473876953, 23.22121810913086, 3.1342639923095703, -6.8762664794921875, -4.356964111328125, -20.127132415771484, -1.1741409301757812, -9.7801513671875, 18.45970916748047, 7.265113830566406, 23.10793685913086, 5.675760269165039, 4.909427642822266, 10.440521240234375, -2.595609664916992, -11.254653930664062, 15.040130615234375, -5.4849395751953125, 2.772981643676758, -5.4683837890625, 10.373016357421875, 22.312191009521484, 16.77831268310547, -5.059131622314453, 7.916481018066406, 17.81252098083496, -6.708234786987305, 2.676116943359375, -5.8857269287109375, 7.284149169921875, -16.48987579345703, 11.3319091796875, 23.448314666748047, 15.553245544433594, 4.003570556640625, 1.8192558288574219, 31.12420654296875, 7.706092834472656, 2.3077163696289062, -4.759120941162109, 20.085254669189453, -5.6069488525390625, 20.937942504882812, -3.6542091369628906, 11.519365310668945, 5.814304351806641, -9.666740417480469, 17.272178649902344, -1.4344406127929688, 14.928802490234375, -7.367824554443359, 15.329986572265625, 5.50048828125, -4.928256988525391, -13.763555526733398, 0.21085739135742188, -20.991851806640625, 9.846534729003906, 2.97607421875, 24.710609436035156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000208.npy"}
|
||||
{"epoch": 0.3144368858654573, "step": 209, "batch_size": 64, "mean": 9.701295852661133, "std": 14.070026397705078, "min": -30.67901611328125, "p10": -3.9774429321289055, "median": 7.372459411621094, "p90": 26.041554260253907, "max": 42.526451110839844, "pos_frac": 0.78125, "sample": [22.736705780029297, 25.217824935913086, 4.904289245605469, 5.702457427978516, -3.258544921875, 31.193084716796875, 9.139253616333008, 20.628067016601562, 5.218605041503906, 37.281646728515625, -2.7658843994140625, 4.4468841552734375, -7.23150634765625, 21.061203002929688, 6.374359130859375, 6.712615966796875, -4.222442626953125, -2.5134544372558594, -2.378803253173828, 37.52912902832031, 2.2759933471679688, -4.652244567871094, 0.01345062255859375, 15.448982238769531, 1.5891647338867188, 8.760429382324219, 23.119895935058594, 11.8358154296875, 29.909194946289062, 10.085922241210938, 12.477865219116211, 19.264617919921875, 6.417451858520508, 21.40503692626953, 26.11902618408203, -2.3126983642578125, 25.86078643798828, 7.395843505859375, -30.67901611328125, 4.43988037109375, -4.288642883300781, 3.546499252319336, -3.4057769775390625, -1.448577880859375, 20.353973388671875, 8.891979217529297, 42.526451110839844, 23.650325775146484, 7.3490753173828125, 22.67135238647461, 5.267946243286133, -14.395620346069336, 1.6191387176513672, 11.66998291015625, 1.640655517578125, 3.4166717529296875, 22.16791534423828, -27.7381591796875, 12.690521240234375, 13.284494400024414, 25.35595703125, 30.238969802856445, 9.614696502685547, 1.652191162109375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000209.npy"}
|
||||
{"epoch": 0.31594860166288735, "step": 210, "batch_size": 64, "mean": 9.79769229888916, "std": 13.768986701965332, "min": -18.668838500976562, "p10": -8.480861663818354, "median": 8.194664001464844, "p90": 29.768180847167972, "max": 39.43024826049805, "pos_frac": 0.78125, "sample": [34.72338104248047, 5.092094421386719, 8.028076171875, 15.575263977050781, -2.382678985595703, 19.068801879882812, 39.43024826049805, 14.179672241210938, 32.88536071777344, 31.972625732421875, 4.285869598388672, 25.876220703125, 25.878265380859375, 4.21221923828125, 32.40819549560547, 2.402252197265625, -13.554033279418945, -12.870460510253906, 27.140796661376953, -10.569591522216797, 30.226333618164062, 8.361251831054688, 6.907541275024414, -0.858489990234375, 22.276458740234375, 0.0188446044921875, -0.973602294921875, 5.87841796875, 28.69915771484375, -2.2523651123046875, 27.965904235839844, 5.42396354675293, 12.988792419433594, 3.9093475341796875, 8.568950653076172, -1.5065231323242188, 1.0783195495605469, -3.607158660888672, 1.0407218933105469, 26.03462791442871, -13.917158126831055, 15.040191650390625, 8.76934814453125, 4.400459289550781, 17.58574676513672, -0.2306671142578125, 13.144750595092773, 3.7008800506591797, 8.928077697753906, 4.458717346191406, 4.346996307373047, 18.0865421295166, 11.41159439086914, -14.20140266418457, 1.9650344848632812, 34.632423400878906, 17.99748992919922, 15.019538879394531, 18.385902404785156, -18.668838500976562, 2.357158660888672, -13.796485900878906, 8.76422119140625, 14.908679962158203], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000210.npy"}
|
||||
{"epoch": 0.31746031746031744, "step": 211, "batch_size": 64, "mean": 8.512306213378906, "std": 11.803450584411621, "min": -13.853912353515625, "p10": -4.124396324157714, "median": 7.663300514221191, "p90": 24.882565307617188, "max": 48.1539306640625, "pos_frac": 0.78125, "sample": [1.1142234802246094, -13.853912353515625, 11.952667236328125, 2.2364578247070312, -0.7126808166503906, 0.9988956451416016, -4.768951416015625, 1.1640625, 12.34906005859375, 3.388517379760742, 12.33489990234375, -1.8749504089355469, 8.1190185546875, 24.01654052734375, 0.9130363464355469, 15.52252197265625, 2.1816654205322266, -1.04559326171875, 21.59686279296875, -3.5007972717285156, 7.59990119934082, -9.848190307617188, 11.34527587890625, -12.501800537109375, 6.165897369384766, -9.253545761108398, 0.80804443359375, 11.269424438476562, 30.940425872802734, 12.14610481262207, 13.904930114746094, 7.7266998291015625, 9.296026229858398, 1.3565139770507812, 25.5623779296875, 9.240863800048828, 10.306774139404297, 4.2686614990234375, -3.0267906188964844, 11.415733337402344, 6.0946807861328125, 48.1539306640625, 33.8646240234375, 0.6650791168212891, -0.398590087890625, 14.554271697998047, 27.77532196044922, 9.062103271484375, 13.783531188964844, 24.998580932617188, -4.501373291015625, 9.056781768798828, 25.062957763671875, 21.35657501220703, 17.732498168945312, 23.15825843811035, 4.007270812988281, 16.51068115234375, -4.391653060913086, 24.611862182617188, 1.5267410278320312, -3.3377685546875, 2.4219284057617188, 2.1644248962402344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000211.npy"}
|
||||
{"epoch": 0.31897203325774753, "step": 212, "batch_size": 64, "mean": 9.100960731506348, "std": 12.992183685302734, "min": -27.625411987304688, "p10": -5.032395935058593, "median": 7.844882965087891, "p90": 26.718532371520997, "max": 42.818695068359375, "pos_frac": 0.765625, "sample": [29.23741912841797, 15.897335052490234, 19.22804069519043, 26.429424285888672, -0.487274169921875, 5.625709533691406, 36.86465072631836, 12.388603210449219, -4.693611145019531, 20.198707580566406, -2.451007843017578, -27.625411987304688, 21.18169593811035, 42.818695068359375, 4.0410614013671875, 21.06475830078125, 19.649715423583984, 11.817939758300781, -5.350791931152344, 13.208745956420898, 10.076873779296875, 2.7810134887695312, 0.7735195159912109, 7.6990509033203125, -14.568832397460938, 0.0034027099609375, 12.778411865234375, 0.3168201446533203, 26.842435836791992, 30.729110717773438, 12.953266143798828, 14.42409896850586, 14.445472717285156, 5.812416076660156, 10.775138854980469, 4.464118957519531, -4.386714935302734, 7.990715026855469, 0.5905284881591797, 6.623783111572266, -2.971424102783203, 4.530731201171875, 2.091899871826172, -7.0039520263671875, 9.829519271850586, 14.072296142578125, 11.34463882446289, 32.989173889160156, -10.634662628173828, -4.4625701904296875, -1.4935951232910156, 1.8946914672851562, 13.782135009765625, -5.177589416503906, 15.738327026367188, 28.08673095703125, 18.796607971191406, 25.83808135986328, 20.702056884765625, 1.6767654418945312, 6.6718292236328125, 6.5893096923828125, -9.736530303955078, -0.862030029296875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000212.npy"}
|
||||
{"epoch": 0.3204837490551776, "step": 213, "batch_size": 64, "mean": 10.106489181518555, "std": 14.190141677856445, "min": -32.06447219848633, "p10": -8.605328750610349, "median": 9.130884170532227, "p90": 29.69294395446778, "max": 47.49072265625, "pos_frac": 0.8125, "sample": [17.680339813232422, 5.885955810546875, 5.013954162597656, 21.519363403320312, 9.354190826416016, 6.270162582397461, -9.305923461914062, -2.5974998474121094, -11.776947021484375, 23.645790100097656, 24.2135009765625, 2.881061553955078, 1.0555191040039062, -6.970607757568359, -32.06447219848633, -12.510955810546875, 17.71510124206543, 47.49072265625, -0.7543601989746094, 13.435432434082031, 22.256099700927734, -13.28073501586914, 25.056434631347656, 36.13794708251953, 25.643661499023438, 9.70022201538086, -1.7785835266113281, -11.645797729492188, 3.887697219848633, 4.330272674560547, -9.527233123779297, 8.907577514648438, 12.305122375488281, 9.76254653930664, 12.946517944335938, 9.66607666015625, 3.4794864654541016, 14.636497497558594, 28.428367614746094, 4.3110504150390625, 5.1621246337890625, 12.238838195800781, 32.48250961303711, 1.917490005493164, 20.80925750732422, 13.785446166992188, 4.778175354003906, 20.729305267333984, 31.250457763671875, 2.713623046875, 0.09734344482421875, 11.5400390625, -2.748870849609375, 0.3322601318359375, 4.560665130615234, 9.951957702636719, 6.7725830078125, 33.91265869140625, 6.925262451171875, 5.3289642333984375, 23.726036071777344, 30.234905242919922, 24.441184997558594, 30.469539642333984], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000213.npy"}
|
||||
{"epoch": 0.3219954648526077, "step": 214, "batch_size": 64, "mean": 9.935011863708496, "std": 13.932735443115234, "min": -17.25182342529297, "p10": -7.086174583435058, "median": 8.981660842895508, "p90": 30.074831008911136, "max": 45.24896240234375, "pos_frac": 0.71875, "sample": [17.68520164489746, 8.628414154052734, -1.8610992431640625, 8.13321304321289, 12.552566528320312, 21.52055549621582, 1.9883613586425781, 4.421604156494141, 16.968364715576172, 22.39362335205078, 30.528932571411133, 18.78493309020996, 14.531366348266602, 3.5432891845703125, 18.337722778320312, 24.051239013671875, 16.426034927368164, -17.25182342529297, 17.17357635498047, 16.59379768371582, -1.6695747375488281, 13.799190521240234, 3.33758544921875, 1.1923675537109375, -0.8825969696044922, 22.106229782104492, -9.4407958984375, 10.77802848815918, 31.93503189086914, 21.681060791015625, 5.787757873535156, -1.34979248046875, -9.226285934448242, 0.31726837158203125, 17.48681640625, 21.640583038330078, -13.0709228515625, -3.4404144287109375, 29.015260696411133, 45.24896240234375, -0.54888916015625, -6.445240020751953, 26.69677734375, 16.199539184570312, 31.898174285888672, 36.26246643066406, 0.010181427001953125, 0.039348602294921875, 6.5093231201171875, -5.871601104736328, -10.65826416015625, 32.807342529296875, 7.438970565795898, -3.5184764862060547, 4.830831527709961, -1.5847759246826172, 31.9266357421875, 9.334907531738281, -17.018569946289062, 14.593124389648438, 15.637344360351562, 17.451061248779297, -3.1842002868652344, -7.360860824584961], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000214.npy"}
|
||||
{"epoch": 0.3235071806500378, "step": 215, "batch_size": 64, "mean": 10.433313369750977, "std": 13.683329582214355, "min": -19.742591857910156, "p10": -6.810447692871093, "median": 9.269916534423828, "p90": 28.885032844543456, "max": 36.33392333984375, "pos_frac": 0.703125, "sample": [10.919696807861328, -3.3617935180664062, -19.742591857910156, 13.84539794921875, 11.757545471191406, -10.095321655273438, 6.850982666015625, -1.4627609252929688, 28.87169075012207, 28.890750885009766, 3.208160400390625, -1.3285942077636719, -10.674797058105469, 5.360065460205078, 11.633941650390625, 31.245040893554688, -0.22068023681640625, 25.24016571044922, -0.2925262451171875, 3.1720123291015625, 9.282272338867188, 6.3408966064453125, -0.5638313293457031, 18.50574493408203, 25.81976318359375, 2.7535934448242188, -1.5328598022460938, 17.628402709960938, 18.9705810546875, -9.30532455444336, 0.3651313781738281, 25.130722045898438, 1.4785118103027344, -2.302520751953125, -6.16802978515625, -11.220016479492188, 20.6290225982666, 16.06725311279297, 20.9002685546875, 33.1318359375, 9.257560729980469, 6.3195343017578125, -5.954498291015625, -0.3255462646484375, 21.23077392578125, 27.937759399414062, 30.769058227539062, 35.39645004272461, 24.723590850830078, 36.33392333984375, 15.71096420288086, 26.650527954101562, 17.548675537109375, -7.0857696533203125, 4.330728530883789, 2.8451061248779297, 15.8836669921875, 31.841766357421875, 19.47777557373047, -1.0134468078613281, 24.308151245117188, 2.218463897705078, 16.801246643066406, -7.202262878417969], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000215.npy"}
|
||||
{"epoch": 0.3250188964474679, "step": 216, "batch_size": 64, "mean": 10.606414794921875, "std": 13.753684997558594, "min": -23.086463928222656, "p10": -3.888172531127929, "median": 10.607137680053711, "p90": 28.94301147460938, "max": 54.596099853515625, "pos_frac": 0.765625, "sample": [1.859832763671875, 23.35155487060547, 22.50811767578125, 27.541580200195312, 15.517313003540039, 7.530632019042969, 9.313201904296875, 5.36260986328125, 15.486320495605469, 25.269954681396484, 14.04962158203125, 15.688247680664062, 1.3627700805664062, 0.3779144287109375, 15.775550842285156, -6.9719696044921875, 4.0700531005859375, 9.790557861328125, -0.181182861328125, 30.490493774414062, 17.655487060546875, 29.543624877929688, 12.669212341308594, 21.13213348388672, 17.205623626708984, 7.129405975341797, 31.23416519165039, 0.8319664001464844, 2.4683303833007812, 32.29258728027344, 16.66783905029297, -23.086463928222656, 4.983968734741211, 11.423717498779297, -0.895233154296875, 15.554595947265625, -9.484199523925781, 1.3871326446533203, 13.195911407470703, -3.267536163330078, -1.9291915893554688, 54.596099853515625, 31.860977172851562, -6.452095031738281, 4.165924072265625, -10.219718933105469, 11.75323486328125, 2.353700637817383, -1.2047386169433594, 21.5784912109375, 37.81624984741211, -13.143016815185547, -4.1541595458984375, 0.49533843994140625, 24.596900939941406, -1.7663345336914062, 6.20622444152832, 13.256919860839844, -2.3321304321289062, 24.050750732421875, -2.1850318908691406, 22.48072052001953, 14.18316650390625, 15.966827392578125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000216.npy"}
|
||||
{"epoch": 0.32653061224489793, "step": 217, "batch_size": 64, "mean": 11.332113265991211, "std": 13.585955619812012, "min": -14.450180053710938, "p10": -4.184050369262695, "median": 9.3711519241333, "p90": 27.91789016723633, "max": 49.65110778808594, "pos_frac": 0.796875, "sample": [18.243667602539062, 28.020843505859375, -4.0709075927734375, 2.988800048828125, -0.2380695343017578, -4.689067840576172, 4.062122344970703, 49.65110778808594, -1.7036590576171875, 6.956756591796875, 35.87046813964844, 37.1051025390625, 5.065574645996094, -4.232540130615234, 7.8104400634765625, 2.2578582763671875, 20.738571166992188, 10.22896957397461, 4.154571533203125, 24.809738159179688, 3.7605762481689453, 14.964014053344727, 4.4868011474609375, 24.8983154296875, 11.508506774902344, 27.374406814575195, -14.314697265625, 2.7301559448242188, 2.5781707763671875, 12.15414810180664, 27.67766571044922, 9.351905822753906, 10.323318481445312, -4.499183654785156, 15.499191284179688, -8.792760848999023, 5.276538848876953, 19.76371955871582, -0.984100341796875, -8.311813354492188, -14.450180053710938, 22.437402725219727, 14.223686218261719, -2.5801315307617188, 0.20438385009765625, 5.741874694824219, 5.783119201660156, 5.744781494140625, 13.525344848632812, 43.43278503417969, 9.390398025512695, 25.308311462402344, 8.008460998535156, 1.9949321746826172, 31.376625061035156, 19.5933837890625, 33.697181701660156, 14.85687255859375, 11.339706420898438, 25.886775970458984, 14.459945678710938, 27.268203735351562, 10.732536315917969, -1.1963882446289062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000217.npy"}
|
||||
{"epoch": 0.328042328042328, "step": 218, "batch_size": 64, "mean": 6.9664506912231445, "std": 15.006710052490234, "min": -35.94684600830078, "p10": -9.454423141479491, "median": 7.769981384277344, "p90": 25.94330749511719, "max": 48.147613525390625, "pos_frac": 0.703125, "sample": [8.423957824707031, 1.1661090850830078, 7.671470642089844, 11.502889633178711, 8.794281005859375, -8.033035278320312, 3.8341827392578125, 1.9205474853515625, -1.151458740234375, 23.697486877441406, 7.868492126464844, 8.300163269042969, 26.19275665283203, -20.194488525390625, -8.678653717041016, -15.821104049682617, 5.024467468261719, 29.73192596435547, -19.74371337890625, -2.9715805053710938, 12.691688537597656, 9.098724365234375, 6.398193359375, 6.508115768432617, 16.180545806884766, 48.147613525390625, 25.36125946044922, 34.21522521972656, 14.329437255859375, 16.966400146484375, -0.45102882385253906, 14.45172119140625, 14.20538330078125, -9.786895751953125, -5.631244659423828, -6.377216339111328, 5.159515380859375, 9.649358749389648, -0.542449951171875, 21.363807678222656, 26.26251983642578, 15.954803466796875, -1.4533710479736328, 0.05777740478515625, 8.17098617553711, -8.226421356201172, 16.84122657775879, 1.1346302032470703, -16.017593383789062, -7.44781494140625, 20.41792869567871, -21.298721313476562, 10.594581604003906, 0.3660106658935547, 12.503562927246094, 19.07208251953125, 7.077747344970703, 32.31843566894531, -35.94684600830078, -3.7651901245117188, 10.165870666503906, 5.933021545410156, 18.341461181640625, 35.323326110839844], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000218.npy"}
|
||||
{"epoch": 0.3295540438397581, "step": 219, "batch_size": 64, "mean": 6.829412460327148, "std": 11.13228702545166, "min": -17.728229522705078, "p10": -7.480522918701172, "median": 7.1429033279418945, "p90": 21.267715835571295, "max": 34.402122497558594, "pos_frac": 0.703125, "sample": [10.823974609375, 22.37982940673828, 15.51620101928711, 6.96208381652832, 11.150596618652344, 0.8400497436523438, 1.1306571960449219, 1.5907840728759766, 11.177406311035156, -8.747398376464844, -1.1913528442382812, 30.21588897705078, 16.02362060546875, -9.237131118774414, -12.907012939453125, 14.880096435546875, 8.579265594482422, 3.578035354614258, 8.079280853271484, 0.47747230529785156, 2.449258804321289, -7.6150665283203125, -7.166587829589844, 4.876415252685547, -4.0007476806640625, -0.4622955322265625, 6.9272308349609375, -1.42938232421875, 9.8446044921875, 7.454723358154297, 16.257164001464844, 24.205223083496094, 26.225723266601562, 28.569551467895508, 10.520103454589844, 21.7230224609375, -2.728057861328125, 5.7819976806640625, -4.0285186767578125, 7.9288330078125, 11.55912971496582, 3.7352523803710938, -15.6669921875, 11.544609069824219, -1.8398513793945312, -3.9590072631835938, 13.93027114868164, -3.3217620849609375, 10.08687973022461, 34.402122497558594, 13.860618591308594, 18.71625518798828, 9.960803985595703, -0.9803237915039062, -17.728229522705078, 2.3406982421875, 6.655067443847656, 7.323722839355469, 18.412628173828125, -3.147958755493164, 20.205333709716797, -8.035003662109375, 19.799930572509766, 12.572677612304688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000219.npy"}
|
||||
{"epoch": 0.3310657596371882, "step": 220, "batch_size": 64, "mean": 8.409795761108398, "std": 13.226707458496094, "min": -21.36181640625, "p10": -6.195392417907715, "median": 6.943351745605469, "p90": 25.260806274414062, "max": 40.727325439453125, "pos_frac": 0.6875, "sample": [6.776163101196289, -1.8699417114257812, 11.639175415039062, 5.151607513427734, 4.147642135620117, 14.751968383789062, 34.7630615234375, 7.7660064697265625, 9.19985580444336, 6.461761474609375, -4.3577423095703125, 34.07654571533203, 4.0458221435546875, -12.69830322265625, -5.765106201171875, -3.7115707397460938, 25.370628356933594, 0.83367919921875, -3.7842979431152344, 0.6910476684570312, 20.545963287353516, 11.497062683105469, -3.175374984741211, -15.90284538269043, -5.545249938964844, -7.922204971313477, -2.775754928588867, -8.251670837402344, 2.94598388671875, 15.406463623046875, 15.874732971191406, 7.8288421630859375, 14.19573974609375, 23.07024383544922, 28.65484046936035, 4.750171661376953, 19.538604736328125, 15.008445739746094, 24.80709457397461, -2.8062286376953125, 1.2278671264648438, 13.454885482788086, -1.4287490844726562, 11.410835266113281, 38.361572265625, -3.0909957885742188, 25.004554748535156, -0.10768508911132812, 13.148786544799805, 23.536155700683594, 19.63130760192871, -6.379800796508789, 25.372163772583008, -1.294443130493164, 7.075675964355469, 6.811027526855469, -21.36181640625, 12.334800720214844, 16.560317993164062, 2.226369857788086, 40.727325439453125, -7.330259323120117, 21.208850860595703, 9.895278930664062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000220.npy"}
|
||||
{"epoch": 0.3325774754346183, "step": 221, "batch_size": 64, "mean": 9.09065055847168, "std": 13.836665153503418, "min": -21.341506958007812, "p10": -7.666125106811522, "median": 7.203767776489258, "p90": 28.226015472412126, "max": 42.2764892578125, "pos_frac": 0.71875, "sample": [4.686269760131836, -8.102535247802734, 5.504241943359375, 7.219696044921875, -1.123769760131836, 23.712295532226562, 0.10268974304199219, 3.7098541259765625, 11.899787902832031, -8.269859313964844, -0.026714324951171875, 8.209808349609375, 7.187839508056641, 9.442588806152344, -6.647834777832031, 4.045354843139648, -1.4617233276367188, -0.9407901763916016, 13.426250457763672, 12.118186950683594, 21.93516731262207, 19.79308319091797, 9.20284652709961, 18.90355682373047, 42.2764892578125, 7.411674499511719, 18.90894317626953, 38.457767486572266, -2.6682586669921875, 3.4833641052246094, 15.7225341796875, 30.05219268798828, 1.4667320251464844, 11.786617279052734, 41.2093505859375, 5.819583892822266, 17.537994384765625, -8.181983947753906, 10.799537658691406, 23.964935302734375, 14.793182373046875, -10.667911529541016, 1.9183883666992188, -0.1978759765625, -0.2200641632080078, -2.3520584106445312, 1.5305557250976562, 36.70782470703125, 3.0590972900390625, 1.6037940979003906, -8.847564697265625, -21.341506958007812, 1.8808364868164062, 38.42437744140625, 19.5654296875, -11.830648422241211, 9.171615600585938, 8.375579833984375, 40.221649169921875, 21.06134796142578, -0.17948150634765625, -0.11772918701171875, 19.347003936767578, 7.322013854980469], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000221.npy"}
|
||||
{"epoch": 0.3340891912320484, "step": 222, "batch_size": 64, "mean": 11.885089874267578, "std": 13.393060684204102, "min": -14.7998046875, "p10": -3.90005874633789, "median": 12.487850189208984, "p90": 27.396181869506837, "max": 45.255096435546875, "pos_frac": 0.78125, "sample": [-3.002452850341797, -14.503639221191406, 14.19894790649414, 12.143295288085938, 21.81344223022461, 25.11056137084961, 15.469181060791016, -4.047218322753906, 25.537734985351562, 33.578575134277344, 17.01025390625, 5.197990417480469, -7.277030944824219, 2.2128334045410156, 16.238637924194336, 3.7132492065429688, 5.540771484375, 5.508373260498047, 3.1097793579101562, 19.402061462402344, 12.832405090332031, -0.4336109161376953, 16.579238891601562, 24.732688903808594, -2.0401458740234375, 25.793655395507812, -1.5763015747070312, 45.255096435546875, 0.06794166564941406, 17.409404754638672, 7.79399299621582, 10.75335693359375, 27.070404052734375, 23.27051544189453, 10.390792846679688, 17.993072509765625, 13.232376098632812, -3.5566864013671875, -6.032388687133789, -0.6891937255859375, -14.7998046875, 17.42444610595703, 3.2135391235351562, 12.126350402832031, 2.692768096923828, 42.107666015625, 5.009208679199219, -12.121467590332031, 0.19234466552734375, 19.104747772216797, 21.185035705566406, -2.865215301513672, 27.613601684570312, 34.15911102294922, 15.766441345214844, 7.151908874511719, 27.53580093383789, 19.766250610351562, 10.523025512695312, 25.509796142578125, 15.743843078613281, 17.86957550048828, 36.421783447265625, -6.486968994140625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000222.npy"}
|
||||
{"epoch": 0.3356009070294785, "step": 223, "batch_size": 64, "mean": 10.319726943969727, "std": 12.236457824707031, "min": -10.093734741210938, "p10": -4.371919822692871, "median": 10.042067527770996, "p90": 26.53026237487794, "max": 43.69831085205078, "pos_frac": 0.703125, "sample": [17.28852081298828, 9.432144165039062, 17.83013916015625, 6.409278869628906, -3.0401077270507812, 17.58005142211914, -0.3088226318359375, 10.069717407226562, 9.305587768554688, 14.526090621948242, 2.0631542205810547, 27.572479248046875, 17.528146743774414, 18.17926025390625, -2.3805580139160156, -5.598186492919922, -1.1708202362060547, 18.235000610351562, 13.753387451171875, -4.276119232177734, 24.09842300415039, 11.96612548828125, -6.343971252441406, 13.645822525024414, -4.150665283203125, 5.6572113037109375, -0.09901618957519531, 21.822250366210938, -4.41297721862793, -8.825206756591797, 17.434627532958984, 10.01441764831543, -10.093734741210938, 6.224308013916016, 4.5311126708984375, 35.64552307128906, 23.4302978515625, 18.179088592529297, -3.296783447265625, -7.251800537109375, -3.654693603515625, -2.8914661407470703, 16.49863052368164, 12.470260620117188, 28.01007080078125, 8.449512481689453, 17.131500244140625, 20.823549270629883, 8.2967529296875, 29.748046875, 43.69831085205078, -5.74615478515625, 35.49139404296875, 29.157325744628906, 19.579544067382812, 10.499565124511719, 18.12091827392578, 16.94561004638672, 8.196102142333984, -2.591337203979492, 0.20642852783203125, 19.248916625976562, 2.8211593627929688, -1.2208099365234375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000223.npy"}
|
||||
{"epoch": 0.3371126228269085, "step": 224, "batch_size": 64, "mean": 6.189866065979004, "std": 11.605639457702637, "min": -16.52828598022461, "p10": -7.599504661560058, "median": 3.921062469482422, "p90": 20.74807395935059, "max": 40.751468658447266, "pos_frac": 0.6875, "sample": [6.938434600830078, 37.4722900390625, 12.607324600219727, 12.515670776367188, 2.278076171875, 26.137638092041016, 15.103973388671875, 11.669937133789062, -0.6609935760498047, 2.9667892456054688, 21.464832305908203, 6.253864288330078, 19.027450561523438, 25.210399627685547, -8.836551666259766, -0.2630596160888672, 3.1183128356933594, 8.335062026977539, 15.10421371459961, 1.556671142578125, 2.3361587524414062, 13.539596557617188, 6.9203948974609375, -12.679811477661133, -2.5518646240234375, -1.4185752868652344, -4.978008270263672, -6.36773681640625, 8.75335693359375, 6.751373291015625, 13.996978759765625, -16.52828598022461, -7.789209365844727, -4.08538818359375, -4.782928466796875, 0.17508888244628906, -0.8771591186523438, -12.174686431884766, 4.600334167480469, 10.284225463867188, -9.146621704101562, 9.849052429199219, 16.367895126342773, 0.38939666748046875, -7.1568603515625, 3.0733489990234375, 11.217369079589844, 8.88479232788086, -8.077468872070312, -0.999664306640625, -3.559001922607422, 10.649730682373047, 15.14356803894043, 19.075637817382812, 40.751468658447266, 9.114334106445312, 3.2404861450195312, 12.378753662109375, 0.5345172882080078, 29.146087646484375, -5.6849212646484375, 24.806686401367188, 3.241790771484375, 1.7868385314941406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000224.npy"}
|
||||
{"epoch": 0.3386243386243386, "step": 225, "batch_size": 64, "mean": 9.818340301513672, "std": 13.215198516845703, "min": -18.313873291015625, "p10": -4.612552070617674, "median": 6.797847747802734, "p90": 28.65045204162598, "max": 42.23918151855469, "pos_frac": 0.78125, "sample": [2.1209659576416016, 6.848480224609375, -0.024356842041015625, 2.7801513671875, 0.911346435546875, 21.629150390625, -2.548198699951172, -0.8623580932617188, 10.522268295288086, 11.585559844970703, 3.1404037475585938, 23.825515747070312, 2.8121490478515625, 23.579727172851562, -5.133533477783203, 2.1597328186035156, 6.704902648925781, 6.829154968261719, 17.032451629638672, 28.470626831054688, 31.75739097595215, 16.390869140625, 37.489830017089844, 10.897109985351562, 42.23918151855469, 6.76654052734375, 3.0825653076171875, 16.551231384277344, 28.727519989013672, -3.3969287872314453, 0.5587005615234375, -11.179000854492188, 14.829063415527344, -6.434917449951172, 31.859390258789062, 16.50762939453125, 2.7687225341796875, 17.245513916015625, 8.284881591796875, 3.856822967529297, 1.0024795532226562, 6.008720397949219, 26.79498291015625, 0.8107070922851562, 24.25617218017578, -18.313873291015625, -11.747406005859375, -1.5840682983398438, 18.044822692871094, 4.200035095214844, 17.670631408691406, -0.011095046997070312, 5.4532318115234375, 30.225494384765625, 23.658634185791016, -1.900991439819336, 3.6394729614257812, 13.539215087890625, 33.270809173583984, 11.527494430541992, -15.97479248046875, 11.615585327148438, 22.772193908691406, -7.7709808349609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000225.npy"}
|
||||
{"epoch": 0.3401360544217687, "step": 226, "batch_size": 64, "mean": 8.040156364440918, "std": 12.984953880310059, "min": -19.843048095703125, "p10": -8.259468460083006, "median": 7.446342468261719, "p90": 26.918226051330578, "max": 33.12855529785156, "pos_frac": 0.75, "sample": [6.608833312988281, -8.796989440917969, 8.365119934082031, 15.185529708862305, 12.0841064453125, 7.985382080078125, 14.446739196777344, 3.6370372772216797, 10.463342666625977, -7.005252838134766, 18.360769271850586, 4.546775817871094, 10.756645202636719, 24.28412437438965, -11.979141235351562, 0.16156768798828125, 5.205390930175781, -15.550704956054688, 28.0592041015625, 20.043968200683594, 30.692195892333984, -2.6556396484375, 20.857505798339844, 2.1676406860351562, 10.231376647949219, 1.0329570770263672, 2.9269561767578125, 28.04712677001953, 20.796173095703125, 20.229888916015625, -5.258161544799805, 19.862472534179688, 6.866420745849609, 8.690483093261719, 33.12855529785156, 6.9073028564453125, 8.657585144042969, -13.15106201171875, -3.3442153930664062, 2.0908870697021484, 2.421051025390625, 11.342132568359375, -4.027870178222656, -19.843048095703125, 32.19078826904297, 30.12457275390625, -16.95220947265625, 9.80096435546875, -15.749900817871094, 11.971794128417969, 6.638034820556641, 4.737443923950195, 5.742343902587891, 13.676637649536133, 0.6187286376953125, 32.874542236328125, 21.53778076171875, 18.975479125976562, -4.896156311035156, -2.1409358978271484, 19.110305786132812, -1.9784717559814453, -4.41197395324707, 17.1690673828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000226.npy"}
|
||||
{"epoch": 0.3416477702191988, "step": 227, "batch_size": 64, "mean": 8.263538360595703, "std": 13.904458999633789, "min": -24.70569610595703, "p10": -8.27801475524902, "median": 5.835563659667969, "p90": 27.318684387207032, "max": 35.238525390625, "pos_frac": 0.6875, "sample": [20.822914123535156, 8.19111442565918, 0.27529144287109375, 22.432952880859375, 1.426727294921875, -2.6339454650878906, 15.372535705566406, 33.35788345336914, -24.70569610595703, -5.178112030029297, 34.55250549316406, -1.6806716918945312, -2.4123096466064453, 28.69072914123535, 13.763145446777344, 5.792236328125, 0.886871337890625, -0.9801521301269531, 27.470909118652344, -2.3263397216796875, 14.530715942382812, 17.97039031982422, 6.434700012207031, 4.903964996337891, 34.895782470703125, 9.694412231445312, -1.3061752319335938, 18.625289916992188, -1.6533145904541016, -1.6459197998046875, 16.034027099609375, 35.238525390625, 12.495147705078125, -1.96490478515625, -0.4796600341796875, 18.924057006835938, 34.57478713989258, 1.2294464111328125, 13.738914489746094, 26.96349334716797, 3.3285751342773438, 5.8788909912109375, -3.6042938232421875, 7.5789337158203125, 18.00848388671875, 3.9477310180664062, 2.1374359130859375, 5.154991149902344, -10.559188842773438, -0.9024772644042969, -15.729854583740234, 15.559074401855469, 18.576934814453125, 17.472686767578125, 2.0075244903564453, -21.29949951171875, 6.545402526855469, 1.2460250854492188, 25.997865676879883, -10.960617065429688, -9.606544494628906, -10.920631408691406, 23.39647102355957, 23.290306091308594], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000227.npy"}
|
||||
{"epoch": 0.3431594860166289, "step": 228, "batch_size": 64, "mean": 8.67706298828125, "std": 13.132659912109375, "min": -20.798385620117188, "p10": -8.221243286132813, "median": 8.826620101928711, "p90": 29.097596740722658, "max": 41.12108612060547, "pos_frac": 0.734375, "sample": [-1.3224239349365234, 24.682167053222656, 3.3138427734375, 30.751272201538086, 12.475013732910156, 7.5793609619140625, 6.861991882324219, 31.195714950561523, 6.4390106201171875, -13.917457580566406, 11.396711349487305, -0.724639892578125, 4.60003662109375, -20.798385620117188, 6.103187561035156, 2.6178150177001953, 16.602676391601562, -4.2598876953125, 11.387104034423828, 10.216583251953125, 0.2612285614013672, 31.506256103515625, -10.569705963134766, 15.461700439453125, -8.260940551757812, 17.124217987060547, 15.423974990844727, 41.12108612060547, 7.997785568237305, 18.73272705078125, -1.787862777709961, -8.128616333007812, 11.347850799560547, 1.31591796875, 14.624017715454102, -6.090635299682617, 8.951961517333984, -1.6613807678222656, -14.026885986328125, -7.617488861083984, 7.662040710449219, 13.72613525390625, 31.18853759765625, -11.536293029785156, 11.673751831054688, 2.9290847778320312, 29.180740356445312, 28.903594970703125, -12.053550720214844, 2.4248123168945312, -2.2217330932617188, 9.68563461303711, 20.779327392578125, 15.877799987792969, -3.1489105224609375, 14.204879760742188, 24.473175048828125, 30.446510314941406, 20.631282806396484, 8.701278686523438, 4.25537109375, 11.101715087890625, 13.927003860473633, 11.594902038574219], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000228.npy"}
|
||||
{"epoch": 0.34467120181405897, "step": 229, "batch_size": 64, "mean": 7.492367744445801, "std": 12.854479789733887, "min": -27.11581039428711, "p10": -8.123320007324219, "median": 7.191784858703613, "p90": 24.509689331054695, "max": 34.17919921875, "pos_frac": 0.78125, "sample": [11.907135009765625, 22.958158493041992, 17.403091430664062, 10.832778930664062, 5.31036376953125, 11.981773376464844, 17.892534255981445, 7.902307510375977, 9.214488983154297, -11.546485900878906, 15.517242431640625, 11.96759033203125, 17.636512756347656, -3.5880279541015625, 9.2010498046875, 4.507617950439453, -22.6844482421875, -7.534015655517578, 20.810699462890625, 19.86008071899414, 10.04940414428711, 2.069061279296875, 4.553134918212891, 15.709434509277344, -8.204193115234375, 9.420703887939453, -8.413307189941406, -1.4086799621582031, 2.599050521850586, -5.841346740722656, 4.9083251953125, 34.17919921875, -27.11581039428711, 25.724775314331055, -7.9346160888671875, 3.183176040649414, 18.871559143066406, -12.74094009399414, 6.48126220703125, 10.484405517578125, 27.301971435546875, 15.396759033203125, 0.21449661254882812, -2.3470935821533203, 20.603675842285156, 0.25322723388671875, 25.174631118774414, 12.464555740356445, 20.240564346313477, 6.390779495239258, 0.740692138671875, -1.4459476470947266, 11.115303039550781, 0.7340011596679688, 26.031219482421875, 6.07177734375, 1.0881423950195312, 1.2340831756591797, 25.340171813964844, 31.09747314453125, 21.774520874023438, 4.824668884277344, 1.7442970275878906, -22.65747833251953], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000229.npy"}
|
||||
{"epoch": 0.34618291761148906, "step": 230, "batch_size": 64, "mean": 8.806828498840332, "std": 12.642942428588867, "min": -22.153247833251953, "p10": -3.256143951416014, "median": 6.42828893661499, "p90": 25.07278442382813, "max": 44.871337890625, "pos_frac": 0.8125, "sample": [-3.8769264221191406, 7.5839385986328125, 15.631467819213867, 1.198699951171875, 9.936786651611328, -20.07623291015625, 2.6202869415283203, 0.4931068420410156, 8.827102661132812, 25.49999237060547, 13.434661865234375, 31.80760955810547, 19.549163818359375, 30.4625244140625, 8.792999267578125, 8.569892883300781, 15.318881034851074, 23.325180053710938, -7.329673767089844, 3.0333251953125, -22.153247833251953, 3.4499359130859375, 11.133087158203125, 12.09677505493164, 14.091407775878906, -1.384552001953125, 23.637794494628906, 2.3025741577148438, 24.075965881347656, 18.636093139648438, 13.347572326660156, 4.377635955810547, -4.222627639770508, 5.051275253295898, 4.2830810546875, -0.0566864013671875, 1.6240692138671875, 3.1857547760009766, 44.871337890625, 7.506645202636719, 5.4224395751953125, 6.7941436767578125, -0.7242774963378906, 6.0525970458984375, 36.678199768066406, 19.79175567626953, -9.61726188659668, 31.53813934326172, 3.8262176513671875, 0.7549610137939453, 29.639373779296875, -15.825225830078125, 19.826974868774414, 19.004409790039062, 6.538431167602539, 9.913215637207031, 2.445840835571289, -0.8340549468994141, 3.6394271850585938, 1.9986686706542969, 6.318146705627441, 16.319290161132812, -1.8076515197753906, 5.286579132080078], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000230.npy"}
|
||||
{"epoch": 0.3476946334089191, "step": 231, "batch_size": 64, "mean": 12.482011795043945, "std": 12.3740234375, "min": -21.101181030273438, "p10": -3.3337383270263663, "median": 12.793190002441406, "p90": 28.247006797790537, "max": 39.78813934326172, "pos_frac": 0.828125, "sample": [22.405715942382812, 5.179878234863281, 19.184009552001953, 21.146656036376953, 1.8285903930664062, 9.796791076660156, 1.1365013122558594, 23.630508422851562, 4.246543884277344, 4.267633438110352, 12.048891067504883, 1.79510498046875, 11.729217529296875, -2.1517066955566406, -4.6913909912109375, 12.219100952148438, 14.116134643554688, 23.458694458007812, 10.471664428710938, -1.7568702697753906, 24.074676513671875, -3.846294403076172, 10.035099029541016, 19.53201675415039, 4.152565002441406, 31.214141845703125, -11.766342163085938, -8.21725845336914, 33.88866424560547, 18.065561294555664, 30.03270721435547, -8.658153533935547, 17.3939208984375, 29.044769287109375, 20.295604705810547, 2.264129638671875, 3.0786380767822266, 13.812076568603516, 8.326324462890625, -0.39006805419921875, 39.78813934326172, 5.8628082275390625, 11.634811401367188, 15.598800659179688, 16.989967346191406, 5.981910705566406, 23.348167419433594, -3.668975830078125, 8.082134246826172, -2.5515174865722656, 22.927459716796875, 23.792938232421875, 13.958915710449219, 20.63530731201172, 18.730918884277344, 32.70372009277344, 19.458267211914062, 19.556360244750977, 13.367279052734375, 20.352401733398438, -21.101181030273438, 10.613851547241211, 26.385560989379883, 34.00629425048828], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000231.npy"}
|
||||
{"epoch": 0.3492063492063492, "step": 232, "batch_size": 64, "mean": 8.543130874633789, "std": 15.11335277557373, "min": -40.594573974609375, "p10": -8.561944580078125, "median": 6.299592971801758, "p90": 27.381002807617197, "max": 54.4610595703125, "pos_frac": 0.78125, "sample": [0.9040985107421875, 28.243301391601562, 15.848526000976562, 34.235076904296875, 6.370456695556641, 21.689651489257812, 5.5927886962890625, 25.264312744140625, 0.7034187316894531, -9.048133850097656, -7.173564910888672, 3.4724197387695312, -8.934951782226562, 4.528606414794922, 7.111423492431641, 21.724773406982422, 0.9451179504394531, 3.955282211303711, 15.85272216796875, 2.0249900817871094, 0.8624153137207031, 17.640182495117188, 15.758377075195312, 9.74755859375, 7.487945556640625, 13.636362075805664, -7.095983505249023, 32.2750244140625, -4.834079742431641, 14.861740112304688, 8.642416000366211, 25.368972778320312, -10.578229904174805, -1.9650611877441406, 6.228729248046875, 1.1840286254882812, 1.1925525665283203, -10.741310119628906, 35.57487487792969, 7.1755523681640625, -0.3693809509277344, 23.246673583984375, 1.3860702514648438, 43.859588623046875, 35.246307373046875, 54.4610595703125, -8.8175048828125, -8.929512023925781, -3.8650283813476562, 5.691873550415039, 7.418060302734375, 22.05762481689453, 21.581314086914062, 10.145244598388672, 4.6732025146484375, -7.96563720703125, 0.8250389099121094, 16.317710876464844, 6.599945068359375, 7.181453704833984, 1.3564682006835938, -40.594573974609375, 13.479644775390625, 6.042333602905273], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000232.npy"}
|
||||
{"epoch": 0.3507180650037793, "step": 233, "batch_size": 64, "mean": 7.049837589263916, "std": 12.94009780883789, "min": -21.063011169433594, "p10": -7.791995811462401, "median": 4.8550519943237305, "p90": 22.97743759155274, "max": 40.443336486816406, "pos_frac": 0.640625, "sample": [40.443336486816406, 2.0727500915527344, 29.339614868164062, -4.645931243896484, 7.186534881591797, 14.931783676147461, 12.956016540527344, 14.169258117675781, -1.8018798828125, -0.9138107299804688, 8.296817779541016, 39.01795959472656, -0.9353752136230469, -1.0532150268554688, -5.0208892822265625, 16.401962280273438, 18.207481384277344, 3.9832515716552734, -2.1724624633789062, 1.487213134765625, 20.642074584960938, 5.7268524169921875, 17.797569274902344, 19.02569580078125, -11.752822875976562, 9.767921447753906, -11.329421997070312, 12.734249114990234, -8.744224548339844, -8.109527587890625, 14.73134994506836, 1.9271621704101562, 15.458404541015625, 10.129634857177734, 23.377479553222656, -21.063011169433594, 11.319229125976562, 2.5347938537597656, -2.5439605712890625, 1.0775413513183594, 6.762077331542969, -13.153236389160156, 22.04400634765625, -1.2479248046875, 16.357383728027344, -5.84735107421875, -0.699798583984375, 9.518753051757812, 16.397354125976562, 17.868724822998047, -5.490139007568359, -5.268922805786133, 3.052644729614258, 34.12487030029297, 25.629043579101562, 30.772811889648438, 3.7057876586914062, -6.073062896728516, 15.69580078125, -0.5378570556640625, 6.1763916015625, -8.589492797851562, -7.051088333129883, 2.3854293823242188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000233.npy"}
|
||||
{"epoch": 0.35222978080120937, "step": 234, "batch_size": 64, "mean": 6.008461952209473, "std": 12.839128494262695, "min": -23.660621643066406, "p10": -8.968844604492187, "median": 4.227685928344727, "p90": 24.390642166137702, "max": 35.067527770996094, "pos_frac": 0.640625, "sample": [6.193229675292969, -8.329483032226562, 0.7058563232421875, -9.221282958984375, 8.64095687866211, -1.6060523986816406, 3.4701385498046875, -14.838493347167969, 13.536590576171875, -8.37982177734375, 10.609222412109375, 8.68798828125, 16.98603057861328, 12.55007553100586, 26.11878204345703, 8.57870864868164, -7.5715179443359375, -15.834953308105469, 15.477294921875, -3.1266841888427734, 4.775947570800781, 3.0667171478271484, 27.05286407470703, 35.067527770996094, -0.29997825622558594, -5.6768341064453125, 26.252220153808594, -0.6341552734375, -23.660621643066406, 3.6685943603515625, 14.606147766113281, 25.163570404052734, 28.507097244262695, 22.537708282470703, -0.8768939971923828, -16.56733512878418, -0.4996185302734375, -1.2854766845703125, 5.885019302368164, -6.659549713134766, 19.684112548828125, 13.985336303710938, 3.9406089782714844, 22.58599090576172, 20.732913970947266, 3.761198043823242, 3.167560577392578, 16.720745086669922, 22.587142944335938, -5.225503921508789, -2.1188507080078125, 4.838106155395508, -12.57998275756836, 15.127517700195312, -11.232574462890625, 0.9626655578613281, 16.913406372070312, 3.0213088989257812, -1.0451889038085938, 4.514762878417969, 5.162925720214844, 29.17753028869629, 13.523185729980469, -6.732898712158203], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000234.npy"}
|
||||
{"epoch": 0.35374149659863946, "step": 235, "batch_size": 64, "mean": 11.060816764831543, "std": 10.744832992553711, "min": -10.755714416503906, "p10": -1.2412324905395509, "median": 8.883941650390625, "p90": 24.655611419677737, "max": 36.458961486816406, "pos_frac": 0.84375, "sample": [1.9597339630126953, 14.723838806152344, 6.968292236328125, 13.38287353515625, 17.489181518554688, 7.898193359375, 6.214759826660156, 36.458961486816406, 28.344940185546875, 18.145828247070312, 1.048614501953125, 14.036300659179688, 4.813240051269531, 11.690460205078125, 16.545989990234375, 19.31791877746582, 13.299261093139648, 3.7263755798339844, 17.515958786010742, 14.269107818603516, 7.529533386230469, 6.11248779296875, -1.227081298828125, 19.042327880859375, 23.0120849609375, 17.26972198486328, -2.8454437255859375, -4.642951965332031, 30.91421127319336, -10.755714416503906, 20.123756408691406, 2.4363670349121094, 7.736982345581055, 15.3857421875, 23.506389617919922, 6.313749313354492, 16.123920440673828, 2.549022674560547, 23.5609130859375, -10.142135620117188, -1.2472972869873047, 24.809051513671875, 9.25048828125, 7.424659729003906, 20.6005859375, 2.141124725341797, 0.46176910400390625, 20.04910659790039, 34.90513610839844, 6.743709564208984, 8.51739501953125, -0.582672119140625, 25.745498657226562, 7.032779693603516, 4.115726470947266, -6.7845306396484375, 24.810020446777344, 20.54471206665039, 24.297584533691406, 0.9842300415039062, 15.696762084960938, -0.3531036376953125, 1.3365669250488281, -2.460742950439453], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000235.npy"}
|
||||
{"epoch": 0.35525321239606955, "step": 236, "batch_size": 64, "mean": 9.029207229614258, "std": 13.852867126464844, "min": -25.223007202148438, "p10": -4.799645042419433, "median": 7.353437423706055, "p90": 28.222344779968264, "max": 46.99208068847656, "pos_frac": 0.703125, "sample": [15.593124389648438, -3.5031585693359375, 0.08847808837890625, 20.137958526611328, 10.07281494140625, 3.9771575927734375, 16.10254669189453, 24.053451538085938, -2.706939697265625, 1.8672046661376953, 11.522651672363281, 6.39813232421875, -4.48246955871582, 6.03546142578125, -6.60203742980957, 4.689491271972656, 14.889326095581055, 19.754150390625, 14.897972106933594, 27.814470291137695, -0.7858543395996094, 1.73114013671875, 5.404563903808594, -1.6238136291503906, 15.393051147460938, 8.910438537597656, 12.840744018554688, 19.4405517578125, 31.598529815673828, 17.855674743652344, 29.900009155273438, -0.7215499877929688, -0.4460296630859375, 8.403404235839844, 40.05628967285156, -4.935577392578125, -1.6156768798828125, 46.99208068847656, 3.7071571350097656, 8.521238327026367, -11.784591674804688, 7.6890869140625, 17.080974578857422, 24.526294708251953, 5.469596862792969, 7.449806213378906, -1.4059791564941406, -6.649986267089844, 28.39714813232422, 7.257068634033203, -0.5985221862792969, -25.197174072265625, -25.223007202148438, 31.49936294555664, 31.569068908691406, 21.776901245117188, 14.559478759765625, -9.027624130249023, -2.392375946044922, -1.1675453186035156, 24.177566528320312, 0.9048538208007812, 10.626955032348633, 7.1057586669921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000236.npy"}
|
||||
{"epoch": 0.35676492819349964, "step": 237, "batch_size": 64, "mean": 9.305397033691406, "std": 13.789247512817383, "min": -23.73436737060547, "p10": -5.738035583496092, "median": 10.076177597045898, "p90": 25.168772125244143, "max": 47.811370849609375, "pos_frac": 0.796875, "sample": [17.453258514404297, 20.922706604003906, 10.535308837890625, 30.22071075439453, 0.07483673095703125, -4.1896209716796875, 8.139114379882812, 12.352859497070312, 4.389102935791016, 16.308977127075195, 11.732025146484375, 18.156646728515625, -8.548629760742188, 7.155792236328125, 16.660789489746094, 16.931724548339844, -2.0761489868164062, 20.763534545898438, -10.816207885742188, 18.763282775878906, -21.465927124023438, 17.9379940032959, 12.086135864257812, 14.004093170166016, 33.04539489746094, -3.1825428009033203, 11.315704345703125, 20.651775360107422, -1.6060791015625, 5.170156478881836, 17.276288986206055, 4.250312805175781, 12.119880676269531, -23.715362548828125, 6.521909713745117, 3.472412109375, 6.107246398925781, -23.73436737060547, 37.49891662597656, 1.4173049926757812, 12.28713607788086, 12.532722473144531, 1.9428443908691406, 5.427032470703125, 3.1869735717773438, 0.5798683166503906, 15.617353439331055, 10.270004272460938, 25.220703125, 25.04759979248047, 22.09424591064453, 8.248321533203125, 28.89502716064453, 47.811370849609375, -22.995941162109375, 9.88235092163086, -1.6178646087646484, 17.78980827331543, 4.569480895996094, 9.466400146484375, -0.4103584289550781, 5.71728515625, -6.401641845703125, 26.28335952758789], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000237.npy"}
|
||||
{"epoch": 0.35827664399092973, "step": 238, "batch_size": 64, "mean": 9.301795959472656, "std": 12.846216201782227, "min": -15.369457244873047, "p10": -7.196720123291016, "median": 6.74200439453125, "p90": 26.86894588470459, "max": 33.497406005859375, "pos_frac": 0.734375, "sample": [-2.563945770263672, 5.18994140625, 27.823902130126953, -9.284355163574219, 22.385379791259766, 33.497406005859375, 2.173107147216797, 5.786949157714844, 23.971534729003906, -2.4841537475585938, 7.216827392578125, -7.25372314453125, 26.959930419921875, 6.013191223144531, 10.32948112487793, 11.246192932128906, -1.3129825592041016, -5.283697128295898, 10.775802612304688, 26.656648635864258, 20.1275634765625, 32.1094970703125, 18.542850494384766, 14.952220916748047, 2.5718822479248047, 18.810945510864258, 15.061485290527344, -7.063713073730469, -11.427047729492188, 21.389114379882812, 21.014392852783203, 17.978160858154297, 3.2561779022216797, 14.777297973632812, -0.6989860534667969, 4.388608932495117, 20.549835205078125, -2.676177978515625, 23.72551727294922, 31.218582153320312, 4.921642303466797, -3.4309616088867188, 0.7393341064453125, 2.926727294921875, 23.13372802734375, 6.267181396484375, 32.585174560546875, 21.887893676757812, 8.060922622680664, 4.606475830078125, 7.25262451171875, 16.492481231689453, 24.02311134338379, -11.202999114990234, -7.823329925537109, 1.2544517517089844, -10.601409912109375, 5.268119812011719, 28.754261016845703, -4.507720947265625, 14.283447265625, -5.610160827636719, -15.369457244873047, 0.9517841339111328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000238.npy"}
|
||||
{"epoch": 0.35978835978835977, "step": 239, "batch_size": 64, "mean": 10.014331817626953, "std": 13.634687423706055, "min": -22.59857749938965, "p10": -6.200550079345702, "median": 9.77511978149414, "p90": 27.049618339538576, "max": 44.350921630859375, "pos_frac": 0.78125, "sample": [7.7836761474609375, 9.337051391601562, 8.504852294921875, 2.136425018310547, 31.31640625, 14.907196044921875, -5.725799560546875, 3.6843338012695312, -1.7255706787109375, 6.717475891113281, -6.404014587402344, 8.948013305664062, 0.3018684387207031, 30.865966796875, 19.273757934570312, 3.378387451171875, -8.023826599121094, -12.372962951660156, 15.16766357421875, 21.59283447265625, 3.0846309661865234, 8.874763488769531, 22.312828063964844, 26.22930145263672, 34.576839447021484, 7.2904815673828125, 12.144718170166016, 14.967525482177734, 11.451972961425781, 10.565689086914062, 20.534866333007812, 6.855125427246094, -8.884239196777344, -5.644506454467773, 9.675872802734375, 27.278488159179688, -2.178049087524414, 2.6507396697998047, -5.39892578125, 10.58428955078125, -2.592334747314453, 10.361923217773438, 11.41290283203125, 13.862106323242188, -17.081390380859375, 13.156509399414062, 35.67840576171875, 44.350921630859375, 40.11747741699219, -12.686180114746094, 0.1518096923828125, 4.240211486816406, -22.59857749938965, 23.37866973876953, 18.475555419921875, 9.932418823242188, 26.515588760375977, -0.22410964965820312, 10.141921997070312, 24.151092529296875, 4.035341262817383, 19.019210815429688, 20.607200622558594, 9.874366760253906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000239.npy"}
|
||||
{"epoch": 0.36130007558578986, "step": 240, "batch_size": 64, "mean": 12.062172889709473, "std": 13.3220853805542, "min": -25.083251953125, "p10": -3.6701511383056626, "median": 12.053388595581055, "p90": 30.28308372497559, "max": 40.69816589355469, "pos_frac": 0.828125, "sample": [12.040813446044922, -6.315299987792969, 21.481658935546875, 1.90673828125, 12.065963745117188, 40.69816589355469, 15.653961181640625, 22.735946655273438, -8.441394805908203, -4.196697235107422, 7.556846618652344, -1.2318878173828125, 13.196929931640625, 21.901277542114258, 30.869869232177734, 35.85334777832031, 19.532114028930664, -10.000066757202148, 27.166202545166016, 20.921783447265625, -4.601970672607422, 16.94403076171875, 11.104814529418945, -12.376686096191406, 14.926681518554688, 33.787315368652344, 13.472702026367188, 16.965011596679688, 33.57464599609375, 19.72711944580078, 23.300643920898438, 26.7392578125, 4.254371643066406, 7.40887451171875, 1.4811935424804688, 18.736549377441406, 8.189895629882812, 11.849090576171875, 25.224998474121094, 0.9714908599853516, 20.669387817382812, 7.053661346435547, -2.3203887939453125, 18.32025909423828, 33.960594177246094, -25.083251953125, 0.40524864196777344, 1.3604621887207031, 10.319669723510742, 2.5295486450195312, 12.250267028808594, 26.497093200683594, -1.5084648132324219, 8.454055786132812, 0.3715629577636719, 4.399631500244141, 28.913917541503906, 32.14563751220703, 9.836784362792969, 3.088815689086914, 21.23052215576172, 2.4240798950195312, -2.4415435791015625, 14.02520751953125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000240.npy"}
|
||||
{"epoch": 0.36281179138321995, "step": 241, "batch_size": 64, "mean": 6.591043472290039, "std": 12.251733779907227, "min": -16.88128662109375, "p10": -7.3034515380859375, "median": 4.896890640258789, "p90": 21.80318832397461, "max": 38.238380432128906, "pos_frac": 0.625, "sample": [-10.39837646484375, -9.296749114990234, 19.55403709411621, -6.841524124145508, 14.085731506347656, -8.132080078125, 2.8693084716796875, 19.541725158691406, 15.061609268188477, 21.150436401367188, -5.254119873046875, 24.43145751953125, -6.084465026855469, 6.716348648071289, 0.488555908203125, 21.916954040527344, 1.6462326049804688, 2.7780284881591797, -2.6944198608398438, 21.537734985351562, 16.211444854736328, 13.241622924804688, -4.39276123046875, 5.685523986816406, 6.180213928222656, -3.0220069885253906, 17.536617279052734, -1.5507011413574219, -0.10063934326171875, -7.257255554199219, 19.261503219604492, 31.205902099609375, 34.2281494140625, 2.303203582763672, -3.5733108520507812, -12.824508666992188, -3.640838623046875, 16.490386962890625, 0.8545913696289062, 4.576801300048828, -11.592613220214844, -0.012020111083984375, -0.28221702575683594, 22.250831604003906, 6.909276962280273, 10.614387512207031, -16.88128662109375, 38.238380432128906, 12.200263977050781, -7.323249816894531, 20.7119140625, 5.2175750732421875, 18.76580047607422, 5.21697998046875, 22.57537078857422, -2.685504913330078, -6.183704376220703, 17.9730167388916, 13.817710876464844, 10.743528366088867, 1.6910572052001953, 9.296529769897461, -3.2847232818603516, -0.6408748626708984], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000241.npy"}
|
||||
{"epoch": 0.36432350718065004, "step": 242, "batch_size": 64, "mean": 10.835296630859375, "std": 14.615829467773438, "min": -23.565269470214844, "p10": -1.4857238769531247, "median": 9.616718292236328, "p90": 30.325737762451176, "max": 54.0994873046875, "pos_frac": 0.796875, "sample": [-17.2867431640625, 8.165603637695312, 11.767723083496094, 5.733188629150391, 18.634864807128906, -12.101882934570312, 24.65229034423828, 18.104345321655273, 14.982418060302734, 27.77991485595703, 5.504188537597656, 15.83929443359375, -0.27392005920410156, 42.017066955566406, 29.002246856689453, -1.583465576171875, 30.892948150634766, -0.6350631713867188, 4.090278625488281, -0.9981422424316406, 6.55718994140625, 20.944900512695312, 33.37489318847656, 3.3856277465820312, -1.257659912109375, 15.248687744140625, 2.867431640625, 1.696380615234375, 19.097457885742188, 21.596878051757812, 14.598182678222656, 0.6041393280029297, 42.125823974609375, 7.36700439453125, 19.090045928955078, 11.067832946777344, 0.33487701416015625, -0.8241157531738281, 54.0994873046875, 2.966522216796875, -14.774703979492188, 16.13683319091797, 13.051040649414062, 1.6316070556640625, 18.359542846679688, 0.35643768310546875, 15.926139831542969, 11.094795227050781, 17.677841186523438, 23.119991302490234, 36.54240798950195, 12.856334686279297, 35.71708679199219, -4.142789840698242, 0.1004486083984375, 11.909370422363281, 14.286575317382812, 1.8706512451171875, 7.9504547119140625, -23.565269470214844, 3.7738800048828125, -0.5817985534667969, -12.781103134155273, 7.714441299438477], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000242.npy"}
|
||||
{"epoch": 0.36583522297808013, "step": 243, "batch_size": 64, "mean": 9.294159889221191, "std": 12.920491218566895, "min": -15.514236450195312, "p10": -5.861220932006835, "median": 8.066396713256836, "p90": 27.7948148727417, "max": 34.36322021484375, "pos_frac": 0.78125, "sample": [1.5419921875, 26.754241943359375, 17.970787048339844, -14.2847900390625, 27.637231826782227, 3.252531051635742, -15.514236450195312, 0.7488861083984375, 15.715370178222656, -2.2631301879882812, 27.37591552734375, 8.466445922851562, 3.3963623046875, 5.865203857421875, 3.8672103881835938, 31.800697326660156, -1.4738025665283203, 4.523902893066406, 25.34987449645996, -3.54156494140625, 4.61956787109375, 34.36322021484375, 10.423572540283203, -4.0826568603515625, 7.8562469482421875, 16.526365280151367, 1.8893966674804688, 15.850439071655273, 22.196746826171875, 13.651084899902344, -6.5535125732421875, -9.622730255126953, 14.1817626953125, -3.8293628692626953, 22.541292190551758, 14.55459976196289, 19.78418731689453, 23.36931610107422, -13.5673828125, 7.66387939453125, 33.117401123046875, 9.221900939941406, 11.949138641357422, 31.643295288085938, 8.951858520507812, 0.16352081298828125, 0.3041343688964844, 9.618635177612305, 27.862350463867188, 13.046180725097656, 1.7595977783203125, 3.0513687133789062, 2.7365188598632812, 28.084182739257812, -6.340312957763672, 21.655548095703125, 8.276546478271484, 3.816448211669922, -4.210954666137695, -4.743339538574219, -15.44378662109375, 29.678298950195312, 6.2635650634765625, 15.358963012695312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000243.npy"}
|
||||
{"epoch": 0.3673469387755102, "step": 244, "batch_size": 64, "mean": 12.21798324584961, "std": 11.764215469360352, "min": -19.643096923828125, "p10": 0.4073110580444339, "median": 10.548473358154297, "p90": 27.970796203613283, "max": 39.41471862792969, "pos_frac": 0.921875, "sample": [8.921211242675781, 12.409568786621094, 14.376922607421875, 5.710273742675781, 8.254669189453125, 8.961896896362305, -1.2270755767822266, 26.444923400878906, 12.74415397644043, -4.976814270019531, 3.714832305908203, 17.37411880493164, 3.799896240234375, 24.250534057617188, 9.417734146118164, 6.3734283447265625, 11.931266784667969, 31.302459716796875, -19.643096923828125, 22.9366512298584, 14.265121459960938, 7.911323547363281, -14.519634246826172, 17.375354766845703, 6.351692199707031, 20.682044982910156, 14.089860916137695, 27.935195922851562, 30.447853088378906, 5.453367233276367, 33.39784240722656, 10.408058166503906, 0.2363567352294922, 0.7426071166992188, 3.1885604858398438, 4.584611892700195, 0.2636127471923828, 17.643909454345703, 30.372772216796875, 39.41471862792969, 2.063383102416992, 27.986053466796875, 1.8609962463378906, 7.1132965087890625, 0.8283271789550781, 31.480300903320312, 15.677543640136719, 24.35670280456543, 1.1339950561523438, 8.01034927368164, 2.6474056243896484, 12.873687744140625, 25.341190338134766, 12.608753204345703, 27.016952514648438, 14.689918518066406, -1.763031005859375, 10.688888549804688, 3.601551055908203, 4.803276062011719, 16.714004516601562, 4.407051086425781, 26.63311767578125, 27.854446411132812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000244.npy"}
|
||||
{"epoch": 0.3688586545729403, "step": 245, "batch_size": 64, "mean": 13.484872817993164, "std": 12.971802711486816, "min": -18.0458984375, "p10": -1.2753293991088859, "median": 13.052188873291016, "p90": 31.518756866455085, "max": 44.07804870605469, "pos_frac": 0.875, "sample": [15.122421264648438, 12.709503173828125, 13.666744232177734, 32.56513977050781, 12.270608901977539, 1.6864643096923828, 3.9680213928222656, 25.75169563293457, 13.0087890625, 15.107917785644531, 2.061809539794922, 11.00872802734375, 21.62617301940918, 14.102771759033203, 17.08885955810547, 32.21111297607422, 6.58734130859375, 20.936973571777344, -10.83331298828125, 44.07804870605469, 36.07283020019531, 4.708763122558594, -1.8094024658203125, 17.567739486694336, 20.77829360961914, 19.965240478515625, 10.225814819335938, -0.459320068359375, 18.699935913085938, 0.4080390930175781, 29.227645874023438, -10.82931137084961, 0.43143463134765625, 14.670585632324219, 5.8404083251953125, 17.000022888183594, -1.9283447265625, 6.141595840454102, 17.241146087646484, 39.73133087158203, 12.394935607910156, 3.53509521484375, 39.07665252685547, 4.435325622558594, -1.6250476837158203, 11.327718734741211, 33.12201690673828, 29.90325927734375, 6.0292205810546875, 7.15802001953125, 20.962005615234375, 18.74007797241211, -12.26959228515625, 22.418529510498047, 29.61644744873047, 13.095588684082031, -18.0458984375, 29.19962501525879, 8.207157135009766, 6.963842391967773, 7.4628143310546875, 23.153167724609375, 5.854328155517578, 13.906291961669922], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000245.npy"}
|
||||
{"epoch": 0.37037037037037035, "step": 246, "batch_size": 64, "mean": 10.877462387084961, "std": 15.950652122497559, "min": -21.45047378540039, "p10": -5.812055015563965, "median": 8.971477508544922, "p90": 30.895558166503907, "max": 49.098388671875, "pos_frac": 0.765625, "sample": [21.796367645263672, 24.687660217285156, 31.206045150756836, 31.630809783935547, 7.86944580078125, 3.4024391174316406, -21.45047378540039, 29.849960327148438, 12.036247253417969, -5.736444473266602, 26.564544677734375, 30.470245361328125, 10.00518798828125, -4.188104629516602, 21.67804718017578, 16.778263092041016, 8.709403991699219, -5.69941520690918, -7.585243225097656, -1.4376983642578125, 8.15234375, 22.79092788696289, 2.7633895874023438, 31.077835083007812, 40.64977264404297, -19.974166870117188, 5.7752838134765625, 49.098388671875, -5.515083312988281, 9.233551025390625, 1.9008769989013672, 26.615432739257812, 7.259613037109375, 5.283845901489258, 6.54669189453125, 15.21435546875, -20.565414428710938, 11.505409240722656, 32.97666549682617, 0.8913192749023438, 13.419334411621094, -20.50006866455078, -5.844459533691406, 25.52544403076172, 13.4857177734375, -3.8018341064453125, 16.590293884277344, 21.932083129882812, 27.946029663085938, -20.3226318359375, 5.5105133056640625, 6.523967742919922, 13.672262191772461, 46.528724670410156, 4.8375701904296875, 1.0427055358886719, 20.748516082763672, 27.85625457763672, -2.95501708984375, 6.62286376953125, 11.925712585449219, 6.258018493652344, 17.85717010498047, -0.96990966796875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000246.npy"}
|
||||
{"epoch": 0.37188208616780044, "step": 247, "batch_size": 64, "mean": 6.412394046783447, "std": 12.790541648864746, "min": -31.226959228515625, "p10": -9.544288635253906, "median": 6.496849060058594, "p90": 20.8168212890625, "max": 45.10107421875, "pos_frac": 0.703125, "sample": [-8.753448486328125, 6.973407745361328, 5.389671325683594, 7.691135406494141, 5.159397125244141, -9.664268493652344, -0.5568084716796875, 26.384613037109375, 30.428905487060547, 0.1906890869140625, 23.961349487304688, 11.71856689453125, 21.58805274963379, 14.261331558227539, 6.4781646728515625, 20.8505859375, 14.588781356811523, 13.546802520751953, -3.191965103149414, -14.4200439453125, 45.10107421875, 19.981857299804688, -20.872177124023438, -7.019330978393555, 3.9020462036132812, 6.380989074707031, 14.190959930419922, -0.7937850952148438, -9.264335632324219, 14.749755859375, 16.867904663085938, 10.365821838378906, 6.515533447265625, 4.6253204345703125, 5.909759521484375, 5.2728118896484375, 16.813858032226562, -31.226959228515625, -12.093841552734375, -0.6448841094970703, -1.1461715698242188, 4.437816619873047, -1.581024169921875, 24.741790771484375, 20.738037109375, 9.647296905517578, 2.4942359924316406, -4.416259765625, 6.302684783935547, 10.235267639160156, 9.781219482421875, 12.175722122192383, -3.9474411010742188, 8.530252456665039, -16.161230087280273, 12.885515213012695, 14.303031921386719, 19.56123161315918, 1.0566177368164062, -5.1942138671875, -10.78759765625, 14.182174682617188, 14.179595947265625, 6.98736572265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000247.npy"}
|
||||
{"epoch": 0.37339380196523053, "step": 248, "batch_size": 64, "mean": 10.455678939819336, "std": 15.783217430114746, "min": -19.101112365722656, "p10": -9.817336082458494, "median": 6.956785202026367, "p90": 30.55037651062012, "max": 49.73096466064453, "pos_frac": 0.796875, "sample": [7.702857971191406, 16.187255859375, -16.440902709960938, -16.742645263671875, 41.76941680908203, 24.33587646484375, 29.10235595703125, 16.876602172851562, 0.7097911834716797, -6.078517913818359, 0.9949779510498047, 26.92080307006836, 15.86724853515625, -8.10453987121582, 28.488685607910156, 9.816543579101562, 33.69596862792969, -1.784759521484375, -13.828550338745117, 11.320945739746094, -13.466056823730469, 3.0450706481933594, 3.168731689453125, 4.211614608764648, 28.970481872558594, 13.37722396850586, 22.16892433166504, 6.765514373779297, 11.371395111083984, 30.457809448242188, -19.101112365722656, 29.26323699951172, 0.12621498107910156, -4.501726150512695, 32.02486038208008, 23.52474021911621, 7.1480560302734375, 5.625831604003906, 4.777780532836914, 14.505340576171875, 17.5184326171875, 25.926254272460938, 4.994220733642578, 37.971717834472656, 33.434722900390625, 8.092594146728516, 2.848165512084961, 1.2304039001464844, 5.693258285522461, 17.72222137451172, 49.73096466064453, 3.311016082763672, 18.392684936523438, 3.9507827758789062, -16.564453125, 6.551271438598633, 1.091217041015625, 30.507099151611328, 30.568923950195312, -5.502498626708984, 1.021017074584961, 2.3955459594726562, -10.5513916015625, -5.444061279296875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000248.npy"}
|
||||
{"epoch": 0.3749055177626606, "step": 249, "batch_size": 64, "mean": 4.365830421447754, "std": 13.832294464111328, "min": -30.464401245117188, "p10": -14.505099487304685, "median": 7.030521392822266, "p90": 18.917801666259766, "max": 34.265777587890625, "pos_frac": 0.6875, "sample": [10.87890625, -12.7740478515625, 6.3599090576171875, 7.0278472900390625, 2.4086647033691406, -10.638320922851562, -17.1951904296875, 11.055778503417969, 10.649993896484375, 13.584068298339844, 8.509590148925781, 16.604251861572266, 2.1928138732910156, 16.755294799804688, -21.6746826171875, -10.891769409179688, 8.490463256835938, -16.23114776611328, -12.100616455078125, -9.134952545166016, 2.902799606323242, -4.44244384765625, -10.013145446777344, 7.984598159790039, 9.33230209350586, 5.741539001464844, 1.0745220184326172, -3.1891555786132812, 34.265777587890625, -5.279914855957031, -9.239723205566406, 18.812850952148438, 2.622722625732422, 7.038675308227539, 22.068058013916016, -24.69732666015625, 5.229484558105469, -0.861724853515625, 9.087715148925781, -19.076171875, 6.014787673950195, 28.404621124267578, 19.254409790039062, 32.19465637207031, 15.995040893554688, 7.143314361572266, -15.246978759765625, 18.835086822509766, 18.102142333984375, 7.033195495605469, 10.181497573852539, 14.92214584350586, 11.26483154296875, 1.9327163696289062, 12.848308563232422, 8.826934814453125, -4.358602523803711, 16.84677505493164, 29.743656158447266, -30.464401245117188, 18.953250885009766, -11.031444549560547, 1.0615768432617188, 7.717338562011719], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000249.npy"}
|
||||
{"epoch": 0.3764172335600907, "step": 250, "batch_size": 64, "mean": 11.281606674194336, "std": 14.730340003967285, "min": -35.99598693847656, "p10": -4.311057472229003, "median": 10.61890697479248, "p90": 29.866773796081546, "max": 44.4437255859375, "pos_frac": 0.78125, "sample": [11.642351150512695, 8.254827499389648, -3.8947582244873047, 7.246921539306641, 0.1941375732421875, 18.2315673828125, 15.361183166503906, 38.179359436035156, 28.8594970703125, 4.249748229980469, 18.206188201904297, -4.489471435546875, 26.723251342773438, 4.778270721435547, 17.4385986328125, 1.9258708953857422, 0.808990478515625, -35.99598693847656, 22.883771896362305, 6.239097595214844, 9.595462799072266, -8.127138137817383, 7.861297607421875, -14.223190307617188, 21.577394485473633, 2.91961669921875, 33.29096603393555, 13.631233215332031, 4.198631286621094, 22.0062255859375, -4.603031158447266, 21.099884033203125, 30.298463821411133, 0.17644691467285156, 16.538040161132812, -0.169158935546875, -0.056415557861328125, 17.41353988647461, 44.4437255859375, 15.90625, -10.002399444580078, -9.41537857055664, 12.438358306884766, 0.8805618286132812, 36.61430358886719, 13.295829772949219, 33.83013916015625, 22.446651458740234, 28.539012908935547, 23.841659545898438, -1.59100341796875, 0.5565414428710938, 21.667869567871094, 2.81805419921875, -0.132568359375, 7.635200500488281, 42.837799072265625, 23.173927307128906, 11.95859146118164, -0.066802978515625, -3.232830047607422, 24.108726501464844, 6.100055694580078, 13.098831176757812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000250.npy"}
|
||||
{"epoch": 0.3779289493575208, "step": 251, "batch_size": 64, "mean": 7.386477470397949, "std": 13.886478424072266, "min": -24.889816284179688, "p10": -7.693423080444335, "median": 6.6252288818359375, "p90": 23.340425109863286, "max": 43.65027618408203, "pos_frac": 0.6875, "sample": [1.9276103973388672, 37.572784423828125, 37.68683624267578, -0.9359264373779297, 5.556617736816406, 9.160682678222656, -24.889816284179688, 9.350269317626953, 14.342010498046875, 43.65027618408203, -0.07840728759765625, -13.472824096679688, 6.279541015625, -4.602561950683594, 3.763082504272461, -6.912525177001953, 8.009536743164062, 29.829681396484375, 5.623952865600586, -20.616836547851562, -3.9168853759765625, 35.58500671386719, 5.743303298950195, -16.7625732421875, 5.24578857421875, -2.5034942626953125, 18.06548309326172, 16.17376708984375, -6.298377990722656, -0.038299560546875, 15.474098205566406, 16.477066040039062, 2.8713111877441406, 7.117170333862305, 14.143386840820312, 9.11572265625, 16.607986450195312, 6.7890625, -7.002536773681641, -0.88812255859375, 23.691543579101562, 1.693084716796875, 16.703506469726562, 17.547840118408203, 6.480247497558594, 6.070610046386719, 12.493011474609375, 7.356929779052734, 13.479385375976562, 22.521148681640625, 19.800445556640625, 6.770210266113281, 25.043888092041016, -6.595390319824219, -2.31689453125, -1.914407730102539, 16.413745880126953, -13.69195556640625, 9.197996139526367, -16.47051239013672, -7.9895172119140625, 14.956527709960938, 21.917160034179688, 6.333110809326172], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000251.npy"}
|
||||
{"epoch": 0.3794406651549509, "step": 252, "batch_size": 64, "mean": 7.06661319732666, "std": 15.511738777160645, "min": -39.778594970703125, "p10": -9.616470718383786, "median": 7.360647201538086, "p90": 26.386022186279302, "max": 43.325836181640625, "pos_frac": 0.75, "sample": [9.787443161010742, -35.87025451660156, -5.275045394897461, -31.189208984375, 21.66302490234375, 0.84454345703125, 7.099246978759766, 26.897964477539062, 1.9182071685791016, 25.191490173339844, 30.760318756103516, 12.110977172851562, 5.629768371582031, 13.544853210449219, 35.177032470703125, 0.05010986328125, 32.77728271484375, -14.744058609008789, 7.622047424316406, 0.9232654571533203, -13.235347747802734, 5.267057418823242, 18.544830322265625, 9.388038635253906, 10.499908447265625, 5.952117919921875, 3.5207977294921875, -5.993335723876953, 23.372594833374023, -2.5305328369140625, 12.02177619934082, 37.647216796875, 22.528057098388672, 9.751007080078125, 9.470664978027344, 3.6072425842285156, 12.7852783203125, 9.339483261108398, 8.407831192016602, -4.338336944580078, 9.490482330322266, -3.737293243408203, -1.6983108520507812, 13.717691421508789, 4.534332275390625, 43.325836181640625, -11.169242858886719, 4.055572509765625, -39.778594970703125, -12.626766204833984, -3.6351184844970703, 4.063507080078125, 11.205560684204102, -1.0264053344726562, 8.802207946777344, 18.260398864746094, 6.711280822753906, 21.85834503173828, 6.209434509277344, 9.526542663574219, 27.7640380859375, 18.285858154296875, -5.906194686889648, 3.1047439575195312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000252.npy"}
|
||||
{"epoch": 0.38095238095238093, "step": 253, "batch_size": 64, "mean": 7.683012962341309, "std": 12.633442878723145, "min": -18.99197769165039, "p10": -7.995293617248535, "median": 7.4367265701293945, "p90": 22.62368144989014, "max": 35.036563873291016, "pos_frac": 0.703125, "sample": [13.760055541992188, -7.740020751953125, 33.01555252075195, 1.340606689453125, -0.7605133056640625, 9.814815521240234, 8.045364379882812, 22.402767181396484, 14.768569946289062, 7.932229995727539, 17.128082275390625, -3.4754867553710938, 19.35608673095703, 2.2555313110351562, 1.8255939483642578, -13.362743377685547, -11.690673828125, 10.896463394165039, -8.137794494628906, 15.310531616210938, 25.558975219726562, 15.133161544799805, -18.99197769165039, 15.203353881835938, 8.813039779663086, 2.987884521484375, 22.718358993530273, 15.198860168457031, -0.07752227783203125, 14.398300170898438, 3.805267333984375, 4.34906005859375, 16.993898391723633, -0.4788169860839844, 4.743804931640625, 34.393463134765625, 5.61199951171875, -15.848426818847656, 13.067390441894531, 12.621255874633789, 5.337921142578125, -5.97320556640625, 6.94122314453125, -8.104696273803711, 21.68273162841797, -4.8102264404296875, -1.4253120422363281, 12.97182846069336, 35.036563873291016, 4.223020553588867, 9.35516357421875, 30.46642303466797, 34.109535217285156, 4.353157043457031, 21.134048461914062, -6.7545318603515625, 12.182723999023438, -6.03154182434082, 18.45458221435547, -12.180595397949219, -5.764743804931641, -2.4986934661865234, 6.130580902099609, 9.990543365478516], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000253.npy"}
|
||||
{"epoch": 0.382464096749811, "step": 254, "batch_size": 64, "mean": 8.387495040893555, "std": 13.542383193969727, "min": -17.592391967773438, "p10": -6.62212905883789, "median": 7.075714111328125, "p90": 27.26704864501954, "max": 40.59635925292969, "pos_frac": 0.640625, "sample": [-7.533699035644531, 15.474044799804688, 32.83880615234375, 13.349884033203125, -0.5669403076171875, 12.087196350097656, 24.27291488647461, 18.86968994140625, 1.459878921508789, 0.19248580932617188, -5.853080749511719, 8.55953598022461, 19.191268920898438, 14.275915145874023, 17.990509033203125, 19.37118148803711, 28.217132568359375, 12.538291931152344, -10.701225280761719, 16.316871643066406, 2.08697509765625, 40.59635925292969, -13.604988098144531, 29.500244140625, -1.8386459350585938, -15.843677520751953, -3.7039794921875, 24.551183700561523, 18.709821701049805, -3.3876724243164062, 36.23328399658203, -5.495941162109375, 10.07305908203125, -0.9811668395996094, -5.232028961181641, 7.7910308837890625, -0.3795433044433594, 19.26703643798828, 0.18123817443847656, 9.320793151855469, 5.317649841308594, -0.7828578948974609, 11.153690338134766, -8.417129516601562, -1.663461685180664, 31.183258056640625, -2.2594070434570312, 24.139205932617188, 19.128021240234375, 5.195669174194336, 6.3603973388671875, -5.7118377685546875, -5.709724426269531, 13.201797485351562, 28.47076416015625, -1.5953598022460938, 8.725135803222656, 2.5518264770507812, 24.179244995117188, 25.050186157226562, 4.762128829956055, -6.95172119140625, -0.129425048828125, -17.592391967773438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000254.npy"}
|
||||
{"epoch": 0.3839758125472411, "step": 255, "batch_size": 64, "mean": 10.852192878723145, "std": 14.606003761291504, "min": -20.59600830078125, "p10": -5.584725189208984, "median": 8.189242362976074, "p90": 33.19599075317383, "max": 41.51458740234375, "pos_frac": 0.734375, "sample": [30.665363311767578, 6.057228088378906, 41.51458740234375, 13.951019287109375, 22.79010009765625, 4.183624267578125, -2.5064239501953125, -0.9534111022949219, 3.0139808654785156, 17.99011993408203, -10.372264862060547, -0.9233989715576172, 6.70930290222168, 21.1678466796875, -5.5234527587890625, 7.207370758056641, 15.695106506347656, 9.087226867675781, 1.2570037841796875, -20.59600830078125, 25.034961700439453, -0.24195098876953125, 8.785049438476562, 6.224552154541016, 32.67311096191406, 16.934188842773438, -7.7710723876953125, 2.2536087036132812, 35.362281799316406, -2.2980880737304688, 33.420082092285156, 22.96363067626953, -4.298728942871094, 9.700210571289062, 13.604736328125, 7.593435287475586, 11.743715286254883, 30.357872009277344, 36.18199920654297, 10.728872299194336, 4.355081558227539, 16.535598754882812, 24.86505126953125, -12.665390014648438, 3.9685916900634766, -3.26251220703125, 5.372871398925781, 39.308990478515625, -6.861900329589844, 13.182823181152344, 2.248668670654297, 30.037269592285156, 9.447296142578125, 39.073577880859375, 6.892814636230469, 18.41948699951172, 2.135883331298828, 23.167255401611328, 35.583892822265625, -12.200557708740234, -3.16314697265625, -5.610984802246094, 16.53490447998047, -2.1926040649414062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000255.npy"}
|
||||
{"epoch": 0.3854875283446712, "step": 256, "batch_size": 64, "mean": 9.19884967803955, "std": 14.750246047973633, "min": -29.3731689453125, "p10": -6.265751647949219, "median": 8.165740966796875, "p90": 29.487322235107428, "max": 47.8929443359375, "pos_frac": 0.71875, "sample": [-4.623016357421875, 3.86370849609375, 17.06565284729004, 10.963741302490234, 27.183975219726562, -6.1544647216796875, -3.820138931274414, 30.429763793945312, 8.075246810913086, 0.6285076141357422, -7.594135284423828, 3.068532943725586, -6.09893798828125, 36.281700134277344, 9.798828125, 37.286155700683594, -2.5514488220214844, 4.834403991699219, -29.3731689453125, 18.756027221679688, 1.2207527160644531, 21.58831787109375, -4.465372085571289, 6.18023681640625, -10.300331115722656, -14.739933013916016, 8.25694465637207, -1.5863723754882812, 10.810310363769531, 13.47146224975586, 2.5924072265625, 15.851287841796875, 11.186538696289062, 22.160449981689453, 22.501861572265625, 21.119659423828125, 28.417903900146484, 22.936065673828125, -4.8556671142578125, 29.94564437866211, 2.5068588256835938, 8.256235122680664, 37.04937744140625, -2.50347900390625, 31.605546951293945, 3.0906333923339844, 17.354949951171875, 3.1243057250976562, 10.513849258422852, 47.8929443359375, 0.152984619140625, 21.842796325683594, -6.313446044921875, 2.3152923583984375, 15.618484497070312, -6.009063720703125, 21.310302734375, -4.508262634277344, 0.07865333557128906, 28.194503784179688, -6.965854644775391, 11.455917358398438, -9.500900268554688, 11.850631713867188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000256.npy"}
|
||||
{"epoch": 0.3869992441421013, "step": 257, "batch_size": 64, "mean": 12.18323802947998, "std": 16.587337493896484, "min": -27.054746627807617, "p10": -2.9863510131835938, "median": 9.317878723144531, "p90": 35.21692390441895, "max": 56.053863525390625, "pos_frac": 0.765625, "sample": [-1.012308120727539, 3.75872802734375, -3.0042724609375, 56.053863525390625, 39.54138946533203, 15.738101959228516, 0.6646671295166016, 20.824201583862305, 3.301555633544922, 20.211193084716797, 35.742897033691406, 8.503623962402344, 4.481349945068359, -2.9445343017578125, 23.025848388671875, 29.26056671142578, 32.30171203613281, -2.507293701171875, -20.6578369140625, -1.2943572998046875, -0.6569194793701172, 11.19281005859375, 18.69244384765625, 2.5683250427246094, 11.888229370117188, 2.593963623046875, 2.483264923095703, 35.20918273925781, 20.456443786621094, -1.9254684448242188, 22.980270385742188, 35.22024154663086, 12.446250915527344, 3.7145309448242188, 17.35535430908203, 12.128265380859375, 3.721099853515625, 27.222122192382812, 48.398685455322266, -7.785289764404297, -27.054746627807617, 54.421783447265625, 9.147132873535156, 9.488624572753906, 3.9601898193359375, -2.0483226776123047, 6.982170104980469, 2.54150390625, 10.437225341796875, 13.516212463378906, 20.41739845275879, 6.4866943359375, 30.258323669433594, 3.085742950439453, -6.019317626953125, 41.805728912353516, -2.246297836303711, -8.218940734863281, 35.07762908935547, 11.3056640625, 13.423198699951172, 11.174217224121094, 6.1162567138671875, -4.223762512207031], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000257.npy"}
|
||||
{"epoch": 0.3885109599395314, "step": 258, "batch_size": 64, "mean": 8.973379135131836, "std": 10.80665397644043, "min": -18.135774612426758, "p10": -2.2133653640747046, "median": 6.943994522094727, "p90": 21.752603149414067, "max": 36.186737060546875, "pos_frac": 0.890625, "sample": [12.132858276367188, 2.3849220275878906, 3.4552993774414062, -3.1891613006591797, 4.769355773925781, 30.000999450683594, 30.9359130859375, 20.77935791015625, 17.616165161132812, 1.5865859985351562, 7.360288619995117, 4.845680236816406, -10.100311279296875, 12.600820541381836, 5.470573425292969, 22.169708251953125, 2.522247314453125, -11.014484405517578, 9.740623474121094, -18.135774612426758, 3.666576385498047, 5.84039306640625, -12.073905944824219, 9.618240356445312, 6.858856201171875, 10.37786865234375, 17.878768920898438, 11.957437515258789, 4.482387542724609, 5.561088562011719, 9.129623413085938, 7.029132843017578, 14.363327026367188, 15.345867156982422, 5.3008270263671875, 0.3473701477050781, 6.692569732666016, 2.0772247314453125, 6.3585662841796875, 36.186737060546875, 14.888565063476562, -5.6402740478515625, 28.79296875, 13.735733032226562, 18.018341064453125, 23.59354591369629, 19.6165771484375, 11.934803009033203, 3.71832275390625, 20.262853622436523, 0.0634918212890625, 3.3864288330078125, 9.983451843261719, 5.207008361816406, 4.492774963378906, 3.319711685180664, 20.44500732421875, -13.53533935546875, 7.248542785644531, 1.3283920288085938, 15.614837646484375, 19.19149398803711, 29.728601455688477, 5.969755172729492], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000258.npy"}
|
||||
{"epoch": 0.3900226757369615, "step": 259, "batch_size": 64, "mean": 13.668660163879395, "std": 14.283196449279785, "min": -17.41126251220703, "p10": -6.4847389221191385, "median": 14.590885162353516, "p90": 32.760305404663086, "max": 46.62770080566406, "pos_frac": 0.84375, "sample": [-0.39306640625, -8.524200439453125, -17.41126251220703, 20.57640838623047, 14.013427734375, 22.630943298339844, 34.585472106933594, 29.747848510742188, 15.523086547851562, -12.188629150390625, 22.897430419921875, 0.8019943237304688, 6.662160873413086, 32.1705322265625, 20.266128540039062, 2.032745361328125, -13.190462112426758, 1.1489601135253906, 35.700775146484375, 19.788299560546875, 11.436012268066406, 22.24951171875, 27.14462661743164, 28.985889434814453, 21.544898986816406, -4.570278167724609, -7.179901123046875, 5.771820068359375, 7.00360107421875, 10.87297248840332, 20.471675872802734, 17.312057495117188, -4.862693786621094, 8.725624084472656, 10.283721923828125, 11.808242797851562, 37.38880920410156, 1.7208404541015625, 21.256027221679688, 25.679855346679688, 31.741256713867188, 15.678813934326172, 1.2532310485839844, 46.62770080566406, 5.682014465332031, 10.181365966796875, -12.608104705810547, 2.0124683380126953, -8.094741821289062, 1.5056686401367188, 36.114044189453125, 33.013065338134766, 18.659759521484375, 11.5045166015625, 15.878456115722656, 29.064430236816406, 10.441558837890625, 14.372467041015625, 3.91876220703125, 14.809303283691406, 19.518173217773438, 34.63926696777344, 19.580963134765625, 19.417879104614258], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000259.npy"}
|
||||
{"epoch": 0.3915343915343915, "step": 260, "batch_size": 64, "mean": 10.318174362182617, "std": 14.745997428894043, "min": -20.515132904052734, "p10": -8.497006607055663, "median": 11.182961463928223, "p90": 28.91758079528809, "max": 41.83478546142578, "pos_frac": 0.734375, "sample": [28.583637237548828, -12.639968872070312, 20.948610305786133, -7.349697113037109, -3.7947235107421875, 38.967559814453125, 14.179420471191406, 12.575447082519531, 30.997360229492188, -12.528976440429688, 28.274505615234375, 11.804418563842773, 25.88880157470703, 19.93389129638672, 23.10149383544922, -8.711868286132812, 30.470672607421875, 15.236244201660156, 14.11598014831543, 41.83478546142578, 27.68732452392578, -2.1418914794921875, 26.234817504882812, 31.436172485351562, -3.31756591796875, 1.3095436096191406, 24.91202163696289, 29.060699462890625, 0.284027099609375, 5.7579498291015625, 12.416845321655273, 9.5521240234375, 23.584571838378906, 6.684272766113281, 4.056356430053711, 11.03647232055664, 0.6316299438476562, -5.5203399658203125, -20.515132904052734, 12.276191711425781, 2.099149703979492, 16.217979431152344, -0.21634292602539062, -1.5916595458984375, 9.258018493652344, 17.121355056762695, 33.91807556152344, -4.9883575439453125, 28.471778869628906, -7.740875244140625, -11.796049118041992, 11.278879165649414, 13.730667114257812, 6.11859130859375, 17.376665115356445, 11.087043762207031, 0.50164794921875, 19.92755889892578, 23.573745727539062, -11.217880249023438, 0.7132759094238281, -14.815597534179688, -7.995662689208984, 2.0175018310546875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000260.npy"}
|
||||
{"epoch": 0.3930461073318216, "step": 261, "batch_size": 64, "mean": 8.2984037399292, "std": 13.600400924682617, "min": -16.488555908203125, "p10": -9.27085952758789, "median": 7.555146217346191, "p90": 29.664928054809575, "max": 42.905906677246094, "pos_frac": 0.734375, "sample": [0.02142333984375, 7.65118408203125, 10.224985122680664, 0.0764923095703125, -9.60675048828125, 20.955997467041016, -1.5867061614990234, 31.953765869140625, 10.436172485351562, -1.2705211639404297, -14.472724914550781, -6.027862548828125, -4.3175048828125, 28.661388397216797, 13.732500076293945, -3.1069793701171875, 3.001983642578125, 34.43217468261719, 3.4823665618896484, 30.78354263305664, 2.8929977416992188, 7.968742370605469, -2.206005096435547, -13.883636474609375, 26.35552978515625, 14.577964782714844, 42.905906677246094, 3.758441925048828, 1.2249984741210938, 7.459108352661133, 16.185028076171875, 16.42983055114746, 14.905071258544922, 23.678916931152344, 9.225446701049805, -0.9793453216552734, 34.76581573486328, -8.41265869140625, -12.612258911132812, 17.656469345092773, 9.495880126953125, 1.9786815643310547, 17.323081970214844, 14.492359161376953, 5.024375915527344, -14.55596923828125, 2.0123367309570312, -16.488555908203125, -9.731597900390625, 7.444671630859375, 9.244369506835938, 0.7654342651367188, 24.966201782226562, 30.735870361328125, 7.228096008300781, -8.487113952636719, 11.450420379638672, 14.259002685546875, -2.946563720703125, 30.095016479492188, 7.902656555175781, 14.510810852050781, 11.769935607910156, 5.687093734741211], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000261.npy"}
|
||||
{"epoch": 0.3945578231292517, "step": 262, "batch_size": 64, "mean": 9.823917388916016, "std": 12.484959602355957, "min": -12.419639587402344, "p10": -6.033965110778809, "median": 6.308947563171387, "p90": 27.601391983032226, "max": 33.686439514160156, "pos_frac": 0.8125, "sample": [-9.751197814941406, 16.55414581298828, 8.476142883300781, 1.696371078491211, -5.05322265625, 24.265331268310547, 8.95440673828125, 2.1402511596679688, 5.799781799316406, 31.45383644104004, -3.894603729248047, 27.466651916503906, -4.127189636230469, 1.0960121154785156, 27.659137725830078, -10.657630920410156, 6.40861701965332, 33.686439514160156, 16.80804443359375, 12.58968734741211, -7.3120269775390625, 0.4210853576660156, 2.5659103393554688, 0.8585662841796875, 15.042121887207031, 28.606639862060547, 3.1864776611328125, 33.16868591308594, -6.09588623046875, 2.0684051513671875, 25.453384399414062, 11.930313110351562, 10.013481140136719, 18.41790771484375, -8.259056091308594, 28.73902130126953, 5.225473403930664, 19.57412338256836, 25.703567504882812, 5.965545654296875, -4.38676643371582, 4.145233154296875, -5.889482498168945, 17.194488525390625, 31.65740966796875, 11.22361946105957, 6.468589782714844, 2.9096851348876953, -12.419639587402344, 5.005252838134766, 14.136329650878906, 3.2121429443359375, 26.011703491210938, 1.6158447265625, 16.0330867767334, 5.506570816040039, 26.37590789794922, 27.126663208007812, 2.6673507690429688, 19.87298583984375, 6.209278106689453, 5.3661956787109375, 19.01911163330078, -7.175590515136719], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000262.npy"}
|
||||
{"epoch": 0.3960695389266818, "step": 263, "batch_size": 64, "mean": 7.785467147827148, "std": 12.75522518157959, "min": -25.03692054748535, "p10": -7.308759307861327, "median": 7.438039779663086, "p90": 25.596742248535158, "max": 34.05366516113281, "pos_frac": 0.71875, "sample": [6.944465637207031, 18.036136627197266, -14.264389038085938, -1.961822509765625, -4.935922622680664, -2.8447837829589844, 9.765609741210938, 0.091156005859375, 19.575084686279297, 27.200910568237305, 14.686126708984375, 1.4949836730957031, -4.072868347167969, 17.533321380615234, 4.0970916748046875, -17.4598388671875, 23.291748046875, 1.9309234619140625, 11.662849426269531, -6.473274230957031, 15.842658996582031, 17.78595733642578, 31.47821044921875, 6.1758575439453125, -0.5986099243164062, 1.6471843719482422, 26.498687744140625, 10.393989562988281, 18.48269271850586, 15.977989196777344, 34.05366516113281, 9.695594787597656, -7.895538330078125, 4.935821533203125, -5.654655456542969, 3.4057159423828125, -16.100788116455078, 25.708873748779297, 10.087989807128906, 16.90918731689453, 11.137664794921875, -2.8455123901367188, -2.4362640380859375, 27.064468383789062, -7.6668243408203125, 1.9347686767578125, 13.473159790039062, 20.269119262695312, 16.788070678710938, 4.375709533691406, -0.661407470703125, 32.813385009765625, 1.336090087890625, -1.0083274841308594, 9.4202880859375, 0.47499847412109375, 15.410636901855469, 18.907323837280273, 7.931613922119141, -9.475484848022461, 6.883338928222656, -25.03692054748535, 25.335102081298828, 10.716878890991211], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000263.npy"}
|
||||
{"epoch": 0.3975812547241119, "step": 264, "batch_size": 64, "mean": 9.597256660461426, "std": 14.814565658569336, "min": -23.951431274414062, "p10": -9.144959640502929, "median": 9.4736328125, "p90": 27.567691040039065, "max": 46.42420959472656, "pos_frac": 0.71875, "sample": [23.807937622070312, -5.323974609375, 7.175041198730469, 1.6573448181152344, 2.6439132690429688, -5.626890182495117, 3.2356033325195312, -0.08889007568359375, 20.19363021850586, 11.066314697265625, -9.369983673095703, 5.4581756591796875, -23.951431274414062, 19.532981872558594, -7.907783508300781, 19.068588256835938, 20.089439392089844, 11.8187255859375, 1.013153076171875, 26.851455688476562, 14.9439697265625, 14.361566543579102, 19.177444458007812, 20.41741180419922, 15.327301025390625, 42.80648422241211, -0.12351417541503906, 11.275604248046875, -2.15252685546875, 11.179018020629883, 11.364410400390625, -2.0129165649414062, 29.4849853515625, 2.7344970703125, -2.9061851501464844, 6.586919784545898, 34.74359893798828, -8.619903564453125, 13.832168579101562, 31.901222229003906, -10.416328430175781, -12.277183532714844, 6.645637512207031, 27.874649047851562, 24.69782257080078, 21.271343231201172, -11.821914672851562, 18.923202514648438, -18.290054321289062, 6.969409942626953, -10.31439208984375, 4.980842590332031, 41.530052185058594, 12.479145050048828, 8.675827026367188, -7.625774383544922, 4.136138916015625, 21.8975830078125, 10.271438598632812, 0.11451148986816406, -0.252593994140625, 46.42420959472656, 18.096397399902344, 24.569530487060547], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000264.npy"}
|
||||
{"epoch": 0.39909297052154197, "step": 265, "batch_size": 64, "mean": 10.07077407836914, "std": 15.917012214660645, "min": -27.542938232421875, "p10": -10.573503875732419, "median": 9.473340034484863, "p90": 31.05825729370117, "max": 35.098411560058594, "pos_frac": 0.75, "sample": [4.2541961669921875, 3.0110511779785156, 1.614044189453125, -22.7769775390625, 9.346406936645508, 2.7982406616210938, 12.420614242553711, 14.431709289550781, -15.32281494140625, 31.781185150146484, 4.888557434082031, 3.2555770874023438, 14.998908996582031, 15.306800842285156, -27.542938232421875, -6.524383544921875, 8.909004211425781, 6.173164367675781, 14.118307113647461, -23.558448791503906, 5.73112678527832, 26.7113037109375, 26.9407958984375, 11.882383346557617, -0.25559234619140625, -0.4789924621582031, 27.963159561157227, 1.2394485473632812, 9.600273132324219, 0.48822593688964844, 34.37712860107422, 28.78631591796875, -5.2731170654296875, 33.737281799316406, 21.0, 33.96233367919922, 4.573270797729492, 4.896602630615234, -1.8615989685058594, -7.547325134277344, 11.302459716796875, 21.486595153808594, 19.501195907592773, 31.146507263183594, 35.098411560058594, 26.417770385742188, 34.94673156738281, 16.33929443359375, 5.322166442871094, -14.853363037109375, 12.12554931640625, 7.6580352783203125, -3.5402259826660156, 29.826011657714844, 28.084640502929688, 30.802167892456055, 30.852340698242188, -20.412799835205078, 15.91139030456543, -4.737724304199219, 19.199508666992188, -1.8147430419921875, 17.682815551757812, -11.870437622070312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000265.npy"}
|
||||
{"epoch": 0.40060468631897206, "step": 266, "batch_size": 64, "mean": 9.457446098327637, "std": 13.840380668640137, "min": -27.664962768554688, "p10": -5.210266685485839, "median": 6.937082290649414, "p90": 27.406082153320312, "max": 46.37110900878906, "pos_frac": 0.734375, "sample": [-3.1557998657226562, 27.149627685546875, 32.081756591796875, 4.879190444946289, 15.728202819824219, -1.7848529815673828, 0.46070098876953125, -11.01971435546875, 5.148838043212891, 3.8236732482910156, 14.330875396728516, 6.3054962158203125, 2.876251220703125, 12.374589920043945, 16.524141311645508, 16.279598236083984, 2.16412353515625, 46.37110900878906, -1.0388908386230469, -0.022388458251953125, 36.83578872680664, 9.194015502929688, 0.9497146606445312, 31.121522903442383, 19.79429817199707, 28.0350341796875, -1.8618392944335938, 27.5159912109375, 17.205331802368164, -1.5026016235351562, 21.160825729370117, 5.696067810058594, 24.85326385498047, -3.8338279724121094, 13.728984832763672, -7.468772888183594, 15.557144165039062, 24.621551513671875, 19.050762176513672, 1.4420051574707031, -0.32398223876953125, -8.498428344726562, -1.4759368896484375, 7.568668365478516, -27.664962768554688, 3.0870361328125, 10.764564514160156, 9.572860717773438, 9.700103759765625, 11.77099609375, -21.14696502685547, 3.931852340698242, -6.7900238037109375, 1.18255615234375, 25.10852813720703, 12.919559478759766, 37.29998016357422, -5.800168991088867, 19.067691802978516, 5.3055572509765625, 20.421653747558594, 5.59998893737793, 23.062862396240234, -0.9592094421386719], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000266.npy"}
|
||||
{"epoch": 0.4021164021164021, "step": 267, "batch_size": 64, "mean": 7.398887634277344, "std": 13.955391883850098, "min": -35.12527084350586, "p10": -7.228026962280273, "median": 4.946300506591797, "p90": 23.40802555084229, "max": 39.291969299316406, "pos_frac": 0.703125, "sample": [11.556015014648438, 1.2646980285644531, 2.1616859436035156, -20.781112670898438, 6.2676849365234375, -6.683082580566406, 21.625885009765625, 27.760360717773438, 19.744224548339844, 3.222686767578125, -3.154726028442383, 14.991470336914062, -1.3514671325683594, 39.291969299316406, 2.7992210388183594, -13.434104919433594, 4.94207763671875, -11.37020206451416, 0.1155853271484375, -18.467910766601562, -7.461574554443359, -8.327644348144531, 3.8289108276367188, 15.584186553955078, 24.738479614257812, 32.46576690673828, -1.0368118286132812, 1.2220191955566406, 18.730377197265625, -5.78546142578125, 12.595813751220703, 2.818258285522461, 23.97511863708496, 14.28005599975586, -2.0945510864257812, 10.493244171142578, 18.322662353515625, 0.11761474609375, 4.950523376464844, 36.94490051269531, 8.558395385742188, -0.3287925720214844, 35.194854736328125, -4.310977935791016, 20.740333557128906, 14.488170623779297, 21.69416046142578, 9.465126037597656, -35.12527084350586, 19.407066345214844, 13.135589599609375, -1.9619293212890625, -3.380756378173828, 4.386072158813477, 1.7947311401367188, -1.2288093566894531, 15.078495025634766, 18.153894424438477, 22.084808349609375, 4.6944732666015625, 5.398395538330078, 21.413333892822266, 8.924110412597656, -1.6095008850097656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000267.npy"}
|
||||
{"epoch": 0.4036281179138322, "step": 268, "batch_size": 64, "mean": 8.824535369873047, "std": 12.014403343200684, "min": -12.366485595703125, "p10": -4.346648788452148, "median": 6.570737838745117, "p90": 28.211846160888683, "max": 35.1871337890625, "pos_frac": 0.75, "sample": [5.3187408447265625, 35.1871337890625, 1.9780826568603516, -4.3312225341796875, 10.980484008789062, 23.73912811279297, 8.46575927734375, -0.600311279296875, -4.353260040283203, 16.284725189208984, 20.974960327148438, 1.0609970092773438, -0.21565628051757812, 3.1200714111328125, 2.6310043334960938, 1.0000152587890625, 11.50084114074707, -0.9408721923828125, -0.9173774719238281, 16.91101837158203, 19.645263671875, 29.578292846679688, -1.1887340545654297, 0.1447906494140625, -9.63751220703125, 6.8843536376953125, 5.494720458984375, 13.29791259765625, 14.392776489257812, 6.1873779296875, 19.191146850585938, 6.257122039794922, 3.3503990173339844, 12.266128540039062, -3.2403793334960938, 25.1673583984375, 0.00101470947265625, -11.473037719726562, -12.366485595703125, 7.900287628173828, 15.163726806640625, 1.7472171783447266, -0.36176300048828125, 29.298995971679688, 25.67516326904297, 13.041778564453125, 29.31036376953125, -4.523954391479492, 7.771232604980469, -10.209632873535156, -6.16888427734375, -3.9814987182617188, 20.697986602783203, 1.0604915618896484, 32.22767639160156, 14.463041305541992, 7.840370178222656, 17.15341567993164, 32.20344924926758, 11.948514938354492, 0.43233680725097656, 32.77904510498047, 17.092243194580078, 0.46189117431640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000268.npy"}
|
||||
{"epoch": 0.4051398337112623, "step": 269, "batch_size": 64, "mean": 9.793499946594238, "std": 12.667573928833008, "min": -22.380584716796875, "p10": -2.3654533386230465, "median": 8.486092567443848, "p90": 29.346600532531742, "max": 35.969390869140625, "pos_frac": 0.765625, "sample": [0.6226348876953125, -11.937248229980469, 6.9380035400390625, 20.500064849853516, 14.851531982421875, -1.9602813720703125, 7.8467559814453125, 28.775659561157227, 10.108291625976562, -0.033660888671875, -7.4188690185546875, 13.072765350341797, 17.763540267944336, 12.265296936035156, 29.591289520263672, 26.52672576904297, 1.5100059509277344, 10.865310668945312, 31.645828247070312, -15.663520812988281, 3.4454574584960938, -7.369140625, 26.879962921142578, 0.7150726318359375, -2.4222412109375, 29.65058135986328, -0.8944988250732422, 35.969390869140625, 12.849113464355469, 24.0423583984375, 7.29408073425293, 7.702098846435547, -1.896392822265625, 13.601058959960938, -0.0460968017578125, 2.484640121459961, 3.985393524169922, 9.762588500976562, -2.2329483032226562, 23.992746353149414, 31.680307388305664, 18.514373779296875, 5.2074432373046875, 9.125429153442383, 20.215465545654297, -5.211652755737305, 12.826770782470703, 5.25872802734375, -2.098033905029297, 1.9473724365234375, 30.869140625, 0.8646030426025391, 31.739151000976562, -1.0552291870117188, 3.115997314453125, -22.380584716796875, 9.354354858398438, 19.8951416015625, 1.9571685791015625, 16.669082641601562, 13.403091430664062, 13.152099609375, 22.723129272460938, 5.62725830078125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000269.npy"}
|
||||
{"epoch": 0.40665154950869237, "step": 270, "batch_size": 64, "mean": 8.024534225463867, "std": 11.860662460327148, "min": -25.056926727294922, "p10": -5.947998809814453, "median": 7.608497619628906, "p90": 20.813617515563966, "max": 39.591339111328125, "pos_frac": 0.796875, "sample": [16.42071533203125, 27.902265548706055, -6.288120269775391, 31.058250427246094, 5.797603607177734, 31.5814208984375, 7.676616668701172, 4.5727386474609375, 12.3697509765625, 3.0191497802734375, 39.591339111328125, 18.117279052734375, 11.776466369628906, 17.448699951171875, 3.302135467529297, -8.783378601074219, 8.085403442382812, 16.850914001464844, 0.9517326354980469, 17.88720703125, 6.314453125, 12.91131591796875, 1.5399894714355469, -6.644649505615234, 3.0326995849609375, 9.442184448242188, 15.254331588745117, -22.9754638671875, 3.75213623046875, 3.781667709350586, 10.798904418945312, 5.708465576171875, 0.02809906005859375, 20.7415771484375, 1.4458084106445312, 9.284996032714844, 22.420818328857422, -2.5522689819335938, 7.540378570556641, -25.056926727294922, 26.42340087890625, 20.124847412109375, 18.07233428955078, -1.736948013305664, 8.332683563232422, 12.362380981445312, -5.889717102050781, 5.731636047363281, 16.61528778076172, 11.709207534790039, 20.844491958618164, -0.06782150268554688, -5.9729766845703125, 14.209091186523438, 4.906333923339844, -2.3998947143554688, 3.3419418334960938, -17.716163635253906, 15.76055908203125, 7.083976745605469, 10.275222778320312, -0.12129974365234375, 5.362113952636719, 10.2127685546875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000270.npy"}
|
||||
{"epoch": 0.40816326530612246, "step": 271, "batch_size": 64, "mean": 7.064091682434082, "std": 15.408188819885254, "min": -20.897192001342773, "p10": -9.767679786682129, "median": 2.060821533203125, "p90": 28.706693267822267, "max": 50.779205322265625, "pos_frac": 0.640625, "sample": [10.620376586914062, 14.110931396484375, -4.4583282470703125, 2.1139678955078125, 31.931228637695312, 24.33551788330078, 7.943977355957031, 0.9537715911865234, 28.40772247314453, -9.90170669555664, 25.155044555664062, -9.454950332641602, 38.17787170410156, -3.5632247924804688, 11.567176818847656, 28.834823608398438, 2.7431087493896484, 2.0076751708984375, 1.5194511413574219, 34.25469970703125, 5.5625457763671875, 7.484630584716797, 20.431074142456055, -6.547933578491211, -2.6019287109375, 0.7222824096679688, -4.249320983886719, 16.42707061767578, -1.3348846435546875, -4.141458511352539, 2.171895980834961, 16.17376708984375, 0.34046173095703125, -12.652633666992188, -3.04583740234375, 30.97600555419922, 5.856426239013672, 1.5711669921875, 18.419639587402344, 25.838876724243164, -0.0461578369140625, -16.476470947265625, 50.779205322265625, -12.521430969238281, 0.71954345703125, 1.8649444580078125, -0.48877716064453125, -20.897192001342773, -18.810302734375, -0.6943359375, 0.4740028381347656, -2.95989990234375, 26.182762145996094, 23.620288848876953, 16.184431076049805, 8.08416748046875, -14.821746826171875, 14.978416442871094, -3.0289993286132812, 2.6689453125, -3.846681594848633, -4.2552642822265625, 43.59153747558594, 7.0998687744140625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000271.npy"}
|
||||
{"epoch": 0.40967498110355255, "step": 272, "batch_size": 64, "mean": 11.822696685791016, "std": 12.333123207092285, "min": -10.067695617675781, "p10": -1.3271774291992182, "median": 9.48865032196045, "p90": 34.27629013061524, "max": 38.374732971191406, "pos_frac": 0.84375, "sample": [2.2943763732910156, -3.2205657958984375, -5.095542907714844, 12.237159729003906, 7.6682586669921875, 35.2076416015625, 13.349273681640625, 24.984909057617188, 35.646934509277344, 7.517570495605469, 18.328590393066406, 13.054271697998047, 1.8451862335205078, -0.11144256591796875, 11.618289947509766, 9.660232543945312, 16.49933624267578, 0.6013565063476562, 23.87000274658203, 23.693920135498047, 22.74997901916504, 4.690828323364258, 17.44620132446289, 11.978830337524414, 36.36181640625, 5.108177185058594, 13.554481506347656, -8.306724548339844, 5.8444366455078125, 28.521392822265625, 8.038887023925781, 38.374732971191406, 10.577857971191406, 0.514617919921875, 16.364334106445312, 2.2975502014160156, -0.3377361297607422, 1.7821807861328125, 14.34014892578125, -7.542575836181641, 22.285659790039062, 35.116737365722656, 34.61560821533203, 8.211681365966797, 1.5470809936523438, 2.9059791564941406, 34.43968200683594, 7.736213684082031, 16.3411865234375, 19.38556671142578, 9.317068099975586, 19.30023956298828, -10.067695617675781, 1.492095947265625, 5.608787536621094, 7.322662353515625, 12.864120483398438, 18.641036987304688, -4.366687774658203, -0.7528438568115234, -1.5733203887939453, 2.54559326171875, 33.895042419433594, 7.831926345825195], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000272.npy"}
|
||||
{"epoch": 0.41118669690098264, "step": 273, "batch_size": 64, "mean": 9.706923484802246, "std": 14.77440357208252, "min": -22.153621673583984, "p10": -7.555210876464842, "median": 9.12143325805664, "p90": 26.46603775024414, "max": 53.18731689453125, "pos_frac": 0.765625, "sample": [-1.9325675964355469, 14.71771240234375, 18.872085571289062, -13.424880981445312, 25.108154296875, 0.1338958740234375, -8.405364990234375, 20.694698333740234, -14.708599090576172, 9.496971130371094, 6.4881134033203125, -5.5715179443359375, 13.098098754882812, -0.1629962921142578, 10.646102905273438, 15.038623809814453, 27.61119842529297, 2.86126708984375, 26.63972282409668, 12.7989501953125, 22.371376037597656, 8.115642547607422, -2.7083282470703125, -3.811931610107422, -19.39093017578125, 20.015762329101562, 10.959587097167969, 6.021415710449219, -4.045387268066406, 3.390758514404297, 10.70391845703125, 53.18731689453125, 21.985130310058594, -22.153621673583984, 0.487884521484375, 14.107551574707031, 14.451911926269531, 20.349971771240234, 4.464988708496094, 16.221923828125, -14.179756164550781, 9.764389038085938, 19.116966247558594, 2.55914306640625, 8.745895385742188, 50.45429992675781, 36.962928771972656, 18.347850799560547, 7.4362640380859375, 17.843719482421875, 0.6383552551269531, 5.554786682128906, 1.8279151916503906, -5.0693359375, 5.421546936035156, 26.48863983154297, 3.8232688903808594, 19.7431640625, 7.299457550048828, -2.760711669921875, 26.413299560546875, 35.55751037597656, -12.079849243164062, 16.608741760253906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000273.npy"}
|
||||
{"epoch": 0.4126984126984127, "step": 274, "batch_size": 64, "mean": 9.445707321166992, "std": 13.220765113830566, "min": -22.06707000732422, "p10": -4.6599693298339835, "median": 7.429143905639648, "p90": 26.719614219665534, "max": 36.43077850341797, "pos_frac": 0.765625, "sample": [15.601181030273438, -3.951568603515625, -2.8543701171875, 8.869010925292969, 1.9632110595703125, -19.827407836914062, 22.42058563232422, 23.76348114013672, 6.584129333496094, 9.401800155639648, 19.207012176513672, 18.49921417236328, 1.8993072509765625, -3.996002197265625, 10.081689834594727, 28.665298461914062, 36.43077850341797, -16.55196189880371, 4.956302642822266, 10.02341079711914, 24.083633422851562, 21.19269561767578, 14.991329193115234, 25.396509170532227, 13.828826904296875, 1.329111099243164, 27.286659240722656, 34.381927490234375, 5.598993301391602, -2.6291580200195312, 1.9923954010009766, 5.3272552490234375, 19.77458953857422, -2.3603515625, -1.1493644714355469, 19.66724395751953, -12.377555847167969, -10.082420349121094, 20.256732940673828, 3.5500030517578125, -3.4207324981689453, 12.641204833984375, 3.0101585388183594, 6.6071319580078125, -1.2366962432861328, 12.15557861328125, 5.511619567871094, -22.06707000732422, 9.380592346191406, 4.404041290283203, 29.045799255371094, 23.985801696777344, 5.803050994873047, 3.69146728515625, 7.9043121337890625, -4.944526672363281, -5.294956207275391, 6.953975677490234, 32.61553955078125, 24.743257522583008, 6.283451080322266, 10.804756164550781, 33.117469787597656, 21.58591651916504], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000274.npy"}
|
||||
{"epoch": 0.41421012849584277, "step": 275, "batch_size": 64, "mean": 10.407403945922852, "std": 13.438794136047363, "min": -29.54095458984375, "p10": -3.144145393371582, "median": 8.260669708251953, "p90": 27.622198486328127, "max": 39.32817459106445, "pos_frac": 0.796875, "sample": [15.910598754882812, 9.75301742553711, -3.0927810668945312, 22.57636833190918, 27.86834716796875, 7.513607025146484, 8.483909606933594, -29.54095458984375, 36.80351257324219, 25.210586547851562, 3.496185302734375, 8.037429809570312, 3.2493972778320312, 36.473785400390625, -1.0608940124511719, 7.298847198486328, 22.66046142578125, 15.64809799194336, 32.98395919799805, -7.165596008300781, -17.499176025390625, 39.32817459106445, 3.6027374267578125, 4.05487060546875, 18.50487518310547, 0.8514175415039062, -4.037422180175781, 2.2324752807617188, 4.383449554443359, 4.422828674316406, 11.667877197265625, 39.25171661376953, 3.9329299926757812, 9.456153869628906, 24.61193084716797, 18.772384643554688, 12.620597839355469, 9.030515670776367, -0.325836181640625, -0.6079254150390625, 17.193099975585938, 0.5306396484375, 3.7746353149414062, -0.4583015441894531, 6.973667144775391, 26.390060424804688, 18.6627197265625, 15.60772705078125, -10.593433380126953, 7.523387908935547, 2.6701087951660156, 0.48465728759765625, -3.166158676147461, 12.066680908203125, 31.981889724731445, 27.0478515625, -1.4617042541503906, 12.50592041015625, 13.74441146850586, 23.024187088012695, 17.948379516601562, 5.761070251464844, -4.67881965637207, 15.178680419921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000275.npy"}
|
||||
{"epoch": 0.41572184429327286, "step": 276, "batch_size": 64, "mean": 10.096281051635742, "std": 11.069599151611328, "min": -16.867286682128906, "p10": -2.147782897949218, "median": 9.432159423828125, "p90": 24.087039756774907, "max": 39.3140754699707, "pos_frac": 0.875, "sample": [19.43122673034668, 4.7956695556640625, 1.0633258819580078, 15.290550231933594, 6.160026550292969, 10.193855285644531, 39.3140754699707, -8.537445068359375, 7.259941101074219, 6.736295700073242, 11.120067596435547, 3.781209945678711, 12.951896667480469, 13.090003967285156, 14.661720275878906, 24.994338989257812, 1.6051826477050781, 1.4420433044433594, 8.772659301757812, 4.948207855224609, 6.014240264892578, 23.34064483642578, 20.07291030883789, 28.944665908813477, 0.91949462890625, -10.420846939086914, 13.267208099365234, -2.4381561279296875, 17.29311180114746, 24.406923294067383, 5.21795654296875, 9.896881103515625, -1.470245361328125, -10.789392471313477, 22.205612182617188, -2.954437255859375, 31.66168785095215, 1.2462787628173828, 11.581085205078125, 15.670967102050781, -16.867286682128906, 7.397468566894531, 29.251434326171875, 1.394418716430664, 0.08635711669921875, 0.20006561279296875, 22.31498908996582, 13.370010375976562, 21.422203063964844, 19.699127197265625, -3.0205154418945312, 1.78289794921875, 9.189140319824219, 3.757823944091797, 9.675178527832031, 27.93006134033203, 21.687179565429688, 6.849658966064453, 12.326484680175781, 22.67273712158203, 2.7286853790283203, 12.748085021972656, 15.909435272216797, 0.9149322509765625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000276.npy"}
|
||||
{"epoch": 0.41723356009070295, "step": 277, "batch_size": 64, "mean": 5.939116477966309, "std": 12.612419128417969, "min": -18.99112319946289, "p10": -7.428957366943359, "median": 4.819378852844238, "p90": 21.608845520019536, "max": 39.370361328125, "pos_frac": 0.703125, "sample": [15.972305297851562, -13.980010986328125, 16.57392120361328, 2.1721878051757812, 13.5169677734375, -6.517543792724609, 5.656856536865234, 5.131767272949219, 21.938642501831055, 14.954864501953125, -6.82489013671875, -2.8375396728515625, 2.057098388671875, 5.372772216796875, -2.97723388671875, 7.918989181518555, 7.203912734985352, -3.7329559326171875, 2.4418563842773438, 3.7519187927246094, 8.97569465637207, 14.581855773925781, 3.2808799743652344, 5.562217712402344, 2.0995216369628906, 2.446044921875, 0.94989013671875, -7.4812774658203125, -12.083267211914062, 19.027923583984375, -0.8795967102050781, 6.290565490722656, -17.4153995513916, 22.148006439208984, 7.166080474853516, 16.567283630371094, -7.306877136230469, -14.056098937988281, 12.9580078125, 3.4493446350097656, 4.732177734375, -6.787197113037109, 36.74749755859375, 25.686317443847656, 7.4512481689453125, 4.906579971313477, 3.0508041381835938, 8.505989074707031, -15.424110412597656, 39.370361328125, 10.62186050415039, -4.895221710205078, 25.633087158203125, 15.678688049316406, -5.74029541015625, 1.2882499694824219, 20.839319229125977, -18.99112319946289, 19.322998046875, 37.40707778930664, -2.434947967529297, 1.9059257507324219, -0.0013561248779296875, 17.15482521057129], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000277.npy"}
|
||||
{"epoch": 0.41874527588813304, "step": 278, "batch_size": 64, "mean": 9.671895980834961, "std": 13.092865943908691, "min": -31.761390686035156, "p10": -6.243297958374022, "median": 9.022477149963379, "p90": 25.409178161621096, "max": 35.94770812988281, "pos_frac": 0.796875, "sample": [-6.705757141113281, -7.207916259765625, 19.15130615234375, 14.60211181640625, 8.532974243164062, -1.3143081665039062, -19.592727661132812, 4.971427917480469, 28.14153289794922, 35.94770812988281, 7.148895263671875, -0.4922065734863281, 24.698699951171875, 34.037353515625, 20.2347412109375, -0.8581752777099609, 6.580406188964844, 11.395374298095703, 11.827465057373047, 20.47852325439453, 9.109159469604492, 12.252029418945312, 6.804069519042969, 8.797607421875, -31.761390686035156, 34.52354431152344, 1.4188995361328125, 20.449609756469727, 13.59930419921875, 7.708290100097656, 5.608921051025391, 14.317642211914062, -8.54510498046875, -3.1495513916015625, 19.71051025390625, 2.9530487060546875, 15.999706268310547, 11.305267333984375, 16.975372314453125, 20.59803009033203, 34.374298095703125, 32.601951599121094, -5.164226531982422, 13.975425720214844, 8.935794830322266, 15.284683227539062, 8.839813232421875, 1.3751144409179688, -14.080673217773438, 2.4323577880859375, 9.954696655273438, 19.163253784179688, -13.652664184570312, -1.993581771850586, 3.2061309814453125, 25.713668823242188, 14.56011962890625, 24.532867431640625, 10.763446807861328, 8.021339416503906, 4.759725570678711, 3.091550827026367, 16.253273010253906, 5.80059814453125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000278.npy"}
|
||||
{"epoch": 0.42025699168556313, "step": 279, "batch_size": 64, "mean": 12.073814392089844, "std": 12.203908920288086, "min": -21.423534393310547, "p10": -2.8875816345214833, "median": 9.474262237548828, "p90": 30.01488037109375, "max": 39.87966537475586, "pos_frac": 0.859375, "sample": [-1.7933101654052734, 6.109382629394531, 39.87966537475586, 6.253089904785156, -21.423534393310547, 25.014419555664062, 5.703922271728516, -4.01519775390625, -3.749053955078125, 14.20620346069336, 12.642295837402344, 25.244918823242188, 8.1905517578125, 30.10797119140625, 16.177078247070312, 34.59931945800781, 9.373321533203125, 23.18325424194336, -5.308095932006836, 2.1035614013671875, -7.50230598449707, 22.278045654296875, 6.099494934082031, 6.877525329589844, 8.118881225585938, 10.733810424804688, 9.748529434204102, 0.14231491088867188, 7.789421081542969, 26.462905883789062, 5.6899261474609375, 17.451202392578125, 24.003807067871094, 7.277809143066406, -3.2707366943359375, 21.39424705505371, 17.436023712158203, 21.664947509765625, -1.9935531616210938, 33.052032470703125, 30.299598693847656, 9.190641403198242, 30.268726348876953, 5.5364837646484375, 1.5731658935546875, 10.005096435546875, 4.664516448974609, 5.881862640380859, 29.79766845703125, 25.154563903808594, 17.147266387939453, -5.573554992675781, 10.79345703125, 26.743045806884766, 2.235370635986328, 4.408565521240234, 7.297637939453125, 2.1338653564453125, 16.635696411132812, 17.075632095336914, 16.375751495361328, 9.575202941894531, 33.88386154174805, 5.6659088134765625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000279.npy"}
|
||||
{"epoch": 0.4217687074829932, "step": 280, "batch_size": 64, "mean": 12.225912094116211, "std": 11.663192749023438, "min": -11.132976531982422, "p10": -0.8826211929321287, "median": 9.173351287841797, "p90": 29.66898269653321, "max": 41.683135986328125, "pos_frac": 0.859375, "sample": [14.272254943847656, 36.761810302734375, 3.929140090942383, 13.922882080078125, 4.2205352783203125, 33.6126708984375, 1.3536815643310547, -1.39288330078125, 6.309825897216797, 17.548561096191406, 11.116386413574219, 12.842952728271484, 13.447822570800781, 12.666351318359375, 25.969261169433594, 27.323585510253906, 6.994434356689453, 9.099891662597656, 28.130966186523438, 3.5165786743164062, -0.06864166259765625, 6.054924011230469, 0.1394500732421875, 1.7814369201660156, 30.32813262939453, 12.006744384765625, 9.172752380371094, 11.6561279296875, 28.064788818359375, 41.683135986328125, 11.572132110595703, -2.7514801025390625, 7.947959899902344, 6.5399932861328125, 2.6692161560058594, 16.504791259765625, 38.78413391113281, -3.916107177734375, 33.79503631591797, 20.071876525878906, -2.454803466796875, -11.132976531982422, -0.6487960815429688, -0.9828319549560547, 7.4453582763671875, 21.761051177978516, 9.248403549194336, -1.8491897583007812, 5.199516296386719, 6.856281280517578, 31.09554100036621, 6.295696258544922, 7.460056304931641, 23.341684341430664, 14.280691146850586, 8.06903076171875, 11.705238342285156, 5.294288635253906, 7.434724807739258, 9.1739501953125, 2.5863113403320312, 21.703529357910156, 24.10594940185547, 22.786483764648438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000280.npy"}
|
||||
{"epoch": 0.42328042328042326, "step": 281, "batch_size": 64, "mean": 10.813520431518555, "std": 12.749458312988281, "min": -18.690885543823242, "p10": -5.16210308074951, "median": 9.972647666931152, "p90": 27.379944610595704, "max": 38.511322021484375, "pos_frac": 0.796875, "sample": [-5.9167327880859375, 21.32699203491211, 4.419921875, 0.9296112060546875, 28.155162811279297, 19.26898193359375, 9.613359451293945, 24.626998901367188, 13.355392456054688, 38.511322021484375, 6.455081939697266, 23.458688735961914, 13.826074600219727, 10.6787109375, 2.2102508544921875, 5.320703506469727, 8.379362106323242, 27.403457641601562, -3.4013004302978516, 19.6912841796875, 25.835124969482422, 4.826841354370117, 9.383026123046875, 15.954093933105469, 16.478363037109375, 10.33193588256836, 8.779312133789062, -1.6007804870605469, 7.97088623046875, -18.690885543823242, 8.769218444824219, 0.8271255493164062, -2.280731201171875, 10.537891387939453, 27.32508087158203, -12.40512466430664, 20.11180877685547, 33.88331604003906, 0.9731693267822266, -2.1561050415039062, 32.932044982910156, 16.112388610839844, 33.55854797363281, 18.08367347717285, 23.93138313293457, 16.046363830566406, 14.669689178466797, 7.186305999755859, 21.388071060180664, 8.28693962097168, -0.22078704833984375, 3.2259674072265625, 27.901439666748047, -15.822797775268555, 7.809528350830078, 24.33089828491211, 19.379817962646484, -7.142608642578125, 10.991813659667969, 16.941890716552734, -2.56195068359375, -7.958414077758789, 1.0687637329101562, -11.240522384643555], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000281.npy"}
|
||||
{"epoch": 0.42479213907785335, "step": 282, "batch_size": 64, "mean": 10.371776580810547, "std": 13.78734302520752, "min": -15.929317474365234, "p10": -5.963351821899414, "median": 8.636750221252441, "p90": 30.8449935913086, "max": 41.47377014160156, "pos_frac": 0.75, "sample": [4.743019104003906, -6.110115051269531, -7.537982940673828, 6.994514465332031, 3.2434310913085938, 18.719928741455078, 13.296867370605469, 7.343620300292969, -4.9318695068359375, 35.51020812988281, 24.065200805664062, 28.886764526367188, 9.208415985107422, 11.933990478515625, 26.256607055664062, 10.516571044921875, 36.43140411376953, 10.14664077758789, -2.72454833984375, 13.224361419677734, 5.1943206787109375, 2.1714019775390625, 8.065084457397461, 25.253128051757812, 2.8715457916259766, 1.218963623046875, -2.4837417602539062, 18.75801658630371, 41.47377014160156, 12.685726165771484, -5.984256744384766, -5.150484085083008, -15.929317474365234, 24.73219871520996, 2.357706069946289, -10.184730529785156, 26.6702880859375, 4.193796157836914, -9.24127197265625, 11.665847778320312, 16.20960235595703, 22.6260929107666, 7.119728088378906, 14.604938507080078, 32.869327545166016, 38.646636962890625, 15.591472625732422, 2.023923873901367, -1.7767181396484375, 9.984481811523438, -5.543365478515625, -4.9330902099609375, -1.2349700927734375, 4.081874847412109, -5.914573669433594, 38.36968231201172, 31.684234619140625, 14.558809280395508, -10.512418746948242, 7.42938232421875, 13.139993667602539, 15.122734069824219, 24.50199317932129, 7.588890075683594], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000282.npy"}
|
||||
{"epoch": 0.42630385487528344, "step": 283, "batch_size": 64, "mean": 8.761629104614258, "std": 14.224546432495117, "min": -20.38262939453125, "p10": -9.464636993408202, "median": 6.472480773925781, "p90": 29.720067596435552, "max": 37.092018127441406, "pos_frac": 0.734375, "sample": [22.915115356445312, -9.689146041870117, -11.782745361328125, -10.607368469238281, 7.6078643798828125, 8.786506652832031, 2.3401031494140625, 0.01007843017578125, 34.640174865722656, -0.9852218627929688, -6.629997253417969, 13.328704833984375, 7.1136322021484375, 0.8736839294433594, 23.391624450683594, 5.746013641357422, -0.076690673828125, 28.08600616455078, -18.628860473632812, 9.445037841796875, 25.601829528808594, 11.839485168457031, 23.92718505859375, 2.8172874450683594, 14.267562866210938, 31.882038116455078, 30.420379638671875, 11.220672607421875, 37.092018127441406, 9.346122741699219, 4.119781494140625, 22.599925994873047, -6.60137939453125, 8.770858764648438, -5.598457336425781, 18.4420166015625, -12.695159912109375, -1.702728271484375, 12.0938720703125, 24.303749084472656, 27.77133560180664, -8.94078254699707, 23.65894317626953, 31.634445190429688, 2.0887451171875, 1.8577537536621094, 5.831329345703125, -20.38262939453125, 1.7881908416748047, -3.640228271484375, 19.8045597076416, 2.4352493286132812, 32.8145751953125, 2.0371131896972656, 24.98468017578125, 15.72796630859375, 0.25408935546875, -1.6611785888671875, 3.0409412384033203, 2.0265274047851562, -12.210968017578125, 31.028589248657227, -2.34088134765625, 13.104347229003906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000283.npy"}
|
||||
{"epoch": 0.42781557067271353, "step": 284, "batch_size": 64, "mean": 9.827235221862793, "std": 12.808740615844727, "min": -20.563270568847656, "p10": -1.2190752029418939, "median": 6.906131744384766, "p90": 27.27147064208985, "max": 46.150177001953125, "pos_frac": 0.859375, "sample": [6.1833038330078125, 3.815155029296875, 9.622701644897461, 10.445755004882812, 33.92278289794922, 11.07928466796875, -0.1292591094970703, -4.0989837646484375, 28.225326538085938, 3.3183212280273438, 2.545185089111328, 12.448421478271484, 1.1710281372070312, 8.25592041015625, 18.706632614135742, -4.725379943847656, 4.363521575927734, 20.40726089477539, 0.5966720581054688, 46.150177001953125, 2.08941650390625, 15.09161376953125, 6.2437896728515625, -6.874603271484375, -3.4398117065429688, 11.847122192382812, 2.1830902099609375, 5.651329040527344, 6.895923614501953, 0.29807281494140625, 19.270893096923828, 12.765144348144531, 15.390060424804688, 4.0904693603515625, 8.081298828125, 4.386173248291016, 17.190750122070312, 43.91702651977539, 26.143829345703125, 0.28511810302734375, 15.370582580566406, 38.51301574707031, 8.256050109863281, 5.102642059326172, -1.5168704986572266, 0.22499465942382812, 20.700881958007812, 11.52705192565918, 10.348533630371094, 27.754745483398438, 2.381795883178711, 6.916339874267578, 23.464096069335938, 4.6343231201171875, 5.4967803955078125, 10.340866088867188, 38.50440979003906, 6.129650115966797, 12.954290390014648, -20.563270568847656, 17.91246795654297, -19.35021209716797, 0.553558349609375, -0.5242195129394531], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000284.npy"}
|
||||
{"epoch": 0.4293272864701436, "step": 285, "batch_size": 64, "mean": 8.347532272338867, "std": 12.424674034118652, "min": -14.061393737792969, "p10": -6.970315551757812, "median": 5.377508163452148, "p90": 24.868084144592288, "max": 37.8170166015625, "pos_frac": 0.75, "sample": [12.011512756347656, 4.052928924560547, -0.086761474609375, 24.389951705932617, 1.8184280395507812, -7.497509002685547, -12.861427307128906, 2.3297348022460938, 10.483444213867188, -8.051437377929688, 12.699638366699219, 29.951236724853516, 19.501495361328125, 11.890106201171875, 29.19432830810547, -10.957159042358398, 14.460639953613281, 34.112159729003906, -0.9222831726074219, 5.273021697998047, 5.48199462890625, 0.5201606750488281, -6.508228302001953, -11.452972412109375, 1.5878524780273438, -6.96435546875, -3.858917236328125, 7.074615478515625, 21.674560546875, -3.441974639892578, -3.7574234008789062, 22.486572265625, -14.061393737792969, 24.197853088378906, 25.668861389160156, 8.249446868896484, 8.384658813476562, 9.067718505859375, 18.71634864807129, 20.901874542236328, 12.415122985839844, 3.5548477172851562, 2.6219329833984375, 32.04914855957031, 23.449745178222656, 3.8994522094726562, 2.6874923706054688, 37.8170166015625, 1.255584716796875, 0.7893333435058594, 9.657896041870117, 17.06385040283203, 3.8097991943359375, 21.254486083984375, 2.7688751220703125, 2.361326217651367, 25.072998046875, 22.930198669433594, -6.972869873046875, 4.562252044677734, -0.943756103515625, 5.670806884765625, -0.5697479248046875, 11.2769775390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000285.npy"}
|
||||
{"epoch": 0.4308390022675737, "step": 286, "batch_size": 64, "mean": 13.323205947875977, "std": 13.647178649902344, "min": -28.124046325683594, "p10": -1.5389642715454093, "median": 11.159696578979492, "p90": 33.78894538879395, "max": 39.91119384765625, "pos_frac": 0.84375, "sample": [14.542839050292969, -0.6249980926513672, 2.7452564239501953, 12.318244934082031, 9.438591003417969, 16.53307342529297, 6.618316650390625, 4.279275894165039, 11.067941665649414, -1.9306640625, 19.806488037109375, 15.554706573486328, 9.759254455566406, 26.076841354370117, 20.807899475097656, 38.5755615234375, 24.373592376708984, 11.25145149230957, -14.09756088256836, 7.0048980712890625, 15.53036117553711, -2.3990936279296875, 9.431964874267578, 33.9030876159668, 5.046390533447266, -11.109298706054688, 21.946014404296875, 15.727886199951172, -2.1408843994140625, 22.489288330078125, 24.333791732788086, 19.940418243408203, 28.58652114868164, 26.588592529296875, 17.755889892578125, 35.89369201660156, 3.948467254638672, 24.903905868530273, -5.172260284423828, 12.987030029296875, 33.522613525390625, 39.91119384765625, 5.0675201416015625, 5.3797607421875, 11.044818878173828, 10.948272705078125, -0.15009689331054688, 5.5340576171875, 38.79613494873047, 34.40008544921875, 29.933837890625, 7.9443359375, 5.673488616943359, 2.564352035522461, -0.028964996337890625, 14.993240356445312, 34.333824157714844, 7.765163421630859, -28.124046325683594, 31.209823608398438, 10.515726089477539, 3.0263633728027344, 11.807411193847656, 4.323535919189453], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000286.npy"}
|
||||
{"epoch": 0.4323507180650038, "step": 287, "batch_size": 64, "mean": 12.83158016204834, "std": 14.385025978088379, "min": -19.76613426208496, "p10": -1.2488388061523434, "median": 10.852407455444336, "p90": 36.0221435546875, "max": 46.34622573852539, "pos_frac": 0.828125, "sample": [16.61687469482422, 2.1787872314453125, 2.995941162109375, 19.30147933959961, 40.32115173339844, 33.11566162109375, -0.1427001953125, 8.00693130493164, 10.650405883789062, -0.8873138427734375, 12.130054473876953, 11.720405578613281, 7.2440643310546875, 14.131591796875, 34.593475341796875, 40.267738342285156, 36.634429931640625, 5.562952041625977, 22.125263214111328, 15.132068634033203, 12.585098266601562, 13.87152099609375, 22.424480438232422, 2.699037551879883, 5.399982452392578, 21.79608917236328, -0.6461429595947266, -10.949882507324219, 18.306785583496094, 4.338451385498047, 20.694801330566406, 3.4823074340820312, -19.76613426208496, 1.1609649658203125, -2.2493057250976562, 24.040069580078125, 4.499153137207031, 4.048847198486328, 7.433677673339844, 23.862937927246094, 26.964332580566406, -1.403778076171875, 33.82298278808594, 28.406064987182617, -9.473945617675781, 1.9006423950195312, 46.34622573852539, 4.829124450683594, 25.69097137451172, 20.284238815307617, 0.36139488220214844, 3.3388214111328125, 13.034765243530273, 3.9705257415771484, -1.7480964660644531, 5.063301086425781, 38.05994415283203, 11.71002197265625, 11.05440902709961, 1.4670352935791016, -6.271018981933594, 38.7685546875, 36.88875961303711, -0.5761489868164062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000287.npy"}
|
||||
{"epoch": 0.43386243386243384, "step": 288, "batch_size": 64, "mean": 8.597015380859375, "std": 14.13460636138916, "min": -22.365066528320312, "p10": -9.32310218811035, "median": 8.46834659576416, "p90": 24.46177825927735, "max": 41.549964904785156, "pos_frac": 0.71875, "sample": [12.858280181884766, 2.8880844116210938, -9.99114990234375, -22.365066528320312, 22.204025268554688, 2.44598388671875, 8.132078170776367, -6.039890289306641, 23.126205444335938, 9.311647415161133, 7.634449005126953, 11.40866470336914, 2.0500106811523438, 25.025344848632812, -11.995849609375, 19.3594970703125, -14.610305786132812, 11.395809173583984, -1.3552703857421875, 11.166637420654297, -7.764324188232422, 32.45105743408203, 41.549964904785156, 39.443817138671875, -4.7646636962890625, 34.83143615722656, 7.71837043762207, 17.232330322265625, -1.2070541381835938, 2.7692203521728516, 41.204345703125, -10.255599975585938, 12.56170654296875, 13.44482421875, 23.14678955078125, 10.469657897949219, 6.9958038330078125, -6.5369873046875, 6.611625671386719, -3.4655227661132812, -7.323543548583984, -17.886306762695312, 28.47417449951172, 5.035133361816406, 6.931121826171875, -14.887557983398438, 18.410255432128906, -2.8499679565429688, 14.936017990112305, 18.67156982421875, 18.233871459960938, 1.8494796752929688, 17.140655517578125, 9.430511474609375, -3.0229034423828125, -1.9697189331054688, 14.266433715820312, 17.623939514160156, 0.3982372283935547, 21.494033813476562, 3.1648902893066406, 16.00402069091797, 18.194015502929688, 8.804615020751953], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000288.npy"}
|
||||
{"epoch": 0.43537414965986393, "step": 289, "batch_size": 64, "mean": 9.236154556274414, "std": 15.077129364013672, "min": -22.71868133544922, "p10": -8.728539848327635, "median": 7.636557579040527, "p90": 30.923453903198244, "max": 47.51153564453125, "pos_frac": 0.734375, "sample": [8.516231536865234, 5.005470275878906, 32.91056823730469, -7.405557632446289, -3.7098922729492188, -9.2955322265625, -2.5704345703125, 9.343704223632812, 14.569427490234375, 10.491889953613281, 3.5905418395996094, 8.298667907714844, 16.578943252563477, 16.646469116210938, 12.57379150390625, 3.9929447174072266, 22.41473388671875, -17.145957946777344, 0.4844856262207031, 28.577178955078125, 4.3990631103515625, 38.13201904296875, -2.0199203491210938, -5.594482421875, 17.71147918701172, 6.333429336547852, 30.97271728515625, 1.8361320495605469, 0.0960235595703125, 38.621734619140625, 6.4291534423828125, 7.362775802612305, -9.735275268554688, -22.71868133544922, 44.698890686035156, -3.8788070678710938, -0.4384899139404297, 47.51153564453125, 6.8813018798828125, 11.722000122070312, 8.457876205444336, 14.206291198730469, -7.066764831542969, 4.584173202514648, 11.223854064941406, -10.078119277954102, 9.09891128540039, 0.4166145324707031, -3.80767822265625, -18.908191680908203, 22.243452072143555, 16.371368408203125, 7.91033935546875, 13.092056274414062, 33.022552490234375, 30.80850601196289, 23.783065795898438, 24.239273071289062, 23.846214294433594, 4.298057556152344, 4.472599029541016, -9.525857925415039, 20.572925567626953, -4.3379364013671875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000289.npy"}
|
||||
{"epoch": 0.436885865457294, "step": 290, "batch_size": 64, "mean": 11.707275390625, "std": 14.84900188446045, "min": -30.005340576171875, "p10": -6.118110847473145, "median": 9.610454559326172, "p90": 33.84928932189941, "max": 39.08074951171875, "pos_frac": 0.84375, "sample": [26.107593536376953, 16.65146255493164, 17.67646026611328, 39.08074951171875, 13.783309936523438, 5.694303512573242, -6.170011520385742, 2.298215866088867, 4.189544677734375, 27.350242614746094, 10.090858459472656, 4.6504669189453125, 2.281646728515625, -13.214927673339844, 22.779930114746094, 5.018592834472656, -6.494239807128906, -2.5275802612304688, 1.7355537414550781, 0.20858001708984375, 12.178972244262695, 23.8671875, 37.94810104370117, 23.975563049316406, 3.21466064453125, 0.9763908386230469, 0.17324447631835938, -7.14495849609375, 10.779729843139648, -13.128555297851562, 27.03632354736328, 33.99663543701172, 18.268754959106445, 5.372074127197266, 4.738197326660156, 37.14783477783203, 37.198753356933594, 7.281558990478516, 9.130050659179688, -5.99700927734375, 31.548858642578125, 36.78539276123047, 1.408050537109375, -30.005340576171875, 4.435829162597656, 2.3500518798828125, 0.04577827453613281, 36.23616027832031, 22.328353881835938, 27.277189254760742, 3.9688148498535156, 13.8199462890625, 6.327198028564453, 7.754981994628906, -11.103450775146484, 30.1015625, 16.83655548095703, 33.5054817199707, 14.626800537109375, 14.459732055664062, 16.771602630615234, 17.042327880859375, 16.971786499023438, -2.4323043823242188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000290.npy"}
|
||||
{"epoch": 0.4383975812547241, "step": 291, "batch_size": 64, "mean": 11.751852035522461, "std": 16.017637252807617, "min": -28.16059684753418, "p10": -7.007069396972656, "median": 9.985696792602539, "p90": 36.081529998779295, "max": 49.38665771484375, "pos_frac": 0.78125, "sample": [35.906951904296875, 36.156349182128906, -0.8693466186523438, -9.94891357421875, 11.047500610351562, 23.622161865234375, -7.089508056640625, 0.13593673706054688, 40.663673400878906, 0.11941909790039062, 15.968128204345703, 23.220285415649414, -13.374893188476562, -3.089052200317383, 25.907854080200195, 23.82720184326172, -0.9515113830566406, -1.4943618774414062, 7.8120880126953125, 13.62115478515625, 1.492431640625, 0.9338455200195312, 18.179473876953125, 19.37157440185547, 18.508907318115234, 20.002662658691406, 42.6348876953125, 0.38167762756347656, 36.50783157348633, 6.8218841552734375, 12.442821502685547, 38.88580322265625, 2.086578369140625, 0.042449951171875, -6.8147125244140625, 0.9818077087402344, 25.90298080444336, 6.6354827880859375, 28.028968811035156, 9.185905456542969, 3.9869308471679688, 38.49708557128906, -28.16059684753418, 11.398574829101562, 30.160133361816406, 15.38482666015625, 16.14232635498047, 29.959571838378906, 10.422029495239258, 49.38665771484375, -10.703411102294922, 6.6180877685546875, -9.768718719482422, 13.396293640136719, -13.004243850708008, 4.521350860595703, 3.2733993530273438, 34.8240966796875, 7.8436737060546875, 12.636123657226562, 17.594146728515625, 9.54936408996582, -2.4226913452148438, -2.8208465576171875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000291.npy"}
|
||||
{"epoch": 0.4399092970521542, "step": 292, "batch_size": 64, "mean": 9.71307373046875, "std": 14.143044471740723, "min": -31.408676147460938, "p10": -6.82669620513916, "median": 8.640029907226562, "p90": 24.637131118774416, "max": 48.645904541015625, "pos_frac": 0.75, "sample": [-31.408676147460938, -6.477439880371094, 17.952791213989258, -0.010747909545898438, 31.652883529663086, 18.8535213470459, 6.632953643798828, 48.645904541015625, 14.404569625854492, -1.033233642578125, -0.7320022583007812, 3.935586929321289, 13.484323501586914, 19.677513122558594, -9.293609619140625, -2.706146240234375, 16.782188415527344, 17.575439453125, 20.86590576171875, -1.0398406982421875, 12.007587432861328, 16.354190826416016, 14.218143463134766, -10.1640625, 8.272659301757812, 3.8605728149414062, 4.718624114990234, 24.169662475585938, -2.4426116943359375, 30.11506462097168, 6.503425598144531, 15.95759391784668, 4.874422073364258, 1.2203598022460938, -0.2670612335205078, -15.403961181640625, 24.837474822998047, -14.509872436523438, 1.4933738708496094, 14.682411193847656, 1.23126220703125, 21.36016082763672, 16.96390724182129, -6.976377487182617, 40.61198425292969, 17.63044548034668, 8.587722778320312, 13.579015731811523, 5.851606369018555, 8.692337036132812, 20.841522216796875, 17.948474884033203, 3.0190792083740234, 17.867935180664062, 14.003402709960938, 2.5744247436523438, 14.567630767822266, 10.908130645751953, 40.64624786376953, 6.4062042236328125, 4.7183837890625, -4.924163818359375, -9.05624008178711, 36.32374572753906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000292.npy"}
|
||||
{"epoch": 0.4414210128495843, "step": 293, "batch_size": 64, "mean": 13.139215469360352, "std": 15.884710311889648, "min": -34.77838134765625, "p10": -6.250070190429687, "median": 14.731998443603516, "p90": 33.46279964447023, "max": 47.507659912109375, "pos_frac": 0.765625, "sample": [1.1987800598144531, 6.9475555419921875, 10.378673553466797, 34.842041015625, 22.678062438964844, 16.432796478271484, -9.79013442993164, 8.243532180786133, 12.792617797851562, 19.057052612304688, 30.189804077148438, -2.5686473846435547, 28.197593688964844, 13.131126403808594, 25.491432189941406, 2.5733413696289062, 26.329668045043945, 17.420761108398438, 25.778310775756836, 21.642745971679688, 24.920757293701172, -7.433315277099609, -5.605857849121094, -9.603462219238281, 22.861717224121094, 4.255882263183594, 18.939762115478516, -6.526161193847656, 7.498283386230469, 45.43292236328125, 2.312715530395508, 41.587852478027344, 38.47210693359375, -12.7783203125, 13.23046875, -4.585224151611328, 30.244569778442383, -34.77838134765625, -1.6717948913574219, 1.3779296875, 15.798042297363281, 38.236968994140625, -12.12643051147461, -3.09942626953125, -1.4709930419921875, -3.602264404296875, 15.767814636230469, 18.4882869720459, 47.507659912109375, 22.03704071044922, 27.38784408569336, 13.696182250976562, 17.586122512817383, 6.470149993896484, 20.968116760253906, -3.9150428771972656, 18.07275390625, 7.4168701171875, 11.146753311157227, 36.36090850830078, 19.72113037109375, 7.330944061279297, 20.504318237304688, 21.506439208984375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000293.npy"}
|
||||
{"epoch": 0.4429327286470144, "step": 294, "batch_size": 64, "mean": 11.398343086242676, "std": 14.342095375061035, "min": -14.300682067871094, "p10": -4.160767364501953, "median": 7.9565887451171875, "p90": 33.165135955810555, "max": 52.011375427246094, "pos_frac": 0.765625, "sample": [31.591102600097656, 10.952075958251953, 17.228317260742188, 7.2292022705078125, 30.032075881958008, 6.7316436767578125, 39.0111198425293, -0.3385467529296875, -7.503265380859375, 5.405830383300781, 18.632171630859375, -14.300682067871094, -5.5622406005859375, 21.662002563476562, 14.566164016723633, 18.250076293945312, 35.24089050292969, 10.133567810058594, 28.113849639892578, 15.198997497558594, 8.683975219726562, 2.078472137451172, 4.253873825073242, 16.316856384277344, -3.8112106323242188, 45.646583557128906, -7.286224365234375, 11.348974227905273, -4.310577392578125, 52.011375427246094, 33.8397216796875, -0.17038345336914062, -2.0979137420654297, 41.91035461425781, -0.7685260772705078, 2.7848892211914062, 4.103790283203125, -1.9609527587890625, 2.7551193237304688, 14.569374084472656, 6.021697998046875, 0.41705322265625, 24.71451759338379, 1.8614826202392578, 12.652311325073242, 19.19152069091797, 10.836864471435547, 25.00060272216797, 2.766693115234375, 34.78843688964844, -4.552478790283203, 27.588916778564453, 23.2913818359375, 1.8112564086914062, 4.34815788269043, 9.985389709472656, 2.6549072265625, -1.6109790802001953, 5.499011993408203, 0.7300224304199219, 13.732688903808594, -4.532709121704102, 11.59646224975586, -1.4711456298828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000294.npy"}
|
||||
{"epoch": 0.4444444444444444, "step": 295, "batch_size": 64, "mean": 9.488086700439453, "std": 14.411123275756836, "min": -13.681961059570312, "p10": -6.985038757324219, "median": 5.197429656982422, "p90": 28.592950439453126, "max": 46.48137664794922, "pos_frac": 0.734375, "sample": [28.910675048828125, -5.345924377441406, -1.1618423461914062, 1.93292236328125, 11.151100158691406, 25.166290283203125, 23.503639221191406, 3.030731201171875, 6.926734924316406, 7.4786376953125, 46.48137664794922, -1.0585651397705078, 44.859710693359375, 3.4346771240234375, 16.774662017822266, 12.59130859375, 4.980964660644531, 4.420013427734375, -5.063018798828125, 16.915302276611328, 25.463302612304688, -9.30302619934082, 3.10235595703125, 8.115026473999023, 26.87872314453125, -1.378143310546875, 5.012701034545898, -7.865245819091797, 29.383804321289062, 5.5657196044921875, -13.681961059570312, -1.8609676361083984, 45.63032531738281, 18.463722229003906, 12.71697998046875, -9.392574310302734, 15.692378997802734, 8.44113540649414, 16.659175872802734, 32.87737274169922, 15.649871826171875, -13.529476165771484, 4.674919128417969, -4.834228515625, 22.606903076171875, 27.851593017578125, -6.723136901855469, -7.097282409667969, -12.905858993530273, 10.967033386230469, 0.17562294006347656, 3.602823257446289, 3.6708297729492188, 3.3837127685546875, 5.030660629272461, 5.364198684692383, 21.723106384277344, 1.33062744140625, 2.9927902221679688, -5.041290283203125, 18.601646423339844, 27.314117431640625, 29.584606170654297, -3.606464385986328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000295.npy"}
|
||||
{"epoch": 0.4459561602418745, "step": 296, "batch_size": 64, "mean": 11.707871437072754, "std": 14.920954704284668, "min": -21.0921630859375, "p10": -6.502852630615234, "median": 11.409368515014648, "p90": 33.145970153808605, "max": 46.50011444091797, "pos_frac": 0.796875, "sample": [21.237396240234375, 31.009963989257812, 23.104042053222656, 8.323320388793945, -2.12933349609375, 5.5971221923828125, 3.325399398803711, 37.49267578125, 1.2243766784667969, 14.606494903564453, 19.575565338134766, 15.065532684326172, -10.285276412963867, 39.197784423828125, 3.5520401000976562, 13.808639526367188, 24.745948791503906, 16.849428176879883, 10.36783218383789, 17.110267639160156, 5.476997375488281, 9.09963607788086, 17.518898010253906, -21.0921630859375, -10.274383544921875, -5.3896942138671875, 2.5326919555664062, 12.85826301574707, 8.123558044433594, 11.40639877319336, 4.1601715087890625, -6.093780517578125, 12.97802734375, -0.5181121826171875, 42.764556884765625, 1.458282470703125, 0.07144927978515625, -4.200981140136719, 21.198976516723633, -8.672332763671875, 11.412338256835938, 10.578521728515625, 43.323150634765625, -16.655197143554688, 29.094253540039062, 14.5838623046875, 18.661197662353516, 19.792192459106445, 16.385549545288086, 25.7208251953125, 21.219894409179688, 2.3036041259765625, 29.68134307861328, 3.0713577270507812, 46.50011444091797, -0.439361572265625, 3.4861907958984375, 34.0614013671875, 2.2634658813476562, -10.77923583984375, -6.678169250488281, 34.21759796142578, 18.315505981445312, 11.997695922851562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000296.npy"}
|
||||
{"epoch": 0.4474678760393046, "step": 297, "batch_size": 64, "mean": 11.997785568237305, "std": 12.767616271972656, "min": -16.40949821472168, "p10": -3.1267169952392564, "median": 10.535984992980957, "p90": 28.95415515899658, "max": 39.11760711669922, "pos_frac": 0.8125, "sample": [20.757678985595703, -5.912086486816406, 0.7430343627929688, 21.790939331054688, 9.336273193359375, 20.172531127929688, 4.7552947998046875, 29.381702423095703, -1.8801116943359375, -16.40949821472168, 14.844802856445312, 5.5180816650390625, 12.080059051513672, 11.32485580444336, 3.787994384765625, 34.087371826171875, 15.915473937988281, 15.820556640625, 1.7886905670166016, -1.2215118408203125, 17.825576782226562, -0.81683349609375, -9.314834594726562, -8.590030670166016, -4.810054779052734, 33.876441955566406, 12.406867980957031, 1.4037628173828125, 24.815200805664062, 3.3961868286132812, 24.501026153564453, 39.11760711669922, 1.5465240478515625, -1.4567127227783203, 0.8664588928222656, 9.747114181518555, 4.661155700683594, 1.90496826171875, 28.451515197753906, 15.888557434082031, 4.81842041015625, 4.402366638183594, 21.454879760742188, 19.931894302368164, 6.7930145263671875, 25.41405487060547, 6.0719146728515625, -0.023581504821777344, -4.515472412109375, 28.78291893005371, 13.123706817626953, 21.435779571533203, 24.32821273803711, 23.324935913085938, -3.6609764099121094, 8.451620101928711, 26.30339813232422, 29.027542114257812, 1.8921890258789062, 32.221107482910156, 6.993034362792969, 22.246826171875, 31.577896118164062, 25.359909057617188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000297.npy"}
|
||||
{"epoch": 0.4489795918367347, "step": 298, "batch_size": 64, "mean": 9.74111270904541, "std": 14.977656364440918, "min": -21.10283088684082, "p10": -9.805010604858397, "median": 8.817173957824707, "p90": 32.103292846679686, "max": 50.2314338684082, "pos_frac": 0.75, "sample": [11.508453369140625, -19.77836799621582, 50.2314338684082, 8.179595947265625, 28.65652084350586, 2.3363189697265625, 19.228023529052734, 0.46360015869140625, -7.655303955078125, 13.802101135253906, 37.65132141113281, 32.16029357910156, -11.129249572753906, 6.40093994140625, 2.260009765625, 13.35097885131836, 15.829254150390625, -2.3693923950195312, 23.820932388305664, 19.177322387695312, 2.7136611938476562, -7.687259674072266, 25.33367919921875, 18.273544311523438, 11.785736083984375, 7.9815826416015625, -13.838565826416016, 13.272588729858398, 14.556011199951172, 34.19529724121094, 3.765798568725586, 35.64514923095703, 1.582000732421875, 15.914535522460938, 11.626091003417969, 16.501850128173828, 6.100492477416992, 34.323448181152344, 7.290502548217773, 5.432884216308594, -1.375823974609375, -5.34266471862793, -9.954048156738281, 11.763565063476562, 7.020956039428711, 21.793426513671875, 27.962554931640625, 3.2950897216796875, -1.7993125915527344, 9.844482421875, 9.454751968383789, -3.784149169921875, -4.635555267333984, -10.888847351074219, 2.3012619018554688, 18.356590270996094, -9.457256317138672, 15.790878295898438, -12.573074340820312, -21.10283088684082, 31.970291137695312, 17.539554595947266, 5.747032165527344, 32.61054229736328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000298.npy"}
|
||||
{"epoch": 0.4504913076341648, "step": 299, "batch_size": 64, "mean": 8.929571151733398, "std": 15.338687896728516, "min": -22.068893432617188, "p10": -9.473002243041991, "median": 7.462772369384766, "p90": 29.623545074462896, "max": 45.06916809082031, "pos_frac": 0.71875, "sample": [19.927371978759766, 16.877201080322266, -0.8225536346435547, -11.60235595703125, -0.8316707611083984, 24.790836334228516, 10.936180114746094, 6.983879089355469, 45.06916809082031, 1.8589153289794922, -10.533231735229492, 9.508499145507812, -17.797386169433594, 2.3225326538085938, -22.068893432617188, 1.2186965942382812, 2.46435546875, 28.000816345214844, 23.66347885131836, -8.513397216796875, 30.319000244140625, -3.567211151123047, 8.200054168701172, 20.261096954345703, -6.6728057861328125, -8.975418090820312, 0.0192108154296875, 2.479595184326172, 42.41596984863281, 0.6178512573242188, 27.82867431640625, 12.819801330566406, 18.532699584960938, -18.575687408447266, 1.3423309326171875, 4.58209228515625, 13.381973266601562, 35.410343170166016, 4.9488677978515625, 7.948493957519531, 16.164138793945312, -7.544273376464844, 20.125091552734375, 20.370580673217773, 42.48085021972656, -9.68625259399414, 11.29780387878418, 31.748130798339844, 21.23272705078125, 17.09143829345703, 1.4276371002197266, 3.6877880096435547, 17.65161895751953, -6.0208740234375, 7.9416656494140625, 18.117630004882812, -4.699188232421875, 19.956741333007812, -0.22261619567871094, 1.4155426025390625, -0.5182247161865234, 8.304668426513672, -9.923233032226562, 36.32384490966797], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000299.npy"}
|
||||
{"epoch": 0.4520030234315949, "step": 300, "batch_size": 64, "mean": 10.220560073852539, "std": 14.03746223449707, "min": -19.40424156188965, "p10": -7.598167419433594, "median": 10.56218147277832, "p90": 26.846800231933596, "max": 41.55850601196289, "pos_frac": 0.78125, "sample": [4.1394195556640625, 31.877227783203125, 1.626190185546875, 22.3883056640625, 5.912181854248047, 1.2847442626953125, 15.5284423828125, 18.7496337890625, 3.1299781799316406, -16.615468978881836, 36.68302917480469, 22.020774841308594, 19.75873374938965, 21.167556762695312, 41.55850601196289, 18.293258666992188, 28.210601806640625, -1.6368980407714844, -6.02569580078125, 9.229629516601562, 1.569833755493164, 16.513023376464844, 14.322738647460938, -5.968301773071289, -8.81396484375, 9.926055908203125, 17.32332992553711, -7.181308746337891, 9.475883483886719, 14.888874053955078, 4.336860656738281, -14.477615356445312, 19.610733032226562, 1.3792686462402344, -18.697052001953125, 11.198307037353516, 5.979574203491211, 41.124237060546875, 1.889892578125, 27.042343139648438, 22.832000732421875, 15.74801254272461, -4.0003509521484375, 33.02824401855469, 14.307380676269531, -8.969825744628906, 2.7719154357910156, 16.428348541259766, -1.6407928466796875, 22.248952865600586, 16.68011474609375, 7.585426330566406, 12.3214111328125, 5.8282318115234375, 0.226837158203125, -7.776821136474609, 18.96392059326172, 23.58380126953125, 22.693988800048828, -19.40424156188965, 2.6327438354492188, 15.777275085449219, 26.390533447265625, -2.8640785217285156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000300.npy"}
|
||||
{"epoch": 0.45351473922902497, "step": 301, "batch_size": 64, "mean": 9.543923377990723, "std": 15.494001388549805, "min": -25.684410095214844, "p10": -8.053623962402344, "median": 7.476099014282227, "p90": 30.487757873535163, "max": 55.32928466796875, "pos_frac": 0.75, "sample": [17.728309631347656, 14.842483520507812, 15.956352233886719, 6.3150482177734375, 14.75531005859375, -5.320091247558594, 20.905982971191406, 32.15531539916992, 24.004079818725586, 5.140235900878906, 6.358396530151367, 5.218965530395508, 11.730354309082031, -6.630035400390625, -16.78234100341797, 7.6344757080078125, 2.7806320190429688, 10.4835205078125, 20.97844696044922, 1.9771690368652344, -9.52474594116211, 7.656070709228516, 39.04170608520508, -7.032936096191406, 20.21149253845215, 1.7415847778320312, 1.991922378540039, 28.018089294433594, 20.026702880859375, -8.5037841796875, -5.659816741943359, 2.3120040893554688, 23.593006134033203, 29.326385498046875, -8.02099609375, 3.0217018127441406, -8.054168701171875, 55.32928466796875, -17.426530838012695, 7.317722320556641, -0.10512351989746094, 3.764371871948242, 13.312042236328125, -16.336891174316406, 2.2055130004882812, 25.8692569732666, 22.772249221801758, 9.051864624023438, 0.951629638671875, 33.46879577636719, 10.496078491210938, 30.985488891601562, 12.440544128417969, -0.498321533203125, -8.052352905273438, 15.98361587524414, 25.826698303222656, 36.732093811035156, 6.434700012207031, 1.9399566650390625, -0.5158615112304688, -25.684410095214844, 36.04449462890625, 8.127388000488281], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000301.npy"}
|
||||
{"epoch": 0.455026455026455, "step": 302, "batch_size": 64, "mean": 6.531844615936279, "std": 13.575113296508789, "min": -24.12506103515625, "p10": -9.624831199645996, "median": 5.44737434387207, "p90": 24.04635181427002, "max": 45.4434814453125, "pos_frac": 0.75, "sample": [8.644538879394531, 37.67311477661133, -4.6394500732421875, 45.4434814453125, 19.049522399902344, 1.8094654083251953, 34.34067916870117, -6.464881896972656, 29.953781127929688, 8.513198852539062, 9.612358093261719, 8.705028533935547, 3.9884109497070312, 0.5339851379394531, 19.905696868896484, 31.11473274230957, 2.9082107543945312, 3.4974594116210938, 2.986652374267578, -4.0888214111328125, 6.442584991455078, -15.628089904785156, -16.444686889648438, 12.560562133789062, 15.687576293945312, -9.550630569458008, 5.2963714599609375, 11.155166625976562, 0.3194236755371094, -0.9536819458007812, -4.993721008300781, 14.172103881835938, 13.96978759765625, 5.613916397094727, -11.671072006225586, 19.280879974365234, 23.18701171875, 0.0264892578125, 5.026618957519531, 6.460361480712891, 12.401460647583008, 24.414640426635742, 1.0941238403320312, 4.307868957519531, 9.8004150390625, 6.1838531494140625, -21.746654510498047, -1.6351318359375, 6.501190185546875, 5.0628814697265625, 17.060163497924805, 0.17559814453125, 0.28310394287109375, 1.923126220703125, 5.598377227783203, -13.070053100585938, -9.656631469726562, 31.416000366210938, -1.907135009765625, -3.9259300231933594, -24.12506103515625, 12.387481689453125, 10.99102783203125, 11.059188842773438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000302.npy"}
|
||||
{"epoch": 0.4565381708238851, "step": 303, "batch_size": 64, "mean": 12.781694412231445, "std": 16.30950355529785, "min": -16.52886199951172, "p10": -10.121866416931152, "median": 14.873838424682617, "p90": 34.16325073242188, "max": 49.72992706298828, "pos_frac": 0.734375, "sample": [-3.7076416015625, 15.722152709960938, 16.943191528320312, 13.776655197143555, 46.09111022949219, 27.536483764648438, 23.990798950195312, -9.162027359008789, 22.05613136291504, 13.0096435546875, 31.294692993164062, -16.52886199951172, 9.918994903564453, -0.3749408721923828, -13.834548950195312, -10.307716369628906, -14.517807006835938, 20.94708251953125, -14.90814208984375, -3.11663818359375, -0.12016868591308594, 9.058340072631836, 34.2451171875, 19.512027740478516, -10.187362670898438, 15.374771118164062, 25.857168197631836, -13.504226684570312, -2.9134273529052734, 19.003021240234375, 14.372905731201172, 5.54388427734375, 8.181671142578125, -1.6268768310546875, 6.900184631347656, 3.651050567626953, 36.458858489990234, 4.138862609863281, 19.825450897216797, 22.11749267578125, 3.2217330932617188, 19.555328369140625, 15.7808837890625, 33.97222900390625, 2.4976043701171875, 25.32666015625, 43.32484436035156, 10.209945678710938, 18.061447143554688, 29.150230407714844, -3.0938262939453125, 0.6005916595458984, -6.116352081298828, 18.434722900390625, 33.86576843261719, 18.657241821289062, 8.371652603149414, 15.7891845703125, 22.97197723388672, 37.89327621459961, 49.72992706298828, -9.96904182434082, 21.237625122070312, 37.837432861328125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000303.npy"}
|
||||
{"epoch": 0.4580498866213152, "step": 304, "batch_size": 64, "mean": 12.559429168701172, "std": 12.507630348205566, "min": -19.256866455078125, "p10": -1.781603622436523, "median": 12.692352294921875, "p90": 30.6490291595459, "max": 38.36076354980469, "pos_frac": 0.84375, "sample": [30.547657012939453, -1.3089332580566406, 20.533523559570312, 21.360092163085938, 3.733856201171875, 11.899051666259766, 16.924942016601562, 17.862422943115234, 0.1292400360107422, 12.088605880737305, 6.8972625732421875, 27.630165100097656, 26.482757568359375, 5.5243072509765625, 25.64490509033203, 13.395513534545898, -1.9841766357421875, 13.814048767089844, 33.45690155029297, -19.256866455078125, 2.7993316650390625, 19.282432556152344, 7.0637664794921875, 9.60185432434082, 6.109550476074219, 5.048042297363281, 1.0312213897705078, 14.782272338867188, 17.239089965820312, 27.414865493774414, 22.02362060546875, 1.5204906463623047, 30.914703369140625, 18.38946533203125, 0.5398635864257812, -11.90155029296875, 15.060356140136719, 13.582916259765625, 7.223918914794922, 8.703460693359375, -0.32363128662109375, -6.333002090454102, 38.36076354980469, 12.47314453125, 4.337709426879883, 5.0620880126953125, 13.164512634277344, 30.692474365234375, 25.449249267578125, 36.39484405517578, -3.5800552368164062, 1.5767364501953125, 30.83026123046875, -1.1896743774414062, 21.067550659179688, -6.861690521240234, 9.740167617797852, 37.31266784667969, 18.01720428466797, 12.109413146972656, 14.558456420898438, -3.742473602294922, 12.91156005859375, 19.97024917602539], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000304.npy"}
|
||||
{"epoch": 0.4595616024187453, "step": 305, "batch_size": 64, "mean": 9.37647819519043, "std": 15.378641128540039, "min": -33.448387145996094, "p10": -8.059568786621094, "median": 6.694644927978516, "p90": 29.805263900756838, "max": 42.5395393371582, "pos_frac": 0.71875, "sample": [24.866859436035156, 5.0928497314453125, -1.429107666015625, -14.07061767578125, -2.9972991943359375, 2.214519500732422, 42.5395393371582, -9.54473876953125, 15.5843505859375, 28.59002685546875, 4.702693939208984, 31.956645965576172, -9.8524169921875, 12.432621002197266, -6.538818359375, -7.491788864135742, 6.752418518066406, 2.9763259887695312, -8.254722595214844, 20.913284301757812, 3.250476837158203, 17.482107162475586, 6.636871337890625, 12.273223876953125, 12.31976318359375, 10.491401672363281, 19.529644012451172, -4.087242126464844, 23.4708251953125, -33.448387145996094, 3.6989898681640625, 17.43793296813965, 30.007633209228516, 23.49730682373047, -5.25360107421875, 5.622753143310547, 39.43708038330078, 1.321014404296875, 8.993579864501953, -4.750640869140625, 2.3337249755859375, 3.7854995727539062, 27.76354217529297, 38.28746032714844, 29.131202697753906, -1.3002243041992188, 5.682373046875, -11.915542602539062, 29.33306884765625, 16.370864868164062, 14.636802673339844, 6.9295196533203125, 6.259761810302734, 33.07167434692383, 19.77861785888672, 30.0335693359375, -7.604209899902344, 11.356391906738281, -17.994197845458984, 21.183507919311523, -4.928834915161133, 3.5537967681884766, 23.193801879882812, -5.220970153808594], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000305.npy"}
|
||||
{"epoch": 0.46107331821617537, "step": 306, "batch_size": 64, "mean": 8.720699310302734, "std": 12.44642162322998, "min": -21.161216735839844, "p10": -5.320017623901366, "median": 7.289965629577637, "p90": 22.62921142578125, "max": 55.20185852050781, "pos_frac": 0.796875, "sample": [37.33619689941406, 19.924789428710938, 1.5212383270263672, 26.73145294189453, 6.394187927246094, 17.062179565429688, 2.210784912109375, 12.251510620117188, 7.444604873657227, 18.026222229003906, 22.674484252929688, 12.614662170410156, -12.397453308105469, 7.135326385498047, -6.3507232666015625, -21.161216735839844, 18.156341552734375, 3.569793701171875, 14.347286224365234, 5.343616485595703, 4.099945068359375, 27.919891357421875, 12.823692321777344, 1.3398075103759766, 11.306291580200195, 3.783252716064453, -1.1657772064208984, 15.866111755371094, 2.7229385375976562, 2.687042236328125, -9.908782958984375, 22.523574829101562, -1.9472160339355469, 0.03402137756347656, 10.340629577636719, 9.093650817871094, 25.989673614501953, 26.6837158203125, -1.1828994750976562, 8.924797058105469, 3.410825729370117, 4.3144989013671875, 7.030704498291016, 15.680168151855469, -3.8008384704589844, 1.3007431030273438, -3.9069652557373047, 21.31591033935547, 0.9220123291015625, 14.389930725097656, 20.282485961914062, 16.623838424682617, 8.964767456054688, 5.692649841308594, 9.208145141601562, -2.017589569091797, 6.900306701660156, -9.544792175292969, 13.667030334472656, 55.20185852050781, -12.451610565185547, 10.6240234375, 15.472579956054688, -5.92561149597168], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000306.npy"}
|
||||
{"epoch": 0.46258503401360546, "step": 307, "batch_size": 64, "mean": 7.320440292358398, "std": 15.187078475952148, "min": -17.580162048339844, "p10": -11.42298355102539, "median": 3.2616806030273438, "p90": 26.32190170288086, "max": 46.60014343261719, "pos_frac": 0.5625, "sample": [24.53490447998047, -3.1999969482421875, 7.520729064941406, -4.193412780761719, -16.319778442382812, 18.10791778564453, 46.60014343261719, -0.134185791015625, 25.65972900390625, 22.219879150390625, -3.8080902099609375, 15.901130676269531, 8.947650909423828, 31.729995727539062, -17.580162048339844, 20.555465698242188, -0.2905426025390625, -1.7034530639648438, 15.291130065917969, -9.712417602539062, 11.494476318359375, -13.98832893371582, -5.72210693359375, -7.9283447265625, 9.951641082763672, -13.568790435791016, -10.26910400390625, 13.833381652832031, 26.552978515625, -5.243513107299805, -1.0473175048828125, 15.571439743041992, -5.010471343994141, -4.103199005126953, -11.917503356933594, 16.397735595703125, -3.406707763671875, 1.1154861450195312, 11.814844131469727, -6.1597442626953125, 4.696403503417969, -4.410530090332031, 0.272491455078125, 33.16040802001953, 29.001148223876953, -3.9189453125, 9.52037239074707, 23.97716522216797, 33.168209075927734, 20.17357635498047, -13.082794189453125, 25.78272247314453, 24.530227661132812, 1.8269577026367188, 23.232330322265625, 21.14792823791504, 0.5699844360351562, 31.50652313232422, -6.624778747558594, -3.1826248168945312, -13.179283142089844, 20.870079040527344, 16.268890380859375, -5.2917633056640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000307.npy"}
|
||||
{"epoch": 0.46409674981103555, "step": 308, "batch_size": 64, "mean": 9.3497314453125, "std": 13.398571014404297, "min": -24.886077880859375, "p10": -8.664563751220703, "median": 8.19736385345459, "p90": 27.878855514526368, "max": 32.49681091308594, "pos_frac": 0.765625, "sample": [28.204126358032227, 14.759319305419922, 14.982147216796875, -7.941747665405273, 20.345924377441406, 23.721527099609375, -4.3429718017578125, 14.724044799804688, 3.024322509765625, 1.570465087890625, 32.49681091308594, 13.185165405273438, 28.340309143066406, 12.13142204284668, 4.5851898193359375, -8.743610382080078, 8.094032287597656, 16.668487548828125, 22.929340362548828, 11.55386734008789, 18.439300537109375, -12.126712799072266, 31.712417602539062, 20.851638793945312, 11.343284606933594, 21.572067260742188, 1.7515506744384766, 4.157958984375, 16.212890625, -10.526542663574219, 31.08782196044922, -2.8289718627929688, 5.466117858886719, -8.480121612548828, 19.276809692382812, 1.6702041625976562, 20.881431579589844, 3.5257644653320312, 6.834175109863281, -24.886077880859375, 4.3850860595703125, 6.312732696533203, 24.89617919921875, 27.914173126220703, 5.654144287109375, 6.6862030029296875, 11.851264953613281, -5.6962890625, 17.3369083404541, -0.22708511352539062, 27.346689224243164, 0.652313232421875, -2.1836700439453125, 24.9725341796875, 32.36236572265625, -0.9538726806640625, 27.79644775390625, -15.625680923461914, -9.200037002563477, 8.300695419311523, 1.8782482147216797, -10.939773559570312, 8.457908630371094, 0.18218040466308594], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000308.npy"}
|
||||
{"epoch": 0.4656084656084656, "step": 309, "batch_size": 64, "mean": 8.874654769897461, "std": 13.91922664642334, "min": -22.154510498046875, "p10": -7.516173553466795, "median": 8.784786224365234, "p90": 26.201259231567384, "max": 44.09855270385742, "pos_frac": 0.703125, "sample": [15.0360107421875, 44.09855270385742, 2.071096420288086, 28.557382583618164, 14.65986442565918, 0.16889190673828125, 16.216651916503906, -9.75539779663086, 8.032760620117188, 2.1821441650390625, 20.205352783203125, -0.5055637359619141, -10.3599853515625, -5.699201583862305, 7.685737609863281, -19.233638763427734, 9.264909744262695, 25.96334457397461, -5.122520446777344, -1.280832290649414, 4.139801025390625, 25.05496597290039, 27.81656837463379, -3.457979202270508, 19.60198211669922, 15.90740966796875, -0.7010002136230469, 6.456258773803711, -5.878608703613281, -1.214447021484375, 5.035009384155273, -10.96466064453125, -1.6403045654296875, 14.650360107421875, 10.055618286132812, 9.770130157470703, 14.996894836425781, 9.55783462524414, -0.8714389801025391, 17.35468292236328, -3.4287872314453125, 26.30322265625, 24.431438446044922, 18.521038055419922, 35.68345642089844, -16.1234130859375, 25.696212768554688, 12.854389190673828, 4.257287979125977, 15.887825012207031, -5.297306060791016, 5.359992980957031, 8.570144653320312, 14.73893928527832, 40.50862121582031, 8.999427795410156, 12.572750091552734, 19.616409301757812, -8.217987060546875, -22.154510498046875, 7.76129150390625, 0.287139892578125, 9.87297248840332, 33.42271423339844], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000309.npy"}
|
||||
{"epoch": 0.4671201814058957, "step": 310, "batch_size": 64, "mean": 5.561444282531738, "std": 13.033235549926758, "min": -13.307632446289062, "p10": -10.69955825805664, "median": 3.472623825073242, "p90": 21.785087585449226, "max": 40.02105712890625, "pos_frac": 0.625, "sample": [15.854278564453125, 0.6149215698242188, 14.181842803955078, 6.94207763671875, 10.346420288085938, -12.927871704101562, -3.0451812744140625, 17.636154174804688, 22.61565399169922, 0.8317546844482422, -3.613832473754883, -13.299562454223633, 13.235523223876953, 34.561004638671875, 6.4898681640625, -1.7881965637207031, 0.07706832885742188, 3.674407958984375, 36.333274841308594, -10.7139892578125, 3.277019500732422, -0.8432865142822266, -9.369331359863281, 40.02105712890625, -5.5085906982421875, -10.520538330078125, 19.84709930419922, -5.740180969238281, 3.6682281494140625, 19.174985885620117, -11.674972534179688, 11.872207641601562, 7.9648284912109375, 13.08038330078125, -3.0474700927734375, 11.01873779296875, 1.4443817138671875, 10.246391296386719, 10.265541076660156, 12.299407958984375, -3.1241455078125, 11.454605102539062, -11.820533752441406, -12.530017852783203, 15.395637512207031, 6.49896240234375, 1.2116851806640625, 15.540822982788086, -2.612335205078125, -5.91192626953125, -1.6049118041992188, 29.680158615112305, 15.069757461547852, -13.307632446289062, 0.21262741088867188, -4.69029426574707, -1.62969970703125, 31.622955322265625, 18.92439079284668, -9.258712768554688, -10.665885925292969, 24.87030792236328, 0.5562381744384766, 6.568855285644531], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000310.npy"}
|
||||
{"epoch": 0.46863189720332576, "step": 311, "batch_size": 64, "mean": 8.477435111999512, "std": 14.548373222351074, "min": -30.878829956054688, "p10": -7.0915588378906245, "median": 7.397723197937012, "p90": 26.175694274902344, "max": 56.63861083984375, "pos_frac": 0.765625, "sample": [17.729286193847656, 0.1660919189453125, 2.5528640747070312, -9.809326171875, -5.47332763671875, -6.7429046630859375, 6.218669891357422, 23.70818328857422, 3.232545852661133, 26.363510131835938, 6.218498229980469, -30.878829956054688, 8.1016845703125, 17.274497985839844, 13.2149658203125, -4.185939788818359, 19.416786193847656, 2.0212020874023438, 25.737457275390625, 7.6694183349609375, 2.216583251953125, 3.2997894287109375, 4.051733016967773, 4.057010650634766, 22.319435119628906, 3.5159759521484375, -12.385772705078125, -6.188140869140625, 3.3645782470703125, 12.735076904296875, -14.055633544921875, 29.824935913085938, 47.77696228027344, 7.126028060913086, 11.8330078125, 16.832807540893555, 12.148780822753906, 9.863788604736328, 10.528823852539062, 9.623138427734375, 56.63861083984375, 4.3302459716796875, -7.2409820556640625, 28.537872314453125, -7.715032577514648, 5.4279632568359375, 3.28155517578125, -4.228363037109375, 20.88513946533203, 11.201446533203125, 35.20848083496094, 18.53814697265625, 7.8483123779296875, -15.539505004882812, 7.976890563964844, 15.129463195800781, 3.371978759765625, -5.6187591552734375, -2.774627685546875, 12.85870361328125, -4.1657562255859375, 16.58885955810547, 29.95465087890625, 11.036331176757812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000311.npy"}
|
||||
{"epoch": 0.47014361300075586, "step": 312, "batch_size": 64, "mean": 11.443854331970215, "std": 12.852249145507812, "min": -21.868988037109375, "p10": -2.5113109588623024, "median": 9.21727180480957, "p90": 30.31490306854248, "max": 46.29161834716797, "pos_frac": 0.859375, "sample": [8.374385833740234, 5.854515075683594, 2.224853515625, 8.664321899414062, 17.334259033203125, 6.1083526611328125, 9.11129379272461, -0.2686614990234375, 7.4579925537109375, 1.3153839111328125, 39.22743225097656, 17.3865966796875, 20.56329345703125, 22.70229148864746, 1.2578887939453125, 15.184677124023438, 6.2306060791015625, 1.5215682983398438, 5.934028625488281, 25.082305908203125, 33.961875915527344, 10.139167785644531, -4.321155548095703, 10.544342041015625, 7.111843109130859, 29.867767333984375, 4.457832336425781, -7.48345947265625, 4.4919891357421875, 30.506532669067383, 36.199134826660156, -21.868988037109375, 18.635841369628906, 23.491554260253906, -12.812286376953125, 7.731334686279297, 22.287612915039062, 9.437549591064453, 17.318527221679688, 31.082427978515625, 46.29161834716797, 4.067485809326172, 9.323249816894531, 10.5028076171875, 21.626907348632812, 3.568939208984375, 6.785484313964844, 21.78302001953125, 13.766395568847656, -3.4724464416503906, -0.11917877197265625, 14.281206130981445, 32.201416015625, 13.161125183105469, 0.462799072265625, 8.5513916015625, 12.008636474609375, 18.23973274230957, -14.036151885986328, 21.261581420898438, 12.530513763427734, -5.621273040771484, 4.8433837890625, 8.351211547851562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000312.npy"}
|
||||
{"epoch": 0.47165532879818595, "step": 313, "batch_size": 64, "mean": 7.513920783996582, "std": 14.756089210510254, "min": -22.591367721557617, "p10": -8.386252212524415, "median": 6.777618408203125, "p90": 25.577771949768067, "max": 54.57176208496094, "pos_frac": 0.6875, "sample": [-2.7760391235351562, 15.910324096679688, 14.657903671264648, 9.054256439208984, 10.907150268554688, 15.312646865844727, 8.598709106445312, 2.553682327270508, 26.959007263183594, -8.446941375732422, 0.9401340484619141, 15.451005935668945, 12.748844146728516, 3.5224533081054688, 6.564201354980469, 1.274017333984375, -2.8606796264648438, -4.83551025390625, 9.687942504882812, -4.725563049316406, 15.828010559082031, 11.530136108398438, -0.47850608825683594, 39.59840393066406, -22.591367721557617, -16.41274070739746, -2.0822525024414062, 1.361114501953125, 21.426971435546875, 10.279716491699219, 34.61073303222656, 25.59200668334961, -15.659774780273438, -2.8396549224853516, 32.37133026123047, 24.21209716796875, 2.1760692596435547, -0.6135177612304688, -1.7761001586914062, 38.6905517578125, 5.068096160888672, 3.8720779418945312, 14.629425048828125, -6.03216552734375, 54.57176208496094, 4.523719787597656, -3.3138275146484375, 0.39818572998046875, 9.769378662109375, 7.029396057128906, 24.778915405273438, 11.885002136230469, 6.991035461425781, 0.3531913757324219, -4.81353759765625, -21.158309936523438, 9.287742614746094, -11.682182312011719, 23.65603256225586, -8.244644165039062, -12.772235870361328, 10.95233154296875, 9.876213073730469, 25.544557571411133], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000313.npy"}
|
||||
{"epoch": 0.47316704459561604, "step": 314, "batch_size": 64, "mean": 11.607396125793457, "std": 12.604023933410645, "min": -11.8321533203125, "p10": -4.025990295410155, "median": 9.97976303100586, "p90": 29.844432067871097, "max": 40.220703125, "pos_frac": 0.84375, "sample": [19.63280487060547, 5.033557891845703, 7.687644958496094, 35.828643798828125, 15.48443603515625, 14.079994201660156, 7.641334533691406, 12.771827697753906, -5.170127868652344, 18.438383102416992, 0.7578125, -4.556999206542969, 39.01356506347656, 8.49639892578125, -2.007396697998047, 36.61625671386719, 15.787517547607422, 13.903264999389648, 28.362594604492188, 0.36107635498046875, 0.7537765502929688, 4.9521331787109375, 23.779329299926758, 25.757972717285156, 25.91205596923828, -11.8321533203125, 4.931018829345703, 27.33563232421875, 10.81842041015625, 8.7901611328125, 20.103628158569336, 10.139190673828125, 5.195232391357422, 3.227325439453125, 1.2098541259765625, 4.062953948974609, 23.772369384765625, -1.2307891845703125, 4.404539108276367, 10.674488067626953, 0.9658889770507812, -6.8561553955078125, 30.070228576660156, 7.315059661865234, 16.226722717285156, 15.312393188476562, 17.485519409179688, 40.220703125, 1.0743560791015625, 29.31757354736328, 9.820335388183594, 30.655593872070312, 15.0628662109375, 6.886466979980469, 18.66699981689453, 10.29058837890625, -4.3585968017578125, 11.988304138183594, 5.9754638671875, -3.249908447265625, 8.46159553527832, -8.957124710083008, -11.472358703613281, 31.051101684570312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000314.npy"}
|
||||
{"epoch": 0.47467876039304613, "step": 315, "batch_size": 64, "mean": 9.52238655090332, "std": 12.439586639404297, "min": -19.591304779052734, "p10": -5.4460723876953105, "median": 9.72073745727539, "p90": 24.897836112976076, "max": 39.55005645751953, "pos_frac": 0.71875, "sample": [-6.343719482421875, 10.536956787109375, 25.115694046020508, 6.829618453979492, -1.7869491577148438, 11.464658737182617, -19.591304779052734, -2.4507083892822266, -6.7945098876953125, 9.872505187988281, -10.58966064453125, 21.0515079498291, -0.6530799865722656, 24.341041564941406, -0.8572502136230469, 13.5048828125, 11.976112365722656, 0.5946884155273438, 3.668537139892578, -1.2678031921386719, 10.622577667236328, 20.093124389648438, 1.8013114929199219, 7.7121734619140625, 8.60650634765625, 12.758075714111328, 30.588470458984375, -2.482879638671875, 5.866977691650391, 21.75496482849121, 15.652755737304688, 10.645755767822266, 27.447357177734375, 8.223167419433594, 4.671306610107422, 19.77295684814453, 15.202774047851562, -3.3515625, -15.979019165039062, 7.48553466796875, 22.0992431640625, -0.47833251953125, 16.009841918945312, 22.019149780273438, -2.3309326171875, 13.972763061523438, 19.624591827392578, 39.55005645751953, -14.684967041015625, 8.211746215820312, 20.06714630126953, 2.409210205078125, 18.694183349609375, -1.7228164672851562, 17.018238067626953, 8.793212890625, 24.572206497192383, 13.34515380859375, 29.86544418334961, -9.259939193725586, 32.23078536987305, -0.8931598663330078, 25.037391662597656, 9.5689697265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000315.npy"}
|
||||
{"epoch": 0.47619047619047616, "step": 316, "batch_size": 64, "mean": 11.016056060791016, "std": 13.494481086730957, "min": -17.339462280273438, "p10": -5.564591979980468, "median": 9.976703643798828, "p90": 32.22959861755371, "max": 44.34416198730469, "pos_frac": 0.75, "sample": [-4.6584014892578125, 5.902824401855469, 12.666435241699219, -5.7901458740234375, -17.339462280273438, -2.083831787109375, 8.508636474609375, 11.127883911132812, 1.1233673095703125, 18.703857421875, 27.320693969726562, 33.708740234375, 17.18630599975586, 33.337677001953125, 16.101898193359375, -3.684661865234375, -9.243833541870117, 7.818382263183594, -7.882244110107422, 24.260427474975586, 14.054445266723633, 33.79644012451172, 8.747098922729492, 9.956771850585938, 9.996635437011719, 6.630268096923828, 17.058258056640625, -0.2597198486328125, -6.553596496582031, 13.547966003417969, 23.16592788696289, 32.60508728027344, 16.10372543334961, 44.34416198730469, -4.140474319458008, 11.13800048828125, -5.038299560546875, 18.57464599609375, 19.532180786132812, 10.659019470214844, 23.0435791015625, 10.512138366699219, 8.922462463378906, 1.3553314208984375, 25.108795166015625, 0.7565078735351562, -1.4894771575927734, 35.103973388671875, -1.8868408203125, 29.943252563476562, -5.889869689941406, -9.691377639770508, 0.9564590454101562, -4.469064712524414, 6.460350036621094, 7.0451507568359375, 5.854984283447266, 34.40841293334961, 8.726089477539062, 15.431228637695312, 22.37405776977539, 9.024192810058594, 31.353458404541016, 11.0706787109375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000316.npy"}
|
||||
{"epoch": 0.47770219198790626, "step": 317, "batch_size": 64, "mean": 13.200772285461426, "std": 12.854412078857422, "min": -10.825820922851562, "p10": -3.6935760498046877, "median": 12.664478302001953, "p90": 31.926076889038086, "max": 40.559913635253906, "pos_frac": 0.828125, "sample": [19.67365264892578, 20.376644134521484, 14.86928939819336, 27.251861572265625, 8.891983032226562, 40.559913635253906, -3.6812286376953125, 9.145713806152344, -2.602285385131836, 10.182403564453125, 3.5233154296875, 5.407360076904297, -4.217658996582031, -1.3755874633789062, 12.352149963378906, 1.3176460266113281, 24.927574157714844, 6.12548828125, 25.725967407226562, 32.031494140625, -5.31219482421875, 2.1408309936523438, 17.598655700683594, -10.825820922851562, 16.427770614624023, 11.201370239257812, 19.535606384277344, 18.168651580810547, 25.109676361083984, -5.6102294921875, 40.47489929199219, 6.577728271484375, 16.986379623413086, 37.36652755737305, 32.68457794189453, -7.6718902587890625, 6.1702728271484375, -3.422008514404297, 13.812496185302734, 32.684288024902344, 5.897190093994141, 12.976806640625, 37.44355773925781, -6.1123046875, 4.787422180175781, 11.858896255493164, 14.205879211425781, 23.502593994140625, 8.616348266601562, 2.8779678344726562, 22.07012176513672, 15.562519073486328, 15.38984489440918, 28.03040313720703, 2.3601913452148438, 14.802780151367188, -3.6988677978515625, 17.81597900390625, 29.22117805480957, 31.680103302001953, 2.7006683349609375, 22.16046905517578, 7.6524200439453125, 8.463966369628906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000317.npy"}
|
||||
{"epoch": 0.47921390778533635, "step": 318, "batch_size": 64, "mean": 8.06229019165039, "std": 16.977603912353516, "min": -26.14012908935547, "p10": -16.467693710327147, "median": 7.0745439529418945, "p90": 30.565275573730474, "max": 43.383575439453125, "pos_frac": 0.671875, "sample": [10.087093353271484, 20.806396484375, 9.579360961914062, -11.533767700195312, 23.852371215820312, 22.047653198242188, 13.122772216796875, -18.236053466796875, 9.749238967895508, 42.3578987121582, 1.7213706970214844, -16.15502166748047, 20.904476165771484, 22.159164428710938, 26.283802032470703, 4.646114349365234, 4.275138854980469, -9.243631362915039, 12.015050888061523, 0.7660884857177734, -2.0252456665039062, 19.853012084960938, 33.25279998779297, -16.922237396240234, 4.5494384765625, 20.812808990478516, 31.180267333984375, -26.14012908935547, 13.585899353027344, 16.253738403320312, -2.5566253662109375, 15.58306884765625, -11.038200378417969, 2.424793243408203, 0.3946380615234375, 43.383575439453125, 41.72259521484375, -6.7680816650390625, 29.130294799804688, -1.8696441650390625, -3.0829315185546875, 6.8628387451171875, 18.74095916748047, -25.73908233642578, -0.6868896484375, 22.914749145507812, -4.314567565917969, 37.713653564453125, 1.5682048797607422, 19.60367202758789, 1.604024887084961, 34.20768737792969, 7.245172500610352, 12.325271606445312, -16.601696014404297, -2.6256103515625, 15.422348022460938, -19.184619903564453, 6.9039154052734375, -0.7420883178710938, 21.04551887512207, -19.099895477294922, -10.596626281738281, 18.490249633789062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000318.npy"}
|
||||
{"epoch": 0.48072562358276644, "step": 319, "batch_size": 64, "mean": 11.5479736328125, "std": 14.695748329162598, "min": -16.851383209228516, "p10": -5.660868072509766, "median": 10.131866455078125, "p90": 30.396272277832036, "max": 50.80320739746094, "pos_frac": 0.75, "sample": [12.036481857299805, 37.34722137451172, 7.4353790283203125, 18.946060180664062, 4.150735855102539, 28.578506469726562, 6.400379180908203, 8.560943603515625, 8.278213500976562, 10.613555908203125, 24.693222045898438, 22.161392211914062, -2.295370101928711, -0.6964187622070312, 9.667854309082031, 19.4281005859375, 1.9999656677246094, 50.80320739746094, 5.100240707397461, 22.42852020263672, 5.024166107177734, -6.961784362792969, 8.852706909179688, 12.862434387207031, 39.50846862792969, -2.9075050354003906, 34.55976867675781, 16.826560974121094, 29.016799926757812, 24.045242309570312, -4.140960693359375, -5.604530334472656, 11.14349365234375, 48.7989501953125, -9.38107681274414, 11.241556167602539, -3.6122589111328125, 17.191970825195312, 19.84972381591797, -16.851383209228516, 30.81002426147461, 9.013351440429688, -1.09527587890625, 4.404014587402344, -0.02741241455078125, 4.246891021728516, -12.436874389648438, -2.2440567016601562, -9.055601119995117, 29.430850982666016, 0.8077507019042969, 8.6688232421875, 9.725692749023438, -5.6850128173828125, 16.14330291748047, 23.558502197265625, 14.417831420898438, 21.1827392578125, 15.851478576660156, 12.893762588500977, 10.538040161132812, -14.438789367675781, 36.5976676940918, 10.662094116210938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000319.npy"}
|
||||
{"epoch": 0.48223733938019653, "step": 320, "batch_size": 64, "mean": 8.551827430725098, "std": 15.342541694641113, "min": -24.297378540039062, "p10": -7.436042022705078, "median": 5.670494079589844, "p90": 28.694967269897468, "max": 54.271728515625, "pos_frac": 0.671875, "sample": [18.77305793762207, -6.749088287353516, 1.365610122680664, 11.499210357666016, 6.0121307373046875, 3.6996097564697266, 29.51132583618164, -1.4244098663330078, 18.51856231689453, -1.7440567016601562, -5.697639465332031, 23.836198806762695, 0.8739471435546875, -0.7724571228027344, -14.65594482421875, 19.074005126953125, 12.382949829101562, -1.8891983032226562, -2.761058807373047, 22.521080017089844, -16.856197357177734, 7.582897186279297, -24.297378540039062, 1.5288238525390625, -1.4837646484375, 54.271728515625, 49.589012145996094, 23.600730895996094, 15.823951721191406, -7.447479248046875, 4.724235534667969, -11.57343864440918, 13.191627502441406, 11.740936279296875, -0.19022369384765625, 37.15568542480469, 5.140361785888672, -3.8042755126953125, -7.714443206787109, -3.0615158081054688, 15.113475799560547, 31.54909896850586, 5.328857421875, 6.2236480712890625, 6.10174560546875, 3.355649948120117, 35.054832458496094, 13.069137573242188, 0.1417999267578125, -7.409355163574219, 1.2795028686523438, -11.651695251464844, 30.76140594482422, -5.316009521484375, 7.86810302734375, 20.762035369873047, 12.255943298339844, 26.44864845275879, 26.790130615234375, 5.307838439941406, 10.764982223510742, -6.271659851074219, 20.317481994628906, 19.176233291625977], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000320.npy"}
|
||||
{"epoch": 0.4837490551776266, "step": 321, "batch_size": 64, "mean": 9.220213890075684, "std": 14.204230308532715, "min": -26.308563232421875, "p10": -9.145271682739256, "median": 8.772453308105469, "p90": 29.695787048339845, "max": 38.470462799072266, "pos_frac": 0.796875, "sample": [12.418830871582031, 18.295822143554688, 0.009735107421875, 13.225296020507812, 13.399909973144531, -0.39084625244140625, 35.169185638427734, 10.289236068725586, 32.120731353759766, 5.13444709777832, 33.872100830078125, 5.491756439208984, 11.594017028808594, 21.724079132080078, -15.3685302734375, 1.3268623352050781, 0.15413665771484375, 8.620216369628906, 14.357086181640625, 24.86595916748047, 11.17291259765625, 7.460380554199219, -21.919479370117188, -5.124290466308594, 8.924690246582031, -9.973644256591797, -14.488077163696289, -14.024084091186523, 18.808420181274414, 12.965011596679688, 6.8710479736328125, 20.38851547241211, 21.339210510253906, 13.425212860107422, 3.0737552642822266, 19.965301513671875, 15.118309020996094, -3.6975059509277344, 3.7596473693847656, 18.271766662597656, 1.1622428894042969, -1.9965324401855469, 1.18572998046875, 27.022994995117188, 4.264961242675781, -26.308563232421875, 24.222869873046875, 29.562026977539062, 31.187252044677734, 16.25818634033203, 6.8521575927734375, 6.279916763305664, 2.7322006225585938, 29.861316680908203, 29.75311279296875, 8.055412292480469, 8.034217834472656, 4.8763275146484375, 38.470462799072266, -18.651031494140625, 11.598968505859375, -5.100616455078125, 9.355331420898438, -7.21240234375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000321.npy"}
|
||||
{"epoch": 0.4852607709750567, "step": 322, "batch_size": 64, "mean": 12.559789657592773, "std": 12.469807624816895, "min": -21.291698455810547, "p10": -0.2218606948852537, "median": 11.415281295776367, "p90": 29.346384048461918, "max": 41.84300994873047, "pos_frac": 0.875, "sample": [-2.8853397369384766, 16.179397583007812, 29.684711456298828, 4.416534423828125, 19.857547760009766, 7.712472915649414, 24.877792358398438, 19.74009132385254, 17.733108520507812, 11.142377853393555, 7.350303649902344, 14.913986206054688, 34.36373519897461, 34.10496139526367, 15.481231689453125, 8.36368179321289, 9.829360961914062, 12.2867431640625, 2.219451904296875, 5.229248046875, 12.887325286865234, 1.494903564453125, -9.840421676635742, 30.684770584106445, 24.01401138305664, 16.97228240966797, 4.931770324707031, 12.162660598754883, -0.017400741577148438, 11.547359466552734, 0.78533935546875, 9.9193115234375, 11.283203125, 26.569442749023438, 27.042354583740234, 10.034788131713867, 6.777870178222656, 34.09889221191406, 3.6563568115234375, -4.934074401855469, 14.702018737792969, 20.49974822998047, 1.971242904663086, 28.396163940429688, 10.995445251464844, 35.58525848388672, 25.267791748046875, 15.73291015625, -12.301826477050781, -21.291698455810547, 8.410503387451172, 4.499870300292969, 4.571094512939453, 11.823493957519531, 28.55695343017578, 7.0208740234375, 15.567054748535156, 5.1396636962890625, 14.937835693359375, -10.283746719360352, 41.84300994873047, -0.30948638916015625, 23.398557662963867, 6.421670913696289], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000322.npy"}
|
||||
{"epoch": 0.48677248677248675, "step": 323, "batch_size": 64, "mean": 8.570770263671875, "std": 17.129331588745117, "min": -24.85467529296875, "p10": -8.0576602935791, "median": 5.297660827636719, "p90": 36.21750793457033, "max": 56.958160400390625, "pos_frac": 0.65625, "sample": [-0.29807281494140625, -9.514816284179688, 5.7516937255859375, 1.7560958862304688, -1.9732589721679688, 18.58258056640625, 13.821304321289062, 9.604436874389648, -0.04572296142578125, 38.292205810546875, 12.143363952636719, -4.166135787963867, 30.635719299316406, 10.814628601074219, 31.789710998535156, -7.885601043701172, -18.493209838867188, 23.587158203125, 19.16413116455078, 7.1337890625, -4.684295654296875, 20.29840087890625, 4.447601318359375, 5.533233642578125, 45.093894958496094, -2.4190902709960938, -13.6781005859375, 9.75157356262207, 0.385406494140625, 4.4532470703125, 56.958160400390625, -7.94476318359375, -0.29547882080078125, -7.234466552734375, 24.01428985595703, 1.6828231811523438, -8.926971435546875, 19.345436096191406, 12.6051025390625, 1.8347854614257812, 8.451789855957031, 13.869766235351562, -4.0750274658203125, 2.3466720581054688, 38.115135192871094, 50.07609176635742, 42.91584014892578, -24.85467529296875, 40.838226318359375, -3.0672531127929688, -24.13263702392578, -1.2960205078125, 11.504814147949219, 5.0620880126953125, 8.339996337890625, 19.38658332824707, 13.705047607421875, 17.963233947753906, -5.23028564453125, -8.10604476928711, 7.802837371826172, 1.6638298034667969, 0.3811492919921875, -5.052650451660156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000323.npy"}
|
||||
{"epoch": 0.48828420256991684, "step": 324, "batch_size": 64, "mean": 10.580099105834961, "std": 11.254307746887207, "min": -14.506317138671875, "p10": -0.9863906860351557, "median": 8.490480422973633, "p90": 24.317388153076173, "max": 38.59803771972656, "pos_frac": 0.859375, "sample": [20.40167236328125, 7.8654022216796875, 6.101356506347656, 2.33197021484375, -7.335422515869141, 28.33466339111328, 13.694635391235352, 10.2833251953125, -1.2601699829101562, -0.34757232666015625, 24.36156463623047, 21.956907272338867, 12.024246215820312, 15.389019012451172, 15.50314712524414, 4.350627899169922, 22.329940795898438, -11.818984985351562, 12.004203796386719, 1.601837158203125, 15.331703186035156, 5.805908203125, 4.760860443115234, 17.4801025390625, 14.526725769042969, 6.555850982666016, -6.859916687011719, 2.6812744140625, 38.59803771972656, 8.447879791259766, 5.274196624755859, -14.506317138671875, 19.21397590637207, 21.692432403564453, 8.5330810546875, 8.967853546142578, 3.7343215942382812, -3.635345458984375, 17.385265350341797, 18.49799346923828, 15.157417297363281, -5.0708465576171875, 29.8087158203125, 2.00567626953125, 35.623268127441406, 1.118621826171875, 1.2862510681152344, 17.89715576171875, 10.587383270263672, 29.63461685180664, 35.26072692871094, 3.0325279235839844, 3.4313316345214844, 12.835166931152344, 17.47564697265625, 7.2091522216796875, 3.6758346557617188, 0.2686004638671875, 4.886878967285156, 7.803050994873047, 24.214309692382812, 21.775413513183594, 7.175575256347656, -0.2243785858154297], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000324.npy"}
|
||||
{"epoch": 0.4897959183673469, "step": 325, "batch_size": 64, "mean": 9.72203254699707, "std": 14.656831741333008, "min": -38.41624450683594, "p10": -6.522586059570313, "median": 8.451053619384766, "p90": 30.910672760009767, "max": 47.589149475097656, "pos_frac": 0.78125, "sample": [9.032951354980469, 31.390480041503906, 2.6708297729492188, 17.357114791870117, 4.736381530761719, 7.8691558837890625, 18.692657470703125, 17.048858642578125, 11.560371398925781, 9.305854797363281, 17.124176025390625, 18.442136764526367, 1.2623233795166016, -6.6879119873046875, 7.1020660400390625, -4.017559051513672, 30.695655822753906, 11.464332580566406, 10.024398803710938, -7.147970199584961, 47.589149475097656, -11.766448974609375, 25.944883346557617, 2.102842330932617, 0.23937225341796875, 36.8892822265625, -0.23266220092773438, 15.90108871459961, -6.441291809082031, 10.998334884643555, -2.7253570556640625, 7.823642730712891, 18.660667419433594, 9.524688720703125, -12.420906066894531, 22.040679931640625, -7.10711669921875, 21.285255432128906, 9.594694137573242, 38.72418212890625, 5.480705261230469, 1.442911148071289, 5.3792572021484375, 33.231658935546875, -38.41624450683594, 3.881591796875, 0.9535369873046875, 3.1477508544921875, 30.498069763183594, 12.506315231323242, 44.368553161621094, -3.1129722595214844, 6.429967880249023, -6.557426452636719, 3.8980026245117188, -0.58599853515625, 10.257209777832031, 15.930652618408203, 31.002822875976562, 4.5724029541015625, 11.30517578125, 3.6657180786132812, 9.540725708007812, -1.1616134643554688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000325.npy"}
|
||||
{"epoch": 0.491307634164777, "step": 326, "batch_size": 64, "mean": 5.123998165130615, "std": 12.84668254852295, "min": -22.69500732421875, "p10": -11.863950538635253, "median": 6.000591278076172, "p90": 20.080769157409673, "max": 31.555004119873047, "pos_frac": 0.609375, "sample": [-11.559394836425781, -1.8082237243652344, 8.781742095947266, 2.6700172424316406, 17.829696655273438, -20.86077880859375, 28.29791259765625, 17.617095947265625, -3.8725128173828125, 23.318687438964844, -2.3386688232421875, 8.71194839477539, 5.526233673095703, -1.8155670166015625, 9.361612319946289, 16.615882873535156, 1.926504135131836, 12.987884521484375, -3.6777801513671875, 14.761383056640625, 19.024951934814453, 1.5930328369140625, 11.364341735839844, 18.86150360107422, -9.955734252929688, -0.258819580078125, -19.347305297851562, -4.028289794921875, -0.284088134765625, -1.091400146484375, 31.555004119873047, -19.147064208984375, 11.030784606933594, -2.197338104248047, 21.03003692626953, 11.253231048583984, 5.13629150390625, 1.7851333618164062, -9.356285095214844, -10.211715698242188, 20.533262252807617, 26.966224670410156, -13.20599365234375, -0.6806640625, 17.461071014404297, 6.474948883056641, 13.03396987915039, -11.994474411010742, 11.751810073852539, 27.47199249267578, 0.4204521179199219, -4.1024627685546875, -22.69500732421875, 15.847389221191406, -4.194267272949219, 11.047332763671875, -14.28111457824707, 10.283050537109375, 12.5830078125, 16.240386962890625, 16.440034866333008, -7.414844512939453, 6.838531494140625, 13.881282806396484], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000326.npy"}
|
||||
{"epoch": 0.4928193499622071, "step": 327, "batch_size": 64, "mean": 8.558960914611816, "std": 15.35792064666748, "min": -32.16865539550781, "p10": -8.791468429565429, "median": 8.948577880859375, "p90": 27.035942077636726, "max": 53.14190673828125, "pos_frac": 0.71875, "sample": [16.808311462402344, -4.32720947265625, -7.9798583984375, 53.14190673828125, 4.30517578125, 19.308197021484375, -16.873056411743164, 12.902992248535156, -13.943174362182617, 18.400333404541016, 29.505762100219727, 3.6645545959472656, -9.331018447875977, 24.670284271240234, 10.78558349609375, 23.090669631958008, 21.603179931640625, 1.8434829711914062, 1.2159881591796875, -8.931476593017578, 0.9993381500244141, 9.586441040039062, 13.331350326538086, 27.598968505859375, 25.722213745117188, -1.5579071044921875, 12.649703979492188, 11.349769592285156, 17.132492065429688, -3.361072540283203, 16.62641143798828, 3.0286827087402344, -8.46478271484375, 18.473388671875, 15.539474487304688, 1.6208419799804688, 9.756103515625, 7.019775390625, -4.618534088134766, 0.013454437255859375, 32.53273010253906, 3.03973388671875, -32.16865539550781, 5.2470703125, 16.066024780273438, 49.626922607421875, -5.1860809326171875, 13.444732666015625, 37.584083557128906, 12.457969665527344, 29.26355743408203, -2.3627471923828125, 8.310714721679688, 4.0731353759765625, -12.409839630126953, -2.7565860748291016, 23.077430725097656, 13.934394836425781, 13.708038330078125, -1.5806007385253906, 1.1125564575195312, -16.14275550842285, 9.88665771484375, -5.291755676269531], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000327.npy"}
|
||||
{"epoch": 0.4943310657596372, "step": 328, "batch_size": 64, "mean": 9.565893173217773, "std": 11.824197769165039, "min": -12.241348266601562, "p10": -4.297257041931152, "median": 7.1379241943359375, "p90": 28.461089134216312, "max": 37.616912841796875, "pos_frac": 0.765625, "sample": [36.05531311035156, 3.824159622192383, 8.858821868896484, 6.492576599121094, 17.105472564697266, 3.1124610900878906, -5.425090789794922, 19.206649780273438, 11.728630065917969, 10.543144226074219, -3.704681396484375, 20.339950561523438, 31.230133056640625, 10.205181121826172, 6.409912109375, -8.709884643554688, 31.612220764160156, 12.766532897949219, 13.009445190429688, -7.381137847900391, 5.7954254150390625, -3.7017974853515625, 2.4743194580078125, 10.818470001220703, -1.2359161376953125, 1.486358642578125, 31.86834716796875, 5.5348663330078125, -12.241348266601562, 13.751846313476562, -6.464508056640625, 17.19402503967285, 23.35958480834961, 6.9272308349609375, 19.198814392089844, -0.9847183227539062, 1.813995361328125, -2.256786346435547, 37.616912841796875, 1.525146484375, 24.512481689453125, 17.310745239257812, -4.551218032836914, -1.2299442291259766, 7.3486175537109375, 14.636070251464844, 1.4383525848388672, 2.0042877197265625, 3.5858078002929688, 6.873554229736328, 10.331262588500977, -2.5614700317382812, 12.745071411132812, 13.377853393554688, -4.9624176025390625, 10.541004180908203, 16.813430786132812, 6.078210830688477, 27.835847854614258, 6.337165832519531, 14.078729629516602, -2.1852455139160156, 33.36980438232422, 28.729049682617188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000328.npy"}
|
||||
{"epoch": 0.4958427815570673, "step": 329, "batch_size": 64, "mean": 10.009060859680176, "std": 14.261083602905273, "min": -17.95721435546875, "p10": -5.620869827270508, "median": 6.498897552490234, "p90": 25.43480453491211, "max": 47.41381072998047, "pos_frac": 0.734375, "sample": [6.238269805908203, 1.6817359924316406, 9.398506164550781, 12.096832275390625, -4.785530090332031, 13.63642692565918, 34.330047607421875, -0.08121490478515625, 23.410430908203125, 4.306884765625, 17.24756622314453, 11.577178955078125, -5.135377883911133, 19.197708129882812, -1.5124549865722656, 0.35746002197265625, 5.9243621826171875, 18.628738403320312, -5.62884521484375, 10.782028198242188, 4.1042938232421875, 1.351552963256836, 41.31916046142578, 17.860260009765625, -17.212913513183594, 31.601974487304688, 47.41381072998047, 13.768638610839844, 1.9910125732421875, 0.3621063232421875, 21.48431396484375, 18.876182556152344, -11.600631713867188, 23.801025390625, 18.458242416381836, -1.8217620849609375, 34.73670959472656, -3.1627063751220703, 23.017480850219727, 20.857177734375, 24.85791015625, 21.030410766601562, -0.3521537780761719, -7.606986999511719, 6.587013244628906, 19.63129425048828, -0.013387680053710938, 4.209293365478516, 6.3104095458984375, 12.397899627685547, 1.1531105041503906, 18.362457275390625, 6.4107818603515625, -5.602260589599609, 45.918701171875, 24.299636840820312, -5.702096939086914, 25.682044982910156, 1.272552490234375, -17.95721435546875, 7.4143524169921875, 2.3774490356445312, -0.8065967559814453, -8.171371459960938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000329.npy"}
|
||||
{"epoch": 0.4973544973544973, "step": 330, "batch_size": 64, "mean": 9.910335540771484, "std": 14.979886054992676, "min": -32.485252380371094, "p10": -8.183676719665524, "median": 6.76017951965332, "p90": 27.25750198364258, "max": 41.17404556274414, "pos_frac": 0.8125, "sample": [2.4381561279296875, -5.657096862792969, -1.5462875366210938, -20.90409278869629, 4.9257049560546875, -32.485252380371094, 26.475196838378906, 6.549858093261719, -5.131248474121094, 29.06766128540039, 24.98974609375, 7.398719787597656, -15.418542861938477, 3.9747886657714844, 15.638362884521484, 22.331268310546875, 32.14918518066406, -10.697578430175781, 40.865966796875, 24.280567169189453, 23.760066986083984, -3.3745861053466797, 17.596710205078125, 0.8894157409667969, 24.359291076660156, 27.374893188476562, 6.295890808105469, 1.8141288757324219, -5.585540771484375, 4.280670166015625, 4.369367599487305, 18.664085388183594, 1.3522567749023438, 6.970500946044922, -15.373414993286133, 26.98358917236328, 19.518047332763672, 15.5340576171875, 19.151092529296875, 9.977401733398438, 3.297088623046875, 12.518905639648438, 4.888374328613281, 16.914440155029297, 26.06884765625, 25.047340393066406, 4.079805374145508, 6.442909240722656, 3.9143524169921875, 2.3117218017578125, 1.3119888305664062, 10.39544677734375, 33.676788330078125, 16.635879516601562, 6.0284576416015625, 3.2890777587890625, 4.544914245605469, -9.266496658325195, 36.360015869140625, -15.49072265625, 21.937410354614258, 15.774742126464844, 8.603187561035156, 41.17404556274414], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000330.npy"}
|
||||
{"epoch": 0.4988662131519274, "step": 331, "batch_size": 64, "mean": 13.904380798339844, "std": 16.066940307617188, "min": -28.779449462890625, "p10": -5.18310852050781, "median": 13.68868637084961, "p90": 34.48415298461914, "max": 54.14271545410156, "pos_frac": 0.828125, "sample": [38.03666305541992, 5.668708801269531, 7.030815124511719, 17.020259857177734, 19.215579986572266, 25.882606506347656, 14.318283081054688, 25.895286560058594, 13.605606079101562, 1.5173873901367188, 0.7523765563964844, 39.69232177734375, 34.34125518798828, 34.54539489746094, 21.037445068359375, -2.425811767578125, 50.36536407470703, 15.634178161621094, 12.622856140136719, 43.049598693847656, 17.00445556640625, 5.793859481811523, 7.249275207519531, 4.751701354980469, -0.3748359680175781, 28.373085021972656, 19.706201553344727, 54.14271545410156, 1.15655517578125, 16.820205688476562, 20.459915161132812, 12.070878982543945, 26.788135528564453, 25.543807983398438, 2.7874832153320312, 0.0738525390625, 13.771766662597656, -12.754661560058594, -9.647287368774414, 25.395000457763672, -6.36480712890625, 38.17517852783203, 12.172143936157227, -1.001953125, 9.877511978149414, -28.779449462890625, 10.385551452636719, -8.551979064941406, -14.199310302734375, 3.059101104736328, 1.4321670532226562, 16.61431121826172, 29.906768798828125, 10.470008850097656, 9.70574951171875, 1.5670700073242188, -0.6876735687255859, 21.70667266845703, 16.9754638671875, -10.575557708740234, 26.982994079589844, 15.324542999267578, 24.969390869140625, 33.79819107055664], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000331.npy"}
|
||||
{"epoch": 0.5003779289493575, "step": 332, "batch_size": 64, "mean": 10.228164672851562, "std": 13.783428192138672, "min": -12.951671600341797, "p10": -3.662150192260741, "median": 6.929269790649414, "p90": 31.39419250488282, "max": 42.12024688720703, "pos_frac": 0.8125, "sample": [-4.056400299072266, 2.0345840454101562, 3.2323226928710938, 0.6262130737304688, 8.227346420288086, 17.278213500976562, 8.688613891601562, 6.124696731567383, 2.696746826171875, 7.038951873779297, 3.3577117919921875, 42.12024688720703, 32.13294982910156, 16.959503173828125, 34.89293670654297, 2.0681838989257812, 9.274215698242188, 1.1805477142333984, 15.641372680664062, 2.3930816650390625, 29.670425415039062, 4.343894958496094, -9.979522705078125, 38.8568000793457, 15.325088500976562, 16.42650604248047, -10.307065963745117, 4.408256530761719, 7.616180419921875, 10.345382690429688, -1.0584640502929688, 29.211448669433594, 4.04705810546875, 1.2392654418945312, 3.2781982421875, 5.830833435058594, -0.9458236694335938, 41.71906280517578, -10.949886322021484, -0.5084781646728516, 10.112030029296875, 27.148590087890625, 11.716669082641602, 7.893373489379883, 5.7331390380859375, -2.7422332763671875, 14.628913879394531, 20.779476165771484, 14.90643310546875, 10.5240478515625, 28.991409301757812, 5.184608459472656, 7.202388763427734, -0.79168701171875, 3.1260147094726562, -8.480270385742188, 39.06050109863281, 6.819587707519531, 2.9389495849609375, 28.43433952331543, 40.191619873046875, 14.097412109375, -12.951671600341797, -10.402275085449219], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000332.npy"}
|
||||
{"epoch": 0.5018896447467877, "step": 333, "batch_size": 64, "mean": 13.17949104309082, "std": 14.092615127563477, "min": -14.490371704101562, "p10": -1.9522825241088864, "median": 11.169082641601562, "p90": 32.638989067077645, "max": 55.251129150390625, "pos_frac": 0.828125, "sample": [7.068878173828125, -1.5331287384033203, 25.721649169921875, 29.500534057617188, 11.333526611328125, 0.24821853637695312, 19.64870834350586, 15.5091552734375, 4.853404998779297, 11.004638671875, 29.267318725585938, 24.558761596679688, 11.938926696777344, 17.160423278808594, 25.65687370300293, 8.95306396484375, 7.9898681640625, 39.10888671875, 33.37347412109375, 39.75825119018555, 0.00072479248046875, 7.4993896484375, 6.41021728515625, 2.9614486694335938, 11.891399383544922, -8.0360107421875, -4.052387237548828, 14.29983139038086, 7.644012451171875, 14.568519592285156, 22.262298583984375, 34.82395935058594, -0.33710479736328125, 7.478904724121094, 15.811958312988281, 22.730457305908203, 55.251129150390625, -1.2153968811035156, 1.3171768188476562, 2.711334228515625, -0.5321292877197266, 6.75213623046875, -14.490371704101562, 5.554830551147461, 8.389541625976562, -5.602664947509766, 33.190399169921875, 1.2697525024414062, -2.1319198608398438, 13.135490417480469, 7.0964508056640625, -7.3291778564453125, 15.590036392211914, 36.27851867675781, 3.222034454345703, -8.492660522460938, 15.061935424804688, 2.2056198120117188, 31.352365493774414, 20.955806732177734, 28.82904052734375, 18.1500244140625, 29.85131072998047, 30.03778839111328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000333.npy"}
|
||||
{"epoch": 0.5034013605442177, "step": 334, "batch_size": 64, "mean": 8.60505485534668, "std": 15.63957691192627, "min": -29.18695068359375, "p10": -9.28688507080078, "median": 6.71342658996582, "p90": 29.75369338989258, "max": 41.1226806640625, "pos_frac": 0.765625, "sample": [-12.115638732910156, -29.18695068359375, 12.126880645751953, 11.565074920654297, 13.726287841796875, 4.316856384277344, -22.184326171875, 26.40387725830078, 4.447315216064453, -25.823074340820312, 0.39365386962890625, 41.1226806640625, -8.003046035766602, 4.198493957519531, 3.83709716796875, 14.00124740600586, -2.70623779296875, -3.8455142974853516, 9.814910888671875, 28.637489318847656, 25.893386840820312, 36.94551086425781, 6.001289367675781, 13.163911819458008, 24.70458221435547, -14.95855712890625, 9.809234619140625, 1.8644428253173828, 36.02889633178711, 5.188159942626953, 0.32868003845214844, -3.9877185821533203, 30.99895477294922, -9.73223876953125, 19.423301696777344, 35.86692810058594, 22.51458740234375, 6.3677825927734375, 6.839202880859375, 1.1720733642578125, 6.587650299072266, -5.606868743896484, -0.5924263000488281, 3.1302337646484375, 29.952484130859375, 0.17739105224609375, 1.6420059204101562, 3.8432788848876953, -19.335418701171875, 22.295120239257812, 20.64893341064453, 8.504203796386719, 7.3861083984375, 9.248449325561523, -5.985940933227539, 9.716888427734375, 15.010749816894531, 29.28984832763672, -8.247726440429688, 3.68487548828125, 18.493179321289062, 14.105884552001953, 34.70793914794922, 26.907203674316406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000334.npy"}
|
||||
{"epoch": 0.5049130763416477, "step": 335, "batch_size": 64, "mean": 10.360504150390625, "std": 15.219330787658691, "min": -19.88143539428711, "p10": -8.649274253845215, "median": 7.403475761413574, "p90": 32.874244689941406, "max": 45.17142105102539, "pos_frac": 0.78125, "sample": [18.213760375976562, 24.012474060058594, -8.795679092407227, 39.984649658203125, 3.069427490234375, 34.774261474609375, 21.960433959960938, 0.498138427734375, 19.684234619140625, 10.872528076171875, 30.191814422607422, 25.655284881591797, 5.2904510498046875, 8.280128479003906, 35.055870056152344, 30.60761260986328, 4.6747283935546875, 7.390226364135742, 3.257293701171875, 43.78673553466797, -14.007362365722656, 7.416725158691406, 10.127696990966797, 33.085113525390625, -7.3028106689453125, 13.530029296875, 18.914833068847656, -13.022674560546875, -10.383296966552734, 20.39574432373047, 21.638469696044922, 0.2920989990234375, 3.1134872436523438, 14.317378997802734, 20.627147674560547, -3.436382293701172, 34.17936706542969, -3.2858619689941406, 3.9966354370117188, 45.17142105102539, 4.087516784667969, 4.609640121459961, 0.9598922729492188, 19.382614135742188, 8.976921081542969, 32.38221740722656, -0.41927337646484375, -8.307662963867188, 4.0482177734375, 0.16104888916015625, 2.95196533203125, -4.88031005859375, 0.3873443603515625, -14.822870254516602, -19.88143539428711, -9.571632385253906, 22.178314208984375, 13.464359283447266, 7.457729339599609, 18.06076431274414, -2.7974395751953125, 1.6290206909179688, 7.021781921386719, 22.161407470703125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000335.npy"}
|
||||
{"epoch": 0.5064247921390779, "step": 336, "batch_size": 64, "mean": 9.76156997680664, "std": 16.585277557373047, "min": -18.40054702758789, "p10": -7.712940979003905, "median": 5.780699729919434, "p90": 33.06481246948243, "max": 68.08120727539062, "pos_frac": 0.703125, "sample": [-8.353515625, 13.804794311523438, 27.573562622070312, -2.444854736328125, 45.1126708984375, 21.758468627929688, 13.136852264404297, -8.117050170898438, 46.776893615722656, 1.265960693359375, 0.20882606506347656, -2.1521644592285156, 38.958396911621094, -4.6787109375, 27.12225341796875, 6.285854339599609, 0.34760284423828125, 14.792160034179688, 39.032196044921875, -12.361808776855469, -4.4756011962890625, 9.286300659179688, 34.211082458496094, 1.3411102294921875, -1.911895751953125, 22.02511215209961, 5.732082366943359, -18.40054702758789, -8.482124328613281, -13.369606018066406, 38.55122375488281, -6.77001953125, 4.404356002807617, 11.276092529296875, 0.11224746704101562, 0.1899700164794922, 6.802070617675781, -8.497905731201172, 22.942548751831055, 17.71664047241211, 3.8552932739257812, -6.709739685058594, 4.6149749755859375, 3.9943161010742188, 23.287715911865234, 8.88409423828125, 8.922225952148438, -1.7467041015625, 5.829317092895508, 30.390182495117188, 4.720024108886719, -1.51611328125, 14.338859558105469, 27.800201416015625, 68.08120727539062, 14.781051635742188, -0.464569091796875, 3.198699951171875, 18.97321319580078, -5.355377197265625, 6.235748291015625, 16.266643524169922, 11.880813598632812, -6.273078918457031], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000336.npy"}
|
||||
{"epoch": 0.5079365079365079, "step": 337, "batch_size": 64, "mean": 8.101402282714844, "std": 13.851225852966309, "min": -23.372386932373047, "p10": -6.569030761718749, "median": 6.305061340332031, "p90": 25.8607988357544, "max": 52.873046875, "pos_frac": 0.71875, "sample": [6.28558349609375, -2.8301467895507812, -14.683483123779297, -6.852775573730469, 10.362556457519531, 17.933258056640625, -2.794330596923828, 26.494626998901367, -2.33880615234375, 1.8802261352539062, 31.877471923828125, 3.0484695434570312, 12.369735717773438, -18.620115280151367, -4.53912353515625, 1.3041763305664062, 9.89569091796875, 24.381866455078125, 3.6127700805664062, -3.3880081176757812, 6.045783996582031, -1.0535831451416016, -12.5936279296875, -23.372386932373047, 15.908273696899414, 10.559627532958984, -5.906959533691406, 11.800582885742188, 10.0477294921875, 36.50514221191406, 8.393638610839844, 1.968994140625, 23.1684513092041, -4.441127777099609, 23.74078369140625, 16.638092041015625, 17.822662353515625, 0.6407470703125, 3.0257186889648438, 17.557706832885742, 27.37287139892578, 9.389705657958984, 17.228500366210938, 11.721633911132812, 5.529998779296875, -4.1623687744140625, 7.28099250793457, 4.3099212646484375, 17.1734619140625, 2.846874237060547, 18.198226928710938, 52.873046875, -0.8658294677734375, 3.020111083984375, 6.3245391845703125, 27.201889038085938, -0.9446620941162109, -7.847251892089844, 21.696866989135742, 16.619253158569336, -14.9931640625, 3.8767547607421875, 34.0684928894043, 10.713993072509766], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000337.npy"}
|
||||
{"epoch": 0.509448223733938, "step": 338, "batch_size": 64, "mean": 8.800281524658203, "std": 13.654952049255371, "min": -26.109268188476562, "p10": -4.9122938156127915, "median": 6.889888763427734, "p90": 26.976210021972662, "max": 39.8663215637207, "pos_frac": 0.75, "sample": [-8.7188720703125, 3.276020050048828, 19.01165771484375, 18.865951538085938, 22.43901824951172, 14.058670043945312, -20.357009887695312, 18.29979705810547, 4.445899963378906, 31.56707191467285, 39.8663215637207, 7.297218322753906, -2.5631484985351562, 11.072799682617188, 3.835865020751953, -8.958663940429688, -2.7565765380859375, -17.850378036499023, 0.89630126953125, 4.499015808105469, 7.498283386230469, 22.728729248046875, 2.162914276123047, -0.9092578887939453, 11.367103576660156, 9.333038330078125, -26.109268188476562, 27.494979858398438, 12.938995361328125, 19.464147567749023, 15.857086181640625, 27.781158447265625, 1.1638908386230469, 4.352809906005859, 10.433927536010742, 14.926750183105469, 12.037630081176758, 6.4825592041015625, 25.7657470703125, 0.6359710693359375, -0.0760345458984375, 24.026430130004883, -5.392425537109375, 17.539371490478516, 2.555011749267578, 10.770591735839844, -2.5768280029296875, -6.007345199584961, 3.048675537109375, -2.5606765747070312, 4.814445495605469, 37.744972229003906, 18.18838119506836, 0.8593254089355469, 8.13724136352539, 20.0609130859375, 0.741912841796875, 32.685089111328125, 23.1710205078125, -3.598499298095703, -3.7919864654541016, 1.93609619140625, 37.68037414550781, -0.372222900390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000338.npy"}
|
||||
{"epoch": 0.5109599395313681, "step": 339, "batch_size": 64, "mean": 9.900163650512695, "std": 13.102482795715332, "min": -13.5140380859375, "p10": -5.689186859130859, "median": 7.790920257568359, "p90": 28.103121566772465, "max": 42.37004852294922, "pos_frac": 0.796875, "sample": [14.516695022583008, 9.727691650390625, 3.8617401123046875, -3.06463623046875, -3.0295562744140625, 27.180561065673828, 7.5575714111328125, 7.611381530761719, 4.571048736572266, -7.198205947875977, 19.442764282226562, -4.9120025634765625, 3.863513946533203, 5.292942047119141, 6.975978851318359, 3.121826171875, -11.954450607299805, 16.507781982421875, 31.104759216308594, 0.06108283996582031, 28.498504638671875, 4.303546905517578, 13.229061126708984, 16.55431365966797, 1.9188079833984375, 23.689266204833984, 4.396411895751953, 5.273357391357422, -5.375999450683594, 18.3856258392334, 23.688705444335938, 31.788467407226562, 36.29347229003906, 42.37004852294922, 14.682868957519531, 3.7302780151367188, 14.136436462402344, -4.7826995849609375, 20.0118408203125, 14.947998046875, 35.028045654296875, 9.35711669921875, -8.272211074829102, -5.8234100341796875, 10.423324584960938, -1.3352737426757812, 4.5096588134765625, 14.096435546875, -12.38226318359375, -10.979217529296875, 3.150745391845703, 3.2717418670654297, 15.995162963867188, 18.396583557128906, -13.5140380859375, 40.94933319091797, 0.37664794921875, 8.011270523071289, 7.970458984375, 17.23326873779297, 10.772769927978516, 17.884334564208984, 2.537670135498047, 26.973526000976562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000339.npy"}
|
||||
{"epoch": 0.5124716553287982, "step": 340, "batch_size": 64, "mean": 10.753141403198242, "std": 17.111642837524414, "min": -25.384876251220703, "p10": -10.960356712341309, "median": 8.811408042907715, "p90": 34.132561492919926, "max": 42.64470672607422, "pos_frac": 0.703125, "sample": [13.310615539550781, 29.548776626586914, 15.947418212890625, 15.412227630615234, -3.0319900512695312, 42.64470672607422, -1.5054683685302734, -2.3231029510498047, -2.9747848510742188, 33.9080810546875, 4.475532531738281, 22.790283203125, 8.592422485351562, 19.185300827026367, 37.793128967285156, 22.097270965576172, 1.611795425415039, -22.386085510253906, 24.04925537109375, 27.78448486328125, -8.181421279907227, -4.18663215637207, -18.702228546142578, 10.252681732177734, 3.0915298461914062, 40.86579895019531, -3.3257179260253906, 28.261932373046875, 6.972145080566406, 32.771697998046875, -12.841209411621094, 22.32305145263672, 2.7315673828125, 12.532806396484375, 32.78791809082031, 8.903417587280273, 37.759735107421875, 0.6662921905517578, 1.2116851806640625, 32.78853225708008, 29.1179256439209, -4.063230514526367, -1.259429931640625, 34.22876739501953, 27.678264617919922, -8.597467422485352, -11.987274169921875, -1.3392448425292969, 5.070554733276367, 25.050003051757812, 8.052078247070312, 9.277816772460938, 14.260765075683594, -11.212381362915039, -25.384876251220703, -10.372299194335938, 38.52018737792969, 0.32074737548828125, 13.105588912963867, 36.66820526123047, 9.765918731689453, 0.6051101684570312, -11.63751220703125, 8.719398498535156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000340.npy"}
|
||||
{"epoch": 0.5139833711262283, "step": 341, "batch_size": 64, "mean": 10.754829406738281, "std": 15.414800643920898, "min": -22.2625732421875, "p10": -7.626309967041014, "median": 8.145318031311035, "p90": 29.249808692932127, "max": 42.09217071533203, "pos_frac": 0.734375, "sample": [-13.60976791381836, 23.112598419189453, 21.02930450439453, 19.634065628051758, 26.187463760375977, 21.46266746520996, 27.619667053222656, 36.631591796875, 27.33068084716797, -14.078887939453125, 33.13850784301758, 3.9347877502441406, -12.957443237304688, -2.948648452758789, 3.1197128295898438, 3.0963058471679688, 19.924095153808594, 6.63629150390625, 29.28016471862793, 39.159759521484375, 22.362808227539062, 29.178977966308594, 33.263404846191406, 12.54507827758789, -22.2625732421875, 42.09217071533203, 23.948638916015625, -1.8912467956542969, -2.4486007690429688, -3.651092529296875, 10.720123291015625, 6.053901672363281, -5.1582183837890625, -3.6046600341796875, 15.49200439453125, 26.813907623291016, 8.33218765258789, 7.95844841003418, 6.487098693847656, -14.563667297363281, -6.052116394042969, -4.098182678222656, 6.618400573730469, 19.17236328125, 3.3358001708984375, 37.68695068359375, 19.758575439453125, 21.082061767578125, 16.055320739746094, 26.089393615722656, 13.549453735351562, 7.270111083984375, 23.047210693359375, -8.30096435546875, 3.689441680908203, 22.63701629638672, 12.790809631347656, 4.658000946044922, 2.337007522583008, -5.4499053955078125, -1.8073806762695312, 0.7803115844726562, 4.809688568115234, -20.721900939941406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000341.npy"}
|
||||
{"epoch": 0.5154950869236583, "step": 342, "batch_size": 64, "mean": 12.426074981689453, "std": 15.14612102508545, "min": -18.81795310974121, "p10": -4.223802566528319, "median": 7.9964752197265625, "p90": 33.26382904052735, "max": 47.801273345947266, "pos_frac": 0.75, "sample": [7.855293273925781, 4.069732666015625, -4.796356201171875, -3.0877838134765625, 1.3040828704833984, 28.605823516845703, 3.7092361450195312, 25.323196411132812, -5.4537506103515625, 24.772789001464844, 2.8542919158935547, 8.137657165527344, -0.049800872802734375, 32.570335388183594, 7.5191192626953125, -6.6676177978515625, 1.8346977233886719, -18.81795310974121, -1.3738670349121094, -2.6562767028808594, 7.4051361083984375, 26.219379425048828, 24.464614868164062, 32.891868591308594, -3.5072174072265625, 15.922969818115234, 10.350234985351562, 12.069219589233398, 21.420852661132812, 27.22116470336914, 9.64581298828125, 9.47222900390625, 5.285064697265625, 20.789260864257812, 20.792381286621094, -4.902732849121094, 24.054946899414062, -3.4933319091796875, 6.7644195556640625, 31.843822479248047, -3.2693099975585938, 36.88629913330078, 27.935699462890625, 11.830772399902344, -4.94883918762207, 44.39186096191406, 5.689483642578125, -2.5663833618164062, 43.81123352050781, 35.92344665527344, 12.290687561035156, 0.7697715759277344, 7.68695068359375, 47.801273345947266, -4.530910491943359, 33.423240661621094, 15.802738189697266, 2.0022735595703125, 4.117530822753906, 12.03048324584961, 18.273649215698242, 6.622463226318359, -0.41236114501953125, 43.34381103515625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000342.npy"}
|
||||
{"epoch": 0.5170068027210885, "step": 343, "batch_size": 64, "mean": 7.081644058227539, "std": 13.055013656616211, "min": -23.00322723388672, "p10": -7.044904708862302, "median": 7.172637939453125, "p90": 21.295089149475107, "max": 39.0841064453125, "pos_frac": 0.703125, "sample": [12.059593200683594, 2.8022079467773438, 0.8137969970703125, -2.6013641357421875, -2.6591567993164062, 18.676389694213867, 7.8020477294921875, -1.213418960571289, -2.5650787353515625, -2.9292526245117188, -15.979095458984375, -2.875762939453125, 11.511611938476562, 11.873634338378906, 38.00860595703125, -1.4709281921386719, 24.868392944335938, 39.0841064453125, 32.7337760925293, 18.385051727294922, 7.8592529296875, 8.62057876586914, 16.185958862304688, 8.998138427734375, 0.4603271484375, 11.400371551513672, 3.8811874389648438, -13.895328521728516, 13.430816650390625, -1.3733463287353516, -17.54845428466797, 6.5432281494140625, 17.979354858398438, 4.746379852294922, 0.39447021484375, 4.507743835449219, 18.16424560546875, 13.751708984375, -23.00322723388672, 31.763458251953125, 0.236785888671875, 9.805557250976562, 14.736396789550781, 22.417388916015625, 8.202285766601562, 6.0852508544921875, 30.425384521484375, 8.57260513305664, 0.6815223693847656, 17.32904815673828, 18.50055694580078, -3.7439956665039062, 5.5430755615234375, 18.470748901367188, 10.175025939941406, 13.636627197265625, -4.509746551513672, -0.074859619140625, 17.017601013183594, -1.3823623657226562, -20.642257690429688, -12.554824829101562, -8.131401062011719, 3.236766815185547], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000343.npy"}
|
||||
{"epoch": 0.5185185185185185, "step": 344, "batch_size": 64, "mean": 10.392600059509277, "std": 13.660822868347168, "min": -20.697507858276367, "p10": -5.55404109954834, "median": 10.006515502929688, "p90": 27.48864669799805, "max": 46.223358154296875, "pos_frac": 0.78125, "sample": [-9.486175537109375, 23.32651710510254, 6.695747375488281, 7.542926788330078, 26.432350158691406, -0.12262535095214844, 6.587766647338867, 11.082122802734375, 9.08709716796875, 17.92364501953125, 1.316009521484375, 2.5017471313476562, 12.826231002807617, 19.882755279541016, 11.571617126464844, 19.60916519165039, 16.363494873046875, 46.223358154296875, 7.5892333984375, 22.274093627929688, -3.8688087463378906, -3.7175445556640625, -11.400222778320312, 31.845352172851562, 0.7856884002685547, -6.083808898925781, -10.981918334960938, 1.641998291015625, 32.09657287597656, 13.67154312133789, -5.626930236816406, 36.19746398925781, 0.20717620849609375, 10.008132934570312, 18.16858673095703, 27.94134521484375, 31.037250518798828, 5.3583984375, 10.004898071289062, -15.527290344238281, 3.0394630432128906, 2.6877403259277344, -2.483428955078125, -3.209474563598633, 3.0691452026367188, -5.383966445922852, 36.07600402832031, 23.358116149902344, 13.415252685546875, 14.998035430908203, 16.804590225219727, 5.164234161376953, 23.867591857910156, 11.253345489501953, 20.17719841003418, 0.4613933563232422, -2.85150146484375, 24.22230339050293, 0.2745819091796875, 18.99114227294922, -20.697507858276367, 20.08875274658203, 17.826854705810547, 22.9915771484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000344.npy"}
|
||||
{"epoch": 0.5200302343159486, "step": 345, "batch_size": 64, "mean": 10.14161491394043, "std": 14.635433197021484, "min": -23.731674194335938, "p10": -8.259581947326659, "median": 8.285926818847656, "p90": 28.797155761718752, "max": 50.084808349609375, "pos_frac": 0.734375, "sample": [9.215377807617188, 3.4788665771484375, 41.383872985839844, -19.89111328125, 4.632652282714844, -1.028839111328125, 2.125835418701172, -13.93255615234375, 2.1890716552734375, 3.5589218139648438, 16.54062271118164, 30.365230560302734, 22.196395874023438, 17.38509178161621, -9.040138244628906, 13.06622314453125, -9.754814147949219, 12.217170715332031, 5.6953887939453125, 26.487213134765625, 50.084808349609375, 5.28013801574707, 17.5401611328125, 17.555667877197266, 21.400968551635742, 1.4222564697265625, -2.5472183227539062, 7.68421745300293, 2.7136001586914062, 8.702140808105469, 16.87537384033203, 28.434402465820312, 22.866127014160156, 13.613161087036133, 13.7545166015625, -11.752168655395508, 40.24468994140625, 30.7147216796875, 22.427127838134766, -0.8886795043945312, 24.913314819335938, -0.31850433349609375, -9.804988861083984, 30.23870849609375, 14.530288696289062, -6.438283920288086, 5.809288024902344, -4.008056640625, 7.869712829589844, 5.616851806640625, 17.971664428710938, 7.066703796386719, -1.2591400146484375, 11.6649169921875, 1.3386154174804688, 24.720458984375, -1.1255073547363281, -23.731674194335938, -3.235105514526367, -0.26139068603515625, 21.37723731994629, 11.378936767578125, 22.780223846435547, 28.952621459960938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000345.npy"}
|
||||
{"epoch": 0.5215419501133787, "step": 346, "batch_size": 64, "mean": 11.988626480102539, "std": 15.684505462646484, "min": -24.11591339111328, "p10": -5.243314170837402, "median": 8.532549858093262, "p90": 32.70193328857422, "max": 42.905704498291016, "pos_frac": 0.78125, "sample": [26.892248153686523, 32.130741119384766, 18.642227172851562, 32.662818908691406, 32.383270263671875, 13.607040405273438, 4.4710845947265625, 16.186946868896484, 37.936038970947266, 31.324291229248047, 32.62957000732422, 3.7344741821289062, 17.13421630859375, 27.27375030517578, -4.42591667175293, 42.905704498291016, -13.14166259765625, 3.8821144104003906, -2.176105499267578, 6.977386474609375, 11.304244995117188, 13.451065063476562, 11.384765625, 4.950191497802734, 20.60282325744629, 36.79536437988281, -24.11591339111328, 12.683712005615234, 0.2618064880371094, -18.412338256835938, 2.827117919921875, -11.695121765136719, 22.429668426513672, -4.828149795532227, 5.170101165771484, 6.631134033203125, 4.20684814453125, -5.421241760253906, 33.758338928222656, 31.37860107421875, 35.078460693359375, 27.902812957763672, 2.4213104248046875, 27.189315795898438, 6.003192901611328, 6.625846862792969, 32.71869659423828, 10.67340087890625, -7.206871032714844, 17.726531982421875, 4.103912353515625, 1.194122314453125, 8.774847030639648, -1.7070388793945312, -4.047554016113281, 3.618316650390625, 6.456573486328125, 19.449668884277344, -13.078773498535156, 40.24491882324219, -2.605754852294922, -1.0953254699707031, 8.290252685546875, 24.14801025390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000346.npy"}
|
||||
{"epoch": 0.5230536659108088, "step": 347, "batch_size": 64, "mean": 9.012718200683594, "std": 15.03288459777832, "min": -25.841156005859375, "p10": -9.034562873840331, "median": 6.4019670486450195, "p90": 31.016534423828134, "max": 50.539329528808594, "pos_frac": 0.734375, "sample": [6.656526565551758, 6.147407531738281, 15.191230773925781, 17.5828857421875, -25.841156005859375, 3.556976318359375, 15.028106689453125, 11.301115036010742, 15.96650505065918, 28.172283172607422, 0.207305908203125, 32.80662155151367, 4.12237548828125, -10.991630554199219, -12.992759704589844, 48.182029724121094, 8.840469360351562, 10.959461212158203, -5.516143798828125, 4.443824768066406, 3.8298187255859375, 5.223747253417969, 20.588706970214844, 2.1954193115234375, -10.88077163696289, 3.1033897399902344, 20.40447235107422, 25.413724899291992, 32.12812805175781, -9.735626220703125, 0.1287384033203125, -2.619718551635742, -12.991828918457031, 12.258125305175781, -7.398748397827148, 5.984783172607422, -5.989906311035156, 12.823406219482422, 8.38775634765625, 6.658428192138672, 16.375471115112305, -2.315765380859375, -1.8423423767089844, 14.891143798828125, 9.117000579833984, 36.594146728515625, -5.250802993774414, -10.145133972167969, 50.539329528808594, 38.23102569580078, 31.749725341796875, 17.612567901611328, 12.005081176757812, 6.0527191162109375, -7.30198860168457, 17.665321350097656, 9.186391830444336, 5.304569244384766, 4.242351531982422, -1.3392200469970703, -4.688564300537109, 4.734275817871094, 29.305755615234375, 22.75545310974121], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000347.npy"}
|
||||
{"epoch": 0.5245653817082389, "step": 348, "batch_size": 64, "mean": 12.638562202453613, "std": 17.778974533081055, "min": -14.503799438476562, "p10": -4.947331809997558, "median": 6.216001510620117, "p90": 36.61789703369141, "max": 60.08609390258789, "pos_frac": 0.6875, "sample": [34.738494873046875, 33.87078094482422, 0.311920166015625, 1.1212615966796875, 31.1444091796875, 16.298240661621094, 3.2681884765625, -4.029361724853516, -14.503799438476562, -0.08725738525390625, 45.42510223388672, -0.174560546875, 11.783203125, 20.32440185546875, 31.829254150390625, 27.76235008239746, 5.342414855957031, 48.25959014892578, 7.089588165283203, 43.22526168823242, 2.0823898315429688, 23.97811508178711, 30.12194061279297, 1.3771514892578125, 8.903099060058594, 26.761608123779297, -0.2669525146484375, 29.55398178100586, -5.145692825317383, -11.103160858154297, 15.829231262207031, 0.6734542846679688, 3.756204605102539, 50.91571044921875, 9.187278747558594, -8.02619743347168, 13.607158660888672, -5.422214508056641, -0.3174285888671875, 8.090192794799805, 37.42335510253906, 60.08609390258789, 4.3339080810546875, -2.14117431640625, 25.76740264892578, -1.2328910827636719, -5.929439544677734, -4.484489440917969, 0.26251220703125, 10.786861419677734, -2.087116241455078, 55.67887878417969, -7.873619079589844, 19.152360916137695, 25.252458572387695, -0.07784271240234375, 22.702322006225586, 4.2796478271484375, -4.125499725341797, 1.7096004486083984, -2.6765823364257812, 11.270034790039062, 24.945526123046875, -1.7096290588378906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000348.npy"}
|
||||
{"epoch": 0.5260770975056689, "step": 349, "batch_size": 64, "mean": 11.49850082397461, "std": 14.406288146972656, "min": -29.469390869140625, "p10": -3.877226638793945, "median": 11.366545677185059, "p90": 27.884056854248048, "max": 47.288482666015625, "pos_frac": 0.765625, "sample": [-3.9744224548339844, 15.46405029296875, 27.064605712890625, 11.606803894042969, 9.665287017822266, 11.595682144165039, 14.5872802734375, 25.567081451416016, 10.240873336791992, 13.441255569458008, -3.1987228393554688, 3.1289615631103516, 18.883426666259766, 4.866783142089844, 12.4486083984375, 21.356374740600586, 27.790815353393555, -12.89788818359375, -29.469390869140625, 26.53598976135254, 11.267890930175781, 27.869674682617188, -2.623199462890625, 6.35498046875, 22.39958953857422, 3.00091552734375, 0.9730148315429688, 16.221582412719727, 3.543792724609375, 23.38165283203125, 23.44455337524414, 19.56869125366211, -3.6504364013671875, 16.669538497924805, -1.1795463562011719, 33.27210235595703, 6.558401107788086, 46.83750915527344, -7.378608703613281, 11.539081573486328, 37.97345733642578, -1.6526718139648438, 3.9457244873046875, 6.42633056640625, 11.465200424194336, 33.43529510498047, 17.09295654296875, 2.2829971313476562, -4.603446960449219, -2.9187850952148438, 18.309310913085938, 10.915374755859375, -4.289886474609375, -1.2515335083007812, 7.982458114624023, 39.874046325683594, 0.1717987060546875, 2.8535308837890625, -7.816295623779297, 27.890220642089844, -0.09012985229492188, 13.185409545898438, 14.659530639648438, 47.288482666015625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000349.npy"}
|
||||
{"epoch": 0.527588813303099, "step": 350, "batch_size": 64, "mean": 13.908271789550781, "std": 15.557916641235352, "min": -20.938697814941406, "p10": -4.577455902099609, "median": 12.85484504699707, "p90": 33.64725036621094, "max": 47.26781463623047, "pos_frac": 0.828125, "sample": [40.463417053222656, 2.9662246704101562, 16.058937072753906, -4.726776123046875, 16.193634033203125, 6.177026748657227, 31.950515747070312, 30.735137939453125, 12.850955963134766, 28.273073196411133, 6.780853271484375, -4.11712646484375, -1.300750732421875, 21.361614227294922, 12.695947647094727, 19.730815887451172, 27.511573791503906, 1.0161857604980469, -4.785167694091797, -4.229042053222656, 16.077789306640625, 24.661575317382812, -14.350120544433594, 11.977462768554688, 13.582361221313477, -2.6519126892089844, -20.938697814941406, 28.909561157226562, 0.2661457061767578, 1.3802299499511719, 22.000747680664062, -7.289604187011719, 24.78350830078125, 2.0934066772460938, 25.929187774658203, 33.535888671875, 10.106414794921875, 33.694976806640625, 36.13056945800781, 46.95832824707031, 11.862092971801758, 47.26781463623047, 46.95931625366211, 1.8100357055664062, 7.3289031982421875, 4.463956832885742, 17.446678161621094, 2.4023399353027344, -8.979732513427734, 24.36414337158203, 8.853256225585938, 15.812503814697266, 1.1098880767822266, 14.52042007446289, 6.23345947265625, -9.234397888183594, 12.858734130859375, 12.816764831542969, 3.7549362182617188, 16.77056884765625, 45.732086181640625, 13.588947296142578, 23.12785530090332, 26.793949127197266], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000350.npy"}
|
||||
{"epoch": 0.5291005291005291, "step": 351, "batch_size": 64, "mean": 10.911720275878906, "std": 15.251737594604492, "min": -36.825286865234375, "p10": -6.965396499633788, "median": 12.470296859741211, "p90": 30.519207572937013, "max": 35.68743896484375, "pos_frac": 0.78125, "sample": [-1.1003894805908203, 14.364219665527344, 32.63288879394531, 21.63207244873047, -1.9490470886230469, 20.123733520507812, -5.382724761962891, 19.288101196289062, 31.16686248779297, 7.938934326171875, 17.617294311523438, 28.118629455566406, 10.543977737426758, 4.7240142822265625, 10.03133773803711, -7.643684387207031, 6.144660949707031, 0.6536941528320312, 24.047767639160156, 10.81656265258789, 15.582794189453125, -10.642478942871094, 32.092105865478516, 7.9789581298828125, 26.883468627929688, 23.949844360351562, 16.721527099609375, 20.37646484375, -29.370826721191406, 12.366661071777344, 26.416141510009766, -9.519710540771484, 30.574310302734375, 0.05950927734375, 16.703777313232422, -12.606704711914062, -3.2607574462890625, 33.11602783203125, 12.146598815917969, 12.573932647705078, 17.89501190185547, 35.68743896484375, 12.355255126953125, -2.15374755859375, 22.86841583251953, 10.120285034179688, 13.622608184814453, 17.575401306152344, 0.4440155029296875, 13.342880249023438, -5.204616546630859, -0.9902915954589844, 22.252634048461914, 14.833185195922852, 30.390634536743164, 7.7024383544921875, 11.273441314697266, 33.74156188964844, -36.825286865234375, 9.521078109741211, -28.23345184326172, 13.278190612792969, 17.37468147277832, 1.5677642822265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000351.npy"}
|
||||
{"epoch": 0.5306122448979592, "step": 352, "batch_size": 64, "mean": 9.951467514038086, "std": 13.990104675292969, "min": -15.6988525390625, "p10": -5.45386962890625, "median": 7.575897216796875, "p90": 28.363077354431155, "max": 55.471893310546875, "pos_frac": 0.796875, "sample": [19.235977172851562, 55.471893310546875, 42.159942626953125, 2.9229888916015625, 23.30999755859375, -15.6988525390625, -6.086643218994141, 3.0125865936279297, 6.349620819091797, 8.2667236328125, 10.826375961303711, 17.32245635986328, -5.59039306640625, -0.9529457092285156, 16.898727416992188, 3.332489013671875, -5.13531494140625, -1.92864990234375, -6.892860412597656, 7.6026153564453125, 24.215599060058594, 8.620864868164062, 11.64862060546875, 32.503108978271484, 27.439403533935547, 13.464599609375, 8.933441162109375, -12.622098922729492, 6.732280731201172, -0.5011882781982422, 1.8527069091796875, 3.9023971557617188, 1.8626251220703125, 8.222373962402344, 7.5491790771484375, 16.66971206665039, 5.462739944458008, 22.36419677734375, 28.911911010742188, 8.984123229980469, 24.261917114257812, 0.5996932983398438, 46.30291748046875, 1.8777027130126953, 8.363433837890625, 28.74785804748535, -5.765285491943359, 3.5876312255859375, -2.9328250885009766, 18.34334945678711, 0.256103515625, 6.638580322265625, 12.543212890625, 31.293373107910156, 20.60816192626953, 0.4227142333984375, -4.968711853027344, 27.465255737304688, -11.175609588623047, 0.0619964599609375, 0.5537071228027344, 12.818071365356445, 4.286767959594727, 12.060579299926758], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000352.npy"}
|
||||
{"epoch": 0.5321239606953893, "step": 353, "batch_size": 64, "mean": 11.268505096435547, "std": 15.612439155578613, "min": -38.775978088378906, "p10": -4.871797943115233, "median": 8.669316291809082, "p90": 33.07772216796875, "max": 54.90790557861328, "pos_frac": 0.796875, "sample": [18.785879135131836, 32.07362365722656, -12.172981262207031, 12.455562591552734, -11.43637466430664, 7.153331756591797, 7.761375427246094, -9.212661743164062, 34.89836120605469, 10.298797607421875, 0.5323371887207031, 16.030067443847656, 19.824108123779297, 4.902801513671875, 37.426239013671875, -2.3463191986083984, 14.92169189453125, 17.43115997314453, 24.746551513671875, 32.326904296875, 31.045516967773438, 36.602325439453125, 1.4084339141845703, 21.745803833007812, 0.7862167358398438, -3.3443984985351562, 34.617347717285156, 17.216079711914062, 33.30762481689453, 8.836893081665039, 11.588829040527344, 2.6957359313964844, 12.557231903076172, -0.6225547790527344, -38.775978088378906, 24.82193946838379, -11.196144104003906, -6.750873565673828, 20.658164978027344, 32.541282653808594, 9.599658966064453, 30.905975341796875, 54.90790557861328, 7.465950012207031, 8.501739501953125, 7.220615386962891, 12.0924072265625, 4.218595504760742, 1.3536033630371094, 4.5336456298828125, 6.989404678344727, 2.0609283447265625, -2.2157516479492188, 8.841087341308594, 0.8108558654785156, 5.7279205322265625, 5.3985748291015625, 4.5181121826171875, -3.156829833984375, -0.16106414794921875, 20.33050537109375, -5.526397705078125, 35.17075729370117, 15.456230163574219], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000353.npy"}
|
||||
{"epoch": 0.5336356764928194, "step": 354, "batch_size": 64, "mean": 12.10396957397461, "std": 15.149526596069336, "min": -21.418275833129883, "p10": -9.327078247070311, "median": 10.271709442138672, "p90": 31.609371185302734, "max": 51.35395431518555, "pos_frac": 0.796875, "sample": [42.97106170654297, 22.26618194580078, -10.504631042480469, 15.504989624023438, -13.298774719238281, 7.041652679443359, 11.749229431152344, 13.562103271484375, 25.836990356445312, 0.5787010192871094, 20.4218807220459, 29.50436019897461, 18.25985336303711, 2.727031707763672, 3.4686203002929688, 1.2228240966796875, 21.59728240966797, -7.898990631103516, 33.32868957519531, 51.35395431518555, 27.099027633666992, 20.328907012939453, 10.117660522460938, 14.555793762207031, -21.418275833129883, 1.3540802001953125, -9.868728637695312, 26.089580535888672, 10.286834716796875, 10.256584167480469, 19.336212158203125, 31.770572662353516, 26.98577117919922, 18.25701904296875, 8.703231811523438, 3.6879310607910156, 13.66579818725586, -12.377901077270508, 34.43547058105469, 0.479949951171875, 16.168685913085938, 33.46415328979492, -9.728111267089844, -8.391334533691406, 29.739593505859375, -10.434600830078125, 24.0799560546875, 32.42364501953125, 7.081779479980469, 1.7121429443359375, 6.4721221923828125, 9.072484970092773, 24.90544891357422, 31.233234405517578, 9.899837493896484, 8.563533782958984, 3.8597335815429688, -0.3019866943359375, -3.1076736450195312, 17.63076400756836, 6.758613586425781, 27.179466247558594, -5.1175079345703125, -1.9484825134277344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000354.npy"}
|
||||
{"epoch": 0.5351473922902494, "step": 355, "batch_size": 64, "mean": 11.743728637695312, "std": 15.987975120544434, "min": -25.34861946105957, "p10": -8.684160995483397, "median": 11.927513122558594, "p90": 33.17920875549317, "max": 42.865264892578125, "pos_frac": 0.75, "sample": [12.682743072509766, 39.784690856933594, 0.5892791748046875, 11.452125549316406, 27.332916259765625, 16.086753845214844, 8.440864562988281, -10.036506652832031, -7.657779693603516, 3.3626327514648438, 20.72018051147461, -3.7857208251953125, 17.819320678710938, -11.0948486328125, 18.156394958496094, 23.407535552978516, -5.966861724853516, 15.503337860107422, -14.199405670166016, 11.509735107421875, -18.215972900390625, 13.769039154052734, 21.078445434570312, 29.877548217773438, 29.707426071166992, 10.548093795776367, 9.755224227905273, 12.715181350708008, 26.388507843017578, 42.865264892578125, -9.124038696289062, 8.211740493774414, -1.9408245086669922, 31.5784912109375, 7.920074462890625, -16.808181762695312, 18.390838623046875, -3.850555419921875, 12.345291137695312, 24.548677444458008, 10.448944091796875, 3.493560791015625, 39.109718322753906, -0.0214080810546875, 6.203470230102539, 33.998573303222656, 24.298126220703125, 2.522472381591797, 40.83714294433594, -25.34861946105957, 33.865230560302734, 21.126808166503906, 24.083786010742188, -5.4696807861328125, 40.322662353515625, -1.619882583618164, -7.3153076171875, 0.4253063201904297, 30.969703674316406, 1.34521484375, 18.756332397460938, 5.6245574951171875, 12.916580200195312, 17.15771484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000355.npy"}
|
||||
{"epoch": 0.5366591080876795, "step": 356, "batch_size": 64, "mean": 9.678832054138184, "std": 17.974163055419922, "min": -36.98518753051758, "p10": -11.15340366363525, "median": 8.409072875976562, "p90": 32.302255630493164, "max": 46.68231201171875, "pos_frac": 0.765625, "sample": [42.70606994628906, 32.11946105957031, 5.38446044921875, -26.44013214111328, 27.156579971313477, 2.2778472900390625, 4.475379943847656, 31.240249633789062, 15.474746704101562, 1.9450206756591797, -23.448326110839844, -33.35228729248047, 6.4930572509765625, -12.619529724121094, 33.21861267089844, 36.480369567871094, 4.727210998535156, 8.199787139892578, 32.64990997314453, 0.8976764678955078, 22.277435302734375, -36.98518753051758, 22.465316772460938, 10.554656982421875, 9.057426452636719, 28.524032592773438, -6.450756072998047, -7.1066131591796875, 8.618358612060547, 11.245994567871094, 7.6208343505859375, 3.214740753173828, -6.693788528442383, 1.1686687469482422, -16.206390380859375, 17.593421936035156, 20.19573974609375, -7.732442855834961, 16.308921813964844, 26.8189697265625, 7.554164886474609, 4.7512054443359375, 30.153423309326172, -0.5663909912109375, 11.137107849121094, 10.697637557983398, 32.38059616088867, 4.6156158447265625, -5.487249374389648, 14.658668518066406, -0.8573875427246094, 14.155157089233398, 37.20747375488281, 29.97052001953125, 0.5094451904296875, 15.67384147644043, 6.9192657470703125, 0.215057373046875, 30.374679565429688, -2.7236480712890625, -23.313961029052734, 46.68231201171875, 15.6275634765625, 25.03460693359375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000356.npy"}
|
||||
{"epoch": 0.5381708238851096, "step": 357, "batch_size": 64, "mean": 8.651976585388184, "std": 15.039231300354004, "min": -26.42682647705078, "p10": -9.452446365356446, "median": 7.257089614868164, "p90": 29.53173904418946, "max": 42.581024169921875, "pos_frac": 0.765625, "sample": [34.102745056152344, -0.6341705322265625, 22.832672119140625, -4.468223571777344, -23.957626342773438, 40.54546356201172, 2.71044921875, 12.003658294677734, 5.6349334716796875, 5.7957305908203125, -6.97674560546875, 27.678489685058594, -6.272666931152344, 9.438858032226562, 42.581024169921875, 7.862335205078125, 31.763566970825195, 10.060134887695312, 20.845685958862305, 6.729846954345703, 15.380546569824219, 8.7935791015625, 8.359134674072266, 3.9628772735595703, 5.3524017333984375, 10.954620361328125, 2.9939632415771484, 0.7371158599853516, -9.531219482421875, 10.618885040283203, -15.015701293945312, 1.2573528289794922, -9.26864242553711, 10.017532348632812, 21.079689025878906, 34.21158218383789, 23.212387084960938, 18.405593872070312, 24.413070678710938, 35.21736526489258, -2.7152481079101562, 12.534507751464844, 4.595184326171875, -0.34067535400390625, 26.069549560546875, 8.371986389160156, 1.8129196166992188, -2.6023483276367188, -15.582561492919922, 7.784332275390625, 4.492191314697266, 20.621994018554688, 20.87908172607422, 3.024730682373047, 22.810928344726562, 0.6343765258789062, 2.386636734008789, 5.874122619628906, 30.32598876953125, -16.633098602294922, -14.537689208984375, -26.42682647705078, 20.2152099609375, 0.7028980255126953], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000357.npy"}
|
||||
{"epoch": 0.5396825396825397, "step": 358, "batch_size": 64, "mean": 12.478787422180176, "std": 15.798869132995605, "min": -16.809101104736328, "p10": -4.982125091552733, "median": 10.400166511535645, "p90": 32.89564514160157, "max": 47.867069244384766, "pos_frac": 0.765625, "sample": [29.17596435546875, 13.221153259277344, 23.986812591552734, 31.59820556640625, -2.414682388305664, 29.064971923828125, 31.396160125732422, -5.509407043457031, 5.439723968505859, 5.7800750732421875, -5.6532135009765625, 15.111637115478516, 8.422645568847656, 17.419326782226562, 33.71842956542969, 4.51446533203125, -12.97271728515625, 16.994491577148438, 26.946914672851562, 11.045495986938477, 23.903961181640625, 2.011180877685547, -3.751800537109375, 27.69282341003418, 20.3971004486084, -3.6908493041992188, -0.33707427978515625, 1.3453903198242188, -2.6273651123046875, 3.0294837951660156, -13.66701889038086, 9.754837036132812, -2.90032958984375, 2.6878623962402344, -3.0345382690429688, 16.212890625, 12.55398178100586, 21.7486572265625, 4.694908142089844, -1.3755455017089844, 40.327728271484375, 33.451690673828125, 47.75341796875, 2.146595001220703, 45.103965759277344, 47.867069244384766, 24.230079650878906, -6.988887786865234, 4.536712646484375, 25.556106567382812, 3.1321945190429688, -16.13245391845703, -16.809101104736328, 26.01849365234375, 2.089569091796875, 20.144031524658203, 5.447681427001953, 9.283939361572266, 2.747528076171875, 17.073482513427734, 12.492650985717773, 24.165916442871094, 12.090465545654297, 40.97846221923828], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000358.npy"}
|
||||
{"epoch": 0.5411942554799698, "step": 359, "batch_size": 64, "mean": 6.921560764312744, "std": 13.538834571838379, "min": -23.60821533203125, "p10": -9.194308471679687, "median": 6.715694427490234, "p90": 26.04004421234131, "max": 36.39397430419922, "pos_frac": 0.734375, "sample": [-7.619781494140625, 20.501792907714844, -4.453727722167969, 8.188995361328125, 9.906448364257812, 3.5107345581054688, -10.317401885986328, 5.233236312866211, -7.243099212646484, 15.132110595703125, 36.049407958984375, 7.831071853637695, 2.7689437866210938, 14.552017211914062, -4.139219284057617, 9.94720458984375, 25.17474937438965, -6.095729827880859, 7.861305236816406, -21.923498153686523, 2.7892532348632812, 3.0627212524414062, 0.363555908203125, -13.915664672851562, 6.74163818359375, 26.410884857177734, 8.604694366455078, 9.935001373291016, 10.287628173828125, -19.429534912109375, 29.097412109375, 35.14666748046875, -11.732734680175781, 24.82378387451172, 5.15185546875, 2.9233856201171875, 9.464393615722656, 12.324359893798828, 6.2937164306640625, 18.07251739501953, -7.6665191650390625, 20.015581130981445, 27.552934646606445, 21.702667236328125, 6.689750671386719, 1.67950439453125, -3.6660842895507812, -23.60821533203125, -3.3492774963378906, 10.223358154296875, 4.021125793457031, -4.921968460083008, -9.849075317382812, -1.7371845245361328, 19.17742919921875, 0.2778167724609375, 19.945819854736328, 8.555902481079102, 7.714935302734375, 26.697677612304688, 14.356693267822266, 36.39397430419922, 0.0707855224609375, 1.421173095703125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000359.npy"}
|
||||
{"epoch": 0.5427059712773998, "step": 360, "batch_size": 64, "mean": 9.974839210510254, "std": 14.58853816986084, "min": -13.78564453125, "p10": -6.361642456054687, "median": 6.802595138549805, "p90": 30.92546653747559, "max": 51.19507598876953, "pos_frac": 0.765625, "sample": [5.246673583984375, 49.11133575439453, -5.2390899658203125, 5.211555480957031, 5.744289398193359, 20.214263916015625, -5.653095245361328, 7.210147857666016, 0.7269439697265625, 17.122528076171875, -2.6377735137939453, -6.663116455078125, 0.7517929077148438, -9.538867950439453, 25.619747161865234, -5.658203125, 35.933013916015625, -10.57900619506836, 7.555856704711914, 23.537826538085938, 8.524044036865234, 2.5285511016845703, -13.78564453125, 1.492095947265625, 5.008544921875, 4.0069122314453125, 23.586502075195312, 11.246627807617188, 10.751873016357422, 10.827896118164062, 11.0125732421875, 6.861789703369141, 11.50040054321289, -8.73590087890625, 5.178375244140625, 15.608268737792969, -9.175933837890625, 12.254829406738281, 24.752769470214844, 3.4966583251953125, 51.19507598876953, 2.2168426513671875, 17.322921752929688, -3.5620784759521484, 32.557525634765625, 14.629753112792969, 31.541637420654297, 6.743400573730469, 3.687183380126953, 47.58500289916992, 30.191017150878906, 17.007919311523438, 31.240230560302734, -0.27715110778808594, -0.8437404632568359, 11.554061889648438, 1.4575176239013672, -12.880428314208984, 2.981721878051758, 23.243438720703125, 5.0736846923828125, -0.573974609375, 12.607894897460938, 18.732208251953125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000360.npy"}
|
||||
{"epoch": 0.54421768707483, "step": 361, "batch_size": 64, "mean": 12.034027099609375, "std": 14.22981071472168, "min": -26.343154907226562, "p10": -3.078435134887695, "median": 11.848313331604004, "p90": 27.26442108154297, "max": 52.54810333251953, "pos_frac": 0.828125, "sample": [11.617225646972656, 18.271900177001953, 0.9441909790039062, -11.332828521728516, 31.138046264648438, 10.490203857421875, -10.81528091430664, -13.095420837402344, 7.764011383056641, 8.246883392333984, 9.157028198242188, -13.878658294677734, -0.8449325561523438, 25.108081817626953, 24.515098571777344, 19.389911651611328, -2.8424644470214844, 3.2296066284179688, 4.151969909667969, 4.60650634765625, 14.39666748046875, 21.410491943359375, 11.187232971191406, 30.795242309570312, 17.717926025390625, 14.986038208007812, 17.51766586303711, -26.343154907226562, 32.73827362060547, 12.283672332763672, 18.847129821777344, 1.2433090209960938, 23.085750579833984, 52.42671203613281, 27.093971252441406, 18.768951416015625, 25.497791290283203, 12.799911499023438, -3.1795654296875, -2.8400039672851562, -0.5997714996337891, 2.0222320556640625, 26.80959701538086, 22.380573272705078, 18.25577163696289, 27.33747100830078, 16.412904739379883, 3.7340011596679688, -6.4603729248046875, 13.099845886230469, 24.07172393798828, 7.1502838134765625, 14.707771301269531, 9.359691619873047, 52.54810333251953, 6.866613388061523, 32.072669982910156, 9.507621765136719, 1.6391754150390625, 14.745061874389648, 7.08306884765625, 12.079401016235352, 2.0431900024414062, 7.0560302734375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000361.npy"}
|
||||
{"epoch": 0.54572940287226, "step": 362, "batch_size": 64, "mean": 11.748510360717773, "std": 14.721344947814941, "min": -18.969192504882812, "p10": -3.975801467895507, "median": 10.00875473022461, "p90": 30.326184463500983, "max": 62.8763427734375, "pos_frac": 0.78125, "sample": [-2.7055892944335938, 12.261465072631836, 2.306976318359375, 0.8156986236572266, 7.306549072265625, -18.969192504882812, 21.04596710205078, 2.861846923828125, 62.8763427734375, 16.653411865234375, 1.1672286987304688, -4.288791656494141, 14.092239379882812, 3.4091720581054688, -0.6092014312744141, -1.5219612121582031, 11.071113586425781, 19.6751708984375, 9.224163055419922, 25.847503662109375, -1.007772445678711, 5.3420257568359375, 7.024175643920898, 8.85223388671875, 27.890625, 8.212852478027344, 28.831279754638672, 4.5925750732421875, -0.23974609375, 19.174278259277344, 43.429168701171875, 4.129005432128906, -0.113250732421875, 20.104324340820312, 3.9407806396484375, 25.09881591796875, 15.675033569335938, 0.6786651611328125, 10.527961730957031, -3.2454910278320312, 2.900949478149414, 38.98027801513672, 28.477325439453125, 31.043399810791016, 12.136772155761719, 7.270263671875, 22.94580078125, 32.370635986328125, 21.0607852935791, 16.100357055664062, -8.04423713684082, 12.986968994140625, 19.147048950195312, -11.985549926757812, 9.489547729492188, 11.494930267333984, 30.96685791015625, -7.98089599609375, -5.983028411865234, 16.142471313476562, 38.868675231933594, -12.656181335449219, 23.16070556640625, 11.593132019042969], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000362.npy"}
|
||||
{"epoch": 0.54724111866969, "step": 363, "batch_size": 64, "mean": 11.839499473571777, "std": 14.310311317443848, "min": -19.961585998535156, "p10": -5.844679450988769, "median": 8.910959243774414, "p90": 33.04799461364746, "max": 49.12823486328125, "pos_frac": 0.796875, "sample": [19.670711517333984, -13.685600280761719, 36.375526428222656, 8.619117736816406, 8.166351318359375, 8.817569732666016, 5.178443908691406, 36.285552978515625, 3.7834014892578125, -7.706409454345703, -7.9631500244140625, 21.339599609375, 49.12823486328125, 28.432998657226562, 2.2768325805664062, 37.77733612060547, 5.688226699829102, 5.674293518066406, 9.673133850097656, 27.384254455566406, 3.011859893798828, 12.036628723144531, 14.182016372680664, 35.76802062988281, 29.314743041992188, 13.43577766418457, 5.4639892578125, 7.338287353515625, -3.3202896118164062, 4.328174591064453, -5.628667831420898, -19.961585998535156, 14.046600341796875, -10.346981048583984, 32.18662643432617, 14.558847427368164, 4.523199081420898, 17.441368103027344, 8.028387069702148, -0.4918994903564453, 5.388912200927734, 5.428035736083984, 16.868797302246094, 25.591567993164062, 11.408075332641602, 36.695655822753906, 23.99199676513672, 9.004348754882812, 24.041410446166992, -1.7761383056640625, 33.417152404785156, 3.2617645263671875, 13.612314224243164, 15.350423812866211, -8.760671615600586, -5.937255859375, 25.4073486328125, -2.0639610290527344, 30.07916259765625, 7.687839508056641, -1.4163570404052734, 6.377006530761719, 13.063751220703125, 10.175312042236328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000363.npy"}
|
||||
{"epoch": 0.5487528344671202, "step": 364, "batch_size": 64, "mean": 12.194845199584961, "std": 17.377119064331055, "min": -32.03185272216797, "p10": -5.117801666259765, "median": 10.13686752319336, "p90": 34.9759204864502, "max": 51.97840881347656, "pos_frac": 0.734375, "sample": [35.05987548828125, -0.7705154418945312, 1.1013565063476562, 46.0184326171875, 37.75104522705078, 2.4887008666992188, 14.966987609863281, 17.45116424560547, -4.6174163818359375, 31.333145141601562, 51.97840881347656, 12.450881958007812, -4.267486572265625, -7.84173583984375, 26.650604248046875, 42.02489471435547, 9.8785400390625, -1.0854969024658203, 48.70072937011719, -3.6142654418945312, 33.608123779296875, 8.173599243164062, 14.980842590332031, 22.63177490234375, -23.285079956054688, -13.130208969116211, 40.56504821777344, -2.6110897064208984, 0.566986083984375, 16.064430236816406, 26.1424503326416, 19.687355041503906, -0.7192535400390625, 10.395195007324219, 7.372444152832031, 2.7145156860351562, 22.779422760009766, 22.58843994140625, -3.139862060546875, 2.5457820892333984, 14.31326675415039, 16.14289093017578, -2.0947952270507812, 4.1224212646484375, 31.375396728515625, 3.400440216064453, 3.244579315185547, 17.034805297851562, 3.10748291015625, 34.780025482177734, -5.332252502441406, 23.352903366088867, -17.84181022644043, 22.24706268310547, -32.03185272216797, -0.8006439208984375, 19.187217712402344, 18.128372192382812, 2.0191612243652344, 8.821098327636719, 6.785602569580078, -5.702232360839844, 34.43254089355469, 18.18962860107422], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000364.npy"}
|
||||
{"epoch": 0.5502645502645502, "step": 365, "batch_size": 64, "mean": 11.4912109375, "std": 12.582695960998535, "min": -19.1407470703125, "p10": -1.9064281463623045, "median": 10.29754638671875, "p90": 27.476770782470705, "max": 42.189754486083984, "pos_frac": 0.875, "sample": [0.8444900512695312, 12.822122573852539, 11.55914306640625, 7.679534912109375, 10.186134338378906, 5.132852554321289, 18.515228271484375, 19.400781631469727, 27.583999633789062, 18.63433837890625, 15.130142211914062, 11.030509948730469, -19.1407470703125, 7.725410461425781, 10.408958435058594, 18.33203125, -1.7405338287353516, 9.075294494628906, 0.0523529052734375, 0.45403099060058594, 16.299575805664062, 6.978263854980469, 15.49072265625, 3.5277538299560547, 0.2917442321777344, -11.162345886230469, 7.905853271484375, 41.56712341308594, 5.212394714355469, 22.91323471069336, 13.336532592773438, 7.048986434936523, 18.71053695678711, 7.235912322998047, 27.8883056640625, 12.527111053466797, 30.459197998046875, -1.9775257110595703, 3.3509864807128906, 11.87542724609375, 26.349531173706055, 20.0181884765625, 20.56781768798828, 4.5222015380859375, 16.319580078125, 41.04090881347656, 5.20208740234375, -8.137287139892578, 41.44659423828125, 7.9553680419921875, 27.22657012939453, -3.607105255126953, 16.488304138183594, 6.3726806640625, 3.4232959747314453, 15.322410583496094, 0.17755126953125, -12.56668472290039, 9.187019348144531, 2.787200927734375, 42.189754486083984, 19.846961975097656, -2.905731201171875, 13.044418334960938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000365.npy"}
|
||||
{"epoch": 0.5517762660619804, "step": 366, "batch_size": 64, "mean": 9.330782890319824, "std": 15.334517478942871, "min": -18.645164489746094, "p10": -10.019235801696775, "median": 8.43372917175293, "p90": 29.767144012451173, "max": 44.824432373046875, "pos_frac": 0.703125, "sample": [-1.1284198760986328, 24.446624755859375, 23.071731567382812, 4.347858428955078, -5.6171417236328125, 27.788244247436523, -14.463623046875, -6.851762771606445, 3.20916748046875, -11.376724243164062, -2.104156494140625, 18.337509155273438, 10.37655258178711, 15.909122467041016, 18.535430908203125, 8.403961181640625, -18.168060302734375, 13.2020263671875, 3.7217063903808594, 13.457191467285156, 10.442901611328125, 8.463497161865234, 1.4971084594726562, -2.4708328247070312, 4.838918685913086, 13.578033447265625, 11.746932983398438, 31.42593765258789, -18.645164489746094, -1.8809223175048828, 44.824432373046875, -3.4816970825195312, -5.441385269165039, 32.06846618652344, -17.594942092895508, 1.653615951538086, 0.9265708923339844, -13.904510498046875, 9.516399383544922, 8.26826286315918, 3.374837875366211, -4.429594039916992, 4.507862091064453, 13.88397216796875, -3.8839874267578125, -0.12892913818359375, 1.1648540496826172, 29.985275268554688, 41.39415740966797, 7.98211669921875, 10.895111083984375, 25.28512954711914, 19.373886108398438, -18.571796417236328, 20.645654678344727, -2.360231399536133, 42.724830627441406, 21.427032470703125, 33.192291259765625, 24.930709838867188, 29.25817108154297, 19.37291717529297, 25.054214477539062, 11.162757873535156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000366.npy"}
|
||||
{"epoch": 0.5532879818594104, "step": 367, "batch_size": 64, "mean": 8.905525207519531, "std": 12.487512588500977, "min": -28.534465789794922, "p10": -5.440667724609374, "median": 8.585445404052734, "p90": 25.162942504882814, "max": 37.28184509277344, "pos_frac": 0.796875, "sample": [3.9308624267578125, 31.61370849609375, 10.567577362060547, -1.5150985717773438, 12.225284576416016, 0.4456939697265625, 27.164443969726562, 26.453060150146484, 8.223106384277344, 1.8256607055664062, -1.3890151977539062, 3.7331085205078125, 5.654937744140625, 5.340484619140625, 37.28184509277344, 8.183990478515625, 13.750862121582031, 0.13806915283203125, -6.612890243530273, 18.821338653564453, 5.525421142578125, 11.3082275390625, -2.262350082397461, -9.529621124267578, 16.74072265625, -28.534465789794922, -10.406639099121094, 22.968994140625, 8.969429016113281, 16.19598388671875, 24.063232421875, 14.969575881958008, 6.678922653198242, 17.428863525390625, 9.00954818725586, 4.578695297241211, 8.947784423828125, -4.317352294921875, 9.242874145507812, 25.25904083251953, 3.3929367065429688, -22.818805694580078, 4.697420120239258, 29.155540466308594, 13.000770568847656, -5.922088623046875, 13.355422973632812, 8.044940948486328, 15.046592712402344, 0.8291969299316406, 15.678070068359375, 4.722513198852539, -10.813705444335938, 20.558731079101562, 24.93871307373047, 23.07929229736328, 3.909332275390625, -3.230854034423828, 13.55902099609375, 29.799156188964844, -0.165313720703125, 9.543642044067383, 4.7079010009765625, 22.211267471313477], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000367.npy"}
|
||||
{"epoch": 0.5547996976568406, "step": 368, "batch_size": 64, "mean": 8.295562744140625, "std": 13.28518009185791, "min": -20.456024169921875, "p10": -9.896392059326171, "median": 9.356555938720703, "p90": 26.657438278198242, "max": 38.819366455078125, "pos_frac": 0.71875, "sample": [-20.456024169921875, 12.111885070800781, -5.142730712890625, 0.8286342620849609, 7.078521728515625, 19.151771545410156, 21.197240829467773, 12.487319946289062, -1.7396163940429688, -6.131050109863281, 24.881668090820312, 1.8502540588378906, 7.8398284912109375, 12.33856201171875, 31.42632293701172, 12.100639343261719, -2.3005447387695312, -10.098709106445312, -9.776458740234375, 23.54074478149414, 5.291358947753906, 11.82493782043457, 14.622909545898438, -10.251953125, 28.403406143188477, 7.800983428955078, -11.29205322265625, 16.110244750976562, 38.819366455078125, 19.61545181274414, -1.852081298828125, 7.60809326171875, 26.76375961303711, -0.3314990997314453, 1.0457725524902344, 14.506954193115234, 1.7454681396484375, 10.870719909667969, 4.511283874511719, 14.031257629394531, 1.8977775573730469, 0.5693416595458984, 26.40935516357422, 17.887924194335938, -15.376815795898438, -8.345130920410156, 10.999900817871094, -2.5056838989257812, 27.050498962402344, -9.947792053222656, 29.610916137695312, 29.337017059326172, 19.130035400390625, 17.89270782470703, 1.9968452453613281, 14.14996337890625, 13.596305847167969, -5.951446533203125, -9.098672866821289, 13.614547729492188, 18.241653442382812, 17.182336807250977, -16.300552368164062, 7.8423919677734375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000368.npy"}
|
||||
{"epoch": 0.5563114134542706, "step": 369, "batch_size": 64, "mean": 8.570233345031738, "std": 13.124995231628418, "min": -19.175466537475586, "p10": -8.212321281433104, "median": 8.336705207824707, "p90": 24.7231990814209, "max": 37.86209487915039, "pos_frac": 0.75, "sample": [21.463096618652344, -15.3857421875, 10.551969528198242, 21.4027099609375, -19.175466537475586, 28.672210693359375, 22.99356460571289, 4.405662536621094, 14.722808837890625, 10.580547332763672, 6.3113861083984375, 21.108642578125, 23.062881469726562, 26.916397094726562, 25.45038414001465, 2.877704620361328, 9.480010986328125, 9.471527099609375, 10.395538330078125, 5.836589813232422, 11.780105590820312, -17.042322158813477, 3.7436141967773438, -16.252302169799805, 7.356803894042969, 22.264495849609375, -3.0020904541015625, 12.839599609375, 14.22079086303711, -5.98808479309082, 21.465606689453125, 0.6547470092773438, 9.463462829589844, 19.470497131347656, 16.127086639404297, 24.91666030883789, 34.112335205078125, -3.928192138671875, -2.338104248046875, -12.099601745605469, 1.7981414794921875, 6.167243957519531, -2.725933074951172, 7.601293563842773, 1.4727249145507812, 3.2510910034179688, 34.9935302734375, 9.07211685180664, 24.27178955078125, -0.8478546142578125, -0.224365234375, 4.305244445800781, -0.05655670166015625, 22.156600952148438, 0.7510147094726562, -5.155656814575195, 3.890230178833008, 37.86209487915039, 4.393669128417969, 12.178153991699219, -9.165565490722656, 11.57391357421875, -14.367195129394531, 16.391687393188477], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000369.npy"}
|
||||
{"epoch": 0.5578231292517006, "step": 370, "batch_size": 64, "mean": 11.482163429260254, "std": 16.573627471923828, "min": -34.052616119384766, "p10": -5.831216812133789, "median": 7.985856056213379, "p90": 34.43070526123047, "max": 45.266441345214844, "pos_frac": 0.78125, "sample": [7.913789749145508, 35.578758239746094, 17.19074058532715, 5.305992126464844, 34.49127197265625, 3.42242431640625, 43.31617736816406, 6.1486358642578125, 45.266441345214844, 3.0968894958496094, 2.1280288696289062, 2.7118911743164062, -1.1433563232421875, 23.0645751953125, 44.37101745605469, 17.893306732177734, 7.4283599853515625, -10.113479614257812, -12.415962219238281, 21.7823486328125, 13.746017456054688, 16.849166870117188, 5.1769561767578125, 18.4603271484375, 23.369991302490234, 13.24761962890625, 26.39476776123047, 1.6451148986816406, 6.0757293701171875, -4.6187896728515625, -1.8446998596191406, 0.7300949096679688, -2.28265380859375, -15.025665283203125, -2.5615005493164062, 29.650962829589844, 35.78765869140625, 16.70636749267578, 36.94142532348633, 23.15500259399414, 34.28938293457031, -28.9837646484375, 16.118791580200195, 15.397323608398438, 3.0942440032958984, 29.389678955078125, 14.349861145019531, 21.716421127319336, 23.11240577697754, 26.214088439941406, -34.052616119384766, 7.452999114990234, 28.793930053710938, -4.436983108520508, -5.919639587402344, 7.092720031738281, 8.05792236328125, 0.40032958984375, -7.595762252807617, 1.1541023254394531, 10.437385559082031, 30.834060668945312, -5.624897003173828, 4.52473258972168], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000370.npy"}
|
||||
{"epoch": 0.5593348450491308, "step": 371, "batch_size": 64, "mean": 10.79281997680664, "std": 15.216870307922363, "min": -20.415491104125977, "p10": -8.949004745483398, "median": 10.52782917022705, "p90": 27.961364173889166, "max": 45.78631591796875, "pos_frac": 0.703125, "sample": [20.269515991210938, -20.415491104125977, 40.18431854248047, 9.527557373046875, 25.488086700439453, -11.05120849609375, 22.89139175415039, -0.8005886077880859, -3.0863037109375, 3.755809783935547, 2.8978118896484375, 19.375492095947266, -8.89969253540039, 43.142364501953125, 3.7330188751220703, 31.41930389404297, -0.14403915405273438, 28.70741081237793, 10.348764419555664, 41.853458404541016, 11.180328369140625, -5.973014831542969, -2.105745315551758, 5.802909851074219, 17.903831481933594, 25.662982940673828, 17.4147891998291, 45.78631591796875, 23.013710021972656, 21.307331085205078, 24.049652099609375, 19.52886199951172, -11.828262329101562, -1.9289321899414062, 12.670289993286133, 10.706893920898438, -2.9055938720703125, -9.727340698242188, -8.970138549804688, 33.937530517578125, -8.881420135498047, 18.711421966552734, 0.9555950164794922, 8.73398208618164, 19.431900024414062, -7.682182312011719, 8.949968338012695, 14.82098388671875, -2.9366226196289062, 7.663963317871094, 12.890901565551758, -10.688278198242188, 1.4505443572998047, 17.45819091796875, 25.979888916015625, 0.5301418304443359, 15.189155578613281, 8.252655029296875, 26.22058868408203, 16.19482421875, 25.260948181152344, 23.14905548095703, -11.014739990234375, -4.624431610107422], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000371.npy"}
|
||||
{"epoch": 0.5608465608465608, "step": 372, "batch_size": 64, "mean": 6.941670894622803, "std": 16.330583572387695, "min": -26.40398406982422, "p10": -12.36216926574707, "median": 7.944232940673828, "p90": 25.874143600463867, "max": 45.919830322265625, "pos_frac": 0.734375, "sample": [-12.513031005859375, 11.41982650756836, 16.133834838867188, 10.292160034179688, 14.851264953613281, 10.468223571777344, 32.26385498046875, -5.020027160644531, 24.028152465820312, 18.71862030029297, 33.7103271484375, 0.09027481079101562, 12.841773986816406, 44.693397521972656, 13.350067138671875, 3.3376007080078125, 15.341636657714844, -25.521984100341797, 3.8135452270507812, 20.000137329101562, 2.159149169921875, 5.831842422485352, 9.710723876953125, 25.561466217041016, 2.281524658203125, -2.4590911865234375, -26.226486206054688, 4.3583831787109375, 45.919830322265625, -9.35881233215332, 18.33037567138672, 16.56102752685547, 2.873310089111328, -7.363460540771484, -12.01015853881836, 11.545965194702148, -23.71697235107422, 8.47089958190918, 22.020843505859375, 13.153097152709961, 26.008148193359375, 11.271114349365234, 3.903656005859375, 19.8455810546875, -8.754226684570312, 10.81352424621582, -24.096527099609375, 0.177276611328125, 4.207159042358398, 7.715827941894531, 2.972076416015625, -26.40398406982422, 13.654769897460938, 2.9726505279541016, -5.103130340576172, -9.728788375854492, -18.513439178466797, 9.411308288574219, 29.1087703704834, 43.670108795166016, -5.759181976318359, -6.51470947265625, 5.2931976318359375, 8.172637939453125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000372.npy"}
|
||||
{"epoch": 0.562358276643991, "step": 373, "batch_size": 64, "mean": 9.437311172485352, "std": 16.31985855102539, "min": -20.401626586914062, "p10": -12.069407653808593, "median": 7.304495811462402, "p90": 27.705449295043948, "max": 56.16065979003906, "pos_frac": 0.734375, "sample": [-17.313072204589844, 22.188705444335938, 30.660003662109375, 11.946990966796875, 21.797317504882812, -11.694862365722656, -2.296527862548828, 25.128681182861328, 12.194099426269531, 7.249809265136719, 16.62504768371582, -18.322837829589844, 17.77362060546875, -1.9622344970703125, -16.189851760864258, -6.434440612792969, 7.359182357788086, 9.1424560546875, -20.401626586914062, -5.0391845703125, 14.147518157958984, 0.9361572265625, 17.00627899169922, -10.285484313964844, 18.01788330078125, 24.680816650390625, 2.6167144775390625, 6.112476348876953, 20.08887481689453, 11.623006820678711, 42.13006591796875, 2.773468017578125, 2.042217254638672, 25.49542236328125, -12.229927062988281, 1.1359062194824219, -0.701019287109375, 27.222076416015625, -0.17455291748046875, 27.912609100341797, 14.302112579345703, -17.228179931640625, -9.80877685546875, -0.3664741516113281, 1.5126876831054688, 16.438129425048828, -16.909513473510742, 32.281497955322266, 7.130332946777344, 6.92218017578125, 0.8985671997070312, 37.91456604003906, 4.103679656982422, 56.16065979003906, 8.897684097290039, 22.286418914794922, 14.901981353759766, 20.291580200195312, 6.631717681884766, 48.806060791015625, 7.455835342407227, 6.613101959228516, 4.97186279296875, 26.818458557128906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000373.npy"}
|
||||
{"epoch": 0.563869992441421, "step": 374, "batch_size": 64, "mean": 12.644262313842773, "std": 12.791444778442383, "min": -15.209869384765625, "p10": -1.5773223876953124, "median": 13.289407730102539, "p90": 29.930352401733405, "max": 39.89484405517578, "pos_frac": 0.828125, "sample": [36.26934814453125, 18.125991821289062, 15.762763977050781, -0.7090187072753906, 6.900199890136719, 23.912731170654297, 30.48782730102539, 22.32190704345703, 35.176116943359375, 13.584047317504883, 21.011119842529297, 0.6176910400390625, 19.80284881591797, 25.54736328125, 12.501258850097656, -1.3924102783203125, 27.039146423339844, 8.774284362792969, 2.3668060302734375, 10.942909240722656, 13.364448547363281, 14.851303100585938, 4.852474212646484, -5.5780181884765625, 10.62628173828125, 13.258079528808594, 5.212028503417969, 36.63134002685547, -0.18382644653320312, 16.40502166748047, 5.780364990234375, 17.32172393798828, 4.674591064453125, 0.9700679779052734, 11.477127075195312, 20.776416778564453, 18.48187255859375, -10.33770751953125, 28.62957763671875, 17.226848602294922, 14.27878189086914, -9.594701766967773, -1.6565704345703125, 37.084442138671875, 31.797515869140625, 3.2283401489257812, -12.832794189453125, 8.494522094726562, 9.885313034057617, 14.632102966308594, 39.89484405517578, 26.525726318359375, 22.442031860351562, 9.601921081542969, 13.03145980834961, 2.3611087799072266, 18.146503448486328, 16.314743041992188, 13.320735931396484, 21.793258666992188, -9.816030502319336, -1.2369384765625, 3.2633438110351562, -15.209869384765625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000374.npy"}
|
||||
{"epoch": 0.5653817082388511, "step": 375, "batch_size": 64, "mean": 7.614285469055176, "std": 13.226885795593262, "min": -20.850984573364258, "p10": -7.805801773071287, "median": 6.538882255554199, "p90": 24.76507034301758, "max": 41.33148193359375, "pos_frac": 0.71875, "sample": [-11.142219543457031, -12.251815795898438, 3.4924468994140625, 9.35470199584961, 12.14521598815918, 5.2389984130859375, 31.4019775390625, 3.9385032653808594, -2.9569854736328125, 10.720422744750977, -5.1103363037109375, 20.337966918945312, -4.6487274169921875, 35.574928283691406, 6.575841903686523, -5.567211151123047, 9.488836288452148, -3.7342376708984375, 10.77047348022461, -4.466022491455078, 7.783649444580078, 23.30168914794922, 25.740737915039062, -9.1451416015625, 10.062171936035156, 25.11566925048828, 0.004795074462890625, -5.3862457275390625, 5.6965484619140625, 18.733722686767578, 1.8278656005859375, -1.5466499328613281, 0.48810577392578125, -12.689170837402344, 3.20916748046875, 9.999998092651367, 10.337966918945312, 6.501922607421875, 41.33148193359375, 4.993827819824219, 17.697052001953125, 14.750240325927734, 21.800575256347656, -20.850984573364258, 7.62017822265625, 1.480072021484375, 29.286895751953125, 4.009391784667969, 31.981903076171875, 10.667316436767578, 18.881752014160156, 23.947006225585938, -5.054294586181641, 21.13774871826172, 4.481071472167969, 6.340339660644531, 9.444114685058594, -19.08749008178711, -3.9301223754882812, 19.900062561035156, -4.578269958496094, 23.446699142456055, 7.183349609375, -8.76519775390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000375.npy"}
|
||||
{"epoch": 0.5668934240362812, "step": 376, "batch_size": 64, "mean": 13.454113006591797, "std": 16.284040451049805, "min": -19.571693420410156, "p10": -8.81042022705078, "median": 12.709505081176758, "p90": 36.40951766967773, "max": 57.886016845703125, "pos_frac": 0.796875, "sample": [12.837684631347656, 6.866798400878906, 4.104433059692383, 9.854560852050781, 6.181648254394531, -8.777164459228516, -3.2831878662109375, 29.800048828125, 3.6472549438476562, 10.98897933959961, 17.55255126953125, 22.579544067382812, -15.154415130615234, -1.1414947509765625, -12.250808715820312, 3.4773178100585938, 38.42060089111328, 19.53701400756836, 12.58132553100586, 57.886016845703125, 13.655418395996094, 33.06178283691406, 3.0667343139648438, 23.130775451660156, 36.83221435546875, -1.9967727661132812, 32.19956970214844, 34.452293395996094, 19.10254669189453, 19.63617706298828, 26.272117614746094, 11.103038787841797, 40.93581771850586, 20.948379516601562, 10.310688018798828, 27.034042358398438, 24.029815673828125, -10.516395568847656, 38.466773986816406, -12.875526428222656, 6.9741973876953125, -8.82467269897461, 27.217761993408203, 14.837345123291016, 12.968378067016602, 36.47589111328125, -2.8457775115966797, 15.240928649902344, -1.8435745239257812, 29.654207229614258, -19.571693420410156, 15.037734985351562, 5.139259338378906, 8.554927825927734, 36.25464630126953, 3.314687728881836, 12.904556274414062, 4.850139617919922, 40.40502166748047, 20.72088623046875, 4.057701110839844, 6.652044296264648, 1.9714775085449219, -13.641090393066406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000376.npy"}
|
||||
{"epoch": 0.5684051398337112, "step": 377, "batch_size": 64, "mean": 9.340099334716797, "std": 13.017889976501465, "min": -17.012954711914062, "p10": -6.395375442504882, "median": 11.025805473327637, "p90": 24.07273807525635, "max": 38.67316436767578, "pos_frac": 0.71875, "sample": [15.311908721923828, -8.021064758300781, 20.003435134887695, 17.019323348999023, 9.128860473632812, -3.6198577880859375, -15.587348937988281, -10.897216796875, 8.326866149902344, 19.62970733642578, 33.132781982421875, -1.46563720703125, 28.881954193115234, 2.0210189819335938, 17.420692443847656, 18.080238342285156, -4.551048278808594, 17.22625732421875, -9.299581527709961, 4.3366241455078125, 23.307239532470703, -0.888153076171875, 20.803546905517578, 12.115234375, 11.016851425170898, -17.012954711914062, 11.034759521484375, 34.36952209472656, 5.937557220458984, 4.762937545776367, 19.501739501953125, -5.037654876708984, 14.961458206176758, 2.359956741333008, 3.5528106689453125, -6.111459732055664, 14.2982177734375, 20.733154296875, 17.10124969482422, 14.967987060546875, -3.2103538513183594, 32.052181243896484, 6.632837295532227, 13.042167663574219, 11.216705322265625, 31.857009887695312, 12.750797271728516, -5.476131439208984, -6.018363952636719, 10.043737411499023, 18.045684814453125, 1.9172744750976562, 2.499307632446289, 38.67316436767578, -2.6271896362304688, -6.517053604125977, 24.22884178161621, 14.123844146728516, -16.106687545776367, 5.068763732910156, 17.845022201538086, -0.6407470703125, 23.70849609375, 15.80514144897461], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000377.npy"}
|
||||
{"epoch": 0.5699168556311414, "step": 378, "batch_size": 64, "mean": 8.184558868408203, "std": 14.295930862426758, "min": -27.194446563720703, "p10": -5.470322608947752, "median": 5.6744537353515625, "p90": 27.561573410034192, "max": 49.810546875, "pos_frac": 0.75, "sample": [14.863960266113281, 5.721809387207031, 9.152624130249023, 5.222187042236328, 20.10053253173828, 17.55622100830078, -9.665725708007812, -1.006256103515625, 9.477447509765625, 28.879661560058594, -27.194446563720703, 5.627098083496094, 40.66154479980469, 1.8436603546142578, 16.327774047851562, 10.982963562011719, 2.7020606994628906, 7.862949371337891, 39.52814483642578, -1.2759590148925781, 22.012489318847656, 8.953857421875, -1.6131401062011719, 11.223297119140625, 24.486034393310547, 6.1614227294921875, 3.4404220581054688, 18.9437255859375, -15.590866088867188, 1.690460205078125, 16.47027587890625, 3.972412109375, -10.792411804199219, 1.6462345123291016, 22.446258544921875, 0.9905891418457031, 0.2314605712890625, 0.48975372314453125, 20.719100952148438, -6.094274520874023, 49.810546875, 5.783843994140625, 5.9770050048828125, 29.489459991455078, 4.8311767578125, 13.275543212890625, -1.524343490600586, 36.159793853759766, 12.903297424316406, 1.2662601470947266, 4.134044647216797, -4.014434814453125, 3.3384532928466797, -14.546897888183594, -1.2240447998046875, 21.79834747314453, -2.481344223022461, 8.9619140625, 3.2571048736572266, 9.963432312011719, -3.7725677490234375, -19.274288177490234, -0.4085674285888672, 32.95262908935547], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000378.npy"}
|
||||
{"epoch": 0.5714285714285714, "step": 379, "batch_size": 64, "mean": 10.972753524780273, "std": 14.245290756225586, "min": -17.810283660888672, "p10": -2.521435546875, "median": 8.405517578125, "p90": 28.198797988891602, "max": 50.00982666015625, "pos_frac": 0.8125, "sample": [13.431135177612305, 2.019134521484375, 28.159648895263672, -2.4497222900390625, 8.661666870117188, 7.9839019775390625, 17.06251335144043, -12.919855117797852, 5.700435638427734, 8.65380859375, 0.32453155517578125, 1.209808349609375, -10.2254638671875, 19.04864501953125, 2.0855445861816406, 50.00982666015625, 17.48236083984375, -2.2496910095214844, 21.534454345703125, 22.52429962158203, 37.79815673828125, 28.215576171875, 29.092655181884766, 18.96709442138672, 1.56414794921875, -14.41790771484375, 17.068302154541016, -4.7943878173828125, 17.850486755371094, -17.26617431640625, 40.434349060058594, 14.026968002319336, 7.545770645141602, 6.431888580322266, -2.5521697998046875, 20.007354736328125, 27.750486373901367, 4.56256103515625, 6.778221130371094, -17.810283660888672, -2.447315216064453, 13.050605773925781, 4.939430236816406, 33.491058349609375, 21.405799865722656, 3.9119796752929688, 42.04689025878906, 21.639068603515625, 1.3243865966796875, 4.624114990234375, 27.456710815429688, -1.8979339599609375, -1.3577880859375, 16.078054428100586, 24.28508186340332, 1.31475830078125, 8.1572265625, 15.754268646240234, 0.3277626037597656, 1.278350830078125, 12.355377197265625, 19.469528198242188, 12.846221923828125, 2.9024505615234375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000379.npy"}
|
||||
{"epoch": 0.5729402872260015, "step": 380, "batch_size": 64, "mean": 12.33914566040039, "std": 14.503582000732422, "min": -23.409210205078125, "p10": -7.550890350341795, "median": 12.455387115478516, "p90": 30.01914005279541, "max": 43.9302978515625, "pos_frac": 0.796875, "sample": [8.626005172729492, 28.880638122558594, 7.1388397216796875, 7.957191467285156, -8.456253051757812, -11.245498657226562, 24.215940475463867, 26.00390625, -9.011920928955078, 24.548141479492188, 14.408157348632812, 0.8312759399414062, -1.6616287231445312, 21.169071197509766, 5.142917633056641, -8.739730834960938, 1.6138038635253906, 9.51959228515625, 30.042770385742188, 29.96400260925293, 17.95367431640625, 10.519119262695312, -5.438377380371094, -23.409210205078125, 30.77220916748047, 4.714878082275391, 14.459095001220703, 21.662200927734375, -1.6435375213623047, 8.867223739624023, 20.819961547851562, 25.110504150390625, 12.786117553710938, 8.780380249023438, 37.24421691894531, 5.132717132568359, 32.850189208984375, 15.588623046875, -0.159515380859375, 7.538047790527344, 25.738845825195312, -22.759185791015625, 30.97870445251465, 43.9302978515625, 15.056135177612305, 15.67204475402832, 17.087024688720703, 12.124656677246094, -0.9140625, 25.25987434387207, 3.1727981567382812, 2.8958606719970703, -9.759353637695312, 42.51914978027344, 28.855911254882812, 19.602066040039062, 4.297906875610352, 18.35186767578125, 21.706588745117188, 22.44782257080078, 10.663505554199219, 17.6044864654541, 4.224521636962891, -4.1478271484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000380.npy"}
|
||||
{"epoch": 0.5744520030234316, "step": 381, "batch_size": 64, "mean": 5.3151655197143555, "std": 13.172294616699219, "min": -36.61383056640625, "p10": -6.625308227539062, "median": 4.018678665161133, "p90": 20.71762924194336, "max": 42.07354736328125, "pos_frac": 0.625, "sample": [-3.67779541015625, 2.843658447265625, 2.765146255493164, -11.921869277954102, -36.61383056640625, 4.091968536376953, 4.546356201171875, 8.286251068115234, 40.05253601074219, 13.255069732666016, 8.637474060058594, -3.4188461303710938, -0.9071693420410156, 12.569766998291016, -2.980386734008789, 42.07354736328125, 15.450912475585938, 0.2506275177001953, 5.64501953125, -0.8070220947265625, 1.3890628814697266, -6.942726135253906, 9.791738510131836, 17.964767456054688, -12.112396240234375, 10.616920471191406, -5.60064697265625, -12.359519958496094, 13.017173767089844, -1.787271499633789, 8.586906433105469, 13.89984130859375, 8.041595458984375, 3.9453887939453125, -0.6040725708007812, 20.798179626464844, 8.402870178222656, -3.5819969177246094, -16.490026473999023, 1.4971160888671875, 21.565704345703125, -0.4244384765625, 15.595687866210938, 6.460533142089844, 8.182044982910156, 33.07151794433594, 17.905561447143555, 10.399528503417969, -1.794809341430664, -0.676177978515625, 2.3548126220703125, 19.905784606933594, 20.529678344726562, -5.884666442871094, 4.954597473144531, -1.0463333129882812, 24.59302520751953, -17.692626953125, -5.375816345214844, 5.06617546081543, 1.5077133178710938, -4.04931640625, -1.4580078125, 27.866107940673828], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000381.npy"}
|
||||
{"epoch": 0.5759637188208617, "step": 382, "batch_size": 64, "mean": 8.310380935668945, "std": 11.892104148864746, "min": -18.779829025268555, "p10": -3.9547828674316405, "median": 7.258705139160156, "p90": 23.355097007751475, "max": 45.651493072509766, "pos_frac": 0.734375, "sample": [-3.9049072265625, 1.8147754669189453, 28.560638427734375, 14.563791275024414, 26.353538513183594, -3.6581268310546875, 0.5413970947265625, 4.730777740478516, 18.55224609375, 10.188322067260742, 6.4091033935546875, 4.6320343017578125, -1.4358444213867188, 17.098419189453125, 17.795146942138672, 0.1201324462890625, -8.495624542236328, 7.13848876953125, -18.779829025268555, 14.6163330078125, -2.0605010986328125, 7.3789215087890625, 11.511825561523438, -1.900054931640625, -3.9761581420898438, 11.954216003417969, 13.63426399230957, 6.926078796386719, -6.2169036865234375, 45.651493072509766, 9.060195922851562, 10.454124450683594, -11.082195281982422, -9.689277648925781, 24.526718139648438, -3.1444091796875, -2.219676971435547, -1.7226829528808594, 8.75146484375, 24.251691818237305, 4.4038543701171875, 41.82976531982422, 7.0312652587890625, 15.049491882324219, 3.9958343505859375, 10.581871032714844, -5.080768585205078, -0.7138938903808594, 0.5322418212890625, 10.836910247802734, 6.632476806640625, 16.22118377685547, -2.4281845092773438, 10.231420516967773, 16.732601165771484, 1.584320068359375, 16.40765380859375, 6.66905403137207, 21.263042449951172, 33.515602111816406, 12.804664611816406, 12.992843627929688, 9.666324615478516, 12.174919128417969], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000382.npy"}
|
||||
{"epoch": 0.5774754346182918, "step": 383, "batch_size": 64, "mean": 10.567968368530273, "std": 15.432197570800781, "min": -23.010955810546875, "p10": -6.487496757507324, "median": 8.735191345214844, "p90": 31.793389892578126, "max": 51.346702575683594, "pos_frac": 0.71875, "sample": [1.0803909301757812, 20.07672882080078, 16.582992553710938, 5.547523498535156, 8.711860656738281, -6.610326766967773, 19.006999969482422, 9.6678466796875, -5.179023742675781, -10.848382949829102, 12.064868927001953, -4.952972412109375, 8.293575286865234, 16.50852394104004, 18.270015716552734, 7.3702392578125, 22.041351318359375, -8.622795104980469, 32.34136962890625, 10.182613372802734, 12.57672119140625, -3.1312255859375, 27.225631713867188, 5.1761474609375, 30.326623916625977, 3.329681396484375, 8.278709411621094, -1.329315185546875, 14.923049926757812, 33.185821533203125, 29.020261764526367, -5.2510833740234375, -3.5025787353515625, 51.346702575683594, -2.1207542419433594, 11.383529663085938, 31.239486694335938, -23.010955810546875, 5.132087707519531, 13.793296813964844, -11.816886901855469, 3.863128662109375, 14.392189025878906, 10.423349380493164, 28.23065185546875, -6.200893402099609, 5.437206268310547, 8.758522033691406, -11.16058349609375, 39.007896423339844, 2.4304676055908203, 45.365516662597656, 21.75164794921875, 3.802703857421875, -1.772125244140625, 32.76892852783203, 23.850982666015625, 6.966583251953125, -1.432037353515625, -18.459014892578125, 27.52635955810547, -5.5009002685546875, 32.03077697753906, 15.960260391235352], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000383.npy"}
|
||||
{"epoch": 0.5789871504157218, "step": 384, "batch_size": 64, "mean": 10.034626007080078, "std": 12.311689376831055, "min": -26.152313232421875, "p10": -4.998551177978515, "median": 10.49962329864502, "p90": 26.45299148559571, "max": 41.902984619140625, "pos_frac": 0.8125, "sample": [-7.638427734375, 24.878402709960938, 11.843254089355469, 8.097404479980469, 2.0013275146484375, 19.95934295654297, 10.689765930175781, 15.774124145507812, 20.6536865234375, 0.4163932800292969, 17.51446533203125, 14.002914428710938, 6.063446044921875, 5.020393371582031, -3.6116943359375, 2.421478271484375, 0.5293464660644531, 30.580299377441406, -1.7937469482421875, 28.109477996826172, -1.9679641723632812, 14.242599487304688, 9.061161041259766, -11.74359130859375, -5.592918395996094, 10.968204498291016, 32.50176239013672, 15.147354125976562, -8.68792724609375, 15.366912841796875, 34.0047607421875, 14.442428588867188, 2.7362403869628906, 8.732427597045898, 12.140857696533203, 0.4082450866699219, 41.902984619140625, 5.83607292175293, -26.152313232421875, 21.788650512695312, 10.3795166015625, 14.958770751953125, 27.253158569335938, -9.630882263183594, 27.12781524658203, 22.704345703125, 10.787117004394531, 4.10014533996582, 4.083576202392578, 10.619729995727539, 12.785125732421875, 3.0271377563476562, 8.719621658325195, 10.311233520507812, 4.6819915771484375, -0.9775142669677734, -6.16253662109375, 24.717880249023438, -3.378204345703125, 5.971282958984375, 23.992446899414062, 14.9625244140625, 18.128271102905273, 12.405914306640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000384.npy"}
|
||||
{"epoch": 0.5804988662131519, "step": 385, "batch_size": 64, "mean": 11.081631660461426, "std": 12.890931129455566, "min": -17.046401977539062, "p10": -3.3224456787109373, "median": 11.328742980957031, "p90": 27.230022430419922, "max": 51.251590728759766, "pos_frac": 0.765625, "sample": [-3.1242218017578125, 3.689739227294922, 29.442110061645508, 17.936187744140625, 15.487091064453125, 12.753128051757812, 12.557113647460938, 11.349651336669922, 4.987407684326172, 7.6044464111328125, -8.26015853881836, 11.813804626464844, 10.747783660888672, 20.412738800048828, 13.894721984863281, 21.48912811279297, 31.58167266845703, 23.56529998779297, -1.5424537658691406, 10.24502944946289, -3.1337738037109375, -17.046401977539062, 1.077606201171875, 27.557510375976562, 24.37961196899414, 21.738143920898438, 12.050128936767578, 12.370674133300781, 2.3814697265625, 9.667232513427734, 8.920387268066406, -15.771934509277344, 19.923744201660156, -1.0236892700195312, 11.136234283447266, 25.577667236328125, 18.77984619140625, 11.600372314453125, 9.03005599975586, 15.076774597167969, -7.1323394775390625, -2.2153701782226562, 18.405120849609375, 18.490623474121094, 0.1298828125, 33.5831298828125, 27.417373657226562, -7.07574462890625, -1.8053474426269531, -3.4033050537109375, 0.171142578125, 11.30783462524414, -1.2929534912109375, 6.6481170654296875, 16.777828216552734, 23.114830017089844, -0.19278717041015625, 20.37685203552246, -3.546630859375, 51.251590728759766, 1.3437767028808594, 5.1050567626953125, 34.05096435546875, 26.792869567871094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000385.npy"}
|
||||
{"epoch": 0.582010582010582, "step": 386, "batch_size": 64, "mean": 8.81447982788086, "std": 13.002494812011719, "min": -24.20606231689453, "p10": -4.920871543884277, "median": 7.174394607543945, "p90": 25.746012115478518, "max": 39.28999328613281, "pos_frac": 0.71875, "sample": [39.28999328613281, -4.759790420532227, -0.18812942504882812, 15.40582275390625, 21.72216033935547, 6.3873748779296875, -0.18255615234375, 19.495445251464844, 6.646907806396484, 16.175796508789062, 3.6622238159179688, -1.965667724609375, -20.29007911682129, 7.228733062744141, -4.147491455078125, 26.02648162841797, 14.619964599609375, 28.334705352783203, 16.712032318115234, 8.376960754394531, 11.453458786010742, 24.460983276367188, 7.535346984863281, -3.9314422607421875, -4.989906311035156, 16.937435150146484, 27.613847732543945, -6.543006896972656, 12.970720291137695, 8.627479553222656, 6.752937316894531, 17.939666748046875, -1.1661300659179688, 7.12005615234375, -11.10788345336914, -0.5614242553710938, 2.5237884521484375, 5.917030334472656, 10.37666130065918, 24.196884155273438, 7.672142028808594, -7.581073760986328, 14.53276252746582, 25.091583251953125, 26.79114532470703, -4.641094207763672, 3.5306396484375, 12.380577087402344, -1.0236644744873047, 36.73509979248047, 3.205230712890625, -2.6583251953125, -14.056671142578125, 21.793075561523438, 34.62139892578125, 4.7186279296875, 4.401252746582031, 5.728935241699219, 24.764785766601562, 15.855392456054688, 11.19281005859375, 4.456562042236328, 6.144172668457031, -24.20606231689453], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000386.npy"}
|
||||
{"epoch": 0.5835222978080121, "step": 387, "batch_size": 64, "mean": 11.657045364379883, "std": 12.036148071289062, "min": -13.486770629882812, "p10": -2.2160772323608393, "median": 9.620269775390625, "p90": 28.178190803527833, "max": 48.60643005371094, "pos_frac": 0.859375, "sample": [17.234344482421875, 3.8231887817382812, 22.990276336669922, -3.1190433502197266, 2.8877410888671875, 26.119720458984375, 9.72991943359375, -1.6394004821777344, 13.12672233581543, 48.60643005371094, 9.350797653198242, -2.463224411010742, 2.3551101684570312, 9.5106201171875, 28.836322784423828, 5.667449951171875, 14.234115600585938, 25.475296020507812, 17.236454010009766, 12.21368408203125, 24.326507568359375, 28.506086349487305, 20.378753662109375, -3.492290496826172, -3.61212158203125, 17.56597137451172, 2.0327224731445312, 3.2629661560058594, 28.943851470947266, 13.483137130737305, 5.607250213623047, 0.2161407470703125, 6.904029846191406, 31.815399169921875, -1.6177978515625, 17.091812133789062, 8.502418518066406, 2.549875259399414, 5.312126159667969, 2.7059097290039062, 3.212533950805664, 14.8096923828125, 23.523292541503906, -13.486770629882812, -2.9051589965820312, 15.70450210571289, 16.39855194091797, 35.439430236816406, 3.8777217864990234, 25.0825252532959, -7.559970855712891, 13.439224243164062, 7.636604309082031, 10.658843994140625, 2.3601226806640625, 9.894058227539062, 8.608627319335938, 3.2629528045654297, 5.6993560791015625, 27.413101196289062, 40.05958557128906, 3.001401901245117, 13.840269088745117, 13.421131134033203], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000387.npy"}
|
||||
{"epoch": 0.5850340136054422, "step": 388, "batch_size": 64, "mean": 8.441862106323242, "std": 14.890864372253418, "min": -17.474517822265625, "p10": -9.006056976318359, "median": 6.760414123535156, "p90": 27.69255676269531, "max": 51.114532470703125, "pos_frac": 0.671875, "sample": [18.247901916503906, 26.464393615722656, 25.767440795898438, 41.37400436401367, 5.4947662353515625, 23.651336669921875, 9.518684387207031, 6.947021484375, -8.577598571777344, -3.9383926391601562, 12.694602966308594, 24.453369140625, -2.5100021362304688, 13.27232551574707, -9.189682006835938, 51.114532470703125, -4.360042572021484, 33.073455810546875, 3.073080062866211, 14.841079711914062, 22.235809326171875, 1.684417724609375, 24.723316192626953, -0.2316131591796875, 25.395896911621094, -17.474517822265625, 2.2552032470703125, 10.940567016601562, 3.9294891357421875, 3.6766300201416016, -3.4282073974609375, -7.246118545532227, -2.2434043884277344, 30.865726470947266, -5.736753463745117, 9.637954711914062, 5.350292205810547, -10.403106689453125, 19.277374267578125, -12.416732788085938, 17.80030059814453, 27.60772705078125, -12.415695190429688, 27.728912353515625, 19.448888778686523, 13.160396575927734, -1.3532943725585938, 8.87689208984375, 7.268890380859375, 32.03007507324219, 3.8466835021972656, 30.70450210571289, -7.595314025878906, 1.617645263671875, -17.187698364257812, -11.414253234863281, 6.5738067626953125, 10.174148559570312, -1.4913787841796875, 0.376373291015625, 8.070579528808594, -6.432334899902344, -7.0386962890625, 7.717510223388672], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000388.npy"}
|
||||
{"epoch": 0.5865457294028723, "step": 389, "batch_size": 64, "mean": 10.342015266418457, "std": 13.948179244995117, "min": -25.255474090576172, "p10": -5.803313827514648, "median": 11.522093772888184, "p90": 30.05045280456544, "max": 42.14884948730469, "pos_frac": 0.703125, "sample": [-10.876541137695312, -6.129140853881836, 14.401090621948242, 18.1348876953125, 11.50153923034668, 9.58251953125, 21.307804107666016, 27.478919982910156, 42.14884948730469, 19.21734619140625, -1.45623779296875, -8.1397705078125, 20.420875549316406, 12.910825729370117, 11.909637451171875, 20.80961799621582, 11.362457275390625, 20.868534088134766, 13.6197509765625, 11.830158233642578, 22.49951171875, 15.310836791992188, 3.9537181854248047, 11.542648315429688, 23.10401153564453, -2.0457229614257812, 20.077713012695312, -3.6922073364257812, 1.4072113037109375, 15.849754333496094, -2.2573814392089844, 20.33200454711914, -5.421844482421875, 32.75624084472656, 31.152538299560547, 9.5982666015625, 34.600074768066406, 16.75646209716797, -4.332862854003906, -24.319406509399414, 25.2366943359375, -3.019794464111328, 5.84698486328125, -0.8785476684570312, 13.477075576782227, 0.8468017578125, 17.523895263671875, -7.927021026611328, -25.255474090576172, 8.089828491210938, -0.6405487060546875, 8.877208709716797, 8.24853515625, -2.1186656951904297, -5.966800689697266, 34.43165588378906, 33.68697738647461, 12.932479858398438, 10.9068603515625, 33.015960693359375, -1.2582530975341797, 16.995773315429688, 4.262224197387695, -3.1995887756347656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000389.npy"}
|
||||
{"epoch": 0.5880574452003023, "step": 390, "batch_size": 64, "mean": 9.992494583129883, "std": 12.310225486755371, "min": -17.75216293334961, "p10": -3.8202201843261707, "median": 8.840958595275879, "p90": 27.92817306518555, "max": 46.09221649169922, "pos_frac": 0.78125, "sample": [-8.505867004394531, 7.951423645019531, 10.010211944580078, 3.271392822265625, 31.564529418945312, 5.422882080078125, 11.664405822753906, 17.478601455688477, -5.926725387573242, 12.185554504394531, -0.5443439483642578, -6.5401153564453125, 17.271812438964844, 1.275045394897461, -1.8173370361328125, 46.09221649169922, 16.85289764404297, -2.8426437377929688, 11.478889465332031, -4.2391815185546875, 20.76123809814453, 1.9287643432617188, 13.49863052368164, 31.58898162841797, 21.019744873046875, 21.12298583984375, 4.5609283447265625, -11.829498291015625, 5.022350311279297, 28.90671157836914, -2.3540496826171875, 6.658730506896973, 9.938386917114258, 19.51123809814453, -7.033109664916992, -1.3173828125, -17.75216293334961, 24.337158203125, 26.86328887939453, 1.216939926147461, 32.65734100341797, 5.108283996582031, 12.469100952148438, 10.638818740844727, 0.2723369598388672, 17.174306869506836, 5.890634536743164, 10.75335693359375, 28.384552001953125, 8.433841705322266, 15.742919921875, 11.331287384033203, 5.539512634277344, 18.60052490234375, 2.3396968841552734, 9.111625671386719, 6.0215911865234375, -2.582122802734375, 8.570291519165039, -1.504781723022461, 20.896102905273438, 14.267715454101562, 5.599174499511719, 35.050025939941406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000390.npy"}
|
||||
{"epoch": 0.5895691609977324, "step": 391, "batch_size": 64, "mean": 11.379058837890625, "std": 16.004846572875977, "min": -25.531051635742188, "p10": -8.771478271484373, "median": 12.758748054504395, "p90": 32.063441848754884, "max": 56.79132080078125, "pos_frac": 0.734375, "sample": [-0.37831878662109375, -25.531051635742188, 32.775062561035156, 16.246292114257812, 13.429176330566406, 22.102882385253906, -1.7927055358886719, 26.010635375976562, 32.13190841674805, 44.0377197265625, 10.400863647460938, -16.2730712890625, 14.934539794921875, -4.312141418457031, 23.211910247802734, 37.519989013671875, -11.82783317565918, 11.05935287475586, -3.9137725830078125, 2.3788604736328125, 1.751190185546875, 16.634868621826172, 18.458175659179688, 19.808570861816406, 2.2520217895507812, -2.684751510620117, -16.028318405151367, -7.487037658691406, 17.25750732421875, 8.014305114746094, 56.79132080078125, 14.073654174804688, 8.808561325073242, -9.321952819824219, 11.65252685546875, -2.0424118041992188, 3.0827579498291016, 19.74706268310547, 24.108001708984375, 13.833839416503906, 14.854721069335938, 24.453277587890625, -15.288627624511719, -11.468753814697266, 13.303146362304688, 13.535839080810547, 32.199058532714844, 9.169578552246094, 27.22742462158203, 31.9036865234375, -5.2462615966796875, -4.9237823486328125, 30.556148529052734, 34.55112075805664, 30.255271911621094, 25.25983428955078, 14.796171188354492, 12.214349746704102, -6.191307067871094, 4.860786437988281, 5.2107391357421875, 14.681011199951172, 9.659570693969727, 1.76654052734375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000391.npy"}
|
||||
{"epoch": 0.5910808767951625, "step": 392, "batch_size": 64, "mean": 12.981614112854004, "std": 17.68175506591797, "min": -18.830358505249023, "p10": -6.5481893539428695, "median": 10.789993286132812, "p90": 36.70958328247071, "max": 68.8592529296875, "pos_frac": 0.78125, "sample": [-15.745025634765625, 2.022489547729492, 15.290359497070312, 11.814697265625, 4.265380859375, -1.16485595703125, 40.932159423828125, -2.4253463745117188, 21.490234375, 26.43097496032715, 5.44989013671875, -15.884780883789062, 13.645431518554688, 26.68039321899414, 35.023170471191406, 15.34674072265625, -0.6865692138671875, 20.345245361328125, 31.008163452148438, 18.263931274414062, 13.651786804199219, 11.228767395019531, 10.351219177246094, -18.830358505249023, 28.050880432128906, 14.084419250488281, 38.019039154052734, 13.664695739746094, 16.780975341796875, -18.660476684570312, 5.54644775390625, 35.59877014160156, 23.669036865234375, 2.223541259765625, -15.347969055175781, 18.02002716064453, 6.610982894897461, 13.89211654663086, 9.4061279296875, 28.251609802246094, 48.961280822753906, 44.811058044433594, -7.209470748901367, 50.121826171875, 1.2843189239501953, 3.568920135498047, 1.346832275390625, 2.4315338134765625, 37.185646057128906, 8.895919799804688, -5.005199432373047, 68.8592529296875, -1.0192146301269531, -0.945037841796875, -8.717414855957031, 3.548818588256836, 32.8958740234375, -2.4266357421875, 7.322784423828125, 24.829757690429688, 7.011781692504883, 12.969831466674805, 1.8630218505859375, 9.923446655273438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000392.npy"}
|
||||
{"epoch": 0.5925925925925926, "step": 393, "batch_size": 64, "mean": 12.172752380371094, "std": 13.440564155578613, "min": -11.349128723144531, "p10": -3.8703720092773426, "median": 10.104241371154785, "p90": 32.47179107666016, "max": 38.89045715332031, "pos_frac": 0.828125, "sample": [12.986679077148438, -1.6412982940673828, 29.965133666992188, 28.64740753173828, 34.88593292236328, 20.318252563476562, 6.758983612060547, 10.876091003417969, 3.8124160766601562, 3.779224395751953, 0.041584014892578125, 3.8974990844726562, 37.20081329345703, 22.09264373779297, 17.297414779663086, -4.3860931396484375, 3.0814285278320312, 18.337936401367188, -2.667022705078125, 16.64474105834961, 1.6711578369140625, 5.451755523681641, 1.400115966796875, 38.89045715332031, 13.211700439453125, 29.531204223632812, -0.9615192413330078, 33.03330993652344, 24.0010986328125, 24.790294647216797, 12.464691162109375, -5.154354095458984, 34.51014709472656, 32.68949890136719, 2.0826950073242188, 15.956680297851562, -5.947504043579102, 31.96380615234375, 1.3078079223632812, -1.5121822357177734, 9.332391738891602, 7.5213165283203125, 11.6634521484375, -6.689975738525391, 7.366634368896484, 16.3610897064209, 24.129127502441406, 5.3775482177734375, 4.1358184814453125, 29.827239990234375, 23.18105125427246, 37.40631866455078, 14.639730453491211, -11.349128723144531, 7.966796875, 19.187530517578125, 1.91619873046875, 0.8547515869140625, -10.410354614257812, -9.968597412109375, 11.482120513916016, 24.377939224243164, 5.149871826171875, 4.2866363525390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000393.npy"}
|
||||
{"epoch": 0.5941043083900227, "step": 394, "batch_size": 64, "mean": 9.990107536315918, "std": 12.154495239257812, "min": -15.183238983154297, "p10": -6.106134796142576, "median": 10.039074897766113, "p90": 24.570368766784675, "max": 38.337501525878906, "pos_frac": 0.8125, "sample": [5.9766693115234375, 35.26997375488281, 32.17809295654297, 3.625377655029297, 2.469573974609375, 10.77032470703125, 1.3902130126953125, 8.150146484375, 25.40216827392578, 9.704626083374023, 32.559967041015625, 12.924190521240234, 2.2381019592285156, -3.6423988342285156, -9.059452056884766, 11.416364669799805, 3.9097061157226562, -9.962015151977539, 4.693817138671875, 11.512504577636719, 12.250614166259766, -7.162021636962891, 17.082582473754883, 19.137542724609375, 19.33375358581543, 37.2426872253418, 22.62950325012207, 18.691905975341797, 8.66120719909668, -0.20835113525390625, -14.360706329345703, 1.8363494873046875, 36.52854919433594, 11.51531982421875, 7.220129013061523, 7.0539093017578125, -0.9034576416015625, 2.4292068481445312, 21.15716552734375, 4.744304656982422, 21.687213897705078, -10.229188919067383, -2.517669677734375, 11.523441314697266, 10.011234283447266, 38.337501525878906, -15.183238983154297, 18.626663208007812, -2.4600677490234375, 9.181022644042969, -10.318489074707031, 13.460609436035156, 11.240013122558594, 14.237258911132812, 10.366561889648438, 6.605447769165039, 10.482950210571289, 14.527326583862305, 10.066915512084961, 9.303070068359375, 7.801216125488281, 20.560171127319336, 11.422523498535156, 14.226242065429688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000394.npy"}
|
||||
{"epoch": 0.5956160241874527, "step": 395, "batch_size": 64, "mean": 9.377607345581055, "std": 12.42612361907959, "min": -14.041267395019531, "p10": -4.865065383911133, "median": 6.643955230712891, "p90": 28.166323852539065, "max": 38.95036315917969, "pos_frac": 0.765625, "sample": [14.41408920288086, 20.978897094726562, 16.568038940429688, 2.0643043518066406, 22.586257934570312, 13.390380859375, -2.7543773651123047, 21.50444793701172, 29.80896759033203, -3.4270496368408203, 3.9011459350585938, 16.312835693359375, 27.711349487304688, 29.50426483154297, 3.798816680908203, 38.21082305908203, 1.8154029846191406, 12.97990608215332, -6.154541015625, 7.201698303222656, -1.7722511291503906, 5.090728759765625, 2.8859100341796875, 10.339115142822266, 14.483013153076172, -9.050226211547852, 14.067996978759766, 25.12946319580078, 10.074525833129883, -4.710746765136719, 16.76130485534668, 32.17047119140625, 8.415878295898438, 4.7783355712890625, -9.994903564453125, 12.863670349121094, -4.931201934814453, 2.728992462158203, 25.146629333496094, -3.38751220703125, 1.5403594970703125, 3.119173049926758, 35.87580108642578, 1.6096649169921875, 17.488887786865234, 15.125907897949219, 5.069334030151367, 28.361312866210938, -3.29132080078125, -1.281637191772461, 3.900735855102539, 38.95036315917969, -6.7101287841796875, 1.2514801025390625, 6.086212158203125, 17.98017120361328, -6.370639801025391, 4.055065155029297, 9.611135482788086, -14.041267395019531, 8.18701171875, -1.3272247314453125, 5.28144645690918, 8.190162658691406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000395.npy"}
|
||||
{"epoch": 0.5971277399848829, "step": 396, "batch_size": 64, "mean": 11.1903657913208, "std": 13.797224044799805, "min": -15.73162841796875, "p10": -5.52315673828125, "median": 9.30767822265625, "p90": 31.487347602844245, "max": 48.86006164550781, "pos_frac": 0.78125, "sample": [37.415916442871094, 48.86006164550781, 26.49291229248047, 36.189781188964844, 29.76447105407715, 28.279813766479492, 10.07577133178711, 17.243432998657227, 4.337928771972656, 4.641885757446289, 9.23773193359375, 4.200408935546875, 23.473731994628906, -7.845537185668945, 5.93798828125, -0.2928638458251953, 24.138397216796875, 26.618247985839844, -0.6068286895751953, 17.980993270874023, 7.151611328125, -0.8718528747558594, 37.772865295410156, 10.78598403930664, 11.588188171386719, 14.583457946777344, 13.933712005615234, -15.053245544433594, 14.630523681640625, 12.80477523803711, 8.270408630371094, -8.808490753173828, 2.5720443725585938, 35.01806640625, 12.029693603515625, 13.878257751464844, 9.288162231445312, 5.367408752441406, 32.22572326660156, 26.6617431640625, -5.374523162841797, 13.572546005249023, -1.0342044830322266, 10.127693176269531, 8.996879577636719, 4.137260437011719, 9.327194213867188, -2.28741455078125, 17.59813690185547, -5.586856842041016, 6.366619110107422, -8.844615936279297, 8.794950485229492, 1.0930023193359375, -1.8998832702636719, 23.820323944091797, 12.109926223754883, 20.006513595581055, 4.386869430541992, 2.774229049682617, -15.73162841796875, -9.88894271850586, 1.3495407104492188, 32.39653778076172], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000396.npy"}
|
||||
{"epoch": 0.5986394557823129, "step": 397, "batch_size": 64, "mean": 5.989149570465088, "std": 15.1456298828125, "min": -44.24433517456055, "p10": -9.589350128173828, "median": 7.423915863037109, "p90": 22.419902801513675, "max": 39.6138916015625, "pos_frac": 0.703125, "sample": [0.19682693481445312, 5.166648864746094, -17.072175979614258, 8.516120910644531, -5.8370513916015625, 13.005241394042969, 18.018932342529297, 3.49700927734375, 9.856170654296875, 5.991943359375, 4.742893218994141, 10.894237518310547, 6.793083190917969, 20.836517333984375, 14.861968994140625, 21.33521270751953, -27.468399047851562, -12.622222900390625, 4.950412750244141, -3.9575443267822266, 18.23381996154785, -5.610076904296875, 11.819978713989258, 11.948089599609375, 9.149787902832031, -28.324508666992188, -2.615875244140625, 4.570323944091797, 25.311264038085938, 21.48992156982422, 10.723129272460938, 3.9130916595458984, 24.608421325683594, 13.038589477539062, 30.898052215576172, 8.05474853515625, 11.796680450439453, 24.487014770507812, 20.546449661254883, 1.5402469635009766, -1.650146484375, 22.818466186523438, 20.74588394165039, 9.297431945800781, -44.24433517456055, 10.161758422851562, 36.429229736328125, -7.1490478515625, -9.257865905761719, -5.090095520019531, 9.48807144165039, 6.637977600097656, 2.6660614013671875, 3.2514305114746094, -0.43384361267089844, -2.6278038024902344, 16.100616455078125, 8.906085968017578, -6.903190612792969, 17.797561645507812, -28.025890350341797, -9.731414794921875, -2.780254364013672, 39.6138916015625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000397.npy"}
|
||||
{"epoch": 0.600151171579743, "step": 398, "batch_size": 64, "mean": 11.813969612121582, "std": 14.055736541748047, "min": -9.741474151611328, "p10": -3.631851959228514, "median": 8.952964782714844, "p90": 34.69014663696289, "max": 52.64251708984375, "pos_frac": 0.796875, "sample": [14.2451171875, 11.121631622314453, 13.818408966064453, 4.2955169677734375, 25.087890625, 2.6923484802246094, 39.876739501953125, 33.3873291015625, -6.1519317626953125, 39.809043884277344, 17.205501556396484, 10.038894653320312, -1.7965011596679688, -2.054290771484375, 15.492019653320312, -2.23046875, 23.222801208496094, -9.001716613769531, 10.727157592773438, -9.741474151611328, 16.875289916992188, 3.064664840698242, 37.495399475097656, 37.305328369140625, -0.46984100341796875, 32.53410720825195, 35.248497009277344, 15.864288330078125, 14.125329971313477, 7.548728942871094, -5.55511474609375, 6.452381134033203, 2.6379013061523438, 16.901657104492188, 8.931716918945312, -6.626667022705078, 0.6580123901367188, 25.035850524902344, 1.2792892456054688, 8.974212646484375, -4.232444763183594, -0.45001220703125, 21.258880615234375, 7.0961456298828125, 17.604740142822266, 17.433151245117188, 3.350454330444336, -0.5137252807617188, 4.258350372314453, 18.02513313293457, 3.01263427734375, 52.64251708984375, 12.790657043457031, 4.779899597167969, 10.655593872070312, 0.40309906005859375, 30.045684814453125, -8.405403137207031, 8.251182556152344, 0.6921291351318359, 7.027744293212891, 22.760446548461914, 36.18516540527344, 3.0969791412353516], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000398.npy"}
|
||||
{"epoch": 0.6016628873771731, "step": 399, "batch_size": 64, "mean": 13.584761619567871, "std": 16.662614822387695, "min": -11.735237121582031, "p10": -3.596644592285156, "median": 9.247852325439453, "p90": 35.837215423583984, "max": 79.63800048828125, "pos_frac": 0.828125, "sample": [6.924171447753906, 7.98126220703125, 13.2823486328125, 25.26776123046875, -3.6440963745117188, 26.892669677734375, 18.788955688476562, 8.602947235107422, -5.887432098388672, -3.4859237670898438, -7.40533447265625, 41.56817626953125, 10.31749153137207, -10.280078887939453, 6.9077301025390625, 6.2087249755859375, 7.683082580566406, 15.1951904296875, 8.327220916748047, 37.210479736328125, -10.433914184570312, 4.444427490234375, -11.735237121582031, 22.269588470458984, 52.224945068359375, 9.68121337890625, 9.723934173583984, 79.63800048828125, -2.055694580078125, 4.914558410644531, 5.6840667724609375, 0.2308788299560547, 9.050186157226562, 4.368091583251953, 6.834617614746094, -5.521331787109375, 9.692543029785156, 29.5203857421875, 10.762870788574219, 25.390968322753906, 22.836837768554688, -0.6580352783203125, 30.186935424804688, 4.752738952636719, 29.069595336914062, 13.262222290039062, 11.313613891601562, 8.371162414550781, 1.2989044189453125, 1.0157089233398438, 31.294492721557617, 37.74090576171875, 35.11583709716797, 8.760368347167969, 22.515390396118164, 24.235504150390625, 17.206741333007812, 47.97538757324219, 0.971832275390625, -1.56121826171875, 36.14637756347656, 0.7284030914306641, 9.445518493652344, 12.259040832519531], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000399.npy"}
|
||||
{"epoch": 0.6031746031746031, "step": 400, "batch_size": 64, "mean": 8.245853424072266, "std": 14.318624496459961, "min": -20.463897705078125, "p10": -6.474594116210935, "median": 4.61566162109375, "p90": 29.34753456115723, "max": 47.34343719482422, "pos_frac": 0.734375, "sample": [1.4325675964355469, 8.964996337890625, 2.5561904907226562, -1.1270904541015625, 20.74689483642578, 31.10567855834961, 0.06066131591796875, 17.871103286743164, 15.426727294921875, 35.5540771484375, 4.760692596435547, -9.128969192504883, 3.635143280029297, 3.957050323486328, -1.1305618286132812, 1.2817535400390625, 2.8489990234375, 28.963298797607422, 14.036689758300781, 24.152746200561523, -3.962066650390625, 2.948314666748047, 15.13397216796875, -0.9923191070556641, -7.5513916015625, 19.717370986938477, 2.7016124725341797, 12.503597259521484, 4.910556793212891, -1.9572010040283203, -20.463897705078125, -2.4645423889160156, 13.16082763671875, 9.1123046875, 47.34343719482422, 2.857563018798828, -16.702194213867188, -1.1776809692382812, 24.457244873046875, 16.305347442626953, 29.51220703125, 22.876380920410156, -19.088539123535156, 15.077728271484375, 5.8267822265625, 6.5420379638671875, 3.35211181640625, -3.4183082580566406, 8.271163940429688, 42.71372985839844, 3.925983428955078, 11.475654602050781, -3.786151885986328, 5.031547546386719, -12.315807342529297, 36.53779602050781, 35.56843948364258, 0.9350433349609375, 4.470630645751953, 10.429534912109375, -12.6260986328125, -0.4711723327636719, 0.0432891845703125, 15.0010986328125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000400.npy"}
|
||||
{"epoch": 0.6046863189720333, "step": 401, "batch_size": 64, "mean": 13.708871841430664, "std": 15.175922393798828, "min": -10.810283660888672, "p10": -1.7322658538818354, "median": 11.207996368408203, "p90": 33.93465194702149, "max": 54.73057556152344, "pos_frac": 0.796875, "sample": [-1.0126571655273438, 21.017593383789062, 20.901824951171875, 15.552894592285156, -5.078887939453125, 1.0115947723388672, 7.071807861328125, 12.779170989990234, -1.1765708923339844, -0.9157562255859375, 32.041786193847656, 0.814697265625, -10.810283660888672, -0.9529495239257812, 12.061126708984375, -1.9704208374023438, 30.92547607421875, 29.926483154296875, 29.309375762939453, 38.70397186279297, 9.241752624511719, 9.456432342529297, 2.2822952270507812, 16.759017944335938, -10.625709533691406, -6.733085632324219, 9.215042114257812, 15.265941619873047, 8.647445678710938, 12.429183959960938, 27.584396362304688, 12.497528076171875, 10.17184066772461, 30.426063537597656, 0.1029052734375, 34.745880126953125, 1.8664112091064453, 12.928016662597656, 0.6244373321533203, 23.97113037109375, -0.33463287353515625, 42.091270446777344, 22.968505859375, -6.378686904907227, 22.61016845703125, 3.018138885498047, 1.722930908203125, 46.17433166503906, 25.816818237304688, 23.458593368530273, 8.976943969726562, 2.7616329193115234, 10.354866027832031, 5.854988098144531, -6.104530334472656, 13.776350021362305, 20.769855499267578, 13.485517501831055, 42.486610412597656, 46.594696044921875, 10.010499954223633, -0.4496574401855469, 21.91482925415039, 54.73057556152344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000401.npy"}
|
||||
{"epoch": 0.6061980347694633, "step": 402, "batch_size": 64, "mean": 10.334466934204102, "std": 14.348188400268555, "min": -21.574371337890625, "p10": -6.0919298171997065, "median": 8.752740859985352, "p90": 29.90673522949219, "max": 40.7777099609375, "pos_frac": 0.796875, "sample": [23.980905532836914, 28.65320587158203, 16.907024383544922, 25.635343551635742, 16.632766723632812, 21.860288619995117, 40.7777099609375, 1.6696243286132812, 6.0061187744140625, 30.44396209716797, 2.427257537841797, 35.40229034423828, 10.924718856811523, -20.026016235351562, 18.0030517578125, 5.592586517333984, -0.1154632568359375, 12.207075119018555, 4.332435607910156, 9.505653381347656, 37.186763763427734, 1.68243408203125, 2.57421875, -15.415107727050781, 28.642135620117188, -0.3112373352050781, 7.673225402832031, 18.911376953125, 8.724254608154297, 6.1119232177734375, 37.36457824707031, 32.73033905029297, 3.15606689453125, 2.9492416381835938, 4.7757568359375, 3.134571075439453, 28.524978637695312, 31.208660125732422, 13.061500549316406, 8.054738998413086, -5.360296249389648, -10.94063949584961, -21.574371337890625, 21.204450607299805, 13.013687133789062, 1.8249034881591797, 10.796747207641602, -2.5700836181640625, 0.46661376953125, -1.0701560974121094, 16.45490074157715, 15.729568481445312, 20.110824584960938, 26.664993286132812, 1.6686248779296875, -0.25226402282714844, 1.3476791381835938, -12.59039306640625, -16.056549072265625, 18.54450225830078, 10.317977905273438, 19.738426208496094, -6.405487060546875, 8.781227111816406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000402.npy"}
|
||||
{"epoch": 0.6077097505668935, "step": 403, "batch_size": 64, "mean": 14.558998107910156, "std": 16.411775588989258, "min": -22.335174560546875, "p10": -5.9872901916503904, "median": 13.57374382019043, "p90": 37.15793914794922, "max": 56.69085693359375, "pos_frac": 0.828125, "sample": [-2.5678253173828125, 7.88751220703125, 37.293670654296875, 16.02960205078125, 33.48762512207031, 4.2742156982421875, 52.30711364746094, 17.456531524658203, 5.433666229248047, 15.441764831542969, 27.06437110900879, -5.9305572509765625, 36.84123229980469, 3.3907127380371094, 41.818824768066406, 8.38677978515625, 10.131010055541992, 5.156126022338867, 9.047271728515625, 4.754526138305664, -9.48560905456543, -7.842678070068359, 56.69085693359375, 39.4862060546875, 0.8886871337890625, 25.032958984375, 29.73102569580078, 2.4953060150146484, 25.20938491821289, 13.621871948242188, 1.9692192077636719, 11.427131652832031, 13.980705261230469, 13.679973602294922, 13.525615692138672, -6.011604309082031, 5.244319915771484, 44.57105255126953, 9.858888626098633, 17.3367919921875, 7.136344909667969, 0.5030975341796875, 39.506290435791016, 9.463138580322266, 18.51921844482422, -2.531280517578125, 18.737773895263672, 20.30158233642578, 22.508106231689453, 9.839996337890625, 24.97332763671875, 30.11007308959961, -22.335174560546875, -7.229366302490234, 32.24432373046875, -14.27823257446289, 33.121559143066406, 20.63005828857422, -0.017864227294921875, 32.07359313964844, -10.124992370605469, 4.215599060058594, 18.506126403808594, 16.78825569152832], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000403.npy"}
|
||||
{"epoch": 0.6092214663643235, "step": 404, "batch_size": 64, "mean": 11.497271537780762, "std": 15.02872085571289, "min": -20.657363891601562, "p10": -7.638591384887695, "median": 11.739837646484375, "p90": 30.140428924560556, "max": 45.83075714111328, "pos_frac": 0.765625, "sample": [1.663909912109375, 5.0852203369140625, 18.36724853515625, 5.7023468017578125, 9.33928108215332, -3.3200225830078125, 22.746253967285156, 3.2406158447265625, -4.749988555908203, -9.423843383789062, 28.061424255371094, 1.204010009765625, -16.348323822021484, -16.80885887145996, -0.4443225860595703, 18.210006713867188, -7.865009307861328, -9.471328735351562, 33.51417541503906, 28.21259307861328, 16.825298309326172, 30.966644287109375, 18.18133544921875, 45.83075714111328, -4.22637939453125, 22.720989227294922, -3.7368316650390625, 18.961578369140625, 12.77530288696289, 23.984561920166016, 3.6977615356445312, 23.98379135131836, -2.7021656036376953, 23.367843627929688, -0.9389266967773438, 5.424797058105469, 2.7582168579101562, 20.80225372314453, 7.809700012207031, 3.228221893310547, 5.572101593017578, 10.146949768066406, 19.305110931396484, 12.538604736328125, 36.18121337890625, 20.578998565673828, 15.5772705078125, 27.670455932617188, 42.85758590698242, 10.941070556640625, 1.6429023742675781, 44.58905029296875, 36.2139892578125, 12.984024047851562, 14.3441162109375, 15.022598266601562, 23.793563842773438, -7.110282897949219, -8.212448120117188, -20.657363891601562, 12.558219909667969, 20.776599884033203, 6.134225845336914, 5.7467041015625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000404.npy"}
|
||||
{"epoch": 0.6107331821617535, "step": 405, "batch_size": 64, "mean": 10.638147354125977, "std": 14.15326976776123, "min": -18.995132446289062, "p10": -5.332022476196289, "median": 6.687801361083984, "p90": 31.631259155273444, "max": 47.537689208984375, "pos_frac": 0.75, "sample": [26.657608032226562, -2.426513671875, -3.4541358947753906, 36.5816650390625, 17.069236755371094, 3.3363189697265625, 28.114585876464844, 4.78253173828125, 1.0304679870605469, -18.995132446289062, -0.11734771728515625, 6.383342742919922, 0.7717304229736328, -6.099092483520508, 13.133819580078125, 5.976722717285156, 38.089691162109375, 0.8043193817138672, -1.4125823974609375, 21.846588134765625, 32.205894470214844, 30.13140869140625, 10.321533203125, 23.208724975585938, -4.1499786376953125, 6.572319030761719, 14.70404052734375, 33.114837646484375, -7.623199462890625, 15.779989242553711, 4.046649932861328, -0.2819786071777344, 14.42584228515625, 14.745872497558594, -1.6887054443359375, 22.92540740966797, 5.164546966552734, -13.383428573608398, 11.420085906982422, 27.53509521484375, 4.883966445922852, -5.456523895263672, 6.940361022949219, 5.696706771850586, 1.6957168579101562, 4.093753814697266, 4.02899169921875, 14.712100982666016, 34.31414794921875, 22.020343780517578, -7.622943878173828, 9.540496826171875, 17.87588119506836, 6.80328369140625, 36.26780700683594, 47.537689208984375, -6.241279602050781, 15.408809661865234, 18.24486541748047, -5.0415191650390625, 2.305389404296875, 30.290443420410156, 15.781261444091797, -4.487091064453125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000405.npy"}
|
||||
{"epoch": 0.6122448979591837, "step": 406, "batch_size": 64, "mean": 12.121261596679688, "std": 15.139636993408203, "min": -27.376142501831055, "p10": -4.743716239929198, "median": 12.23813247680664, "p90": 33.4529312133789, "max": 43.65647888183594, "pos_frac": 0.78125, "sample": [23.315963745117188, 4.775238037109375, -4.238517761230469, -6.250335693359375, 11.848861694335938, 24.97040557861328, 33.48753356933594, 33.3721923828125, -10.715143203735352, 36.65867614746094, 16.330331802368164, 18.09795379638672, 5.4413604736328125, 0.6420097351074219, 21.440223693847656, 20.2081298828125, 31.320770263671875, 7.751853942871094, 10.315765380859375, 39.12697219848633, 18.74749755859375, 12.627403259277344, 16.83987808227539, 24.957542419433594, 11.000770568847656, 18.877182006835938, -4.271509170532227, 14.917755126953125, 0.6707839965820312, 2.6791305541992188, -1.3600521087646484, -0.5344696044921875, -16.989131927490234, -2.941743850708008, 18.868003845214844, 20.053802490234375, 5.938196182250977, 35.83210754394531, -8.843059539794922, 41.13275909423828, 15.650039672851562, 0.997161865234375, 4.675422668457031, 43.65647888183594, 2.3291778564453125, 31.368114471435547, -12.011913299560547, 4.352870941162109, 23.769237518310547, 38.409095764160156, 16.011980056762695, -4.9460906982421875, 24.130752563476562, -3.502197265625, 3.6085052490234375, -27.376142501831055, 8.09625244140625, 18.493637084960938, -3.044269561767578, 3.4155654907226562, 22.586898803710938, 6.098663330078125, 18.62908935546875, 14.259323120117188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000406.npy"}
|
||||
{"epoch": 0.6137566137566137, "step": 407, "batch_size": 64, "mean": 9.148632049560547, "std": 14.957232475280762, "min": -35.1995849609375, "p10": -5.9129150390625, "median": 6.880474090576172, "p90": 27.956504440307622, "max": 61.81072998046875, "pos_frac": 0.765625, "sample": [3.786510467529297, 3.6262760162353516, 3.81072998046875, 23.610130310058594, 0.4350700378417969, 25.254596710205078, 30.145828247070312, -2.0950469970703125, -20.53176498413086, 16.96307945251465, 12.858169555664062, 4.038738250732422, 0.15225982666015625, -5.9550933837890625, -9.093757629394531, 4.006813049316406, 12.683494567871094, 17.516029357910156, 33.07688522338867, 28.622695922851562, 4.047454833984375, 2.1702919006347656, 16.154048919677734, 11.674072265625, 12.309417724609375, 8.807323455810547, 2.3744125366210938, -5.8144989013671875, 9.953704833984375, 61.81072998046875, 11.090667724609375, 1.1559524536132812, 24.2999267578125, 19.651222229003906, 25.560882568359375, 2.2837753295898438, -5.540863037109375, 29.23223876953125, -7.84013557434082, 6.148723602294922, 0.9143524169921875, 29.192794799804688, 18.417922973632812, 8.115707397460938, 18.732742309570312, 38.74128723144531, 19.657129287719727, 0.32607269287109375, 20.635986328125, -0.9246292114257812, 4.448356628417969, -35.1995849609375, -11.842376708984375, 4.0438232421875, -4.7121429443359375, 7.612224578857422, -2.0571346282958984, 26.402057647705078, -7.015913009643555, -1.5560226440429688, -2.7742538452148438, 13.423248291015625, 15.2235107421875, 13.26629638671875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000407.npy"}
|
||||
{"epoch": 0.6152683295540439, "step": 408, "batch_size": 64, "mean": 13.449457168579102, "std": 14.67824935913086, "min": -13.197120666503906, "p10": -4.146756935119629, "median": 11.117152214050293, "p90": 32.49221343994141, "max": 47.01336669921875, "pos_frac": 0.796875, "sample": [4.510995864868164, 32.352867126464844, 25.805824279785156, 15.972831726074219, -1.7106456756591797, -13.197120666503906, -12.572257995605469, 3.633045196533203, 17.75045394897461, 6.277339935302734, 9.318977355957031, 27.87066650390625, -1.384561538696289, 2.6890106201171875, 36.69871520996094, 21.822097778320312, 6.196994781494141, -6.653541564941406, 19.305789947509766, 5.128944396972656, 13.752296447753906, -0.14760589599609375, -2.828573226928711, 19.64122200012207, 42.950897216796875, -8.416900634765625, 6.781333923339844, 47.01336669921875, 25.2562255859375, 22.213144302368164, 36.80910873413086, -3.5786705017089844, 20.868562698364258, 0.4467887878417969, 19.45948028564453, 30.901512145996094, 9.769573211669922, -7.410331726074219, 44.94081115722656, 8.941726684570312, 7.180088043212891, 18.081436157226562, 29.647056579589844, 40.987491607666016, 29.86492919921875, 10.630643844604492, 26.729034423828125, -4.390222549438477, 4.218273162841797, 10.178817749023438, 22.72116470336914, 16.975341796875, 32.55193328857422, -8.964879989624023, 12.888168334960938, 17.69202423095703, 6.901271820068359, -3.0613536834716797, 13.271045684814453, 3.332904815673828, 11.603660583496094, 7.122509002685547, 3.772186279296875, 23.65135383605957], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000408.npy"}
|
||||
{"epoch": 0.6167800453514739, "step": 409, "batch_size": 64, "mean": 8.186318397521973, "std": 14.884572982788086, "min": -23.574249267578125, "p10": -8.750107955932616, "median": 4.863189697265625, "p90": 31.6594825744629, "max": 48.09428405761719, "pos_frac": 0.6875, "sample": [-0.17708206176757812, 1.3941230773925781, -4.4169769287109375, 1.27825927734375, 2.7561988830566406, 1.3209342956542969, 13.633855819702148, -6.308448791503906, 5.803485870361328, -8.096183776855469, -1.0964889526367188, 33.89646911621094, 1.6231975555419922, 48.09428405761719, 15.543460845947266, 6.2978668212890625, 29.5491943359375, -4.396244049072266, -9.03036117553711, 33.40306854248047, 29.63166046142578, -6.377628326416016, 19.250572204589844, 19.966537475585938, 8.49462890625, -6.8271331787109375, 5.786346435546875, 11.479476928710938, 15.181257247924805, -9.84661865234375, -1.96685791015625, 7.996803283691406, 18.658523559570312, -9.277908325195312, 15.587394714355469, -0.06413459777832031, 26.73676300048828, 32.52854919433594, 9.909934997558594, 5.909421920776367, -3.4821014404296875, -6.517364501953125, 4.287025451660156, -9.741222381591797, 7.480293273925781, 3.2918243408203125, 4.928474426269531, 4.212471008300781, -23.574249267578125, 32.64436721801758, 2.9494705200195312, 22.754955291748047, -0.7831230163574219, 35.874542236328125, -20.073362350463867, 4.444053649902344, 9.31378173828125, 3.9478302001953125, 16.24498748779297, 27.297340393066406, 39.6827392578125, -9.577390670776367, 19.690902709960938, 4.797904968261719], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000409.npy"}
|
||||
{"epoch": 0.618291761148904, "step": 410, "batch_size": 64, "mean": 10.562816619873047, "std": 11.270023345947266, "min": -8.777816772460938, "p10": -3.0892982482910147, "median": 9.350418090820312, "p90": 23.101184463500978, "max": 42.017547607421875, "pos_frac": 0.8125, "sample": [7.827526092529297, 7.329343795776367, 10.224594116210938, -0.4611072540283203, 37.89306640625, -8.777816772460938, 18.807510375976562, 10.70083236694336, -4.7233428955078125, -8.012493133544922, 16.122459411621094, 14.508125305175781, 1.3552474975585938, -5.437141418457031, 18.591983795166016, 17.861547470092773, 22.620059967041016, -0.6772079467773438, 17.40901756286621, 29.065521240234375, 33.645965576171875, 10.386283874511719, -0.0530853271484375, 17.234519958496094, 3.3267669677734375, 10.305835723876953, -3.4108428955078125, 13.386642456054688, 11.937004089355469, -2.3390274047851562, 17.678749084472656, -1.5472793579101562, 6.655290603637695, 23.30738067626953, 5.761688232421875, 37.978607177734375, 17.817642211914062, 2.0902976989746094, 3.524372100830078, 19.587295532226562, 3.6399154663085938, -7.357978820800781, 7.600118637084961, 5.8065643310546875, 6.8617095947265625, 17.33978271484375, 21.016021728515625, 1.3731155395507812, 6.424751281738281, 11.667877197265625, 8.476242065429688, 11.285125732421875, 42.017547607421875, 18.87255859375, 4.092781066894531, 7.433723449707031, 5.3667755126953125, 1.569173812866211, 4.008369445800781, 21.828510284423828, 26.797149658203125, 13.205482482910156, -3.6146240234375, 10.807746887207031], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000410.npy"}
|
||||
{"epoch": 0.6198034769463341, "step": 411, "batch_size": 64, "mean": 11.461427688598633, "std": 17.94147300720215, "min": -26.106842041015625, "p10": -8.150221252441407, "median": 10.893533706665039, "p90": 36.75500907897949, "max": 59.514312744140625, "pos_frac": 0.671875, "sample": [-7.6546478271484375, 28.599285125732422, 36.768150329589844, 37.5344123840332, 11.006149291992188, -0.835205078125, -22.687240600585938, -26.106842041015625, 12.080802917480469, -2.306446075439453, 15.145950317382812, 31.78356170654297, 19.187274932861328, 12.622066497802734, 40.441009521484375, 7.5883026123046875, 19.251575469970703, -6.545204162597656, 16.75080108642578, 8.544464111328125, 18.432334899902344, 27.56482696533203, 27.418014526367188, 9.746614456176758, -3.9325637817382812, 15.299041748046875, 1.5984077453613281, 29.759952545166016, -2.2962799072265625, -21.240310668945312, -4.758331298828125, -9.437776565551758, -1.8566055297851562, 19.370986938476562, 10.78091812133789, 19.03481674194336, -1.1113567352294922, 47.93994140625, -6.4116058349609375, 59.514312744140625, 14.593963623046875, 10.244024276733398, -8.36260986328125, 24.772085189819336, 26.292190551757812, 38.626136779785156, -14.994270324707031, 13.462615966796875, -0.5306510925292969, -5.6671600341796875, 3.2098026275634766, -0.6741485595703125, 36.72434616088867, 26.041168212890625, 9.044168472290039, 18.931594848632812, 10.616846084594727, 8.130788803100586, 49.69057083129883, 11.152275085449219, -3.0304336547851562, 0.8128471374511719, -15.165786743164062, 13.027473449707031], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000411.npy"}
|
||||
{"epoch": 0.6213151927437641, "step": 412, "batch_size": 64, "mean": 11.312671661376953, "std": 14.445615768432617, "min": -10.016326904296875, "p10": -5.1510019302368155, "median": 7.983412742614746, "p90": 29.081434631347662, "max": 49.3262939453125, "pos_frac": 0.75, "sample": [-7.061347961425781, 29.79095458984375, 8.035202026367188, 49.3262939453125, 10.6014404296875, 15.112312316894531, 13.549758911132812, 17.104658126831055, -7.380710601806641, 2.9687461853027344, 13.242277145385742, 3.8461532592773438, 15.190174102783203, 48.725929260253906, -2.0377769470214844, -3.5881881713867188, 19.95305824279785, -7.867422103881836, -1.153533935546875, 1.0175704956054688, 23.36182403564453, 14.071792602539062, 6.1682281494140625, 5.674747467041016, 2.106658935546875, 3.3385658264160156, 16.4366455078125, 7.329891204833984, 19.143657684326172, 5.015625, 7.32377815246582, 27.425888061523438, 8.220783233642578, 47.60376739501953, -2.846874237060547, -2.1857147216796875, 13.2279052734375, -1.4667739868164062, 20.260046005249023, -1.886474609375, -5.423768997192383, -10.016326904296875, 25.836585998535156, -6.474185943603516, 3.2192211151123047, 23.600143432617188, 22.871856689453125, 26.19904327392578, 43.3973388671875, -2.2858314514160156, 8.667404174804688, 18.406810760498047, 33.73270034790039, -6.705474853515625, 10.528095245361328, 24.969276428222656, 7.931623458862305, 6.5101776123046875, -4.514545440673828, 7.555259704589844, 24.55244255065918, 0.3437938690185547, 32.32215881347656, 1.0876922607421875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000412.npy"}
|
||||
{"epoch": 0.6228269085411943, "step": 413, "batch_size": 64, "mean": 10.459529876708984, "std": 16.504194259643555, "min": -24.085716247558594, "p10": -8.157576751708984, "median": 9.91086483001709, "p90": 28.722298431396492, "max": 60.78767395019531, "pos_frac": 0.796875, "sample": [0.4765167236328125, -8.730300903320312, 20.764198303222656, 8.24789810180664, 2.1495513916015625, 0.7880706787109375, 12.67730712890625, 43.705501556396484, -14.426870346069336, 11.399993896484375, -24.085716247558594, 19.4581298828125, 13.254600524902344, -6.483009338378906, -3.7720489501953125, 6.150093078613281, -3.288738250732422, 42.35850143432617, 0.48169898986816406, 9.924631118774414, 50.22639465332031, 11.089630126953125, 11.894424438476562, 4.661594390869141, 19.065475463867188, 7.502832412719727, 60.78767395019531, 36.348785400390625, 2.8860397338867188, 10.315170288085938, 10.060707092285156, 13.628301620483398, 16.781551361083984, 13.836181640625, 42.30717849731445, -13.8720703125, 12.91412353515625, 5.6942291259765625, 17.31232452392578, 2.3264236450195312, -18.356307983398438, 21.34986114501953, 17.917266845703125, 2.686370849609375, 18.072845458984375, -5.973169326782227, -23.34882354736328, 2.1931991577148438, 7.10899543762207, 8.493782043457031, 3.9489288330078125, -12.253982543945312, 24.032501220703125, -1.6949901580810547, 1.9001235961914062, 16.100570678710938, 9.437511444091797, 26.245452880859375, 9.897098541259766, 22.126258850097656, -6.821220397949219, 23.097625732421875, 29.480979919433594, 26.952041625976562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000413.npy"}
|
||||
{"epoch": 0.6243386243386243, "step": 414, "batch_size": 64, "mean": 12.347614288330078, "std": 15.61620044708252, "min": -21.855384826660156, "p10": -4.590320587158203, "median": 10.809525489807129, "p90": 33.34152145385743, "max": 52.14103698730469, "pos_frac": 0.796875, "sample": [27.0931339263916, -4.849327087402344, 12.939300537109375, 25.15399169921875, 1.76263427734375, 15.742929458618164, -6.859466552734375, 30.799758911132812, 9.883064270019531, 5.8722686767578125, 17.707801818847656, 13.941570281982422, 21.834266662597656, -12.907676696777344, 35.93384552001953, 19.754985809326172, -16.06945037841797, -11.731193542480469, 6.825080871582031, 13.158088684082031, 16.444482803344727, 7.558927536010742, 52.05919647216797, 7.406425476074219, 34.27130126953125, -1.8765411376953125, 18.22976303100586, 20.842269897460938, 10.791610717773438, 22.04566192626953, 39.95376968383789, 3.9301910400390625, 20.933273315429688, 0.8340835571289062, 31.075363159179688, 10.278810501098633, 25.972450256347656, 14.457130432128906, 13.521316528320312, 21.113231658935547, 10.240013122558594, -4.575553894042969, 2.175121307373047, 0.8217754364013672, 31.172035217285156, -2.754302978515625, 0.8717269897460938, 14.079551696777344, -21.855384826660156, 1.6896133422851562, 30.82806396484375, 0.06061553955078125, -4.596649169921875, 3.476886749267578, 40.18595504760742, 52.14103698730469, 3.3632049560546875, 0.7698516845703125, 11.679454803466797, 10.82744026184082, -1.14306640625, -0.9388942718505859, -0.5828132629394531, 36.48329162597656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000414.npy"}
|
||||
{"epoch": 0.6258503401360545, "step": 415, "batch_size": 64, "mean": 11.103410720825195, "std": 16.674846649169922, "min": -25.94300079345703, "p10": -9.916154479980468, "median": 9.966667175292969, "p90": 29.757396507263184, "max": 53.014549255371094, "pos_frac": 0.78125, "sample": [-2.656026840209961, 29.216400146484375, 17.927637100219727, 19.26821517944336, 4.761577606201172, 10.877883911132812, 20.342025756835938, 39.82423400878906, -1.818878173828125, 26.2747802734375, 10.837509155273438, -15.663196563720703, 23.856765747070312, 27.638574600219727, 1.2902755737304688, 42.071250915527344, 53.014549255371094, 6.590251922607422, 22.010757446289062, -1.178293228149414, 3.435302734375, 12.98370361328125, 9.772567749023438, 0.054714202880859375, 14.840141296386719, 44.77240753173828, -18.50342559814453, 3.6124038696289062, 23.153579711914062, 22.462860107421875, -8.77227783203125, 17.10163116455078, 8.728891372680664, -4.434791564941406, -21.801368713378906, 48.05120849609375, 17.588455200195312, 29.9892520904541, 24.434829711914062, -10.406387329101562, 8.709190368652344, -16.742034912109375, -10.929758071899414, -25.94300079345703, 12.115161895751953, 24.20740509033203, 2.4200286865234375, 6.139804840087891, 10.28896713256836, 1.3158740997314453, 0.34388160705566406, 10.167335510253906, 3.0235633850097656, 3.2031211853027344, 5.254173278808594, 8.935150146484375, -0.4026145935058594, 14.291091918945312, 10.1607666015625, -0.20772552490234375, 4.7324371337890625, 19.155776977539062, 41.535888671875, 27.29384994506836], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000415.npy"}
|
||||
{"epoch": 0.6273620559334845, "step": 416, "batch_size": 64, "mean": 9.105339050292969, "std": 16.177913665771484, "min": -22.404678344726562, "p10": -10.65501251220703, "median": 8.101105690002441, "p90": 28.756188201904305, "max": 53.743980407714844, "pos_frac": 0.734375, "sample": [3.3211097717285156, -20.95160675048828, -7.582218170166016, 20.347763061523438, 4.111259460449219, 32.69508361816406, -8.916366577148438, -8.41162109375, 26.918560028076172, 40.41416931152344, 9.174491882324219, 40.23027038574219, 52.146453857421875, 5.269317626953125, 3.2482337951660156, 7.790901184082031, 6.560253143310547, 11.640691757202148, -19.817798614501953, -14.577911376953125, 12.40028190612793, 14.82383918762207, 13.940980911254883, 16.649734497070312, -1.7871112823486328, 21.94463348388672, 15.375457763671875, 53.743980407714844, 35.61260223388672, 14.653160095214844, 5.931449890136719, 12.458641052246094, 5.5649261474609375, -17.969175338745117, 10.729913711547852, -1.6530609130859375, -11.400146484375, -14.106134414672852, 11.739019393920898, 5.040550231933594, -7.6651153564453125, 17.684188842773438, -0.8572463989257812, 0.40354156494140625, 4.820648193359375, -2.9902305603027344, 25.50852394104004, -22.404678344726562, 1.153564453125, 19.261878967285156, 7.60052490234375, 6.775300979614258, -6.678829193115234, 19.07445526123047, -3.74835205078125, 8.411310195922852, 1.8431396484375, 16.113327026367188, 11.823188781738281, 14.378639221191406, 22.887939453125, 22.70922088623047, 29.543743133544922, 9.788434982299805], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000416.npy"}
|
||||
{"epoch": 0.6288737717309146, "step": 417, "batch_size": 64, "mean": 11.362879753112793, "std": 15.519515991210938, "min": -29.08409881591797, "p10": -4.606177139282226, "median": 11.872261047363281, "p90": 30.02316627502442, "max": 51.357994079589844, "pos_frac": 0.78125, "sample": [7.620954513549805, -4.277008056640625, -2.7996292114257812, -29.08409881591797, 28.474700927734375, 21.367401123046875, -26.31866455078125, 30.68679428100586, 14.089740753173828, 12.457603454589844, 48.18513488769531, 6.6782989501953125, 17.12488555908203, 14.537866592407227, 18.11438751220703, -24.397811889648438, 32.810386657714844, 9.2669677734375, 0.4908599853515625, 25.821388244628906, 18.85601806640625, 18.463409423828125, 9.622339248657227, 4.492401123046875, 10.776264190673828, -3.860595703125, 23.469104766845703, 11.111885070800781, -0.03331756591796875, 16.64013671875, 1.734506607055664, 38.4998779296875, 37.16884994506836, -5.3207855224609375, 7.247650146484375, -5.8248291015625, 0.0200653076171875, 1.7154617309570312, 21.723167419433594, 11.486518859863281, 32.37449645996094, 6.290168762207031, 22.27081298828125, 12.886405944824219, -2.1018524169921875, 16.688278198242188, 9.683324813842773, 14.006248474121094, -3.353435516357422, 12.258003234863281, 4.7681427001953125, 12.817085266113281, 4.1623382568359375, -12.076286315917969, 18.259830474853516, -4.747249603271484, 14.92617416381836, 25.67611312866211, 6.1610107421875, -2.3836708068847656, 23.799386978149414, 16.343578338623047, 28.319133758544922, 51.357994079589844], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000417.npy"}
|
||||
{"epoch": 0.6303854875283447, "step": 418, "batch_size": 64, "mean": 7.804815769195557, "std": 17.112911224365234, "min": -32.88189697265625, "p10": -10.959408569335936, "median": 6.741687774658203, "p90": 25.511115264892577, "max": 74.01213073730469, "pos_frac": 0.71875, "sample": [12.665084838867188, 25.038822174072266, 18.508041381835938, 6.7624053955078125, 5.312944412231445, 15.428466796875, 4.733562469482422, 16.529022216796875, 40.76484298706055, -9.504432678222656, 25.2579345703125, 41.26812744140625, -2.17254638671875, 15.999481201171875, 8.6141357421875, -11.582969665527344, 12.327629089355469, 1.8482666015625, 34.192142486572266, -0.5180435180664062, 1.0896224975585938, 74.01213073730469, 10.261680603027344, 4.842887878417969, 23.058143615722656, 1.9084091186523438, -5.517181396484375, 7.162609100341797, 6.185268402099609, 9.13690185546875, -32.88189697265625, -21.3115234375, 24.253890991210938, 7.024641036987305, 41.633445739746094, 0.7878646850585938, 1.6423110961914062, -4.708885192871094, 10.530998229980469, 1.6670455932617188, -21.65044403076172, -19.609031677246094, -19.807762145996094, -1.6010513305664062, 9.007146835327148, -7.214820861816406, 6.720970153808594, 10.931207656860352, -2.682352066040039, 2.0503463745117188, 3.6118316650390625, -6.6134185791015625, 26.43547821044922, 21.088653564453125, 15.34310531616211, -7.14495849609375, -12.147804260253906, 17.13433074951172, 11.99488639831543, 25.61962127685547, 11.50732421875, -4.0047760009765625, 12.55419921875, 5.734245300292969], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000418.npy"}
|
||||
{"epoch": 0.6318972033257747, "step": 419, "batch_size": 64, "mean": 12.461854934692383, "std": 15.024937629699707, "min": -17.065750122070312, "p10": -3.824114990234374, "median": 10.942386627197266, "p90": 33.58150939941407, "max": 51.25474548339844, "pos_frac": 0.796875, "sample": [-2.5273780822753906, 34.965911865234375, 21.238388061523438, 3.9605560302734375, -15.051225662231445, 7.279735565185547, 36.064300537109375, 10.686180114746094, 0.37639617919921875, 5.142936706542969, -17.065750122070312, 30.895938873291016, 19.412185668945312, 14.925140380859375, -2.8153762817382812, 1.6530609130859375, 18.482784271240234, 7.124847412109375, 28.892242431640625, -0.71575927734375, 7.157524108886719, 15.452018737792969, -4.256431579589844, 23.177345275878906, 5.204017639160156, 1.8399467468261719, 27.089630126953125, -8.682159423828125, 21.787734985351562, 32.666358947753906, -1.1600914001464844, -8.815757751464844, -8.854106903076172, 33.973716735839844, 44.11088562011719, 18.130786895751953, 18.99393081665039, 7.920440673828125, 8.121711730957031, 13.540908813476562, 12.446630477905273, 35.46360778808594, 4.91180419921875, 1.960845947265625, 32.502113342285156, -0.4827728271484375, 25.815353393554688, 10.292972564697266, 2.0766334533691406, 51.25474548339844, 31.356014251708984, 0.4296245574951172, 12.419330596923828, 7.9315948486328125, 17.65947723388672, 0.1366710662841797, 15.347702026367188, 17.470577239990234, 35.89299011230469, -1.36676025390625, 11.512466430664062, 11.198593139648438, -14.967140197753906, 25.972129821777344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000419.npy"}
|
||||
{"epoch": 0.6334089191232048, "step": 420, "batch_size": 64, "mean": 7.538368225097656, "std": 17.010129928588867, "min": -36.834503173828125, "p10": -9.938254356384276, "median": 5.957975387573242, "p90": 31.542837142944336, "max": 51.63890075683594, "pos_frac": 0.671875, "sample": [-1.4126663208007812, -16.397014617919922, 33.29619598388672, -0.2641143798828125, 36.27318572998047, -14.745849609375, 13.395889282226562, 4.330852508544922, 7.115608215332031, -32.857940673828125, -1.9242324829101562, 14.106773376464844, 4.022636413574219, 18.661941528320312, -4.635993957519531, 20.705078125, 39.485595703125, 21.635116577148438, 42.768253326416016, 22.98870086669922, 2.489522933959961, -5.853443145751953, 2.3192806243896484, 11.553733825683594, 7.073305130004883, 31.55545425415039, -10.437185287475586, -21.42915916442871, 5.106599807739258, 6.420917510986328, -12.486328125, 5.495033264160156, 1.44073486328125, 2.2398414611816406, -3.6170005798339844, -5.868398666381836, 4.914207458496094, -8.27203369140625, 18.31298828125, 14.753631591796875, -2.6596450805664062, -36.834503173828125, -7.718414306640625, 8.022994995117188, 35.22922134399414, 9.187719345092773, 11.14010238647461, 30.299325942993164, 7.761924743652344, 8.73388671875, 24.06043243408203, 11.758583068847656, 14.887203216552734, -1.6150226593017578, 1.4833869934082031, 17.70149040222168, 31.513397216796875, -8.77408218383789, 19.73153305053711, 15.858184814453125, 4.24847412109375, 51.63890075683594, -8.35052490234375, -7.108707427978516], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000420.npy"}
|
||||
{"epoch": 0.6349206349206349, "step": 421, "batch_size": 64, "mean": 8.942495346069336, "std": 18.435340881347656, "min": -40.40763854980469, "p10": -14.020944595336912, "median": 8.981414794921875, "p90": 30.737528991699218, "max": 60.298370361328125, "pos_frac": 0.671875, "sample": [-21.60943603515625, 30.8182373046875, -0.20599746704101562, 19.14639663696289, 15.951286315917969, 20.598159790039062, 22.36853790283203, -0.824951171875, 0.3140411376953125, 60.298370361328125, 43.72871398925781, -12.72100830078125, 32.015403747558594, 14.0074462890625, -6.718067169189453, 3.63702392578125, 7.309326171875, -5.092185974121094, 26.70030975341797, 12.795806884765625, 29.679100036621094, 8.591293334960938, 2.9950008392333984, -2.7120399475097656, 15.270063400268555, -3.9465980529785156, 22.134307861328125, -40.40763854980469, -20.730865478515625, -14.578060150146484, 11.312929153442383, 45.50733184814453, 30.549209594726562, 1.2585811614990234, 5.886756896972656, 25.369972229003906, 35.92823791503906, 25.153751373291016, -0.3054008483886719, 2.081623077392578, 35.663116455078125, -0.16587066650390625, 4.8978118896484375, -11.056015014648438, -15.564956665039062, 15.989953994750977, -7.026878356933594, 11.195686340332031, -22.116485595703125, -0.2746429443359375, 21.111907958984375, 29.697959899902344, 2.766674041748047, 12.57147216796875, 1.951324462890625, -19.758087158203125, 12.717887878417969, 10.209352493286133, 9.371536254882812, -11.3575439453125, 20.80109405517578, 26.28797149658203, 10.301124572753906, -7.449665069580078], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000421.npy"}
|
||||
{"epoch": 0.636432350718065, "step": 422, "batch_size": 64, "mean": 11.161639213562012, "std": 14.725334167480469, "min": -15.71380615234375, "p10": -5.9644733428955075, "median": 11.359259605407715, "p90": 24.7713249206543, "max": 60.709129333496094, "pos_frac": 0.765625, "sample": [25.141685485839844, 44.590576171875, -14.718301773071289, -0.2933006286621094, 18.16204833984375, -5.866935729980469, 19.432281494140625, -15.71380615234375, 11.504535675048828, 60.709129333496094, 11.675537109375, 13.763650894165039, -3.6398696899414062, 13.740997314453125, 29.624656677246094, 4.210685729980469, 19.836654663085938, 23.35213851928711, 6.0760498046875, 16.605934143066406, 14.10015869140625, -2.086639404296875, -6.006275177001953, 11.05523681640625, 9.586761474609375, 20.553466796875, 14.687137603759766, 21.37887954711914, 42.52943420410156, 11.420578002929688, 12.140777587890625, 18.955886840820312, 9.061382293701172, -6.445072174072266, 4.7565155029296875, 8.682697296142578, 21.113529205322266, 4.017148971557617, 26.598052978515625, 0.6311264038085938, 1.1240310668945312, -10.68661117553711, 22.466079711914062, 18.586254119873047, 23.907150268554688, -9.732460021972656, -2.72259521484375, 13.526725769042969, 2.500885009765625, 16.154878616333008, -0.7963371276855469, 49.8514404296875, 5.284297943115234, -1.7141494750976562, 2.5973854064941406, 8.222396850585938, 1.1246814727783203, -12.433799743652344, 9.525678634643555, 20.393997192382812, 22.87722396850586, 11.297941207885742, -3.70196533203125, 11.766664505004883], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000422.npy"}
|
||||
{"epoch": 0.6379440665154951, "step": 423, "batch_size": 64, "mean": 12.38492202758789, "std": 16.620140075683594, "min": -24.714111328125, "p10": -4.8508766174316404, "median": 10.050159454345703, "p90": 34.99904251098634, "max": 56.74702072143555, "pos_frac": 0.8125, "sample": [6.85382080078125, -4.8321685791015625, 17.77056884765625, -1.6445331573486328, 56.74702072143555, 1.5669612884521484, 4.198337554931641, -24.714111328125, 50.16410827636719, 5.002124786376953, -4.523193359375, 27.40993309020996, 8.416748046875, -4.858894348144531, 7.072349548339844, -0.2602386474609375, 0.6261520385742188, -9.586868286132812, 36.913116455078125, -7.4500274658203125, 10.394794464111328, 10.902412414550781, -15.987129211425781, 1.4521598815917969, 7.937202453613281, 30.979248046875, 31.659019470214844, 1.249847412109375, -21.281904220581055, 6.7458343505859375, 27.529205322265625, 36.43048095703125, 14.862899780273438, 14.397293090820312, 21.403749465942383, 19.21044921875, 23.812698364257812, 26.366439819335938, 54.62985610961914, 30.888641357421875, 10.261528015136719, 7.328218460083008, 3.0383071899414062, 11.760330200195312, 16.237449645996094, 27.333656311035156, 28.068321228027344, 6.708717346191406, 2.3972511291503906, 13.296058654785156, 36.48760986328125, 12.738014221191406, 1.7934894561767578, 12.493551254272461, 16.276214599609375, 37.475921630859375, 9.838790893554688, 0.960968017578125, -6.062644958496094, 18.12396240234375, 24.73333740234375, -0.116973876953125, 2.5942344665527344, 0.41427040100097656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000423.npy"}
|
||||
{"epoch": 0.6394557823129252, "step": 424, "batch_size": 64, "mean": 12.973922729492188, "std": 15.519699096679688, "min": -13.989456176757812, "p10": -5.811342620849609, "median": 9.019485473632812, "p90": 35.410984802246105, "max": 48.16400146484375, "pos_frac": 0.796875, "sample": [-3.24517822265625, 33.24842071533203, 3.3239059448242188, 27.49334144592285, -0.7982864379882812, 22.44859504699707, 48.16400146484375, 16.845481872558594, -9.04974365234375, -13.989456176757812, -5.9744415283203125, 6.003480911254883, 2.3591442108154297, -13.207992553710938, 36.278785705566406, 7.735771179199219, 25.878036499023438, -6.814701080322266, 10.290481567382812, 4.542514801025391, -0.6869716644287109, 18.67923355102539, 21.496726989746094, 4.407508850097656, -6.610942840576172, 15.296531677246094, 38.904815673828125, 5.233695983886719, 46.835540771484375, 7.579290390014648, 19.736427307128906, 12.829254150390625, 4.399986267089844, 5.531227111816406, 19.28099822998047, 6.601318359375, 15.849346160888672, 41.9454345703125, 15.033973693847656, 21.144744873046875, 5.19635009765625, 18.748809814453125, 8.09225082397461, 41.05523681640625, 9.429023742675781, 43.30445861816406, 27.235183715820312, 0.43811988830566406, -5.430778503417969, 0.36989402770996094, 31.130123138427734, 8.609947204589844, 15.68548583984375, 33.38611602783203, -0.6526412963867188, 3.1351699829101562, 15.2659912109375, -12.2171630859375, 32.35502624511719, 29.521564483642578, 0.641571044921875, 15.26409912109375, -0.8044967651367188, 5.5514068603515625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000424.npy"}
|
||||
{"epoch": 0.6409674981103552, "step": 425, "batch_size": 64, "mean": 10.299787521362305, "std": 14.790627479553223, "min": -18.810020446777344, "p10": -9.099824714660643, "median": 7.734262466430664, "p90": 30.873355102539072, "max": 48.860572814941406, "pos_frac": 0.78125, "sample": [27.32518768310547, 24.49573516845703, 7.483650207519531, 5.077903747558594, 32.13368225097656, -2.701213836669922, 36.95250701904297, 2.2860069274902344, 13.104469299316406, 8.428386688232422, 11.352142333984375, -9.426523208618164, 15.201330184936523, 0.6475830078125, -11.616153717041016, 18.579803466796875, -1.7896041870117188, 37.310096740722656, 7.03070068359375, 4.343463897705078, -10.955228805541992, 10.745414733886719, 2.1328201293945312, 6.874183654785156, 18.599105834960938, 0.47930145263671875, 6.369882583618164, 35.437042236328125, -1.5658988952636719, 48.860572814941406, 27.147247314453125, -0.6899909973144531, 13.555374145507812, 4.895137786865234, 13.909347534179688, 6.347927093505859, 1.51025390625, 46.124664306640625, 14.540042877197266, 5.744159698486328, 18.74982452392578, -18.810020446777344, 10.181159973144531, 15.020034790039062, 13.596649169921875, -12.630769729614258, 7.95404052734375, 4.187808990478516, 13.014015197753906, 7.4723358154296875, 23.89798355102539, 2.0346641540527344, -9.319910049438477, 28.438201904296875, -6.618247985839844, 7.863773345947266, 28.072513580322266, 15.36050033569336, -7.333003997802734, 26.56793975830078, -8.586292266845703, 7.6047515869140625, 31.9169921875, -15.729080200195312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000425.npy"}
|
||||
{"epoch": 0.6424792139077853, "step": 426, "batch_size": 64, "mean": 6.4043288230896, "std": 16.058971405029297, "min": -46.30827331542969, "p10": -9.645843315124509, "median": 4.583148956298828, "p90": 27.814408493041995, "max": 53.7537956237793, "pos_frac": 0.65625, "sample": [3.333148956298828, -0.08720779418945312, 15.886932373046875, 29.925010681152344, 9.537322998046875, 20.75359344482422, 0.4515533447265625, 53.7537956237793, 6.631134033203125, 7.2250518798828125, -13.105903625488281, -6.374542236328125, 11.835330963134766, -11.908916473388672, 3.68792724609375, 5.420879364013672, 14.504318237304688, -6.62139892578125, 4.455976486206055, -6.9109649658203125, -16.384841918945312, -0.985595703125, 3.2167205810546875, 1.9348373413085938, -6.8371734619140625, -2.122589111328125, 9.046689987182617, -46.30827331542969, 20.765621185302734, 13.527641296386719, 17.61669921875, 23.604639053344727, 11.897109985351562, -3.8126049041748047, -18.654617309570312, 9.24246597290039, -5.347564697265625, 4.710321426391602, 0.8795928955078125, 33.2318115234375, 36.890586853027344, 39.18556213378906, -14.221389770507812, -10.584182739257812, -5.964561462402344, 41.44548416137695, 0.2455596923828125, 2.670562744140625, -2.1846885681152344, -4.072998046875, 7.642553329467773, 26.698631286621094, 3.63043212890625, 23.3363037109375, 8.3316650390625, -5.389862060546875, 11.395435333251953, 11.537178039550781, 9.800460815429688, -1.7313079833984375, 10.369705200195312, 8.39577865600586, 28.292598724365234, -7.456384658813477], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000426.npy"}
|
||||
{"epoch": 0.6439909297052154, "step": 427, "batch_size": 64, "mean": 10.857381820678711, "std": 16.276039123535156, "min": -25.3100643157959, "p10": -6.371303939819335, "median": 9.762567520141602, "p90": 32.933708190917976, "max": 68.40277099609375, "pos_frac": 0.75, "sample": [17.207172393798828, 4.589609146118164, -4.137676239013672, -12.683998107910156, -2.3461456298828125, 17.619979858398438, 33.40870666503906, -2.1451377868652344, 19.703262329101562, -10.260604858398438, 23.821372985839844, -2.436389923095703, -16.37492561340332, -6.235992431640625, 16.142852783203125, 20.860733032226562, 9.992279052734375, 14.793815612792969, 1.3849945068359375, 2.8625640869140625, -0.7738323211669922, 9.718242645263672, 23.126007080078125, 15.769035339355469, 9.806892395019531, 20.40801429748535, 7.685943603515625, 25.046791076660156, 4.120246887207031, 4.668663024902344, 35.20591735839844, 8.99700927734375, 16.46685791015625, 4.386543273925781, 11.390007019042969, 33.54610061645508, 1.1068611145019531, -25.3100643157959, 44.964027404785156, 27.686599731445312, 6.0125274658203125, -0.252166748046875, 14.660308837890625, -5.3497161865234375, -6.429294586181641, -10.097978591918945, 20.11774444580078, 24.34747314453125, 7.0409698486328125, -18.397220611572266, 37.421810150146484, 31.82537841796875, 2.72052001953125, 13.426460266113281, 12.572124481201172, 6.075225830078125, 10.089324951171875, 11.889312744140625, 10.216629028320312, 6.5802459716796875, 9.047294616699219, -5.240997314453125, 68.40277099609375, 44.411277770996094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000427.npy"}
|
||||
{"epoch": 0.6455026455026455, "step": 428, "batch_size": 64, "mean": 13.07719612121582, "std": 17.252134323120117, "min": -24.588302612304688, "p10": -4.316696357727051, "median": 9.811370849609375, "p90": 37.79529151916504, "max": 65.86196899414062, "pos_frac": 0.765625, "sample": [-24.588302612304688, 27.769187927246094, 27.376617431640625, -3.1566085815429688, 1.1275978088378906, 5.163616180419922, -4.340473175048828, -10.93560791015625, -17.229568481445312, 3.8032760620117188, 37.868873596191406, -1.6968765258789062, 15.626335144042969, -5.348888397216797, 28.73914337158203, 11.193450927734375, 37.95298767089844, 4.526939392089844, -3.7021713256835938, 6.0143280029296875, 19.256134033203125, 9.346546173095703, 0.888946533203125, 15.715309143066406, 22.556854248046875, 13.903217315673828, 9.905143737792969, -12.539840698242188, 24.85821533203125, -14.061676025390625, 29.158920288085938, -2.8410720825195312, 5.231019973754883, 9.717597961425781, 16.71271514892578, 43.54877471923828, 38.477943420410156, 40.069732666015625, 8.773239135742188, -3.3823413848876953, 28.672042846679688, 65.86196899414062, 13.537284851074219, 29.86602783203125, 2.9559326171875, 3.4405670166015625, 27.315353393554688, 8.137290954589844, 2.838724136352539, 0.688629150390625, 5.607202529907227, 10.315834045410156, -0.2610664367675781, 1.2344932556152344, 38.5484504699707, 36.49250030517578, 37.623600006103516, 29.25713539123535, 20.832443237304688, 20.468257904052734, 31.423736572265625, 15.344396591186523, -0.45831298828125, -4.26121711730957], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000428.npy"}
|
||||
{"epoch": 0.6470143613000756, "step": 429, "batch_size": 64, "mean": 10.786637306213379, "std": 15.497200012207031, "min": -27.69049835205078, "p10": -7.808487319946288, "median": 10.148231506347656, "p90": 31.67841892242432, "max": 55.44940185546875, "pos_frac": 0.796875, "sample": [23.763153076171875, -5.49749755859375, 0.9403610229492188, 11.769790649414062, 22.89330291748047, 11.617055892944336, 10.709430694580078, 24.447532653808594, -5.780353546142578, 4.60150146484375, 8.068328857421875, -8.41119384765625, 6.5617218017578125, 15.179519653320312, 5.192279815673828, 11.093292236328125, -27.69049835205078, 6.594490051269531, 6.214569091796875, 13.328725814819336, 20.467788696289062, 2.1545982360839844, 12.598419189453125, 26.55127716064453, 34.50273895263672, 33.17668151855469, 33.62682342529297, 5.375556945800781, 16.329696655273438, -14.777416229248047, -6.337898254394531, -13.99551773071289, 0.759368896484375, -13.769813537597656, 31.06989097595215, 1.9258804321289062, -9.288772583007812, 22.283252716064453, -3.8454818725585938, 0.6312217712402344, 27.109378814697266, 31.93921661376953, 6.03387451171875, 24.62531852722168, 32.40142059326172, -14.791458129882812, -6.332672119140625, 6.803016662597656, 4.619335174560547, 15.195688247680664, 0.096466064453125, 41.512916564941406, 17.144256591796875, -6.402172088623047, 9.587032318115234, 14.211013793945312, 29.289051055908203, 7.546245574951172, 55.44940185546875, 22.009475708007812, 5.13848876953125, 15.853818893432617, 11.51324462890625, 24.758621215820312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000429.npy"}
|
||||
{"epoch": 0.6485260770975056, "step": 430, "batch_size": 64, "mean": 12.4488525390625, "std": 15.50989055633545, "min": -25.251800537109375, "p10": -2.072775650024414, "median": 9.805801391601562, "p90": 32.122814941406254, "max": 54.51991271972656, "pos_frac": 0.84375, "sample": [1.9378814697265625, 4.297063827514648, -1.1122665405273438, 6.2632904052734375, -8.954032897949219, 26.603439331054688, 0.13722610473632812, 50.65079879760742, 6.153505325317383, 12.137611389160156, -8.885204315185547, 28.566543579101562, 32.33228302001953, 5.0573272705078125, 42.42852020263672, 14.317146301269531, 22.622352600097656, 0.864593505859375, 39.42138671875, 3.8339290618896484, 5.069038391113281, 54.51991271972656, 9.493362426757812, 6.964290618896484, -1.1005001068115234, 36.95391845703125, 10.118240356445312, 27.004005432128906, 26.97399139404297, 16.291160583496094, 9.11083984375, 7.827362060546875, 12.150375366210938, 18.157745361328125, -2.0758323669433594, 48.24924850463867, 31.634056091308594, 14.449996948242188, -12.129817962646484, 26.69647979736328, 11.797592163085938, 11.767341613769531, 4.958703994750977, 6.2535858154296875, 7.493366241455078, 19.029590606689453, -2.2032089233398438, 0.8313426971435547, -25.251800537109375, 0.28186798095703125, 1.55780029296875, 14.386539459228516, 2.5388412475585938, 21.53192138671875, -2.065643310546875, 25.691261291503906, 20.51153564453125, 13.569650650024414, 16.633773803710938, 18.556289672851562, 10.319900512695312, -14.12674331665039, 3.062835693359375, 4.568878173828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000430.npy"}
|
||||
{"epoch": 0.6500377928949358, "step": 431, "batch_size": 64, "mean": 10.449111938476562, "std": 15.33355712890625, "min": -23.160354614257812, "p10": -6.678289794921874, "median": 8.17955207824707, "p90": 32.62090148925782, "max": 58.062198638916016, "pos_frac": 0.75, "sample": [-23.160354614257812, 13.688896179199219, 10.98366928100586, 3.2173500061035156, 1.2037353515625, 24.82599639892578, 33.418853759765625, 14.075752258300781, 22.017471313476562, -1.8472518920898438, -0.49204063415527344, 47.86855697631836, 14.31460952758789, -2.0782623291015625, 7.745548248291016, 12.028875350952148, 45.958702087402344, -4.743251800537109, 25.26177978515625, 6.926177978515625, 7.322593688964844, -9.92437744140625, 2.3412857055664062, 33.707916259765625, -8.379257202148438, 42.00374984741211, 33.005638122558594, -7.29652214050293, 7.079315185546875, -3.4041786193847656, 4.460760116577148, 9.81690788269043, 14.87646484375, 24.291595458984375, -7.91973876953125, 6.461158752441406, 31.723182678222656, 8.383544921875, -17.297210693359375, -7.1427001953125, 0.385833740234375, 9.303802490234375, -5.59466552734375, 18.03032684326172, -0.0004425048828125, 5.105155944824219, 14.647438049316406, 10.709304809570312, 9.250059127807617, 58.062198638916016, 2.11138916015625, 9.008136749267578, -0.260589599609375, 16.24703598022461, 12.01357650756836, 8.699790954589844, 7.564140319824219, -0.18231582641601562, 7.975559234619141, 23.2720947265625, 27.841323852539062, 7.90308952331543, 10.790306091308594, 0.5356521606445312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000431.npy"}
|
||||
{"epoch": 0.6515495086923658, "step": 432, "batch_size": 64, "mean": 11.078676223754883, "std": 14.657334327697754, "min": -23.276351928710938, "p10": -4.797689819335936, "median": 9.761405944824219, "p90": 30.380646324157716, "max": 45.709800720214844, "pos_frac": 0.75, "sample": [-3.0254974365234375, -6.988306045532227, 1.405670166015625, -23.276351928710938, 2.5662384033203125, 4.7093353271484375, -6.715160369873047, 12.768997192382812, -2.0043563842773438, 37.7598876953125, 7.133565902709961, -21.57343292236328, 1.79498291015625, 12.169143676757812, 23.042495727539062, -8.260162353515625, 12.947013854980469, -5.251708984375, 31.988128662109375, 34.009490966796875, 0.5401992797851562, 19.942100524902344, 27.002357482910156, 6.96307373046875, -0.6616077423095703, 26.0457763671875, 5.776222229003906, 28.918014526367188, 22.353988647460938, -2.060192108154297, -1.9219474792480469, 5.476245880126953, 3.8773555755615234, -6.623756408691406, 18.539947509765625, 8.860969543457031, 34.184532165527344, 17.528282165527344, 10.661842346191406, 22.187782287597656, -0.9528541564941406, 13.448734283447266, 15.776748657226562, 2.5993499755859375, 20.4556884765625, 5.9163970947265625, 0.5611209869384766, 30.403162002563477, 1.6347618103027344, 45.709800720214844, 20.32002830505371, 13.369503021240234, 20.385498046875, 26.640655517578125, 21.3232421875, 45.39191436767578, -2.7387008666992188, 17.183002471923828, 16.542495727539062, -3.738311767578125, -1.8224029541015625, 30.328109741210938, 3.122528076171875, 14.38363265991211], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000432.npy"}
|
||||
{"epoch": 0.6530612244897959, "step": 433, "batch_size": 64, "mean": 13.874216079711914, "std": 17.683834075927734, "min": -25.941673278808594, "p10": -4.770421409606933, "median": 9.347003936767578, "p90": 40.172603225708016, "max": 57.909523010253906, "pos_frac": 0.8125, "sample": [7.563331604003906, 21.00257110595703, 1.5882911682128906, 9.084915161132812, 11.112968444824219, 4.7823333740234375, 0.1910839080810547, 27.202484130859375, -3.5896034240722656, 10.374366760253906, 5.910316467285156, 8.050102233886719, 8.036468505859375, -3.454345703125, 50.004150390625, -15.255489349365234, 7.7407379150390625, 27.899433135986328, 23.75887680053711, -6.093746185302734, 31.15807342529297, 3.3869552612304688, 5.011474609375, 17.482873916625977, -25.941673278808594, 6.315889358520508, 26.402435302734375, 1.4185676574707031, 28.54131507873535, 22.779170989990234, 1.5035171508789062, 19.526084899902344, -6.240026473999023, 9.609092712402344, 14.402103424072266, 3.8703079223632812, -4.106639862060547, 9.838973999023438, 12.78851318359375, 57.909523010253906, 42.91499328613281, 38.6783447265625, 44.021148681640625, 4.35833740234375, 26.876800537109375, -16.72876739501953, 0.9486827850341797, 28.354637145996094, 0.14240264892578125, 40.8129997253418, -1.4738349914550781, 18.060646057128906, 12.441734313964844, -5.054899215698242, -1.3109512329101562, 34.74469757080078, 20.980669021606445, 2.95733642578125, 30.851905822753906, 36.497222900390625, 42.40876007080078, 7.015533447265625, 53.56934356689453, -5.683719635009766], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000433.npy"}
|
||||
{"epoch": 0.654572940287226, "step": 434, "batch_size": 64, "mean": 12.23498249053955, "std": 17.2602481842041, "min": -32.821563720703125, "p10": -9.371756362915038, "median": 12.462017059326172, "p90": 35.4175594329834, "max": 55.552494049072266, "pos_frac": 0.796875, "sample": [-26.320213317871094, 12.592628479003906, 20.597734451293945, 3.3301544189453125, 16.152862548828125, 32.58174514770508, -9.611316680908203, 26.09113311767578, -9.66754150390625, 8.29461669921875, 0.1841754913330078, -7.5074462890625, -32.821563720703125, 17.290267944335938, 12.873062133789062, -0.04042816162109375, 26.566219329833984, 12.495536804199219, 35.499603271484375, 39.14237594604492, 9.472572326660156, -0.23882675170898438, 38.91455078125, 5.301246643066406, -15.158988952636719, 19.457534790039062, 38.928672790527344, 15.768119812011719, 5.9232635498046875, 15.540868759155273, 3.2654991149902344, 30.81725311279297, 12.428497314453125, 55.552494049072266, 48.517189025878906, 28.24713134765625, -8.812782287597656, 19.707046508789062, 22.4261474609375, -3.0796966552734375, 35.22612380981445, -14.826622009277344, 8.339872360229492, 13.48202896118164, 5.309318542480469, 4.669761657714844, 2.593608856201172, 29.69970703125, 19.766387939453125, 12.183418273925781, 0.7139663696289062, 37.004844665527344, -14.74508285522461, 1.0784645080566406, 0.1874847412109375, 8.868537902832031, 22.740734100341797, 29.117103576660156, 11.934850692749023, -6.872386932373047, 14.98503303527832, 14.869583129882812, 17.57891273498535, 8.431879043579102], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000434.npy"}
|
||||
{"epoch": 0.656084656084656, "step": 435, "batch_size": 64, "mean": 10.593865394592285, "std": 14.43040657043457, "min": -24.065635681152344, "p10": -4.76915340423584, "median": 7.320978164672852, "p90": 30.542166900634776, "max": 48.86137008666992, "pos_frac": 0.796875, "sample": [36.20320129394531, 6.93815803527832, 1.3382759094238281, 6.054603576660156, -13.738199234008789, 0.333343505859375, 7.2171630859375, 2.391765594482422, 24.11937713623047, -1.8024063110351562, 14.423572540283203, 28.453697204589844, -4.6476593017578125, 24.133575439453125, 28.248851776123047, 21.02873992919922, 5.268348693847656, -8.917892456054688, -1.1269683837890625, 32.466835021972656, 13.851158142089844, 4.931318283081055, 21.554771423339844, 34.47675323486328, 2.395233154296875, 14.151908874511719, -0.934661865234375, 4.488983154296875, 33.8315544128418, 23.528358459472656, 3.0301055908203125, 23.79888916015625, -19.363067626953125, 12.131488800048828, 11.48577880859375, 21.059940338134766, 1.8966751098632812, 1.0936431884765625, 12.726194381713867, 24.12078857421875, 18.9654541015625, 32.869476318359375, 48.86137008666992, 21.939529418945312, 0.8483390808105469, 7.194211959838867, 8.556819915771484, 0.38030433654785156, -3.2670249938964844, 31.437225341796875, -9.370506286621094, 10.054794311523438, -4.821222305297852, 5.700691223144531, 7.424793243408203, 14.663501739501953, 25.74139404296875, -11.753509521484375, -24.065635681152344, 5.2188262939453125, -1.6035690307617188, 11.89642333984375, 22.44589614868164, 6.047576904296875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000435.npy"}
|
||||
{"epoch": 0.6575963718820862, "step": 436, "batch_size": 64, "mean": 11.02175521850586, "std": 14.968390464782715, "min": -21.833267211914062, "p10": -7.678165435791016, "median": 12.478650093078613, "p90": 30.536671447753907, "max": 40.658714294433594, "pos_frac": 0.75, "sample": [16.46477508544922, 2.4461402893066406, 40.658714294433594, 3.428956985473633, 15.369596481323242, 2.827188491821289, 9.575496673583984, 24.316986083984375, -7.491004943847656, 4.6581878662109375, 16.879653930664062, 31.990659713745117, 8.803237915039062, 17.992279052734375, 1.8745384216308594, 29.935653686523438, 30.79425048828125, 38.52177429199219, -11.845909118652344, 22.146102905273438, 16.152305603027344, 12.497129440307617, -7.0254058837890625, 3.3300857543945312, 11.90431022644043, -0.633758544921875, 32.92059326171875, -2.600748062133789, 22.539886474609375, 24.057540893554688, 19.825672149658203, 1.781005859375, -20.0705623626709, 1.9788665771484375, -0.9220638275146484, 8.2135009765625, 22.49250030517578, -3.479459762573242, 1.8612136840820312, -6.66070556640625, 21.63385009765625, 16.413406372070312, 12.46017074584961, -1.7330780029296875, -10.124908447265625, 13.608880996704102, -9.770111083984375, 39.8155517578125, 13.426307678222656, 28.585205078125, 18.284339904785156, 1.4904537200927734, 24.945388793945312, 13.212181091308594, -4.501243591308594, 18.478023529052734, 26.77326774597168, -7.7583770751953125, 17.28075408935547, 0.7643470764160156, -21.833267211914062, 37.913726806640625, -10.006649017333984, 28.524925231933594], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000436.npy"}
|
||||
{"epoch": 0.6591080876795162, "step": 437, "batch_size": 64, "mean": 10.476823806762695, "std": 18.794281005859375, "min": -37.098968505859375, "p10": -9.927384948730468, "median": 7.958288192749023, "p90": 35.47010498046877, "max": 63.8214111328125, "pos_frac": 0.703125, "sample": [46.49456787109375, 29.133697509765625, 23.60774803161621, 14.928855895996094, 3.6765899658203125, -15.555805206298828, 27.914825439453125, 18.74957275390625, 63.8214111328125, 6.730998992919922, 22.43036651611328, 5.215667724609375, -37.098968505859375, 14.509765625, -6.8238525390625, -6.8452301025390625, 13.282699584960938, 8.19668960571289, 10.44888687133789, -6.011299133300781, 51.47972106933594, 47.912559509277344, 37.36715316772461, -6.141666412353516, -13.447383880615234, 9.434135437011719, 25.24505615234375, -10.985504150390625, -19.089656829833984, -9.194572448730469, -4.997566223144531, -4.037660598754883, 0.0330047607421875, -0.9975509643554688, 12.477371215820312, 43.058555603027344, 13.701614379882812, -8.727119445800781, 9.201866149902344, 4.124114990234375, 3.6689071655273438, 1.0114097595214844, -4.726448059082031, 19.719390869140625, 12.813148498535156, 28.301761627197266, 5.513236999511719, 18.388755798339844, 26.502037048339844, -10.241447448730469, 6.100980758666992, 31.043659210205078, 15.927047729492188, 30.58038330078125, 16.9876708984375, 5.727386474609375, 2.89874267578125, 20.976531982421875, -19.238723754882812, -1.5359077453613281, -0.1418914794921875, 7.719886779785156, 0.521148681640625, 38.775390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000437.npy"}
|
||||
{"epoch": 0.6606198034769464, "step": 438, "batch_size": 64, "mean": 13.026363372802734, "std": 16.612516403198242, "min": -19.297197341918945, "p10": -6.3564002990722654, "median": 11.578666687011719, "p90": 34.84794654846191, "max": 59.755767822265625, "pos_frac": 0.796875, "sample": [43.77836608886719, -5.543426513671875, 59.755767822265625, -18.23786163330078, 0.9765701293945312, -6.392147064208984, 1.1600341796875, 14.71622085571289, 1.0639209747314453, 34.73338317871094, 15.991580963134766, 40.774208068847656, -6.272991180419922, -0.7947616577148438, 21.10179901123047, 0.9197406768798828, 23.585729598999023, -19.297197341918945, 19.508363723754883, 13.369577407836914, 29.428234100341797, 22.705158233642578, -6.554859161376953, 16.71038818359375, 17.448768615722656, 20.510766983032227, 14.252059936523438, 2.7584190368652344, 34.89704513549805, 34.52574157714844, 19.70294952392578, -1.46661376953125, 36.053619384765625, 2.812429428100586, -0.7636260986328125, 23.511024475097656, 21.743751525878906, -9.287397384643555, 11.163410186767578, 18.377748489379883, 28.50922203063965, 1.8290557861328125, 7.602733612060547, 38.811309814453125, 29.958473205566406, 10.713058471679688, 20.850929260253906, 5.637218475341797, 1.6467056274414062, 8.295303344726562, -3.157367706298828, 0.9701309204101562, 7.21723747253418, 32.380760192871094, 24.69892120361328, -15.82305908203125, 39.743080139160156, 1.8659439086914062, -15.770332336425781, 30.085622787475586, 9.574634552001953, 2.123514175415039, 11.99392318725586, 10.504375457763672], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000438.npy"}
|
||||
{"epoch": 0.6621315192743764, "step": 439, "batch_size": 64, "mean": 11.690801620483398, "std": 14.457306861877441, "min": -24.84423828125, "p10": -4.086472702026367, "median": 9.522937774658203, "p90": 33.55199203491211, "max": 42.28172302246094, "pos_frac": 0.828125, "sample": [13.316783905029297, -0.8087005615234375, 3.9395904541015625, -14.61865234375, 15.781604766845703, 18.078441619873047, 8.413248062133789, 8.37548828125, 9.531852722167969, 26.151657104492188, 40.51856994628906, 0.31096458435058594, 32.33177185058594, 10.791671752929688, 8.867294311523438, 28.225372314453125, 42.28172302246094, 0.8726425170898438, 18.7236328125, -6.13348388671875, 2.8709373474121094, 21.19481086730957, 14.991527557373047, -6.509532928466797, 22.684131622314453, 8.024452209472656, 40.69189453125, -4.124111175537109, 3.3113956451416016, 35.1384391784668, -3.1984386444091797, 33.75849151611328, -3.9986495971679688, -1.472900390625, 20.016719818115234, 17.045166015625, 1.5063934326171875, 7.864234924316406, -24.84423828125, 2.9589405059814453, -10.292877197265625, 9.250259399414062, 25.704971313476562, 39.915618896484375, 10.719371795654297, 27.085525512695312, 10.255630493164062, 40.563873291015625, 4.446868896484375, 13.137939453125, 7.196079254150391, 8.33062744140625, 9.514022827148438, 16.889503479003906, 13.698083877563477, 33.070159912109375, 6.948333740234375, 3.3147735595703125, -12.066871643066406, 0.09299468994140625, 10.014415740966797, 12.011886596679688, 10.285804748535156, 5.263206481933594], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000439.npy"}
|
||||
{"epoch": 0.6636432350718064, "step": 440, "batch_size": 64, "mean": 12.544342041015625, "std": 16.05022430419922, "min": -24.171266555786133, "p10": -5.413516044616698, "median": 10.339771270751953, "p90": 36.17706527709961, "max": 48.86863708496094, "pos_frac": 0.78125, "sample": [-4.890506744384766, 15.095230102539062, 31.56549072265625, 42.180023193359375, 35.87239074707031, 48.86863708496094, 40.75971603393555, -0.6004848480224609, 32.17859649658203, 5.5061798095703125, 11.502799987792969, 43.41673278808594, 6.3568878173828125, 26.446754455566406, 15.814159393310547, 6.037300109863281, 24.36604881286621, -8.088668823242188, -8.8939208984375, 8.614835739135742, 10.148948669433594, -0.7523651123046875, 3.067279815673828, 15.143821716308594, 14.163848876953125, 16.820283889770508, 7.299468994140625, 13.600128173828125, 2.460247039794922, 6.716400146484375, 5.883869171142578, 10.478965759277344, 18.114234924316406, 2.0535850524902344, 1.4309043884277344, 3.9791336059570312, 11.084150314331055, -24.171266555786133, 11.970354080200195, 36.307640075683594, 22.09682846069336, 31.357112884521484, 1.3689422607421875, 32.67396545410156, -21.002323150634766, 4.439197540283203, 33.863304138183594, 8.950096130371094, 17.80361557006836, -2.4354171752929688, -2.526081085205078, -5.637662887573242, 19.4130859375, -2.9644699096679688, 11.33946418762207, 5.999385833740234, -0.8706932067871094, -6.301568984985352, 18.26498794555664, -10.276077270507812, 10.200576782226562, 41.927581787109375, 42.28807067871094, 14.958152770996094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000440.npy"}
|
||||
{"epoch": 0.6651549508692366, "step": 441, "batch_size": 64, "mean": 13.467504501342773, "std": 12.822663307189941, "min": -12.592144012451172, "p10": -1.7284927368164054, "median": 12.160597801208496, "p90": 31.372674751281746, "max": 44.67438507080078, "pos_frac": 0.859375, "sample": [16.536558151245117, 11.68110466003418, 32.051307678222656, 24.037872314453125, 0.1424884796142578, 9.132047653198242, 23.8068790435791, 2.3317947387695312, 17.576171875, -0.73968505859375, -5.2375335693359375, 25.526947021484375, 6.825927734375, 32.16041564941406, 4.425510406494141, 6.553924560546875, 15.420642852783203, 8.877227783203125, -12.592144012451172, 43.45790100097656, 9.090164184570312, 15.585224151611328, 27.434112548828125, 35.543846130371094, 8.75680160522461, 44.67438507080078, 4.577564239501953, 11.295013427734375, 41.453208923339844, 6.316925048828125, 3.53955078125, 13.439987182617188, -3.2533607482910156, 21.332904815673828, 16.988494873046875, -2.1522674560546875, -0.1835784912109375, 29.78919792175293, 23.749006271362305, 5.563316345214844, 16.576446533203125, 29.246131896972656, 1.8336067199707031, 10.164257049560547, 18.998306274414062, 12.640090942382812, 15.048412322998047, 5.514636993408203, 8.747299194335938, -3.786640167236328, 25.686859130859375, 3.197986602783203, 29.76793670654297, 32.50294494628906, -6.851011276245117, 15.859169006347656, 8.632789611816406, 15.719093322753906, 6.057258605957031, 12.725120544433594, -10.26995849609375, 15.045722961425781, 17.5175724029541, 5.830358505249023], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000441.npy"}
|
||||
{"epoch": 0.6666666666666666, "step": 442, "batch_size": 64, "mean": 11.842670440673828, "std": 15.254429817199707, "min": -19.08641815185547, "p10": -6.595755767822264, "median": 10.218162536621094, "p90": 28.526647186279302, "max": 51.263023376464844, "pos_frac": 0.765625, "sample": [3.0427169799804688, 23.823410034179688, 13.817672729492188, -7.291252136230469, 8.995752334594727, 49.75868225097656, 40.015380859375, 35.59859848022461, 15.966651916503906, 24.899513244628906, 14.126285552978516, 18.99951934814453, 27.047515869140625, -4.972930908203125, 19.781410217285156, 12.979272842407227, -0.3023681640625, 2.9307937622070312, 6.0875701904296875, -4.8720855712890625, -0.691925048828125, 21.891517639160156, 14.998992919921875, 4.971855163574219, 26.446792602539062, 26.03319549560547, -19.08641815185547, 13.28656005859375, -1.0119647979736328, 22.551132202148438, 10.125564575195312, 9.236053466796875, 17.040206909179688, 26.972801208496094, 19.0146484375, 6.205820083618164, -8.495475769042969, 6.608737945556641, 51.263023376464844, 29.160560607910156, 8.821321487426758, -2.5797805786132812, 6.482481002807617, 11.850269317626953, -3.7552852630615234, -11.131996154785156, -15.717903137207031, 41.71272277832031, -11.800102233886719, 3.360565185546875, 25.504562377929688, 17.619483947753906, -17.31000518798828, 7.638069152832031, 5.277374267578125, -4.064077377319336, 8.000045776367188, 3.4425201416015625, 5.0860595703125, 26.981277465820312, 21.103790283203125, 31.636558532714844, 10.310760498046875, 12.508413314819336], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000442.npy"}
|
||||
{"epoch": 0.6681783824640968, "step": 443, "batch_size": 64, "mean": 15.245677947998047, "std": 17.70081901550293, "min": -23.074417114257812, "p10": -3.144652175903318, "median": 11.789647102355957, "p90": 38.532902526855466, "max": 67.48007202148438, "pos_frac": 0.84375, "sample": [1.4834251403808594, 26.174087524414062, 21.391342163085938, 27.5009765625, 6.555931091308594, 24.308547973632812, 17.718505859375, -19.707321166992188, 11.727523803710938, 12.497306823730469, 0.37567901611328125, -4.115535736083984, 6.760265350341797, 23.16779136657715, 3.7706260681152344, 26.894180297851562, 1.5532703399658203, 13.874931335449219, 11.318794250488281, 24.102005004882812, 67.48007202148438, -23.074417114257812, -6.840538024902344, 28.03044891357422, -0.10615730285644531, 49.591896057128906, 42.663970947265625, 7.015863418579102, 50.403472900390625, 38.55204391479492, 20.0391845703125, 9.452037811279297, 11.851770401000977, 20.394207000732422, 25.609878540039062, 27.93426513671875, 29.391799926757812, 38.47160339355469, 30.76956558227539, 31.238800048828125, 5.528675079345703, 27.14618492126465, 0.27834129333496094, 18.109100341796875, 38.48823928833008, 31.249637603759766, -4.998313903808594, 7.2976837158203125, 6.398551940917969, 1.545907974243164, 1.050405502319336, -10.7366943359375, 11.261810302734375, 15.591766357421875, 6.247714996337891, 4.120708465576172, -22.680335998535156, 6.301177978515625, 38.855224609375, 9.119699478149414, -0.4792919158935547, -0.8792572021484375, 9.817466735839844, 40.86689758300781], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000443.npy"}
|
||||
{"epoch": 0.6696900982615268, "step": 444, "batch_size": 64, "mean": 6.04442024230957, "std": 15.02802848815918, "min": -26.38718032836914, "p10": -13.038907623291015, "median": 3.874347686767578, "p90": 25.27651596069336, "max": 40.79718017578125, "pos_frac": 0.65625, "sample": [20.441272735595703, 1.7035331726074219, -4.854148864746094, 2.6512889862060547, 9.707351684570312, 16.907390594482422, 16.994924545288086, -12.371932983398438, -4.099964141845703, 26.47469139099121, -6.4123382568359375, 25.32111358642578, -6.186248779296875, -13.78253173828125, 8.318099975585938, 19.047531127929688, 20.055072784423828, 12.055540084838867, 3.512399673461914, -20.242568969726562, 33.925811767578125, 23.070823669433594, 1.1427421569824219, 2.2735443115234375, -5.5679931640625, 3.6707611083984375, 1.2775726318359375, -5.085973739624023, -13.119361877441406, -8.095123291015625, 25.172454833984375, 4.403928756713867, 15.988380432128906, 10.717864990234375, 6.341758728027344, 18.157791137695312, 4.077934265136719, 16.95223617553711, -3.876453399658203, -26.38718032836914, 5.344482421875, 40.79718017578125, -22.835487365722656, -13.822784423828125, 2.655771255493164, 24.668664932250977, 13.753936767578125, 34.0948486328125, 28.98705291748047, 9.049880981445312, 12.423131942749023, 6.3985443115234375, 2.0267906188964844, -0.9543609619140625, 15.562400817871094, -5.901700973510742, -6.102298736572266, 2.092947006225586, 14.492706298828125, -12.851181030273438, -18.95856475830078, 39.04432678222656, -2.7319412231445312, -0.6734523773193359], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000444.npy"}
|
||||
{"epoch": 0.671201814058957, "step": 445, "batch_size": 64, "mean": 11.31997299194336, "std": 14.911118507385254, "min": -22.28083038330078, "p10": -8.841629219055175, "median": 8.103286743164062, "p90": 33.78619461059571, "max": 41.54234313964844, "pos_frac": 0.828125, "sample": [21.911060333251953, 36.713890075683594, 6.70977783203125, 3.6620941162109375, -10.351211547851562, 4.808876037597656, 29.987972259521484, 8.637115478515625, 12.040740966796875, 7.393573760986328, -1.6049346923828125, -13.184776306152344, -3.8408164978027344, 5.7445220947265625, -12.044403076171875, 23.048372268676758, -7.4590301513671875, 1.8808135986328125, 0.650177001953125, 5.872825622558594, 0.38372230529785156, 30.55889892578125, -18.379253387451172, 3.5407485961914062, -22.28083038330078, 31.822166442871094, 10.1689453125, 34.02072525024414, 2.1842880249023438, 2.1049575805664062, -13.427814483642578, -9.434171676635742, 4.21051025390625, 33.297569274902344, 41.54234313964844, 9.754547119140625, 20.476425170898438, 35.686431884765625, 7.658775329589844, 4.660491943359375, 23.543678283691406, 24.52375030517578, 16.525943756103516, 33.99560546875, 21.007020950317383, 23.363399505615234, 29.403152465820312, 3.9751815795898438, 14.594114303588867, -3.366342544555664, 7.078609466552734, 5.745277404785156, 3.490997314453125, 8.547798156738281, 18.85881805419922, 11.725448608398438, 34.43415069580078, 25.090469360351562, 5.085336685180664, 10.934059143066406, 19.36379623413086, 16.475019454956055, 34.78282928466797, 6.174049377441406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000445.npy"}
|
||||
{"epoch": 0.672713529856387, "step": 446, "batch_size": 64, "mean": 15.477706909179688, "std": 14.095762252807617, "min": -7.456951141357422, "p10": -2.187261199951171, "median": 15.3084716796875, "p90": 35.48521308898926, "max": 54.520050048828125, "pos_frac": 0.84375, "sample": [1.5223388671875, 0.8155746459960938, -7.456951141357422, 24.26361083984375, 11.200889587402344, 23.181991577148438, 6.979644775390625, -4.518028259277344, 54.520050048828125, 36.00227355957031, 13.799911499023438, 15.970321655273438, 15.558074951171875, 19.28607940673828, 22.616302490234375, 44.18577575683594, 15.336502075195312, 20.154325485229492, 28.168601989746094, 16.460289001464844, 0.6860160827636719, 17.994564056396484, -3.84442138671875, 21.25567626953125, 35.85248947143555, 14.84042739868164, 3.844135284423828, 20.223106384277344, 34.62823486328125, -1.2544708251953125, 18.282655715942383, 3.1617813110351562, -4.0179290771484375, -1.5539169311523438, -0.7613945007324219, 4.432445526123047, 0.4845714569091797, -3.676593780517578, 20.024681091308594, 9.007026672363281, 11.862991333007812, 43.750152587890625, 31.376850128173828, 19.443479537963867, 37.0145263671875, 2.421426773071289, 12.906402587890625, 30.2145938873291, -3.524251937866211, 13.271881103515625, 22.96349334716797, 4.254730224609375, 18.410003662109375, 30.801910400390625, 30.883955001831055, 41.434879302978516, 6.622417449951172, 13.34499740600586, 2.6824989318847656, 26.818084716796875, -2.4586944580078125, 10.700157165527344, 22.43970489501953, 15.280441284179688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000446.npy"}
|
||||
{"epoch": 0.674225245653817, "step": 447, "batch_size": 64, "mean": 15.309407234191895, "std": 16.141260147094727, "min": -40.06574249267578, "p10": -5.584954071044922, "median": 16.452211380004883, "p90": 36.8708610534668, "max": 46.868385314941406, "pos_frac": 0.828125, "sample": [37.26325225830078, 23.721405029296875, 18.628433227539062, 7.28509521484375, 4.063499450683594, -3.29254150390625, 31.708412170410156, -5.5207977294921875, -11.8291015625, 8.14727783203125, 16.805465698242188, 16.23900604248047, 9.008232116699219, 21.793243408203125, 18.338096618652344, 43.53467559814453, 9.708742141723633, 16.406536102294922, 10.446502685546875, 13.737655639648438, 6.524410247802734, -8.896965026855469, 9.621185302734375, -9.033348083496094, 4.2023162841796875, 31.4827880859375, 36.60188293457031, 22.691116333007812, 22.7169189453125, 7.5978546142578125, 25.509963989257812, 25.20945930480957, 17.60887908935547, 31.421295166015625, 22.606082916259766, 19.502403259277344, 38.19914627075195, 14.235931396484375, 43.84814453125, 1.1748466491699219, 22.276885986328125, 9.714466094970703, 28.792816162109375, -7.030632019042969, -0.11455726623535156, -9.5767822265625, 4.468620300292969, 45.34999084472656, 18.939773559570312, 17.30719757080078, -0.8751258850097656, 30.796491622924805, -5.612449645996094, 36.205169677734375, 16.497886657714844, 12.3939208984375, 15.503158569335938, -40.06574249267578, 12.502809524536133, 46.868385314941406, 36.98613739013672, 2.6253890991210938, 18.693161010742188, 18.137693405151367], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000447.npy"}
|
||||
{"epoch": 0.6757369614512472, "step": 448, "batch_size": 64, "mean": 12.550762176513672, "std": 15.389115333557129, "min": -32.131317138671875, "p10": -4.054278564453124, "median": 13.097176551818848, "p90": 29.112733459472658, "max": 56.71908950805664, "pos_frac": 0.796875, "sample": [25.062942504882812, -19.0194149017334, 31.49786949157715, 5.1008453369140625, 19.614837646484375, 13.7513427734375, -0.9598121643066406, 13.902450561523438, 16.921844482421875, -1.7427902221679688, 26.190902709960938, 28.43682861328125, 5.7939605712890625, 3.3826370239257812, 13.291496276855469, 25.499473571777344, -3.5171661376953125, 35.29559326171875, 18.851722717285156, 5.580989837646484, 0.5208892822265625, 56.71908950805664, 10.614501953125, 13.093353271484375, 13.10099983215332, -0.9077568054199219, 25.966995239257812, 9.185455322265625, 16.25354766845703, 5.1662139892578125, -6.88581657409668, 4.316131591796875, 7.4741973876953125, 3.3335342407226562, 3.4723968505859375, 22.518817901611328, -4.2844696044921875, 18.68479347229004, 29.160675048828125, -6.466279983520508, 18.388975143432617, -32.131317138671875, 14.76727294921875, 4.9005584716796875, 16.708755493164062, 18.116012573242188, 19.701019287109375, 26.349273681640625, 19.680185317993164, 5.778886795043945, -4.542570114135742, 8.4346923828125, 9.0535888671875, 22.271957397460938, -0.4246845245361328, 39.26507568359375, -1.6508560180664062, -16.11060333251953, 39.663875579833984, 29.000869750976562, 22.84710693359375, 8.005104064941406, 3.0398635864257812, 48.16188049316406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000448.npy"}
|
||||
{"epoch": 0.6772486772486772, "step": 449, "batch_size": 64, "mean": 7.638876914978027, "std": 13.2771635055542, "min": -16.236053466796875, "p10": -8.761068725585938, "median": 6.320662498474121, "p90": 22.209275054931645, "max": 50.52324295043945, "pos_frac": 0.71875, "sample": [0.25110816955566406, 13.687957763671875, -16.236053466796875, 18.038772583007812, -4.70623779296875, -4.913475036621094, 4.2034759521484375, 10.144966125488281, -1.9873924255371094, 26.405601501464844, 11.297142028808594, 5.64776611328125, 2.9520950317382812, -9.548681259155273, 1.392059326171875, 11.464889526367188, 50.52324295043945, 6.600515365600586, 7.07452392578125, 10.384456634521484, -1.7766990661621094, 11.413610458374023, 34.907859802246094, 39.327392578125, -13.78713607788086, 9.244640350341797, 12.656791687011719, 3.8977813720703125, 12.254035949707031, 2.6072349548339844, 10.581130981445312, -3.6845741271972656, 13.043983459472656, -15.063770294189453, 29.83203887939453, 11.190162658691406, 37.42198181152344, -1.1271018981933594, 6.040809631347656, 13.536918640136719, 1.581491470336914, -1.580078125, 2.9925270080566406, 16.116966247558594, -4.561614990234375, 20.183818817138672, -8.976226806640625, 22.575469970703125, 4.229984283447266, 9.057525634765625, 18.7047119140625, -2.1890411376953125, 3.3605422973632812, 6.667022705078125, -8.259033203125, 21.1612548828125, 17.847084045410156, -13.609123229980469, 13.568286895751953, 0.9943923950195312, 21.354820251464844, -9.237762451171875, 3.0336265563964844, -1.3223419189453125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000449.npy"}
|
||||
{"epoch": 0.6787603930461074, "step": 450, "batch_size": 64, "mean": 8.44278335571289, "std": 14.440051078796387, "min": -20.136869430541992, "p10": -7.58793716430664, "median": 5.948906898498535, "p90": 30.290412139892585, "max": 46.11705780029297, "pos_frac": 0.71875, "sample": [-0.17601776123046875, -1.6395034790039062, -6.947826385498047, -3.3375167846679688, 33.045867919921875, 31.04816436767578, 9.261253356933594, 5.72784423828125, 20.990997314453125, 46.11705780029297, 18.29953384399414, 3.8170604705810547, 18.358951568603516, 1.9722824096679688, 1.6996269226074219, 35.823448181152344, -1.27166748046875, 4.6841278076171875, 13.486328125, 45.565032958984375, -0.7497634887695312, 14.431522369384766, 4.221282958984375, -5.143196105957031, 6.1644134521484375, -7.862270355224609, 18.539451599121094, 24.481979370117188, 9.479644775390625, -11.927236557006836, 1.9635467529296875, 25.36846923828125, 4.251134872436523, 11.69056510925293, -18.113187789916992, 28.522323608398438, 15.287956237792969, 0.1921710968017578, 18.027008056640625, 9.416027069091797, 7.387531280517578, 16.75121307373047, 6.670755386352539, -20.136869430541992, 8.636650085449219, 4.176019668579102, 12.091316223144531, 8.636421203613281, 10.859893798828125, -4.39141845703125, 2.6830692291259766, -10.988487243652344, 2.011941909790039, 5.733400344848633, 36.00367736816406, 10.060127258300781, -9.492691040039062, 15.093132019042969, -2.5474700927734375, 36.17656707763672, -6.943569183349609, 3.8609848022460938, -15.126852035522461, -1.6340866088867188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000450.npy"}
|
||||
{"epoch": 0.6802721088435374, "step": 451, "batch_size": 64, "mean": 9.253421783447266, "std": 14.644659996032715, "min": -20.707725524902344, "p10": -7.6333992004394515, "median": 8.220773696899414, "p90": 26.771089172363286, "max": 49.375999450683594, "pos_frac": 0.734375, "sample": [27.28268051147461, 5.2447967529296875, 15.825035095214844, 6.973907470703125, 23.108367919921875, 7.1744384765625, -14.946266174316406, 12.63140869140625, 8.828088760375977, 18.478721618652344, -3.000457763671875, 8.390201568603516, 22.182708740234375, 8.04388427734375, 21.478878021240234, 7.12725830078125, 14.906761169433594, 14.99332046508789, 8.984468460083008, 19.94135856628418, 2.7847347259521484, -20.707725524902344, -4.244709014892578, 41.52851867675781, 20.83126449584961, -11.450408935546875, 15.563308715820312, -18.171707153320312, -10.461181640625, 1.2051734924316406, 12.628103256225586, 27.218795776367188, -2.6524429321289062, 1.9011669158935547, 14.776582717895508, 8.644783020019531, 37.77803039550781, 17.940078735351562, -0.10268974304199219, 9.737358093261719, 2.090587615966797, 25.7264404296875, 2.8568267822265625, -0.119415283203125, 15.754924774169922, -2.946319580078125, 7.544528961181641, 43.627967834472656, 36.64009094238281, 8.051345825195312, 3.889190673828125, 10.5093994140625, 9.681564331054688, -5.83935546875, -1.6039619445800781, -4.219139099121094, 22.14650535583496, 0.18575286865234375, -3.951526641845703, 18.569530487060547, 49.375999450683594, -16.29171371459961, -8.402275085449219, 0.5454940795898438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000451.npy"}
|
||||
{"epoch": 0.6817838246409675, "step": 452, "batch_size": 64, "mean": 11.130670547485352, "std": 15.69097900390625, "min": -27.697555541992188, "p10": -6.568274307250976, "median": 10.511960983276367, "p90": 32.617712402343756, "max": 47.81727600097656, "pos_frac": 0.734375, "sample": [13.786087036132812, -6.753761291503906, 5.688352584838867, 26.607650756835938, 3.8416519165039062, 20.57670783996582, 13.048782348632812, 28.869125366210938, 25.719284057617188, 32.09223175048828, 9.946956634521484, 39.83171081542969, -23.04876708984375, 16.26886749267578, 32.842918395996094, 13.399200439453125, 40.36332702636719, 16.255172729492188, 19.403791427612305, 27.59906005859375, 3.1280670166015625, -2.1198196411132812, 25.014007568359375, 12.280139923095703, 1.5943412780761719, -2.9032840728759766, 13.73586654663086, 6.429290771484375, 1.9215011596679688, 0.6294174194335938, -2.3967247009277344, 47.81727600097656, -11.525218963623047, 25.486064910888672, -6.135471343994141, 23.71099090576172, 9.397674560546875, 11.07696533203125, -12.431282043457031, 5.876596450805664, 8.846782684326172, -0.479766845703125, 29.34217071533203, 9.4521484375, 16.642471313476562, -13.669172286987305, 38.90376281738281, 33.10608673095703, 17.81128692626953, -1.6894912719726562, 12.783843994140625, -5.7372894287109375, 8.974796295166016, 15.818164825439453, 2.222492218017578, -8.056503295898438, -1.467010498046875, 15.604560852050781, 23.168848037719727, -2.819854736328125, -1.2290496826171875, 33.34741973876953, -27.697555541992188, 2.2590789794921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000452.npy"}
|
||||
{"epoch": 0.6832955404383976, "step": 453, "batch_size": 64, "mean": 11.34100341796875, "std": 16.083253860473633, "min": -19.746383666992188, "p10": -8.750557708740233, "median": 7.490144729614258, "p90": 33.13493270874024, "max": 48.59230041503906, "pos_frac": 0.75, "sample": [13.351821899414062, 17.96686363220215, -6.027687072753906, 0.3046760559082031, 39.95826721191406, 5.099052429199219, 1.6755752563476562, 14.405078887939453, 22.8145751953125, -1.8026275634765625, -19.746383666992188, 5.85467529296875, 25.048675537109375, -4.116752624511719, 32.64024353027344, 29.538070678710938, 32.92143249511719, 3.9108810424804688, 30.51769256591797, 15.90243148803711, 0.6123332977294922, -2.4458770751953125, 10.8848876953125, 4.74835205078125, 0.23758697509765625, -8.948066711425781, -11.451011657714844, 31.003381729125977, -10.398784637451172, 18.026620864868164, 44.4583740234375, 3.816934585571289, 22.027950286865234, -14.726577758789062, -2.925373077392578, 6.177427291870117, 37.69207000732422, 17.498306274414062, -7.53057861328125, 48.59230041503906, 6.062767028808594, 35.46147155761719, -5.999454498291016, 1.7332267761230469, 1.3559646606445312, 16.384178161621094, -8.289703369140625, 6.366977691650391, 11.250411987304688, 16.146514892578125, 15.68313980102539, -1.7224845886230469, 5.872585296630859, 21.360769271850586, 14.173538208007812, -11.081890106201172, -8.981708526611328, 32.03825378417969, 23.25977325439453, 8.613311767578125, 36.825828552246094, 24.560205459594727, 33.22643280029297, 3.9573326110839844], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000453.npy"}
|
||||
{"epoch": 0.6848072562358276, "step": 454, "batch_size": 64, "mean": 12.183794021606445, "std": 16.968210220336914, "min": -47.70435333251953, "p10": -6.791542816162108, "median": 10.987372398376465, "p90": 34.33298034667969, "max": 48.3717041015625, "pos_frac": 0.796875, "sample": [33.627532958984375, -14.176473617553711, -2.370635986328125, 29.590476989746094, -47.70435333251953, -7.100803375244141, 21.793182373046875, 2.438323974609375, 7.503261566162109, 7.7569732666015625, 14.021419525146484, 32.26226806640625, 14.008735656738281, -16.307907104492188, 26.548057556152344, 7.568981170654297, 36.357845306396484, 27.900053024291992, 16.221481323242188, 13.372848510742188, -4.0702056884765625, -14.401947021484375, -1.7045440673828125, 30.983489990234375, 35.73094940185547, 40.184600830078125, -9.973590850830078, 5.9209442138671875, 9.714683532714844, 1.7466888427734375, -1.4919052124023438, 13.950920104980469, 34.63531494140625, 2.0781784057617188, 10.955101013183594, 5.419609069824219, 48.3717041015625, -6.069934844970703, 46.16547393798828, 4.007192611694336, 7.754314422607422, 12.370635986328125, 0.4698638916015625, 1.7549676895141602, 5.95416259765625, 5.273189544677734, 32.85358810424805, 27.2685546875, 6.551836013793945, 16.75429916381836, 14.912979125976562, 21.539077758789062, 21.836292266845703, 34.910064697265625, 14.921470642089844, 4.1384429931640625, 11.019643783569336, 26.939315795898438, 12.553192138671875, 29.948623657226562, 19.234535217285156, 3.743762969970703, -0.10840988159179688, -8.295589447021484], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000454.npy"}
|
||||
{"epoch": 0.6863189720332578, "step": 455, "batch_size": 64, "mean": 12.291560173034668, "std": 15.147902488708496, "min": -19.44845962524414, "p10": -3.3626159667968745, "median": 9.848495483398438, "p90": 32.65455131530762, "max": 57.122589111328125, "pos_frac": 0.75, "sample": [18.449798583984375, 9.484970092773438, 17.869186401367188, 25.64568328857422, 15.767265319824219, -2.957122802734375, 39.71147155761719, 41.75484085083008, -6.233856201171875, 5.158714294433594, 8.324272155761719, -10.383872985839844, 12.691238403320312, 9.352664947509766, 16.791141510009766, 57.122589111328125, 1.7417984008789062, -3.0279541015625, 8.256393432617188, -1.0537948608398438, 33.89805603027344, 1.6995658874511719, 30.923072814941406, -0.6469764709472656, -3.892274856567383, 27.23468017578125, 7.343299865722656, 17.98090362548828, 21.354080200195312, 0.26818084716796875, 12.43642807006836, 4.309392929077148, 26.788429260253906, 6.075325012207031, 16.493608474731445, 2.4422149658203125, -19.44845962524414, 15.738845825195312, 23.19554901123047, 31.939071655273438, -5.021076202392578, -1.7825088500976562, 12.19488525390625, -7.616729736328125, 32.961185455322266, 25.429367065429688, 10.212020874023438, 2.8668746948242188, 13.030204772949219, -2.23187255859375, 52.70456314086914, 18.80553436279297, -1.5917091369628906, 15.678180694580078, -3.50604248046875, 6.2043304443359375, 21.243640899658203, -1.9252700805664062, -2.7861328125, 7.1566619873046875, 5.851633071899414, 14.65433120727539, 14.89963150024414, 38.629730224609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000455.npy"}
|
||||
{"epoch": 0.6878306878306878, "step": 456, "batch_size": 64, "mean": 10.736598014831543, "std": 14.202646255493164, "min": -22.752334594726562, "p10": -5.414611816406249, "median": 9.41241455078125, "p90": 29.540682220458987, "max": 45.819305419921875, "pos_frac": 0.765625, "sample": [24.432388305664062, -8.454872131347656, -0.6534538269042969, 19.28844451904297, 45.819305419921875, -3.015949249267578, -0.5178165435791016, 13.069873809814453, -1.6170921325683594, 34.98090744018555, 15.230461120605469, 8.982982635498047, 29.022918701171875, 38.980079650878906, 16.982391357421875, 8.806488037109375, 1.2605972290039062, 15.097091674804688, 7.517795562744141, 16.2049560546875, -4.9425048828125, 4.559501647949219, 9.357208251953125, -1.3021240234375, 20.322776794433594, 1.4970474243164062, -13.295379638671875, -10.576492309570312, 12.13038444519043, -5.616943359375, 7.283458709716797, 9.467620849609375, 8.852861404418945, 14.587623596191406, 15.374343872070312, 18.309551239013672, 5.60992431640625, 5.853523254394531, 16.392173767089844, 2.395160675048828, -11.198034286499023, 2.0604515075683594, 2.0273075103759766, 11.166046142578125, -9.752286911010742, -22.752334594726562, 30.116043090820312, 13.68359375, -4.486137390136719, -3.7470130920410156, 24.69219207763672, 45.50448226928711, 2.2202529907226562, 28.374649047851562, 13.063697814941406, 24.9583740234375, 25.102493286132812, 5.86212158203125, 14.636283874511719, 36.168724060058594, 15.818288803100586, 3.0223541259765625, 13.160934448242188, 29.76258087158203], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000456.npy"}
|
||||
{"epoch": 0.6893424036281179, "step": 457, "batch_size": 64, "mean": 9.899250984191895, "std": 12.993903160095215, "min": -15.918243408203125, "p10": -5.319770622253417, "median": 7.377819061279297, "p90": 27.801694488525392, "max": 39.0213623046875, "pos_frac": 0.765625, "sample": [8.456779479980469, -9.152381896972656, 1.8804054260253906, 13.766681671142578, 14.714811325073242, -7.76129150390625, 11.06406021118164, 5.9159698486328125, -0.3677330017089844, 1.2065582275390625, 6.385808944702148, 33.45196533203125, 14.234058380126953, 27.195640563964844, -8.065652847290039, -2.778533935546875, 24.08090591430664, 10.619415283203125, 0.8929119110107422, 18.0213623046875, 38.13711166381836, 13.456899642944336, 39.0213623046875, -1.459503173828125, -2.6660633087158203, 13.56236457824707, -4.62834358215332, 23.805747985839844, -15.25311279296875, 6.453052520751953, -5.616096496582031, -15.918243408203125, 4.502784729003906, 19.635093688964844, 7.528804779052734, 0.5745391845703125, -6.9726409912109375, 18.822219848632812, 0.9360198974609375, 7.226833343505859, 23.77001953125, -0.08551788330078125, 26.72251319885254, 33.34609603881836, -2.2928237915039062, 5.1238250732421875, 22.783309936523438, -1.6013946533203125, 14.057846069335938, 29.629043579101562, 23.564224243164062, 5.315898895263672, 28.061431884765625, 12.162752151489258, 12.756311416625977, 31.608179092407227, 26.75818634033203, 4.0827789306640625, 10.437047958374023, 1.6202812194824219, 9.286056518554688, 2.473236083984375, 6.076559066772461, 2.985624313354492], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000457.npy"}
|
||||
{"epoch": 0.690854119425548, "step": 458, "batch_size": 64, "mean": 11.566278457641602, "std": 15.147212982177734, "min": -29.53106689453125, "p10": -6.4123971939086895, "median": 8.791075706481934, "p90": 30.873516082763675, "max": 45.878662109375, "pos_frac": 0.765625, "sample": [6.986690521240234, 18.187515258789062, 2.9845924377441406, 21.927719116210938, 14.672895431518555, 28.274150848388672, 7.347572326660156, -29.53106689453125, -1.5925960540771484, 1.4203834533691406, -3.34649658203125, -13.314224243164062, -9.729171752929688, 25.829605102539062, 28.2686767578125, 4.299201965332031, 14.648185729980469, 24.378061294555664, 11.339164733886719, 15.075881958007812, 8.501157760620117, 16.952011108398438, 16.66492462158203, -3.7130508422851562, 28.150588989257812, 4.421478271484375, 21.521743774414062, 15.277883529663086, 7.584083557128906, 22.615341186523438, 26.981666564941406, 3.4019222259521484, 4.453868865966797, -9.006706237792969, 8.174163818359375, -1.5076828002929688, 5.835304260253906, 8.682022094726562, 17.418975830078125, 41.677974700927734, 16.193605422973633, 6.286571502685547, -19.904991149902344, -0.1273212432861328, 29.9007568359375, 5.842750549316406, 8.900129318237305, -2.2066726684570312, 45.878662109375, 34.452423095703125, -8.695341110229492, 35.75761413574219, 22.88982391357422, 17.119714736938477, -7.395711898803711, 41.37384033203125, 3.4365463256835938, 31.29041290283203, -4.1179962158203125, -0.724517822265625, 7.040912628173828, 13.104789733886719, 35.93443298339844, 15.796966552734375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000458.npy"}
|
||||
{"epoch": 0.6923658352229781, "step": 459, "batch_size": 64, "mean": 13.465246200561523, "std": 15.597167015075684, "min": -13.524137496948242, "p10": -5.661792564392089, "median": 12.933195114135742, "p90": 38.446834564208984, "max": 51.1695556640625, "pos_frac": 0.78125, "sample": [13.648017883300781, 23.1185245513916, -12.174903869628906, 12.897712707519531, -6.094394683837891, -7.78057861328125, 9.465364456176758, 30.272491455078125, 9.336639404296875, 15.502479553222656, 31.12969970703125, -4.289424896240234, 20.060588836669922, -3.8414154052734375, 42.29777145385742, 48.229705810546875, 10.229049682617188, 17.297149658203125, 8.828842163085938, 14.473966598510742, 18.91915512084961, 20.52752685546875, 5.369510650634766, 2.9595184326171875, 3.9462032318115234, 21.171485900878906, 4.790046691894531, 15.091392517089844, 14.083724975585938, 28.420289993286133, 41.133056640625, 1.6031494140625, -4.652387619018555, 6.127077102661133, 4.753623962402344, 11.508075714111328, 14.724700927734375, -10.3109130859375, 26.429275512695312, -6.544303894042969, 1.909475326538086, 23.6937255859375, 21.626712799072266, 5.603271484375, 38.63043212890625, -3.6814117431640625, 20.73888397216797, -1.8269500732421875, -12.976089477539062, 38.01844024658203, 1.2718467712402344, 9.238666534423828, -13.524137496948242, 22.62896728515625, 12.968677520751953, -1.0787353515625, 20.524124145507812, 12.764480590820312, 41.79100036621094, 25.381057739257812, 39.93968200683594, -0.7810173034667969, 51.1695556640625, 15.087623596191406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000459.npy"}
|
||||
{"epoch": 0.6938775510204082, "step": 460, "batch_size": 64, "mean": 14.945837020874023, "std": 16.099214553833008, "min": -17.206832885742188, "p10": -2.307971954345703, "median": 10.007773399353027, "p90": 39.52042083740235, "max": 46.65043640136719, "pos_frac": 0.8125, "sample": [31.208724975585938, 23.496524810791016, 45.45330810546875, 3.4349517822265625, 7.009059906005859, 46.65043640136719, 45.40159606933594, 6.04620361328125, 37.94960021972656, -0.20629119873046875, -0.7168159484863281, 6.569908142089844, 6.576347351074219, 7.578399658203125, 2.7304611206054688, 1.4836807250976562, 10.42182731628418, 23.59162139892578, -2.262176513671875, 43.4541130065918, 5.42802619934082, 43.375732421875, 17.3270263671875, 25.730316162109375, 36.21269607543945, 21.640419006347656, 14.013118743896484, 7.692039489746094, 25.815391540527344, 8.597732543945312, 13.83892822265625, 39.165069580078125, 26.326641082763672, 8.369361877441406, 3.114351272583008, 9.593719482421875, 11.136327743530273, 3.281768798828125, 6.11224365234375, 6.722890853881836, 33.06978988647461, 21.819564819335938, -2.3275985717773438, -0.1785755157470703, 20.45534896850586, 39.67271423339844, -8.809562683105469, -8.708740234375, 13.526458740234375, -17.206832885742188, -3.403963088989258, 11.412109375, 30.37799072265625, 1.8757457733154297, -1.9557952880859375, 25.737014770507812, 2.6002197265625, 38.756439208984375, -4.168632507324219, -5.0033721923828125, 26.8533935546875, 20.36431884765625, 0.8706283569335938, 41.5395622253418], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000460.npy"}
|
||||
{"epoch": 0.6953892668178382, "step": 461, "batch_size": 64, "mean": 12.89040756225586, "std": 15.378559112548828, "min": -19.173583984375, "p10": -4.115039443969726, "median": 11.350016593933105, "p90": 33.18500213623047, "max": 49.895416259765625, "pos_frac": 0.78125, "sample": [23.80516815185547, 10.696231842041016, 4.041404724121094, 10.672842025756836, 37.38007354736328, 33.47544860839844, 15.743431091308594, 21.94603729248047, 24.61907196044922, -3.6610984802246094, -19.173583984375, 12.682533264160156, 16.77752685546875, 4.6468353271484375, 20.739242553710938, 3.195068359375, -4.3095855712890625, 32.507293701171875, 3.5842514038085938, 7.964216232299805, 12.003801345825195, 28.356388092041016, 23.08885955810547, 17.687728881835938, 7.90576171875, 19.384340286254883, -10.967369079589844, 0.1299419403076172, -5.073707580566406, 9.002729415893555, 16.544483184814453, -2.838542938232422, -15.564157485961914, 10.145835876464844, 14.65390396118164, 26.398757934570312, 39.44805908203125, 12.201202392578125, 4.197959899902344, 49.895416259765625, 18.254745483398438, 34.51111602783203, 23.65656280517578, 8.391525268554688, 3.8399429321289062, -2.6796340942382812, 6.282550811767578, -3.644723892211914, 23.78889274597168, 14.141090393066406, 21.347335815429688, 20.455787658691406, 3.477630615234375, 1.4831924438476562, 29.85491180419922, -2.69189453125, 48.019927978515625, -7.973140716552734, -2.3217945098876953, -7.569925308227539, 30.51891326904297, 4.7673797607421875, 47.78437805175781, -2.6424522399902344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000461.npy"}
|
||||
{"epoch": 0.6969009826152683, "step": 462, "batch_size": 64, "mean": 14.608634948730469, "std": 15.192632675170898, "min": -20.919570922851562, "p10": -3.995447349548338, "median": 15.605876922607422, "p90": 34.104538726806645, "max": 62.02330017089844, "pos_frac": 0.859375, "sample": [34.462310791015625, 39.6490364074707, 37.411468505859375, 20.360107421875, -0.3953704833984375, 10.492095947265625, 18.325820922851562, 2.232128143310547, 1.8695755004882812, 1.1557846069335938, -4.874366760253906, 42.5304069519043, 29.659011840820312, 10.565483093261719, 21.975784301757812, 35.59703826904297, 0.6862716674804688, 30.732040405273438, 7.595218658447266, 21.80199432373047, 16.047584533691406, 10.175209045410156, 17.56646728515625, 14.905220031738281, 16.83673095703125, 3.9846038818359375, 17.363018035888672, 28.89301300048828, 5.199367523193359, 20.15481948852539, 6.1270751953125, -5.7209320068359375, 16.863754272460938, 22.00335693359375, 33.269737243652344, -11.520645141601562, -5.340009689331055, 21.978439331054688, 21.25868034362793, 7.665058135986328, 29.2515869140625, 25.28809356689453, 16.999725341796875, 3.8027877807617188, 15.164169311523438, 25.72665786743164, 7.4568023681640625, 14.737300872802734, 25.475940704345703, 18.89898681640625, 9.2393798828125, -20.919570922851562, 22.55675506591797, 35.75189208984375, 30.0946044921875, 1.6182861328125, 3.5285892486572266, 62.02330017089844, 10.316352844238281, 0.9440212249755859, -17.24258041381836, -5.3215179443359375, -1.9446353912353516, 1.9633293151855469], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000462.npy"}
|
||||
{"epoch": 0.6984126984126984, "step": 463, "batch_size": 64, "mean": 10.091035842895508, "std": 12.849349975585938, "min": -19.448883056640625, "p10": -4.318197631835937, "median": 8.204368591308594, "p90": 26.561238861083986, "max": 44.687042236328125, "pos_frac": 0.78125, "sample": [0.3203773498535156, 22.6644287109375, 23.71800994873047, -19.448883056640625, 29.544414520263672, 14.483293533325195, -2.847532272338867, 34.6515998840332, 12.806739807128906, 10.232627868652344, -7.749790191650391, 1.7767181396484375, 4.9625091552734375, 25.055755615234375, 26.161224365234375, 0.617889404296875, 2.9218978881835938, 3.0956878662109375, -0.3554973602294922, -1.9466476440429688, 1.734903335571289, -0.11736297607421875, 1.98931884765625, 35.733741760253906, 14.652156829833984, 19.270782470703125, 6.17559814453125, 11.178291320800781, -8.798942565917969, -3.711833953857422, 6.272819519042969, 5.372978210449219, 23.1883544921875, 27.82549476623535, 4.873348236083984, 22.27843475341797, 17.94969940185547, 5.1953887939453125, 0.9534149169921875, 29.885757446289062, 11.62887191772461, 5.799125671386719, 2.829345703125, -0.2197113037109375, -4.578067779541016, -6.30743408203125, 44.687042236328125, 10.783782958984375, 26.73267364501953, 21.600112915039062, 15.103893280029297, -15.77392578125, 7.0664215087890625, -3.6277904510498047, 9.342315673828125, 21.6182861328125, 17.118629455566406, 18.865745544433594, 17.740211486816406, 13.815948486328125, -5.496791839599609, 13.724693298339844, 1.3881034851074219, 19.417621612548828], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000463.npy"}
|
||||
{"epoch": 0.6999244142101285, "step": 464, "batch_size": 64, "mean": 11.514649391174316, "std": 15.790977478027344, "min": -27.8702392578125, "p10": -7.1408954620361325, "median": 8.814496994018555, "p90": 32.29069061279297, "max": 43.832794189453125, "pos_frac": 0.765625, "sample": [-4.967613220214844, 5.984138488769531, 24.515274047851562, -13.868911743164062, 27.624658584594727, 1.2562255859375, -3.6374740600585938, 30.411609649658203, 19.357402801513672, 10.7529296875, 24.370285034179688, -6.538673400878906, 30.03687286376953, 15.559133529663086, 7.4783477783203125, -8.583038330078125, 29.440841674804688, 8.865650177001953, 8.763343811035156, 19.947437286376953, 7.44232177734375, 32.17265319824219, -7.458234786987305, 14.591686248779297, 1.9382667541503906, 26.878570556640625, 7.872322082519531, 36.44883346557617, 23.833463668823242, 16.185150146484375, 8.543365478515625, 1.5944442749023438, 36.48030471801758, 29.786453247070312, -19.158523559570312, 4.9253387451171875, 30.91411018371582, 15.738334655761719, -14.64361572265625, -7.159053802490234, 38.19624328613281, 9.166996002197266, 6.320121765136719, 27.447959899902344, -1.1186904907226562, 0.17363739013671875, 2.7893600463867188, 8.454710960388184, -3.8469314575195312, 36.02824401855469, 13.744758605957031, 32.341278076171875, -27.8702392578125, -1.1693611145019531, 2.7243423461914062, -7.0985260009765625, 7.0850982666015625, 33.38097381591797, 13.035173416137695, 43.832794189453125, -4.4365997314453125, 14.817703247070312, 17.572410583496094, 1.67144775390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000464.npy"}
|
||||
{"epoch": 0.7014361300075586, "step": 465, "batch_size": 64, "mean": 10.417366981506348, "std": 15.76006031036377, "min": -18.549488067626953, "p10": -8.946302795410155, "median": 7.8054914474487305, "p90": 33.83498268127441, "max": 41.38336181640625, "pos_frac": 0.703125, "sample": [-14.49981689453125, 24.899871826171875, 14.891883850097656, -6.262542724609375, 22.886154174804688, 14.454429626464844, -1.633758544921875, 39.52764129638672, -7.1470184326171875, 15.807746887207031, 3.6025047302246094, 28.632606506347656, 25.81043815612793, 36.41184997558594, -10.758867263793945, 35.18675231933594, -12.831016540527344, 33.66966247558594, 22.6279296875, 7.721351623535156, 1.0689849853515625, -14.349983215332031, 22.66362190246582, 37.71330261230469, 23.388381958007812, 32.77937316894531, -4.187110900878906, -9.254386901855469, -4.415641784667969, 4.329717636108398, 4.7905120849609375, -8.227439880371094, 13.09613037109375, 12.331565856933594, 17.600372314453125, 2.9132766723632812, -7.368547439575195, -3.2970199584960938, -4.372682571411133, 13.776494979858398, -16.030818939208984, 6.7639007568359375, 6.454036712646484, 9.993843078613281, -3.9794845581054688, -18.549488067626953, -0.045421600341796875, 6.833528518676758, 11.983451843261719, 40.65360641479492, 1.7758941650390625, 27.522918701171875, 6.373313903808594, 33.90583419799805, 5.507667541503906, -1.4696483612060547, 10.344001770019531, 6.686891555786133, 16.98058319091797, 13.572380065917969, 41.38336181640625, 7.889631271362305, 23.989784240722656, 24.194992065429688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000465.npy"}
|
||||
{"epoch": 0.7029478458049887, "step": 466, "batch_size": 64, "mean": 6.940884113311768, "std": 13.174407958984375, "min": -23.506500244140625, "p10": -7.944874000549315, "median": 6.632080078125, "p90": 23.57448234558106, "max": 40.204803466796875, "pos_frac": 0.734375, "sample": [-2.77099609375, 22.195858001708984, -21.543701171875, -14.719451904296875, 9.799367904663086, -1.2344627380371094, 1.1165771484375, 6.596099853515625, 8.625244140625, -23.506500244140625, 12.077491760253906, 6.500923156738281, 26.010948181152344, -10.51613998413086, 9.47021484375, -5.708526611328125, 15.238723754882812, 40.204803466796875, 12.162521362304688, 31.42236328125, 20.074710845947266, 10.880559921264648, 25.994773864746094, -0.7008743286132812, 6.668060302734375, 5.947669982910156, 20.005935668945312, -6.267993927001953, 34.22419738769531, 5.6649017333984375, 0.4404754638671875, 2.9713287353515625, -4.8133087158203125, 13.015609741210938, 12.642749786376953, 10.15296745300293, 19.14002227783203, 6.111839294433594, -8.476139068603516, 9.400146484375, 4.214912414550781, -0.028472900390625, -6.705255508422852, -18.074546813964844, 16.13422393798828, 5.024322509765625, -6.3410491943359375, 9.328529357910156, 39.257408142089844, 2.337879180908203, -17.497657775878906, 10.120498657226562, -2.8712234497070312, 17.317337036132812, 4.8300933837890625, 2.2477588653564453, 6.949728012084961, 3.9016895294189453, 7.138845443725586, 13.40224838256836, 24.165321350097656, 9.769790649414062, 11.381172180175781, 3.7140274047851562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000466.npy"}
|
||||
{"epoch": 0.7044595616024187, "step": 467, "batch_size": 64, "mean": 11.48172664642334, "std": 13.728860855102539, "min": -16.040597915649414, "p10": -4.6447505950927725, "median": 8.842238426208496, "p90": 29.205829048156744, "max": 56.290740966796875, "pos_frac": 0.8125, "sample": [3.19915771484375, 43.28425979614258, 15.082283020019531, 17.72735595703125, 0.6544876098632812, 7.0008544921875, -4.181842803955078, -5.311248779296875, 10.789119720458984, 18.637165069580078, 19.110633850097656, -5.504787445068359, 27.20159149169922, -2.098905563354492, 15.988876342773438, 33.352447509765625, 7.317821502685547, 6.419288635253906, 7.9432373046875, 4.189502716064453, 18.749298095703125, -10.803291320800781, -0.38555908203125, 33.908836364746094, 26.42646026611328, 3.136444091796875, 19.90070343017578, 20.913951873779297, -16.040597915649414, 20.12542724609375, -3.561065673828125, 20.261398315429688, 4.098020553588867, 12.769546508789062, -4.8431396484375, 8.294044494628906, 14.159523010253906, 29.63165283203125, 3.8167877197265625, 16.814987182617188, 0.17292022705078125, 0.34598350524902344, -6.066089630126953, 9.390432357788086, 3.8305797576904297, -1.597686767578125, 5.76080322265625, 0.18556976318359375, 17.035106658935547, 28.21224021911621, 56.290740966796875, 8.186859130859375, 14.587745666503906, 1.2018699645996094, 12.871513366699219, 12.183551788330078, -9.34759521484375, 36.443603515625, 6.926692962646484, 35.52241516113281, 7.9437713623046875, 16.254409790039062, 25.04507064819336, 15.275245666503906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000467.npy"}
|
||||
{"epoch": 0.7059712773998488, "step": 468, "batch_size": 64, "mean": 16.208662033081055, "std": 16.297958374023438, "min": -11.336835861206055, "p10": -4.4358980178833, "median": 12.522747039794922, "p90": 35.49253692626953, "max": 63.55842208862305, "pos_frac": 0.828125, "sample": [11.479646682739258, 3.5909595489501953, 11.236774444580078, 35.56297302246094, 11.664836883544922, -4.119344711303711, 1.6130504608154297, 31.717788696289062, 18.737213134765625, 10.591590881347656, 25.25223159790039, 33.60980987548828, 25.925811767578125, -11.336835861206055, 12.113662719726562, 28.58466339111328, 6.809074401855469, -8.0931396484375, -7.566162109375, 16.23241424560547, 45.908905029296875, 9.7188720703125, 32.460792541503906, 39.2442626953125, 8.449073791503906, 16.096242904663086, 41.426673889160156, 7.111055374145508, -3.0091896057128906, -10.335166931152344, 36.84767532348633, -4.571563720703125, 0.539794921875, -5.686504364013672, 6.447029113769531, -6.925981521606445, 16.292823791503906, 15.108192443847656, -2.0103759765625, 28.774532318115234, 11.961566925048828, 11.0614013671875, 24.03357696533203, 35.32818603515625, 30.98611831665039, 6.597511291503906, 11.923093795776367, 23.68701934814453, 12.931831359863281, 13.36256217956543, 32.94584655761719, 30.414657592773438, 22.318462371826172, 11.81170654296875, 58.532745361328125, 4.156768798828125, 2.781646728515625, 63.55842208862305, 21.68390655517578, 21.303241729736328, 31.929096221923828, 24.426307678222656, 5.393058776855469, -1.2685317993164062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000468.npy"}
|
||||
{"epoch": 0.7074829931972789, "step": 469, "batch_size": 64, "mean": 12.465350151062012, "std": 16.057851791381836, "min": -16.684675216674805, "p10": -2.375085639953613, "median": 9.214681625366211, "p90": 32.42687911987305, "max": 59.85797119140625, "pos_frac": 0.765625, "sample": [6.7688751220703125, 21.086105346679688, 26.561813354492188, 10.713035583496094, 5.235649108886719, 5.313365936279297, 4.34814453125, 10.608102798461914, 19.247303009033203, -12.665962219238281, 26.124488830566406, 18.049488067626953, 14.34759521484375, -6.391395568847656, -0.11296844482421875, 17.59160804748535, 13.824142456054688, 7.821260452270508, 18.02671241760254, 34.822265625, 32.25647735595703, 3.8097991943359375, 11.666053771972656, 2.3734817504882812, 22.53411865234375, -10.969047546386719, 5.988719940185547, 3.5819435119628906, -2.3602962493896484, 4.00372314453125, -4.172271728515625, -0.18221473693847656, 29.42816162109375, 56.456111907958984, 30.616044998168945, -2.3814239501953125, -0.30681610107421875, 0.5889205932617188, 31.79291534423828, 40.36460876464844, 32.499908447265625, 13.436325073242188, 4.1334686279296875, 15.562263488769531, 2.0234375, -1.8509445190429688, 6.75872802734375, -2.3380966186523438, 59.85797119140625, 14.56658935546875, -0.34458160400390625, 0.45497894287109375, 53.74589920043945, 1.13165283203125, 40.05181884765625, -1.2761611938476562, 18.881134033203125, 13.104618072509766, 17.906208038330078, -16.684675216674805, 11.273490905761719, 0.6930446624755859, -2.4212493896484375, 20.20794677734375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000469.npy"}
|
||||
{"epoch": 0.708994708994709, "step": 470, "batch_size": 64, "mean": 11.667706489562988, "std": 14.86290454864502, "min": -39.332088470458984, "p10": -5.657162284851074, "median": 11.838233947753906, "p90": 29.338331604003912, "max": 46.50531768798828, "pos_frac": 0.75, "sample": [19.03636932373047, 23.383201599121094, 44.11484909057617, -39.332088470458984, 10.395111083984375, 30.127532958984375, 23.775901794433594, 4.183837890625, 31.03521728515625, 24.052688598632812, 18.385875701904297, -8.988037109375, 26.57568359375, 12.447662353515625, 23.578231811523438, 2.5442276000976562, 27.191062927246094, -2.298421859741211, 7.557365417480469, 23.5982723236084, -4.742404937744141, 27.597015380859375, 9.112550735473633, -6.448150634765625, 30.084609985351562, 11.777572631835938, 46.50531768798828, 19.731338500976562, 3.421642303466797, -7.237016677856445, 9.648185729980469, -0.9122905731201172, 25.995193481445312, 2.6083450317382812, -5.399801254272461, -1.4957351684570312, 4.702220916748047, 2.976163864135742, 10.679115295410156, -9.248104095458984, 18.113418579101562, 20.232620239257812, 25.788345336914062, 1.938751220703125, -5.133354187011719, -5.3971405029296875, 12.527502059936523, 24.01457977294922, 15.66064453125, 32.275184631347656, 12.432552337646484, 25.55664825439453, -5.767459869384766, 12.481243133544922, 5.78350830078125, 11.898895263671875, 6.640651702880859, 21.96661376953125, -7.251861572265625, 8.041543960571289, 31.057044982910156, -1.4369754791259766, 15.32479476928711, -0.7348747253417969], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000470.npy"}
|
||||
{"epoch": 0.7105064247921391, "step": 471, "batch_size": 64, "mean": 13.961222648620605, "std": 16.12949562072754, "min": -37.479949951171875, "p10": -2.6061664581298825, "median": 11.976364135742188, "p90": 36.3091609954834, "max": 41.95337677001953, "pos_frac": 0.796875, "sample": [2.787628173828125, 34.791481018066406, 29.6817569732666, 20.26854133605957, 10.530189514160156, -2.7825164794921875, 25.55120849609375, 18.835033416748047, 33.739959716796875, 28.36681365966797, 25.036338806152344, 7.653541564941406, 37.64435577392578, 1.9095382690429688, -11.509876251220703, -1.815938949584961, 6.800874710083008, 41.95337677001953, 35.61539077758789, -8.912467956542969, 6.216901779174805, 18.998531341552734, 23.798141479492188, 18.205810546875, 21.010955810546875, -37.479949951171875, 23.76050567626953, 25.587326049804688, 38.2811393737793, 3.7950592041015625, 21.864803314208984, 10.989921569824219, 5.245708465576172, 3.4947242736816406, 9.991378784179688, 12.382553100585938, 36.60649108886719, 3.317474365234375, 3.0219650268554688, 32.973846435546875, 9.738327026367188, -6.4910888671875, 36.931983947753906, 30.190582275390625, -1.4972553253173828, 4.1750335693359375, 12.525588989257812, 24.112091064453125, -12.824028015136719, 19.439617156982422, -8.322410583496094, 26.628585815429688, 1.9101600646972656, 3.4689102172851562, 13.270523071289062, -1.7421741485595703, 11.570175170898438, 25.799964904785156, -1.9842681884765625, 41.218231201171875, 8.128028869628906, -0.07408905029296875, -2.194683074951172, 41.33192443847656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000471.npy"}
|
||||
{"epoch": 0.7120181405895691, "step": 472, "batch_size": 64, "mean": 12.367351531982422, "std": 16.697553634643555, "min": -20.76934051513672, "p10": -8.763375663757323, "median": 8.652376174926758, "p90": 37.63821182250977, "max": 51.63719940185547, "pos_frac": 0.796875, "sample": [8.051612854003906, 10.436820983886719, 9.117698669433594, 7.648773193359375, 28.5018310546875, 32.05259323120117, -1.7783050537109375, 26.858352661132812, 27.18597412109375, 37.98213195800781, 4.884769439697266, 48.58317565917969, 9.197649002075195, -9.18402099609375, 3.1876468658447266, 40.055572509765625, 8.276147842407227, 15.836418151855469, 51.63719940185547, 7.1191253662109375, -9.356925964355469, 8.218700408935547, -7.540531158447266, 30.693923950195312, -1.6774444580078125, 13.837478637695312, 6.5357666015625, 36.835731506347656, 38.691986083984375, -7.781869888305664, 3.8333206176757812, 16.441238403320312, 4.765743255615234, -11.558395385742188, -6.169788360595703, 14.59844970703125, 25.390304565429688, 38.798492431640625, 4.23985481262207, 7.306976318359375, 24.37074089050293, 40.441131591796875, -20.76934051513672, 32.175941467285156, -19.948684692382812, -10.806587219238281, 25.667682647705078, 1.1320266723632812, -14.9073486328125, 8.178825378417969, 9.421318054199219, 9.028604507446289, 12.344841003417969, -1.4225006103515625, 8.053466796875, 1.7060623168945312, 15.887115478515625, 9.220954895019531, 31.780179977416992, 1.6660842895507812, 16.41494369506836, 7.8741912841796875, 2.094085693359375, 30.152568817138672], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000472.npy"}
|
||||
{"epoch": 0.7135298563869993, "step": 473, "batch_size": 64, "mean": 12.419994354248047, "std": 18.544708251953125, "min": -20.358352661132812, "p10": -10.919354820251465, "median": 8.221078872680664, "p90": 37.619161987304686, "max": 53.3365478515625, "pos_frac": 0.765625, "sample": [36.509857177734375, 37.13037109375, 1.878225326538086, -12.052240371704102, -19.580888748168945, 33.00892639160156, 5.213386535644531, 17.91387176513672, 4.009103775024414, 42.382469177246094, 38.24576187133789, 1.47662353515625, -2.700223922729492, 9.750656127929688, 6.466007232666016, -11.094322204589844, 50.9716796875, 36.187828063964844, 14.522140502929688, 27.383941650390625, -20.358352661132812, 2.8308944702148438, 27.630096435546875, -11.075286865234375, -10.382644653320312, 22.050125122070312, 10.306694030761719, 22.432861328125, 4.749788284301758, 15.485824584960938, 3.6017303466796875, -1.1686458587646484, -17.645034790039062, -9.053245544433594, -10.555513381958008, 25.248172760009766, 53.3365478515625, 38.878211975097656, 6.3605804443359375, 1.0415611267089844, 33.43719482421875, 14.822628021240234, 37.828643798828125, 49.77648162841797, 6.774532318115234, 9.780521392822266, 6.48921012878418, 3.872467041015625, 5.993011474609375, 29.158899307250977, 16.088489532470703, 2.448810577392578, 22.614816665649414, 9.667625427246094, -2.7370262145996094, 13.404729843139648, 32.991783142089844, -8.012908935546875, 0.9646759033203125, -0.3267059326171875, 1.3206634521484375, 36.882049560546875, 16.81182861328125, -16.510318756103516], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000473.npy"}
|
||||
{"epoch": 0.7150415721844293, "step": 474, "batch_size": 64, "mean": 9.747077941894531, "std": 16.323074340820312, "min": -26.802383422851562, "p10": -8.810668182373046, "median": 8.177644729614258, "p90": 32.75975379943848, "max": 46.31895446777344, "pos_frac": 0.671875, "sample": [33.40808868408203, 21.937793731689453, 12.706233978271484, -13.629083633422852, -9.740863800048828, -0.3768043518066406, 13.809112548828125, 40.36498260498047, 19.796932220458984, -1.067352294921875, 5.950096130371094, 20.898765563964844, -17.572284698486328, 17.282485961914062, 10.428359985351562, 32.51083755493164, -2.7538681030273438, -0.6632194519042969, -2.6133880615234375, 12.837844848632812, -9.134220123291016, -1.297393798828125, 27.463733673095703, 15.561641693115234, 19.84539794921875, 2.7675724029541016, -1.9538803100585938, -26.802383422851562, 1.1985549926757812, -4.05487060546875, 38.025367736816406, 9.14035415649414, -18.391597747802734, 5.788299560546875, -7.901817321777344, 28.837142944335938, 2.3833045959472656, 23.510971069335938, 20.12273406982422, -2.4302215576171875, 5.800655364990234, 7.291023254394531, 26.964683532714844, 19.249778747558594, -4.550975799560547, 46.31895446777344, 28.788604736328125, 5.3538818359375, 28.606918334960938, -3.719606399536133, 35.957374572753906, 32.866432189941406, 1.32647705078125, 9.215465545654297, 8.076179504394531, -8.055713653564453, 35.13446044921875, -5.4829864501953125, 8.73843765258789, 14.967025756835938, 26.891590118408203, -21.873611450195312, 1.4754867553710938, 8.279109954833984], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000474.npy"}
|
||||
{"epoch": 0.7165532879818595, "step": 475, "batch_size": 64, "mean": 16.779415130615234, "std": 16.023086547851562, "min": -10.271345138549805, "p10": -0.6674499511718741, "median": 13.822755813598633, "p90": 41.27140731811524, "max": 54.30387878417969, "pos_frac": 0.890625, "sample": [0.8174037933349609, 8.81625747680664, 13.823951721191406, 12.22036361694336, 23.05354118347168, 8.089164733886719, 32.00096130371094, 41.61149597167969, 8.442853927612305, 1.694091796875, 12.082374572753906, -7.0907135009765625, 29.031402587890625, 30.790008544921875, 15.599588394165039, 24.334632873535156, 19.19130516052246, 13.82155990600586, 9.496421813964844, 24.80016326904297, 9.814201354980469, 6.7638702392578125, 2.087858200073242, 10.992149353027344, 12.459068298339844, 1.3656272888183594, 4.852531433105469, 54.30387878417969, 13.908096313476562, -8.817001342773438, 49.358314514160156, 40.477867126464844, 33.045013427734375, 48.54314422607422, 8.643808364868164, 0.23086166381835938, 14.11906623840332, -1.0524406433105469, 24.731674194335938, 3.0507736206054688, 37.15594482421875, 23.644195556640625, 3.3550357818603516, 8.198043823242188, 19.472190856933594, -10.271345138549805, -1.2659950256347656, 12.928133010864258, 46.81805419921875, 4.5522918701171875, 44.24449157714844, 47.39167785644531, 28.396942138671875, 20.000152587890625, 27.40706443786621, 0.84112548828125, -8.381278991699219, 24.84075927734375, 38.1087646484375, 21.337146759033203, 6.085391998291016, 14.453514099121094, -8.607131958007812, 21.672237396240234], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000475.npy"}
|
||||
{"epoch": 0.7180650037792895, "step": 476, "batch_size": 64, "mean": 10.166099548339844, "std": 16.569677352905273, "min": -22.121002197265625, "p10": -8.033135032653808, "median": 8.729812622070312, "p90": 31.762055587768558, "max": 51.20764923095703, "pos_frac": 0.703125, "sample": [35.23058319091797, 9.73184585571289, -3.0922794342041016, 10.488945007324219, -12.01080322265625, 20.440109252929688, -8.102142333984375, 37.26353073120117, 43.60466003417969, 4.4908447265625, 32.163787841796875, -0.5080165863037109, 49.44947052001953, 17.699851989746094, 19.595706939697266, 10.628105163574219, 6.05828857421875, 7.217315673828125, 13.987869262695312, 8.85479736328125, -7.87211799621582, 2.2243824005126953, 26.19062614440918, 21.615222930908203, 17.807891845703125, -17.9455623626709, -6.571926116943359, 4.4467926025390625, -0.4256744384765625, 4.844247817993164, 44.57191467285156, -17.819290161132812, 16.126358032226562, 0.6182365417480469, 12.33489990234375, -17.3677978515625, -5.95166015625, -15.656997680664062, 24.941444396972656, 2.8783493041992188, -1.2886276245117188, 12.968246459960938, 7.5924530029296875, 8.604827880859375, 14.00762939453125, 7.212806701660156, 9.051803588867188, -2.4337158203125, 0.5869369506835938, 28.04302978515625, 16.261409759521484, -22.121002197265625, 22.585792541503906, -0.8094482421875, 23.022205352783203, 51.20764923095703, -1.7570381164550781, 4.0665740966796875, 30.82468032836914, 16.18145751953125, -3.9680233001708984, 29.117563247680664, 16.63743019104004, -7.146089553833008], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000476.npy"}
|
||||
{"epoch": 0.7195767195767195, "step": 477, "batch_size": 64, "mean": 11.35832691192627, "std": 17.44481658935547, "min": -26.877727508544922, "p10": -11.014076995849607, "median": 11.731487274169922, "p90": 32.51811676025391, "max": 46.16912841796875, "pos_frac": 0.75, "sample": [-14.863258361816406, -8.7664794921875, 26.360336303710938, 4.579887390136719, 19.56048583984375, 19.707477569580078, 0.7904891967773438, 2.619905471801758, 7.5953369140625, -2.793224334716797, 27.35095977783203, 12.4185791015625, -24.401630401611328, 2.9730567932128906, 13.582498550415039, 3.683462142944336, -5.541191101074219, 44.048370361328125, 15.067581176757812, -8.676713943481445, 18.80682373046875, 38.648345947265625, 13.125885009765625, 15.783378601074219, 28.097597122192383, -3.2392616271972656, -11.977333068847656, 11.044395446777344, 24.494060516357422, 3.803089141845703, -19.234909057617188, 7.498680114746094, 9.37618637084961, -3.1810874938964844, -26.877727508544922, -2.9496116638183594, -8.41009521484375, 10.677803039550781, 14.883071899414062, 10.066413879394531, 40.69175338745117, 0.493865966796875, 32.091575622558594, 46.16912841796875, 17.334463119506836, 32.70092010498047, -15.507122039794922, 22.085784912109375, 30.872055053710938, 0.9228134155273438, 38.41986083984375, 8.109182357788086, 15.738618850708008, 27.761425018310547, 40.64997863769531, 23.153228759765625, -22.7900390625, 20.086669921875, 14.9747314453125, 30.695663452148438, 2.946260452270508, 30.690876007080078, -0.0696563720703125, 22.979248046875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000477.npy"}
|
||||
{"epoch": 0.7210884353741497, "step": 478, "batch_size": 64, "mean": 10.045768737792969, "std": 17.34210968017578, "min": -23.70909881591797, "p10": -9.151187133789062, "median": 8.075626373291016, "p90": 38.60493278503419, "max": 47.66845703125, "pos_frac": 0.703125, "sample": [-2.3418331146240234, 47.66845703125, 2.36279296875, 23.37005043029785, 9.823484420776367, 1.0135078430175781, 39.02288055419922, 34.556060791015625, -3.807558059692383, 14.363456726074219, 0.91961669921875, 1.7236557006835938, 14.863128662109375, 42.51213836669922, -9.21136474609375, 5.323081970214844, -8.13519287109375, 7.322174072265625, -5.343513488769531, 14.150505065917969, -3.161611557006836, -4.156307220458984, 19.944766998291016, 17.353160858154297, 29.834304809570312, -23.70909881591797, 19.762001037597656, 15.211454391479492, 27.45061492919922, -17.181541442871094, 15.81982421875, 1.3406562805175781, -21.950706481933594, -18.86437225341797, 13.62319564819336, 37.63811492919922, 31.830528259277344, -1.2697296142578125, 19.309654235839844, 8.764434814453125, 39.019283294677734, 0.3249015808105469, -12.592658996582031, 8.057579040527344, 39.51039123535156, 8.93951416015625, -0.00574493408203125, 26.677932739257812, 8.093673706054688, -4.669223785400391, 11.303333282470703, 26.106077194213867, -9.010772705078125, 14.712417602539062, -4.7666168212890625, 40.53558349609375, 43.52495193481445, -8.072990417480469, 2.5841140747070312, 7.144783020019531, 0.2413177490234375, -10.292333602905273, 16.03705596923828, 1.7817707061767578], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000478.npy"}
|
||||
{"epoch": 0.7226001511715797, "step": 479, "batch_size": 64, "mean": 9.556758880615234, "std": 16.12399673461914, "min": -50.94584655761719, "p10": -4.486983489990234, "median": 7.78143310546875, "p90": 28.881662750244143, "max": 45.729305267333984, "pos_frac": 0.6875, "sample": [10.02517318725586, 34.82571029663086, 0.05804443359375, -3.6600914001464844, -1.1423282623291016, 8.629070281982422, 45.729305267333984, 16.80712890625, 19.228374481201172, -1.41259765625, 6.692867279052734, -6.355770111083984, -3.2434005737304688, -2.0425872802734375, -0.6983718872070312, 16.471633911132812, 28.567642211914062, 22.190223693847656, -2.1695175170898438, 1.6875667572021484, -9.861371994018555, 2.292491912841797, 44.699859619140625, 23.482219696044922, 6.669824600219727, -9.631282806396484, 10.812919616699219, 3.223785400390625, -9.37646484375, -4.740852355957031, 13.865753173828125, 15.489068984985352, 2.32794189453125, 37.7509765625, 22.537551879882812, 5.677181243896484, -17.120418548583984, -2.650970458984375, 12.347042083740234, 25.266971588134766, 37.895721435546875, -50.94584655761719, -3.894622802734375, 7.64250373840332, 24.97010040283203, 43.82833480834961, 20.89708709716797, -2.8279266357421875, 4.077762603759766, 5.101142883300781, 18.520050048828125, 29.01624298095703, -2.007955551147461, 19.019683837890625, 4.963315963745117, 15.47109603881836, 14.964181900024414, 7.92036247253418, 13.37137222290039, 14.37503433227539, -1.6168079376220703, -3.4561386108398438, 13.833869934082031, 17.26372718811035], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000479.npy"}
|
||||
{"epoch": 0.7241118669690099, "step": 480, "batch_size": 64, "mean": 11.229856491088867, "std": 17.173276901245117, "min": -40.460609436035156, "p10": -5.940003204345703, "median": 12.985103607177734, "p90": 34.797796630859374, "max": 47.717987060546875, "pos_frac": 0.734375, "sample": [12.281604766845703, -5.397552490234375, 15.28830337524414, -15.591644287109375, 20.129037857055664, 3.495321273803711, 16.450925827026367, -39.35671615600586, 4.817497253417969, 47.717987060546875, -1.516702651977539, 9.31021499633789, 22.345382690429688, -1.9503631591796875, 6.877449035644531, -5.1020660400390625, -2.4935302734375, 15.970466613769531, -3.795684814453125, 2.8290061950683594, -14.441661834716797, 25.76909637451172, 22.94934844970703, -12.414024353027344, 34.81330108642578, 33.354042053222656, 40.791107177734375, 14.341194152832031, 38.12843322753906, 12.765426635742188, 19.11524200439453, -1.0692310333251953, 5.11090087890625, 13.204780578613281, 17.899991989135742, 18.508216857910156, 23.383621215820312, 14.550643920898438, -40.460609436035156, 13.562713623046875, 37.20210266113281, 26.048507690429688, 18.547475814819336, 17.496601104736328, 10.130203247070312, 28.77761459350586, 4.604248046875, -5.3160858154296875, 11.050167083740234, 11.556560516357422, 10.271970748901367, -5.9741363525390625, 38.513099670410156, 15.076400756835938, 13.985786437988281, 22.275169372558594, 35.76514434814453, 22.980979919433594, 4.929103851318359, -5.127342224121094, 0.8315658569335938, 34.761619567871094, -5.98699951171875, -5.860359191894531], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000480.npy"}
|
||||
{"epoch": 0.7256235827664399, "step": 481, "batch_size": 64, "mean": 12.816054344177246, "std": 17.803773880004883, "min": -22.835193634033203, "p10": -5.884752082824707, "median": 10.493221282958984, "p90": 37.20188827514649, "max": 55.35961151123047, "pos_frac": 0.75, "sample": [20.09954071044922, 10.374183654785156, 55.35961151123047, 2.7050552368164062, -6.0738067626953125, 12.039060592651367, 24.921966552734375, 12.990482330322266, -7.906684875488281, -11.824653625488281, 27.586196899414062, 10.68157958984375, 25.622451782226562, 23.273353576660156, -4.665916442871094, 37.68955993652344, 16.078166961669922, 13.558364868164062, 12.332168579101562, 32.799583435058594, -22.835193634033203, -0.8066520690917969, 36.063987731933594, 6.367231369018555, 1.6194038391113281, -5.443624496459961, -0.4744720458984375, 21.871973037719727, 5.830730438232422, 23.52165985107422, -1.657501220703125, 26.411527633666992, 7.490837097167969, 2.118896484375, 7.415489196777344, 11.664627075195312, 18.760902404785156, 9.58962631225586, 7.021881103515625, 6.776317596435547, -9.795257568359375, 8.418956756591797, 31.697965621948242, 53.42843246459961, -2.6143951416015625, 39.18696594238281, 0.06099700927734375, 2.3717117309570312, 54.215423583984375, 25.445871353149414, 22.069053649902344, 10.612258911132812, 3.648591995239258, -0.133148193359375, -4.71833610534668, 42.31378173828125, -22.37861442565918, 27.32904052734375, 12.604215621948242, 52.04536437988281, 21.719755172729492, 2.836963653564453, -13.9329833984375, -5.153057098388672], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000481.npy"}
|
||||
{"epoch": 0.72713529856387, "step": 482, "batch_size": 64, "mean": 10.592021942138672, "std": 18.434587478637695, "min": -25.694580078125, "p10": -10.144931793212889, "median": 9.272384643554688, "p90": 37.25528106689453, "max": 64.34710693359375, "pos_frac": 0.765625, "sample": [-7.626495361328125, -25.694580078125, -2.953887939453125, 15.160224914550781, 5.554924011230469, 11.211845397949219, 9.525138854980469, -7.391456604003906, 2.760936737060547, 36.61480712890625, 9.973657608032227, 33.3719482421875, 22.373428344726562, 3.4948043823242188, 16.519737243652344, 10.632575988769531, 43.903839111328125, 18.313804626464844, -22.526424407958984, 14.936561584472656, 40.16539001464844, 1.2770233154296875, 55.01239013671875, -20.042770385742188, -0.32965087890625, 16.821619033813477, 37.52976989746094, 15.76077651977539, -1.5912551879882812, 1.4641456604003906, 2.4829330444335938, 17.819089889526367, -12.751344680786133, 8.815223693847656, 42.4559326171875, 24.487350463867188, 18.7265625, 2.1490440368652344, 8.01016616821289, -6.4962158203125, 64.34710693359375, 11.389617919921875, 10.867164611816406, 3.7137680053710938, 7.173789978027344, 23.166744232177734, 9.412055969238281, 7.946161270141602, 45.28419494628906, -24.952831268310547, 15.922264099121094, 9.132713317871094, 15.627185821533203, 6.956878662109375, 17.417072296142578, 0.326934814453125, -15.952674865722656, 35.7352294921875, -10.994987487792969, -5.9627532958984375, 17.203468322753906, 1.3136444091796875, -8.161468505859375, 1.0565738677978516], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000482.npy"}
|
||||
{"epoch": 0.7286470143613001, "step": 483, "batch_size": 64, "mean": 13.076276779174805, "std": 16.778644561767578, "min": -39.05513000488281, "p10": -2.9154899597167967, "median": 11.055846214294434, "p90": 34.89721450805664, "max": 53.19403076171875, "pos_frac": 0.765625, "sample": [-8.684684753417969, 33.78204345703125, -2.890827178955078, 6.577476501464844, 21.688682556152344, 45.53444290161133, 21.217010498046875, 10.3497314453125, -2.505279541015625, -5.659931182861328, 4.970396041870117, 17.68203353881836, 38.13041687011719, 40.77119445800781, -2.9260597229003906, 31.97601318359375, 9.096271514892578, -14.680473327636719, 3.360076904296875, 27.84687042236328, 20.460205078125, 36.445167541503906, 32.428375244140625, 2.2384033203125, 25.205154418945312, 11.864166259765625, 20.436019897460938, -0.271331787109375, 3.8036956787109375, 4.618072509765625, 9.680412292480469, -17.668548583984375, 43.33149719238281, 1.821340560913086, 7.947021484375, 27.959136962890625, 13.873088836669922, -39.05513000488281, 11.761960983276367, -6.845817565917969, 10.14520263671875, 17.580886840820312, 22.59417724609375, 2.6969642639160156, -2.6873016357421875, 11.86770248413086, 53.19403076171875, 30.915786743164062, 17.80498504638672, 31.950092315673828, 9.328994750976562, 18.776229858398438, 5.9389495849609375, 35.375144958496094, -0.4055500030517578, 2.502439498901367, -0.4183387756347656, 15.286867141723633, 26.93390655517578, 18.374534606933594, 0.4284210205078125, -2.8328781127929688, -0.15547752380371094, 26.01764678955078], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000483.npy"}
|
||||
{"epoch": 0.7301587301587301, "step": 484, "batch_size": 64, "mean": 10.246103286743164, "std": 12.582767486572266, "min": -14.55419921875, "p10": -6.444997024536132, "median": 10.156850814819336, "p90": 25.01209411621094, "max": 44.6629638671875, "pos_frac": 0.765625, "sample": [10.051624298095703, 18.90752410888672, -14.55419921875, 8.698745727539062, 44.6629638671875, 4.874538421630859, 6.5721588134765625, -10.510723114013672, 25.334808349609375, -6.774528503417969, 10.262077331542969, -7.769264221191406, 7.451061248779297, 8.442249298095703, 15.111055374145508, 22.425350189208984, -1.2351646423339844, -5.377613067626953, 13.997596740722656, 17.78569793701172, -2.7929630279541016, 19.1976318359375, 22.774620056152344, 13.501897811889648, 16.6389217376709, 20.058897018432617, 29.14208984375, 0.239410400390625, 23.527774810791016, 0.9018344879150391, 3.0018272399902344, 17.17108154296875, 1.5318450927734375, -9.73685073852539, 29.985267639160156, -1.3149490356445312, 15.299455642700195, 15.520200729370117, 9.92245864868164, 2.4342613220214844, 13.079858779907227, -1.6957550048828125, 41.23173141479492, -3.374725341796875, 26.54389190673828, 7.131967544555664, 24.25909423828125, 1.1966743469238281, 15.339195251464844, 9.112747192382812, 33.8194580078125, 10.308036804199219, 14.693870544433594, -2.8878936767578125, -9.154518127441406, -5.676090240478516, 17.918331146240234, 3.134307861328125, 9.01662826538086, 16.59770965576172, 14.567092895507812, 17.778182983398438, 15.01340103149414, -7.56329345703125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000484.npy"}
|
||||
{"epoch": 0.7316704459561603, "step": 485, "batch_size": 64, "mean": 11.878273010253906, "std": 16.221696853637695, "min": -24.2794189453125, "p10": -4.474009323120117, "median": 7.588918685913086, "p90": 37.621496582031256, "max": 48.70496368408203, "pos_frac": 0.75, "sample": [6.532493591308594, 9.603170394897461, 23.543651580810547, 15.942962646484375, 10.118354797363281, 13.597221374511719, 7.663396835327148, -0.8612918853759766, 27.666748046875, -4.358476638793945, -18.531009674072266, 3.934600830078125, 48.70496368408203, 24.500389099121094, -0.9330596923828125, 38.199798583984375, 8.99612808227539, 40.315162658691406, 43.497894287109375, 7.514440536499023, 1.8450298309326172, 14.683242797851562, -0.219207763671875, 41.43798828125, -1.5841598510742188, 0.07534599304199219, -24.2794189453125, 6.710166931152344, 8.559028625488281, 13.706123352050781, -7.0581817626953125, -1.3888359069824219, 33.95903015136719, -0.3331489562988281, 12.25347900390625, -7.3746490478515625, 35.21876525878906, 35.82952117919922, 3.435791015625, 13.050281524658203, 11.074665069580078, -9.081903457641602, 12.11236572265625, 2.3815765380859375, 7.2806243896484375, -4.785913467407227, -2.218158721923828, 27.19994354248047, 36.272125244140625, 6.952484130859375, 4.108020782470703, 11.265769958496094, 17.781494140625, 44.04955291748047, 7.1350555419921875, 33.90164566040039, 11.155731201171875, 44.15009307861328, -3.986663818359375, -4.523523330688477, 6.5812835693359375, 6.098674774169922, 7.1497650146484375, 3.9810104370117188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000485.npy"}
|
||||
{"epoch": 0.7331821617535903, "step": 486, "batch_size": 64, "mean": 10.990509986877441, "std": 18.090343475341797, "min": -27.376819610595703, "p10": -6.422056388854979, "median": 7.24654483795166, "p90": 36.072709274291995, "max": 69.06289672851562, "pos_frac": 0.75, "sample": [-1.850250244140625, 13.501152038574219, 30.365386962890625, 14.790468215942383, 7.125602722167969, -0.3552818298339844, 10.451927185058594, 69.06289672851562, 17.198280334472656, -4.064727783203125, -4.462427139282227, 17.807411193847656, 12.166023254394531, 7.367486953735352, 23.380844116210938, 1.4108390808105469, 2.579620361328125, 36.160736083984375, -1.4312019348144531, 17.02947998046875, 10.518156051635742, 8.279922485351562, -12.703250885009766, 11.938308715820312, 43.60929870605469, -10.5477294921875, 24.187469482421875, -15.890392303466797, 25.246450424194336, -15.959121704101562, 33.114898681640625, 6.6060791015625, 0.988555908203125, -21.95606231689453, 8.546537399291992, 4.003997802734375, 11.851341247558594, 23.446544647216797, -5.14079475402832, 3.4409446716308594, 1.0964622497558594, 37.82318115234375, 12.268081665039062, 48.180694580078125, 35.867313385009766, -3.9514198303222656, 6.074066162109375, 14.416105270385742, 16.009033203125, 49.58666229248047, 4.852752685546875, 1.0919265747070312, 4.5816650390625, 34.06385803222656, -2.9323959350585938, 0.57403564453125, 1.9736251831054688, 5.992162704467773, 47.03523254394531, -0.7587661743164062, 16.98529052734375, -27.376819610595703, 5.095615386962891, -6.971168518066406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000486.npy"}
|
||||
{"epoch": 0.7346938775510204, "step": 487, "batch_size": 64, "mean": 11.102409362792969, "std": 15.504838943481445, "min": -25.169618606567383, "p10": -5.979644775390624, "median": 7.720399856567383, "p90": 31.48772125244142, "max": 59.183685302734375, "pos_frac": 0.78125, "sample": [22.608657836914062, -5.3577880859375, 7.246040344238281, 12.265029907226562, 18.306434631347656, 5.099180221557617, 3.1148223876953125, 16.103107452392578, 34.29725646972656, 19.85443878173828, 14.316131591796875, -2.225006103515625, 3.264404296875, -4.6197509765625, -6.24615478515625, 49.67289352416992, 32.98457717895508, 23.991771697998047, 42.11517333984375, 13.338539123535156, -6.252052307128906, 10.37945556640625, 2.8122425079345703, 27.89510726928711, 8.436981201171875, -6.266300201416016, 3.5722999572753906, -14.0240478515625, 2.3807449340820312, 6.448699951171875, 47.138214111328125, 9.109786987304688, 15.045526504516602, -7.53535270690918, 25.939208984375, -1.5694732666015625, 11.457448959350586, -0.8171844482421875, 4.831869125366211, 11.31865119934082, 28.016143798828125, 2.0757484436035156, 5.0689697265625, 18.032283782958984, -10.08059310913086, 5.751033782958984, 6.289396286010742, 0.7250747680664062, 3.0096302032470703, -25.169618606567383, -3.002664566040039, 32.97554016113281, 7.559230804443359, 7.881568908691406, 5.0308990478515625, 20.471511840820312, 4.190883636474609, 14.924659729003906, -3.3386383056640625, 24.808578491210938, 24.79792022705078, 19.964445114135742, 59.183685302734375, 10.95693588256836], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000487.npy"}
|
||||
{"epoch": 0.7362055933484505, "step": 488, "batch_size": 64, "mean": 13.650503158569336, "std": 15.794641494750977, "min": -12.609395980834961, "p10": -4.091012001037596, "median": 10.778961181640625, "p90": 36.02139968872071, "max": 62.16131591796875, "pos_frac": 0.796875, "sample": [1.6330108642578125, -1.051177978515625, -1.8301258087158203, -8.02284049987793, 0.5635833740234375, -4.707250595092773, -6.780551910400391, 22.34550666809082, 33.91535949707031, 15.815624237060547, 7.958698272705078, 6.076835632324219, 50.94312286376953, 7.298431396484375, 1.355560302734375, 62.16131591796875, 18.546981811523438, 37.62042236328125, 1.1250801086425781, 5.964344024658203, 13.22991943359375, 14.152359008789062, 4.282623291015625, 5.131683349609375, 5.0362701416015625, 9.497505187988281, -10.580764770507812, 30.308631896972656, 6.1480560302734375, 28.540203094482422, 12.271141052246094, -8.950584411621094, -1.1296672821044922, 42.670684814453125, 18.215164184570312, 7.51911735534668, 28.720792770385742, -1.0796451568603516, 36.923988342285156, 26.806854248046875, 30.258514404296875, 11.566085815429688, 17.263893127441406, 9.627090454101562, 29.678504943847656, 11.954582214355469, -1.4009628295898438, 14.425956726074219, -2.6531219482421875, 9.148536682128906, 9.991836547851562, 12.208921432495117, -12.609395980834961, 19.418182373046875, 17.94415283203125, 20.03565216064453, 33.58441162109375, 25.891429901123047, 41.346073150634766, 15.92983627319336, 7.795143127441406, 4.3489532470703125, -9.177330017089844, 38.409034729003906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000488.npy"}
|
||||
{"epoch": 0.7377173091458806, "step": 489, "batch_size": 64, "mean": 7.184290409088135, "std": 14.340322494506836, "min": -21.07420539855957, "p10": -10.878813552856444, "median": 4.754909515380859, "p90": 25.264949035644534, "max": 41.638511657714844, "pos_frac": 0.703125, "sample": [-21.07420539855957, 13.815498352050781, 18.517417907714844, 17.002342224121094, 29.432430267333984, 25.506744384765625, -2.1333847045898438, 20.349456787109375, 18.544612884521484, 36.65849304199219, 19.1063232421875, 2.813323974609375, 13.335906982421875, -12.21844482421875, -3.2225894927978516, 2.72857666015625, -13.924461364746094, 0.09377670288085938, 22.187530517578125, 9.914796829223633, 41.638511657714844, 22.21747589111328, 8.769643783569336, 5.329341888427734, 17.926345825195312, 24.519134521484375, -5.389984130859375, -7.845268249511719, 18.034927368164062, 3.131519317626953, -10.657257080078125, -15.048686981201172, 33.77302551269531, 3.3895492553710938, -1.39617919921875, -4.472139358520508, 1.4070281982421875, 1.4160480499267578, 20.06891632080078, 6.85467529296875, 4.091154098510742, 9.62432861328125, 1.8810043334960938, 29.2518310546875, 12.244110107421875, -10.973766326904297, 24.700759887695312, 3.1600875854492188, 10.490825653076172, 17.918930053710938, -17.42920684814453, 0.20737457275390625, 7.285022735595703, 1.205270767211914, -5.203182220458984, -19.47314453125, -0.398712158203125, 4.312339782714844, -6.849601745605469, 5.197479248046875, -6.396923065185547, 33.801544189453125, -6.76824951171875, 6.8145599365234375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000489.npy"}
|
||||
{"epoch": 0.7392290249433107, "step": 490, "batch_size": 64, "mean": 9.813450813293457, "std": 18.563215255737305, "min": -29.588424682617188, "p10": -11.407741546630858, "median": 10.500667572021484, "p90": 35.36553611755372, "max": 47.656166076660156, "pos_frac": 0.65625, "sample": [-2.9196414947509766, -14.236221313476562, -29.360137939453125, 24.05431365966797, -7.607229232788086, 13.560043334960938, -4.571311950683594, 23.590667724609375, 46.04225158691406, 40.137542724609375, -28.251380920410156, 35.6671028137207, -29.588424682617188, 6.501016616821289, 17.736061096191406, 17.635984420776367, 30.282081604003906, 15.501579284667969, 31.916778564453125, -0.52197265625, 0.20664215087890625, 30.196365356445312, 20.464529037475586, 24.57880401611328, -11.938735961914062, 8.900436401367188, 47.656166076660156, 1.114797592163086, 6.950992584228516, 10.395294189453125, -20.325149536132812, 15.111297607421875, 10.606040954589844, 18.846324920654297, -2.542827606201172, 0.39031982421875, -0.9396781921386719, 42.38667297363281, -8.628089904785156, -1.6036529541015625, 18.075641632080078, -7.3511199951171875, 8.768287658691406, 23.353723526000977, -2.1796703338623047, 15.827186584472656, -1.074462890625, -2.249784469604492, -10.168754577636719, -4.207855224609375, 12.564361572265625, 11.077241897583008, 0.794891357421875, 15.309627532958984, 46.386436462402344, 11.086345672607422, 31.265161514282227, 11.249809265136719, 8.11810302734375, 37.73041915893555, -16.704524993896484, -9.352474212646484, 34.66188049316406, 17.684757232666016], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000490.npy"}
|
||||
{"epoch": 0.7407407407407407, "step": 491, "batch_size": 64, "mean": 10.529365539550781, "std": 13.56296443939209, "min": -19.154129028320312, "p10": -6.484710693359373, "median": 7.966623306274414, "p90": 29.486965942382824, "max": 34.96107482910156, "pos_frac": 0.796875, "sample": [30.896448135375977, -3.3697662353515625, 32.491127014160156, 7.4888153076171875, 15.535011291503906, 13.723915100097656, -1.3726081848144531, 20.690521240234375, 15.604835510253906, 0.3913536071777344, 7.1608428955078125, -11.841499328613281, 21.981914520263672, 0.9149551391601562, 31.655010223388672, -4.512657165527344, 1.6369857788085938, 19.2872314453125, 3.9240493774414062, 14.432540893554688, 3.861358642578125, 19.063446044921875, 27.03127098083496, 7.865901947021484, 11.826787948608398, 26.74770736694336, 7.0296783447265625, -1.2536773681640625, 9.758552551269531, 4.752216339111328, 30.537254333496094, 20.9677734375, -4.998748779296875, 8.067344665527344, 34.82337188720703, 14.246715545654297, -10.128509521484375, 34.96107482910156, 3.12030029296875, -7.121551513671875, 31.506492614746094, 1.5138473510742188, 3.7621803283691406, 3.2672176361083984, 5.987979888916016, 24.353858947753906, -7.546295166015625, -17.847328186035156, 27.036293029785156, 26.288558959960938, 12.976394653320312, 23.904926300048828, 24.65441131591797, 24.165218353271484, 3.235078811645508, 0.6370086669921875, 13.382980346679688, 3.6278533935546875, 21.34490394592285, -19.154129028320312, -11.055870056152344, 6.492240905761719, -4.979900360107422, 18.448211669921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000491.npy"}
|
||||
{"epoch": 0.7422524565381708, "step": 492, "batch_size": 64, "mean": 9.632694244384766, "std": 17.54473876953125, "min": -31.16836166381836, "p10": -10.883757400512694, "median": 8.263145446777344, "p90": 32.949589347839364, "max": 52.839019775390625, "pos_frac": 0.6875, "sample": [35.74537658691406, 15.5487060546875, -10.192716598510742, 6.234107971191406, 26.431243896484375, -15.523691177368164, 23.959243774414062, 7.50396728515625, 13.418027877807617, -1.3318042755126953, 20.704025268554688, 18.691024780273438, 9.022323608398438, 1.2748794555664062, 16.516300201416016, 12.576934814453125, -28.122421264648438, 5.224662780761719, 2.7464427947998047, 33.83299255371094, 19.360240936279297, -3.7817459106445312, 2.4444656372070312, -5.212608337402344, 15.477500915527344, 6.712451934814453, 14.921775817871094, 52.839019775390625, -3.7016067504882812, -12.659873962402344, 38.97087097167969, 26.474180221557617, -3.4616165161132812, 16.525009155273438, -3.065420150756836, -1.6757965087890625, -11.263046264648438, 50.80059051513672, -17.088531494140625, -8.286109924316406, 30.02203369140625, 15.864242553710938, 18.66857147216797, -1.2624092102050781, 0.3009033203125, 25.09717559814453, 34.7127571105957, 9.609306335449219, -4.8030548095703125, 30.888315200805664, -31.16836166381836, -9.812019348144531, 3.01983642578125, 20.179405212402344, 22.633399963378906, 4.015911102294922, 37.86861801147461, 13.533966064453125, 15.768417358398438, 0.146728515625, 3.2694644927978516, -10.463130950927734, -11.06402587890625, 30.876991271972656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000492.npy"}
|
||||
{"epoch": 0.7437641723356009, "step": 493, "batch_size": 64, "mean": 13.085746765136719, "std": 15.1661958694458, "min": -36.163291931152344, "p10": -1.983644866943358, "median": 10.861846923828125, "p90": 32.75201950073243, "max": 46.224910736083984, "pos_frac": 0.859375, "sample": [8.356758117675781, 2.7076759338378906, -3.368896484375, 23.75582504272461, 11.171173095703125, 6.2039947509765625, -28.465415954589844, 11.88681411743164, 5.799966812133789, 9.060783386230469, 4.3914337158203125, 23.946563720703125, 30.144805908203125, 39.19330596923828, 4.042087554931641, -2.7247962951660156, 35.35906982421875, 5.307407379150391, 30.852210998535156, 24.51313018798828, 5.28248405456543, 3.8906936645507812, -3.2215347290039062, 13.81011962890625, 8.175331115722656, -0.5472488403320312, 22.291549682617188, 9.498176574707031, 41.23113250732422, 25.603090286254883, 8.775482177734375, 21.90268325805664, -2.5992431640625, 17.30498504638672, -7.095558166503906, 11.690322875976562, 46.224910736083984, -0.517059326171875, 27.4725341796875, 6.886383056640625, 24.84532928466797, 22.733551025390625, 18.033523559570312, 10.552520751953125, 5.4847412109375, 40.72857666015625, 1.2568130493164062, 20.587493896484375, 24.635080337524414, 4.317268371582031, 0.662139892578125, 24.094789505004883, 10.277000427246094, 0.43589019775390625, 44.035064697265625, 18.018959045410156, 14.081024169921875, 14.560562133789062, 15.654754638671875, 33.56622314453125, 8.597221374511719, 3.2875328063964844, 15.011909484863281, -36.163291931152344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000493.npy"}
|
||||
{"epoch": 0.745275888133031, "step": 494, "batch_size": 64, "mean": 9.566875457763672, "std": 17.267995834350586, "min": -35.12706756591797, "p10": -7.819282150268554, "median": 9.197097778320312, "p90": 32.29510211944583, "max": 55.13336944580078, "pos_frac": 0.71875, "sample": [24.58124542236328, 2.4032516479492188, 8.268394470214844, 11.266036987304688, 21.574722290039062, -0.5319061279296875, -7.156898498535156, 25.279020309448242, 15.188396453857422, 0.9970932006835938, 0.4138374328613281, -10.316810607910156, 1.284149169921875, 24.937911987304688, 2.9940643310546875, 9.945297241210938, 24.418210983276367, -0.088592529296875, 19.911785125732422, -8.773895263671875, 0.18437957763671875, 42.67961120605469, 14.73843002319336, 4.152503967285156, 19.598609924316406, 55.13336944580078, 16.7880859375, -1.6403961181640625, 16.27081298828125, 35.30199432373047, 12.436508178710938, -25.721786499023438, -27.937850952148438, 25.154726028442383, -6.826898574829102, -6.2845001220703125, -20.921951293945312, 20.533546447753906, 13.119911193847656, 10.193767547607422, 10.244598388671875, -0.1365375518798828, -35.12706756591797, -1.3179893493652344, 15.76186752319336, 6.450469970703125, 42.987579345703125, 9.372879028320312, 1.1867294311523438, 13.032768249511719, 36.970794677734375, 9.021316528320312, 19.279632568359375, 37.39678955078125, 3.3578338623046875, 2.2283248901367188, 1.128143310546875, -8.103160858154297, 24.373992919921875, -1.6954345703125, 22.425994873046875, -0.406951904296875, -3.9628334045410156, 44.2620735168457], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000494.npy"}
|
||||
{"epoch": 0.7467876039304611, "step": 495, "batch_size": 64, "mean": 7.907181262969971, "std": 14.322943687438965, "min": -24.425880432128906, "p10": -5.950306320190429, "median": 5.305854797363281, "p90": 26.030762672424324, "max": 46.83502197265625, "pos_frac": 0.6875, "sample": [-9.495513916015625, -3.2007293701171875, -11.70770263671875, -24.425880432128906, 30.247642517089844, -6.093692779541016, -8.04389762878418, 4.931365966796875, 11.796085357666016, -0.4853363037109375, -5.6157379150390625, -22.56128692626953, -3.3811492919921875, 8.14364242553711, 9.663383483886719, 43.20751190185547, -3.8289718627929688, 17.85773468017578, 9.270164489746094, 2.6843395233154297, 7.510265350341797, 14.66812515258789, 13.645172119140625, 2.0840015411376953, 19.13573455810547, 26.958349227905273, -3.7533607482910156, 14.110183715820312, 2.4599075317382812, 0.8209304809570312, 46.83502197265625, 18.939233779907227, 3.0817718505859375, -5.556795120239258, 20.355287551879883, 2.2648963928222656, -0.4237060546875, 10.613067626953125, 23.86639404296875, 17.53192138671875, -2.1744384765625, 5.2497406005859375, 23.329959869384766, 13.92669677734375, 35.4809455871582, 3.0230789184570312, 6.63493537902832, -0.8919677734375, -0.5832443237304688, 41.690940856933594, -3.2240543365478516, 21.725257873535156, 12.087196350097656, -4.559833526611328, -9.007545471191406, 31.772384643554688, 10.336395263671875, 0.5443115234375, 5.361968994140625, 2.029876708984375, 10.027841567993164, 5.431526184082031, 22.3741455078125, 1.3651008605957031], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000495.npy"}
|
||||
{"epoch": 0.7482993197278912, "step": 496, "batch_size": 64, "mean": 10.436321258544922, "std": 18.097488403320312, "min": -18.8046875, "p10": -9.065535736083982, "median": 8.441619873046875, "p90": 27.560669708251957, "max": 89.2288818359375, "pos_frac": 0.71875, "sample": [15.389862060546875, 9.245452880859375, 53.1795654296875, 14.568077087402344, -12.55352783203125, 11.064994812011719, 7.637786865234375, 19.498565673828125, 25.214529037475586, 34.31255340576172, -5.536537170410156, 1.2639350891113281, 16.099416732788086, 5.9426727294921875, 0.39345359802246094, -0.7994384765625, 0.19398117065429688, 10.59992790222168, 15.33292007446289, 26.209434509277344, -14.713310241699219, 89.2288818359375, -0.4542407989501953, 26.190818786621094, 13.195953369140625, -18.8046875, 41.252685546875, 19.941146850585938, -2.0476207733154297, 28.1397705078125, -7.014610290527344, 5.368610382080078, 37.21729278564453, 22.342498779296875, -4.846580505371094, 6.23082160949707, -11.074304580688477, -4.927553176879883, 5.355438232421875, 11.712135314941406, 22.982589721679688, 9.855751037597656, 17.88109588623047, 16.47228240966797, 10.389289855957031, 6.79783821105957, -6.9093170166015625, 9.269529342651367, -2.374784469604492, 5.756847381591797, 54.048362731933594, -12.659706115722656, -11.663665771484375, 1.331125259399414, 16.511474609375, 12.813220977783203, 2.3922042846679688, 23.138153076171875, -5.969814300537109, 4.8080902099609375, -1.430419921875, 11.907951354980469, -9.944503784179688, 2.9701995849609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000496.npy"}
|
||||
{"epoch": 0.7498110355253212, "step": 497, "batch_size": 64, "mean": 14.204545021057129, "std": 17.420249938964844, "min": -23.30150604248047, "p10": -5.022790527343749, "median": 12.711773872375488, "p90": 33.2309455871582, "max": 58.891448974609375, "pos_frac": 0.78125, "sample": [58.23075866699219, 11.88656234741211, 58.891448974609375, 29.82977294921875, -0.057464599609375, -3.9420928955078125, 10.100418090820312, 5.480476379394531, 12.441308975219727, 16.339012145996094, 27.14290428161621, 7.92474365234375, 26.803701400756836, 11.781608581542969, 13.4501953125, -0.8025321960449219, 24.075531005859375, -23.30150604248047, 29.123546600341797, 22.66558074951172, -1.601593017578125, 22.483951568603516, 2.4734649658203125, -3.0629119873046875, 5.623744964599609, -7.41119384765625, -8.940078735351562, 19.557720184326172, 29.786415100097656, 34.530967712402344, 2.510324478149414, 12.98223876953125, 10.529739379882812, 15.467208862304688, -15.081085205078125, 15.88543701171875, 32.991127014160156, 24.87718391418457, 21.622344970703125, -3.9323081970214844, 33.33372497558594, 10.101419448852539, 1.8275642395019531, 18.06218719482422, 22.600830078125, -8.243637084960938, 49.33445739746094, 46.59156036376953, 1.406036376953125, 2.207977294921875, 15.595592498779297, 25.4722900390625, 29.5667781829834, 8.809207916259766, 5.4835662841796875, 56.32741928100586, -16.684127807617188, 3.800689697265625, -1.2082099914550781, 16.606842041015625, 20.75262451171875, 4.012054443359375, -5.4859466552734375, 19.463348388671875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000497.npy"}
|
||||
{"epoch": 0.7513227513227513, "step": 498, "batch_size": 64, "mean": 12.491800308227539, "std": 15.746021270751953, "min": -29.118621826171875, "p10": -2.2314086914062496, "median": 9.041820526123047, "p90": 34.45784187316895, "max": 51.870697021484375, "pos_frac": 0.78125, "sample": [-2.332611083984375, 10.252456665039062, 31.879013061523438, 14.482646942138672, -1.995269775390625, -10.474807739257812, 20.47252655029297, 33.17633056640625, -3.57379150390625, -1.5911102294921875, 15.853019714355469, 9.969245910644531, 10.728919982910156, 8.376060485839844, 38.212059020996094, 4.21258544921875, 2.5082626342773438, -3.1091175079345703, 0.9111404418945312, 17.005538940429688, 39.44218444824219, 2.56890869140625, 29.00815773010254, 3.750425338745117, 7.832710266113281, 8.582748413085938, -4.288078308105469, 39.184913635253906, 31.16350555419922, 1.64410400390625, 23.233108520507812, 35.00706100463867, 3.1640625, 14.436737060546875, 7.004310607910156, -5.996246337890625, -1.8967208862304688, 15.074792861938477, 0.5289382934570312, -0.7446117401123047, 27.106903076171875, 31.406518936157227, -29.118621826171875, 15.79936408996582, 9.85276985168457, 6.76513671875, 0.7260665893554688, 9.500892639160156, 11.33792495727539, 7.622261047363281, 30.85810089111328, 49.46925354003906, 27.772933959960938, 5.090309143066406, 41.102447509765625, 1.6358928680419922, 31.184402465820312, 12.554046630859375, -0.1599712371826172, -1.56854248046875, 1.3932647705078125, -1.0471649169921875, 14.656227111816406, 51.870697021484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000498.npy"}
|
||||
{"epoch": 0.7528344671201814, "step": 499, "batch_size": 64, "mean": 11.773270606994629, "std": 17.379684448242188, "min": -25.83892822265625, "p10": -10.105386543273925, "median": 9.27505111694336, "p90": 35.23752136230469, "max": 56.037384033203125, "pos_frac": 0.734375, "sample": [22.033164978027344, 37.33027648925781, -25.83892822265625, 34.839691162109375, 11.899612426757812, 7.983737945556641, 6.459836959838867, -17.658737182617188, 5.191356658935547, 8.129867553710938, 25.536426544189453, 15.426567077636719, 41.07582092285156, -10.77011489868164, 20.936981201171875, 2.6492385864257812, 56.037384033203125, -16.817520141601562, 36.63861083984375, 35.40802001953125, 14.895584106445312, 0.574493408203125, -7.803012847900391, -8.100261688232422, -6.08906364440918, 22.05084228515625, 9.6170654296875, 26.990737915039062, -6.765569686889648, 7.151111602783203, -0.8485069274902344, -1.0137443542480469, 16.22119140625, 10.65203857421875, 16.845197677612305, -2.17205810546875, 3.9295578002929688, 19.728694915771484, -8.554353713989258, 7.113986968994141, 33.26481628417969, 3.7983627319335938, 23.710723876953125, 7.947511672973633, -17.9383544921875, 13.192298889160156, 5.940765380859375, 31.610937118530273, 26.148040771484375, 48.770545959472656, 18.9102783203125, 6.469017028808594, 27.23849105834961, -12.099990844726562, -10.846519470214844, 27.323883056640625, 20.277204513549805, 11.9705810546875, -5.63226318359375, 30.810012817382812, 4.127220153808594, 39.03319549560547, 8.933036804199219, -0.38571929931640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000499.npy"}
|
||||
{"epoch": 0.7543461829176115, "step": 500, "batch_size": 64, "mean": 10.816570281982422, "std": 15.792787551879883, "min": -26.341796875, "p10": -6.9647186279296855, "median": 8.059928894042969, "p90": 33.354944038391125, "max": 48.8157844543457, "pos_frac": 0.8125, "sample": [-5.30621337890625, 9.433082580566406, 6.5710296630859375, 1.9752674102783203, 14.294052124023438, 6.409370422363281, 17.324386596679688, 38.287994384765625, 2.95648193359375, 36.87216567993164, 9.921234130859375, 34.483970642089844, 24.741195678710938, -0.00563812255859375, 29.06085205078125, 24.89942169189453, 48.8157844543457, -2.435943603515625, 7.4236602783203125, -1.5554351806640625, 28.676719665527344, 17.45917510986328, 22.29436492919922, 30.066925048828125, 1.1611251831054688, 39.024967193603516, 15.842178344726562, 3.047351837158203, 2.7155227661132812, -19.708641052246094, -7.675506591796875, 3.318035125732422, 2.217395782470703, 7.595947265625, -17.143415451049805, -4.760339736938477, 10.64306640625, 7.0646209716796875, 23.9844970703125, 1.3693618774414062, 15.820549011230469, 3.4983081817626953, 19.124725341796875, 6.486713409423828, -13.651554107666016, 23.643007278442383, 3.793956756591797, -26.341796875, 27.411758422851562, -18.858787536621094, 36.609710693359375, 2.284515380859375, 0.5483989715576172, 9.230117797851562, 15.976375579833984, 7.259548187255859, 8.523910522460938, 14.569366455078125, 17.539175033569336, 1.6725425720214844, -15.017166137695312, 14.777902603149414, 30.720548629760742, 35.27861785888672], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000500.npy"}
|
||||
{"epoch": 0.7558578987150416, "step": 501, "batch_size": 64, "mean": 10.493215560913086, "std": 12.197235107421875, "min": -19.282852172851562, "p10": -1.1828601837158197, "median": 7.189613342285156, "p90": 25.04252281188965, "max": 45.6258544921875, "pos_frac": 0.859375, "sample": [7.902717590332031, 19.663475036621094, 10.938980102539062, 16.087509155273438, 14.689340591430664, 16.516992568969727, 5.26531982421875, 10.775688171386719, 12.86672592163086, 41.15751647949219, 34.87385559082031, 2.755117416381836, 18.617496490478516, 3.9455642700195312, 0.9945602416992188, 0.6426906585693359, 16.71131134033203, 23.335914611816406, -13.018913269042969, -5.124143600463867, 5.764396667480469, -1.4314727783203125, -0.31014251708984375, 21.367332458496094, 45.6258544921875, 31.343589782714844, 7.772590637207031, 34.13490295410156, 9.001262664794922, 1.9055290222167969, 1.5766525268554688, 6.3650360107421875, -1.8037395477294922, 4.518787384033203, 14.292205810546875, 4.620819091796875, 18.43107032775879, -2.4753799438476562, 5.158576965332031, 3.5268726348876953, 2.646970748901367, 24.4320068359375, 1.9806556701660156, 6.217731475830078, 7.1742401123046875, -19.282852172851562, 0.9198074340820312, 20.965484619140625, -5.637248992919922, 13.403770446777344, 13.982261657714844, 20.68517303466797, 3.7614593505859375, 2.0593643188476562, 4.9980010986328125, -0.6027641296386719, 7.031951904296875, 5.305631637573242, 22.412445068359375, 25.30417251586914, 33.62892150878906, 15.061960220336914, 8.929231643676758, 7.204986572265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000501.npy"}
|
||||
{"epoch": 0.7573696145124716, "step": 502, "batch_size": 64, "mean": 9.898885726928711, "std": 13.561397552490234, "min": -16.11590576171875, "p10": -6.766959381103512, "median": 8.041860580444336, "p90": 24.468042755126962, "max": 56.431976318359375, "pos_frac": 0.84375, "sample": [-3.1536941528320312, 3.958192825317383, 3.5351333618164062, -9.928947448730469, 18.962005615234375, 19.14804458618164, 4.328315734863281, 2.26141357421875, -9.706298828125, 10.366058349609375, 14.072681427001953, 34.55419921875, 0.20691680908203125, -9.553565979003906, 12.946807861328125, 42.879974365234375, 8.964157104492188, -1.4480438232421875, 7.4789581298828125, 3.5107269287109375, 9.11456298828125, 0.9826469421386719, 18.427309036254883, 13.64456558227539, 4.856605529785156, 4.370838165283203, 21.799575805664062, -3.2827835083007812, 0.9760303497314453, 15.378120422363281, 2.5002975463867188, 2.3048343658447266, 11.698577880859375, 5.19019889831543, 10.486019134521484, 4.898345947265625, 42.359249114990234, 56.431976318359375, -10.60992431640625, 0.2735328674316406, -12.25381851196289, 5.000465393066406, 12.707813262939453, 1.53741455078125, 6.538579940795898, 2.4266891479492188, 8.913848876953125, 21.1324462890625, 21.79602813720703, -8.260177612304688, 16.679780960083008, 18.480682373046875, 18.162338256835938, 5.5267333984375, 14.53546142578125, -16.11590576171875, 18.317520141601562, 25.611671447753906, 3.1338729858398438, 30.134490966796875, 8.60476303100586, 15.082893371582031, 33.51611328125, 17.135324478149414], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000502.npy"}
|
||||
{"epoch": 0.7588813303099018, "step": 503, "batch_size": 64, "mean": 10.574769020080566, "std": 15.689276695251465, "min": -24.79330825805664, "p10": -6.679472160339355, "median": 8.374431610107422, "p90": 31.46540489196778, "max": 42.532005310058594, "pos_frac": 0.8125, "sample": [15.857826232910156, -7.041399002075195, -5.3050994873046875, 2.205820083618164, 22.890403747558594, 6.839790344238281, 32.195655822753906, -24.259414672851562, 5.645195007324219, 21.02214813232422, 19.03863525390625, 6.057769775390625, 8.117271423339844, 39.87179946899414, -0.6833114624023438, 14.113155364990234, -5.183671951293945, 1.500732421875, 36.64189147949219, -11.434112548828125, 0.32480621337890625, 8.064823150634766, 42.532005310058594, 14.40631103515625, 11.60275650024414, 8.631591796875, 18.530418395996094, 27.690399169921875, 3.586528778076172, 20.59524154663086, -1.3036613464355469, 3.3492164611816406, 37.48501968383789, 27.368309020996094, 29.761486053466797, 0.23523712158203125, 15.249637603759766, -5.8349761962890625, -21.838241577148438, 0.5535507202148438, 7.721546173095703, 7.299278259277344, 0.9001235961914062, 13.246343612670898, 3.8085479736328125, 3.6735763549804688, 7.70355224609375, 26.51031494140625, 34.37409973144531, 13.59664535522461, 24.32845687866211, 25.941072463989258, -24.79330825805664, 3.216869354248047, 36.45709228515625, 29.33340835571289, 1.2444629669189453, -14.378320693969727, 12.503463745117188, 16.864112854003906, 9.263021469116211, 12.448402404785156, 26.454618453979492, -20.013683319091797], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000503.npy"}
|
||||
{"epoch": 0.7603930461073318, "step": 504, "batch_size": 64, "mean": 13.92115592956543, "std": 14.116393089294434, "min": -8.929573059082031, "p10": -1.2637546539306637, "median": 11.290643692016602, "p90": 32.09833602905274, "max": 54.95655059814453, "pos_frac": 0.828125, "sample": [-0.21258544921875, 32.68750762939453, -2.24273681640625, 26.479188919067383, 18.021888732910156, -2.0642967224121094, 20.207794189453125, 11.159353256225586, 47.09983825683594, -0.8959598541259766, 2.3642578125, 13.86187744140625, 6.869319915771484, 40.26700973510742, 7.520938873291016, 28.720439910888672, 15.194808959960938, -6.585746765136719, 18.64209747314453, 45.042945861816406, 19.600421905517578, 7.729866027832031, 4.315464019775391, 29.44306755065918, 30.723602294921875, -1.6500396728515625, 6.256551742553711, 34.50785446166992, 5.032674789428711, 54.95655059814453, 16.93572998046875, 16.8687744140625, -1.4213809967041016, 21.04251480102539, 21.58685302734375, -0.3507080078125, 5.080997467041016, -8.929573059082031, 26.361175537109375, 7.5113983154296875, 26.296737670898438, 1.8105316162109375, 0.2650566101074219, 13.0565185546875, -0.47467041015625, 20.273334503173828, 12.300983428955078, 9.464887619018555, 7.675506591796875, 7.553089141845703, 28.901103973388672, 3.8365936279296875, -7.979867935180664, 2.89935302734375, 11.421934127807617, 10.605987548828125, 3.9600963592529297, 24.665634155273438, 16.113998413085938, 3.393421173095703, 11.896135330200195, 44.03852844238281, 13.000007629394531, 8.239347457885742], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000504.npy"}
|
||||
{"epoch": 0.7619047619047619, "step": 505, "batch_size": 64, "mean": 9.950593948364258, "std": 16.557979583740234, "min": -28.37420654296875, "p10": -7.420875930786131, "median": 6.9246416091918945, "p90": 32.17185516357422, "max": 55.49956512451172, "pos_frac": 0.71875, "sample": [18.50088882446289, -1.5726661682128906, 4.2105560302734375, 6.087459564208984, -2.340984344482422, 32.3177490234375, -2.026123046875, 52.14863586425781, 18.04754066467285, 25.590198516845703, 35.93833923339844, -8.151927947998047, 3.8946762084960938, 2.64361572265625, 10.543983459472656, 3.3055572509765625, 5.960838317871094, 1.9606704711914062, 13.490079879760742, 9.458747863769531, -5.4897003173828125, 16.14599609375, 22.332874298095703, 13.465961456298828, 20.400428771972656, 55.49956512451172, 13.578544616699219, -5.715087890625, 10.628562927246094, -0.45063018798828125, 17.60369873046875, -2.7081298828125, 33.981781005859375, -18.87621307373047, 43.69927215576172, 26.425392150878906, 11.246574401855469, 20.430150985717773, -3.638914108276367, -2.5934982299804688, 13.498687744140625, 5.498931884765625, 0.864044189453125, -28.37420654296875, 7.9149932861328125, 6.136941909790039, 14.246406555175781, -8.476564407348633, 31.831436157226562, -20.735157012939453, -4.547065734863281, 21.38208770751953, 33.5195198059082, 7.189910888671875, 28.297428131103516, 2.732635498046875, 19.052734375, -12.834503173828125, 6.229213714599609, 30.626508712768555, 6.659372329711914, 1.1290760040283203, -5.432182312011719, -15.546722412109375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000505.npy"}
|
||||
{"epoch": 0.763416477702192, "step": 506, "batch_size": 64, "mean": 9.846002578735352, "std": 16.15572738647461, "min": -27.940040588378906, "p10": -7.506916427612303, "median": 9.0821533203125, "p90": 28.281248092651367, "max": 63.461212158203125, "pos_frac": 0.765625, "sample": [10.111654281616211, 24.70608901977539, 0.19232177734375, -1.258453369140625, 14.706878662109375, 9.4075927734375, 13.173595428466797, -2.8319549560546875, 12.819168090820312, -7.972625732421875, 1.787363052368164, 19.833709716796875, 7.524375915527344, 8.651046752929688, 5.58953857421875, 8.7567138671875, -15.615806579589844, 43.064910888671875, 63.461212158203125, 0.49623870849609375, 28.460594177246094, -14.499168395996094, 5.555877685546875, -21.891128540039062, -13.592033386230469, -27.940040588378906, 3.7471237182617188, 41.93872833251953, 28.686920166015625, 27.637924194335938, 48.65058517456055, -12.753768920898438, 30.344383239746094, -4.0320587158203125, 26.57265853881836, 1.8253860473632812, 27.862773895263672, 0.6495742797851562, 19.81964874267578, -3.186370849609375, -2.30059814453125, 0.08678817749023438, 3.8479576110839844, 15.469810485839844, 12.262428283691406, 10.365825653076172, -0.6736793518066406, 6.461181640625, 12.406465530395508, 26.792800903320312, 19.845666885375977, 15.300613403320312, 4.638635635375977, 18.928916931152344, 12.955028533935547, 12.188240051269531, 2.6579971313476562, 16.292720794677734, -2.0725326538085938, -6.420261383056641, 7.038887023925781, 10.311622619628906, 13.4613037109375, 9.837127685546875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000506.npy"}
|
||||
{"epoch": 0.764928193499622, "step": 507, "batch_size": 64, "mean": 11.408361434936523, "std": 16.526464462280273, "min": -16.389678955078125, "p10": -8.012922286987303, "median": 10.96171760559082, "p90": 33.35313873291016, "max": 60.50457763671875, "pos_frac": 0.734375, "sample": [21.675575256347656, 0.707916259765625, -8.406604766845703, 24.2159423828125, 60.50457763671875, 11.665813446044922, 25.237091064453125, -14.68023681640625, -13.559524536132812, 14.438461303710938, 4.543813705444336, 14.780258178710938, -5.1357574462890625, 34.090301513671875, -1.5122032165527344, 15.542633056640625, 2.970428466796875, -6.3475189208984375, 22.510555267333984, 4.7983245849609375, 5.7613067626953125, 22.68085479736328, 23.633060455322266, -6.703010559082031, 32.957679748535156, 1.070444107055664, 6.775888442993164, 18.949554443359375, 9.83909797668457, 14.643173217773438, 7.657379150390625, -2.27587890625, 54.10130310058594, -5.1641387939453125, 14.024711608886719, 17.51238441467285, -7.094329833984375, 21.443214416503906, -10.244117736816406, -5.8977813720703125, 0.5325469970703125, 14.865509033203125, 1.5686912536621094, 23.093271255493164, 15.722328186035156, 4.2274627685546875, 18.752593994140625, 7.9281768798828125, 38.98907470703125, 45.484222412109375, -16.389678955078125, 20.002532958984375, -3.7032318115234375, 10.257621765136719, 23.882171630859375, -2.2628097534179688, 19.305988311767578, -8.684242248535156, 15.214447021484375, -14.966907501220703, 14.881851196289062, 33.522621154785156, 6.366115570068359, 39.834129333496094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000507.npy"}
|
||||
{"epoch": 0.7664399092970522, "step": 508, "batch_size": 64, "mean": 12.621658325195312, "std": 15.87256145477295, "min": -17.194395065307617, "p10": -6.963643074035643, "median": 12.151611328125, "p90": 31.766323852539063, "max": 58.46418762207031, "pos_frac": 0.78125, "sample": [20.541828155517578, 1.7569732666015625, 31.80987548828125, 21.88128662109375, 6.390174865722656, -8.192436218261719, 58.46418762207031, -5.273317337036133, 25.74816131591797, 31.664703369140625, 54.470611572265625, 28.057178497314453, 6.370765686035156, 5.399116516113281, 4.8988037109375, 22.08684539794922, -17.194395065307617, 15.400932312011719, 5.7364959716796875, -10.933902740478516, 12.202133178710938, -0.2900199890136719, 18.912748336791992, -2.1802825927734375, -1.6602554321289062, -12.6793212890625, 13.321434020996094, 3.5376739501953125, 18.306427001953125, 17.261394500732422, 1.4795074462890625, 27.167770385742188, 7.5755462646484375, 14.463981628417969, 24.835140228271484, 23.9195556640625, 6.247798919677734, 16.163803100585938, 36.50762939453125, 3.9806747436523438, -2.5484619140625, -8.414215087890625, 26.015548706054688, 18.144283294677734, -4.245262145996094, 2.2208709716796875, 19.884567260742188, 36.839263916015625, 36.808738708496094, 4.572603225708008, 22.93936538696289, 1.4687862396240234, 3.003631591796875, -7.688068389892578, 8.065536499023438, -16.274150848388672, -4.996437072753906, 39.90262222290039, 14.683692932128906, 23.139404296875, 17.398286819458008, 12.101089477539062, 9.023979187011719, 27.58325958251953], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000508.npy"}
|
||||
{"epoch": 0.7679516250944822, "step": 509, "batch_size": 64, "mean": 10.063714981079102, "std": 14.2826509475708, "min": -22.31212615966797, "p10": -8.517967605590819, "median": 10.420276641845703, "p90": 26.343786621093756, "max": 43.71990966796875, "pos_frac": 0.734375, "sample": [19.201171875, -6.884243011474609, 5.289909362792969, 19.969650268554688, 6.055206298828125, 0.12157821655273438, 5.770380020141602, -3.2166824340820312, -18.749488830566406, 2.566385269165039, 19.13054656982422, 29.6451416015625, -2.832183837890625, 4.630990982055664, 26.913665771484375, -22.31212615966797, 43.71990966796875, 20.47002410888672, -1.2321929931640625, 6.0317535400390625, 13.836311340332031, -5.369361877441406, 16.92547607421875, -11.470672607421875, 22.695289611816406, 5.409515380859375, -2.4551315307617188, 3.705883026123047, 18.020309448242188, 7.167572021484375, 17.611679077148438, 30.284061431884766, 10.380447387695312, 10.460105895996094, 9.079742431640625, -0.5126819610595703, 20.24734878540039, 11.095527648925781, 14.3111572265625, 34.98844909667969, 19.287734985351562, -2.3773574829101562, 29.50975799560547, -14.185226440429688, 24.159591674804688, 16.110260009765625, 8.042564392089844, 15.02447509765625, 41.03722381591797, 24.420310974121094, 25.014068603515625, -0.5876541137695312, -9.161895751953125, -18.252593994140625, 3.6493072509765625, 19.518661499023438, 20.4741268157959, -7.015468597412109, -11.044586181640625, 7.310874938964844, 16.348793029785156, 17.85260772705078, 24.657424926757812, 13.584327697753906], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000509.npy"}
|
||||
{"epoch": 0.7694633408919124, "step": 510, "batch_size": 64, "mean": 11.428990364074707, "std": 15.367403984069824, "min": -17.90038299560547, "p10": -9.692417526245116, "median": 9.667641639709473, "p90": 31.4952865600586, "max": 46.489646911621094, "pos_frac": 0.8125, "sample": [4.625526428222656, 30.24134063720703, 21.321449279785156, 8.37896728515625, -16.715087890625, 3.311370849609375, 17.554187774658203, 46.21222686767578, 9.794822692871094, 7.4882354736328125, 1.12939453125, 2.027801513671875, 4.041080474853516, 6.864383697509766, -17.18557357788086, -0.4697608947753906, 2.1668243408203125, 9.028038024902344, 1.9688568115234375, 6.542299270629883, 9.222976684570312, 20.4799861907959, 29.269638061523438, -9.062915802001953, 2.0278167724609375, 26.498851776123047, 21.124732971191406, 0.39642333984375, 2.3435802459716797, 22.219043731689453, 20.150588989257812, 15.311630249023438, 35.495147705078125, 18.434152603149414, 11.989864349365234, 12.079620361328125, 13.787681579589844, 45.669029235839844, -1.2099761962890625, -9.962203979492188, 3.749736785888672, -12.325698852539062, 16.41036605834961, 20.789180755615234, 16.687362670898438, 5.504844665527344, 8.257339477539062, 46.489646911621094, 19.8494873046875, -5.4630126953125, 19.937828063964844, 9.757904052734375, 43.51301193237305, 10.936796188354492, -10.847831726074219, 14.93564224243164, 38.355438232421875, 32.032691955566406, -1.3175220489501953, 13.551481246948242, 26.606292724609375, 9.57737922668457, -12.224693298339844, -17.90038299560547], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000510.npy"}
|
||||
{"epoch": 0.7709750566893424, "step": 511, "batch_size": 64, "mean": 14.473926544189453, "std": 16.594026565551758, "min": -14.777664184570312, "p10": -6.422794342041016, "median": 11.97134780883789, "p90": 38.85139999389649, "max": 49.96331787109375, "pos_frac": 0.84375, "sample": [49.96331787109375, 29.632598876953125, 14.074159622192383, 44.786949157714844, 11.698898315429688, 5.161369323730469, 4.6090087890625, 1.2193145751953125, 34.597938537597656, 35.96595001220703, 14.846267700195312, -13.003257751464844, 16.780258178710938, 9.968879699707031, 36.527313232421875, 17.183135986328125, 18.72392463684082, 2.4600448608398438, 42.50348663330078, 20.4788818359375, 32.84455871582031, -6.066070556640625, 5.672698974609375, -14.686119079589844, -11.980339050292969, -4.9130859375, 42.4828987121582, 21.695968627929688, 11.947196960449219, 9.216156005859375, 0.8074817657470703, 7.2990264892578125, 15.319686889648438, 8.189178466796875, 14.9256591796875, 16.017837524414062, 30.012195587158203, 4.299617767333984, -8.152267456054688, 8.344308853149414, 3.35430908203125, 21.05257797241211, 24.894943237304688, 39.561668395996094, 35.763336181640625, 18.750289916992188, -11.094871520996094, 5.36187744140625, 2.361806869506836, 45.25133514404297, 9.390380859375, 4.904630661010742, 22.537914276123047, 13.028083801269531, 1.8041572570800781, 4.43016242980957, -6.575675964355469, 43.96617889404297, 11.995498657226562, 28.858829498291016, 37.19410705566406, -1.8682289123535156, -14.777664184570312, 4.730640411376953], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000511.npy"}
|
||||
{"epoch": 0.7724867724867724, "step": 512, "batch_size": 64, "mean": 11.295859336853027, "std": 20.635469436645508, "min": -23.92041015625, "p10": -7.498609161376952, "median": 7.136512756347656, "p90": 34.8848373413086, "max": 112.27423095703125, "pos_frac": 0.703125, "sample": [17.813987731933594, 10.124008178710938, 23.1297664642334, -14.666118621826172, -2.9668312072753906, 39.12293243408203, 2.1199569702148438, 1.8792095184326172, 0.6743545532226562, 12.563896179199219, 16.331317901611328, 25.569564819335938, -23.92041015625, -5.247257232666016, 14.548311233520508, -0.0225982666015625, -3.2904281616210938, -1.9969749450683594, 34.04491424560547, -0.17565345764160156, 41.986419677734375, 25.896085739135742, 2.4906997680664062, 12.297149658203125, 7.6736602783203125, 27.7572021484375, -19.002464294433594, 15.939338684082031, 16.31639862060547, 41.36741638183594, 0.8530006408691406, 18.053512573242188, 35.199729919433594, 0.38664817810058594, 15.257736206054688, 3.016998291015625, -0.3748149871826172, 19.100296020507812, -3.8292980194091797, 25.688751220703125, 1.3780670166015625, -12.332185745239258, -2.881641387939453, -6.5301971435546875, 7.964271545410156, 28.618148803710938, -0.11791229248046875, 6.352573394775391, 6.599365234375, 1.0126457214355469, 34.56092071533203, -2.941436767578125, 3.2813262939453125, -12.107803344726562, 2.1647491455078125, 15.571235656738281, 112.27423095703125, 35.023658752441406, 52.644325256347656, 26.58025360107422, 10.149770736694336, -19.692302703857422, 11.566139221191406, -7.913642883300781], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000512.npy"}
|
||||
{"epoch": 0.7739984882842026, "step": 513, "batch_size": 64, "mean": 11.771991729736328, "std": 14.384750366210938, "min": -15.460956573486328, "p10": -4.867130661010742, "median": 8.878032684326172, "p90": 34.62731361389161, "max": 52.47068405151367, "pos_frac": 0.828125, "sample": [35.193504333496094, 22.30156707763672, 44.54658126831055, 11.542194366455078, 15.977775573730469, 7.7165985107421875, 9.082305908203125, 10.799644470214844, 3.941740036010742, -5.445606231689453, 7.114067077636719, -14.581062316894531, 23.731948852539062, 5.234308242797852, 5.919923782348633, 17.19607162475586, -1.2329368591308594, 7.0853424072265625, 52.47068405151367, 25.724876403808594, 18.740280151367188, 41.47335433959961, 15.600112915039062, 6.526317596435547, 35.84159851074219, 2.2865562438964844, 10.073379516601562, 13.11807632446289, 4.204507827758789, 11.019718170166016, 6.956329345703125, 1.3536109924316406, 20.521194458007812, -15.460956573486328, 7.345207214355469, 10.178972244262695, -7.055015563964844, 6.183563232421875, 40.78929138183594, 5.338184356689453, -4.123514175415039, -8.678800582885742, 44.52143096923828, -0.4470977783203125, 23.10211181640625, 30.651233673095703, 33.30620193481445, 5.081474304199219, 17.809051513671875, 8.673759460449219, 9.458877563476562, 11.636363983154297, 12.779731750488281, 7.2860260009765625, 10.350677490234375, -5.185823440551758, 5.236045837402344, 6.952190399169922, 3.3853683471679688, 2.9640274047851562, 18.589595794677734, -10.514724731445312, -0.4890899658203125, 11.708498001098633], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000513.npy"}
|
||||
{"epoch": 0.7755102040816326, "step": 514, "batch_size": 64, "mean": 11.144484519958496, "std": 14.885872840881348, "min": -28.569156646728516, "p10": -6.767321777343749, "median": 12.25276803970337, "p90": 31.801741790771487, "max": 44.21026611328125, "pos_frac": 0.75, "sample": [1.8675823211669922, 3.2627220153808594, 14.008541107177734, 22.581707000732422, 16.297073364257812, 7.63859748840332, 35.478370666503906, -3.934326171875, 8.1116943359375, 2.3016090393066406, 13.99686050415039, 24.380382537841797, 10.342781066894531, 14.759979248046875, -1.162832260131836, 4.501068115234375, -21.729568481445312, 15.561180114746094, -0.52911376953125, 10.390678405761719, -28.569156646728516, 2.297698974609375, 19.57834815979004, 17.965463638305664, 42.661964416503906, -3.1827239990234375, -1.7802047729492188, 19.52169418334961, 18.439979553222656, 25.404769897460938, -4.641334533691406, -10.02623176574707, 21.88996124267578, 13.969802856445312, 14.768852233886719, 15.068069458007812, 16.210067749023438, 31.9713134765625, -0.07619094848632812, 8.938373565673828, 5.668861389160156, -7.180091857910156, -8.967662811279297, 5.212982177734375, 15.806121826171875, 16.45899200439453, 11.6724853515625, 0.007659912109375, 12.833050727844238, 19.106590270996094, -5.804191589355469, 7.391422271728516, 9.788124084472656, 32.960105895996094, -12.0404052734375, 31.01031494140625, 34.92706298828125, 15.45758056640625, 31.40607452392578, -3.222156524658203, 44.21026611328125, 39.13513946533203, -7.4885101318359375, 26.361648559570312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000514.npy"}
|
||||
{"epoch": 0.7770219198790628, "step": 515, "batch_size": 64, "mean": 11.970718383789062, "std": 14.557340621948242, "min": -14.634834289550781, "p10": -7.51922569274902, "median": 11.088534355163574, "p90": 30.294529724121098, "max": 44.853485107421875, "pos_frac": 0.796875, "sample": [1.1269187927246094, 11.729833602905273, 9.974647521972656, 18.98308563232422, 39.87034606933594, 27.28059196472168, 15.98220443725586, 25.81155776977539, 7.572044372558594, 21.131752014160156, 20.93524169921875, 15.064994812011719, 22.28605842590332, -4.546060562133789, 16.0706729888916, 10.447235107421875, -10.48675537109375, 22.859649658203125, -8.686668395996094, 5.4539947509765625, 13.886795043945312, 0.04326629638671875, 3.055206298828125, -1.4762039184570312, -13.277862548828125, 18.61477279663086, 44.853485107421875, 24.334022521972656, 9.719535827636719, 0.06302642822265625, 13.900337219238281, 15.364696502685547, 1.8666572570800781, 2.4894351959228516, 16.243865966796875, 30.7752685546875, -14.634834289550781, 18.741920471191406, 2.7737903594970703, 15.208053588867188, -4.795192718505859, 4.871551513671875, 32.08958435058594, -11.91278076171875, 8.82305908203125, 44.24971008300781, 6.238075256347656, -9.164594650268555, 28.847740173339844, 27.528322219848633, 15.774322509765625, 9.505630493164062, 5.558784484863281, 35.648040771484375, 29.172805786132812, -3.531475067138672, 25.578733444213867, 2.087677001953125, 37.12923049926758, -8.977928161621094, 3.5734214782714844, 24.008087158203125, -3.2585830688476562, -4.324790954589844], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000515.npy"}
|
||||
{"epoch": 0.7785336356764928, "step": 516, "batch_size": 64, "mean": 9.980987548828125, "std": 14.468896865844727, "min": -27.016780853271484, "p10": -3.370866012573242, "median": 10.79269027709961, "p90": 28.251470565795906, "max": 43.88829040527344, "pos_frac": 0.75, "sample": [43.71117401123047, 43.88829040527344, 21.249420166015625, -6.759971618652344, -3.4997940063476562, 29.056896209716797, -2.5469436645507812, 8.457756042480469, 15.323789596557617, 11.802251815795898, 4.140556335449219, 15.746309280395508, 19.77978515625, 36.021881103515625, 9.363616943359375, 1.3781814575195312, 15.243560791015625, 20.466785430908203, 26.940185546875, 17.043609619140625, -3.8819847106933594, -27.016780853271484, 4.019783020019531, 14.781166076660156, 42.32538604736328, 13.491806030273438, 17.66585922241211, 1.291412353515625, 28.81344985961914, 16.794734954833984, -20.61406707763672, -1.7465400695800781, 1.4568424224853516, 1.8659744262695312, 12.915214538574219, 23.150405883789062, 13.275909423828125, -1.328155517578125, 5.285100936889648, -0.5856132507324219, 12.317054748535156, -26.28516387939453, 12.510368347167969, 18.82512855529785, 16.76433563232422, 6.7408905029296875, 20.376327514648438, 4.862339019775391, 2.4673385620117188, 10.317535400390625, 2.496002197265625, 11.267845153808594, 3.96875, -1.3101959228515625, -11.556411743164062, 22.208816528320312, -3.0700340270996094, 13.336385726928711, -1.9527587890625, -1.7552490234375, 22.321266174316406, 3.883298873901367, -0.05797576904296875, 31.340038299560547], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000516.npy"}
|
||||
{"epoch": 0.780045351473923, "step": 517, "batch_size": 64, "mean": 10.533844947814941, "std": 15.740942001342773, "min": -19.338890075683594, "p10": -7.873977661132812, "median": 7.279667854309082, "p90": 31.75793914794922, "max": 58.82414245605469, "pos_frac": 0.71875, "sample": [16.170734405517578, -3.829315185546875, -2.060443878173828, 14.415706634521484, -4.786308288574219, 10.271160125732422, 4.502044677734375, 7.216520309448242, 40.414886474609375, 20.143634796142578, 9.777809143066406, -10.276754379272461, -3.0345382690429688, 4.957561492919922, 1.52032470703125, -3.67803955078125, 1.435220718383789, 12.028312683105469, 19.129961013793945, -4.514007568359375, 18.077190399169922, -11.078659057617188, 31.361572265625, 35.110084533691406, -8.375457763671875, 5.807809829711914, 6.313896179199219, 7.342815399169922, 16.876434326171875, 28.519550323486328, 21.99695587158203, 29.491348266601562, 28.469392776489258, 5.423488616943359, -6.703857421875, -11.454660415649414, 5.750213623046875, 33.73396301269531, 1.9579906463623047, -1.8323898315429688, 14.092430114746094, 26.247095108032227, 45.72804260253906, 3.678773880004883, 10.243942260742188, 31.927810668945312, 12.328723907470703, 15.351757049560547, -9.268123626708984, 4.167335510253906, -3.837747573852539, 13.69882583618164, -0.87799072265625, 58.82414245605469, 45.67988586425781, 23.20662498474121, 4.505241394042969, -8.793691635131836, 4.875982284545898, 7.917232513427734, 18.489112854003906, -19.338890075683594, 11.566204071044922, -2.8388519287109375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000517.npy"}
|
||||
{"epoch": 0.781557067271353, "step": 518, "batch_size": 64, "mean": 9.157788276672363, "std": 15.573409080505371, "min": -18.175010681152344, "p10": -8.389799308776853, "median": 8.184972763061523, "p90": 30.378410339355476, "max": 56.93975830078125, "pos_frac": 0.671875, "sample": [11.776130676269531, -0.92822265625, 2.533946990966797, 24.736351013183594, -1.5381011962890625, -2.968658447265625, 5.46466064453125, 17.230419158935547, -16.389713287353516, 8.224929809570312, 3.7996177673339844, -4.16400146484375, 7.272247314453125, 28.806625366210938, 12.26907730102539, -15.99444580078125, 19.102279663085938, 36.640655517578125, -12.222082138061523, 15.432884216308594, 31.052032470703125, -18.175010681152344, 2.0952529907226562, 5.761791229248047, 14.033348083496094, 8.159130096435547, -3.1089935302734375, 10.776779174804688, -13.958023071289062, 28.370750427246094, -1.4982032775878906, -6.314874649047852, -2.455434799194336, 19.896406173706055, 27.879844665527344, -2.589141845703125, 13.558443069458008, -9.279052734375, 14.439079284667969, 56.93975830078125, 15.8154296875, -0.6402053833007812, 5.805166244506836, 51.03419494628906, 31.585403442382812, 23.35235595703125, -3.0633773803710938, 33.48552322387695, 12.466293334960938, -12.084999084472656, -3.549327850341797, 0.2010059356689453, 8.2108154296875, 0.09000778198242188, 8.620628356933594, 17.764328002929688, 13.986270904541016, 8.511085510253906, -0.6838016510009766, -1.1300048828125, 8.847297668457031, 1.7727298736572266, 39.38307189941406, 11.65008544921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000518.npy"}
|
||||
{"epoch": 0.783068783068783, "step": 519, "batch_size": 64, "mean": 6.495992660522461, "std": 17.606674194335938, "min": -46.0882568359375, "p10": -12.260530471801758, "median": 6.042370796203613, "p90": 30.426708984375, "max": 40.878326416015625, "pos_frac": 0.65625, "sample": [-17.564085006713867, 22.56728744506836, -8.886795043945312, -28.299026489257812, -8.82058334350586, 14.196815490722656, -2.0744762420654297, 38.239990234375, 7.657135009765625, 13.109542846679688, 3.270843505859375, 7.2141876220703125, 22.099288940429688, 4.263191223144531, 9.617218017578125, 3.7788333892822266, 23.00690460205078, 1.2136955261230469, -12.587459564208984, -19.243927001953125, 24.347196578979492, 40.878326416015625, -9.383651733398438, 11.495140075683594, 20.86541748046875, 39.509742736816406, -7.520038604736328, 13.42071533203125, 18.743263244628906, 6.4005126953125, -5.917991638183594, 27.841171264648438, 4.2324371337890625, -11.497695922851562, 11.748064041137695, -11.360336303710938, 7.3106689453125, 6.543537139892578, 30.626312255859375, -7.2654571533203125, 0.9816055297851562, 9.680858612060547, 23.512969970703125, 5.684228897094727, 4.993034362792969, -46.0882568359375, 2.0197906494140625, -7.3633270263671875, 19.841407775878906, -8.85875129699707, 32.543434143066406, -11.160438537597656, -14.315021514892578, 38.257537841796875, 1.6359024047851562, 10.779869079589844, 21.709548950195312, 29.960968017578125, 8.309764862060547, -9.272472381591797, -1.9665145874023438, -0.6624317169189453, 39.26068115234375, -17.51676368713379], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000519.npy"}
|
||||
{"epoch": 0.7845804988662132, "step": 520, "batch_size": 64, "mean": 9.729334831237793, "std": 12.652044296264648, "min": -14.359115600585938, "p10": -6.417240142822263, "median": 8.73406982421875, "p90": 30.642374420166018, "max": 35.52225112915039, "pos_frac": 0.78125, "sample": [10.27458381652832, -1.7116718292236328, -2.078899383544922, 7.674114227294922, 23.917911529541016, -0.0061492919921875, 15.618982315063477, -0.4309234619140625, 5.393047332763672, 33.995635986328125, 18.281837463378906, 12.007791519165039, 1.1857566833496094, 4.0217437744140625, 21.0469970703125, 7.7881622314453125, -0.9481124877929688, -7.613250732421875, 2.914104461669922, 35.52225112915039, 21.422561645507812, 10.398082733154297, 10.543241500854492, 30.825523376464844, 8.914154052734375, 10.961708068847656, 32.49613952636719, -12.147293090820312, -10.319862365722656, 26.880020141601562, 10.750411987304688, 6.991847991943359, 2.7633533477783203, 13.957565307617188, 8.654655456542969, 3.327505111694336, 32.17127990722656, 12.843650817871094, -3.6265487670898438, -13.072517395019531, -10.438655853271484, 5.0087432861328125, 4.4334869384765625, -12.560287475585938, 4.542137145996094, 5.027957916259766, 6.7713470458984375, 7.527172088623047, 9.266366958618164, 8.813484191894531, 6.713249206542969, 30.21502685546875, 33.655242919921875, 10.647428512573242, 33.47010803222656, 22.775728225708008, 22.681640625, 14.522453308105469, 0.9628334045410156, -2.8311634063720703, 9.29922103881836, 17.531978607177734, 17.41164207458496, -14.359115600585938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000520.npy"}
|
||||
{"epoch": 0.7860922146636432, "step": 521, "batch_size": 64, "mean": 11.032337188720703, "std": 13.8711519241333, "min": -24.03905487060547, "p10": -3.499405479431151, "median": 10.081132888793945, "p90": 29.869639968872075, "max": 50.664573669433594, "pos_frac": 0.78125, "sample": [15.395599365234375, 7.350830078125, 28.364593505859375, 10.0113525390625, 30.473392486572266, 7.3204345703125, 34.000144958496094, -0.42250823974609375, -0.1949462890625, 17.261093139648438, 16.768470764160156, -0.3455924987792969, 17.68377685546875, 11.685569763183594, 24.04456329345703, 17.59668731689453, 15.478059768676758, 10.15091323852539, 5.4299468994140625, 41.36705017089844, 1.2402877807617188, 25.96746063232422, 5.94134521484375, -1.7387161254882812, 0.35700035095214844, 17.967185974121094, -8.722198486328125, 17.231056213378906, -0.42958831787109375, 10.977333068847656, 50.664573669433594, 41.921966552734375, -11.483207702636719, 4.549049377441406, -0.13640594482421875, -4.0580902099609375, 2.3150253295898438, 15.124885559082031, 4.311553955078125, 28.46088409423828, 23.507614135742188, 0.9839286804199219, -11.626712799072266, 9.521492004394531, -3.982088088989258, 10.192495346069336, 14.645301818847656, 17.701095581054688, 12.187742233276367, 30.6639404296875, 8.995670318603516, 11.880538940429688, -24.03905487060547, 36.4871826171875, -11.613113403320312, 3.750886917114258, 4.941545486450195, 3.0911941528320312, 15.454147338867188, -2.3731460571289062, 8.523956298828125, 5.7818450927734375, 14.036087036132812, 17.4761962890625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000521.npy"}
|
||||
{"epoch": 0.7876039304610734, "step": 522, "batch_size": 64, "mean": 10.363337516784668, "std": 18.005126953125, "min": -40.61383056640625, "p10": -12.747959899902341, "median": 9.989940643310547, "p90": 29.476400375366214, "max": 54.694095611572266, "pos_frac": 0.75, "sample": [-6.525390625, 16.87641143798828, 3.0641422271728516, 12.646427154541016, 11.182731628417969, 25.99622344970703, 10.71514892578125, 28.40795135498047, 25.561660766601562, 7.329629898071289, -17.357112884521484, -10.86410903930664, 10.174087524414062, 43.38927459716797, 5.902717590332031, 25.909156799316406, -13.937454223632812, 16.364973068237305, 18.740875244140625, 4.277425765991211, 27.018169403076172, 5.800647735595703, 6.58782958984375, -40.61383056640625, 39.9063720703125, 1.7647628784179688, 13.803882598876953, -5.8532257080078125, -10.8367919921875, 5.038576126098633, -2.8807144165039062, 39.53654479980469, 2.0132598876953125, 29.934307098388672, 24.81477928161621, 7.457481384277344, -4.851890563964844, 14.102783203125, -5.627010345458984, 6.8519439697265625, 13.303277969360352, 10.721099853515625, 5.491127014160156, 9.805793762207031, 14.484012603759766, -2.6997814178466797, 33.93817138671875, -15.089679718017578, 54.694095611572266, -1.1355361938476562, 26.888450622558594, -22.998275756835938, 14.872238159179688, 52.598876953125, 6.945228576660156, 5.6858062744140625, -23.109848022460938, 28.195693969726562, 21.089458465576172, 21.02135467529297, 2.2121829986572266, -13.55532455444336, 24.372726440429688, 23.699851989746094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000522.npy"}
|
||||
{"epoch": 0.7891156462585034, "step": 523, "batch_size": 64, "mean": 12.461475372314453, "std": 15.715434074401855, "min": -29.427356719970703, "p10": -8.777102279663085, "median": 12.982563018798828, "p90": 31.3895492553711, "max": 49.37782669067383, "pos_frac": 0.796875, "sample": [-3.1180763244628906, 33.25258255004883, 8.49985122680664, 6.852897644042969, 24.771278381347656, 23.45408821105957, 4.4151153564453125, 24.90289306640625, 23.52167510986328, 6.757133483886719, 5.38176155090332, 26.711837768554688, 27.70703125, -8.493648529052734, 27.382347106933594, 20.3380126953125, 15.740264892578125, 1.565866470336914, 11.646965026855469, -15.921642303466797, 10.998771667480469, 18.63330078125, -3.6014938354492188, 7.697685241699219, -15.925018310546875, 13.208412170410156, -4.2082977294921875, -1.8945884704589844, 17.05390167236328, 29.404029846191406, 14.782533645629883, 18.838088989257812, 25.2384033203125, 15.919382095336914, 16.699684143066406, 34.11300277709961, 35.10104751586914, 12.446945190429688, 6.961997985839844, -22.20111083984375, 12.7567138671875, -8.333625793457031, 49.37782669067383, 20.697509765625, 28.314422607421875, 10.385673522949219, 3.666332244873047, 4.2803955078125, 36.59800720214844, -17.049232482910156, 36.317970275878906, 11.054996490478516, 17.43059539794922, -11.340003967285156, 32.24048614501953, 17.56175994873047, 9.17854118347168, 18.125732421875, 28.220172882080078, -29.427356719970703, -8.898582458496094, 11.61729621887207, 7.706817626953125, 22.417037963867188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000523.npy"}
|
||||
{"epoch": 0.7906273620559335, "step": 524, "batch_size": 64, "mean": 8.407295227050781, "std": 13.36708927154541, "min": -31.39696502685547, "p10": -5.573083114624023, "median": 9.398118019104004, "p90": 24.135361099243166, "max": 47.590843200683594, "pos_frac": 0.71875, "sample": [17.287857055664062, 12.808006286621094, 47.590843200683594, 4.528804779052734, 4.992523193359375, -5.973247528076172, -6.4800262451171875, -3.5914306640625, 18.162490844726562, 4.266071319580078, 20.41675567626953, -4.017482757568359, -15.47529411315918, 7.773017883300781, 2.3978042602539062, 21.329177856445312, -16.379486083984375, -8.859100341796875, 5.806854248046875, 24.481491088867188, 15.3399658203125, -31.39696502685547, 17.132522583007812, 1.6732044219970703, -2.3212127685546875, 10.898792266845703, 46.90251922607422, 13.434982299804688, 15.100435256958008, -4.639366149902344, -2.1779327392578125, 16.304222106933594, -1.3840560913085938, 7.540275573730469, 18.506439208984375, 15.325069427490234, 15.919271469116211, 13.169700622558594, 9.008544921875, -6.375423431396484, 0.980010986328125, -2.43292236328125, 3.6918487548828125, 26.047622680664062, 9.787691116333008, 3.1048049926757812, 28.3963680267334, -4.437984466552734, 15.691986083984375, 9.796844482421875, 10.11932373046875, 9.847549438476562, 14.526920318603516, 26.27197265625, 10.16757583618164, 11.075727462768555, 23.32772445678711, 7.647308349609375, 0.005096435546875, -1.2334060668945312, 25.65545654296875, -4.31585693359375, 16.300018310546875, -0.9814529418945312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000524.npy"}
|
||||
{"epoch": 0.7921390778533636, "step": 525, "batch_size": 64, "mean": 8.153154373168945, "std": 12.413273811340332, "min": -15.42169189453125, "p10": -7.139464569091796, "median": 6.668052673339844, "p90": 24.574665260314944, "max": 40.30039978027344, "pos_frac": 0.71875, "sample": [15.500534057617188, -15.42169189453125, -7.535194396972656, -3.406658172607422, -9.056785583496094, 10.53240966796875, 9.147842407226562, -1.7950973510742188, 23.384490966796875, -6.216094970703125, 8.17633056640625, 6.2563018798828125, 6.975076675415039, 6.555717468261719, 4.856719970703125, 23.24102783203125, 2.8389339447021484, -13.07403564453125, 17.196456909179688, 8.381561279296875, 7.417854309082031, -1.7347373962402344, 32.39678192138672, 24.093782424926758, 2.6865234375, 13.29311752319336, -0.042266845703125, 7.955127716064453, 6.032279968261719, 14.491676330566406, 1.8077621459960938, 25.18328857421875, 5.425188064575195, 15.02096939086914, 14.411197662353516, -7.824737548828125, 5.109712600708008, 4.029502868652344, 40.30039978027344, -3.23974609375, -4.499839782714844, 6.780387878417969, 14.644817352294922, -10.053581237792969, 5.420207977294922, 32.42646789550781, 0.8406639099121094, 2.0361328125, 18.109817504882812, 24.780757904052734, -1.8052215576171875, 29.997909545898438, 10.627792358398438, 1.7803878784179688, 14.610595703125, 23.037384033203125, -2.8098297119140625, 12.019012451171875, 13.054595947265625, 13.630874633789062, -2.0422744750976562, 39.20292282104492, -0.9140853881835938, -12.425537109375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000525.npy"}
|
||||
{"epoch": 0.7936507936507936, "step": 526, "batch_size": 64, "mean": 8.224380493164062, "std": 13.137300491333008, "min": -25.621627807617188, "p10": -7.1057689666748045, "median": 6.945918083190918, "p90": 26.430758666992194, "max": 38.061363220214844, "pos_frac": 0.765625, "sample": [35.81877136230469, 12.903430938720703, -0.8806362152099609, 5.898447036743164, -11.246932983398438, 11.577014923095703, 6.83930778503418, 23.409759521484375, -19.799942016601562, -6.5041656494140625, 8.654571533203125, 12.10977554321289, -12.253997802734375, 4.169677734375, 12.9058837890625, -3.7519073486328125, 11.264469146728516, 38.061363220214844, 4.696067810058594, 22.22987937927246, 13.838760375976562, 17.91845703125, 6.7322845458984375, 27.00042724609375, 25.101531982421875, 6.477870941162109, 8.440177917480469, 6.642738342285156, 4.0584716796875, 9.804901123046875, 0.3796272277832031, 5.102760314941406, 23.337379455566406, -6.810258865356445, 12.940696716308594, -1.3608646392822266, 14.777076721191406, -1.0480880737304688, 17.64134979248047, -1.3545379638671875, 38.05262756347656, 13.873096466064453, 4.9924468994140625, 7.414072036743164, 12.209564208984375, -18.42645263671875, -7.232416152954102, 1.7958755493164062, 27.216033935546875, -1.51397705078125, 9.368362426757812, 0.4849128723144531, 28.907562255859375, 6.698352813720703, 31.003597259521484, 4.105796813964844, -10.193586349487305, 16.431808471679688, 8.015830993652344, 4.744483947753906, 7.052528381347656, 4.091495513916016, 17.1683349609375, -25.621627807617188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000526.npy"}
|
||||
{"epoch": 0.7951625094482238, "step": 527, "batch_size": 64, "mean": 7.224104881286621, "std": 14.480178833007812, "min": -29.223194122314453, "p10": -7.0228784561157225, "median": 5.223667144775391, "p90": 28.751052856445327, "max": 50.52956008911133, "pos_frac": 0.65625, "sample": [-5.528079986572266, 7.074195861816406, 5.270549774169922, 19.870025634765625, 5.962242126464844, 3.054668426513672, 32.00074005126953, 6.195247650146484, 22.366348266601562, 25.630126953125, 33.656593322753906, -6.9503173828125, 16.09688949584961, -4.286125183105469, 13.291030883789062, -6.131988525390625, 33.235328674316406, -1.420816421508789, 5.176784515380859, -4.826486587524414, -5.6940155029296875, -12.176155090332031, -11.9825439453125, -2.4476318359375, 14.976917266845703, 3.0446624755859375, 36.402732849121094, -2.06396484375, 18.819812774658203, 0.69805908203125, 9.986221313476562, 3.271007537841797, -0.52520751953125, 12.120857238769531, 5.377311706542969, 50.52956008911133, 19.215316772460938, 5.145477294921875, -2.117708206176758, 17.047985076904297, 14.98876953125, 0.17738723754882812, 20.665924072265625, -6.190113067626953, 5.84478759765625, -0.6025390625, -7.053976058959961, 30.323429107666016, 7.50830078125, -6.237844467163086, -7.144134521484375, 4.906660079956055, 16.112518310546875, 11.369913101196289, -10.093612670898438, -2.2351913452148438, 1.9780731201171875, 15.793098449707031, 22.030181884765625, 30.088592529296875, -29.223194122314453, -21.14190673828125, 8.455509185791016, 2.6564292907714844], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000527.npy"}
|
||||
{"epoch": 0.7966742252456538, "step": 528, "batch_size": 64, "mean": 7.294772148132324, "std": 14.777105331420898, "min": -27.308380126953125, "p10": -9.604355812072752, "median": 8.442480087280273, "p90": 24.858634948730472, "max": 41.602630615234375, "pos_frac": 0.734375, "sample": [32.22611999511719, -26.0211181640625, 33.149681091308594, 0.04300117492675781, 19.02654266357422, 4.835319519042969, 18.430431365966797, 1.5474395751953125, 2.0497989654541016, 10.015087127685547, -5.91105842590332, -7.293098449707031, 15.641525268554688, -24.145301818847656, -5.351877212524414, -10.181953430175781, 28.764938354492188, 4.928918838500977, 8.787145614624023, 0.8245849609375, -2.5558929443359375, 29.316268920898438, 1.5477161407470703, -27.308380126953125, 1.6782073974609375, 8.907920837402344, -4.321372985839844, -5.826103210449219, 6.34351921081543, 19.377418518066406, 18.605209350585938, -2.66644287109375, 13.477352142333984, 35.78326416015625, 6.4740142822265625, -4.36053466796875, 41.602630615234375, 21.80889129638672, 22.198692321777344, 5.819877624511719, 15.2606201171875, 17.686622619628906, -1.3522891998291016, 14.376192092895508, -15.678306579589844, 9.10357666015625, 13.906131744384766, 1.7157516479492188, 23.971885681152344, 5.104772567749023, 9.845375061035156, -21.563318252563477, -15.674507141113281, 23.709259033203125, 18.43035888671875, 8.700679779052734, 1.4672966003417969, 8.987991333007812, 25.238670349121094, -8.256628036499023, 9.225728988647461, 11.679641723632812, 15.527267456054688, 8.184280395507812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000528.npy"}
|
||||
{"epoch": 0.7981859410430839, "step": 529, "batch_size": 64, "mean": 15.683242797851562, "std": 15.594358444213867, "min": -26.398723602294922, "p10": -2.7624769210815425, "median": 16.610828399658203, "p90": 34.546878814697266, "max": 48.11885070800781, "pos_frac": 0.78125, "sample": [24.50006866455078, -2.896106719970703, 17.91998291015625, 5.7891082763671875, 25.208049774169922, 5.458290100097656, 46.4683837890625, 39.96588897705078, -2.1265907287597656, 0.4614067077636719, -2.450674057006836, -7.414682388305664, 12.99139404296875, 11.752021789550781, 12.319419860839844, 8.720146179199219, -9.516433715820312, 26.744741439819336, -1.7213020324707031, 18.40062713623047, 32.170684814453125, 4.7453765869140625, -7.7818145751953125, 16.31787872314453, 23.841941833496094, 31.016021728515625, -4.0134124755859375, -1.2943649291992188, 12.939037322998047, 6.913116455078125, 34.360252380371094, 21.509456634521484, 30.100982666015625, 42.88078308105469, 17.63764190673828, 31.19786834716797, 28.91937828063965, 43.301048278808594, -1.9930992126464844, 29.134429931640625, 17.961212158203125, 20.783845901489258, 48.11885070800781, 12.00811767578125, 7.787330627441406, 21.212188720703125, 14.793586730957031, -1.3474903106689453, 19.44208526611328, 26.850234985351562, 7.594936370849609, 25.140586853027344, 4.084016799926758, 32.04997634887695, -6.9652862548828125, 38.420448303222656, 16.903778076171875, 24.49337387084961, 11.292732238769531, 19.67047119140625, -26.398723602294922, 13.731674194335938, -1.00421142578125, 34.626861572265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000529.npy"}
|
||||
{"epoch": 0.799697656840514, "step": 530, "batch_size": 64, "mean": 9.965425491333008, "std": 12.764376640319824, "min": -16.25847816467285, "p10": -4.532322120666503, "median": 8.447702407836914, "p90": 23.862064933776857, "max": 45.783172607421875, "pos_frac": 0.75, "sample": [45.783172607421875, 15.26828384399414, 16.425888061523438, -1.0975761413574219, 11.162605285644531, -3.3774681091308594, 39.15242004394531, -2.9274978637695312, 15.93341064453125, 17.441143035888672, 8.08578872680664, 15.092582702636719, -3.0113086700439453, 6.497377395629883, 16.63897132873535, 23.7653751373291, 3.0443267822265625, 8.809616088867188, 36.993560791015625, -10.250205993652344, 0.22644424438476562, -8.411052703857422, 3.30755615234375, 13.798587799072266, 19.34510040283203, 0.7770233154296875, 23.90350341796875, 13.607864379882812, 5.505680084228516, 39.18669128417969, 9.935508728027344, 7.198394775390625, 21.94671630859375, 29.892822265625, -5.313079833984375, 1.163747787475586, 11.999935150146484, 17.266170501708984, -4.749691009521484, 5.160591125488281, 12.353302001953125, 19.459426879882812, 3.9984664916992188, 7.2996978759765625, -4.6640625, 6.018226623535156, 21.95632553100586, -3.7927932739257812, -16.25847816467285, 16.527359008789062, -4.22492790222168, -7.338600158691406, 20.11774253845215, -1.6226043701171875, 27.633743286132812, 4.702766418457031, 19.007240295410156, -2.2245330810546875, 6.133903503417969, -1.0699462890625, 23.711715698242188, 2.816974639892578, 8.92367172241211, 13.14361572265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000530.npy"}
|
||||
{"epoch": 0.8012093726379441, "step": 531, "batch_size": 64, "mean": 8.250574111938477, "std": 14.103066444396973, "min": -24.125831604003906, "p10": -5.437086486816406, "median": 4.477563858032227, "p90": 26.798099327087403, "max": 44.9510498046875, "pos_frac": 0.75, "sample": [-17.010467529296875, 24.84145736694336, 25.28662109375, 2.5541343688964844, 6.55055046081543, 2.4629669189453125, 2.568340301513672, 26.137954711914062, 6.6137237548828125, 4.442089080810547, 5.554725646972656, 3.2299633026123047, 26.715377807617188, -3.5755462646484375, 28.927642822265625, 23.77013397216797, -12.714950561523438, -9.093864440917969, 4.501777648925781, 40.20802688598633, 2.1942291259765625, 20.619949340820312, -5.028190612792969, -5.612327575683594, -3.22528076171875, 8.010683059692383, 5.4180908203125, 7.153617858886719, 13.087905883789062, 2.0567169189453125, 4.806510925292969, -3.0422286987304688, -0.10590362548828125, 0.4794464111328125, -17.964736938476562, -1.7863502502441406, -0.1837615966796875, 14.190757751464844, 13.590507507324219, -24.125831604003906, 17.955467224121094, 8.439979553222656, 27.56237030029297, 25.19525146484375, -12.867805480957031, -1.3229598999023438, 35.76103591918945, 26.83355140686035, 10.431114196777344, 3.8178558349609375, 0.1581268310546875, 4.453350067138672, 10.223892211914062, 18.95561981201172, 1.164520263671875, 1.8238868713378906, 23.138824462890625, -0.32645416259765625, 21.778593063354492, 44.9510498046875, 3.2150344848632812, 2.232175827026367, 30.363494873046875, 1.5943222045898438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000531.npy"}
|
||||
{"epoch": 0.8027210884353742, "step": 532, "batch_size": 64, "mean": 10.983474731445312, "std": 15.387964248657227, "min": -17.965476989746094, "p10": -7.360879516601562, "median": 9.88449478149414, "p90": 34.86874809265138, "max": 51.277427673339844, "pos_frac": 0.765625, "sample": [17.642959594726562, 16.61419105529785, 27.49097442626953, 37.93891143798828, -8.851249694824219, -0.3787498474121094, 11.413505554199219, 46.47071838378906, 3.2122764587402344, 15.38031005859375, -7.3975830078125, 15.832233428955078, 31.3076171875, 3.3554916381835938, 24.683929443359375, 7.135711669921875, 9.602458953857422, 8.95864486694336, 3.333486557006836, -14.853302001953125, -1.6380252838134766, 4.270313262939453, 23.279481887817383, 5.37690544128418, 12.25421142578125, 14.588577270507812, 12.213165283203125, 29.889511108398438, 19.830860137939453, 22.379981994628906, 18.632949829101562, 2.848865509033203, 10.853912353515625, 51.277427673339844, 6.892181396484375, 17.129379272460938, 13.471275329589844, -10.742130279541016, 0.741546630859375, 2.31536865234375, -5.081274032592773, -7.275238037109375, 2.328216552734375, -5.365453720092773, 39.80651092529297, 5.257331848144531, 36.49053955078125, -3.5741233825683594, 13.646560668945312, 8.807998657226562, 10.16653060913086, 13.17584228515625, -1.6215438842773438, 36.39494705200195, -10.67559814453125, 24.184513092041016, -17.965476989746094, 4.4551544189453125, 37.55479049682617, -16.246505737304688, 0.7860622406005859, -6.107580184936523, 23.095352172851562, 15.94650650024414], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000532.npy"}
|
||||
{"epoch": 0.8042328042328042, "step": 533, "batch_size": 64, "mean": 10.23153305053711, "std": 14.718372344970703, "min": -33.58033752441406, "p10": -4.841221618652343, "median": 9.28034496307373, "p90": 30.253814697265625, "max": 46.36780548095703, "pos_frac": 0.75, "sample": [10.7421875, 23.2534236907959, -1.4375228881835938, 16.436416625976562, 13.058231353759766, 21.42039680480957, 3.2412261962890625, 0.7000808715820312, 24.268062591552734, 18.67853546142578, 4.10394287109375, 3.907501220703125, 8.710908889770508, -3.9290084838867188, 20.770429611206055, 25.88494873046875, 5.103250503540039, 24.75505828857422, -0.3772735595703125, -3.2572708129882812, 2.52777099609375, 7.603120803833008, 9.849781036376953, 5.949180603027344, -10.057136535644531, 12.620534896850586, 24.818763732910156, -1.8823089599609375, 6.388786315917969, -19.64793586730957, 0.8696136474609375, 0.004238128662109375, 29.630468368530273, 27.410789489746094, -7.664470672607422, 30.346694946289062, -5.786712646484375, 35.979705810546875, 4.589242935180664, -3.1098670959472656, 10.3875732421875, 11.509040832519531, 10.375484466552734, 15.291580200195312, 31.2034912109375, 46.36780548095703, 6.8119354248046875, -8.931941986083984, 5.777587890625, -3.3187522888183594, -3.7072505950927734, 30.037094116210938, 12.522968292236328, -0.496337890625, 2.044046401977539, -5.232170104980469, 10.504791259765625, 16.359542846679688, 11.381088256835938, 41.03427505493164, 31.875885009765625, 33.40663146972656, -33.58033752441406, 16.720306396484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000533.npy"}
|
||||
{"epoch": 0.8057445200302343, "step": 534, "batch_size": 64, "mean": 9.913549423217773, "std": 14.617988586425781, "min": -27.5484619140625, "p10": -6.922441482543945, "median": 10.441549301147461, "p90": 27.777883911132815, "max": 47.9652099609375, "pos_frac": 0.75, "sample": [3.8057785034179688, -8.116281509399414, 6.45904541015625, 11.142078399658203, 0.9227199554443359, 14.557151794433594, 4.3219451904296875, -27.5484619140625, 10.524036407470703, -12.060775756835938, -0.7324924468994141, 1.899749755859375, 23.850032806396484, 13.764289855957031, -1.7930831909179688, 47.9652099609375, 16.257347106933594, 14.312179565429688, 19.328140258789062, -14.080436706542969, 26.76831817626953, 28.564254760742188, 9.795085906982422, 12.161865234375, -5.196319580078125, 5.85797119140625, -24.66510009765625, 13.319887161254883, 29.14765167236328, 15.920631408691406, 10.359062194824219, -3.5463905334472656, 27.999588012695312, 32.53855895996094, 32.616153717041016, 23.498138427734375, -1.1817550659179688, 6.639701843261719, 38.01625061035156, 22.68618392944336, 22.91492462158203, 27.260574340820312, -6.1856689453125, 6.1721649169921875, 15.272363662719727, 13.930133819580078, -6.79962158203125, 16.02336883544922, 2.0199928283691406, -1.7914505004882812, 7.814487457275391, 26.78982925415039, 14.785369873046875, -6.975078582763672, 3.6310348510742188, 4.138336181640625, 19.701480865478516, -11.004829406738281, 10.73379135131836, 1.9178085327148438, 23.816253662109375, 5.0844879150390625, 25.259113311767578, -6.119632720947266], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000534.npy"}
|
||||
{"epoch": 0.8072562358276644, "step": 535, "batch_size": 64, "mean": 9.840182304382324, "std": 16.139507293701172, "min": -34.74873352050781, "p10": -8.813952255249022, "median": 9.514720916748047, "p90": 29.568459320068364, "max": 53.20384979248047, "pos_frac": 0.75, "sample": [4.233940124511719, -8.949359893798828, 30.812355041503906, 18.69693374633789, 4.448997497558594, 16.89767074584961, -17.6727294921875, -3.9869956970214844, -5.803852081298828, 17.002304077148438, 23.0396728515625, -6.707937240600586, 10.790529251098633, 18.46503448486328, -5.15507698059082, 4.346649169921875, -8.498001098632812, 4.6790618896484375, 0.15302467346191406, -1.9815673828125, 13.876480102539062, 13.947799682617188, 16.052223205566406, 3.7918701171875, 15.380672454833984, -1.977752685546875, 38.76983642578125, -12.192245483398438, -12.21990966796875, 6.646841049194336, 24.544570922851562, 43.50885009765625, 17.007152557373047, -3.3164443969726562, 9.25568962097168, 11.741037368774414, 1.432647705078125, 51.97856140136719, -15.775909423828125, 13.966407775878906, 9.658973693847656, 3.1330795288085938, 53.20384979248047, 10.036975860595703, 2.0382614135742188, -34.74873352050781, -2.860137939453125, 19.44609832763672, 1.7718124389648438, 19.364425659179688, 23.95514678955078, -9.853351593017578, 5.658363342285156, 33.517845153808594, 19.906383514404297, 12.392879486083984, 11.51300048828125, 23.555633544921875, 28.428016662597656, 9.370468139648438, 1.3643035888671875, 4.628013610839844, 30.057220458984375, 23.004119873046875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000535.npy"}
|
||||
{"epoch": 0.8087679516250945, "step": 536, "batch_size": 64, "mean": 11.637962341308594, "std": 12.550446510314941, "min": -7.521648406982422, "p10": -3.155397415161133, "median": 8.56766128540039, "p90": 28.775200653076173, "max": 45.70884704589844, "pos_frac": 0.8125, "sample": [11.105945587158203, 3.5879974365234375, -6.319725036621094, 42.63645935058594, 30.544677734375, 10.097419738769531, 1.0561599731445312, -4.766944885253906, 8.441818237304688, 5.463289260864258, -6.297172546386719, 4.283077239990234, 28.95948028564453, 19.15381622314453, 14.193492889404297, 15.651535034179688, 15.441680908203125, 30.191133499145508, 45.70884704589844, 11.269327163696289, 13.11109733581543, -0.2790813446044922, 23.017087936401367, 13.38397216796875, -3.226287841796875, 37.99787139892578, 8.693504333496094, 1.1620502471923828, 15.873298645019531, 6.581966400146484, 28.34521484375, -7.521648406982422, 3.470935821533203, 12.688358306884766, -1.1238861083984375, 5.687492370605469, 19.15411949157715, 1.3339157104492188, 5.1120758056640625, 1.8941020965576172, 20.45477294921875, 24.79890251159668, 20.508201599121094, -5.08953857421875, 2.4080238342285156, 9.820915222167969, 7.61712646484375, 18.347360610961914, 8.254703521728516, 25.136016845703125, -2.6316699981689453, 28.036651611328125, 6.959716796875, 27.316993713378906, 22.235702514648438, 8.39697265625, 5.923076629638672, 35.001312255859375, -2.9899864196777344, -4.4017333984375, 3.371042251586914, 12.826520919799805, -0.49272918701171875, 7.2627105712890625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000536.npy"}
|
||||
{"epoch": 0.8102796674225246, "step": 537, "batch_size": 64, "mean": 10.277729034423828, "std": 15.937142372131348, "min": -29.935935974121094, "p10": -9.040626716613769, "median": 9.35964298248291, "p90": 32.15217819213867, "max": 51.0526123046875, "pos_frac": 0.71875, "sample": [8.022319793701172, 17.569190979003906, 8.808914184570312, 23.075347900390625, 32.88739013671875, 32.24863815307617, 20.72573471069336, -0.45789146423339844, 18.39398193359375, 11.78299331665039, 5.348535537719727, 13.185945510864258, 30.000694274902344, 22.238616943359375, 31.927104949951172, 21.70037841796875, 23.7775936126709, 9.184480667114258, -1.8381710052490234, -9.393878936767578, -3.6869544982910156, 5.944427490234375, 11.949188232421875, 2.339853286743164, -4.419153213500977, 28.628936767578125, 8.9930419921875, 2.2437667846679688, -29.935935974121094, -2.540342330932617, 16.76483154296875, 4.913532257080078, 6.77137565612793, -5.624298095703125, -8.731216430664062, -1.8943443298339844, 33.39886474609375, 17.353294372558594, 16.8074951171875, 0.6567726135253906, 15.363090515136719, 29.81714630126953, 1.4025726318359375, 6.7664031982421875, -3.0551280975341797, -3.670248031616211, 15.6815185546875, -12.393096923828125, 6.111915588378906, 23.290359497070312, 9.534805297851562, 18.002866744995117, -19.866121292114258, 35.52995300292969, 36.77153015136719, 51.0526123046875, -7.553333282470703, 16.517593383789062, -9.17323112487793, 13.195960998535156, 33.6981201171875, 17.803627014160156, -24.729557037353516, -11.445831298828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000537.npy"}
|
||||
{"epoch": 0.8117913832199547, "step": 538, "batch_size": 64, "mean": 12.854602813720703, "std": 14.36524772644043, "min": -16.633529663085938, "p10": -4.828701019287109, "median": 13.236699104309082, "p90": 29.189149475097658, "max": 54.44065856933594, "pos_frac": 0.8125, "sample": [-1.9713897705078125, 26.519929885864258, 6.47198486328125, 8.032360076904297, 16.640926361083984, -16.633529663085938, 15.217884063720703, -0.9220962524414062, 15.056324005126953, 15.958703994750977, 8.93069076538086, 27.35143280029297, 1.3829345703125, 22.1368408203125, 19.184619903564453, 16.6390380859375, 27.723730087280273, 35.98234558105469, 13.856834411621094, 20.70072364807129, 7.009910583496094, -1.7886486053466797, 35.033180236816406, 16.068893432617188, 7.453893661499023, -6.407497406005859, 9.978103637695312, 10.219734191894531, 44.49580383300781, 24.772476196289062, -4.8287353515625, 13.076499938964844, -9.381843566894531, 4.3629150390625, -11.888259887695312, 8.341514587402344, 10.924331665039062, 13.39689826965332, 2.7019882202148438, 9.712448120117188, 30.815162658691406, -16.482177734375, 29.497467041015625, 20.8409423828125, -4.161001205444336, 15.599777221679688, -4.828620910644531, 27.282663345336914, 12.942558288574219, 54.44065856933594, 16.458553314208984, 19.055709838867188, 28.469741821289062, -6.107513427734375, 13.539718627929688, 4.70147705078125, 45.757442474365234, 3.4816761016845703, 18.368175506591797, 3.2095489501953125, 3.6316757202148438, 17.658681869506836, 23.308616638183594, 3.699787139892578], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000538.npy"}
|
||||
{"epoch": 0.8133030990173847, "step": 539, "batch_size": 64, "mean": 10.651832580566406, "std": 13.370071411132812, "min": -12.743667602539062, "p10": -4.950665283203123, "median": 7.944343566894531, "p90": 28.72385330200196, "max": 43.55482482910156, "pos_frac": 0.796875, "sample": [8.970632553100586, 22.84127426147461, 1.4832286834716797, 42.206520080566406, 20.482290267944336, 3.9806442260742188, 24.70825958251953, 17.2772216796875, 23.715240478515625, 4.870765686035156, 29.287628173828125, 6.528778076171875, 3.103240966796875, 8.397613525390625, -2.631011962890625, -2.1547279357910156, 18.852279663085938, -3.1505813598632812, 43.55482482910156, 9.203445434570312, 12.139978408813477, -7.348003387451172, 11.42950439453125, -12.743667602539062, 3.2104454040527344, -12.454849243164062, 0.17849159240722656, 9.5498046875, 36.66502380371094, 9.614730834960938, 6.276466369628906, 4.635459899902344, 6.068840026855469, 4.274803161621094, 19.73260498046875, 13.82581901550293, 2.6284408569335938, -0.1862640380859375, 27.40837860107422, 3.732851028442383, -5.722129821777344, 8.400871276855469, 7.180507659912109, -0.6629600524902344, 23.645118713378906, 7.4910736083984375, -7.920509338378906, 34.764892578125, 10.218154907226562, -7.79876708984375, 19.158985137939453, 24.035438537597656, 3.164081573486328, 4.96759033203125, 11.008171081542969, 32.424896240234375, -2.7086639404296875, 6.129377365112305, -8.818294525146484, 24.81597137451172, 9.993532180786133, 27.265655517578125, 37.9854736328125, 2.5323562622070312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000539.npy"}
|
||||
{"epoch": 0.8148148148148148, "step": 540, "batch_size": 64, "mean": 13.654460906982422, "std": 14.6085844039917, "min": -16.94681167602539, "p10": -1.9901550292968746, "median": 11.644986152648926, "p90": 34.59872512817383, "max": 47.935028076171875, "pos_frac": 0.828125, "sample": [-9.055160522460938, 4.2952880859375, 47.935028076171875, 22.531688690185547, 5.709905624389648, -16.94681167602539, 18.661239624023438, 3.935272216796875, 6.043983459472656, 23.572002410888672, 3.170269012451172, 28.19549560546875, 26.978090286254883, 18.306907653808594, -6.9308013916015625, 10.854616165161133, 35.70793151855469, 10.818611145019531, -0.5703544616699219, 15.303436279296875, 20.968551635742188, -8.827072143554688, 21.606109619140625, 6.320657730102539, 36.88641357421875, 34.99737548828125, 3.2196826934814453, 9.752891540527344, -2.1331787109375, 16.310279846191406, -1.65643310546875, 31.616653442382812, 25.278182983398438, 12.613838195800781, 29.619979858398438, 1.9630279541015625, -0.29103851318359375, 17.85312271118164, -6.409942626953125, 1.8913955688476562, 0.9249763488769531, 24.584548950195312, 18.180435180664062, 41.542755126953125, 42.279563903808594, 10.531852722167969, 31.58802032470703, 12.721061706542969, 2.3672332763671875, -6.161041259765625, 12.435356140136719, 26.859176635742188, 2.2267074584960938, 1.1572113037109375, 3.863800048828125, 28.391433715820312, -1.1545333862304688, 37.64381408691406, 7.241180419921875, 19.11475372314453, 15.766433715820312, 3.4615097045898438, 4.55357551574707, 33.668540954589844], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000540.npy"}
|
||||
{"epoch": 0.8163265306122449, "step": 541, "batch_size": 64, "mean": 12.510297775268555, "std": 14.964747428894043, "min": -16.7384033203125, "p10": -6.306347656249999, "median": 11.277594566345215, "p90": 33.85687408447266, "max": 41.084774017333984, "pos_frac": 0.765625, "sample": [37.98306655883789, 11.048004150390625, 23.174755096435547, 35.31838607788086, -0.8156280517578125, 32.923194885253906, 6.481620788574219, 24.146930694580078, 31.40348243713379, -4.3059234619140625, -6.92828369140625, 16.568511962890625, 11.759170532226562, 10.734792709350586, 3.4953842163085938, -9.558135986328125, 12.597846984863281, 35.61238098144531, 6.3848724365234375, 29.282520294189453, -4.85516357421875, 19.866870880126953, 41.084774017333984, 24.146202087402344, 33.39611053466797, 16.568424224853516, 22.229270935058594, -1.72705078125, -16.06394386291504, -3.220317840576172, 11.263381958007812, 11.291807174682617, 37.97941589355469, 23.717132568359375, 5.301963806152344, 23.77149772644043, 12.767820358276367, -14.892143249511719, 18.384296417236328, 1.903249740600586, 3.5999908447265625, -3.1870765686035156, 34.054344177246094, 25.807178497314453, 3.8973827362060547, 10.260581970214844, 19.22305679321289, 11.115402221679688, -16.7384033203125, -0.34772491455078125, 1.0926952362060547, 21.688793182373047, 14.935371398925781, 10.888969421386719, 0.80859375, -11.498077392578125, 7.908733367919922, 6.600177764892578, -0.3354206085205078, 15.15255355834961, 12.061349868774414, -8.43271255493164, 39.9539794921875, 31.928749084472656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000541.npy"}
|
||||
{"epoch": 0.817838246409675, "step": 542, "batch_size": 64, "mean": 12.850105285644531, "std": 15.975829124450684, "min": -20.314178466796875, "p10": -5.26184959411621, "median": 9.485832214355469, "p90": 33.960994720458984, "max": 54.583003997802734, "pos_frac": 0.828125, "sample": [24.554595947265625, 21.368488311767578, 7.039131164550781, 30.460479736328125, 7.963348388671875, 7.2064208984375, 9.802734375, 16.45685577392578, 6.2034912109375, 7.7445068359375, 12.784366607666016, 2.226642608642578, -1.5454483032226562, 16.686256408691406, 7.748289108276367, 14.260030746459961, 4.360359191894531, 32.78498840332031, -6.584632873535156, 31.37310218811035, 34.723358154296875, 34.26207733154297, 5.504951477050781, 0.5213432312011719, -20.292613983154297, -20.314178466796875, 13.966094970703125, 4.342338562011719, 7.738531112670898, -11.622383117675781, 42.80195617675781, 30.39986801147461, 1.9886589050292969, 20.51679229736328, -4.319694519042969, -5.630084991455078, -13.33135986328125, 9.639541625976562, -4.4026336669921875, 23.572784423828125, 12.424667358398438, 1.54327392578125, 37.3548469543457, -4.238788604736328, 18.806503295898438, 6.095130920410156, 23.563243865966797, 9.332122802734375, 2.1937103271484375, 38.602752685546875, 33.25846862792969, 44.41630935668945, -12.949455261230469, 12.863540649414062, 8.056591033935547, 16.885128021240234, 0.8663330078125, 54.583003997802734, 23.47303009033203, 13.728729248046875, 8.256317138671875, 32.8812255859375, 6.484397888183594, 30.96627426147461], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000542.npy"}
|
||||
{"epoch": 0.8193499622071051, "step": 543, "batch_size": 64, "mean": 11.006534576416016, "std": 14.532954216003418, "min": -36.75696563720703, "p10": -3.130462074279784, "median": 10.133051872253418, "p90": 29.774551391601573, "max": 54.79234313964844, "pos_frac": 0.828125, "sample": [7.2324981689453125, 2.929964065551758, 4.941799163818359, -0.5143871307373047, 10.156463623046875, 10.699630737304688, 17.91754913330078, 31.621620178222656, -19.10927963256836, 24.146759033203125, 0.2485198974609375, 20.301076889038086, 13.461772918701172, 25.62405776977539, 36.80108642578125, 14.717437744140625, 54.79234313964844, -10.518146514892578, 6.754945755004883, 22.115699768066406, 5.559980392456055, 7.022022247314453, 1.0324783325195312, 5.00457763671875, 2.799457550048828, 24.081684112548828, -36.75696563720703, 0.5923423767089844, 8.221435546875, -3.7505645751953125, 12.682510375976562, 25.327308654785156, 37.66026306152344, 14.428115844726562, 20.479530334472656, 10.352767944335938, 41.17352294921875, -1.5835247039794922, -3.4999923706054688, -2.2682247161865234, 14.878734588623047, 10.109640121459961, 14.096588134765625, 19.1866455078125, 22.25640869140625, 21.241737365722656, 27.276390075683594, 30.845191955566406, -0.5535469055175781, 4.75689697265625, 0.6472511291503906, 5.820005416870117, 0.089508056640625, 6.3510284423828125, 11.800460815429688, 0.5642471313476562, 12.88519287109375, 4.996007919311523, 6.407096862792969, -4.364145278930664, 12.81549072265625, 31.960220336914062, -6.203727722167969, 13.67471694946289], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000543.npy"}
|
||||
{"epoch": 0.8208616780045351, "step": 544, "batch_size": 64, "mean": 14.96635627746582, "std": 17.58794403076172, "min": -17.778470993041992, "p10": -7.375933837890623, "median": 10.478303909301758, "p90": 41.65829200744629, "max": 59.83338928222656, "pos_frac": 0.796875, "sample": [28.668052673339844, -10.602104187011719, 11.758148193359375, 6.960264205932617, 21.520538330078125, 20.282798767089844, -15.46706771850586, 8.570550918579102, 26.628433227539062, 8.89404296875, 7.339263916015625, 9.19845962524414, 6.109161376953125, 30.948829650878906, 14.245491027832031, 2.205953598022461, -17.778470993041992, 8.4993896484375, 27.05072784423828, 16.657989501953125, 8.605567932128906, 6.394401550292969, 17.223129272460938, 28.904598236083984, 30.552719116210938, 8.795206069946289, 6.831512451171875, -3.9462432861328125, -1.1615753173828125, -6.117950439453125, 51.81007385253906, -9.753837585449219, 18.55439567565918, -1.305459976196289, 41.20650863647461, 15.438201904296875, 50.272308349609375, -15.668067932128906, 7.021015167236328, 4.8251190185546875, 18.36285400390625, 12.045578002929688, 42.02732849121094, 3.0295963287353516, 4.494285583496094, 8.815475463867188, 30.13983154296875, 8.908906936645508, 41.85191345214844, -8.898223876953125, -0.8308010101318359, 39.39031982421875, 27.952285766601562, 59.83338928222656, -7.915069580078125, 19.39038848876953, 44.23103332519531, 7.3365020751953125, 42.381290435791016, 41.08758544921875, -0.7252902984619141, 16.500946044921875, 20.81957244873047, 17.445037841796875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000544.npy"}
|
||||
{"epoch": 0.8223733938019653, "step": 545, "batch_size": 64, "mean": 9.980533599853516, "std": 15.21135425567627, "min": -25.538755416870117, "p10": -5.344876670837402, "median": 9.018893241882324, "p90": 31.07083015441895, "max": 51.66947937011719, "pos_frac": 0.734375, "sample": [12.789434432983398, 14.635398864746094, -5.6836395263671875, 31.636844635009766, 4.674205780029297, 5.358039855957031, -4.4155426025390625, -2.421628952026367, -1.0622844696044922, 12.366641998291016, 10.088592529296875, -8.532501220703125, 37.04058837890625, 3.9715709686279297, 3.557485580444336, 27.652328491210938, 6.2273406982421875, 12.676254272460938, -3.591348648071289, 12.533195495605469, 10.435997009277344, 27.482940673828125, -0.7039108276367188, 13.48974609375, 35.43609619140625, 17.18634796142578, 4.251884460449219, 2.614755630493164, 18.35041046142578, 8.408727645874023, 0.6820259094238281, -10.171504974365234, -16.3726806640625, 8.929378509521484, -1.53125, -4.55443000793457, 38.89939880371094, 43.371917724609375, 29.75012969970703, -0.2708740234375, -1.4109687805175781, 20.352996826171875, 16.95412826538086, 13.044681549072266, 3.23455810546875, 19.734764099121094, -2.5703811645507812, 9.204414367675781, 12.999557495117188, -25.538755416870117, 3.457305908203125, 16.192703247070312, 18.362258911132812, 51.66947937011719, 24.060806274414062, -14.098731994628906, 0.7868003845214844, 9.108407974243164, -17.725425720214844, 1.0668201446533203, 18.844871520996094, 39.130157470703125, 22.365097045898438, 4.342548370361328], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000545.npy"}
|
||||
{"epoch": 0.8238851095993953, "step": 546, "batch_size": 64, "mean": 10.140081405639648, "std": 14.117212295532227, "min": -23.953277587890625, "p10": -8.109901428222654, "median": 9.422725677490234, "p90": 26.596793174743652, "max": 43.54631805419922, "pos_frac": 0.78125, "sample": [31.788787841796875, -6.4294891357421875, 27.569534301757812, 12.78851318359375, 14.010635375976562, 24.195215225219727, 4.1103515625, -6.2286376953125, -5.703025817871094, -2.484630584716797, -18.559295654296875, 11.154792785644531, 6.867973327636719, 7.655841827392578, -23.953277587890625, 16.280929565429688, 6.530622482299805, 7.620874404907227, -9.58694839477539, 6.359714508056641, -5.87774658203125, 18.30152702331543, 9.01654052734375, 15.4951171875, 17.222450256347656, 9.134941101074219, 21.78205680847168, 7.912384033203125, 37.76882553100586, 43.54631805419922, 8.108715057373047, 6.8138885498046875, 26.63605499267578, 9.71051025390625, 19.523632049560547, -10.344161987304688, 6.5563507080078125, 20.155303955078125, 7.952177047729492, 10.716766357421875, 12.221565246582031, 5.710420608520508, -15.424314498901367, 3.9532699584960938, 16.42497444152832, 20.579002380371094, -4.329734802246094, 10.424930572509766, 17.895536422729492, 20.716808319091797, 16.184188842773438, 5.467609405517578, 36.039344787597656, -8.830078125, -1.6545753479003906, 8.38599967956543, 18.190990447998047, 18.432117462158203, 41.9213981628418, 12.503591537475586, 21.73925018310547, -19.166248321533203, 26.50518226623535, 0.9538822174072266], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000546.npy"}
|
||||
{"epoch": 0.8253968253968254, "step": 547, "batch_size": 64, "mean": 11.940618515014648, "std": 14.476734161376953, "min": -25.254005432128906, "p10": -4.308485984802245, "median": 12.779991149902344, "p90": 30.111683654785157, "max": 58.09821319580078, "pos_frac": 0.8125, "sample": [3.5578441619873047, 0.4093780517578125, 8.880277633666992, 7.891197204589844, 30.34765625, 3.619922637939453, -12.266263961791992, 58.09821319580078, 11.544475555419922, 16.0234375, 13.220489501953125, 15.663200378417969, 9.228933334350586, 11.498374938964844, 4.971796035766602, -0.10089111328125, 23.727432250976562, 42.96376037597656, 2.4896278381347656, 14.515220642089844, 18.057159423828125, 20.154769897460938, 36.86225891113281, 2.176116943359375, 29.561080932617188, 12.612174987792969, -1.0003204345703125, 13.987350463867188, 24.97027587890625, 7.1444854736328125, 13.686004638671875, -4.580423355102539, -22.90869903564453, 18.94439697265625, -8.602119445800781, 14.993003845214844, 20.562881469726562, 2.6919727325439453, 19.51800537109375, 18.700517654418945, 14.045799255371094, 38.827484130859375, 10.979911804199219, 18.37701416015625, -9.469459533691406, 20.335721969604492, -0.21382522583007812, -25.254005432128906, -3.6739654541015625, 14.419395446777344, 33.39918518066406, 12.332391738891602, 30.754226684570312, -6.3582305908203125, 12.947807312011719, 2.7691192626953125, 14.046802520751953, 23.248294830322266, 1.0572528839111328, 12.272529602050781, 19.79482078552246, 5.2680511474609375, 20.749610900878906, -0.2713165283203125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000547.npy"}
|
||||
{"epoch": 0.8269085411942555, "step": 548, "batch_size": 64, "mean": 9.051241874694824, "std": 14.732340812683105, "min": -16.746517181396484, "p10": -7.061670875549315, "median": 7.749205589294434, "p90": 30.882634353637705, "max": 58.66621017456055, "pos_frac": 0.734375, "sample": [6.321502685546875, 15.200904846191406, 9.353591918945312, -8.426162719726562, -7.593877792358398, 18.829391479492188, -9.284019470214844, 48.07585525512695, 25.801069259643555, 10.475517272949219, -5.342658996582031, 34.38426971435547, 0.5885105133056641, 1.1814613342285156, 3.7869415283203125, 4.065816879272461, -11.111724853515625, 4.6629180908203125, 4.901836395263672, 10.965744018554688, 28.85033416748047, 11.339059829711914, -12.344474792480469, 5.661834716796875, 10.536972045898438, 3.9487037658691406, 42.213592529296875, 17.049156188964844, 13.032150268554688, 31.753620147705078, -0.8636283874511719, 11.124687194824219, -3.610870361328125, 10.259147644042969, -4.214225769042969, 2.9603729248046875, 35.208984375, 9.775646209716797, -5.819854736328125, 11.398902893066406, 12.188301086425781, 7.836790084838867, 15.797065734863281, 0.380706787109375, 39.23846435546875, 0.9259757995605469, -3.6492080688476562, 14.927993774414062, -7.650392532348633, 8.037345886230469, 9.954864501953125, -16.746517181396484, 9.949382781982422, 16.557209014892578, 7.66162109375, -3.3590469360351562, 19.556791305541992, -2.8090591430664062, 6.440895080566406, -1.766845703125, 58.66621017456055, 14.30392837524414, -4.951745986938477, 2.6917762756347656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000548.npy"}
|
||||
{"epoch": 0.8284202569916855, "step": 549, "batch_size": 64, "mean": 9.182695388793945, "std": 14.257761001586914, "min": -21.681961059570312, "p10": -10.030653762817384, "median": 10.658872604370117, "p90": 30.25193500518799, "max": 33.880313873291016, "pos_frac": 0.6875, "sample": [3.60595703125, 22.390235900878906, -7.14654541015625, 12.816574096679688, -3.4435653686523438, -10.100051879882812, 19.380210876464844, 5.919315338134766, 9.209457397460938, 30.845684051513672, 30.404190063476562, 7.255466461181641, 10.60432243347168, 32.45552062988281, -4.8529510498046875, 2.0696182250976562, -15.307796478271484, 16.255367279052734, 16.76783561706543, 14.761112213134766, 26.248886108398438, 21.613964080810547, 19.214569091796875, -9.586750030517578, 17.137435913085938, 0.2177581787109375, 4.378440856933594, 25.21849822998047, 9.59381103515625, -14.550407409667969, 21.559349060058594, 31.06348419189453, 17.420166015625, -2.3969993591308594, 10.713422775268555, 32.97040557861328, -0.3638801574707031, -12.336471557617188, -5.577545166015625, 15.43194580078125, -16.142852783203125, 11.054718017578125, 2.059713363647461, 2.3151016235351562, 26.955745697021484, 18.261157989501953, -11.11666488647461, 15.026603698730469, 14.104507446289062, -1.7452621459960938, -2.2647533416748047, 29.89667320251465, 17.640106201171875, 15.206695556640625, 11.19366455078125, -0.7898483276367188, 13.775030136108398, 8.774272918701172, -0.06526947021484375, 33.80493927001953, 33.880313873291016, -21.681961059570312, -9.868724822998047, -4.4414520263671875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000549.npy"}
|
||||
{"epoch": 0.8299319727891157, "step": 550, "batch_size": 64, "mean": 8.02153205871582, "std": 13.532042503356934, "min": -21.455230712890625, "p10": -7.518287658691406, "median": 6.08011531829834, "p90": 25.485245513916027, "max": 39.77197265625, "pos_frac": 0.765625, "sample": [5.918769836425781, 37.415157318115234, 38.11333084106445, 9.489181518554688, -17.109891891479492, 0.7140655517578125, 1.5393543243408203, -1.1438140869140625, 0.14688873291015625, 7.886096954345703, 18.32594108581543, 9.855461120605469, 16.597898483276367, 7.842861175537109, 3.1279125213623047, 26.830894470214844, 39.77197265625, 18.444046020507812, 12.384468078613281, 8.005226135253906, 18.75336456298828, -2.3384742736816406, 17.533946990966797, 20.67369842529297, 19.490737915039062, -9.907341003417969, 22.34539794921875, 0.7896595001220703, 4.678905487060547, -8.819908142089844, -14.8919677734375, 1.3302230834960938, 1.2796478271484375, 2.7664566040039062, 1.784210205078125, 5.210765838623047, 5.603675842285156, -7.361114501953125, -2.7501068115234375, 1.4653091430664062, -4.0495452880859375, 12.04583740234375, 8.01681137084961, 2.8597679138183594, 8.244354248046875, 10.352283477783203, 17.901458740234375, 21.699983596801758, 6.8952789306640625, 22.070568084716797, -7.5856475830078125, 14.78799057006836, 28.789581298828125, 6.241460800170898, 33.1008186340332, 6.719459533691406, -21.455230712890625, -1.1305274963378906, 38.62046813964844, -17.2874813079834, 5.463966369628906, 4.100547790527344, -3.698719024658203, -1.1183662414550781], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000550.npy"}
|
||||
{"epoch": 0.8314436885865457, "step": 551, "batch_size": 64, "mean": 9.569366455078125, "std": 13.966843605041504, "min": -16.5213623046875, "p10": -4.636013603210449, "median": 7.473529815673828, "p90": 26.607831573486333, "max": 62.18377685546875, "pos_frac": 0.71875, "sample": [11.921356201171875, 13.047103881835938, 17.100337982177734, 5.658538818359375, 13.952306747436523, 17.723432540893555, 62.18377685546875, -0.5094528198242188, 1.6226367950439453, 27.194847106933594, -4.081684112548828, 7.347377777099609, 36.501583099365234, 1.2211246490478516, 20.611984252929688, 7.599681854248047, 12.252716064453125, 10.231025695800781, -16.5213623046875, -2.4722824096679688, 7.619016647338867, 8.776771545410156, 13.873847961425781, -7.02398681640625, 4.671028137207031, 40.67774963378906, 3.1580963134765625, 6.9288177490234375, 16.96490478515625, 19.303497314453125, 9.357439041137695, -0.7715911865234375, 2.4618473052978516, 13.3951416015625, 1.5673675537109375, 31.192626953125, -3.3045501708984375, 33.791412353515625, 1.3744239807128906, 15.423049926757812, -4.3632049560546875, 8.022758483886719, -1.5136260986328125, 1.8773994445800781, 24.43914794921875, 21.3419189453125, 25.238128662109375, 13.376873016357422, -5.077667236328125, -4.752931594848633, 5.1568756103515625, 13.833839416503906, -1.29974365234375, -8.198871612548828, -7.7952880859375, -0.1054534912109375, -4.347240447998047, 24.442123413085938, -3.2255935668945312, 3.7433128356933594, -9.045280456542969, 1.5490951538085938, 23.775253295898438, 33.34564208984375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000551.npy"}
|
||||
{"epoch": 0.8329554043839759, "step": 552, "batch_size": 64, "mean": 11.383108139038086, "std": 13.411713600158691, "min": -24.770503997802734, "p10": -4.980535316467285, "median": 11.560050964355469, "p90": 27.042625427246097, "max": 42.87552261352539, "pos_frac": 0.859375, "sample": [-4.586217880249023, 6.0213623046875, 12.122596740722656, 3.5113449096679688, 24.52250862121582, 15.000484466552734, 1.838796615600586, 3.031463623046875, 2.7266311645507812, 26.558822631835938, 23.512481689453125, 39.82643127441406, 12.64931869506836, 14.062713623046875, 3.8401947021484375, -9.331415176391602, -5.149528503417969, 20.448482513427734, 8.790916442871094, 0.25499725341796875, 13.102554321289062, 18.055526733398438, 9.046340942382812, 12.670442581176758, 20.980926513671875, 5.763343811035156, 25.100326538085938, 30.23345375061035, 16.340065002441406, -21.050506591796875, 8.225522994995117, 21.24298858642578, 27.249969482421875, -24.770503997802734, 17.187759399414062, -1.3325061798095703, 32.270782470703125, 15.544952392578125, 2.767120361328125, 7.120758056640625, 23.593002319335938, 10.111305236816406, -8.721839904785156, 34.39021301269531, 8.943115234375, 8.129959106445312, 19.0189208984375, 8.85357666015625, -18.076889038085938, 11.651542663574219, 42.87552261352539, 12.249130249023438, 8.986623764038086, -13.303962707519531, 23.411319732666016, 10.630058288574219, 6.009529113769531, 10.42927360534668, 12.0579833984375, 11.468559265136719, 33.968971252441406, 15.894729614257812, 13.556045532226562, 6.99052619934082], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000552.npy"}
|
||||
{"epoch": 0.8344671201814059, "step": 553, "batch_size": 64, "mean": 9.903047561645508, "std": 18.037132263183594, "min": -31.624732971191406, "p10": -7.658935546874999, "median": 5.024211883544922, "p90": 36.35709991455078, "max": 51.872344970703125, "pos_frac": 0.65625, "sample": [44.20250701904297, -6.594627380371094, -3.2822189331054688, 16.74470329284668, 1.611663818359375, -1.083944320678711, 31.9737491607666, 14.554046630859375, -10.916168212890625, 0.9652614593505859, 5.505413055419922, 0.1165771484375, 25.887771606445312, 24.995803833007812, -6.8218536376953125, 31.930465698242188, 1.087656021118164, 19.354286193847656, 4.543010711669922, -31.624732971191406, -6.486238479614258, 30.685867309570312, 0.6175613403320312, 13.535881042480469, 44.6007080078125, 18.897972106933594, -2.4012374877929688, 1.2124862670898438, -16.357818603515625, -4.368034362792969, 50.081085205078125, 30.483177185058594, 25.88055419921875, 51.872344970703125, 6.920928955078125, 16.24751853942871, 38.6629753112793, 12.229232788085938, 11.1546630859375, 36.54508972167969, 3.884552001953125, 5.827465057373047, -0.10658073425292969, -0.0609283447265625, -5.49676513671875, 0.6478652954101562, -8.017684936523438, 48.47645568847656, -21.199005126953125, -12.435064315795898, 4.472282409667969, 13.06357192993164, 19.66327667236328, 6.4920196533203125, -0.36722564697265625, 13.279426574707031, -2.0715980529785156, -9.710464477539062, 12.864013671875, 13.149368286132812, -2.520872116088867, -3.4452896118164062, 35.91845703125, -1.6762847900390625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000553.npy"}
|
||||
{"epoch": 0.8359788359788359, "step": 554, "batch_size": 64, "mean": 10.65607738494873, "std": 16.109352111816406, "min": -29.893936157226562, "p10": -4.928635406494139, "median": 7.148262977600098, "p90": 33.76729469299317, "max": 53.24139404296875, "pos_frac": 0.8125, "sample": [4.695030212402344, 33.03376388549805, 48.00446319580078, 30.10980224609375, 6.797920227050781, 5.459434509277344, 9.572458267211914, 7.932159423828125, 9.5418701171875, 14.531326293945312, 0.9600143432617188, -1.3074417114257812, -5.8505401611328125, 21.61151885986328, -25.620906829833984, 3.5126571655273438, 20.55984115600586, -11.931060791015625, 8.996894836425781, 2.9267826080322266, 0.27344512939453125, 17.00189971923828, 7.225282669067383, -5.629180908203125, 21.071426391601562, 20.346771240234375, 36.415199279785156, 3.507242202758789, -9.161748886108398, 14.746803283691406, 53.24139404296875, 2.6407546997070312, 15.52393913269043, 2.4183311462402344, -0.4460411071777344, -3.2940292358398438, 17.34380340576172, 0.9320621490478516, 23.72307586669922, 1.2794113159179688, -29.893936157226562, 34.0816650390625, 0.6117401123046875, 6.519359588623047, 13.806045532226562, 4.679782867431641, -0.2237987518310547, 11.384017944335938, 2.158113479614258, 0.57391357421875, 20.103309631347656, 9.965492248535156, -3.1617965698242188, 31.054458618164062, 45.16148376464844, 4.968620300292969, 11.004112243652344, 11.051063537597656, 7.0712432861328125, 26.122413635253906, 38.148094177246094, -12.123313903808594, 42.10893249511719, 4.122093200683594], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000554.npy"}
|
||||
{"epoch": 0.8374905517762661, "step": 555, "batch_size": 64, "mean": 9.74759292602539, "std": 14.666908264160156, "min": -36.460662841796875, "p10": -7.218167877197265, "median": 8.929022789001465, "p90": 27.83626098632813, "max": 43.06028747558594, "pos_frac": 0.796875, "sample": [2.1877307891845703, -26.18010711669922, 3.49420166015625, 5.6061553955078125, 19.2108154296875, 3.2437286376953125, -14.701217651367188, 1.6914825439453125, -7.552310943603516, 2.5197067260742188, 24.282054901123047, -4.390190124511719, 10.578155517578125, 5.4445037841796875, 23.865421295166016, 26.410423278808594, 8.267234802246094, 1.1171226501464844, 5.5956878662109375, 28.44733428955078, 8.898048400878906, -3.505096435546875, 43.06028747558594, 4.535789489746094, 20.930118560791016, 11.227287292480469, 21.349185943603516, 26.04527473449707, 17.5452880859375, 22.72888946533203, 19.689725875854492, 12.031707763671875, 15.139404296875, 20.569732666015625, -10.324378967285156, 20.98785400390625, -36.460662841796875, 8.959997177124023, 3.8992691040039062, 7.529094696044922, 11.878311157226562, 11.752296447753906, 2.8570938110351562, 6.876747131347656, -4.5897369384765625, 7.970985412597656, 31.96752166748047, 10.442352294921875, 30.048824310302734, -6.757564544677734, 30.634174346923828, -2.936382293701172, -12.977005004882812, -3.3891525268554688, 24.159366607666016, 7.263603210449219, 23.62747573852539, 11.637153625488281, 10.392496109008789, -7.415569305419922, 40.1657829284668, 15.673147201538086, 29.02619171142578, 1.5630664825439453], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000555.npy"}
|
||||
{"epoch": 0.8390022675736961, "step": 556, "batch_size": 64, "mean": 9.296749114990234, "std": 16.738374710083008, "min": -21.37879180908203, "p10": -7.865085983276367, "median": 5.782291412353516, "p90": 33.99609107971192, "max": 59.380157470703125, "pos_frac": 0.6875, "sample": [10.129158020019531, 7.7941436767578125, 10.57135009765625, -5.129764556884766, -11.026138305664062, 16.241439819335938, 33.90800094604492, 7.454471588134766, -3.3973388671875, -8.464096069335938, -3.973775863647461, 2.155029296875, 32.672828674316406, 28.374984741210938, 1.8592338562011719, -7.49847412109375, 7.194210052490234, 0.7588348388671875, 8.900745391845703, 30.146671295166016, -1.8315811157226562, 23.866668701171875, 34.033843994140625, 2.1367568969726562, -2.4429264068603516, 4.517311096191406, -15.93173599243164, 10.789804458618164, 16.163654327392578, -7.20025634765625, -1.8036441802978516, -21.37879180908203, 39.913490295410156, 2.0137100219726562, -5.7552032470703125, 3.5547542572021484, 19.862930297851562, 27.707992553710938, 1.0141067504882812, 1.3223819732666016, 29.187036514282227, 59.380157470703125, 2.9376564025878906, 17.63315200805664, 22.47911834716797, 19.275306701660156, 0.2928009033203125, 10.9639892578125, -16.901443481445312, -3.4494552612304688, 39.06927490234375, -8.022205352783203, -5.4264984130859375, 35.58245849609375, 4.652122497558594, -6.8074951171875, 6.9124603271484375, -18.09398651123047, 7.5670928955078125, 28.892364501953125, 34.054847717285156, 35.67131805419922, 11.490190505981445, -1.5731582641601562], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000556.npy"}
|
||||
{"epoch": 0.8405139833711263, "step": 557, "batch_size": 64, "mean": 12.408753395080566, "std": 14.9137544631958, "min": -14.059600830078125, "p10": -3.3531906127929685, "median": 11.355794906616211, "p90": 33.98239212036133, "max": 52.18134307861328, "pos_frac": 0.78125, "sample": [34.917823791503906, 15.159950256347656, 11.367679595947266, 2.7973709106445312, -1.2857398986816406, -8.81204605102539, 49.596031188964844, 3.9962921142578125, 16.700504302978516, 10.513259887695312, 1.852691650390625, -3.4734039306640625, -3.07269287109375, 18.74068832397461, 2.113292694091797, 5.3651123046875, 14.943439483642578, 12.405597686767578, 43.551788330078125, 16.675209045410156, -1.1436576843261719, -2.3127517700195312, 33.11103057861328, -6.483306884765625, -2.0215415954589844, -14.059600830078125, 38.586647033691406, 38.99366760253906, 5.283771514892578, 1.1060791015625, 19.049346923828125, 8.064050674438477, 14.321632385253906, -9.328758239746094, 7.977775573730469, 33.75727081298828, 11.343910217285156, 31.101409912109375, 28.68621826171875, 52.18134307861328, 4.666412353515625, 2.365194320678711, -7.993019104003906, -2.6998214721679688, 0.8839969635009766, 7.6352691650390625, 19.446456909179688, 23.168304443359375, 18.042787551879883, 12.179450988769531, 0.837310791015625, 7.251279830932617, 15.67287826538086, 15.871681213378906, 14.828397750854492, 14.000518798828125, 30.060012817382812, 9.72390365600586, 14.479450225830078, 13.349258422851562, -1.048126220703125, -7.8074188232421875, 34.07887268066406, 22.89974594116211], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000557.npy"}
|
||||
{"epoch": 0.8420256991685563, "step": 558, "batch_size": 64, "mean": 12.038671493530273, "std": 15.163728713989258, "min": -18.942832946777344, "p10": -4.313250732421874, "median": 9.624866485595703, "p90": 32.075263595581056, "max": 53.44947814941406, "pos_frac": 0.71875, "sample": [11.68674087524414, 29.894798278808594, 26.8272705078125, -2.3155136108398438, 29.786270141601562, 2.130228042602539, 33.76020050048828, -3.4379940032958984, 1.127267837524414, 16.852142333984375, -1.2842864990234375, 20.52910614013672, 13.742324829101562, 10.032264709472656, -1.700347900390625, 24.906570434570312, 17.199365615844727, 9.21746826171875, 7.208683013916016, 10.940078735351562, 31.292877197265625, 19.375106811523438, 8.922775268554688, 27.320110321044922, 15.869831085205078, -5.069404602050781, 45.598052978515625, 11.799751281738281, 17.717727661132812, 12.197662353515625, 32.41057205200195, 4.568687438964844, 3.9433746337890625, 47.15870666503906, -1.2368659973144531, -1.692331314086914, 8.841974258422852, 2.4116134643554688, 8.128211975097656, 53.44947814941406, 16.646198272705078, -3.9320831298828125, 36.85392761230469, -1.117431640625, 27.509824752807617, 42.3128662109375, -9.374763488769531, -4.4766082763671875, 24.307907104492188, 3.4273757934570312, 8.380477905273438, -18.942832946777344, 12.935077667236328, -4.576652526855469, -4.693939208984375, -1.8916854858398438, 17.114665985107422, -1.5454635620117188, -8.626708984375, -0.6230392456054688, 16.598655700683594, 4.520284652709961, 18.380332946777344, 1.178009033203125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000558.npy"}
|
||||
{"epoch": 0.8435374149659864, "step": 559, "batch_size": 64, "mean": 11.499939918518066, "std": 16.84891700744629, "min": -22.80624008178711, "p10": -5.890732192993164, "median": 8.877201080322266, "p90": 33.01677131652832, "max": 63.431060791015625, "pos_frac": 0.71875, "sample": [-3.9701690673828125, -10.921882629394531, 5.182834625244141, -2.7598114013671875, 13.339057922363281, 63.431060791015625, 27.981300354003906, 30.064666748046875, 7.248283386230469, 2.5221328735351562, -20.163864135742188, 28.352081298828125, 6.0065765380859375, 14.22430419921875, -1.1654300689697266, 36.21519470214844, 15.562751770019531, 8.381278991699219, 26.175209045410156, 17.261943817138672, 27.511093139648438, 7.845832824707031, -2.294647216796875, 18.224365234375, -1.4128761291503906, 2.4395313262939453, -6.9416656494140625, 22.478515625, 5.938201904296875, 45.711395263671875, 5.185920715332031, 8.828994750976562, 13.038673400878906, 23.232070922851562, 27.86145782470703, -0.7186145782470703, 15.198686599731445, -4.737600326538086, 1.9059982299804688, -10.466773986816406, 1.6731033325195312, -5.894855499267578, -2.317218780517578, 8.925407409667969, 8.46038818359375, 33.35984802246094, 24.84065818786621, -5.180381774902344, 21.75399398803711, 13.696807861328125, 12.089933395385742, 43.93585205078125, -22.80624008178711, -5.881111145019531, 18.619918823242188, 43.4308967590332, 8.364620208740234, 34.85142517089844, 9.91085433959961, 8.992979049682617, 32.21625900268555, 14.960800170898438, -2.575885772705078, -21.22799301147461], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000559.npy"}
|
||||
{"epoch": 0.8450491307634165, "step": 560, "batch_size": 64, "mean": 10.107817649841309, "std": 15.207623481750488, "min": -14.5880126953125, "p10": -8.688070297241211, "median": 6.542828559875488, "p90": 32.107414245605476, "max": 46.846431732177734, "pos_frac": 0.703125, "sample": [10.719406127929688, 6.602973937988281, 32.519622802734375, 12.339118957519531, -14.5880126953125, -11.924400329589844, 30.390350341796875, 5.5734100341796875, 35.929168701171875, 1.47576904296875, 18.38946533203125, -0.9497604370117188, 2.3658599853515625, 13.672576904296875, -0.8525543212890625, -7.579925537109375, -4.2492523193359375, -2.8786354064941406, 11.572734832763672, 19.285125732421875, -11.301025390625, 23.238685607910156, 4.545417785644531, 1.6888904571533203, 30.330345153808594, 16.237777709960938, -0.8968582153320312, 29.41127586364746, 38.078346252441406, 6.482683181762695, 22.74687957763672, -7.111228942871094, -11.237945556640625, 28.410194396972656, 32.490562438964844, -8.676040649414062, 2.6687545776367188, -7.341423034667969, 46.846431732177734, -10.411758422851562, 39.72882843017578, 18.739974975585938, 14.885536193847656, 17.841014862060547, 3.4442138671875, 10.52197265625, -3.3797988891601562, 27.262710571289062, 4.043682098388672, 0.6317558288574219, -2.4713211059570312, 21.3492431640625, -8.693225860595703, -1.6419143676757812, 10.054718017578125, 18.783653259277344, -10.538970947265625, 4.890037536621094, 31.213401794433594, 4.628894805908203, 35.07653045654297, 15.023090362548828, 3.7695579528808594, 7.723718643188477], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000560.npy"}
|
||||
{"epoch": 0.8465608465608465, "step": 561, "batch_size": 64, "mean": 11.133149147033691, "std": 15.4675874710083, "min": -12.465339660644531, "p10": -6.071048736572266, "median": 7.450986862182617, "p90": 34.901023864746094, "max": 54.046531677246094, "pos_frac": 0.78125, "sample": [1.699554443359375, 3.1820220947265625, 3.13555908203125, 7.346487045288086, 4.252582550048828, 19.13189697265625, 11.07115364074707, -5.898735046386719, 24.34156036376953, 10.471176147460938, 7.999015808105469, 12.68402099609375, 0.3844413757324219, 21.662132263183594, 2.5828933715820312, -11.121938705444336, 3.0470962524414062, 35.133636474609375, 54.046531677246094, 0.09344482421875, -0.27526092529296875, 21.841339111328125, 16.104232788085938, -3.6090850830078125, 13.371383666992188, -3.715383529663086, 14.499137878417969, -8.333572387695312, -12.465339660644531, 43.287811279296875, -10.039867401123047, 7.555486679077148, 29.81298828125, 27.118995666503906, 4.3282928466796875, 4.578277587890625, 7.136871337890625, 9.181772232055664, 3.011199951171875, 47.32115173339844, 43.54600524902344, 34.35826110839844, 27.30809783935547, 38.90496826171875, 21.02536392211914, -6.94891357421875, -11.422203063964844, 13.595333099365234, 26.772842407226562, 12.756437301635742, 1.0981521606445312, 3.3862991333007812, -2.55908203125, 0.2130718231201172, 11.134994506835938, 17.697673797607422, 4.665212631225586, 13.677818298339844, 17.970474243164062, -4.643531799316406, -1.542001724243164, 4.474128723144531, 37.242061614990234, -6.1448974609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000561.npy"}
|
||||
{"epoch": 0.8480725623582767, "step": 562, "batch_size": 64, "mean": 12.541390419006348, "std": 14.736178398132324, "min": -18.664813995361328, "p10": -4.918732452392577, "median": 11.948356628417969, "p90": 35.214715576171876, "max": 42.20332336425781, "pos_frac": 0.734375, "sample": [5.141963958740234, 30.072439193725586, -8.383922576904297, 24.233001708984375, 16.589706420898438, 17.00720977783203, -7.095001220703125, 37.973876953125, 42.20332336425781, -7.024692535400391, 15.400802612304688, 20.618148803710938, -7.112730026245117, 3.364604949951172, 30.03997802734375, 0.21717453002929688, -5.2926177978515625, 11.757232666015625, 17.886306762695312, -2.916900634765625, 25.747154235839844, 24.648826599121094, -2.8793716430664062, 0.9072799682617188, -4.046333312988281, 32.21603012084961, -1.8197002410888672, -0.6573486328125, 22.118423461914062, 11.519004821777344, 22.563949584960938, -3.8466339111328125, 36.11992645263672, 29.739288330078125, 5.146575927734375, 35.56256103515625, -1.4216041564941406, 3.3488082885742188, -2.012115478515625, 34.403076171875, -1.0922775268554688, 0.5268020629882812, 22.15559196472168, 11.502777099609375, 14.645660400390625, 38.68476486206055, 18.189376831054688, 12.155181884765625, 23.957149505615234, 12.139480590820312, 36.014312744140625, 24.66766357421875, -1.0292835235595703, 7.100433349609375, 16.43603515625, 15.538515090942383, 19.651212692260742, 36.45782470703125, 6.726951599121094, -18.664813995361328, 3.0224609375, -5.70123291015625, 0.6258888244628906, 6.900840759277344], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000562.npy"}
|
||||
{"epoch": 0.8495842781557067, "step": 563, "batch_size": 64, "mean": 10.990996360778809, "std": 16.457427978515625, "min": -24.83926010131836, "p10": -9.320093154907227, "median": 11.52324390411377, "p90": 29.242890930175783, "max": 61.16590118408203, "pos_frac": 0.734375, "sample": [2.0074844360351562, 14.83317756652832, 13.220630645751953, 1.5497188568115234, 23.177078247070312, 25.713134765625, -24.770925521850586, 25.602203369140625, 8.718856811523438, -0.2682952880859375, -1.2308578491210938, 10.1197509765625, 1.5828475952148438, -4.521369934082031, 30.010353088378906, 11.718069076538086, -8.696834564208984, 20.959495544433594, -11.130599975585938, 28.686904907226562, 11.803070068359375, 43.186737060546875, -3.909616470336914, 16.61174774169922, 21.77362060546875, 6.969478607177734, 23.38896942138672, 4.534751892089844, 13.284278869628906, -18.51280975341797, 0.21392059326171875, -2.415283203125, -5.9905242919921875, 5.814228057861328, -2.456024169921875, 38.32654571533203, 8.091419219970703, -24.83926010131836, 15.373041152954102, 6.475513458251953, 40.20996856689453, 30.527236938476562, 22.96889305114746, -9.797828674316406, -9.587203979492188, 9.965072631835938, 25.953468322753906, -0.008575439453125, 27.75873565673828, 11.962532043457031, -20.19500732421875, 14.642330169677734, 18.967456817626953, 29.481170654296875, 18.69466781616211, 9.414941787719727, 61.16590118408203, 24.364944458007812, 0.6749267578125, 28.395822525024414, 14.87948226928711, -1.8659172058105469, 18.517732620239258, 11.328418731689453], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000563.npy"}
|
||||
{"epoch": 0.8510959939531368, "step": 564, "batch_size": 64, "mean": 12.945907592773438, "std": 14.958745956420898, "min": -18.846111297607422, "p10": -6.032719421386718, "median": 10.89182186126709, "p90": 32.476975250244145, "max": 48.56470489501953, "pos_frac": 0.828125, "sample": [27.748580932617188, -4.1918792724609375, 27.46596908569336, 17.734207153320312, 31.70092010498047, 34.340965270996094, 21.920120239257812, -18.846111297607422, 0.914337158203125, 2.9730224609375, 35.59423828125, 29.881526947021484, 11.906719207763672, -9.01591682434082, 6.0037689208984375, 2.6972808837890625, -2.901031494140625, 9.790542602539062, 13.389091491699219, 3.820873260498047, 4.669197082519531, 1.4148578643798828, 24.89788818359375, 17.126556396484375, 21.910003662109375, 11.027000427246094, 28.669418334960938, -9.461891174316406, 9.508295059204102, 14.88690185546875, 31.62335968017578, 34.78483581542969, 45.59546661376953, 10.16312026977539, 2.4558868408203125, -6.6160430908203125, 10.756643295288086, 9.266674041748047, 4.485267639160156, 5.7454681396484375, 39.05635070800781, 17.774658203125, 15.29693603515625, 2.2805538177490234, -8.238273620605469, 5.7912139892578125, 8.0997314453125, 14.49053955078125, 6.860740661621094, -17.502105712890625, 21.200374603271484, 12.050043106079102, 32.8095703125, -0.14591217041015625, 28.234573364257812, 6.2457122802734375, 25.12276840209961, 48.56470489501953, -12.487808227539062, 28.68988800048828, -4.671630859375, 16.206707000732422, 8.876237869262695, 18.096389770507812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000564.npy"}
|
||||
{"epoch": 0.8526077097505669, "step": 565, "batch_size": 64, "mean": 9.63563346862793, "std": 14.847885131835938, "min": -21.038917541503906, "p10": -8.70280532836914, "median": 10.511082649230957, "p90": 25.41222610473633, "max": 52.92290496826172, "pos_frac": 0.71875, "sample": [6.44306755065918, -9.204143524169922, -14.17529296875, 25.71965789794922, -4.295249938964844, 5.813385009765625, 11.700103759765625, 11.492134094238281, -2.6408615112304688, 2.4958648681640625, 17.296119689941406, 16.574234008789062, 27.64452362060547, -21.038917541503906, -1.4786911010742188, 12.396354675292969, 10.550363540649414, 24.52627182006836, 42.04212951660156, 40.47438049316406, 23.06938934326172, -18.493453979492188, -3.853199005126953, -4.846435546875, 16.42388343811035, 21.00218963623047, 8.40625, 12.009323120117188, 20.995567321777344, 16.111671447753906, 21.771732330322266, 2.201129913330078, -4.3124542236328125, -7.533016204833984, 18.73606300354004, -7.193359375, 10.045167922973633, 17.182186126708984, -10.985040664672852, 10.435302734375, 14.557174682617188, -13.16610336303711, -2.4860076904296875, 24.69488525390625, 10.905487060546875, 20.452991485595703, 2.2909469604492188, 18.356037139892578, 33.17226028442383, 7.1155242919921875, 19.437362670898438, -6.856563568115234, -18.355636596679688, -1.1655120849609375, 8.651841163635254, 6.121601104736328, 1.2076339721679688, 17.643390655517578, 5.979770660400391, 11.033096313476562, 52.92290496826172, 23.566631317138672, 10.4718017578125, 26.620702743530273], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000565.npy"}
|
||||
{"epoch": 0.854119425547997, "step": 566, "batch_size": 64, "mean": 9.973814010620117, "std": 16.861936569213867, "min": -25.467998504638672, "p10": -11.550188064575194, "median": 9.823633193969727, "p90": 29.324612426757813, "max": 58.86842346191406, "pos_frac": 0.6875, "sample": [12.958793640136719, 6.845146179199219, -3.189361572265625, 12.018280029296875, 2.28289794921875, 20.091224670410156, 35.97031784057617, 11.853004455566406, 8.211139678955078, 23.591781616210938, 14.975139617919922, 13.004898071289062, 10.109174728393555, 9.846881866455078, 23.71221160888672, 38.78009033203125, -22.528518676757812, 16.8143310546875, 21.009449005126953, 20.368928909301758, 26.37814712524414, -3.1692886352539062, -8.069072723388672, 29.702484130859375, -0.12083625793457031, -9.243637084960938, 19.40915298461914, 25.07583999633789, 22.837936401367188, -10.084159851074219, 28.89633560180664, 23.036346435546875, 26.751880645751953, 8.01226806640625, -9.882965087890625, -25.467998504638672, -6.957878112792969, 10.48044204711914, 26.7076416015625, -12.178485870361328, 4.909553527832031, -12.5355224609375, 58.86842346191406, -9.273750305175781, 29.508159637451172, 5.467460632324219, -4.378726959228516, -14.549476623535156, 18.86156463623047, 5.356536865234375, -1.5803241729736328, -1.1739349365234375, 7.855377197265625, 18.990760803222656, -2.0825233459472656, 5.865055084228516, -18.446876525878906, 30.224029541015625, 6.003791809082031, -12.849184036254883, 15.590473175048828, 8.419147491455078, 9.800384521484375, 50.633705139160156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000566.npy"}
|
||||
{"epoch": 0.8556311413454271, "step": 567, "batch_size": 64, "mean": 11.62405014038086, "std": 13.366959571838379, "min": -11.367393493652344, "p10": -4.899188995361327, "median": 10.33731460571289, "p90": 31.706019973754884, "max": 41.69207763671875, "pos_frac": 0.78125, "sample": [8.896484375, 3.616130828857422, -3.6628799438476562, 34.38435363769531, 11.224212646484375, 9.669647216796875, 31.724853515625, 17.965087890625, 37.993614196777344, -0.09647369384765625, 20.28160858154297, 15.708887100219727, 6.333580017089844, -9.262939453125, 10.021568298339844, 5.171318054199219, -3.7065353393554688, 13.21026611328125, 9.967567443847656, 14.255393981933594, 18.55755615234375, 1.0862960815429688, -1.6499958038330078, 17.469528198242188, 3.6507797241210938, 1.2452831268310547, -6.660526275634766, -7.329977035522461, 15.134681701660156, -11.367393493652344, 40.08006286621094, 11.982208251953125, 31.66207504272461, 41.69207763671875, 31.76171112060547, 3.1432456970214844, 23.784265518188477, -1.1780509948730469, 10.069534301757812, 39.045257568359375, -11.309471130371094, 11.5152587890625, 13.774139404296875, 9.136734008789062, 21.262771606445312, 23.4827880859375, -4.2581329345703125, 7.082004547119141, 24.178848266601562, 10.605094909667969, 19.823570251464844, 1.2534408569335938, 29.4285888671875, 6.3946075439453125, 1.3866119384765625, -5.173927307128906, -6.765872955322266, 12.413534164428711, 31.088607788085938, -0.43804931640625, 10.664745330810547, 8.292793273925781, 18.83941650390625, 15.386756896972656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000567.npy"}
|
||||
{"epoch": 0.8571428571428571, "step": 568, "batch_size": 64, "mean": 12.78278923034668, "std": 14.10503101348877, "min": -12.379528045654297, "p10": -2.176889419555663, "median": 10.788741111755371, "p90": 33.10747375488282, "max": 58.97052001953125, "pos_frac": 0.828125, "sample": [12.11114501953125, 1.090219497680664, 16.109149932861328, 3.1729507446289062, 10.54355239868164, 16.048179626464844, 13.441055297851562, -12.379528045654297, 6.267360687255859, -2.913196563720703, -8.42469596862793, 27.1782169342041, 5.695207595825195, 31.286102294921875, -1.3053474426269531, -5.5240325927734375, 6.452392578125, 7.1540069580078125, 14.494026184082031, 12.004297256469727, -1.0112152099609375, 6.4345855712890625, 2.09523868560791, 17.48943328857422, 24.431137084960938, 43.06444549560547, 20.21809959411621, 34.53352355957031, -7.393383026123047, 7.80499267578125, 14.200698852539062, 4.49836540222168, 17.59238624572754, -2.5504074096679688, 16.600847244262695, 11.033929824829102, -0.7233352661132812, 24.06330108642578, -0.6111412048339844, 14.635154724121094, 17.843406677246094, 27.09076690673828, 3.155731201171875, 17.132308959960938, 23.294456481933594, 2.7198257446289062, 15.297393798828125, -7.808738708496094, 8.4954833984375, 4.678794860839844, 9.1136474609375, 15.16351318359375, 5.210479736328125, 33.8880615234375, 34.08679962158203, 6.677740097045898, 0.9370765686035156, 38.333160400390625, 30.204193115234375, 27.180404663085938, 3.355743408203125, 0.9116058349609375, 43.25843811035156, 58.97052001953125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000568.npy"}
|
||||
{"epoch": 0.8586545729402872, "step": 569, "batch_size": 64, "mean": 10.228591918945312, "std": 14.01224136352539, "min": -21.831588745117188, "p10": -5.8469512939453105, "median": 8.598423957824707, "p90": 31.72637062072754, "max": 41.55260467529297, "pos_frac": 0.8125, "sample": [16.68083953857422, 12.1612548828125, 8.626565933227539, 32.86467742919922, 12.791152954101562, 36.291709899902344, -15.10595703125, 31.592987060546875, 20.272239685058594, 37.70660400390625, 18.113906860351562, 7.003227233886719, 0.7428627014160156, 9.008108139038086, -21.831588745117188, 15.770538330078125, 17.031673431396484, 27.149734497070312, 0.34499549865722656, 13.773134231567383, 31.78353500366211, 19.868255615234375, 6.191722869873047, 32.231285095214844, -0.932891845703125, 20.541446685791016, 19.618698120117188, 4.924785614013672, 8.057807922363281, -8.976455688476562, -18.251447677612305, 5.554698944091797, 1.9191265106201172, 6.7035675048828125, 3.1048965454101562, 0.6222076416015625, 41.55260467529297, 6.820043563842773, 14.012413024902344, 0.4496116638183594, 7.852766036987305, -2.5391483306884766, 2.0956878662109375, -0.6155357360839844, 5.5814361572265625, -20.347625732421875, 8.570281982421875, 0.9967498779296875, 21.809185028076172, 15.719156265258789, 10.368303298950195, 9.95751953125, 38.052978515625, -6.559638977050781, -7.064176559448242, 4.246301651000977, 18.000152587890625, 13.288131713867188, 19.105098724365234, -2.3645477294921875, -4.184013366699219, 13.103813171386719, 26.63531494140625, 6.137107849121094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000569.npy"}
|
||||
{"epoch": 0.8601662887377173, "step": 570, "batch_size": 64, "mean": 14.748177528381348, "std": 15.03004264831543, "min": -11.270927429199219, "p10": -2.202098846435547, "median": 13.101895332336426, "p90": 34.014756011962895, "max": 56.289215087890625, "pos_frac": 0.828125, "sample": [24.91748809814453, -11.270927429199219, 15.335197448730469, 16.40835952758789, 13.667396545410156, 19.222434997558594, -0.5322647094726562, -6.104595184326172, 9.85540771484375, 1.8151168823242188, 4.206581115722656, 4.8890380859375, 8.411375045776367, 14.482078552246094, 9.984161376953125, 24.834453582763672, -2.2274703979492188, -1.6249885559082031, 3.1970977783203125, 17.404117584228516, 13.346908569335938, -6.809635162353516, 0.9570388793945312, 33.49943542480469, 46.5058479309082, 0.2755584716796875, 17.89864730834961, 17.218597412109375, 28.023643493652344, 4.693260192871094, 21.750335693359375, 34.114097595214844, 21.25635528564453, -8.718673706054688, 10.56112289428711, 12.372322082519531, 21.239952087402344, 12.301385879516602, 33.782958984375, 11.478309631347656, 29.416488647460938, 32.279388427734375, 28.994552612304688, -4.715803146362305, 7.559135437011719, 12.564971923828125, 38.392181396484375, 26.05651092529297, 7.890533447265625, -11.105056762695312, 22.00727653503418, -2.1428985595703125, 2.5980377197265625, 23.294614791870117, 51.84606170654297, 2.5305938720703125, 14.114801406860352, 12.856882095336914, 16.6492919921875, 39.772953033447266, 56.289215087890625, 8.241031646728516, 36.281585693359375, -0.4065399169921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000570.npy"}
|
||||
{"epoch": 0.8616780045351474, "step": 571, "batch_size": 64, "mean": 8.224736213684082, "std": 13.498366355895996, "min": -21.963768005371094, "p10": -8.14187831878662, "median": 8.345294952392578, "p90": 20.773464965820313, "max": 39.651241302490234, "pos_frac": 0.75, "sample": [14.10470199584961, 3.1157302856445312, 7.742103576660156, 2.83453369140625, -4.58447265625, 37.6314811706543, 14.7476806640625, 10.81561279296875, 20.02479362487793, 19.915252685546875, 10.20356559753418, 23.06127166748047, -3.8172874450683594, 4.32818603515625, -2.904825210571289, 13.424118041992188, 4.38507080078125, 13.344718933105469, -1.7911300659179688, 19.46120834350586, 26.845813751220703, 0.6174850463867188, 20.907333374023438, 6.301116943359375, 16.123031616210938, 20.37136459350586, 13.94989013671875, 17.335493087768555, -15.239433288574219, 15.202556610107422, 0.18857955932617188, -14.162862777709961, 3.4529953002929688, -7.862749099731445, 16.572525024414062, 4.657747268676758, -6.1056671142578125, -2.7130165100097656, -8.261505126953125, 1.7594528198242188, 15.955787658691406, 8.658599853515625, 13.509384155273438, 14.059577941894531, 18.34292221069336, 34.221923828125, 20.461105346679688, -15.467378616333008, 6.141883850097656, 4.767822265625, -18.661026000976562, -20.29241943359375, 16.279090881347656, 18.6619873046875, 18.652271270751953, -2.7184677124023438, 7.44915771484375, 32.709415435791016, 8.031990051269531, 15.680709838867188, -5.598125457763672, 1.8669776916503906, -21.963768005371094, 39.651241302490234], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000571.npy"}
|
||||
{"epoch": 0.8631897203325775, "step": 572, "batch_size": 64, "mean": 10.044036865234375, "std": 13.902515411376953, "min": -13.225837707519531, "p10": -7.65177993774414, "median": 9.057140350341797, "p90": 27.74001998901367, "max": 52.53944396972656, "pos_frac": 0.734375, "sample": [-0.3133392333984375, -13.225837707519531, -1.1367950439453125, 14.424060821533203, 26.771316528320312, 10.441093444824219, 36.286376953125, 8.843315124511719, -6.778835296630859, 52.53944396972656, 9.270965576171875, 15.044084548950195, 19.6549072265625, 21.35387420654297, -4.400112152099609, 18.154098510742188, -8.722366333007812, 4.861356735229492, 1.4371623992919922, 27.660888671875, 21.46945571899414, 14.20361328125, 15.425735473632812, 33.03230667114258, 13.235427856445312, 6.288887023925781, 0.9902381896972656, 4.264747619628906, -7.997306823730469, -3.62164306640625, 29.04889678955078, 27.77393341064453, 37.68014144897461, -12.64617919921875, 14.49323844909668, 8.585796356201172, 8.267356872558594, -8.462074279785156, 3.719005584716797, 5.693286895751953, 14.879257202148438, 17.053590774536133, 15.716766357421875, 20.659969329833984, 10.026695251464844, 26.273536682128906, 0.6345405578613281, -9.356796264648438, 36.074180603027344, 21.011795043945312, 11.169471740722656, 19.780479431152344, 1.3826904296875, 15.344474792480469, -9.7452392578125, -5.7237701416015625, -0.01631927490234375, 7.519477844238281, -5.976585388183594, -6.845550537109375, 13.215316772460938, -4.340625762939453, 5.851591110229492, 4.618873596191406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000572.npy"}
|
||||
{"epoch": 0.8647014361300076, "step": 573, "batch_size": 64, "mean": 11.156794548034668, "std": 18.701684951782227, "min": -27.115734100341797, "p10": -11.932479095458984, "median": 7.613151550292969, "p90": 40.11586608886719, "max": 59.228965759277344, "pos_frac": 0.75, "sample": [10.234542846679688, 12.870840072631836, -27.115734100341797, 15.007339477539062, -3.8226547241210938, 4.760009765625, -7.976783752441406, 27.620704650878906, 13.385177612304688, -7.5890350341796875, -26.1536922454834, -12.016883850097656, -2.3604564666748047, 1.2840118408203125, 4.8147430419921875, 12.558837890625, -0.7446517944335938, 10.422767639160156, 40.38249969482422, -18.887245178222656, 39.49372100830078, 7.760078430175781, 31.091812133789062, 7.949714660644531, -12.262298583984375, 0.3410606384277344, 51.67832946777344, 6.357273101806641, 29.36572265625, 3.3339004516601562, 26.336349487304688, 1.791351318359375, 6.866844177246094, 11.735084533691406, 52.2689208984375, 5.325775146484375, 4.007448196411133, 6.686738967895508, 7.0054473876953125, 59.228965759277344, 26.81853485107422, 14.957170486450195, 13.942577362060547, 2.695009231567383, -11.73553466796875, 7.466224670410156, 46.47145080566406, 10.056938171386719, 4.08111572265625, 12.042579650878906, 43.506431579589844, 31.725433349609375, 4.580841064453125, 18.57935333251953, -0.12906646728515625, -12.514945983886719, 41.63775634765625, -4.298774719238281, 13.84206771850586, 17.20204734802246, 17.655471801757812, -1.77447509765625, -12.71661376953125, 36.90667724609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000573.npy"}
|
||||
{"epoch": 0.8662131519274376, "step": 574, "batch_size": 64, "mean": 12.324638366699219, "std": 12.890103340148926, "min": -9.435028076171875, "p10": -2.1919677734375, "median": 12.186670303344727, "p90": 32.58058433532715, "max": 42.2489013671875, "pos_frac": 0.8125, "sample": [9.915771484375, -2.372100830078125, -1.970611572265625, -8.343067169189453, 5.781135559082031, 19.409698486328125, 2.6592750549316406, -1.0441970825195312, 5.476274490356445, -2.286834716796875, -6.14459228515625, 12.845733642578125, 7.413665771484375, 3.016742706298828, -1.2017021179199219, -5.496997833251953, -2.4645557403564453, 11.897750854492188, 12.893341064453125, 8.298721313476562, 29.240211486816406, 13.362991333007812, 7.18798828125, 14.60888671875, 0.38120269775390625, 41.84141540527344, 13.125541687011719, 40.89067077636719, 1.3103466033935547, 28.17993927001953, 13.83990478515625, 0.180938720703125, 12.212528228759766, 15.037811279296875, 9.523590087890625, 18.523540496826172, -9.435028076171875, 12.404731750488281, 26.869239807128906, 1.4716835021972656, 12.45556640625, 16.231582641601562, 40.15393829345703, 13.493499755859375, 20.009716033935547, 33.10963439941406, 15.067214965820312, 19.470596313476562, 8.631767272949219, 12.160812377929688, 4.6509246826171875, 36.32706069946289, 9.94354248046875, 14.894887924194336, 20.008392333984375, -1.4729118347167969, 7.640047073364258, 31.346134185791016, 22.22161102294922, 33.93153381347656, 0.6854705810546875, 17.615753173828125, 42.2489013671875, -1.0904006958007812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000574.npy"}
|
||||
{"epoch": 0.8677248677248677, "step": 575, "batch_size": 64, "mean": 8.858168601989746, "std": 16.031005859375, "min": -27.903057098388672, "p10": -8.72027473449707, "median": 7.646141052246094, "p90": 34.51325721740723, "max": 49.1690673828125, "pos_frac": 0.703125, "sample": [-2.0541000366210938, -12.403125762939453, -20.50375747680664, -4.28630256652832, 20.589508056640625, 5.88861083984375, 1.1454391479492188, 13.807380676269531, 0.6912307739257812, -7.247764587402344, 21.1995849609375, -4.881980895996094, 36.07313537597656, -9.007511138916016, -12.643409729003906, 18.537918090820312, -10.538265228271484, 12.863304138183594, 11.41378402709961, 7.497711181640625, 16.892311096191406, 36.148155212402344, 7.221473693847656, 11.958003997802734, -0.21738433837890625, 3.9062728881835938, -7.50457763671875, 15.100797653198242, 28.238189697265625, 0.4025421142578125, 3.2012081146240234, 3.2312355041503906, 34.67583084106445, 37.772979736328125, 25.370819091796875, -16.76372528076172, 46.241859436035156, -2.0294342041015625, 34.13391876220703, 28.7691707611084, -4.6829681396484375, 11.867027282714844, 17.092369079589844, 9.591110229492188, 3.7789306640625, 7.7945709228515625, 2.744112014770508, -27.903057098388672, 3.3629302978515625, 49.1690673828125, 19.459060668945312, 10.837814331054688, 10.30181884765625, 9.566242218017578, -7.605983734130859, 12.615509033203125, 6.491170883178711, -4.834827423095703, 8.494417190551758, 19.698223114013672, -8.050056457519531, 34.92853546142578, 13.3489990234375, -4.033267974853516], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000575.npy"}
|
||||
{"epoch": 0.8692365835222978, "step": 576, "batch_size": 64, "mean": 11.62350082397461, "std": 14.529440879821777, "min": -15.862289428710938, "p10": -6.694554328918455, "median": 9.802497863769531, "p90": 31.59078750610352, "max": 44.12355041503906, "pos_frac": 0.71875, "sample": [25.811721801757812, 35.127349853515625, -1.5492019653320312, 13.634256362915039, 5.40924072265625, 15.209197998046875, 41.42629623413086, 8.029878616333008, 16.176910400390625, 8.2681884765625, -4.163972854614258, -7.863088607788086, 21.124698638916016, -9.774436950683594, 11.336807250976562, -7.251106262207031, -3.1282424926757812, 24.75358772277832, 44.12355041503906, 2.81298828125, 15.698919296264648, 38.228431701660156, 13.247142791748047, -0.741058349609375, 20.362762451171875, 16.868040084838867, 14.779460906982422, -7.4495086669921875, 35.026695251464844, -3.5725231170654297, 6.9580841064453125, 2.4300613403320312, 2.46966552734375, 24.285627365112305, 7.8156890869140625, 3.35748291015625, 31.970596313476562, -1.36407470703125, 3.4420547485351562, 2.480846405029297, 21.022380828857422, -1.1385612487792969, 30.125457763671875, -7.9384307861328125, 22.789291381835938, 6.3555145263671875, 42.29825973510742, 25.67011260986328, -1.8298969268798828, 21.523841857910156, 22.483993530273438, -15.862289428710938, 13.93341064453125, -2.1466522216796875, 17.82464599609375, 12.636116027832031, -8.112886428833008, -5.395933151245117, 24.92322540283203, 21.253067016601562, 3.361074447631836, 3.708597183227539, -0.093902587890625, 30.704566955566406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000576.npy"}
|
||||
{"epoch": 0.8707482993197279, "step": 577, "batch_size": 64, "mean": 8.467338562011719, "std": 15.173977851867676, "min": -23.040740966796875, "p10": -10.125494384765625, "median": 6.542396545410156, "p90": 28.273232269287117, "max": 50.07032012939453, "pos_frac": 0.71875, "sample": [19.34502601623535, 2.319833755493164, -0.29297637939453125, 10.627159118652344, -3.331390380859375, 7.738067626953125, 2.942127227783203, -3.9518203735351562, 25.988571166992188, -2.2344512939453125, -4.1292572021484375, 18.89159393310547, 17.817386627197266, -2.6405868530273438, -10.266288757324219, 30.489974975585938, 22.8695068359375, 30.556201934814453, -23.040740966796875, 13.093114852905273, 24.654449462890625, 10.0289306640625, 5.998821258544922, 33.68207550048828, -1.1699905395507812, -16.617034912109375, -15.133167266845703, 22.070396423339844, 11.984376907348633, 2.0236053466796875, 7.071403503417969, 11.827388763427734, 5.762641906738281, 2.6474571228027344, 1.587738037109375, -2.1179428100585938, 6.1520233154296875, 6.7076416015625, 7.493507385253906, -2.791748046875, 9.846603393554688, 15.305206298828125, 35.80469512939453, 2.7145023345947266, 48.67281723022461, 29.25237274169922, -18.601463317871094, 10.784576416015625, 5.609127044677734, 23.01921844482422, -16.248924255371094, 12.987838745117188, 50.07032012939453, 25.56688690185547, 4.249244689941406, 3.28265380859375, -17.784378051757812, -9.796974182128906, -3.664154052734375, 20.576906204223633, 6.3771514892578125, 13.58835220336914, 13.809768676757812, 1.8336944580078125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000577.npy"}
|
||||
{"epoch": 0.872260015117158, "step": 578, "batch_size": 64, "mean": 7.942363739013672, "std": 14.25981330871582, "min": -18.901901245117188, "p10": -7.638845443725586, "median": 6.8532257080078125, "p90": 27.749163055419924, "max": 46.1414794921875, "pos_frac": 0.640625, "sample": [8.209999084472656, -12.807975769042969, 0.8436660766601562, 1.9842758178710938, 13.516807556152344, -0.6441516876220703, 7.198421478271484, 8.233299255371094, 14.160324096679688, 3.6532230377197266, -2.323028564453125, 46.1414794921875, -3.5211868286132812, 9.20053482055664, 23.00655746459961, -18.901901245117188, 34.8800048828125, 18.4813175201416, 11.557052612304688, 23.545562744140625, 31.52735137939453, 10.256591796875, 3.785022735595703, 8.861042022705078, -2.979167938232422, 11.865785598754883, 18.140098571777344, -1.5201644897460938, 3.4928722381591797, -0.8523807525634766, 37.68830108642578, 14.794807434082031, 45.41566848754883, 10.237396240234375, -4.126182556152344, 7.1111297607421875, 9.154275894165039, 20.243003845214844, 7.369590759277344, 3.473512649536133, 27.56470489501953, -15.856170654296875, 9.274452209472656, -7.218341827392578, -7.819061279296875, -10.2470703125, 0.8760452270507812, 28.909860610961914, -4.808626174926758, 27.343067169189453, 11.420608520507812, -10.028007507324219, 27.828216552734375, -0.012964248657226562, 6.5953216552734375, -7.187690734863281, 2.3979263305664062, -0.9353485107421875, -2.4754562377929688, -0.16411972045898438, -2.2681541442871094, -1.9029388427734375, -9.41363525390625, 26.08580780029297], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000578.npy"}
|
||||
{"epoch": 0.873771730914588, "step": 579, "batch_size": 64, "mean": 14.169384002685547, "std": 15.5214204788208, "min": -19.854217529296875, "p10": -3.8309749603271483, "median": 11.39997673034668, "p90": 37.95195770263674, "max": 51.87957763671875, "pos_frac": 0.828125, "sample": [17.637847900390625, 8.542518615722656, 10.913673400878906, 4.47052001953125, 2.329601287841797, -6.9791259765625, 16.81109619140625, 4.790458679199219, 16.118560791015625, 44.1353759765625, 32.718971252441406, 30.90168571472168, 9.925891876220703, 42.89686965942383, 16.16545295715332, 9.007667541503906, 32.95702362060547, -2.2921524047851562, -3.8456954956054688, 21.981430053710938, 51.87957763671875, -5.816173553466797, 46.66514205932617, 0.1300067901611328, 11.368820190429688, 11.431133270263672, 48.70790100097656, 23.427474975585938, 31.00092315673828, 13.145515441894531, -3.7966270446777344, 11.091760635375977, 30.197227478027344, 12.622125625610352, 18.53948211669922, 13.723358154296875, 22.127914428710938, 20.291040420532227, 5.4300384521484375, 45.40632247924805, 0.4922142028808594, 15.150493621826172, -0.08931732177734375, -9.379669189453125, 11.047252655029297, 5.831521987915039, -5.355659484863281, -19.854217529296875, 40.09264373779297, 6.707794189453125, 19.080230712890625, 3.9529876708984375, -3.9757080078125, 21.10352897644043, 16.975296020507812, 6.433891296386719, 24.29156494140625, 23.697189331054688, 5.170705795288086, 6.127349853515625, -3.3653106689453125, 6.056163787841797, 18.570106506347656, 1.3188133239746094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000579.npy"}
|
||||
{"epoch": 0.8752834467120182, "step": 580, "batch_size": 64, "mean": 9.934186935424805, "std": 15.349852561950684, "min": -22.20928382873535, "p10": -8.128013229370117, "median": 8.483466148376465, "p90": 31.46255416870118, "max": 45.08287048339844, "pos_frac": 0.71875, "sample": [-12.069007873535156, -4.730079650878906, -1.3351364135742188, 7.17249870300293, -2.7707290649414062, 23.201284408569336, 39.355995178222656, 12.707149505615234, 9.79443359375, -0.7826919555664062, 19.26116943359375, 15.851638793945312, 4.99298095703125, 3.2546005249023438, -7.7210693359375, 6.864477157592773, 20.325176239013672, 12.487251281738281, -3.3829269409179688, 11.073902130126953, -0.8783760070800781, 17.61909294128418, 29.779701232910156, 3.598907470703125, 25.74635124206543, -3.38397216796875, 32.18377685546875, 0.3238983154296875, 0.9246425628662109, 41.0999755859375, 25.936317443847656, 27.330352783203125, 15.080108642578125, 17.081266403198242, -8.36898422241211, 45.08287048339844, -18.079208374023438, -22.20928382873535, 3.8724327087402344, 36.49542236328125, -21.058483123779297, 16.3231201171875, -8.302417755126953, 11.904754638671875, 33.69276428222656, -3.374919891357422, 2.6693859100341797, 2.338470458984375, 17.32086181640625, 10.538688659667969, 18.854625701904297, 24.678171157836914, 0.9708938598632812, -12.511726379394531, 0.1483001708984375, -3.6509437561035156, 13.602462768554688, 22.60792350769043, 11.82461929321289, 38.308929443359375, 7.113441467285156, -0.2349700927734375, 6.6696014404296875, 22.56818389892578], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000580.npy"}
|
||||
{"epoch": 0.8767951625094482, "step": 581, "batch_size": 64, "mean": 9.075504302978516, "std": 15.439260482788086, "min": -22.61644172668457, "p10": -6.799765205383299, "median": 7.54132080078125, "p90": 27.195553588867202, "max": 51.41450500488281, "pos_frac": 0.75, "sample": [19.79806137084961, 18.386085510253906, 2.0024681091308594, 10.290044784545898, 7.64849853515625, 3.8003997802734375, -0.6145477294921875, -10.013328552246094, 19.2327880859375, 13.170204162597656, -21.585716247558594, 4.374950408935547, 1.4562435150146484, 10.382659912109375, 21.580421447753906, -7.586854934692383, 17.203292846679688, 4.355674743652344, 14.958675384521484, 0.8369274139404297, 38.80481719970703, 3.7408218383789062, 38.03181457519531, -2.85064697265625, 30.969131469726562, 51.41450500488281, 17.88383674621582, 4.562042236328125, 23.156112670898438, 17.00152587890625, -4.1117095947265625, 16.180992126464844, 3.1993331909179688, 21.558212280273438, 7.2113494873046875, 13.530826568603516, 7.43414306640625, -12.368217468261719, 50.722747802734375, 28.926742553710938, 2.4511795043945312, -20.061859130859375, 8.343597412109375, -2.4228515625, 12.508869171142578, -22.61644172668457, 39.347564697265625, 2.08624267578125, -4.914257049560547, -15.505626678466797, 15.378040313720703, -4.3712310791015625, 9.318641662597656, 21.251968383789062, 0.8258552551269531, 19.619232177734375, 0.6883316040039062, -4.963222503662109, -3.0408782958984375, 9.161121368408203, 0.29236602783203125, 19.524517059326172, 17.28055763244629, -4.0247650146484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000581.npy"}
|
||||
{"epoch": 0.8783068783068783, "step": 582, "batch_size": 64, "mean": 10.306524276733398, "std": 13.325193405151367, "min": -13.81887435913086, "p10": -7.601196289062499, "median": 9.478034973144531, "p90": 26.704394149780274, "max": 47.36456298828125, "pos_frac": 0.796875, "sample": [22.800827026367188, 46.27922058105469, 7.64459228515625, 15.621726989746094, 2.1008682250976562, 1.6351547241210938, 17.714202880859375, 9.304244995117188, 19.934932708740234, 25.81110382080078, 9.353897094726562, 10.595420837402344, 12.326934814453125, 47.36456298828125, -0.7262115478515625, 22.56587028503418, 11.186935424804688, -0.338958740234375, 6.914039611816406, 18.903396606445312, 5.546327590942383, 10.227638244628906, 9.892276763916016, 27.553482055664062, 18.373680114746094, 16.219520568847656, 12.023828506469727, 11.263982772827148, -12.083393096923828, -13.262941360473633, 1.779611587524414, 4.474151611328125, 23.24785614013672, -7.8633575439453125, -0.21654319763183594, 26.74856948852539, 34.86930847167969, 14.874805450439453, 24.461708068847656, 3.9384841918945312, 26.601318359375, 9.6021728515625, 6.379243850708008, 8.724649429321289, -4.610071182250977, 0.40808868408203125, -10.699165344238281, 8.002761840820312, 11.699317932128906, -13.81887435913086, 4.4317779541015625, 5.058502197265625, -0.654510498046875, 1.211761474609375, -11.14799690246582, 11.188232421875, 24.15319061279297, -10.196342468261719, 30.773813247680664, -6.9894866943359375, 2.7818603515625, 3.083812713623047, 16.01923370361328, 28.55250358581543], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000582.npy"}
|
||||
{"epoch": 0.8798185941043084, "step": 583, "batch_size": 64, "mean": 10.24338150024414, "std": 16.199644088745117, "min": -29.17729377746582, "p10": -7.694673919677734, "median": 9.143747329711914, "p90": 32.9731378555298, "max": 51.16255569458008, "pos_frac": 0.71875, "sample": [15.54891586303711, 3.293537139892578, -3.7249603271484375, -6.393703460693359, 47.68425369262695, 8.276081085205078, -7.3007049560546875, -7.863517761230469, -13.466514587402344, -0.1524658203125, 14.713165283203125, -12.79305648803711, -0.5289955139160156, 17.416290283203125, 5.885673522949219, -6.601604461669922, 15.20730209350586, -3.226530075073242, 51.16255569458008, 34.159332275390625, 18.32428550720215, -23.098052978515625, 7.7209625244140625, -2.0817184448242188, 14.6898193359375, 15.288095474243164, 18.646209716796875, -13.642364501953125, 6.408878326416016, 24.09642791748047, 14.655956268310547, 10.649236679077148, 14.035720825195312, 3.9260101318359375, 14.131256103515625, 36.96968078613281, -8.113861083984375, -5.601951599121094, 14.860877990722656, 37.21590042114258, 5.240114212036133, 37.723175048828125, -29.17729377746582, 27.65362548828125, 4.859563827514648, 23.857666015625, 16.67435073852539, 1.5650100708007812, 12.74847412109375, -4.573390960693359, 19.591094970703125, 5.8133544921875, 14.601598739624023, 4.592355728149414, 16.543582916259766, 8.04693603515625, 1.4619522094726562, 18.811004638671875, -0.2187023162841797, 30.205350875854492, 46.473793029785156, 27.561805725097656, 10.01141357421875, 5.1331939697265625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000583.npy"}
|
||||
{"epoch": 0.8813303099017384, "step": 584, "batch_size": 64, "mean": 10.626161575317383, "std": 16.814287185668945, "min": -27.08477020263672, "p10": -7.348778533935547, "median": 10.455429077148438, "p90": 29.364742469787597, "max": 54.70103454589844, "pos_frac": 0.703125, "sample": [44.365509033203125, 11.8294677734375, 15.271120071411133, -3.7656822204589844, 11.793052673339844, -0.11272621154785156, 38.888580322265625, 17.436004638671875, 18.02511215209961, 21.731521606445312, 12.895889282226562, -4.308357238769531, -5.39204216003418, -4.448518753051758, -7.387901306152344, 18.48371124267578, 23.41923713684082, 29.148698806762695, 23.923009872436523, -7.568988800048828, 23.167861938476562, 26.864898681640625, -2.457305908203125, 28.65399169921875, 23.759368896484375, 19.091659545898438, 4.560049057006836, 3.8761367797851562, 4.32708740234375, 4.5943603515625, 20.163543701171875, 2.7855987548828125, 16.46497344970703, 4.034759521484375, -6.411233901977539, 31.045570373535156, -7.2574920654296875, -5.2335357666015625, -0.26227569580078125, 2.7113208770751953, 10.34885025024414, 37.311119079589844, -10.812477111816406, 14.52407455444336, 26.242721557617188, 22.6473388671875, -0.9921455383300781, 12.048704147338867, 9.793037414550781, -11.834220886230469, -27.08477020263672, 10.562007904052734, -6.2909088134765625, 29.457332611083984, 2.2284469604492188, 54.70103454589844, 53.55750274658203, 2.0307254791259766, 17.75731086730957, 3.242218017578125, 6.033454895019531, 18.803707122802734, -20.663909912109375, -22.24285316467285], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000584.npy"}
|
||||
{"epoch": 0.8828420256991686, "step": 585, "batch_size": 64, "mean": 9.082149505615234, "std": 14.285852432250977, "min": -34.454139709472656, "p10": -4.353546142578124, "median": 7.754067420959473, "p90": 27.58816833496094, "max": 47.28668212890625, "pos_frac": 0.8125, "sample": [9.719120025634766, 26.181930541992188, 2.9593353271484375, 11.195655822753906, 10.776878356933594, 5.803596496582031, 4.564796447753906, 24.40062713623047, 5.043922424316406, 3.077585220336914, -26.465892791748047, 2.376811981201172, 27.9658203125, 1.4696807861328125, 1.1738452911376953, 21.959747314453125, -15.608007431030273, 4.815116882324219, 20.34881591796875, 14.151603698730469, 30.246299743652344, 2.3881969451904297, 3.0331172943115234, -14.051544189453125, 10.762796401977539, 10.214109420776367, 2.8246841430664062, -4.891441345214844, 15.12396240234375, 5.460773468017578, -2.050922393798828, 2.111663818359375, 27.205799102783203, -0.7603244781494141, -10.670852661132812, 8.170822143554688, 2.93475341796875, 23.402225494384766, 24.32181167602539, 21.922008514404297, 6.756561279296875, 15.164520263671875, 6.881355285644531, -12.608587265014648, -34.454139709472656, 38.60390090942383, 32.29433822631836, 27.75204086303711, -0.7915191650390625, -1.2927398681640625, 7.337312698364258, 11.233161926269531, 20.838966369628906, 10.513345718383789, 2.225109100341797, -3.0984573364257812, 9.250244140625, 29.827590942382812, 47.28668212890625, 12.831001281738281, 4.566322326660156, 8.699455261230469, 13.480964660644531, 14.351234436035156], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000585.npy"}
|
||||
{"epoch": 0.8843537414965986, "step": 586, "batch_size": 64, "mean": 11.758225440979004, "std": 13.110422134399414, "min": -16.494632720947266, "p10": -2.0999750137329096, "median": 11.530052185058594, "p90": 29.307632446289066, "max": 48.79499053955078, "pos_frac": 0.828125, "sample": [29.719436645507812, 8.786819458007812, 18.658065795898438, 10.260589599609375, 3.88861083984375, 31.07025909423828, 12.547662734985352, 1.4497222900390625, 7.228080749511719, 13.93297004699707, 18.098398208618164, -7.314796447753906, -0.2398223876953125, 36.137420654296875, 27.477149963378906, -2.2945995330810547, 8.63316535949707, 25.686904907226562, 5.60582160949707, 20.213171005249023, 2.934558868408203, 18.673595428466797, 11.105010986328125, 25.230472564697266, 2.8177719116210938, -1.3048171997070312, -1.6458511352539062, 6.260124206542969, 20.62722396850586, 18.341110229492188, 31.023902893066406, 12.413257598876953, 1.2304000854492188, 19.500293731689453, 13.716110229492188, 4.536800384521484, -16.494632720947266, 48.79499053955078, -2.9811744689941406, 4.88580322265625, 4.697324752807617, 8.034049987792969, 4.2357177734375, 18.98351287841797, 14.357498168945312, -1.1538887023925781, -4.987510681152344, 11.955093383789062, 0.28370094299316406, 17.192657470703125, 28.346755981445312, 6.7136688232421875, 14.228073120117188, 4.507598876953125, 35.7416877746582, 2.8971786499023438, -15.055900573730469, 38.440704345703125, -15.319137573242188, 23.070587158203125, 20.834388732910156, 14.527362823486328, 18.567447662353516, 12.217880249023438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000586.npy"}
|
||||
{"epoch": 0.8858654572940288, "step": 587, "batch_size": 64, "mean": 10.757262229919434, "std": 17.104738235473633, "min": -40.16162109375, "p10": -8.35250473022461, "median": 9.765506744384766, "p90": 35.074856185913085, "max": 50.95374298095703, "pos_frac": 0.75, "sample": [3.6793785095214844, -22.804550170898438, 4.663078308105469, 10.641878128051758, 20.21471405029297, -9.487747192382812, 3.807849884033203, -7.591552734375, 24.25159454345703, 13.245025634765625, 3.5104141235351562, 13.111808776855469, -8.678627014160156, -5.861419677734375, 1.3942489624023438, 15.543380737304688, 15.361764907836914, -1.0594348907470703, 6.162384033203125, 50.95374298095703, 24.921157836914062, 11.241966247558594, -1.5141334533691406, 37.500755310058594, 10.66143798828125, 17.29092788696289, 45.349491119384766, 0.09717559814453125, -40.16162109375, -1.7758216857910156, 22.794498443603516, 9.736209869384766, 23.501815795898438, 38.29899597167969, 7.309648513793945, 6.317909240722656, 15.166946411132812, -23.3150634765625, -2.891265869140625, 22.85626220703125, 32.829689025878906, -11.152523040771484, 23.28386688232422, 13.305852890014648, -0.2883167266845703, 13.242027282714844, 8.72601318359375, 23.741924285888672, 19.871170043945312, 8.433269500732422, 34.850250244140625, -15.313739776611328, 26.067115783691406, 1.6523818969726562, 9.794803619384766, 40.763389587402344, -0.23642730712890625, -3.8858299255371094, 0.057666778564453125, 8.649032592773438, 16.701316833496094, 41.32102966308594, 35.17111587524414, 6.4344635009765625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000587.npy"}
|
||||
{"epoch": 0.8873771730914588, "step": 588, "batch_size": 64, "mean": 10.493810653686523, "std": 14.082263946533203, "min": -20.518600463867188, "p10": -5.189046669006347, "median": 10.145820617675781, "p90": 27.828861236572273, "max": 45.9775390625, "pos_frac": 0.765625, "sample": [13.842185974121094, -5.49566650390625, -1.1861648559570312, 13.068260192871094, -3.4456329345703125, 18.228858947753906, 28.54039764404297, 11.151290893554688, 20.517051696777344, 26.168609619140625, 44.01924133300781, 6.401391983032227, 8.12276840209961, 6.220678329467773, 6.824394226074219, 15.980480194091797, 44.57579803466797, 45.9775390625, 8.315790176391602, 35.04399108886719, 11.598716735839844, 15.24397087097168, -9.097698211669922, 39.655975341796875, 14.571727752685547, 28.59711456298828, 10.071651458740234, -2.8527069091796875, -2.7641124725341797, -4.1309814453125, 2.753580093383789, 19.380416870117188, -16.392070770263672, 3.6614227294921875, 16.491127014160156, 19.58647918701172, -4.473600387573242, 10.153076171875, 13.217344284057617, 3.0385665893554688, -11.814125061035156, -3.965545654296875, 9.844146728515625, 25.278827667236328, 11.158031463623047, 10.138565063476562, 23.174360275268555, 2.373577117919922, -0.1532135009765625, 16.552350997924805, 10.807258605957031, 4.311553955078125, 2.063213348388672, 11.078117370605469, 4.1147003173828125, 5.230804443359375, -5.872699737548828, 13.442352294921875, 6.296348571777344, -20.518600463867188, 16.16840171813965, 21.8861083984375, 22.601608276367188, -13.773513793945312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000588.npy"}
|
||||
{"epoch": 0.8888888888888888, "step": 589, "batch_size": 64, "mean": 13.7318696975708, "std": 14.992883682250977, "min": -22.284027099609375, "p10": -1.9399032592773433, "median": 11.011817932128906, "p90": 37.130777740478514, "max": 62.866943359375, "pos_frac": 0.859375, "sample": [2.200956344604492, 23.63158416748047, -10.002433776855469, 37.32427215576172, 12.269584655761719, 5.53150749206543, 6.347679138183594, 10.794387817382812, 4.1073455810546875, 1.7200584411621094, -2.0836639404296875, 1.424661636352539, 46.16075134277344, 11.42104721069336, 27.311111450195312, 13.030723571777344, 8.150470733642578, 25.153705596923828, 39.5638542175293, 22.181182861328125, 5.9415283203125, 19.509429931640625, 22.460519790649414, 0.9940338134765625, 36.679290771484375, 10.611495971679688, -5.493827819824219, 17.8165283203125, 28.1151065826416, 15.513261795043945, 15.33194351196289, -1.604461669921875, 37.357582092285156, 7.936767578125, -2.9814682006835938, 11.637977600097656, 8.894447326660156, 17.081764221191406, 10.591232299804688, 43.00021743774414, 15.564582824707031, 16.685705184936523, 22.523483276367188, 7.454311370849609, 14.24978256225586, 11.229248046875, 6.47784423828125, -6.255970001220703, -0.21071434020996094, 7.233695983886719, -22.284027099609375, 14.299690246582031, 44.78233337402344, -5.4741058349609375, 5.576637268066406, 21.435178756713867, 32.592369079589844, 6.835197448730469, 9.093765258789062, 4.094306945800781, 3.859527587890625, 8.144201278686523, 62.866943359375, 12.433557510375977], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000589.npy"}
|
||||
{"epoch": 0.890400604686319, "step": 590, "batch_size": 64, "mean": 13.212362289428711, "std": 16.263023376464844, "min": -20.472946166992188, "p10": -4.500745010375976, "median": 11.997842788696289, "p90": 37.27719497680664, "max": 48.68756866455078, "pos_frac": 0.78125, "sample": [30.369125366210938, 6.077583312988281, 12.161949157714844, -3.1161117553710938, 17.727989196777344, 39.52583312988281, 11.09356689453125, -20.472946166992188, 3.2789134979248047, 1.603515625, 7.161163330078125, 3.1565723419189453, 0.3599700927734375, 3.3418960571289062, -18.639923095703125, 9.958860397338867, 18.273902893066406, 27.937744140625, -15.861438751220703, -1.9919281005859375, -3.1323585510253906, 13.80572509765625, 1.1111602783203125, 35.82499694824219, 38.99407958984375, -1.0382308959960938, 2.0052947998046875, 8.498764038085938, 46.33009338378906, 20.281600952148438, -2.0112686157226562, 24.897552490234375, 5.665643692016602, 22.657958984375, 25.13812255859375, 20.7469482421875, 20.50880241394043, 6.5151214599609375, 3.18841552734375, 18.58123016357422, 38.69022750854492, 12.208074569702148, 37.553192138671875, 20.7901611328125, 36.01239776611328, 11.833736419677734, 17.535186767578125, 18.95123291015625, -7.9961395263671875, 16.017166137695312, -9.54232406616211, 30.318918228149414, 28.30013656616211, -1.7227783203125, 7.2334136962890625, 36.633201599121094, 14.34504508972168, 10.314056396484375, 39.171478271484375, -4.924221038818359, 20.854633331298828, -3.51263427734375, 48.68756866455078, -12.676471710205078], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000590.npy"}
|
||||
{"epoch": 0.891912320483749, "step": 591, "batch_size": 64, "mean": 12.337522506713867, "std": 16.69988250732422, "min": -27.773765563964844, "p10": -8.326468086242675, "median": 12.679130554199219, "p90": 33.19220771789551, "max": 48.85679626464844, "pos_frac": 0.734375, "sample": [-0.144317626953125, 40.37003707885742, 15.29005241394043, 26.67940330505371, 17.167930603027344, 12.8763427734375, 33.56135177612305, 11.439804077148438, 21.59063720703125, 23.795608520507812, 8.45745849609375, -2.17303466796875, 21.438278198242188, -13.324935913085938, 9.511661529541016, -4.876186370849609, 17.034652709960938, 6.0675201416015625, 14.807647705078125, 4.777444839477539, 18.564117431640625, 12.323591232299805, 31.86944580078125, -7.9857940673828125, 26.271148681640625, -27.773765563964844, 20.131118774414062, 12.481918334960938, 5.701763153076172, 6.234405517578125, -15.046829223632812, -10.118551254272461, 36.23542785644531, 48.85679626464844, 1.4220218658447266, 24.903228759765625, -17.086597442626953, 7.024879455566406, 29.32281494140625, 0.618255615234375, -2.1020431518554688, -6.657506942749023, 11.032249450683594, 25.365447998046875, -4.15643310546875, 43.26237487792969, -8.472471237182617, -18.451820373535156, -6.333595275878906, 32.33087158203125, -5.3023223876953125, -0.11347579956054688, 35.254669189453125, 17.876876831054688, 7.533123016357422, 26.809131622314453, 29.38762664794922, 15.447990417480469, 15.58319091796875, 18.919876098632812, 42.60505676269531, 15.84979248046875, 4.127756118774414, 31.508296966552734], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000591.npy"}
|
||||
{"epoch": 0.8934240362811792, "step": 592, "batch_size": 64, "mean": 11.08197021484375, "std": 15.1878023147583, "min": -12.45005989074707, "p10": -4.468030548095703, "median": 7.23394775390625, "p90": 37.392023468017584, "max": 54.557464599609375, "pos_frac": 0.78125, "sample": [6.863006591796875, 42.80293273925781, 30.30093765258789, 9.572023391723633, 19.01227569580078, 43.791324615478516, 26.27672576904297, 4.014293670654297, 3.5913314819335938, 7.6713104248046875, -0.9950180053710938, 12.459049224853516, 7.9821319580078125, 2.1587448120117188, 10.6025390625, 25.42809295654297, 43.36859893798828, 11.73355484008789, 1.1888885498046875, 4.235767364501953, -7.3185882568359375, 25.447540283203125, 9.015296936035156, 36.738128662109375, 4.037166595458984, 13.530960083007812, 8.332511901855469, 37.672264099121094, 21.492660522460938, 7.228485107421875, -6.100669860839844, 12.857946395874023, 4.250480651855469, 9.266471862792969, 9.462989807128906, 5.603054046630859, -6.3914337158203125, 24.35913848876953, -12.45005989074707, -3.8518619537353516, -1.022226333618164, 18.299116134643555, -2.07354736328125, -0.6842117309570312, -10.673675537109375, 20.21385955810547, -4.5286865234375, 40.22294616699219, 43.844444274902344, -4.326499938964844, 54.557464599609375, -1.1210975646972656, 1.7093772888183594, 2.37628173828125, 4.549568176269531, 3.3332080841064453, 5.268707275390625, 5.517387390136719, -11.78548812866211, 8.329704284667969, 4.058502197265625, 4.082061767578125, 7.239410400390625, 16.61847686767578], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000592.npy"}
|
||||
{"epoch": 0.8949357520786092, "step": 593, "batch_size": 64, "mean": 10.850542068481445, "std": 17.05246353149414, "min": -26.597904205322266, "p10": -10.11894073486328, "median": 10.401998519897461, "p90": 33.783425140380864, "max": 46.653343200683594, "pos_frac": 0.71875, "sample": [21.45916748046875, 4.389240264892578, 23.9798583984375, -6.8829803466796875, 31.171293258666992, -0.6988601684570312, 5.5348663330078125, -16.352203369140625, 42.82851028442383, 7.481559753417969, 37.575992584228516, -0.19923019409179688, 31.326705932617188, 26.42292022705078, 9.17578125, -11.11672592163086, 10.455390930175781, 19.2884521484375, 36.92460632324219, 15.33843994140625, 4.20051383972168, 17.915191650390625, 34.488067626953125, 25.066261291503906, 13.71173095703125, 10.34860610961914, -4.83843994140625, 24.987201690673828, 11.129974365234375, -26.597904205322266, 20.566551208496094, -14.444961547851562, -3.7113494873046875, 19.541698455810547, 14.569635391235352, 21.05864143371582, -3.6209182739257812, -4.782630920410156, 23.403244018554688, -4.76702880859375, -11.172737121582031, -4.4617919921875, 32.139259338378906, -21.41466522216797, 43.95512390136719, 19.341373443603516, 42.51963806152344, 10.597383499145508, -7.790775299072266, 46.653343200683594, 13.166229248046875, 0.10156059265136719, 9.480934143066406, 10.267879486083984, 13.348777770996094, -24.55316162109375, 4.593099594116211, 24.686904907226562, 1.2236061096191406, -4.526084899902344, 2.38604736328125, 3.96343994140625, 5.4239044189453125, 18.17853546142578], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000593.npy"}
|
||||
{"epoch": 0.8964474678760394, "step": 594, "batch_size": 64, "mean": 8.84228515625, "std": 13.423866271972656, "min": -23.468887329101562, "p10": -6.987490844726563, "median": 6.889852523803711, "p90": 26.668374633789064, "max": 45.27313995361328, "pos_frac": 0.703125, "sample": [5.728435516357422, 15.554580688476562, -8.47921371459961, 18.15856170654297, 20.939922332763672, 21.714744567871094, 6.910266876220703, 5.8285980224609375, 20.19225311279297, -7.205436706542969, -2.18310546875, -1.5371589660644531, -7.091156005859375, 10.035964965820312, 6.716852188110352, -1.885223388671875, -5.832061767578125, -2.556621551513672, 20.277923583984375, 0.5226516723632812, 14.59531021118164, 7.566028594970703, 3.1537647247314453, 10.917675018310547, 37.25923156738281, 13.522705078125, 11.255332946777344, -7.83673095703125, 16.004703521728516, -2.1631698608398438, 45.27313995361328, 16.775772094726562, 33.53514099121094, 26.877002716064453, 0.4530487060546875, -2.805126190185547, -6.74560546875, 11.66946792602539, 26.181575775146484, -11.425308227539062, 15.281654357910156, 1.4458484649658203, 22.331666946411133, 32.37201690673828, -2.2425765991210938, 10.43510627746582, 5.809125900268555, 22.09625244140625, -2.4696712493896484, 20.187088012695312, 4.98675537109375, -23.468887329101562, 26.982574462890625, 33.476539611816406, 18.212493896484375, 14.093303680419922, 3.5598621368408203, 3.573486328125, 6.869438171386719, -2.0770797729492188, 8.764427185058594, -14.219314575195312, 1.4968032836914062, -1.46539306640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000594.npy"}
|
||||
{"epoch": 0.8979591836734694, "step": 595, "batch_size": 64, "mean": 14.363414764404297, "std": 17.364608764648438, "min": -36.73572540283203, "p10": -5.7735157012939435, "median": 13.025444030761719, "p90": 35.240104293823244, "max": 62.176910400390625, "pos_frac": 0.84375, "sample": [-1.8652267456054688, 23.942367553710938, 37.384307861328125, 12.851577758789062, 13.236915588378906, 10.904245376586914, 15.555381774902344, 8.859012603759766, 0.9769134521484375, 32.16048049926758, 21.41998291015625, 7.773406982421875, 5.925975799560547, -14.330921173095703, 36.17133331298828, 3.159637451171875, 7.8245391845703125, -3.6771926879882812, 26.363521575927734, 5.980829238891602, 32.01531219482422, 11.604537963867188, 33.52985763549805, 12.428936004638672, 39.70458984375, 3.7147369384765625, 7.826225280761719, 13.199310302734375, 13.91168212890625, 21.84274673461914, 6.3424835205078125, -9.688186645507812, 7.610561370849609, -11.363555908203125, 24.876564025878906, 33.928070068359375, 35.61614227294922, -8.416397094726562, -36.73572540283203, 54.283164978027344, 62.176910400390625, 34.3626823425293, 14.63409423828125, 10.783462524414062, -6.671939849853516, 32.92188262939453, 14.36627197265625, 22.36639404296875, 20.35681915283203, 21.325931549072266, -1.2052326202392578, 26.217857360839844, 2.9520702362060547, 4.460487365722656, 4.723121643066406, 18.019962310791016, 41.45158386230469, 19.500812530517578, 0.34832763671875, 0.0110931396484375, 0.9577541351318359, -17.217308044433594, 21.39700698852539, 34.17041778564453], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000595.npy"}
|
||||
{"epoch": 0.8994708994708994, "step": 596, "batch_size": 64, "mean": 7.262219429016113, "std": 15.262001991271973, "min": -22.552839279174805, "p10": -10.281627655029297, "median": 5.050606727600098, "p90": 31.899967575073248, "max": 44.07817077636719, "pos_frac": 0.6875, "sample": [-10.380287170410156, 8.439010620117188, 35.66998291015625, 10.01620101928711, 12.781478881835938, -6.846134185791016, -1.7148590087890625, 5.031562805175781, 0.9113998413085938, 25.440628051757812, -11.744430541992188, 11.607837677001953, -10.904014587402344, -2.6578369140625, 32.555145263671875, 41.900413513183594, 15.791923522949219, 4.495494842529297, -12.048473358154297, 5.069650650024414, -10.051422119140625, 21.655349731445312, 4.4542999267578125, 12.256673812866211, 44.07817077636719, 30.371219635009766, 14.601646423339844, -10.85997200012207, 4.811241149902344, -4.2681884765625, 0.07892227172851562, 5.79730224609375, 5.1977996826171875, 0.4479389190673828, 7.347663879394531, -6.340442657470703, -8.334281921386719, 41.85710144042969, -1.0580940246582031, -17.17621612548828, 9.649085998535156, 9.549827575683594, 0.4574470520019531, -22.552839279174805, -2.403472900390625, 7.359104156494141, 0.5312271118164062, -6.67112922668457, 37.91566467285156, 27.248687744140625, 29.240243911743164, 2.6490097045898438, -1.46533203125, 6.634895324707031, 1.5458908081054688, 38.61791229248047, 0.2416534423828125, 13.696823120117188, -7.386323928833008, 5.331193923950195, 13.54775619506836, 6.3167572021484375, 7.9556884765625, -1.509134292602539], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000596.npy"}
|
||||
{"epoch": 0.9009826152683296, "step": 597, "batch_size": 64, "mean": 10.611440658569336, "std": 14.756790161132812, "min": -19.169273376464844, "p10": -5.917714881896972, "median": 6.989833831787109, "p90": 30.518646240234375, "max": 48.813438415527344, "pos_frac": 0.765625, "sample": [-7.31134033203125, -19.169273376464844, -1.8165931701660156, 2.2005462646484375, 30.889175415039062, 7.284221649169922, 10.276386260986328, -9.297882080078125, -5.229024887084961, 39.6868896484375, 2.5390052795410156, 27.161026000976562, 4.275629043579102, 8.632835388183594, -4.29510498046875, 6.534051895141602, 47.616607666015625, 25.388851165771484, 17.709171295166016, -7.446786880493164, 14.911603927612305, 7.74005126953125, -10.658798217773438, 0.5995559692382812, 22.788833618164062, 14.321155548095703, 6.462850570678711, -3.0479373931884766, 17.32714080810547, 0.18059539794921875, 10.813070297241211, 10.023483276367188, 38.59349060058594, 17.468055725097656, 25.508010864257812, 6.695446014404297, -3.0805740356445312, 5.751104354858398, 13.434549331665039, 35.38771057128906, -0.6507682800292969, -6.212867736816406, 3.5160140991210938, 1.2647972106933594, -2.427217483520508, 3.804485321044922, 0.9527225494384766, 2.5186023712158203, 27.785194396972656, 10.493911743164062, 48.813438415527344, 8.756561279296875, 22.41217803955078, -3.4464263916015625, -9.844062805175781, 19.077781677246094, 3.414011001586914, 5.695526123046875, 30.50860595703125, 30.52294921875, 3.018697738647461, 27.13295555114746, 24.736434936523438, 20.440900802612305], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000597.npy"}
|
||||
{"epoch": 0.9024943310657596, "step": 598, "batch_size": 64, "mean": 7.703800678253174, "std": 15.023340225219727, "min": -22.21297836303711, "p10": -11.245330810546873, "median": 4.597350120544434, "p90": 28.14963302612305, "max": 47.22474670410156, "pos_frac": 0.6875, "sample": [0.11139106750488281, -8.522071838378906, 13.005073547363281, 7.4738311767578125, 19.627029418945312, -0.2663612365722656, 9.617408752441406, 29.16573715209961, 14.507379531860352, -22.21297836303711, 0.2622203826904297, 4.749719619750977, 1.2169303894042969, -12.510988235473633, 47.22474670410156, -14.163543701171875, 39.693233489990234, -9.015228271484375, 21.820697784423828, 17.524429321289062, -4.288511276245117, -16.610698699951172, -12.142410278320312, 16.579483032226562, 1.7827835083007812, 5.872102737426758, 4.444980621337891, 19.32293701171875, -9.152145385742188, 0.5044746398925781, -5.1133270263671875, 0.16570281982421875, 10.316141128540039, 8.55609130859375, 32.259002685546875, -0.8564033508300781, 21.5816650390625, 0.11535263061523438, 28.35645294189453, -2.68280029296875, 21.39666748046875, 14.055381774902344, 0.9880905151367188, 43.14823913574219, -0.5290908813476562, 7.378562927246094, 10.033599853515625, -1.928152084350586, 15.659402847290039, 16.58177947998047, -1.986846923828125, -12.203697204589844, 2.2072372436523438, 8.71630859375, 3.9039783477783203, -13.395805358886719, 36.62169647216797, 13.735946655273438, -0.523101806640625, 3.2745132446289062, 22.9525146484375, 27.66705322265625, -7.376289367675781, 24.345748901367188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000598.npy"}
|
||||
{"epoch": 0.9040060468631897, "step": 599, "batch_size": 64, "mean": 9.143360137939453, "std": 15.048900604248047, "min": -20.917625427246094, "p10": -7.063841819763183, "median": 6.364513397216797, "p90": 27.8472396850586, "max": 49.83259582519531, "pos_frac": 0.734375, "sample": [-1.490966796875, -1.998220443725586, 7.00468635559082, 10.424140930175781, -3.3482742309570312, 8.406946182250977, -1.0314674377441406, 35.05723571777344, -9.738616943359375, -2.003570556640625, -7.7890777587890625, -5.836771011352539, 0.6951408386230469, 13.677330017089844, 25.41730499267578, -3.6792984008789062, -20.272781372070312, -2.7212085723876953, 6.066596984863281, 1.8937835693359375, 25.63753318786621, 13.790529251098633, 5.226264953613281, -20.917625427246094, 3.742513656616211, 0.7651214599609375, 1.8097686767578125, 35.729766845703125, -14.51068115234375, 5.289039611816406, 38.22407531738281, 12.198974609375, -7.589729309082031, -4.6870574951171875, 6.396507263183594, 5.715599060058594, 2.574054718017578, 7.214851379394531, 4.10582160949707, 29.053619384765625, 25.35476303100586, 3.094715118408203, 1.0866889953613281, 11.645891189575195, 26.645572662353516, 28.362239837646484, 25.727333068847656, 1.1985321044921875, -5.469387054443359, 12.80160140991211, 47.926849365234375, 21.98583984375, 12.17047119140625, 11.037860870361328, 26.518997192382812, 13.038919448852539, 49.83259582519531, 6.3425750732421875, 23.196685791015625, 6.386451721191406, -12.978164672851562, 23.11900520324707, 19.705825805664062, 7.941307067871094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000599.npy"}
|
||||
{"epoch": 0.9055177626606198, "step": 600, "batch_size": 64, "mean": 9.431896209716797, "std": 14.972805976867676, "min": -30.027790069580078, "p10": -4.807323265075683, "median": 7.480072021484375, "p90": 31.601720809936527, "max": 43.60912322998047, "pos_frac": 0.75, "sample": [16.220169067382812, 6.961429595947266, -7.625612258911133, 2.372344970703125, 30.413589477539062, 7.025657653808594, 11.802345275878906, -7.554744720458984, 34.324195861816406, 34.17011260986328, -4.022422790527344, 12.108631134033203, 24.375259399414062, 34.03269958496094, -4.1712493896484375, -2.8792171478271484, -4.1593475341796875, -4.19792366027832, 0.31463623046875, 24.567062377929688, 2.4688682556152344, 7.6563568115234375, 23.022445678710938, 16.16704559326172, -3.59893798828125, 23.081771850585938, 7.28741455078125, 35.62419891357422, -1.8934097290039062, 15.799232482910156, 28.937225341796875, 8.721057891845703, -19.56954574584961, 5.35919189453125, 12.569938659667969, -30.027790069580078, -0.9210968017578125, 10.45611572265625, 2.1814498901367188, 28.981842041015625, 1.1313934326171875, 13.489837646484375, 13.026508331298828, 17.836883544921875, -4.302469253540039, 14.519556045532227, -14.85736083984375, -5.023689270019531, -17.91724395751953, 32.11091995239258, 3.141021728515625, 8.560997009277344, 3.4077396392822266, 0.58697509765625, 9.34127426147461, 7.3037872314453125, 19.357513427734375, 4.302188873291016, 12.2861328125, 43.58759689331055, 5.4692230224609375, 5.856006622314453, 10.436447143554688, 43.60912322998047], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000600.npy"}
|
||||
{"epoch": 0.9070294784580499, "step": 601, "batch_size": 64, "mean": 12.440435409545898, "std": 16.000980377197266, "min": -22.17723846435547, "p10": -6.296835708618164, "median": 9.875864028930664, "p90": 32.67057304382325, "max": 52.14064025878906, "pos_frac": 0.734375, "sample": [21.362991333007812, 7.4076385498046875, 21.033309936523438, -7.919841766357422, 30.789344787597656, 25.436969757080078, 50.67021179199219, -0.535369873046875, 5.4188232421875, -6.519252777099609, 52.14064025878906, -22.17723846435547, 29.85045623779297, 12.087821960449219, -0.6350879669189453, 7.4109039306640625, 26.044015884399414, 24.22894287109375, -9.115333557128906, 16.273056030273438, 34.679969787597656, 5.137336730957031, 0.39295196533203125, 28.99652099609375, 49.01628112792969, 2.5981903076171875, -1.7053184509277344, -0.8197116851806641, 21.763336181640625, -3.2353134155273438, 4.757240295410156, 22.310306549072266, 11.37530517578125, 1.9668502807617188, 0.7187938690185547, 30.828899383544922, 6.304176330566406, 14.425392150878906, 8.160293579101562, 10.45199966430664, -7.535614013671875, 10.415283203125, 3.144977569580078, 29.668651580810547, 5.62506103515625, 7.055082321166992, 10.837150573730469, -5.777862548828125, -4.116914749145508, 9.336444854736328, 17.889732360839844, 34.77851867675781, -3.344064712524414, -0.41594696044921875, 13.51397705078125, 24.579532623291016, 35.1566162109375, 20.222328186035156, 24.118118286132812, -1.315103530883789, -7.3926544189453125, 28.610912322998047, 33.459861755371094, -13.7027587890625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000601.npy"}
|
||||
{"epoch": 0.90854119425548, "step": 602, "batch_size": 64, "mean": 11.356219291687012, "std": 13.638294219970703, "min": -17.085708618164062, "p10": -4.314105415344237, "median": 8.853797912597656, "p90": 31.777601623535162, "max": 42.032371520996094, "pos_frac": 0.78125, "sample": [-5.463005065917969, 24.227813720703125, 20.870834350585938, 38.624481201171875, 12.191802978515625, 10.337791442871094, 7.331146240234375, -0.1273326873779297, 16.451324462890625, 8.926197052001953, 39.11424255371094, -5.087451934814453, 25.275405883789062, 32.313926696777344, -7.251842498779297, 42.032371520996094, -17.085708618164062, 21.980735778808594, 12.128448486328125, 3.2702083587646484, 3.6115875244140625, -2.979921340942383, 0.31103515625, -4.885898590087891, 2.2773971557617188, 14.919044494628906, 4.4987945556640625, 18.490917205810547, 9.530426025390625, 9.357730865478516, -2.792705535888672, 7.556827545166016, 12.850212097167969, 7.328643798828125, 8.204730987548828, 1.8491325378417969, 8.327411651611328, 4.150470733642578, 29.3013858795166, 2.43597412109375, -1.2584495544433594, 15.250434875488281, 16.856355667114258, 32.80417251586914, -6.029624938964844, -14.897125244140625, 22.162391662597656, 5.650768280029297, 13.153465270996094, 2.581298828125, -1.3533897399902344, 7.8749542236328125, 40.77208709716797, 20.483726501464844, 6.674385070800781, 26.843952178955078, 21.3692626953125, 8.78139877319336, 37.333106994628906, 30.52617645263672, -2.3950881958007812, 20.504779815673828, 9.518653869628906, -0.8142242431640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000602.npy"}
|
||||
{"epoch": 0.91005291005291, "step": 603, "batch_size": 64, "mean": 11.105274200439453, "std": 14.364604949951172, "min": -19.026626586914062, "p10": -6.016345977783202, "median": 8.206415176391602, "p90": 30.297336959838873, "max": 54.45159912109375, "pos_frac": 0.796875, "sample": [14.7353515625, 29.221603393554688, 8.522537231445312, -2.4339752197265625, -2.8939647674560547, 16.246536254882812, 6.312705993652344, 15.281112670898438, -5.555961608886719, 0.2389068603515625, 30.758365631103516, 3.8956756591796875, 19.668190002441406, -1.9110240936279297, 5.540031433105469, 26.922401428222656, -7.632335662841797, 6.867988586425781, 2.6214828491210938, 54.45159912109375, 12.757911682128906, 6.077667236328125, 32.37066650390625, 6.366966247558594, -0.8184089660644531, 7.9429473876953125, 15.487386703491211, -6.213653564453125, 15.250839233398438, -9.846939086914062, 6.389492034912109, 7.1893463134765625, 8.962141036987305, 8.46988296508789, -14.170272827148438, 15.837821960449219, 16.05941390991211, 6.0224151611328125, 1.4309272766113281, 42.99237823486328, 9.909065246582031, 13.586761474609375, 6.234001159667969, 9.640174865722656, 23.694984436035156, 26.85138702392578, 14.79486083984375, 0.5078029632568359, 7.384864807128906, 3.021707534790039, -7.882530212402344, 4.014200210571289, 16.397825241088867, 23.393268585205078, 34.172508239746094, -0.07670211791992188, 14.358200073242188, -6.612892150878906, 28.959014892578125, 28.001785278320312, 34.236026763916016, 1.7663688659667969, 43.99537658691406, -19.026626586914062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000603.npy"}
|
||||
{"epoch": 0.9115646258503401, "step": 604, "batch_size": 64, "mean": 10.216726303100586, "std": 16.70660400390625, "min": -36.71730041503906, "p10": -7.357246398925781, "median": 7.716991424560547, "p90": 37.98942794799807, "max": 46.77684783935547, "pos_frac": 0.703125, "sample": [14.784454345703125, 20.452735900878906, -1.127227783203125, -16.755477905273438, 10.96712875366211, 12.654569625854492, -0.13946151733398438, 8.766090393066406, 41.10236358642578, -8.374984741210938, -7.153076171875, -0.806976318359375, 3.4520320892333984, 18.71363067626953, -4.6356048583984375, 7.69757080078125, 25.063758850097656, 3.836538314819336, 16.64630889892578, 14.41461181640625, -23.04595184326172, -5.076446533203125, 4.06884765625, -7.4447479248046875, 15.095077514648438, -0.28658485412597656, 1.9346179962158203, 5.765495300292969, 11.817253112792969, -36.71730041503906, 11.83511734008789, -4.123161315917969, 7.736412048339844, -7.921905517578125, 40.72446823120117, -4.586112976074219, -1.53436279296875, 21.964447021484375, 46.77684783935547, 18.392562866210938, 13.040435791015625, 42.961570739746094, 46.39161682128906, 6.512607574462891, 17.534034729003906, 24.510360717773438, 5.565032958984375, 26.76811981201172, 46.385772705078125, -4.544887542724609, 33.44102478027344, 7.5846710205078125, 3.5039825439453125, 2.5659713745117188, 16.161773681640625, 18.898479461669922, 15.32735824584961, 1.608926773071289, 3.9804153442382812, 39.938743591308594, 19.04773712158203, -0.4855194091796875, -7.9438934326171875, 20.18256378173828], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000604.npy"}
|
||||
{"epoch": 0.9130763416477702, "step": 605, "batch_size": 64, "mean": 9.645886421203613, "std": 14.445586204528809, "min": -19.6231689453125, "p10": -8.65339241027832, "median": 9.773843765258789, "p90": 30.409470367431645, "max": 45.71282958984375, "pos_frac": 0.75, "sample": [10.513168334960938, 1.4503250122070312, -9.708047866821289, 45.71282958984375, 19.199867248535156, 25.71337890625, -3.5892791748046875, 10.791656494140625, 14.487422943115234, -11.061286926269531, 25.139007568359375, -8.657394409179688, 4.559288024902344, -1.1794052124023438, 4.454761505126953, -8.644054412841797, 12.72393798828125, 24.8603515625, 14.064422607421875, 9.03451919555664, -3.4021453857421875, 0.2942962646484375, 14.417673110961914, -17.624969482421875, -0.274688720703125, 11.084770202636719, 34.223548889160156, 14.251426696777344, 5.432411193847656, 7.2355499267578125, 13.452468872070312, 6.556190490722656, -15.330032348632812, 35.427223205566406, 1.73431396484375, 20.128890991210938, 27.387409210205078, 20.123321533203125, 8.236801147460938, 36.94898223876953, -0.10111236572265625, 4.6882476806640625, 13.909507751464844, 12.97755241394043, 33.44313049316406, 6.362129211425781, 19.679168701171875, 29.37097930908203, 15.894294738769531, -5.485382080078125, 35.274070739746094, 12.001815795898438, 12.38327407836914, 1.5212631225585938, -19.6231689453125, 3.7347965240478516, 5.827156066894531, -18.35704803466797, 13.913528442382812, -2.8084564208984375, 6.0087738037109375, 30.854537963867188, -7.73736572265625, 13.436124801635742], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000605.npy"}
|
||||
{"epoch": 0.9145880574452003, "step": 606, "batch_size": 64, "mean": 10.010250091552734, "std": 13.483731269836426, "min": -12.251136779785156, "p10": -5.565420532226562, "median": 9.091754913330078, "p90": 26.229475402832033, "max": 48.18408203125, "pos_frac": 0.6875, "sample": [-4.355827331542969, 25.446739196777344, 3.8034934997558594, 24.988685607910156, -0.7200851440429688, 0.694000244140625, 23.606712341308594, -8.42254638671875, -0.05802726745605469, 23.6355037689209, 33.21197509765625, -6.408782958984375, 21.751083374023438, 15.433212280273438, 18.34392547607422, -10.444190979003906, 12.570741653442383, 5.342903137207031, -3.1274566650390625, 9.121055603027344, 48.18408203125, -0.262786865234375, 0.4365425109863281, 10.2950439453125, -5.468547821044922, -2.7548751831054688, 14.135986328125, 0.7397327423095703, 9.038864135742188, 17.129371643066406, 7.7115325927734375, -0.24891090393066406, -12.251136779785156, 28.239791870117188, 5.2057647705078125, 3.6770572662353516, 10.1829833984375, -5.606937408447266, 13.052597045898438, 21.803115844726562, 24.02016830444336, 36.331382751464844, 13.717979431152344, 22.03018569946289, 9.062454223632812, 39.34608459472656, -2.5468292236328125, 12.556793212890625, -6.82438850402832, -1.7027320861816406, 24.414196014404297, -5.787086486816406, -5.163459777832031, 20.80731964111328, 2.3774871826171875, 17.003070831298828, 7.76153564453125, -5.340419769287109, 9.654644012451172, 33.95808410644531, 9.180122375488281, 26.56493377685547, 13.38372802734375, -1.8016395568847656], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000606.npy"}
|
||||
{"epoch": 0.9160997732426304, "step": 607, "batch_size": 64, "mean": 11.56348991394043, "std": 15.677382469177246, "min": -21.30657958984375, "p10": -7.291674041748046, "median": 10.692768096923828, "p90": 29.490242576599126, "max": 57.257904052734375, "pos_frac": 0.71875, "sample": [13.008129119873047, 5.6154632568359375, -5.929365158081055, 6.1298370361328125, 14.903984069824219, 14.643867492675781, 21.801353454589844, 16.94438934326172, -21.30657958984375, -7.652091979980469, 7.659648895263672, 15.435127258300781, 49.49250793457031, 32.29170227050781, 7.410675048828125, -0.8680419921875, -6.4506988525390625, 3.8118114471435547, 20.80853271484375, 42.35327911376953, 6.66326904296875, 23.764301300048828, 24.948654174804688, 21.201000213623047, 28.07251739501953, 18.76464080810547, 5.418601989746094, 9.393096923828125, -4.103084564208984, -3.1654701232910156, 23.86028289794922, 20.20343017578125, 25.970901489257812, 12.585334777832031, -4.582355499267578, 35.04900360107422, 32.73020935058594, -2.6137733459472656, 26.52001953125, -3.8700103759765625, -9.257575988769531, 4.921686172485352, 1.7537403106689453, 13.127758026123047, 10.999267578125, -10.97955322265625, 5.670591354370117, 25.077417373657227, 57.257904052734375, 30.051677703857422, -1.8574256896972656, -9.01251220703125, 10.386268615722656, 27.96319580078125, -2.9492759704589844, 19.931140899658203, -0.666259765625, 1.4875717163085938, 11.29962158203125, 28.180227279663086, 9.991241455078125, -15.864456176757812, -9.656349182128906, 15.293354034423828], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000607.npy"}
|
||||
{"epoch": 0.9176114890400605, "step": 608, "batch_size": 64, "mean": 8.535051345825195, "std": 13.496474266052246, "min": -26.49267578125, "p10": -4.974020385742188, "median": 6.127513885498047, "p90": 25.562808609008805, "max": 46.993343353271484, "pos_frac": 0.734375, "sample": [12.411767959594727, -3.0183563232421875, 3.8471946716308594, 21.96273422241211, -0.058879852294921875, 17.429595947265625, 18.8653564453125, -5.354454040527344, 7.469749450683594, -4.89068603515625, 6.9098663330078125, 5.834053039550781, 21.701705932617188, 12.99417495727539, 4.441671371459961, 14.932632446289062, 34.39741516113281, -19.531936645507812, -1.7367420196533203, -0.4355735778808594, 3.8000335693359375, -4.380916595458984, -0.4767322540283203, 46.993343353271484, 18.14022445678711, 1.5063591003417969, -7.850006103515625, 6.4209747314453125, 4.577350616455078, 32.123355865478516, 16.158233642578125, -12.783683776855469, 2.103179931640625, 12.128173828125, 0.6833343505859375, 14.435501098632812, 8.836837768554688, -0.5502700805664062, 7.3561859130859375, 3.8611526489257812, 4.291385650634766, 10.147857666015625, 5.337158203125, -0.5019187927246094, -6.7630767822265625, 15.123184204101562, -2.8730926513671875, 27.105697631835938, 44.392730712890625, 17.12153434753418, 13.68212890625, 9.1678466796875, 5.3271636962890625, 13.534503936767578, 2.866436004638672, 20.3216552734375, -26.49267578125, 37.16859436035156, 8.662918090820312, -5.009735107421875, 0.9545516967773438, 19.74383544921875, 3.8724899291992188, 27.808197021484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000608.npy"}
|
||||
{"epoch": 0.9191232048374905, "step": 609, "batch_size": 64, "mean": 12.957870483398438, "std": 13.61194896697998, "min": -17.93280029296875, "p10": -2.0078033447265624, "median": 11.766220092773438, "p90": 28.18872833251954, "max": 47.72038269042969, "pos_frac": 0.84375, "sample": [26.318405151367188, 12.874832153320312, 22.175376892089844, 6.190380096435547, 11.508514404296875, 3.4392471313476562, 17.830364227294922, 13.050537109375, -1.0115432739257812, -1.1494598388671875, -2.369211196899414, 10.197162628173828, -2.1204681396484375, 4.237581253051758, 45.77760314941406, 23.726852416992188, 32.86932373046875, 19.43680191040039, 6.70863151550293, 47.72038269042969, -11.720115661621094, 23.019332885742188, 9.589553833007812, 21.959304809570312, -1.7449188232421875, 23.34356689453125, 16.29918670654297, 15.036590576171875, -6.922719955444336, 12.02392578125, 36.07910919189453, 0.01998138427734375, 46.9693603515625, 22.58495330810547, 20.22503662109375, 20.136737823486328, 9.147991180419922, 1.669027328491211, 43.150634765625, 14.564285278320312, 7.269645690917969, -8.84743881225586, 4.768260955810547, 28.99029541015625, 21.79913330078125, -17.93280029296875, 10.693012237548828, 10.45098876953125, 16.405548095703125, 1.5612945556640625, 12.478897094726562, 10.530899047851562, 5.7308349609375, 9.189306259155273, 17.36272621154785, 1.7434730529785156, 17.395626068115234, 19.065208435058594, 18.46271514892578, -7.243442535400391, 18.974380493164062, 7.335319519042969, 9.749107360839844, 0.5285568237304688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000609.npy"}
|
||||
{"epoch": 0.9206349206349206, "step": 610, "batch_size": 64, "mean": 11.85274600982666, "std": 17.29871368408203, "min": -27.178070068359375, "p10": -7.83476104736328, "median": 8.508859634399414, "p90": 34.64293556213379, "max": 62.54029846191406, "pos_frac": 0.765625, "sample": [8.034553527832031, 35.42060852050781, 29.52906036376953, 28.962066650390625, 4.298454284667969, 23.434917449951172, 4.2984619140625, 6.6441650390625, -9.20501708984375, 7.457313537597656, 6.2206268310546875, -13.158735275268555, 6.449192047119141, 32.96965026855469, 2.1295242309570312, 9.437850952148438, 34.922298431396484, -3.866535186767578, -27.178070068359375, -0.2917938232421875, 2.979888916015625, -2.4027328491210938, 9.159263610839844, 13.227733612060547, -8.62567138671875, -4.42803955078125, -8.712371826171875, 48.95659637451172, 19.39316749572754, 32.683570861816406, 5.450506210327148, 53.352294921875, 22.031219482421875, 28.633071899414062, 0.5345001220703125, 16.18037223815918, 24.582054138183594, 0.8756771087646484, 15.848751068115234, 10.023284912109375, 11.186347961425781, 9.167301177978516, 8.942726135253906, -9.532470703125, 40.43445587158203, -2.036907196044922, 2.6018333435058594, -4.137783050537109, 24.58195686340332, 33.9910888671875, 2.8627090454101562, 9.818353652954102, 9.005935668945312, 0.6025447845458984, -1.0596866607666016, 0.8940658569335938, -9.209770202636719, 19.588157653808594, 62.54029846191406, 8.41775131225586, 26.177154541015625, -5.9893035888671875, 44.877288818359375, 8.599967956542969], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000610.npy"}
|
||||
{"epoch": 0.9221466364323507, "step": 611, "batch_size": 64, "mean": 7.915609359741211, "std": 13.60826587677002, "min": -19.125457763671875, "p10": -7.331627655029297, "median": 6.995113372802734, "p90": 23.083176803588874, "max": 53.47576904296875, "pos_frac": 0.71875, "sample": [21.133804321289062, 12.388898849487305, -13.753372192382812, 6.412528991699219, 32.21034240722656, 3.8750686645507812, -0.5589599609375, 15.60427474975586, -2.966930389404297, 5.459381103515625, -10.929302215576172, 25.050739288330078, 12.055143356323242, 10.98464584350586, -4.229209899902344, -1.6987133026123047, -6.590679168701172, 19.814285278320312, -18.572898864746094, 53.47576904296875, 11.75924301147461, 21.080917358398438, 8.443603515625, -7.305696487426758, -2.9379920959472656, 0.15107345581054688, 21.187301635742188, 6.856170654296875, 3.0028839111328125, 20.878326416015625, 6.703485488891602, -11.433151245117188, -6.29766845703125, 33.09435272216797, 2.1480331420898438, 6.416961669921875, -15.456302642822266, 36.61370849609375, 8.329185485839844, 9.253532409667969, 17.534637451171875, 6.393037796020508, -2.7964935302734375, 18.87371063232422, 2.0422210693359375, 4.991783142089844, 11.063850402832031, 13.444908142089844, 6.6460113525390625, 14.2666015625, 15.210227966308594, 32.10481262207031, 4.16407585144043, 13.112258911132812, 7.2763671875, 12.499748229980469, 23.895694732666016, -19.125457763671875, 11.055404663085938, 7.134056091308594, -7.342741012573242, -5.347923278808594, -3.0894412994384766, 10.938812255859375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000611.npy"}
|
||||
{"epoch": 0.9236583522297808, "step": 612, "batch_size": 64, "mean": 12.111353874206543, "std": 17.129297256469727, "min": -35.91444396972656, "p10": -3.2405963897705066, "median": 11.06141185760498, "p90": 35.451517105102546, "max": 59.321632385253906, "pos_frac": 0.78125, "sample": [-1.1215934753417969, -16.182418823242188, 20.0035400390625, -2.2336654663085938, -1.5407257080078125, 5.011566162109375, 20.760581970214844, 17.341995239257812, -1.0165786743164062, 0.41635894775390625, 27.539566040039062, 23.7430419921875, 18.631183624267578, 5.3727264404296875, 8.348724365234375, 12.915534973144531, 13.075035095214844, 7.374519348144531, 4.8773345947265625, 13.362388610839844, 16.757404327392578, -22.525177001953125, 15.456596374511719, 4.517631530761719, -1.86083984375, 5.427608489990234, 44.63525390625, -5.450004577636719, -3.672138214111328, 4.111274719238281, 3.9444732666015625, 36.31169509887695, 0.4560737609863281, -0.3957386016845703, 0.16869354248046875, 33.51436996459961, 8.933738708496094, 27.13005828857422, 55.76301574707031, -19.40216064453125, 41.42743682861328, 6.807949066162109, 28.643966674804688, -1.052032470703125, 31.237808227539062, 36.28172302246094, 9.951957702636719, -3.931978225708008, 23.541046142578125, 59.321632385253906, 5.841865539550781, 6.711357116699219, 39.967193603515625, 17.044830322265625, 12.170866012573242, 23.611968994140625, 13.178779602050781, 1.1938152313232422, 14.966606140136719, -35.91444396972656, 14.952856063842773, 21.6258544921875, 13.799310684204102, 13.245346069335938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000612.npy"}
|
||||
{"epoch": 0.9251700680272109, "step": 613, "batch_size": 64, "mean": 14.422100067138672, "std": 16.680994033813477, "min": -21.61144256591797, "p10": -3.7262153625488272, "median": 11.86203384399414, "p90": 39.83320541381836, "max": 54.586448669433594, "pos_frac": 0.84375, "sample": [-4.2497406005859375, 11.612716674804688, 0.3635749816894531, 13.134239196777344, 17.05234718322754, 18.323020935058594, 18.525054931640625, 9.474311828613281, -8.285331726074219, -1.0917320251464844, 11.538589477539062, 15.502300262451172, -21.61144256591797, 12.111351013183594, 16.79129981994629, 24.835357666015625, 7.965169906616211, 9.389694213867188, 37.30951690673828, 8.55889892578125, 0.41557884216308594, -6.431175231933594, -14.497772216796875, 5.18010139465332, 44.45659637451172, 15.969289779663086, 44.07379913330078, 8.206281661987305, 9.255943298339844, 13.49139404296875, 38.28059387207031, 16.19886016845703, 16.102432250976562, -0.5819473266601562, 1.7775802612304688, 11.020801544189453, -4.085639953613281, 40.00359344482422, 15.501092910766602, 52.50926971435547, 29.752410888671875, 24.519672393798828, 5.828834533691406, 39.05810546875, 43.523162841796875, 1.0818099975585938, 39.65522003173828, 16.204694747924805, -2.8875579833984375, 22.774372100830078, 2.85888671875, 2.9128875732421875, 14.797290802001953, 39.90948486328125, 21.7086181640625, 10.88951301574707, 10.906192779541016, 31.656343460083008, 10.179849624633789, 54.586448669433594, 1.0767822265625, 16.212352752685547, 2.92205810546875, -21.208908081054688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000613.npy"}
|
||||
{"epoch": 0.926681783824641, "step": 614, "batch_size": 64, "mean": 10.78100872039795, "std": 11.753384590148926, "min": -23.270931243896484, "p10": -3.560803604125976, "median": 10.415401458740234, "p90": 25.930926513671874, "max": 46.664459228515625, "pos_frac": 0.828125, "sample": [22.52146339416504, 28.080886840820312, 6.132781982421875, 16.78626251220703, 5.567626953125, 17.993080139160156, 6.600894927978516, 3.305675506591797, 10.014461517333984, 8.136882781982422, 21.194244384765625, 12.399702072143555, 23.321075439453125, 23.018142700195312, 15.141311645507812, 9.456697463989258, 14.186222076416016, 19.688758850097656, 33.099205017089844, 25.834625244140625, 8.386281967163086, 2.478168487548828, -4.470611572265625, -9.352386474609375, 46.664459228515625, -3.9316482543945312, 0.8458003997802734, -4.8672943115234375, 14.39539909362793, 12.862991333007812, 0.7094039916992188, 4.31964111328125, -2.6954994201660156, 29.626739501953125, 5.845510482788086, 7.494901657104492, 26.09678840637207, 14.168914794921875, -23.270931243896484, 13.761528015136719, -2.1769332885742188, 28.917312622070312, 14.511344909667969, 22.701459884643555, 8.487831115722656, 12.17854118347168, 10.047454833984375, 11.062942504882812, 8.945793151855469, -1.9575653076171875, 13.81161880493164, 8.055755615234375, -9.684928894042969, -0.8547439575195312, 2.258472442626953, 25.972198486328125, 16.259254455566406, 0.710723876953125, 16.540143966674805, 10.783348083496094, 3.9036216735839844, 14.180065155029297, -4.7303314208984375, 18.512985229492188], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000614.npy"}
|
||||
{"epoch": 0.9281934996220711, "step": 615, "batch_size": 64, "mean": 12.26151180267334, "std": 16.501768112182617, "min": -23.719345092773438, "p10": -7.503333663940429, "median": 9.810115814208984, "p90": 37.92607421875, "max": 49.5465087890625, "pos_frac": 0.796875, "sample": [9.428184509277344, 11.946395874023438, 18.544790267944336, 7.029857635498047, 11.048561096191406, 28.659278869628906, 8.905677795410156, -23.719345092773438, 42.18441390991211, 13.463783264160156, -4.117034912109375, 13.744804382324219, -9.649887084960938, 16.310516357421875, 3.327150344848633, -10.158760070800781, 6.236427307128906, 37.8934326171875, 17.334747314453125, -4.8120880126953125, 28.850879669189453, 3.0077667236328125, 34.96770477294922, -3.864614486694336, 20.473987579345703, 11.39239501953125, -7.506214141845703, 11.785446166992188, 9.966232299804688, 38.414608001708984, 22.126388549804688, 2.893148422241211, -1.8723297119140625, -8.850711822509766, 14.760108947753906, 28.63622283935547, 36.01666259765625, 5.3280792236328125, 0.75927734375, 38.68932342529297, 27.37049102783203, 43.59686279296875, 23.731361389160156, 16.232894897460938, 21.70964813232422, 4.981544494628906, 7.2666015625, 0.423553466796875, 2.8347320556640625, 31.70553970336914, 5.4053955078125, 49.5465087890625, -7.496612548828125, 8.02718734741211, 2.5056610107421875, 38.665252685546875, 3.5817413330078125, 12.44371223449707, 37.9400634765625, 4.49028205871582, 9.653999328613281, -2.7538070678710938, -21.13471221923828, -15.566375732421875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000615.npy"}
|
||||
{"epoch": 0.9297052154195011, "step": 616, "batch_size": 64, "mean": 8.698728561401367, "std": 14.213396072387695, "min": -19.423091888427734, "p10": -5.468685150146483, "median": 4.710036277770996, "p90": 29.637834930419928, "max": 50.506011962890625, "pos_frac": 0.734375, "sample": [33.71876525878906, 4.063816070556641, 2.8382568359375, -3.0624847412109375, 14.80917739868164, -3.925992965698242, 50.506011962890625, 10.140060424804688, -0.2892112731933594, 6.050590515136719, 10.26048469543457, 0.5950222015380859, 10.756195068359375, -19.423091888427734, 1.4374122619628906, -13.447002410888672, 4.1460113525390625, -7.88909912109375, 0.5840797424316406, 36.49751281738281, 2.2448196411132812, 44.77272033691406, -8.85828971862793, -1.1187610626220703, 44.96575164794922, -3.7475757598876953, 3.488492965698242, -8.223777770996094, -1.4894218444824219, 28.388198852539062, 15.183090209960938, 2.2845535278320312, 6.153423309326172, 11.859054565429688, 31.283374786376953, 18.887813568115234, -10.20931625366211, 8.698070526123047, -0.9669189453125, 22.708784103393555, 15.479721069335938, 14.160511016845703, 5.27406120300293, 13.234809875488281, 30.17339324951172, 11.238876342773438, 1.9716243743896484, 0.4365653991699219, -6.105560302734375, 2.683704376220703, 0.07056427001953125, -3.9826431274414062, 10.985000610351562, 3.1185150146484375, 10.388198852539062, 18.07678985595703, 7.6132659912109375, 18.706867218017578, 22.736072540283203, -2.7793960571289062, 22.22918701171875, 16.278709411621094, 3.3690338134765625, -3.309865951538086], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000616.npy"}
|
||||
{"epoch": 0.9312169312169312, "step": 617, "batch_size": 64, "mean": 10.601092338562012, "std": 14.413009643554688, "min": -18.475303649902344, "p10": -5.167729759216308, "median": 6.716341018676758, "p90": 30.849295043945332, "max": 50.87507629394531, "pos_frac": 0.8125, "sample": [6.731636047363281, 6.701045989990234, -3.3612213134765625, 16.86383819580078, 32.83869934082031, 7.507160186767578, 19.705047607421875, 21.94854736328125, 1.4505729675292969, -7.151695251464844, 2.9134445190429688, 19.226768493652344, 5.5051116943359375, 11.242130279541016, 2.252593994140625, 17.410953521728516, 14.209068298339844, 17.91724395751953, 4.968193054199219, 10.081079483032227, 42.5726318359375, 2.189807891845703, 1.7237663269042969, 24.38469696044922, 9.005287170410156, -2.955038070678711, 1.3870353698730469, 18.452129364013672, 0.37494659423828125, 15.166255950927734, 9.20598030090332, -5.835941314697266, 4.286811828613281, 38.31924057006836, 19.732025146484375, 50.87507629394531, 3.9473724365234375, -6.881324768066406, -2.5908203125, 37.074851989746094, 21.344017028808594, 6.4277496337890625, -2.3975696563720703, 0.7114753723144531, 34.96149444580078, 13.88917350769043, -18.475303649902344, -11.930938720703125, -16.371055603027344, 3.8350830078125, 11.535209655761719, 25.164871215820312, 26.207351684570312, -4.467859268188477, 46.1771240234375, 13.369169235229492, 5.6425628662109375, 3.75860595703125, 19.662330627441406, 4.71881103515625, 6.583580017089844, 3.2380523681640625, 20.958656311035156, -5.467674255371094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000617.npy"}
|
||||
{"epoch": 0.9327286470143613, "step": 618, "batch_size": 64, "mean": 10.869397163391113, "std": 13.95022964477539, "min": -24.478759765625, "p10": -3.4360282897949213, "median": 8.277908325195312, "p90": 30.617145538330078, "max": 46.02310562133789, "pos_frac": 0.796875, "sample": [32.96856689453125, 26.949134826660156, 16.15509796142578, 13.036575317382812, 0.6912002563476562, 1.3960018157958984, -3.769847869873047, 2.5987167358398438, 9.80804443359375, -0.5568218231201172, 5.453182220458984, 34.72076416015625, 23.929187774658203, 15.9512939453125, 46.02310562133789, 2.8666534423828125, 8.217124938964844, 2.543733596801758, 38.80364990234375, 20.429428100585938, 6.3444671630859375, 45.759979248046875, 4.083230972290039, 15.473747253417969, -1.1007308959960938, 30.65576934814453, 5.071126937866211, 12.090187072753906, 11.755470275878906, 3.5097179412841797, -0.160552978515625, 0.4242725372314453, 23.29376220703125, 8.141170501708984, 18.87549591064453, 12.85626220703125, 17.810653686523438, -4.404899597167969, 18.765033721923828, 8.037345886230469, -6.965188980102539, -0.7378368377685547, -2.657115936279297, -5.981620788574219, 3.8861312866210938, -12.462326049804688, 2.8568553924560547, -24.478759765625, 15.605941772460938, 17.625259399414062, -0.5233154296875, 25.222118377685547, 8.338691711425781, 8.626504898071289, 6.2137908935546875, 10.408966064453125, 1.5652618408203125, 11.223472595214844, 10.033992767333984, 25.789329528808594, 30.527023315429688, 40.595458984375, -9.101669311523438, 4.534152984619141], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000618.npy"}
|
||||
{"epoch": 0.9342403628117913, "step": 619, "batch_size": 64, "mean": 10.657764434814453, "std": 14.088643074035645, "min": -23.774932861328125, "p10": -4.082064247131347, "median": 7.735625267028809, "p90": 29.929489135742188, "max": 46.47865295410156, "pos_frac": 0.78125, "sample": [11.480953216552734, -2.4544754028320312, 2.8495445251464844, 3.6436233520507812, 46.47865295410156, 33.39537048339844, 24.887710571289062, -5.1650543212890625, -12.1890869140625, 36.27659606933594, 18.50133514404297, 5.951229095458984, 23.165496826171875, 3.059846878051758, 30.50048828125, -5.40264892578125, 17.53272247314453, 11.498641967773438, -2.684497833251953, -3.0624542236328125, -4.2266998291015625, 29.66699981689453, 1.0588340759277344, -2.38140869140625, -3.4511051177978516, 13.874717712402344, 15.995306015014648, 4.4392852783203125, 0.8923625946044922, 0.2990531921386719, 8.987686157226562, -23.774932861328125, -5.461246490478516, 11.208953857421875, 24.26443862915039, 27.559398651123047, 8.098857879638672, 30.04198455810547, 44.058692932128906, -6.4769744873046875, 7.372392654418945, 27.068817138671875, 11.08053970336914, -3.7445812225341797, 38.94761657714844, 15.375862121582031, 5.243869781494141, 17.032569885253906, 14.021856307983398, 7.092044830322266, 4.141151428222656, 11.928974151611328, 20.38864517211914, 2.55743408203125, 26.24542236328125, 0.484283447265625, 2.1509246826171875, 23.499786376953125, 0.5195064544677734, 3.231395721435547, 3.4365615844726562, 14.467704772949219, -0.7012252807617188, 17.317134857177734], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000619.npy"}
|
||||
{"epoch": 0.9357520786092215, "step": 620, "batch_size": 64, "mean": 9.638816833496094, "std": 14.521862983703613, "min": -13.52801513671875, "p10": -5.119942283630371, "median": 6.580702781677246, "p90": 32.91818542480469, "max": 49.829856872558594, "pos_frac": 0.75, "sample": [2.60247802734375, -5.145233154296875, -8.603626251220703, 49.829856872558594, -9.4927978515625, 1.1259727478027344, 9.200332641601562, 32.7294921875, 3.442169189453125, 13.819866180419922, 33.87652587890625, 35.27165603637695, 10.347137451171875, 8.417633056640625, 5.290130615234375, 30.17181396484375, 8.915031433105469, 6.261564254760742, -5.032958984375, 0.8723678588867188, -11.06951904296875, 0.7635536193847656, -13.52801513671875, 39.284263610839844, -11.350143432617188, 25.515975952148438, -0.7078208923339844, -0.9645919799804688, 45.23455810546875, -2.2046585083007812, 17.67604637145996, 19.125587463378906, 0.0072193145751953125, -3.1787357330322266, -3.998607635498047, 2.8560104370117188, 0.8144664764404297, 10.021987915039062, 23.406431198120117, 0.6369209289550781, 17.851213455200195, 4.4649658203125, 12.935615539550781, 7.146369934082031, 1.3853645324707031, -4.29010009765625, 33.87232971191406, 32.999053955078125, -0.5869960784912109, 18.104412078857422, 0.16815948486328125, 28.91533660888672, 9.54583740234375, -5.060930252075195, -7.679363250732422, 1.2541255950927734, 11.737625122070312, 2.6816368103027344, 15.134340286254883, 16.130523681640625, 17.429616928100586, 16.25531768798828, 17.349653244018555, 6.89984130859375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000620.npy"}
|
||||
{"epoch": 0.9372637944066515, "step": 621, "batch_size": 64, "mean": 9.723562240600586, "std": 14.935873985290527, "min": -20.611557006835938, "p10": -6.575630187988281, "median": 7.567917823791504, "p90": 29.51068191528321, "max": 43.696815490722656, "pos_frac": 0.703125, "sample": [-0.3724842071533203, 1.5609054565429688, 6.2866363525390625, -2.2614173889160156, -12.093429565429688, -20.611557006835938, 15.000137329101562, 43.696815490722656, -6.654022216796875, -0.45288658142089844, 8.582645416259766, 30.203330993652344, 30.630699157714844, 2.479768753051758, 42.93879699707031, 9.822250366210938, 7.645265579223633, 9.1920166015625, 16.534072875976562, 20.523788452148438, 3.716094970703125, 4.6893463134765625, 27.153076171875, -17.908000946044922, 24.71973419189453, 38.040771484375, 20.06915283203125, -9.983154296875, 7.490570068359375, -2.867412567138672, 40.482147216796875, 20.22126007080078, 18.671852111816406, 16.419219970703125, 16.247846603393555, 3.8601150512695312, 9.778938293457031, 0.6179313659667969, 18.086917877197266, -5.050140380859375, -4.392679214477539, 4.467988967895508, 21.582603454589844, 11.826995849609375, 1.9222965240478516, -0.21531295776367188, 6.88671875, -1.8253936767578125, -6.3927154541015625, 11.474098205566406, -1.6275348663330078, 22.311233520507812, 21.55596923828125, 1.6498470306396484, 21.31024169921875, -3.9838619232177734, -2.449382781982422, 40.757049560546875, -16.91814422607422, 17.285388946533203, -7.276153564453125, 1.6026229858398438, 17.75402069091797, 27.894500732421875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000621.npy"}
|
||||
{"epoch": 0.9387755102040817, "step": 622, "batch_size": 64, "mean": 9.982682228088379, "std": 12.812434196472168, "min": -10.910799026489258, "p10": -3.7143104553222654, "median": 7.422664642333984, "p90": 23.5262809753418, "max": 50.34864044189453, "pos_frac": 0.796875, "sample": [18.12060546875, 4.323421478271484, -1.8665084838867188, -9.997867584228516, 13.009124755859375, 4.6414337158203125, 5.224800109863281, 17.655487060546875, -3.8300113677978516, 4.8143157958984375, 50.34864044189453, 14.120319366455078, 0.3941917419433594, -9.48501205444336, 17.07868194580078, 9.070003509521484, 23.187068939208984, 10.20291519165039, 16.567245483398438, 36.18522644042969, 7.997642517089844, 18.355609893798828, -3.4443416595458984, 0.5729618072509766, 35.40843200683594, 1.6561508178710938, 11.962692260742188, 18.271804809570312, 1.3352813720703125, 12.53939437866211, 38.670494079589844, -7.8446044921875, 12.457075119018555, 2.0185585021972656, -1.9658737182617188, 3.6587467193603516, 29.137908935546875, 22.33806610107422, 18.78874969482422, 6.1986083984375, 21.827335357666016, 9.425880432128906, 12.151256561279297, -2.5495853424072266, 5.308071136474609, 14.898639678955078, 6.847686767578125, -7.446258544921875, 1.4612770080566406, 12.970394134521484, 23.67165756225586, 1.1652069091796875, -0.9902381896972656, -5.3986053466796875, 4.844720840454102, 2.66033935546875, 42.33518981933594, 5.9152069091796875, -10.910799026489258, -0.8882522583007812, 19.602001190185547, 4.15882682800293, 14.121925354003906, 15.832366943359375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000622.npy"}
|
||||
{"epoch": 0.9402872260015117, "step": 623, "batch_size": 64, "mean": 11.253677368164062, "std": 13.09897518157959, "min": -13.864532470703125, "p10": -5.910240173339843, "median": 11.061380386352539, "p90": 26.92706336975098, "max": 42.82111358642578, "pos_frac": 0.765625, "sample": [14.38580322265625, 5.1893463134765625, -2.8331336975097656, -7.754730224609375, 7.646507263183594, 33.4853515625, 12.258468627929688, 14.481441497802734, 18.06231689453125, 26.992778778076172, -6.386138916015625, 4.6713104248046875, 22.546531677246094, 10.918601989746094, 8.780952453613281, 42.82111358642578, -6.635570526123047, 28.513748168945312, 31.863265991210938, 9.4656982421875, 14.069755554199219, 4.715021133422852, 16.589750289916992, -7.367774963378906, -4.930351257324219, 10.758722305297852, 20.40294647216797, 18.752822875976562, -13.864532470703125, 26.773727416992188, 17.804420471191406, 41.94954299926758, 16.71770477294922, 25.0467529296875, 2.648101806640625, -12.407636642456055, 11.204158782958984, -1.8198623657226562, 26.0814208984375, 20.85393524169922, 23.3489990234375, 36.27024841308594, 9.762458801269531, 0.615966796875, -1.6712722778320312, 2.751096725463867, -0.9317245483398438, 4.598306655883789, 1.0607070922851562, 13.782329559326172, 22.11944580078125, 21.178176879882812, 14.035903930664062, 25.611740112304688, -0.6618442535400391, 12.251129150390625, 14.538986206054688, 1.4384765625, 5.5683441162109375, 1.154439926147461, 21.367645263671875, -6.330192565917969, -2.0514068603515625, -2.0249252319335938], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000623.npy"}
|
||||
{"epoch": 0.9417989417989417, "step": 624, "batch_size": 64, "mean": 10.991124153137207, "std": 15.009068489074707, "min": -17.282384872436523, "p10": -5.88018741607666, "median": 11.128121376037598, "p90": 31.311338806152346, "max": 48.77375793457031, "pos_frac": 0.71875, "sample": [-4.72967529296875, -7.673038482666016, 16.95579719543457, 11.647026062011719, 42.12845993041992, 41.26966094970703, 0.2972393035888672, 3.9745635986328125, -0.48384857177734375, -3.6357059478759766, 8.603248596191406, 12.85992431640625, 13.212947845458984, 10.979021072387695, 19.223182678222656, 11.2772216796875, 9.555278778076172, 19.324676513671875, 29.178878784179688, 3.9829483032226562, 0.10891342163085938, -2.0198936462402344, -12.626001358032227, 24.35803985595703, -0.4585380554199219, 13.102874755859375, 1.6894111633300781, -5.173728942871094, 3.2722091674804688, 4.759002685546875, 32.31059265136719, 29.44091796875, 17.316741943359375, 31.57806396484375, 12.90115737915039, -9.87704849243164, 36.61906433105469, 23.43215560913086, 48.77375793457031, 20.185585021972656, 15.653425216674805, 15.757003784179688, 17.614276885986328, 6.340980529785156, -7.002464294433594, 17.91149139404297, 37.74836730957031, -1.6641769409179688, -2.771028518676758, 18.05461883544922, 13.697898864746094, 3.1343116760253906, -6.036954879760742, 0.6334056854248047, -17.282384872436523, 30.39196014404297, -5.514396667480469, -1.1893692016601562, 26.741134643554688, -2.7439918518066406, -14.766279220581055, 3.863994598388672, 16.530059814453125, 30.688980102539062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000624.npy"}
|
||||
{"epoch": 0.9433106575963719, "step": 625, "batch_size": 64, "mean": 8.711897850036621, "std": 14.47381591796875, "min": -19.493425369262695, "p10": -7.2884677886962885, "median": 6.212799072265625, "p90": 30.179033279418945, "max": 44.79796600341797, "pos_frac": 0.703125, "sample": [11.454019546508789, 24.73137664794922, 7.560100555419922, 3.094940185546875, -6.500469207763672, 14.969707489013672, 8.26711654663086, 4.516258239746094, 10.777700424194336, 36.40673828125, 4.2183685302734375, -1.7311058044433594, 21.616294860839844, 0.9745407104492188, 29.059783935546875, 2.2252349853515625, 39.557861328125, 7.061073303222656, -8.189701080322266, -0.39896392822265625, 9.025215148925781, 30.33224105834961, 0.7213058471679688, -15.416168212890625, -0.2400054931640625, 7.940448760986328, 7.9287567138671875, 22.100494384765625, -0.00054168701171875, 9.57516860961914, -3.455455780029297, 29.821548461914062, 27.8450927734375, -0.06274795532226562, 11.35598373413086, -7.169685363769531, -5.380548477172852, 34.28779602050781, 5.364524841308594, 4.862068176269531, 16.5164794921875, 44.79796600341797, 32.464805603027344, 12.020483016967773, 14.994789123535156, -0.78985595703125, -19.33196258544922, -6.383247375488281, -7.339374542236328, 13.040863037109375, 3.6172637939453125, -11.816192626953125, -19.493425369262695, 32.60956954956055, 11.862361907958984, 1.2439842224121094, 12.142974853515625, 5.297615051269531, 22.321908950805664, -12.035568237304688, 3.4464035034179688, 25.450347900390625, 4.422384262084961, -0.6054611206054688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000625.npy"}
|
||||
{"epoch": 0.9448223733938019, "step": 626, "batch_size": 64, "mean": 12.506250381469727, "std": 17.841793060302734, "min": -19.487518310546875, "p10": -11.129969596862793, "median": 10.149324417114258, "p90": 35.19625015258789, "max": 55.467254638671875, "pos_frac": 0.765625, "sample": [19.735023498535156, 10.526779174804688, 28.678770065307617, -12.747745513916016, 55.467254638671875, 31.671167373657227, 35.293975830078125, -10.916433334350586, 13.644983291625977, 32.809425354003906, 1.2818241119384766, 1.75299072265625, 6.6334991455078125, 0.9602260589599609, -7.15447998046875, 51.77143478393555, 20.485633850097656, -14.131767272949219, 34.968223571777344, 2.405965805053711, -16.909807205200195, 16.297882080078125, 10.460142135620117, 47.421302795410156, 8.780038833618164, 6.97172737121582, 16.412382125854492, 26.191177368164062, -19.487518310546875, 49.541168212890625, 5.966255187988281, -5.728002548217773, 20.49627685546875, 20.779403686523438, 9.549701690673828, -14.599716186523438, -0.1696624755859375, 21.794281005859375, -0.1324138641357422, 12.578575134277344, 9.628368377685547, 5.133100509643555, 33.80042266845703, 9.838506698608398, -6.0431365966796875, 8.94797134399414, 2.5172176361083984, 32.24045944213867, 18.581066131591797, 13.347419738769531, 20.500770568847656, 4.886600494384766, -15.696634292602539, 14.09423828125, -11.221485137939453, 20.414169311523438, 5.021705627441406, 33.61383819580078, 35.94460678100586, 19.624298095703125, 37.24638366699219, -9.199935913085938, 5.857063293457031, -8.026969909667969], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000626.npy"}
|
||||
{"epoch": 0.9463340891912321, "step": 627, "batch_size": 64, "mean": 14.53464412689209, "std": 16.819307327270508, "min": -18.927955627441406, "p10": -3.8784595489501945, "median": 10.519668579101562, "p90": 38.674499511718764, "max": 58.23252868652344, "pos_frac": 0.84375, "sample": [8.294315338134766, 12.767826080322266, 25.671234130859375, 24.521621704101562, 8.655948638916016, 17.391490936279297, 49.054466247558594, 40.18461608886719, 33.8232421875, 10.705657958984375, -7.291969299316406, 27.70372772216797, 51.3189697265625, 13.532302856445312, 10.274831771850586, 1.88616943359375, 6.033119201660156, 7.014228820800781, 46.18397521972656, 7.8142242431640625, 20.765960693359375, 11.913022994995117, 4.512607574462891, 12.5692138671875, 3.787261962890625, -12.816463470458984, 31.668201446533203, -7.76434326171875, 23.091079711914062, 0.5611076354980469, 2.2329349517822266, 47.756072998046875, 10.33367919921875, 0.18499755859375, 8.98358154296875, 22.43768310546875, -0.5283737182617188, 5.0218353271484375, 1.3053150177001953, 22.427616119384766, 58.23252868652344, 18.736251831054688, -3.004962921142578, -7.360578536987305, -4.252815246582031, 7.9040374755859375, -0.17082977294921875, 23.89705467224121, 6.798088073730469, 35.15089416503906, 13.29356575012207, -18.927955627441406, 7.302814483642578, 20.28220558166504, 32.478424072265625, 9.9881591796875, 2.30169677734375, -17.536495208740234, 17.310958862304688, 5.45428466796875, 22.464218139648438, 30.3197078704834, 20.394527435302734, 45.178443908691406], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000627.npy"}
|
||||
{"epoch": 0.9478458049886621, "step": 628, "batch_size": 64, "mean": 13.757041931152344, "std": 17.495742797851562, "min": -35.858245849609375, "p10": -7.013595008850097, "median": 13.421600341796875, "p90": 35.15561981201172, "max": 56.567230224609375, "pos_frac": 0.796875, "sample": [56.567230224609375, 8.110443115234375, 17.293724060058594, 15.736387252807617, -7.110687255859375, -12.045150756835938, 22.275192260742188, 4.8528900146484375, 31.450166702270508, 23.500892639160156, 49.426902770996094, 44.26026916503906, 20.120697021484375, 9.747447967529297, 16.2729434967041, 6.4183502197265625, -35.858245849609375, -20.292760848999023, 29.749252319335938, 4.269588470458984, 26.1030330657959, 13.507400512695312, 3.8890113830566406, -7.99188232421875, 20.502426147460938, 22.938919067382812, 13.335800170898438, 12.381973266601562, 10.628364562988281, 35.53778839111328, 16.692007064819336, 16.67926025390625, 3.5198898315429688, 43.019874572753906, 9.849308013916016, 31.778228759765625, -3.5354156494140625, 0.9335517883300781, 20.046630859375, -1.010650634765625, 19.606475830078125, 11.984970092773438, 10.036458969116211, 25.06175994873047, 23.969085693359375, 17.767044067382812, 41.895957946777344, 30.213159561157227, 31.527202606201172, 24.744869232177734, 2.524984359741211, -0.0983123779296875, 2.406757354736328, -14.740678787231445, 11.048128128051758, 38.12904357910156, 5.233085632324219, 16.329116821289062, -2.4090118408203125, 9.142398834228516, -6.787046432495117, 34.263893127441406, -3.988006591796875, -20.961685180664062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000628.npy"}
|
||||
{"epoch": 0.9493575207860923, "step": 629, "batch_size": 64, "mean": 11.235391616821289, "std": 12.188960075378418, "min": -10.957511901855469, "p10": -1.8277980804443343, "median": 7.742208480834961, "p90": 29.654981994628912, "max": 50.59660339355469, "pos_frac": 0.859375, "sample": [0.527801513671875, 4.059150695800781, -10.957511901855469, 14.523286819458008, 13.05230712890625, 7.3287353515625, 0.7812728881835938, 13.983085632324219, 6.756385803222656, 8.810302734375, 7.213294982910156, 12.687263488769531, 5.113256454467773, 50.59660339355469, 5.891767501831055, -2.503925323486328, 8.155681610107422, 13.668327331542969, 7.1198883056640625, 24.4915771484375, -0.2501678466796875, 0.655059814453125, -2.6532859802246094, 2.928508758544922, -4.942686080932617, 6.224525451660156, 28.508499145507812, 16.844818115234375, 1.007925033569336, 25.843833923339844, 20.479324340820312, -5.304588317871094, 13.7041015625, 15.592916488647461, 17.430252075195312, 14.17291259765625, 9.921274185180664, 1.4307403564453125, 5.713096618652344, -0.14282989501953125, 3.5791778564453125, 2.5123023986816406, 33.69267272949219, 6.797512054443359, 3.6062049865722656, 9.6617431640625, 7.10662841796875, 18.839092254638672, 10.968603134155273, 21.77161407470703, 25.066741943359375, 2.4780235290527344, 31.305641174316406, 4.623056411743164, 19.9771728515625, 22.948287963867188, -6.7811431884765625, 37.797828674316406, 34.13013458251953, 32.17579650878906, 4.35919189453125, 30.146331787109375, -5.59930419921875, 9.438911437988281], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000629.npy"}
|
||||
{"epoch": 0.9508692365835223, "step": 630, "batch_size": 64, "mean": 11.56503963470459, "std": 14.158218383789062, "min": -9.88296127319336, "p10": -5.3297489166259755, "median": 10.464754104614258, "p90": 27.92954635620118, "max": 49.033233642578125, "pos_frac": 0.734375, "sample": [-9.70534896850586, 12.58905029296875, 7.727420806884766, -2.29974365234375, 23.54434585571289, 34.83720397949219, 14.851043701171875, -1.8944969177246094, 44.17426300048828, 24.341033935546875, -2.633150100708008, 18.147960662841797, -7.7763214111328125, 23.933807373046875, -0.4297943115234375, 24.771034240722656, -5.892112731933594, 10.547718048095703, 19.524208068847656, 18.39404296875, 1.86968994140625, -2.4047775268554688, 20.01616668701172, 26.104270935058594, 25.371612548828125, 25.649192810058594, -6.061004638671875, 11.378337860107422, -9.794769287109375, 12.412094116210938, 0.355682373046875, 18.867225646972656, -4.017566680908203, 30.62109375, 17.7276611328125, 3.5449295043945312, 24.794307708740234, 1.0287704467773438, 12.1123046875, 7.488578796386719, 49.033233642578125, 17.065637588500977, 33.503082275390625, -9.88296127319336, 48.37611389160156, 14.951461791992188, 10.381790161132812, 6.456443786621094, 16.767486572265625, -3.884521484375, 4.441276550292969, -6.649330139160156, -3.044574737548828, 28.711807250976562, 12.593673706054688, -2.994302749633789, 7.9026336669921875, 6.018318176269531, 10.084232330322266, 23.279739379882812, -1.2037353515625, 8.323974609375, 3.3054466247558594, 2.8096580505371094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000630.npy"}
|
||||
{"epoch": 0.9523809523809523, "step": 631, "batch_size": 64, "mean": 13.726907730102539, "std": 17.76945686340332, "min": -23.228546142578125, "p10": -6.2059347152709945, "median": 10.79334831237793, "p90": 38.40172424316407, "max": 66.40908813476562, "pos_frac": 0.78125, "sample": [20.959251403808594, 8.078109741210938, 5.1180419921875, 25.176063537597656, 10.527175903320312, 33.621864318847656, 25.036163330078125, 19.50274658203125, -0.9452934265136719, 4.987335205078125, -23.228546142578125, -12.163286209106445, 10.390207290649414, -3.9717864990234375, -2.2707748413085938, 5.7580413818359375, -4.843843460083008, 7.722927093505859, 16.851898193359375, 37.416473388671875, 42.302093505859375, 0.634796142578125, 53.04969024658203, 23.392250061035156, 23.327667236328125, 21.539154052734375, 3.988433837890625, 38.823974609375, -1.9913482666015625, 6.522947311401367, 9.779666900634766, 8.668327331542969, -15.885292053222656, 17.206867218017578, 16.468036651611328, 26.187667846679688, 25.805282592773438, 15.840179443359375, 66.40908813476562, 27.8887939453125, 1.7220611572265625, -16.032752990722656, -3.7188873291015625, 12.353094100952148, 0.938812255859375, -16.592323303222656, 10.521795272827148, 11.558841705322266, -10.967079162597656, 19.1566162109375, -3.4230880737304688, 20.913101196289062, 5.77996826171875, 39.30585479736328, -6.7896881103515625, 21.21526336669922, 11.059520721435547, 41.413856506347656, 13.385540008544922, 32.85539245605469, 8.975448608398438, 49.306884765625, 9.581687927246094, 32.3211669921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000631.npy"}
|
||||
{"epoch": 0.9538926681783825, "step": 632, "batch_size": 64, "mean": 13.924145698547363, "std": 14.745100975036621, "min": -17.115612030029297, "p10": -5.804339981079101, "median": 12.723711967468262, "p90": 32.86839828491211, "max": 48.359291076660156, "pos_frac": 0.8125, "sample": [13.858596801757812, 23.043289184570312, 12.160736083984375, 4.250253677368164, 21.78457260131836, 33.33753967285156, -1.1724376678466797, 33.576438903808594, -2.0410118103027344, 25.990234375, 7.648895263671875, -12.954544067382812, 12.94049072265625, -0.8791160583496094, -6.478609085083008, 26.633296966552734, 30.114925384521484, 3.9763565063476562, 28.561771392822266, -15.17965316772461, 12.359954833984375, 11.486495971679688, -13.176570892333984, 16.60320281982422, 4.631675720214844, 25.107742309570312, 11.020843505859375, 23.57147216796875, 13.834400177001953, 11.14090347290039, 30.315322875976562, 38.994842529296875, 7.552757263183594, -5.908203125, 20.42211151123047, 18.53571891784668, 34.02739715576172, 12.506933212280273, 27.296218872070312, 15.435779571533203, 24.180770874023438, 24.365585327148438, 5.715293884277344, 2.1641464233398438, 3.2416133880615234, 31.77373504638672, 23.573272705078125, 6.017568588256836, 21.256423950195312, 24.37725257873535, 48.359291076660156, -13.046218872070312, 22.48504638671875, 40.17670822143555, -5.561992645263672, 2.8904342651367188, 37.37629318237305, 20.604286193847656, 11.255912780761719, 6.829469680786133, 9.832618713378906, 5.532360076904297, -17.115612030029297, -0.06999588012695312], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000632.npy"}
|
||||
{"epoch": 0.9554043839758125, "step": 633, "batch_size": 64, "mean": 10.044373512268066, "std": 15.747550964355469, "min": -30.53618812561035, "p10": -7.715658187866209, "median": 8.325078964233398, "p90": 31.761713409423837, "max": 56.74064636230469, "pos_frac": 0.75, "sample": [-1.0455131530761719, 8.272869110107422, 9.506378173828125, 11.135887145996094, 18.300003051757812, 10.2637939453125, 10.518768310546875, 4.205753326416016, 0.6618366241455078, 10.293937683105469, 2.3721160888671875, -2.9203548431396484, 36.94535827636719, 5.887960433959961, 1.2513961791992188, 23.768508911132812, -2.9450302124023438, 1.7772674560546875, 8.377288818359375, -3.8447036743164062, 4.902677536010742, 0.3203315734863281, 25.63220977783203, 26.48188018798828, 15.679725646972656, 12.918582916259766, -1.0130653381347656, 0.28863525390625, 15.414426803588867, 12.51251220703125, 14.014404296875, 0.02459716796875, 23.627145767211914, -1.9147796630859375, -0.5370445251464844, 12.86678695678711, -1.5332489013671875, -11.044342041015625, -10.374359130859375, -12.850906372070312, -8.38275146484375, 1.4112129211425781, 3.209207534790039, 56.74064636230469, 19.189285278320312, 1.7196483612060547, 9.214519500732422, -10.420755386352539, 28.13518524169922, 33.05534362792969, 32.673301696777344, 14.234001159667969, 21.139488220214844, 29.634674072265625, 5.351726531982422, -9.08126449584961, 3.6737442016601562, -6.159107208251953, 27.772750854492188, 42.046939849853516, 43.256072998046875, -30.53618812561035, 40.985137939453125, 15.777374267578125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000633.npy"}
|
||||
{"epoch": 0.9569160997732427, "step": 634, "batch_size": 64, "mean": 6.308449745178223, "std": 14.115734100341797, "min": -20.276344299316406, "p10": -8.443406295776366, "median": 3.2048721313476562, "p90": 28.951189041137706, "max": 42.337364196777344, "pos_frac": 0.578125, "sample": [2.909381866455078, -3.759674072265625, 6.827522277832031, 42.337364196777344, 20.5179443359375, -1.8666858673095703, -2.3428688049316406, -1.6220073699951172, -0.982879638671875, -10.000839233398438, 4.841302871704102, 3.374835968017578, -5.0782623291015625, -2.0861358642578125, -0.42376708984375, 4.995643615722656, 11.253597259521484, 6.9346466064453125, -4.2922821044921875, -3.8262386322021484, -10.247970581054688, 0.14891624450683594, 16.19200897216797, -20.276344299316406, 36.08595275878906, -2.257173538208008, 24.531417846679688, 18.15135955810547, 31.281173706054688, -7.219882965087891, -6.9008636474609375, 16.72442054748535, 4.390144348144531, -4.186840057373047, 26.228763580322266, -2.283140182495117, -14.685009002685547, -15.36614990234375, 35.28308868408203, 30.117942810058594, 15.736923217773438, -6.85675048828125, -8.9677734375, 10.857135772705078, -2.35479736328125, -13.775398254394531, 1.887613296508789, 8.268392562866211, 34.888816833496094, 5.436214447021484, -6.39434814453125, -4.175331115722656, 7.077690124511719, 0.806488037109375, 14.760908126831055, 3.0349082946777344, 22.218849182128906, 21.636409759521484, 16.87708282470703, 17.73310089111328, -0.416534423828125, 7.418827056884766, 4.295372009277344, 30.3245849609375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000634.npy"}
|
||||
{"epoch": 0.9584278155706727, "step": 635, "batch_size": 64, "mean": 8.641317367553711, "std": 16.978967666625977, "min": -20.599430084228516, "p10": -9.941907501220703, "median": 5.196222305297852, "p90": 30.23452186584473, "max": 56.86640930175781, "pos_frac": 0.625, "sample": [-5.380241394042969, 19.57911491394043, -12.545782089233398, -7.459827423095703, 32.132232666015625, 3.0949020385742188, -9.465404510498047, -14.24886703491211, -7.8438873291015625, 35.65462875366211, 1.2288970947265625, 17.602569580078125, -2.4281845092773438, -7.266716003417969, 17.079788208007812, 28.695716857910156, 27.193925857543945, 7.621345520019531, 29.5986328125, 13.5716552734375, -9.705902099609375, -0.3014984130859375, -4.925086975097656, 49.471923828125, -3.3670196533203125, -10.043052673339844, 14.06805419921875, -0.4892311096191406, -4.00926399230957, 8.676036834716797, 3.36834716796875, -12.564704895019531, 26.567012786865234, 7.504524230957031, -3.96923828125, 4.712169647216797, 30.50704574584961, 0.9819831848144531, 20.38671875, -20.599430084228516, 17.24626922607422, 36.738914489746094, -2.6199874877929688, 46.86775207519531, 3.060333251953125, -9.256782531738281, 14.492668151855469, 14.411979675292969, 23.89458465576172, -3.726806640625, 10.11025619506836, 56.86640930175781, 9.812835693359375, 5.680274963378906, -15.65020751953125, 7.467193603515625, 29.380020141601562, 4.67449951171875, 4.6381683349609375, 19.326587677001953, -16.05322265625, 18.303361892700195, 17.10749053955078, -2.4121360778808594], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000635.npy"}
|
||||
{"epoch": 0.9599395313681028, "step": 636, "batch_size": 64, "mean": 12.491098403930664, "std": 13.8200101852417, "min": -14.279884338378906, "p10": -1.8568691253662095, "median": 10.03131103515625, "p90": 30.587866210937502, "max": 50.9871826171875, "pos_frac": 0.84375, "sample": [19.643367767333984, -0.5877799987792969, 8.018945693969727, 2.7273406982421875, 19.507339477539062, 9.188179016113281, 16.569149017333984, 5.2060546875, -0.32738494873046875, 27.651752471923828, 13.582107543945312, -0.06532096862792969, 19.108009338378906, 12.468055725097656, 16.572708129882812, 7.17523193359375, 17.042999267578125, 11.050384521484375, 7.6447296142578125, 18.640304565429688, 11.351371765136719, 11.893409729003906, 1.3657112121582031, 17.05528450012207, 6.523807525634766, 14.584075927734375, 50.9871826171875, 0.1389923095703125, -2.4007644653320312, 17.06218719482422, 4.372734069824219, 6.197353363037109, 44.23626708984375, 8.865493774414062, -2.7651519775390625, 30.680503845214844, 7.960960388183594, 17.385042190551758, -2.5817718505859375, 39.92182922363281, 11.560880661010742, 30.37171173095703, 0.6870384216308594, 47.95635986328125, 3.1034393310546875, -13.328910827636719, 38.89131164550781, 9.709587097167969, 10.192239761352539, 14.810462951660156, 9.48974609375, 7.1068878173828125, 25.890914916992188, 41.36363983154297, 13.104965209960938, -12.853729248046875, -14.279884338378906, 3.369840621948242, 9.870382308959961, 26.716032028198242, 8.365547180175781, -5.493034362792969, 11.097126007080078, 8.0770263671875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000636.npy"}
|
||||
{"epoch": 0.9614512471655329, "step": 637, "batch_size": 64, "mean": 9.565020561218262, "std": 17.230409622192383, "min": -30.929229736328125, "p10": -5.638515853881835, "median": 7.483664512634277, "p90": 41.14995727539065, "max": 54.937530517578125, "pos_frac": 0.703125, "sample": [10.731040954589844, 15.643333435058594, 11.834823608398438, 54.937530517578125, -0.12052154541015625, 10.40869140625, -4.537940979003906, 10.788000106811523, 26.60565185546875, 2.8452186584472656, -4.579093933105469, 3.5600433349609375, 9.781818389892578, -10.4219970703125, 7.5020599365234375, 15.764141082763672, 0.13063812255859375, 9.78323745727539, 47.234195709228516, 3.093687057495117, 0.8462982177734375, 7.9374847412109375, -0.4656982421875, 1.4536056518554688, -2.674165725708008, 13.849365234375, -3.9048500061035156, 43.5452880859375, 28.274398803710938, 51.25495910644531, 49.33259963989258, 4.778968811035156, -14.059555053710938, 14.016265869140625, -4.244575500488281, 10.836204528808594, -15.764533996582031, 7.928977966308594, -0.2611846923828125, 21.018692016601562, 1.6215667724609375, 24.942306518554688, -2.590169906616211, 7.148777008056641, 17.410400390625, 11.804607391357422, -11.555671691894531, 15.193572998046875, -5.523632049560547, -0.000202178955078125, 1.63323974609375, 44.16949462890625, 7.465269088745117, 8.443878173828125, 8.989288330078125, -11.847114562988281, 35.56085205078125, 3.1778945922851562, -2.3198318481445312, -30.929229736328125, 9.382129669189453, 43.887908935546875, -5.687751770019531, 7.1006317138671875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000637.npy"}
|
||||
{"epoch": 0.9629629629629629, "step": 638, "batch_size": 64, "mean": 13.074190139770508, "std": 15.722932815551758, "min": -18.95488739013672, "p10": -4.042398071289062, "median": 9.065524101257324, "p90": 32.914628219604495, "max": 47.50390625, "pos_frac": 0.75, "sample": [-0.3630523681640625, 26.94561767578125, 22.73065185546875, -0.3914375305175781, -18.95488739013672, -0.01512908935546875, 33.66511535644531, 8.952056884765625, 4.784046173095703, -13.336597442626953, 4.623626708984375, -4.166866302490234, -0.31412506103515625, 8.073654174804688, 2.1780014038085938, 25.43548583984375, -9.389488220214844, -3.751972198486328, 25.98969268798828, 3.201019287109375, 18.091171264648438, 25.457252502441406, 33.90211486816406, 25.6669921875, -15.354972839355469, 44.99546813964844, 16.9813232421875, 33.08467483520508, 2.953052520751953, 9.023662567138672, 10.375307083129883, 30.337053298950195, 17.297256469726562, 47.38337326049805, 11.4627685546875, -1.7518463134765625, 3.98602294921875, 18.91363525390625, 9.496185302734375, 19.758586883544922, 23.42029571533203, 21.907089233398438, 29.740497589111328, -5.338321685791016, 32.517852783203125, 1.4583206176757812, 5.122798919677734, 31.431480407714844, 8.825332641601562, 27.745681762695312, -0.5422801971435547, 22.915584564208984, 45.63764953613281, 8.89813232421875, -5.185321807861328, 17.3397274017334, -1.0037994384765625, 47.50390625, 29.656652450561523, 4.293128967285156, 1.1225204467773438, 2.3350982666015625, -0.1156768798828125, 9.107385635375977], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000638.npy"}
|
||||
{"epoch": 0.9644746787603931, "step": 639, "batch_size": 64, "mean": 11.806966781616211, "std": 15.718415260314941, "min": -14.619104385375977, "p10": -4.777453994750976, "median": 8.46633529663086, "p90": 33.967481231689455, "max": 56.684635162353516, "pos_frac": 0.765625, "sample": [17.098615646362305, 21.6328125, 6.5569305419921875, 17.771324157714844, 17.703994750976562, 13.368843078613281, 11.249504089355469, -4.829647064208984, -2.9825496673583984, 18.832042694091797, 0.817840576171875, 12.933000564575195, 56.28529357910156, 10.44158935546875, 15.126762390136719, 2.318115234375, 15.707796096801758, -3.4637451171875, -6.52093505859375, 8.593955993652344, 5.550750732421875, 38.90055465698242, 1.9287662506103516, 28.135223388671875, -7.979591369628906, 19.71355438232422, 6.549930572509766, 17.078651428222656, -2.052337646484375, 34.44822692871094, 41.22254180908203, -10.675970077514648, 3.1078052520751953, 15.879470825195312, -1.6886787414550781, 2.4525203704833984, 12.336692810058594, 5.389892578125, 56.684635162353516, 21.347976684570312, 29.41571044921875, 5.55975341796875, 37.30121612548828, 8.338714599609375, -4.655670166015625, 6.133571624755859, -2.365264892578125, -2.5461997985839844, 9.455497741699219, 0.31598663330078125, 32.845741271972656, 22.27689552307129, 1.8668136596679688, 6.622337341308594, 3.14593505859375, -14.619104385375977, 3.8856201171875, -3.556304931640625, -5.425422668457031, 47.59624481201172, -8.594825744628906, 18.508773803710938, 16.93014907836914, 30.237548828125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000639.npy"}
|
||||
{"epoch": 0.9659863945578231, "step": 640, "batch_size": 64, "mean": 9.291372299194336, "std": 15.078871726989746, "min": -41.14456558227539, "p10": -7.575798416137695, "median": 10.141159057617188, "p90": 27.10679931640625, "max": 38.54255676269531, "pos_frac": 0.78125, "sample": [19.382183074951172, 18.140552520751953, 26.952178955078125, -19.90839195251465, 2.1281204223632812, 8.011795043945312, 21.013545989990234, 1.9533958435058594, 10.127471923828125, 33.25413513183594, 2.9834671020507812, 4.273857116699219, 22.69146156311035, 11.424041748046875, -12.621162414550781, 9.81024169921875, 13.047784805297852, 14.219593048095703, 1.6892948150634766, 35.45008087158203, -41.14456558227539, 3.966827392578125, 27.173065185546875, 6.9213104248046875, -7.429485321044922, -7.095808029174805, 14.898849487304688, -0.484527587890625, -9.23771858215332, 1.7705459594726562, 26.03846549987793, -2.8824005126953125, 5.262237548828125, 21.34597396850586, 18.216064453125, 0.14212799072265625, -0.17389678955078125, 10.87448501586914, 9.683113098144531, 12.071895599365234, 1.1548004150390625, 25.61345672607422, 2.561107635498047, 10.15484619140625, 24.36750030517578, 13.351821899414062, 14.284584045410156, -31.202980041503906, 11.543901443481445, 15.53985595703125, 15.656387329101562, 31.007293701171875, 11.872638702392578, 26.1607666015625, -2.325847625732422, -3.984651565551758, 38.54255676269531, -7.6385040283203125, -7.6605072021484375, 20.402360916137695, 1.2869071960449219, 33.167327880859375, 7.199281692504883, 29.652732849121094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000640.npy"}
|
||||
{"epoch": 0.9674981103552532, "step": 641, "batch_size": 64, "mean": 9.97656536102295, "std": 13.373872756958008, "min": -25.010177612304688, "p10": -1.9970829010009756, "median": 8.561539649963379, "p90": 25.35069561004639, "max": 56.609375, "pos_frac": 0.828125, "sample": [-11.457283020019531, -11.290184020996094, 22.833080291748047, 14.805858612060547, 15.839492797851562, 0.5310287475585938, 15.89522933959961, 5.251396179199219, -2.388652801513672, 2.1647567749023438, 2.1536865234375, 4.736976623535156, 13.060997009277344, 33.39928436279297, 5.1197052001953125, 8.834779739379883, -1.0834197998046875, 30.82958221435547, 40.34690856933594, 6.538185119628906, 13.394248962402344, 0.4693794250488281, 0.4877471923828125, 4.1019287109375, 12.582267761230469, 21.287254333496094, 12.141891479492188, 7.9815216064453125, 13.144092559814453, 13.287572860717773, 0.5219268798828125, 24.55181884765625, 9.253211975097656, 10.430418014526367, -1.042938232421875, -12.110389709472656, 12.152305603027344, 14.116256713867188, -6.6150665283203125, 2.7953948974609375, -25.010177612304688, 33.6680793762207, 11.274734497070312, 6.970832824707031, 4.5992584228515625, 43.90531539916992, -0.11553192138671875, -0.3553428649902344, 12.763690948486328, 11.75054931640625, 56.609375, 14.74186897277832, 25.693071365356445, 14.139257431030273, -3.771820068359375, 8.288299560546875, 10.659759521484375, 7.321502685546875, 3.164440155029297, 5.011444091796875, 5.845722198486328, 5.6897735595703125, 14.089670181274414, 22.514127731323242], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000641.npy"}
|
||||
{"epoch": 0.9690098261526833, "step": 642, "batch_size": 64, "mean": 8.783670425415039, "std": 15.442992210388184, "min": -34.87001037597656, "p10": -8.217436981201171, "median": 6.37310791015625, "p90": 31.059001922607422, "max": 41.106353759765625, "pos_frac": 0.6875, "sample": [26.920433044433594, 12.14599609375, 0.7742691040039062, -10.173418045043945, 3.608154296875, -6.517049789428711, 13.33245849609375, 30.79894256591797, 0.4098052978515625, -34.87001037597656, 31.170455932617188, 5.042999267578125, 5.065223693847656, -6.5345306396484375, 11.996368408203125, 0.8495941162109375, 24.958099365234375, 37.72486114501953, -3.3459091186523438, -11.499248504638672, 6.958961486816406, 13.315948486328125, 21.222091674804688, -0.5933418273925781, 38.78990936279297, 16.333423614501953, 41.106353759765625, 8.228912353515625, 29.659072875976562, 11.032318115234375, 10.63616943359375, 12.419921875, 15.226005554199219, 5.164501190185547, -4.242382049560547, 32.72919464111328, -15.745597839355469, 5.779266357421875, 5.787254333496094, -4.5490264892578125, 0.20409774780273438, 4.676780700683594, -3.904115676879883, -0.5348358154296875, 15.632532119750977, 7.791200637817383, -7.8910369873046875, 20.389480590820312, 30.636871337890625, 36.284584045410156, 0.0530853271484375, -6.130760192871094, 22.116443634033203, 29.644954681396484, 34.14134216308594, 7.558078765869141, -9.397026062011719, 13.927635192871094, -5.458337783813477, -8.357322692871094, -1.1821517944335938, -0.807373046875, -8.677047729492188, 10.321361541748047], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000642.npy"}
|
||||
{"epoch": 0.9705215419501134, "step": 643, "batch_size": 64, "mean": 10.086037635803223, "std": 14.142683029174805, "min": -20.4158935546875, "p10": -7.206118392944335, "median": 8.86956787109375, "p90": 29.09925975799561, "max": 41.204551696777344, "pos_frac": 0.734375, "sample": [3.4485321044921875, 13.835042953491211, 8.519145965576172, -3.151580810546875, 3.934168815612793, 37.94120788574219, 6.108526229858398, 16.158729553222656, 25.97570037841797, -1.5510940551757812, 11.09619140625, 7.012571334838867, 9.219989776611328, 1.319000244140625, 26.42877960205078, 11.445587158203125, -1.779449462890625, 0.9544048309326172, 27.79758644104004, 41.204551696777344, 6.262504577636719, 11.578033447265625, 25.50772476196289, 5.733516693115234, -5.951591491699219, 34.352806091308594, 17.76544952392578, -13.358413696289062, 38.55023956298828, 21.001312255859375, 4.6411590576171875, 15.931665420532227, -0.3540191650390625, 15.915386199951172, 1.386831283569336, -1.1579170227050781, 4.951662063598633, 13.605209350585938, 29.657119750976562, 31.089391708374023, 24.6225643157959, -11.044708251953125, 22.359146118164062, -7.506870269775391, 14.119224548339844, -6.504364013671875, -13.687797546386719, -20.4158935546875, 5.146869659423828, 11.826366424560547, -3.9824066162109375, 11.997251510620117, 26.8902587890625, 13.57895278930664, 3.23748779296875, -12.946693420410156, -7.9316558837890625, 27.386856079101562, 15.103492736816406, 5.650993347167969, -1.7066078186035156, 32.96112060546875, 15.538223266601562, -2.2110748291015625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000643.npy"}
|
||||
{"epoch": 0.9720332577475435, "step": 644, "batch_size": 64, "mean": 9.126594543457031, "std": 16.435949325561523, "min": -31.184940338134766, "p10": -7.424218368530274, "median": 7.918977737426758, "p90": 31.258107376098636, "max": 48.06245422363281, "pos_frac": 0.71875, "sample": [14.654396057128906, -4.582921981811523, -31.184940338134766, 18.756301879882812, -2.4138145446777344, -12.07625961303711, 12.242561340332031, 19.796483993530273, 35.39223861694336, 25.221511840820312, 13.312339782714844, -1.8562698364257812, 21.960678100585938, 30.24930191040039, 8.561935424804688, 4.566249847412109, 7.491733551025391, 6.109642028808594, -0.7920074462890625, 36.25586700439453, 0.9154567718505859, 21.248748779296875, -21.85234832763672, -0.8939971923828125, -0.41768646240234375, 16.276287078857422, -1.5708503723144531, 8.341571807861328, 0.43845367431640625, -6.741050720214844, 26.56194305419922, -12.19976806640625, 10.080144882202148, 17.631568908691406, 23.785423278808594, -7.507747650146484, 7.4963836669921875, 39.690120697021484, 35.8267822265625, 2.259105682373047, 11.923683166503906, 24.801513671875, 2.69744873046875, 8.674886703491211, 0.6116371154785156, -30.56432342529297, -2.3260498046875, 27.617992401123047, 6.862190246582031, 31.690452575683594, 4.1740264892578125, 2.8697967529296875, -0.2023334503173828, -7.229316711425781, 11.138427734375, 10.660026550292969, -18.378143310546875, 8.67193603515625, 17.33307647705078, 1.9244842529296875, 2.487579345703125, 48.06245422363281, 13.089439392089844, 46.477622985839844], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000644.npy"}
|
||||
{"epoch": 0.9735449735449735, "step": 645, "batch_size": 64, "mean": 10.393871307373047, "std": 14.485919952392578, "min": -20.33221435546875, "p10": -6.665456390380859, "median": 8.713379859924316, "p90": 28.5603931427002, "max": 52.0159912109375, "pos_frac": 0.765625, "sample": [11.170196533203125, 27.458656311035156, 22.954662322998047, -10.3480224609375, 23.766983032226562, 52.0159912109375, 13.422439575195312, 8.88951301574707, 9.922172546386719, -0.44582366943359375, -10.00790786743164, 6.7179412841796875, 5.825918197631836, 0.4465675354003906, 32.01812744140625, 2.7994651794433594, 17.544336318969727, 2.6197052001953125, 3.8800811767578125, -0.21344947814941406, 35.61589813232422, -6.662406921386719, 6.515621185302734, 7.450050354003906, 14.640304565429688, -6.6667633056640625, 4.3790130615234375, 8.537246704101562, 25.673120498657227, 18.51385498046875, 3.6900863647460938, 2.0390243530273438, 17.15045928955078, 0.8850936889648438, 1.007781982421875, 19.204818725585938, -13.208206176757812, -2.2560882568359375, -8.672073364257812, 17.701202392578125, 29.954940795898438, 11.004585266113281, 48.85966491699219, -0.20351409912109375, 3.962169647216797, 14.403118133544922, 17.82868194580078, 8.148565292358398, 12.13470458984375, 16.135116577148438, -3.1412124633789062, 22.698204040527344, 10.012615203857422, 42.679847717285156, 17.061676025390625, -1.931854248046875, 27.298412322998047, -6.067325592041016, 10.74493408203125, -20.33221435546875, 2.3686752319335938, 29.03256607055664, 15.598905563354492, -9.019149780273438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000645.npy"}
|
||||
{"epoch": 0.9750566893424036, "step": 646, "batch_size": 64, "mean": 11.388921737670898, "std": 14.883009910583496, "min": -26.89641761779785, "p10": -5.177449417114257, "median": 9.032869338989258, "p90": 29.16847095489502, "max": 49.796875, "pos_frac": 0.78125, "sample": [0.4279327392578125, 0.9764060974121094, 11.421302795410156, 18.881759643554688, 27.37583351135254, 17.94652557373047, -7.30767822265625, 24.93194580078125, 27.826637268066406, 25.016929626464844, -2.6883773803710938, 6.7065277099609375, 21.30936050415039, 40.0687255859375, 11.4075927734375, 1.47174072265625, 14.811927795410156, 7.98486328125, 39.09178924560547, 16.557409286499023, 23.68726348876953, 11.10736083984375, -14.917875289916992, -4.185600280761719, 0.21088027954101562, 25.299530029296875, 8.278079986572266, -1.3681488037109375, 5.947368621826172, 4.233795166015625, 8.638504028320312, 8.796646118164062, 19.383338928222656, 29.77988052368164, 25.222946166992188, -0.5086593627929688, 6.659183502197266, -5.602527618408203, 7.642210006713867, 17.062076568603516, 0.08817100524902344, -6.275535583496094, 12.09649658203125, 5.103485107421875, 17.95965576171875, 49.796875, 19.513381958007812, 29.231040954589844, 6.179071426391602, 41.34264373779297, 29.02247428894043, -1.70306396484375, 9.719039916992188, 12.423545837402344, -26.89641761779785, 1.7803268432617188, 23.334243774414062, -15.008184432983398, -3.7661895751953125, 39.303932189941406, -8.666637420654297, -0.9245986938476562, 6.382781982421875, 9.269092559814453], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000646.npy"}
|
||||
{"epoch": 0.9765684051398337, "step": 647, "batch_size": 64, "mean": 13.488749504089355, "std": 15.688854217529297, "min": -13.8270263671875, "p10": -4.319994354248046, "median": 10.562971115112305, "p90": 34.96376266479495, "max": 52.8226318359375, "pos_frac": 0.796875, "sample": [1.7077598571777344, 12.386520385742188, 23.294189453125, 13.303924560546875, 20.318557739257812, 14.704391479492188, -4.17425537109375, 51.33314514160156, 5.82843017578125, 37.76564025878906, 9.767814636230469, 10.00897216796875, 10.810089111328125, -2.5241546630859375, 0.9386062622070312, 13.58599853515625, 50.55274963378906, -0.20409202575683594, -7.7568206787109375, -7.541534423828125, 25.954736709594727, 15.448995590209961, 26.34905242919922, 3.1399192810058594, 42.42377471923828, 25.84079360961914, 28.014183044433594, -5.507795333862305, 24.846790313720703, -0.8339080810546875, 9.505775451660156, -1.3288421630859375, 19.265655517578125, 2.6321258544921875, 10.315853118896484, -7.200839996337891, 9.055484771728516, 26.586238861083984, 4.391624450683594, -4.382453918457031, 19.643659591674805, 16.906822204589844, 8.126768112182617, 21.02996826171875, -13.021530151367188, 5.315317153930664, 12.561046600341797, 27.152240753173828, 2.7776336669921875, 7.398965835571289, -2.30535888671875, 40.558448791503906, 16.897144317626953, -13.8270263671875, 3.8452606201171875, 28.426048278808594, 2.187467575073242, 52.8226318359375, 4.467498779296875, 22.16119384765625, 45.03663635253906, 1.6077651977539062, 22.811717987060547, 22.076583862304688], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000647.npy"}
|
||||
{"epoch": 0.9780801209372638, "step": 648, "batch_size": 64, "mean": 11.590964317321777, "std": 15.26468276977539, "min": -32.36396789550781, "p10": -3.002249336242675, "median": 10.872230529785156, "p90": 33.118228149414065, "max": 47.29829025268555, "pos_frac": 0.765625, "sample": [17.28333282470703, 21.825355529785156, 11.723472595214844, -32.36396789550781, 29.917999267578125, 0.7180328369140625, -17.830535888671875, 11.426937103271484, 24.255157470703125, 16.408781051635742, -1.3085212707519531, 10.317523956298828, 5.437614440917969, -1.9007720947265625, -3.4180068969726562, -14.425853729248047, 14.16879653930664, -0.3173370361328125, 42.33192443847656, 27.652191162109375, 47.29829025268555, 6.989650726318359, 20.4268798828125, 34.03593826293945, 24.789154052734375, 8.498088836669922, 4.350677490234375, 36.560752868652344, 6.837432861328125, 42.906463623046875, 7.368064880371094, -0.6198692321777344, 33.0916748046875, 33.129608154296875, 2.1803417205810547, -5.366035461425781, 9.5885009765625, 8.889881134033203, 15.12101936340332, 28.840972900390625, 12.19110107421875, 17.158092498779297, 36.952484130859375, -0.3002967834472656, -5.011054992675781, 3.0115280151367188, 14.038955688476562, 22.771230697631836, 2.5255126953125, 14.68560791015625, 12.55544662475586, 19.883041381835938, 16.115318298339844, -15.376708984375, 0.7039794921875, 18.663818359375, 22.9833984375, -0.5892715454101562, 3.6299667358398438, -1.4156684875488281, 3.0336380004882812, -2.0321483612060547, 6.7464141845703125, 12.077728271484375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000648.npy"}
|
||||
{"epoch": 0.9795918367346939, "step": 649, "batch_size": 64, "mean": 10.998601913452148, "std": 15.761153221130371, "min": -23.196762084960938, "p10": -9.034126472473144, "median": 11.438446044921875, "p90": 32.99332733154297, "max": 42.18362808227539, "pos_frac": 0.75, "sample": [26.87274169921875, 18.2640323638916, -2.9727134704589844, 14.098518371582031, 25.268972396850586, 16.704147338867188, 22.484092712402344, 10.490385055541992, 4.4783782958984375, 29.879444122314453, 1.0737056732177734, 38.71430969238281, 11.146713256835938, -14.50244140625, 13.403846740722656, 11.730178833007812, 12.21200942993164, 24.282012939453125, 16.89459800720215, 12.573036193847656, 39.81041717529297, -4.589746475219727, -7.366729736328125, -14.086341857910156, -23.196762084960938, 21.712345123291016, 28.732757568359375, 3.4436206817626953, -3.8617019653320312, -7.8172454833984375, 25.43081283569336, 0.8665008544921875, 33.12039566040039, 1.5950336456298828, 21.51559066772461, 9.959762573242188, 13.846168518066406, 23.012935638427734, 33.52030944824219, 13.630167007446289, 2.8503952026367188, 42.18362808227539, 19.150291442871094, -11.056224822998047, -1.60418701171875, 32.696834564208984, 1.6100482940673828, 2.2496795654296875, 6.9441986083984375, 1.4981803894042969, -17.096275329589844, 35.926597595214844, -9.555646896362305, 4.2832794189453125, 41.03559875488281, -6.946987152099609, -12.35158920288086, -4.7287139892578125, 18.087039947509766, 19.319507598876953, 25.07750701904297, 10.660057067871094, 4.253211975097656, -2.95013427734375], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000649.npy"}
|
||||
{"epoch": 0.981103552532124, "step": 650, "batch_size": 64, "mean": 10.888345718383789, "std": 16.859827041625977, "min": -23.0225830078125, "p10": -6.968121337890625, "median": 8.608724594116211, "p90": 31.982691001892093, "max": 49.20238494873047, "pos_frac": 0.734375, "sample": [-10.504161834716797, 29.139419555664062, -6.76007080078125, 31.054664611816406, 1.4458160400390625, 32.799034118652344, 10.879837036132812, 0.85614013671875, 6.221305847167969, 2.134246826171875, 25.782859802246094, 23.306400299072266, -6.223289489746094, 12.000255584716797, 0.8052539825439453, 29.67432403564453, 31.315942764282227, 12.683618545532227, 22.61591339111328, 1.1515064239501953, 8.703289031982422, 12.10920524597168, -4.337255477905273, -3.871967315673828, 40.52320861816406, 23.863372802734375, 49.20238494873047, 20.995410919189453, 22.96307373046875, -11.460319519042969, 2.5492820739746094, -22.553627014160156, 3.7801246643066406, 27.137245178222656, -23.0225830078125, 2.9486541748046875, 47.055328369140625, 23.90676498413086, 1.4117660522460938, -17.376792907714844, 16.409622192382812, 32.26844024658203, -13.974395751953125, 30.22503662109375, -4.651741027832031, -6.8573150634765625, 36.21467590332031, 19.89498519897461, 8.51416015625, -1.617898941040039, 10.235191345214844, 2.935394287109375, 19.91860580444336, -0.1639251708984375, 3.745279312133789, 15.010889053344727, -7.0156097412109375, 16.6494140625, 14.92519760131836, 45.35493469238281, -6.1802978515625, 5.5492706298828125, 7.561985015869141, -3.0033645629882812], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000650.npy"}
|
||||
{"epoch": 0.982615268329554, "step": 651, "batch_size": 64, "mean": 10.358694076538086, "std": 14.641880989074707, "min": -19.724266052246094, "p10": -4.112140274047851, "median": 7.649138450622559, "p90": 29.57285079956055, "max": 51.83692169189453, "pos_frac": 0.734375, "sample": [1.1299686431884766, 3.512054443359375, 7.852745056152344, 4.8868560791015625, 16.545654296875, -19.491004943847656, -2.7844009399414062, 7.445531845092773, -2.0063323974609375, 24.019868850708008, 3.799480438232422, 3.022491455078125, -4.302028656005859, -3.6690673828125, 19.87759017944336, -19.724266052246094, -1.3471221923828125, 36.781524658203125, 5.485252380371094, 16.624649047851562, 13.836921691894531, -3.139507293701172, -4.9659423828125, 5.2430419921875, 5.846092224121094, 0.5729846954345703, 23.6680908203125, 21.394287109375, -2.1107330322265625, 29.742431640625, 15.262350082397461, 28.80988311767578, 12.30313491821289, 0.04586029052734375, 15.307662963867188, 1.8496570587158203, 26.186351776123047, 15.032073974609375, -5.896125793457031, 13.538135528564453, 32.352447509765625, 14.622978210449219, 24.259021759033203, 7.902103424072266, -1.9558296203613281, 46.26798629760742, 51.83692169189453, 31.11522674560547, -2.1768112182617188, 29.177162170410156, 5.294679641723633, -2.36724853515625, 1.597341537475586, 18.113754272460938, 11.664932250976562, 14.765007019042969, 19.518211364746094, 10.117572784423828, 13.564834594726562, -11.635345458984375, 5.916648864746094, -8.207244873046875, 41.502342224121094, -0.4763946533203125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000651.npy"}
|
||||
{"epoch": 0.9841269841269841, "step": 652, "batch_size": 64, "mean": 10.115922927856445, "std": 14.549574851989746, "min": -16.493316650390625, "p10": -6.542061614990234, "median": 8.661823272705078, "p90": 31.45265731811524, "max": 43.34751892089844, "pos_frac": 0.671875, "sample": [-1.1664276123046875, 29.018266677856445, 36.79335403442383, 12.571182250976562, -0.16797637939453125, 32.070045471191406, 17.834285736083984, 9.22701644897461, 43.34751892089844, 2.3457870483398438, 22.151573181152344, -9.926994323730469, 1.2765998840332031, 17.002168655395508, 17.607959747314453, 16.21405029296875, 0.6396675109863281, 19.44426727294922, 12.201667785644531, -9.823028564453125, -6.681983947753906, 15.983261108398438, 13.3035888671875, -0.7594375610351562, 3.4070091247558594, 1.7662811279296875, -16.493316650390625, -2.520355224609375, 30.0120849609375, 10.878990173339844, 8.528915405273438, -3.2805252075195312, 11.070606231689453, 8.794731140136719, 18.993423461914062, -1.0480728149414062, 41.72582244873047, 4.230121612548828, -4.412361145019531, -8.313095092773438, -1.1027374267578125, 38.48158645629883, 6.7835540771484375, 10.597671508789062, -1.3987655639648438, -1.0893211364746094, 25.833091735839844, 11.416763305664062, -0.4220142364501953, 29.70397186279297, 39.02081298828125, 2.5827865600585938, 18.482345581054688, 5.778594970703125, -6.753780364990234, 18.468040466308594, -6.18280029296875, 8.04456901550293, -6.215576171875, 38.46263885498047, 17.386512756347656, -3.404022216796875, -10.873210906982422, 19.971710205078125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000652.npy"}
|
||||
{"epoch": 0.9856386999244142, "step": 653, "batch_size": 64, "mean": 10.676427841186523, "std": 16.18169403076172, "min": -25.111221313476562, "p10": -4.747282409667968, "median": 6.487344741821289, "p90": 35.354410552978514, "max": 55.59747314453125, "pos_frac": 0.78125, "sample": [20.848825454711914, 36.446380615234375, -0.15614700317382812, 3.629913330078125, 5.251262664794922, 9.308242797851562, 11.456809997558594, 5.861663818359375, 42.14275360107422, 0.9121246337890625, -9.423454284667969, 3.22900390625, 35.375511169433594, -0.309661865234375, 32.51081848144531, 24.20543670654297, 13.979320526123047, 35.4237060546875, 4.769412994384766, 32.86952209472656, 25.57195281982422, 38.01155090332031, 5.266393661499023, -0.39301300048828125, 5.553504943847656, 0.43051719665527344, 20.697608947753906, 1.9320220947265625, -0.15710067749023438, 3.3819503784179688, 55.59747314453125, -14.4520263671875, -4.3507232666015625, 6.849079132080078, 18.176565170288086, -14.595436096191406, 8.214744567871094, 10.100898742675781, 11.423194885253906, 3.027698516845703, 4.381011962890625, -25.111221313476562, 19.770843505859375, 15.074745178222656, 20.905029296875, -24.151832580566406, 35.30517578125, -3.757171630859375, 43.42999267578125, 7.170654296875, 4.800285339355469, 28.273799896240234, -15.013961791992188, 1.7257919311523438, -1.5972843170166016, 0.6840076446533203, 14.44320297241211, 18.690406799316406, 4.253238677978516, 17.08544921875, 6.1256103515625, 16.431110382080078, -4.917236328125, 10.671455383300781], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000653.npy"}
|
||||
{"epoch": 0.9871504157218443, "step": 654, "batch_size": 64, "mean": 10.721696853637695, "std": 15.151617050170898, "min": -17.903396606445312, "p10": -5.534330940246581, "median": 6.346099853515625, "p90": 32.109700775146486, "max": 54.81158447265625, "pos_frac": 0.8125, "sample": [5.291473388671875, 27.257293701171875, -13.149124145507812, 11.16848373413086, 4.078285217285156, -6.856903076171875, 22.181259155273438, 10.160823822021484, 10.686405181884766, 0.24225616455078125, 2.2509689331054688, -5.986412048339844, 31.290878295898438, 4.759498596191406, -0.6536941528320312, 11.416120529174805, 17.317646026611328, -6.942516326904297, 32.46062469482422, 2.2311935424804688, 2.56842041015625, 6.2956390380859375, 24.708892822265625, 47.21142578125, 0.3860950469970703, 4.580280303955078, -1.1168384552001953, -4.666435241699219, -1.18487548828125, 9.915473937988281, 2.0460376739501953, 18.354690551757812, 54.81158447265625, 7.187030792236328, 15.467460632324219, 18.800270080566406, 16.46319580078125, -4.086601257324219, 0.4124908447265625, 2.2010955810546875, 21.845956802368164, -17.903396606445312, 12.927978515625, -5.906286239624023, 4.828468322753906, 0.5386314392089844, 42.560428619384766, 41.68574523925781, 8.792158126831055, 9.094184875488281, 9.816268920898438, 42.317474365234375, 4.97509765625, 11.240119934082031, 48.756431579589844, -7.008358001708984, 6.257781982421875, 5.734569549560547, 3.381439208984375, 25.561973571777344, 11.707355499267578, 6.1406097412109375, 12.887496948242188, 6.3965606689453125], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000654.npy"}
|
||||
{"epoch": 0.9886621315192744, "step": 655, "batch_size": 64, "mean": 11.356832504272461, "std": 12.724363327026367, "min": -10.456794738769531, "p10": -4.002120971679686, "median": 9.826579093933105, "p90": 23.81196746826172, "max": 62.843292236328125, "pos_frac": 0.828125, "sample": [19.96881675720215, 4.714866638183594, -0.04534912109375, 6.223978042602539, 9.740043640136719, 33.42737579345703, 23.8580322265625, 19.483890533447266, 4.864383697509766, 4.4748687744140625, 0.9230804443359375, 10.179832458496094, 4.961189270019531, 4.607460021972656, -7.447269439697266, -7.055461883544922, 9.913114547729492, 23.65918731689453, 35.88536834716797, 2.871805191040039, 28.860858917236328, -4.6817626953125, 23.700122833251953, 11.089569091796875, 11.761579513549805, -1.6088714599609375, 8.503101348876953, -10.456794738769531, 21.179550170898438, 2.7061691284179688, 21.177574157714844, 8.018150329589844, 22.137733459472656, 6.5493927001953125, 19.432693481445312, 4.058719635009766, 8.516286849975586, 16.25900650024414, 15.515457153320312, 62.843292236328125, 4.4084930419921875, -8.981277465820312, -2.416290283203125, 23.704483032226562, 10.414894104003906, -8.191734313964844, 8.933807373046875, 17.067474365234375, 31.578218460083008, 22.632293701171875, 4.028896331787109, 3.3270339965820312, 19.206695556640625, 6.268823623657227, 6.486202239990234, 12.356452941894531, 25.325336456298828, 11.05963134765625, 18.12635040283203, -2.1846160888671875, -9.296935081481934, 19.891386032104492, 17.23394775390625, 15.086669921875], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000655.npy"}
|
||||
{"epoch": 0.9901738473167044, "step": 656, "batch_size": 64, "mean": 12.519877433776855, "std": 14.689604759216309, "min": -19.079971313476562, "p10": -3.654612731933593, "median": 8.66054916381836, "p90": 32.52769165039063, "max": 50.92334747314453, "pos_frac": 0.84375, "sample": [36.60600662231445, 3.2827377319335938, 36.58873748779297, 16.10626220703125, 2.931610107421875, 2.9068374633789062, 5.582157135009766, 4.4368896484375, 10.53704833984375, -3.0672454833984375, 6.194549560546875, 29.08935546875, 2.67791748046875, 2.04022216796875, 2.3240604400634766, 23.317230224609375, -7.2051849365234375, -5.520927429199219, 13.118675231933594, 23.807363510131836, 50.92334747314453, 13.771453857421875, -19.079971313476562, 20.924129486083984, 26.304168701171875, -1.6617774963378906, 9.626106262207031, 18.195693969726562, -3.906341552734375, 0.9693031311035156, 47.01538848876953, 0.5110855102539062, 5.007654190063477, 8.504127502441406, 7.907695770263672, 32.04859924316406, 23.93951416015625, 12.677925109863281, 38.47844696044922, 14.772125244140625, 3.9906463623046875, -3.9892425537109375, 18.143043518066406, 32.73301696777344, 6.880161285400391, 21.64636993408203, 4.439050674438477, 28.301620483398438, 4.199817657470703, 15.011617660522461, 30.871551513671875, -5.998451232910156, 10.842376708984375, 44.1815185546875, 16.283309936523438, 8.816970825195312, 1.7886085510253906, -8.419921875, 1.9201431274414062, 0.9562454223632812, -1.708770751953125, 7.125160217285156, 19.629362106323242, 30.94493865966797], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000656.npy"}
|
||||
{"epoch": 0.9916855631141346, "step": 657, "batch_size": 64, "mean": 13.723871231079102, "std": 16.28316879272461, "min": -27.98269271850586, "p10": -2.8714551925659175, "median": 12.425362586975098, "p90": 38.461087036132824, "max": 49.57483673095703, "pos_frac": 0.796875, "sample": [39.96782684326172, 20.195693969726562, -3.0866355895996094, 26.267074584960938, 17.529085159301758, 10.575162887573242, 39.96710205078125, 41.271568298339844, 18.10472869873047, -27.98269271850586, -1.0857276916503906, -2.2640323638916016, 19.192829132080078, 5.157072067260742, 33.6717529296875, 9.54559326171875, 27.94194793701172, 47.49561309814453, -8.419544219970703, 28.321029663085938, -1.3746795654296875, 12.490983963012695, -9.518260955810547, 2.914102554321289, 13.957071304321289, 34.947052001953125, 3.756683349609375, 12.3597412109375, -1.0181350708007812, 6.308494567871094, 15.93914794921875, 24.263391494750977, 11.480484008789062, 22.61919403076172, 22.358701705932617, 15.449249267578125, 13.868576049804688, 42.945960998535156, 29.231002807617188, 6.716150283813477, 23.988311767578125, 8.12640380859375, 1.3726272583007812, 26.401397705078125, -1.889495849609375, 2.79833984375, 49.57483673095703, 3.977693557739258, -7.387702941894531, 7.925449371337891, 6.287986755371094, -18.46057891845703, 27.489479064941406, 0.32677650451660156, 0.25748252868652344, 3.095794677734375, -7.204109191894531, 15.326774597167969, 23.227787017822266, 43.77422332763672, -2.3693675994873047, 20.62579345703125, 3.833057403564453, 25.168432235717773], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000657.npy"}
|
||||
{"epoch": 0.9931972789115646, "step": 658, "batch_size": 64, "mean": 9.801492691040039, "std": 17.029258728027344, "min": -28.380088806152344, "p10": -8.724024200439453, "median": 4.90389347076416, "p90": 31.86208419799805, "max": 49.096099853515625, "pos_frac": 0.71875, "sample": [11.445926666259766, 13.719898223876953, 19.957801818847656, -19.9478759765625, 2.9793701171875, 3.4791107177734375, -19.987220764160156, -0.20756912231445312, 16.979684829711914, 23.787368774414062, 1.9514541625976562, 5.280914306640625, -23.744951248168945, 30.55594253540039, 16.62197494506836, 0.14352798461914062, 0.8932037353515625, 13.696182250976562, 49.096099853515625, 1.1925506591796875, -2.5046939849853516, 1.74664306640625, 3.4095840454101562, 20.965354919433594, 37.99256896972656, 29.592605590820312, -8.833221435546875, 17.16480255126953, 20.686885833740234, 3.3907089233398438, 0.4248847961425781, -1.3486385345458984, 8.363569259643555, 29.511131286621094, -1.0218925476074219, 19.309791564941406, 30.860870361328125, -12.03079605102539, -1.1288986206054688, 11.1865234375, 45.45397186279297, 4.526872634887695, 9.332717895507812, 28.786956787109375, -2.2882003784179688, 20.527395248413086, -2.0063934326171875, 32.291175842285156, 32.95507049560547, 19.924396514892578, 4.27166748046875, -3.355752944946289, -2.93524169921875, -15.33012580871582, 9.69635009765625, 16.573570251464844, -5.836872100830078, -28.380088806152344, 42.28663635253906, -8.469230651855469, 43.199554443359375, 4.026668548583984, 24.365310668945312, 2.0479507446289062], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000658.npy"}
|
||||
{"epoch": 0.9947089947089947, "step": 659, "batch_size": 64, "mean": 9.027709007263184, "std": 14.709617614746094, "min": -24.305831909179688, "p10": -8.003089141845702, "median": 7.227928161621094, "p90": 29.288668823242194, "max": 48.8019905090332, "pos_frac": 0.71875, "sample": [2.6070022583007812, 18.883197784423828, 30.96471405029297, 18.064172744750977, -1.1062889099121094, 29.90228271484375, 7.004096984863281, 17.60609245300293, 7.9207305908203125, 48.8019905090332, 24.137969970703125, -7.381034851074219, 6.916404724121094, -0.8367691040039062, 2.1572494506835938, 11.626953125, 14.921295166015625, -13.72683334350586, 4.628168106079102, -11.359130859375, 27.856903076171875, -3.9151668548583984, 0.6912689208984375, 10.416748046875, 1.6892871856689453, -2.0939483642578125, 5.754304885864258, 11.826971054077148, -2.9402999877929688, 11.858078002929688, 6.427116394042969, 7.828651428222656, 15.891056060791016, 7.2312469482421875, 6.866752624511719, 15.013633728027344, 35.80903625488281, 17.32541275024414, 44.03459167480469, 11.051025390625, 31.075851440429688, -1.178131103515625, -8.269683837890625, 21.526872634887695, -7.246702194213867, 2.115234375, 20.286094665527344, 17.828392028808594, 15.635940551757812, -10.719772338867188, 5.9394378662109375, 17.086917877197266, -24.305831909179688, -10.908977508544922, 23.47381591796875, 7.224609375, -1.6237754821777344, -22.303123474121094, 39.86096954345703, 10.6490478515625, -4.806060791015625, 15.364084243774414, 4.1953277587890625, -3.482086181640625], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000659.npy"}
|
||||
{"epoch": 0.9962207105064248, "step": 660, "batch_size": 64, "mean": 12.517721176147461, "std": 14.261533737182617, "min": -17.60187530517578, "p10": -3.1585800170898435, "median": 10.104963302612305, "p90": 34.08629112243654, "max": 49.56462097167969, "pos_frac": 0.84375, "sample": [37.28645324707031, 4.416160583496094, 4.478862762451172, -3.10772705078125, 16.03066062927246, -4.674476623535156, 8.4884033203125, 24.96706771850586, -12.275924682617188, -17.60187530517578, 35.708377838134766, 49.56462097167969, 8.97719955444336, 13.793994903564453, 11.685905456542969, 38.83819580078125, 26.12456512451172, 12.601655960083008, 7.454519271850586, -6.6411285400390625, 6.2003021240234375, -12.957122802734375, 24.920753479003906, 1.9531803131103516, 14.310989379882812, 39.663612365722656, -11.561622619628906, 12.78179931640625, 15.80881118774414, 0.1894989013671875, 6.286769866943359, 30.301422119140625, 38.7088623046875, 5.244636535644531, 1.3935298919677734, -2.1231002807617188, 7.293449401855469, 6.2078399658203125, 24.98671531677246, 0.743377685546875, 10.177345275878906, 14.436517715454102, 14.439033508300781, 19.300952911376953, 28.263324737548828, 15.224414825439453, -3.1803741455078125, 3.6505508422851562, 21.295053482055664, 27.28215789794922, 10.032581329345703, 40.68328094482422, 6.233558654785156, -0.49500083923339844, 27.01871109008789, 10.433235168457031, 15.447603225708008, 22.776771545410156, 8.877685546875, 1.9787368774414062, 8.793031692504883, 9.841621398925781, 1.5516891479492188, 20.602439880371094], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000660.npy"}
|
||||
{"epoch": 0.9977324263038548, "step": 661, "batch_size": 64, "mean": 10.303092956542969, "std": 15.598078727722168, "min": -36.94904327392578, "p10": -6.551709365844726, "median": 9.78125286102295, "p90": 29.612030410766604, "max": 45.17748260498047, "pos_frac": 0.765625, "sample": [2.1121292114257812, -6.000911712646484, -13.404136657714844, -10.738872528076172, 23.435466766357422, 9.901433944702148, 31.63475799560547, -3.6623077392578125, -26.922882080078125, 11.613504409790039, -11.352638244628906, 1.6510467529296875, 4.2908935546875, 39.50373077392578, 21.094749450683594, 29.942794799804688, 45.17748260498047, 10.070148468017578, 36.74766540527344, 7.217079162597656, -36.94904327392578, 14.777175903320312, 25.802715301513672, 0.22863388061523438, -5.580596923828125, 4.906944274902344, 6.5901031494140625, -4.974948883056641, 2.089000701904297, 4.20465087890625, 12.252338409423828, 19.33234214782715, -12.798242568969727, 28.55707550048828, 24.99951171875, 4.40217399597168, 14.808326721191406, 28.952648162841797, 3.910114288330078, 24.46845245361328, 23.755569458007812, 3.5364151000976562, 4.9715728759765625, 28.085216522216797, 22.778427124023438, 9.66107177734375, 27.97283935546875, 8.445688247680664, 4.652620315551758, 7.254444122314453, -6.7877655029296875, 21.485641479492188, -2.6683502197265625, -1.0751895904541016, 11.39926528930664, 30.94847869873047, 20.128433227539062, 14.1844482421875, 10.777740478515625, 29.894622802734375, 12.009994506835938, -3.2222747802734375, -0.8696441650390625, 19.788192749023438], "npy": "outputs/llama-3-8b-base-margin-dpo-hh-harmless/margin_logs/step_0000661.npy"}
|
||||
3
margin_logs/step_0000001.npy
Normal file
3
margin_logs/step_0000001.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:22dddd9b4bf59a58ac9754704862dc0b60abca7f1f9941029f73d8f470387557
|
||||
size 384
|
||||
3
margin_logs/step_0000002.npy
Normal file
3
margin_logs/step_0000002.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:9c4a0d8c26a315903fc2506660d8ac2eb82c1e4d9a761e6a7de89830e1a119f6
|
||||
size 384
|
||||
3
margin_logs/step_0000003.npy
Normal file
3
margin_logs/step_0000003.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:6f9e8a5891dc854f1d6c92e0c3b2215ca5f2a54c541adf3ba87dc089408f1cca
|
||||
size 384
|
||||
3
margin_logs/step_0000004.npy
Normal file
3
margin_logs/step_0000004.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:3a2b20f6ac34ce8ac7f657471dc77e5519e65d487305129f0d3203ab0f847c93
|
||||
size 384
|
||||
3
margin_logs/step_0000005.npy
Normal file
3
margin_logs/step_0000005.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ba9a1829598c56ad8b65eea14f56f162df250f5a91136c9be81396318d2ec099
|
||||
size 384
|
||||
3
margin_logs/step_0000006.npy
Normal file
3
margin_logs/step_0000006.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:1ceabbb82c4d23118fb14d12fb4745d0027a810066e4a5f437334b1fd0495702
|
||||
size 384
|
||||
3
margin_logs/step_0000007.npy
Normal file
3
margin_logs/step_0000007.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:325f70103d858470b9306f860b2a1508bd759941b17334d6444ee80fceeeecf1
|
||||
size 384
|
||||
3
margin_logs/step_0000008.npy
Normal file
3
margin_logs/step_0000008.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:e13f0dc694a820c6b25e973b2a4f9d28148e535c7f7fe2fd69641c2e5c813269
|
||||
size 384
|
||||
3
margin_logs/step_0000009.npy
Normal file
3
margin_logs/step_0000009.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c4738d4278baf3de80fa9ba2ee8f6deab49afd11c4e9ca22c07311199ca95c7d
|
||||
size 384
|
||||
3
margin_logs/step_0000010.npy
Normal file
3
margin_logs/step_0000010.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c2a00ba417a39c1c433c0342320e5c31d270925b2593b2e66546a3888e5f1006
|
||||
size 384
|
||||
3
margin_logs/step_0000011.npy
Normal file
3
margin_logs/step_0000011.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:16ea683a87f02494069a71286338f65a78974868713b474b52470ef10d4d931d
|
||||
size 384
|
||||
3
margin_logs/step_0000012.npy
Normal file
3
margin_logs/step_0000012.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c703f5073668afb05f8516f104c4549eb9bbe871f3bbcd6d508ab390b04dbe49
|
||||
size 384
|
||||
3
margin_logs/step_0000013.npy
Normal file
3
margin_logs/step_0000013.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:9daeabd50965eb10d5a56e9a8718bc0430c40065b23009df94eb29bf4c205053
|
||||
size 384
|
||||
3
margin_logs/step_0000014.npy
Normal file
3
margin_logs/step_0000014.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:dfbe188f39cda22bf90d90feebbe3447f8a4fb75c2e8576c841ee65d91aa5428
|
||||
size 384
|
||||
3
margin_logs/step_0000015.npy
Normal file
3
margin_logs/step_0000015.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:e44325d22de130e173d98003e444d8b8c323411d5625eebcea52a089b8068ca5
|
||||
size 384
|
||||
3
margin_logs/step_0000016.npy
Normal file
3
margin_logs/step_0000016.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0003319671ca132521f9d22b774a3e2de2e030c885416e86757616bbc6bc363d
|
||||
size 384
|
||||
3
margin_logs/step_0000017.npy
Normal file
3
margin_logs/step_0000017.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:6afe7b029bc7a6dc3740d75073ae6bd49972190518ce2d2c7b6fa80ad1fa573e
|
||||
size 384
|
||||
3
margin_logs/step_0000018.npy
Normal file
3
margin_logs/step_0000018.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:87bbdc38102b473529edf3c70d353ce443c0791ee784b28b6e79f44218ceb1d4
|
||||
size 384
|
||||
3
margin_logs/step_0000019.npy
Normal file
3
margin_logs/step_0000019.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:354cd53652135ea7e8f1de7fb82d7bd52b673c6471e2b40a748620d5ac4c9181
|
||||
size 384
|
||||
3
margin_logs/step_0000020.npy
Normal file
3
margin_logs/step_0000020.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d16cabf5ff37eaaa66ed9956686f2ab5c628f9a1e2f16cbb42fe772b41704e9b
|
||||
size 384
|
||||
3
margin_logs/step_0000021.npy
Normal file
3
margin_logs/step_0000021.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c1b1e9e40bec7b7bc713b2117a86e094efea6f2195c50d7820541546e9a1d389
|
||||
size 384
|
||||
3
margin_logs/step_0000022.npy
Normal file
3
margin_logs/step_0000022.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2199ddf71b20ddc785627f9bbc15ab9bc373a8d96e4a5bd682f420d63dddbc0d
|
||||
size 384
|
||||
3
margin_logs/step_0000023.npy
Normal file
3
margin_logs/step_0000023.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5f6a60588dd2f8e15c976a8e4e95ec399fc4dd3ae58b4a40b47640e328ac05ee
|
||||
size 384
|
||||
3
margin_logs/step_0000024.npy
Normal file
3
margin_logs/step_0000024.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:71f802564cd88f5310077d055989176f8b0d229c9433968712dbbf7cba7814c9
|
||||
size 384
|
||||
3
margin_logs/step_0000025.npy
Normal file
3
margin_logs/step_0000025.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d2747d3ad02f33eaa981bfebf9c8ba52ee5564ea2d0e746bbf4574f621da11b7
|
||||
size 384
|
||||
3
margin_logs/step_0000026.npy
Normal file
3
margin_logs/step_0000026.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:1fb1c629cb9871027ed1b7913c8828b823c7221fc9ade8c55bc599b5048616b0
|
||||
size 384
|
||||
3
margin_logs/step_0000027.npy
Normal file
3
margin_logs/step_0000027.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:1c445119d467f890f8d780c9a7bb48ce08761b87eda1f46d59b9afa4f8e4cbeb
|
||||
size 384
|
||||
3
margin_logs/step_0000028.npy
Normal file
3
margin_logs/step_0000028.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ee2c4144307fcd6754801418c7732d47a120e4553a14eb7ca26c5ee78bb56643
|
||||
size 384
|
||||
3
margin_logs/step_0000029.npy
Normal file
3
margin_logs/step_0000029.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f0d45db42042b545c3a1cdea8de0454bb3cd464d1409695c0f9e6717cfd715fb
|
||||
size 384
|
||||
3
margin_logs/step_0000030.npy
Normal file
3
margin_logs/step_0000030.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7e07b6585743476f2e37e5b9a54c993157d5a8b93b67e650aa8f2bf2e2bc3420
|
||||
size 384
|
||||
3
margin_logs/step_0000031.npy
Normal file
3
margin_logs/step_0000031.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0de2f727b104d1ef53a2cb16448cb1c51281587268a0283ecf8f98751f602ade
|
||||
size 384
|
||||
3
margin_logs/step_0000032.npy
Normal file
3
margin_logs/step_0000032.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c1117bc7b2bc1816e6a401a76779e35b9c65c1d6e2ad4b32f617ba3ada4516ff
|
||||
size 384
|
||||
3
margin_logs/step_0000033.npy
Normal file
3
margin_logs/step_0000033.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:37d1db7abac021dcde2e394c8bcd660cc47cf3175e5651f426e5dd41f6618db3
|
||||
size 384
|
||||
3
margin_logs/step_0000034.npy
Normal file
3
margin_logs/step_0000034.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:369a4dc068b86c3acc81d7bd3a70ff9f486f62c8c8051c7e3e146fc5c669e769
|
||||
size 384
|
||||
3
margin_logs/step_0000035.npy
Normal file
3
margin_logs/step_0000035.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:da8d85b96990b8591ea2e27f6153efe159bf0ffa26505db1789dea9854e60a6f
|
||||
size 384
|
||||
3
margin_logs/step_0000036.npy
Normal file
3
margin_logs/step_0000036.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5656d479202d333dfc796b7997b494ffec04349f11fe9fa9dfeff443fe026494
|
||||
size 384
|
||||
3
margin_logs/step_0000037.npy
Normal file
3
margin_logs/step_0000037.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:e72b1d7f57b0eecef23f11873f79d17a57447432f027d56814d2a88431486179
|
||||
size 384
|
||||
3
margin_logs/step_0000038.npy
Normal file
3
margin_logs/step_0000038.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c497002b9315516a7edfed61afd44bbf01e120f5eff0607e5def401a417545eb
|
||||
size 384
|
||||
3
margin_logs/step_0000039.npy
Normal file
3
margin_logs/step_0000039.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:46ce728f404b3bf7ecc80b7828a6f38da52747626a9e6fbda80cea2f3af3c5aa
|
||||
size 384
|
||||
3
margin_logs/step_0000040.npy
Normal file
3
margin_logs/step_0000040.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a43188019d931831ffcf866c949a609e93dfb742a9caf240f3797ea876706930
|
||||
size 384
|
||||
3
margin_logs/step_0000041.npy
Normal file
3
margin_logs/step_0000041.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:687863a45fb2c67280a0c84cbcf35cc2d31d09d0fb0fcb6ee7a878b1a3b5c298
|
||||
size 384
|
||||
3
margin_logs/step_0000042.npy
Normal file
3
margin_logs/step_0000042.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5ba02d029559e28afd9ab2a049824a03280cd54e8c8bc449c331d85914b8c11b
|
||||
size 384
|
||||
3
margin_logs/step_0000043.npy
Normal file
3
margin_logs/step_0000043.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:90526f2b651bedac57c0ad4f03e836cf40c4e2e38e5e58d2fbf34f448b87087c
|
||||
size 384
|
||||
3
margin_logs/step_0000044.npy
Normal file
3
margin_logs/step_0000044.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a2fa3ff5cae6d16c693a2bd8c68e01beef25e60b6ae397c593bde72fa2cae29a
|
||||
size 384
|
||||
3
margin_logs/step_0000045.npy
Normal file
3
margin_logs/step_0000045.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:24b6254cbe98a8ba30827ca22a7c078e09c0701ee415cb754d0dfabc3ca4862b
|
||||
size 384
|
||||
3
margin_logs/step_0000046.npy
Normal file
3
margin_logs/step_0000046.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:17691ace4971e66775c66cc2042b9f59048759dcc803e160cab1c6eed1495b5b
|
||||
size 384
|
||||
3
margin_logs/step_0000047.npy
Normal file
3
margin_logs/step_0000047.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ee17123d881657764a378f7833bbb1e4ebb75082c0f27e1a5bb0e8b50bb09143
|
||||
size 384
|
||||
3
margin_logs/step_0000048.npy
Normal file
3
margin_logs/step_0000048.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:6d8c74cadc880e4b29bd54e1ebca0612bda887631705c220e3afcfb38bd1a534
|
||||
size 384
|
||||
3
margin_logs/step_0000049.npy
Normal file
3
margin_logs/step_0000049.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f98882646ec534059f0c41765d9f67dcae054d4149cee4bad5cf3ea2ff52ccc5
|
||||
size 384
|
||||
3
margin_logs/step_0000050.npy
Normal file
3
margin_logs/step_0000050.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0f4f37391098a387a9b426263c88f20e0ee7e1d770e89ef5f10b8c4bb12ef8c1
|
||||
size 384
|
||||
3
margin_logs/step_0000051.npy
Normal file
3
margin_logs/step_0000051.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ba978534a354701b1aac42b78c9966799de66b1d2d8175a05dbea3085b13f6cd
|
||||
size 384
|
||||
3
margin_logs/step_0000052.npy
Normal file
3
margin_logs/step_0000052.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:91762b0a49a0258b860e6591b42b3291d275177d2c2396439baac79948ff6c71
|
||||
size 384
|
||||
3
margin_logs/step_0000053.npy
Normal file
3
margin_logs/step_0000053.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:48d8f5d1438ee34de0cba74bdbf3e87b54a37e114530bf2365e28ac6411217ce
|
||||
size 384
|
||||
3
margin_logs/step_0000054.npy
Normal file
3
margin_logs/step_0000054.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:6e1ce87cad027a4578283e782d311fb9f225e13f8889937210bbe4b0ce309352
|
||||
size 384
|
||||
3
margin_logs/step_0000055.npy
Normal file
3
margin_logs/step_0000055.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7c4714798ccc943f5c94dc5a80bc2f3c44f8cf95a88ef2eaecf26354a41c0aac
|
||||
size 384
|
||||
3
margin_logs/step_0000056.npy
Normal file
3
margin_logs/step_0000056.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:d33dc4d2e62c44b8f16ab5a21c24c728c9cd6eb2c4e3c65ae88238c1cc616c63
|
||||
size 384
|
||||
3
margin_logs/step_0000057.npy
Normal file
3
margin_logs/step_0000057.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:59f38b1ce51a98c399e1bf281589579702610cf6b289cdc71f3013b040613bac
|
||||
size 384
|
||||
3
margin_logs/step_0000058.npy
Normal file
3
margin_logs/step_0000058.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:8c30b4771d92beac74f8c78b8d211cf8750d5469c570f186a8f404b2023ccc17
|
||||
size 384
|
||||
3
margin_logs/step_0000059.npy
Normal file
3
margin_logs/step_0000059.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:cfcab22fdefa8f8950422f6f4136a8248ab2d534531cb4ab84894852fbf38e32
|
||||
size 384
|
||||
3
margin_logs/step_0000060.npy
Normal file
3
margin_logs/step_0000060.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:ac6d2c735152d69b6940d8b44d2fdda435407277b8cc3e010e4d7ca827da613b
|
||||
size 384
|
||||
3
margin_logs/step_0000061.npy
Normal file
3
margin_logs/step_0000061.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7781e84c5d8a0a3730a834364a95affee908eeda9d511b70a7776695ffacda1e
|
||||
size 384
|
||||
3
margin_logs/step_0000062.npy
Normal file
3
margin_logs/step_0000062.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:e49d9f85a9b26018603888659d88dc5209887f1058a917cf41e10b2cfb4326ff
|
||||
size 384
|
||||
3
margin_logs/step_0000063.npy
Normal file
3
margin_logs/step_0000063.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:349827351687e6a402cd22ff7ee9db2438f6de9ba70e11784a0710dcdbf0b858
|
||||
size 384
|
||||
3
margin_logs/step_0000064.npy
Normal file
3
margin_logs/step_0000064.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:49da5dac8aa10a9ec724f9e34ca43c160c59ea5d548e712bd77e8418ec3db627
|
||||
size 384
|
||||
3
margin_logs/step_0000065.npy
Normal file
3
margin_logs/step_0000065.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:76135e992d1fefa9f7507d68cc2799c777dce0ad36adb62d85e86d82b65a17ca
|
||||
size 384
|
||||
3
margin_logs/step_0000066.npy
Normal file
3
margin_logs/step_0000066.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:a4e8eab75afb463fe61ace6004f82cdae4137b0bb050baa18630ae432080ca17
|
||||
size 384
|
||||
3
margin_logs/step_0000067.npy
Normal file
3
margin_logs/step_0000067.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b9d764c8969fe81a9f4eac255950fcb5f1f990ea25334aa4e83562e58ad5f26a
|
||||
size 384
|
||||
3
margin_logs/step_0000068.npy
Normal file
3
margin_logs/step_0000068.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:12d012555d293b9250fcb248516e544b210933a4b6060fe5e7c8a8ebf51f2be4
|
||||
size 384
|
||||
3
margin_logs/step_0000069.npy
Normal file
3
margin_logs/step_0000069.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2cba11f031e6b7b987f43e6800c4aa61b397f3de124ec82e56726fda30a6743a
|
||||
size 384
|
||||
3
margin_logs/step_0000070.npy
Normal file
3
margin_logs/step_0000070.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:5adf5407d5a952d1026435be6de4360331ce30c40cce055fe7556725d4bf3734
|
||||
size 384
|
||||
3
margin_logs/step_0000071.npy
Normal file
3
margin_logs/step_0000071.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:52649384d42d69d74dd2dd8cbbaa0ea8be69ae409de94bfc4e84045ba42b9ba9
|
||||
size 384
|
||||
3
margin_logs/step_0000072.npy
Normal file
3
margin_logs/step_0000072.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:bcb19ff723a4e804dc62623c4f64f41aa2ec1676f4f77bedd8fb0d0f18e8e5a8
|
||||
size 384
|
||||
3
margin_logs/step_0000073.npy
Normal file
3
margin_logs/step_0000073.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:675c54db089ef56742a5b9c5ceec4f0541482234f3c45cc0352f5a90c185dbff
|
||||
size 384
|
||||
3
margin_logs/step_0000074.npy
Normal file
3
margin_logs/step_0000074.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:0cac3b4d5c16279182f1b24fd51df10547d69d43166bc9b267968d16922649c7
|
||||
size 384
|
||||
3
margin_logs/step_0000075.npy
Normal file
3
margin_logs/step_0000075.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:bda62197d24e9810dd34f53905decce4d08bc3520554cd0c684baab3661577da
|
||||
size 384
|
||||
3
margin_logs/step_0000076.npy
Normal file
3
margin_logs/step_0000076.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b6b19d382bf6a4a61c2b8f8b37ba7b7f636515db9e7ca54994ec282403c86522
|
||||
size 384
|
||||
3
margin_logs/step_0000077.npy
Normal file
3
margin_logs/step_0000077.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:816b598a55827cbb3567639b1793062c7a8bd6dc3ddfb4fbe6479b97b8d97098
|
||||
size 384
|
||||
3
margin_logs/step_0000078.npy
Normal file
3
margin_logs/step_0000078.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:27f5cf552b74e21e0a400ed1a876258b8e292fc098d28010e60f9cfd41c765bd
|
||||
size 384
|
||||
3
margin_logs/step_0000079.npy
Normal file
3
margin_logs/step_0000079.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b4c480c51b2b4bbff5f953dff9c7fd79abe952ba387bbd11d6cb6814aeb4484d
|
||||
size 384
|
||||
3
margin_logs/step_0000080.npy
Normal file
3
margin_logs/step_0000080.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:3d7cd5bb7347f3cd240520c94da1f6326986facdda5f3ebeb45d276d39faff60
|
||||
size 384
|
||||
3
margin_logs/step_0000081.npy
Normal file
3
margin_logs/step_0000081.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2d10bd18f1a6412f46b8ebd7002e99497ec89340934a894145fb9b3b9fe84dcd
|
||||
size 384
|
||||
3
margin_logs/step_0000082.npy
Normal file
3
margin_logs/step_0000082.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:de321ae4bcbc19be547ff5d88ce1bb9685e1ae5245b3d5f15c1ef8f76af76226
|
||||
size 384
|
||||
3
margin_logs/step_0000083.npy
Normal file
3
margin_logs/step_0000083.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:851e852ad5737f13490008c1d2297c990383e2ef4bb8320dcff796daa590eef5
|
||||
size 384
|
||||
3
margin_logs/step_0000084.npy
Normal file
3
margin_logs/step_0000084.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:cd5a5eac3241dd34750c3c0e2f35870ecb3472357d24a528a8149e78102b5b30
|
||||
size 384
|
||||
3
margin_logs/step_0000085.npy
Normal file
3
margin_logs/step_0000085.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:63c1aa79efa0b72673013b3299e8604121530d9787ca310349b3e26e122d9a16
|
||||
size 384
|
||||
3
margin_logs/step_0000086.npy
Normal file
3
margin_logs/step_0000086.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f18cac7529e18854090dde2c37dcc3688306e6250c0d523941a0a98563f080eb
|
||||
size 384
|
||||
3
margin_logs/step_0000087.npy
Normal file
3
margin_logs/step_0000087.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:f525438e2b2d03e4099dff40cacdbc4ec4593117ceafa234df86111ab99db4e4
|
||||
size 384
|
||||
3
margin_logs/step_0000088.npy
Normal file
3
margin_logs/step_0000088.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:85847d8b211ee745bbec7440a0f0b9e2f86a41ee12b9092cacd3117dcfcb55e3
|
||||
size 384
|
||||
3
margin_logs/step_0000089.npy
Normal file
3
margin_logs/step_0000089.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:329a8cdb534927267fa8cbd5144f60292e4b5c0bc345cc51e65d86e3292201a0
|
||||
size 384
|
||||
3
margin_logs/step_0000090.npy
Normal file
3
margin_logs/step_0000090.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:2858e70c67c88322b9dc8ce53cbd16ec964356c9754c16d494dd63b8eb40c2e3
|
||||
size 384
|
||||
3
margin_logs/step_0000091.npy
Normal file
3
margin_logs/step_0000091.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:bd1e531a4674afa2aa10fb7f1847fe72c2bb68d8ac9d4b931747e2cb9b5e5304
|
||||
size 384
|
||||
3
margin_logs/step_0000092.npy
Normal file
3
margin_logs/step_0000092.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:7ebead11ddc60db6b2a59e9888dab17795777ebb4a696eb938f69fc20c1397cb
|
||||
size 384
|
||||
3
margin_logs/step_0000093.npy
Normal file
3
margin_logs/step_0000093.npy
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:59775953c56c8f0457e6baf8a4df712e4eac5807389d5b9b4898d17e5d018490
|
||||
size 384
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user