初始化项目,由ModelHub XC社区提供模型

Model: OpenLLM-France/Lucie-7B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-01 07:45:17 +08:00
commit 7042ca32ca
27 changed files with 130759 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
figures/pie_dataset_composition_training.png filter=lfs diff=lfs merge=lfs -text

237
LICENSE.md Normal file
View File

@@ -0,0 +1,237 @@
---
title: Apache License 2.0
spdx-id: Apache-2.0
redirect_from: /licenses/apache/
featured: true
hidden: false
description: A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
how: Create a text file (typically named LICENSE or LICENSE.txt) in the root of your source code and copy the text of the license into the file.
note: The Apache Software Foundation <a href="https://apache.org/foundation/license-faq.html#Apply-My-Software">recommends</a> taking the additional step of adding a boilerplate notice to the header of each source file. You can find the notice in the appendix at the very end of the license text.
using:
Kubernetes: https://github.com/kubernetes/kubernetes/blob/master/LICENSE
PDF.js: https://github.com/mozilla/pdf.js/blob/master/LICENSE
Swift: https://github.com/apple/swift/blob/main/LICENSE.txt
permissions:
- commercial-use
- modifications
- distribution
- patent-use
- private-use
conditions:
- include-copyright
- document-changes
limitations:
- trademark-use
- liability
- warranty
---
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

362
README.md Normal file
View File

@@ -0,0 +1,362 @@
---
license: apache-2.0
pipeline_tag: text-generation
language:
- fr
- en
- it
- de
- es
tags:
- pretrained
- llama-3
- openllm-france
datasets:
- OpenLLM-France/Lucie-Training-Dataset
widget:
- text: |-
Quelle est la capitale de l'Espagne ? Madrid.
Quelle est la capitale de la France ?
example_title: Capital cities in French
group: 1-shot Question Answering
training_progress:
num_steps: 756291
num_tokens: 3131736326144
context_length: 32000
---
# Model Card for Lucie-7B
<!-- inspired from the following template:
https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1
-->
* [Model Description](#model-description)
<!-- * [Uses](#uses) -->
* [Example Code in Python](#example-code-in-python)
* [Load the model](#load-the-model)
* [Sentence completion](#sentence-completion)
* [Load a checkpoint](#load-a-checkpoint)
* [Training Details](#training-details)
* [Training Data](#training-data)
* [Training Procedure](#training-procedure)
* [Neural Network Architecture](#neural-network-architecture)
* [Training Hyperparameters](#training-hyperparameters)
1. [Main Pre-training](#1-main-pre-training)
2. [Context Length Extension](#2-context-length-extension)
3. [Annealing](#3-annealing)
* [Training Logs and Learning Curves](#training-logs-and-learning-curves)
<!-- * [Evaluation](#evaluation) -->
* [Disclaimer](#disclaimer)
* [Citation](#citation)
* [Acknowledgements](#acknowledgements)
* [Contact](#contact)
## Model Description
Lucie-7B is a pretrained 7B parameter causal language model built by [LINAGORA](https://labs.linagora.com/) and [OpenLLM-France](https://github.com/OpenLLM-France).
Lucie-7B was trained on 3 trillion tokens of multilingual data, including
English (33.2%),
French (32.4%),
German (6.9%),
Spanish (6.6%),
Italian (3.8%),
and parallel data from those languages (2.5%),
as well as several programming languages (14.7%).
## Example Code in Python
### Load the model
Load the model (quantized version on GPU if possible, for efficient inference):
```python
import transformers
model_name = "OpenLLM-France/Lucie-7B"
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
model = transformers.AutoModelForCausalLM.from_pretrained(model_name,
device_map="auto",
load_in_4bit=True # For efficient inference, if quantization is supported by the GPU card
)
```
### Sentence completion
Wrap the model in a text generation pipeline, and specify some generation parameters:
```
pipeline = transformers.pipeline("text-generation", model=model, tokenizer=tokenizer)
generation_kwargs = dict(
num_return_sequences=1, # Number of variants to generate.
return_full_text= False, # Do not include the prompt in the generated text.
do_sample=True,
temperature=1.0, top_p=1, top_k=None, # Sampling parameters.
max_new_tokens=200, # Maximum length for the output text (in number of tokens).
)
```
Try 1-shot question answering:
```python
prompt = """\
Quelle est la capitale de l'Espagne ? Madrid\n\
Quelle est la capitale de la France ?\
"""
completions = pipeline(prompt, **generation_kwargs)
for completion in completions:
print(prompt + " […]" + completion['generated_text'])
```
This will print something like:
```
Quelle est la capitale de l'Espagne ? Madrid
Quelle est la capitale de la France ? […] Paris
Quelle est la capitale de l'Italie? Rome
Quelle est la capitale de la Grande-Bretagne? Londres
Quelle est la capitale de la Suisse? Berne
Quelle est la capitale du Portugal? Lisbonne
Quelle est la capitale de l'Algérie? Alger
...
```
If running on GPU (`cuda` device), you will need at least 6GB of VRAM to run inference using 4bit quantization (16GB of VRAM without 4bit quantization).
### Load a checkpoint
Checkpoints at several training steps are available under revision tags,
every 5000 steps during the first 25000 steps, and then every 25000 steps.
Intermediate checkpoints can be loaded using the `revision` parameter:
```python
model = transformers.AutoModelForCausalLM.from_pretrained(model_name,
revision="step0753851",
...
)
```
where `revision` can be one of:
* "[`step0005000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0005000)", "[`step0010000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0010000)", "[`step0015000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0015000)", "[`step0020000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0020000)": every 5000 steps for the first pre-training steps (with a context length of 4096).
* "[`step0025000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0025000)", "[`step0050000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0050000)", "[`step0075000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0075000)", "[`step0100000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0100000)", ..., "[`step0750000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0750000)": every 25000 steps from 25k to 750k steps.
* "[`step0753851`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/step0753851)": last pre-training step before context length extension and annealing.
* "[`extension_step0000250`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0000250)", "[`extension_step0000500`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0000500)", "[`extension_step0000750`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0000750)", "[`extension_step0001000`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0001000)", "[`extension_step0001220`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/extension_step0001220)": several checkpoints during context length extension (with a context length of 32000).
## Training Details
### Training Data
The training dataset used for the pretraining of Lucie-7B is available
at [OpenLLM-France/Lucie-Training-Dataset](https://huggingface.co/datasets/OpenLLM-France/Lucie-Training-Dataset).
<!-- and described in ["The Lucie Training Dataset" (2024/12)](https://arxiv.org/abs/xxxx.xxxxx). -->
The initial composition of the training data is as follows:
![Initial Data Composition](figures/pie_dataset_composition.png)
Some of the data was upsampled to balance the training data distribution yielding the following composition for training:
![Training Data Composition](figures/pie_dataset_composition_training.png)
### Training Procedure
Lucie-7B is a causal decoder-only model trained on a causal language modeling task (i.e., predict the next token).
It was pre-trained on 512 H100 80GB GPUs for about 550\,000 GPU hours on the [Jean Zay supercomputer](http://www.idris.fr/eng/jean-zay/jean-zay-presentation-eng.html).
The training code is available at [https://github.com/OpenLLM-France/Lucie-Training](https://github.com/OpenLLM-France/Lucie-Training).
It is based on [this fork of Megatron-DeepSpeed](https://github.com/OpenLLM-France/Megatron-DeepSpeed).
Optimizer checkpoints are available at [OpenLLM-France/Lucie-7B-optimizer-states](https://huggingface.co/OpenLLM-France/Lucie-7B-optimizer-states).
#### Neural Network Architecture
Lucie-7B has the same neural network architecture as [Llama3.1](https://huggingface.co/meta-llama/Llama-3.1-8B).
It has exactly 6 706 958 336 free parameters,
with the following hyperparameters:
| **Hyperparameter** | **Value** |
|---------------------------|---------|
| Vocabulary size (\# tokens)| 65 024 |
| \# transformer blocks | 32 |
| \# attention heads | 32 |
| \# key-value heads | 8 |
| Hidden size | 4 096 |
| Feed-Forward hidden size | 12 288 |
| Activation | `silu` |
| RMS norm epsilon | 1e-5 |
The "theta" parameter of Rotary Positional Embedding (RoPE) was increased during the training process. Its values are indicated in the tables with training hyperparameters below.
#### Training Hyperparameters
The training consisted of three main phases:
1. Main pre-training on 3.1T tokens, with a context length of 4096,
2. Context extension on 5B tokens, with a context length of 32000,
3. Annealing on 5B tokens of high quality data composed of a mixture of new data and data seen during training.
<!-- perhaps cite the dataset for annealing -->
The details of each phase are given below.
##### 1. Main Pre-training
Training hyperparameters in torch/Megatron-DeepSpeed were as follows:
| **Hyperparameter** | **Value** |
|------------------------|------------|
| Total \# samples| 762 144 586 (3.1T tokens) |
| Total \# steps | 753 851 |
| RoPE theta | 500 000 |
| Context length | 4 096 |
| Initial Batch size | 256 |
| Final Batch size | 1 024 |
| Batch size rampup | by steps of 64 over 10M samples |
| Learning rate schedule | warmup (2M samples) + cosine annealing |
| Maximum Learning rate | 3e-4 |
| Final Learning rate | 3e-5 |
| Weight decay | 0.1 |
| Dropout | _ |
| Gradient clipping | 1 |
| Initializer range | 0.009 |
| Optimizer | `AdamW` (β₁=0.9, β₂=0.95, ε=1e-5) |
| Precision | `bfloat16` |
| Tensor Parallelism (with 512 GPUs) | 4 |
| Pipeline Parallelism (with 512 GPUs) | 4 |
| Data Parallelism (with 512 GPUs) | 32 |
#### 2. Context Length Extension
Training hyperparameters are the same as above, with the following changes:
| **Hyperparameter** | **Value** |
|------------------------|------------|
| Total \# samples| 156 250 (5B tokens) |
| Total \# steps | 1 220 |
| RoPE theta | 20 000 000 |
| Context length | 32 000 |
| Batch size | 128 |
| Learning rate | 2e-5 |
| Learning rate schedule | constant |
| Tensor Parallelism (with 128 GPUs) | 4 |
| Pipeline Parallelism (with 128 GPUs) | 4 |
| Data Parallelism (with 128 GPUs) | 8 |
#### 3. Annealing
Training hyperparameters are the same as for context length extension, with the following changes:
| **Hyperparameter** | **Value** |
|------------------------|------------|
| Total \# samples| 156 250 (5B tokens) |
| Total \# steps | 1 220 |
| Learning rate schedule | linear annealing |
| Maximum Learning rate | 3e-5 |
| Final Learning rate | 0 |
### Training Logs and Learning Curves
#### Training loss
Training logs can be found in Tensorboard format in:
* [`metadata/training_logs/`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/main/metadata/training_logs)
<br> ├── [`1_pretraining.zip`](metadata/training_logs/1_pretraining.zip) training logs for the first pre-training phases,
in a zip file. Each file in the zip corresponds to a job of at most 20H of training (parallelized over 512 GPUs).
<br> ├── [`2_extension/`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/main/metadata/training_logs/2_extension) folder containing the training log <br> └── [`3_annealing/`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/main/metadata/training_logs/3_annealing) folder containing the training log for the annealing phase, which also took around 13H of training (parallelized over 128 GPUs).
The convergence curves of the three pre-training phases are the following:
![figures/convergence-curve-pretraining.png](figures/convergence-curve-pretraining.png)
Data corresponding to these plots were extracted from tensorboard logs and are available in the following CSV files:
* [`metadata/training_logs/`](https://huggingface.co/OpenLLM-France/Lucie-7B/tree/main/metadata/training_logs)
<br> ├── [`1_pretraining.csv`](metadata/training_logs/1_pretraining.csv)
<br> ├── [`2_extension.csv`](metadata/training_logs/2_extension.csv)
<br> └── [`3_annealing.csv`](metadata/training_logs/3_annealing.csv)
#### Evaluations
Multiple evaluations were conducted during Lucie-7B's training to assess its performance on standard benchmarks,
primarily in French and English, as well as in Spanish, German, and Italian.
Evaluation results on benchmark datasets of checkpoints of Lucie-7B throughout the training process are available at
[metadata/evaluation_learning_curve_lucie.csv](metadata/evaluation_learning_curve_lucie.csv).
Evaluation results of baseline models on the same benchmark datasets are available at
[metadata/evaluation_baselines.csv](metadata/evaluation_baselines.csv).
Main results are summarized in the following figures:
### French
![figures/learning-curve-evaluation-french-bench.png](figures/learning-curve-evaluation-french-bench.png)
### English
![figures/learning-curve-evaluation-benchmarks-in-english.png](figures/learning-curve-evaluation-benchmarks-in-english.png)
### other
![figures/learning-curve-evaluation-multilingual-arc-benchmark.png](figures/learning-curve-evaluation-multilingual-arc-benchmark.png)
### Needle in a Haystack
#### Pretraining
![figures/needle-in-a-haystack/Lucie-7B-main.png](figures/needle-in-a-haystack/Lucie-7B-main.png)
#### Context Length Extension
![figures/needle-in-a-haystack/Lucie-7B-extension.png](figures/needle-in-a-haystack/Lucie-7B-extension.png)
#### Annealing
![figures/needle-in-a-haystack/Lucie-7B-annealing.png](figures/needle-in-a-haystack/Lucie-7B-annealing.png)
## Disclaimer
Lucie-7B is a language model trained solely to predict the most probable next word in a sequence. Despite efforts to filter the [Lucie Training Dataset](https://huggingface.co/datasets/OpenLLM-France/Lucie-Training-Dataset), it is possible that Lucie-7B encountered strings containing toxic or offensive language during its training and as a result, it may generate strings of similar quality. To limit such behavior, it is advised to fine-tune Lucie-7B through instruction and/or preference tuning (DPO, RLHF, etc.).
## Citation
When using the Lucie-7B model, please cite the following paper:
✍ Olivier Gouvert, Julie Hunter, Jérôme Louradour,
Christophe Cérisara, Evan Dufraisse, Yaya Sy,
Laura Rivière, Jean-Pierre Lorré (2025).
[The Lucie-7B LLM and the Lucie Training Dataset:
Open resources for multilingual language generation](https://arxiv.org/abs/2503.12294). arxiv:2503.12294.
```bibtex
@misc{openllm2025lucie,
title={The Lucie-7B LLM and the Lucie Training Dataset: Open resources for multilingual language generation},
author={Olivier Gouvert and Julie Hunter and Jérôme Louradour and Christophe Cerisara and Evan Dufraisse and Yaya Sy and Laura Rivière and Jean-Pierre Lorré and OpenLLM-France community},
year={2025},
eprint={2503.12294},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.12294},
}
```
## Acknowledgements
This work was performed using HPC resources from GENCIIDRIS (Grant 2024-GC011015444). We gratefully acknowledge support from GENCI and IDRIS and from Pierre-François Lavallée (IDRIS) and Stephane Requena (GENCI) in particular.
Lucie-7B was created by members of [LINAGORA](https://labs.linagora.com/) and the [OpenLLM-France](https://www.openllm-france.fr/) community, including in alphabetical order:
Agustin Martin Picard (IRT),
Thibaut Boissin (IRT),
Christophe Cerisara (LORIA),
Evan Dufraisse (CEA List),
Julie Hunter (LINAGORA),
Jean-Pierre Lorré (LINAGORA),
Jérôme Louradour (LINAGORA),
Lucas Hervier (IRT),
Michel-Marie Maudet (LINAGORA),
Olivier Gouvert (LINAGORA), and
Yaya Sy (LORIA).
We thank
Clément Bénesse (Opsci),
Guokan Shang (MBZUAI),
Ismaïl Harrando (LINAGORA),
Joël Gombin (Opsci),
Jordan Ricker (Opsci),
Julien Tourille (EDF),
Manuel Faysse (ILLUIN Technology),
Olivier Ferret (CEA List), and
Rachel Bawden (INRIA),
for their helpful input.
We also thank the support teams from IDRIS, in particular Myriam Peyrounette and Hatim Bourfoune, and from Hugging Face, in particular Thomas Wolf, Guilherme Penedo, Elie Bakouch, Haojun Zhao, and Lucain Pouget for their technical guidance.
Finally, we thank the entire OpenLLM-France community, whose members have helped in diverse ways.
## Contact
contact@openllm-france.fr

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 0,
"eos_token_id": 1,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 12288,
"max_position_embeddings": 32000,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 20000000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.36.1",
"use_cache": true,
"vocab_size": 65024,
"training_steps": 756291,
"training_tokens": 3131736326144
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 205 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 137 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 132 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 116 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 223 KiB

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1fb3cc91f0c38ebc8aecba19e8d8402d9eea1f90bce4281909a942a1cd371dc6
size 266605

8
generation_config.json Normal file
View File

@@ -0,0 +1,8 @@
{
"bos_token_id": 0,
"eos_token_id": 1,
"max_length": 32000,
"do_sample": true,
"temperature": 0.6,
"transformers_version": "4.36.1"
}

View File

@@ -0,0 +1,387 @@
training_steps,training_tokens,training_loss,walltime,gputime,learning_rate
1865,1955594240,5.540690021514893,0.7975208023282855,408.33065079208217,7.161599933169782e-05
3790,4114087936,3.00810284614563,1.6655786928923269,852.7762907608713,0.00015066239575389773
5313,6110314496,2.6936181354522706,2.419871313692319,1238.9741126104673,0.0002237664011772722
7402,9252634624,2.525974779129028,3.525297917627946,1804.9525338255085,0.000299999926937744
9616,13150715904,2.3369024181365967,4.839082616515136,2477.6102996557497,0.00029999829712323844
11655,17390895104,2.276480207443237,6.211945974481024,3180.5163389342842,0.00029999419348314404
13968,23110877184,2.2075621032714845,7.9962738511337434,4094.0922117804766,0.00029998470563441515
15910,28752740352,2.1356527423858642,9.694892240850036,4963.7848273152185,0.0002999709395226091
17655,34566045696,2.187712240219116,11.403844424466975,5838.768345327091,0.000299952196655795
19561,41906601984,2.1478218460083007,13.512215933090056,6918.254557742109,0.0002999218995682895
21599,50454593536,2.108550395965576,15.963069546280945,8173.091607695844,0.00029987728339619935
22929,56033017856,2.08190354347229,17.555688533989233,8988.512529402487,0.0002998427371494472
24879,64211910656,2.063254547119141,20.013122328626352,10246.718632256692,0.00029978438396938145
26435,70738247680,2.0415384721755983,22.095803890509558,11313.051591940894,0.00029973124037496746
28408,79013609472,2.0200176572799684,24.742381052788318,12668.099099027619,0.0002996554540004581
30712,88677285888,2.009900689125061,27.834729681593487,14251.381596975865,0.00029955507488921285
32719,97095254016,1.997526035308838,30.529593802309225,15631.152026782323,0.00029945719870738685
34620,105068625920,1.9822303581237792,33.088940144262565,16941.537353862434,0.00029935556813143194
37346,116502298624,1.9698213863372802,36.75907196609288,18820.644846639556,0.0002991946239490062
39233,124416950272,1.9548299026489258,39.309065435078146,20126.24150276001,0.00029907276621088386
40846,131182362624,1.951894497871399,41.485595499098174,21240.624895538265,0.0002989618224091828
42956,140032344064,1.9350375270843505,44.33258508443755,22698.283563232024,0.0002988072519656271
44939,148349648896,1.9308405590057374,47.01033564593735,24069.291850719925,0.00029865227406844497
46768,156021030912,1.9222344255447388,49.47616130587861,25331.79458860985,0.00029850099235773087
48882,164887789568,1.9197535276412965,52.35843906614072,26807.520801864048,0.00029831615393050015
50706,172538200064,1.912511978149414,54.81750936548799,28066.564795129852,0.00029814810841344297
52358,179467190272,1.907254514694214,57.04858303080124,29208.874511770235,0.0002979890559799969
53857,185754451968,1.9000687837600707,59.087224691490626,30252.6590420432,0.00029783911304548383
56177,195485237248,1.897755184173584,62.22864813516615,31861.067845205067,0.0002975965035147965
58449,205014695936,1.8944290637969972,65.2858384233869,33426.349272774096,0.0002973465307150036
60122,212031766528,1.885751051902771,67.54159698311844,34581.29765535664,0.00029715464916080236
62532,222140039168,1.8851725578308105,70.83456301174779,36267.29626201487,0.0002968665794469416
64820,231736606720,1.875647120475769,73.90514303951494,37839.43323623165,0.0002965803723782301
67145,241488363520,1.8717713451385498,77.07916604916753,39464.53301717377,0.0002962769358418882
69418,251022016512,1.8673745584487915,80.13863182021647,41030.979491950835,0.00029596799868158996
70813,256873070592,1.8599135208129882,82.01500265421159,41991.68135895633,0.0002957723627332598
73257,267123949568,1.8653322410583497,85.29531497185086,43671.20126558764,0.00029541869298554957
74982,274359123968,1.8608362674713135,87.62793199229114,44865.50118005306,0.0002951606293208897
76878,282311524352,1.8511997079849243,90.18867519614679,46176.601700427156,0.00029486900893971324
79546,293501927424,1.8491973686218262,93.81005986021884,48030.75064843205,0.0002944445004686713
81690,302494515200,1.8464100503921508,96.70326836685862,49512.073403831615,0.00029409138369373977
83549,310291726336,1.847438826560974,99.1904628869521,50785.51699811948,0.0002937766257673502
86305,321851228160,1.8410191345214844,102.87963783826878,52674.374573193614,0.00029329530661925673
87821,328209793024,1.8423179483413696,104.95056398590805,53734.68876078492,0.0002930230984929949
89624,335772123136,1.8375876760482788,107.36518641720015,54970.97544560648,0.00029269250808283687
91624,344160731136,1.8310838747024536,110.08795571576191,56365.0333264701,0.00029231709777377546
94363,355648929792,1.8548078060150146,113.81864126870086,58275.14432957484,0.0002917881647590548
96400,364192727040,1.8301151275634766,116.56662833875006,59682.11370944003,0.000291383737931028
98285,372098990080,1.825421872138977,119.117614549062,60988.21864911974,0.0002910011389758438
99761,378289782784,1.828107204437256,121.11521686213217,62010.99103341167,0.0002906959562096745
101660,386254766080,1.8232884979248047,123.67062551266181,63319.36026248285,0.0002902960986830294
102797,391023689728,1.8284022760391236,125.20398345403945,64104.439528468196,0.0002900528197642416
104466,398023983104,1.8237161207199097,127.48315444020703,65271.375073386,0.0002896904479712248
106544,406739746816,1.8209293365478516,130.37456855220438,66751.77909872864,0.0002892306074500084
108654,415589728256,1.813926682472229,133.23329634167231,68215.44772693623,0.0002887538284994662
110464,423181418496,1.8184159755706788,135.6778559462546,69467.06224448235,0.00028833700343966484
113489,435869188096,1.8118912790502821,139.763474430842,71558.8989085911,0.0002876241924241185
115252,443263746048,1.8117296981811524,142.15046140346098,72781.03623857202,0.0002871994802262634
116687,449282572288,1.8083889770507813,144.07612540268525,73766.97620617485,0.00028684877906925976
119106,459428593664,1.8078338527679443,147.34195422902133,75439.08056525892,0.0002862474066205323
121073,467678789632,1.808600254058838,149.9825934684281,76791.08785583518,0.00028574903262779117
122958,475585052672,1.7992388534545898,152.52741438835127,78094.03616683585,0.0002852636098396033
124630,482597928960,1.8028169393539428,154.7699610268964,79242.22004577096,0.000284826586721465
126358,489845686272,1.8049570083618165,157.08673302915514,80428.40731092743,0.00028436866705305874
128236,497722589184,1.8046081829071046,159.62801352390318,81729.54292423843,0.00028386374469846487
130926,509005266944,1.7990764093399048,163.24389280245367,83580.87311485628,0.0002831274177879095
132497,515594518528,1.794449429512024,165.37276175601704,84670.85401908073,0.0002826903073582798
135020,526176747520,1.799844126701355,168.77996288452678,86415.34099687771,0.0002819774381350726
136777,533546139648,1.7945206785202026,171.14425289637555,87625.85748294428,0.0002814731269609183
139067,543151095808,1.7959566926956176,174.2254043554347,89203.40702998257,0.00028080615447834134
140573,549467717632,1.8007401327292125,176.28781320277358,90259.36035982007,0.00028036159346811473
142990,559605350400,1.790226879119873,179.52959129813544,91919.15074464535,0.00027963833417743444
144717,566848913408,1.7950389575958252,181.85608978862726,93110.31797177716,0.00027911417419090867
147235,577410170880,1.787166004180908,185.26090506606388,94853.58339382471,0.0002783390518743545
149646,587522637824,1.7874946737289428,188.52470080996864,96524.64681470394,0.00027758482610806823
152254,598461382656,1.7876266622543335,192.05541756899714,98332.37379532654,0.00027675574528984725
153497,603674902528,1.7870644330978394,193.73033408004696,99189.93104898404,0.0002763558004517108
155091,610360623104,1.7821037721633912,195.90790076383007,100304.845191081,0.00027583842165768147
157080,618703093760,1.7816897821426392,198.57830149684983,101672.09036638711,0.00027518573915585876
159083,627104284672,1.7820564126968383,201.2724561886386,103051.49756858297,0.00027452060021460056
160774,634196852736,1.7861653804779052,203.56257374424416,104224.03775705301,0.00027395293000154197
162912,643164274688,1.7875264167785645,206.45703154702142,105706.00015207496,0.00027322719688527286
164504,649841606656,1.7890281875928242,208.59348014120815,106799.86183229857,0.00027268106350675225
166091,656497967104,1.7810548543930054,210.73933679447862,107898.54043877305,0.0002721317869145423
168208,665377308672,1.7790726804733277,213.599654382709,109363.023043947,0.0002713915309868753
170798,676240556032,1.7792628765106202,217.10682849737182,111158.69619065437,0.0002704742655623704
173178,686222999552,1.7723922634124756,220.3199128641763,112803.79538645827,0.00026962021365761757
175477,695865704448,1.7773733282089232,223.42362898933956,114392.89804254186,0.0002687851374503225
177319,703591612416,1.7751788663864136,225.91816708412628,115670.10154707266,0.0002681089681573212
179285,711837614080,1.773937292098999,228.55287790967017,117019.07348975113,0.00026738038286566734
181269,720159113216,1.7757549047470094,231.22714584410738,118388.29867218298,0.00026663794415071607
182472,725204860928,1.7724720859527587,232.86696931923933,119227.88829145054,0.00026618424453772604
183903,731206909952,1.7717243003845216,234.78111301701074,120207.9298647095,0.0002656411670614034
186140,740589568000,1.7713909292221068,237.79312360069395,121750.0792835553,0.00026478481595404446
188681,751247294464,1.7675371074676514,241.22784125286,123508.65472146432,0.00026380125200375915
190733,759854006272,1.770745587348938,243.9752635411917,124915.33493309015,0.00026299862656742334
192470,767139512320,1.7692016410827636,246.32182628229677,126116.77505653595,0.00026231343508698046
194277,774718619648,1.7701459550857543,248.78614687347053,127378.50719921691,0.00026159503613598645
195792,781072990208,1.7644649791717528,250.8326970803534,128426.34090514094,0.00026098836679011583
198006,790359179264,1.7638875579833984,253.84096263601015,129966.5728696372,0.0002600947336759418
199191,795329429504,1.7611687517166137,255.44903973567233,130789.90834466423,0.0002596129779703915
201402,804603035648,1.7673775005340575,258.4134064139623,132307.6640839487,0.0002587077615316957
203281,812484132864,1.7651812601089478,260.93973858808965,133601.1461571019,0.00025793202803470194
204903,819287293952,1.7612399768829345,263.1386813809225,134727.00486703232,0.00025725766317918897
206833,827382300672,1.7639558458328246,265.7458362247564,136061.86814707526,0.00025644959532655776
209473,838455263232,1.7610173416137695,269.30228242161377,137882.76859986625,0.00025533439475111663
211312,846168588288,1.759269905090332,271.79195232891027,139157.47959240206,0.00025455086142756045
213227,854200680448,1.7630469226837158,274.3732711282936,140479.11481768632,0.00025372920208610594
215847,865189756928,1.7609153509140014,277.9089087475696,142289.36127875565,0.00025259560788981616
217356,871518961664,1.7579796981811524,279.9560044781967,143337.4742928367,0.0002519378322176635
219172,879135817728,1.7634607887268066,282.2283169674041,144500.8982873109,0.0002511414932087064
222150,891626455040,1.7575871229171753,285.76951404988944,146313.9911935434,0.000249824661295861
223853,898769354752,1.758929591178894,287.8028389246148,147355.05352940276,0.000249065546086058
225145,904188395520,1.7552893447875977,289.3465113940193,148145.41383373787,0.00024848670000210404
226928,911666839552,1.751872878074646,291.46764839194356,149231.4359766751,0.00024768381263129413
227290,913185177600,1.756528417269389,291.96401525144137,149485.57580873798,0.00024752022000029683
229251,921410207744,1.7513698720932007,294.6090092848358,150839.81275383593,0.0002466306905262172
231418,930499264512,1.7536880302429199,297.51891305445406,152329.68348388048,0.00024564118939451873
232756,936111243264,1.7567100238800049,299.3165955686522,153250.09693114992,0.0002450268075335771
234747,944462102528,1.7647992753982544,301.9957619307597,154621.83010854898,0.00024410788319073617
236656,952469028864,1.759487557411194,304.57088663315164,155940.29395617364,0.00024322151148226112
238174,958835982336,1.760659966468811,306.609707181298,156984.17007682458,0.00024251305148936808
240473,968478687232,1.754089126586914,309.6952503732213,158563.9681910893,0.00024143399787135422
242768,978104614912,1.7523606967926026,312.7955168953312,160151.30465040958,0.00024034960370045155
244243,984291213312,1.7513489294052125,314.7860534490736,161170.4593659257,0.00023964889987837523
245978,991568330752,1.7525734424591064,317.14563100169306,162378.56307286685,0.0002388209686614573
248018,1000124710912,1.7499562788009644,319.9244360278179,163801.31124624275,0.00023784241057001054
249886,1007959670784,1.7509011316299439,322.45674101591925,165097.85140015066,0.00023694158880971372
251357,1014129491968,1.7487824440002442,324.4791230457826,166133.3109994407,0.00023622905428055674
252908,1020634857472,1.7466645431518555,326.58318169927685,167210.58903002975,0.00023547477030660957
255231,1030378225664,1.7461054420471191,329.7483051645596,168831.13224425452,0.00023433937167283148
256784,1036891979776,1.7455737209320068,331.87937631953497,169922.2406756019,0.00023357658938039094
258875,1045662269440,1.7457578039169313,334.731770063507,171382.66627251558,0.00023254487314261496
260952,1054373838848,1.7445444059371948,337.5729135723391,172837.3317490376,0.00023151483037509024
262527,1060979867648,1.741570553779602,339.7600884215228,173957.1652718197,0.00023073032207321376
264811,1070559657984,1.740293960571289,342.88246244716737,175555.8207729497,0.00022958747285883874
266420,1077308293120,1.7429306316375732,345.0796831538715,176680.7977747822,0.00022877875017002225
268308,1085227139072,1.744356451034546,347.65395848590146,177998.82674478155,0.00022782600717619061
269559,1090474213376,1.7428239250183106,349.3460391367528,178865.17203801742,0.00022719251865055412
271809,1099911397376,1.7441408920288086,352.4030745393055,180430.37416412443,0.00022604875266551971
274026,1109210169344,1.7409614515304566,355.4137426253769,181971.836224193,0.00022491635172627866
276200,1118328586240,1.7383298921585082,358.35239492776185,183476.42620301407,0.00022380080190487206
279273,1131217682432,1.7371133943883383,362.5356638252912,185618.25987854908,0.00022221545805223286
281879,1142148038656,1.7415697383880615,366.08764050357314,187436.87193782945,0.00022086345416028053
283803,1150217879552,1.733839235305786,368.711490408947,188780.28308938086,0.0002198608999606222
286646,1162142285824,1.7431989669799806,372.57302363409656,190757.38810065744,0.00021837285021319985
288431,1169629118464,1.739035325050354,375.010250589488,192005.24830181786,0.00021743458637502044
289869,1175660527616,1.733289074897766,376.974500554174,193010.9442837371,0.00021667654800694436
292495,1186674769920,1.7315478420257568,380.5901660706151,194862.16502815494,0.00021528734941966832
295211,1198066499584,1.733106060028076,384.2868979963938,196754.89177415363,0.00021384400315582752
297195,1206387998720,1.7410508108139038,386.9830853080197,198135.3396777061,0.00021278555504977703
298925,1213644144640,1.734706358909607,389.3362156377362,199340.14240652093,0.00021185987861827016
300889,1221881757696,1.73500732421875,392.0144997620718,200711.42387818077,0.00021080594160594046
302662,1229318258688,1.7418826770782472,394.42366878365414,201944.91841723092,0.00020985178707633168
304473,1236914143232,1.7310642719268798,396.8901623202001,203207.76310794245,0.00020887458231300116
306476,1245315334144,1.7335577774047852,399.62016656875767,204605.52528320393,0.0002077907556667924
308518,1253880102912,1.7285248136520386,402.42136555782304,206039.7391656054,0.00020668267097789794
310574,1262503591936,1.733399453163147,405.2239282874955,207474.6512831977,0.00020556384697556496
312781,1271760420864,1.730285539627075,408.3160193839066,209057.8019245602,0.00020435944315977395
314612,1279440191488,1.733029899597168,410.81628584431604,210337.9383522898,0.0002033576020039618
316647,1287975600128,1.731144299507141,413.5669219080583,211746.26401692585,0.00020224145555403084
319268,1298968870912,1.729205231666565,416.9754043734229,213491.40703919253,0.00020079984096810222
320942,1305990135808,1.7270024967193605,418.99536520049196,214525.62698265188,0.00019987679843325168
323057,1314861088768,1.7282018089294433,421.5389839125796,215827.95976324077,0.0001987080613616854
324688,1321701998592,1.727188115119934,423.4974892227614,216830.71448205382,0.0001978049403987825
325782,1326290567168,1.7273711681365966,424.81380970657455,217504.67056976617,0.00019719830015674233
328381,1337191563264,1.7251214504241943,427.95426755280107,219112.58498703415,0.00019575434271246195
330172,1344703561728,1.7238556051254272,430.13040363881544,220226.7666630735,0.0001947571145137772
332386,1353989750784,1.7249505424499512,432.82640173550084,221607.11768857643,0.00019352202070876956
334457,1362676154368,1.7219355773925782,435.3433918957347,222895.81665061618,0.00019236441585235298
336042,1369324126208,1.72216224193573,437.23262586380747,223863.10444226942,0.0001914770546136424
337560,1375691079680,1.7201724815368653,439.0764706419921,224807.15296869996,0.00019062607316300273
338748,1380673912832,1.7237060013271512,440.51795167100903,225545.19125555662,0.00018995934806298465
340933,1389838467072,1.7192433309555053,443.16397790562996,226899.95668768254,0.00018873147200793028
342332,1395706298368,1.7205400276184082,444.8788870963792,227777.99019334614,0.00018794421339407563
344302,1403969077248,1.7203184127807618,447.26131719529656,228997.79440399184,0.00018683428061194718
345968,1410956787712,1.720102686882019,449.29559398623377,230039.3441209517,0.0001858944451669231
347502,1417390850048,1.720639853477478,451.16690631718046,230997.4560343964,0.0001850281551014632
349991,1427830472704,1.7197772884368896,454.2264308151358,232563.93257734954,0.00018362075206823647
352052,1436474933248,1.7142712354660035,456.81043543511663,233886.94294277971,0.00018245380488224328
354054,1444871929856,1.7160462188720702,459.453295596552,235240.08734543461,0.00018131898832507432
355452,1450735566848,1.7196543836593627,461.30180067648325,236186.52194635943,0.0001805258507374674
357345,1458675384320,1.7139066517353059,463.9481888459466,237541.47268912467,0.00017945100262295455
359270,1466749419520,1.7155957555770873,466.5414323894858,238869.21338341673,0.00017835704784374684
361118,1474500493312,1.7160774993896484,469.038111011795,240147.51283803905,0.0001773060066625476
363330,1483778293760,1.716701912879944,472.0381865977867,241683.5515380668,0.0001760469749569893
365934,1494700261376,1.7132270240783691,475.5838090708114,243498.91024425544,0.0001745635672705248
367689,1502061264896,1.7156493997573852,477.96177319155544,244716.42787407638,0.00017356315220240504
369392,1509204164608,1.7158074569702149,480.26144646743097,245893.86059132466,0.00017259192827623338
371001,1515952799744,1.715286192893982,482.4363151349443,247007.3933490915,0.00017167393525596708
373275,1525490647040,1.712519268989563,485.5007621892336,248576.39024088762,0.0001703760353848338
375882,1536425197568,1.7095552492141723,489.0597850280192,250398.60993434582,0.00016888746176846325
378308,1546600579072,1.7065624713897705,492.38097163360044,252099.05747640342,0.0001675017992965877
380320,1555039518720,1.7018836069107055,495.12656034960173,253504.7988989961,0.00016635240172035992
381423,1559665836032,1.712470350265503,496.61899996729517,254268.92798325513,0.00016572224558331072
384180,1571229532160,1.70383526802063,500.35769686916666,256183.14079701333,0.0001641471026232466
386393,1580511526912,1.7026185607910156,503.3778410258571,257729.45460523883,0.0001628828322282061
388342,1588686225408,1.703223738670349,506.02086480091833,259082.6827780702,0.00016176952340174466
389711,1594428227584,1.7003657007217408,507.8724601782208,260030.69961124906,0.00016098766354843974
391539,1602095415296,1.6955804538726806,510.35535180901985,261301.94012621816,0.000159943854669109
393541,1610492411904,1.6995060205459596,513.0894965861288,262701.82225209795,0.00015880104911047965
395741,1619719880704,1.7011328125,516.0827098217362,264234.34742872894,0.00015754574269521981
398543,1631472320512,1.7008357238769531,519.8783201021156,266177.6998922832,0.000155947869643569
400565,1639953203200,1.695354881286621,522.6083451644168,267575.4727241814,0.000154795590788126
402527,1648182427648,1.6946661138534547,525.2668895711776,268936.6474604429,0.00015367820742540061
403945,1654129950720,1.6960719108581543,527.1914807180993,269922.03812766686,0.00015287112910300493
405313,1659867758592,1.697601842880249,529.0370912977813,270866.99074446404,0.0001520929072285071
406788,1666054356992,1.6927887296676636,531.0349452814536,271889.89198410424,0.000151254324009642
409387,1676955353088,1.69403892993927,534.5653393185967,273697.4537311215,0.00014977800310589373
411155,1684370882560,1.691395869255066,536.9709613285441,274929.1322002146,0.0001487747795181349
413089,1692482666496,1.6922231245040893,539.5964738321247,276273.39460204786,0.00014767838001716882
414732,1699373907968,1.6947362184524537,541.7996070183738,277401.39879340737,0.00014674787234980613
416436,1706521001984,1.6899096250534058,544.0972045027258,278577.76870539563,0.0001457837497582659
417863,1712506273792,1.6881012392044068,545.9991046208406,279551.5415658704,0.0001449771225452423
419237,1718269247488,1.691229190826416,547.8411327792202,280494.65998296073,0.00014420114166568965
421285,1726859182080,1.6884182071685792,550.5797526560408,281896.8333598929,0.00014304582145996392
424030,1738372546560,1.6905427885055542,554.2562450989998,283779.1974906879,0.0001414999132975936
426068,1746920538112,1.6902626466751098,557.0390545870044,285203.9959485463,0.00014035419735591859
427364,1752356356096,1.6865596675872803,558.7964797184435,286103.79761584307,0.0001396265724906698
429281,1760396836864,1.6867563104629517,561.412200978866,287443.0469011794,0.00013855169527232647
432073,1772107333632,1.6869472122192384,565.1983350192647,289381.54752986354,0.0001369893434457481
433564,1778361040896,1.6836950588226318,567.304830058465,290460.07298993407,0.00013615658099297434
435451,1786275692544,1.682388458251953,569.8640171882282,291770.37680037285,0.0001351043174508959
436975,1792667811840,1.6846920680999755,571.9226006626227,292824.3715392628,0.00013425585348159075
439052,1801379381248,1.6854845952987672,574.7291045055265,294261.30150682956,0.00013310158101376146
440966,1809407279104,1.6832574605941772,577.334748714053,295595.39134159515,0.000132040077005513
443227,1818890600448,1.6824970388412475,580.0256552258438,296973.13547563204,0.00013078891788609326
445179,1827077881856,1.6792938709259033,582.2320406351785,298102.8048052114,0.00012971126125194132
447934,1838633189376,1.6817043399810792,585.2915628118471,299669.28015966574,0.00012819441326428205
449710,1846082273280,1.6782426357269287,587.2799727474821,300687.34604671085,0.00012721921666525304
452524,1857885044736,1.6820737409591675,590.4152128734677,302292.58899121545,0.00012567844532895833
454426,1865862610944,1.6787804031372071,592.544702035982,303382.8874424228,0.00012464018072932959
456289,1873676599296,1.681654314994812,594.6135522814485,304442.13876810163,0.00012362573761492968
458583,1883298332672,1.6730261087417602,597.1887328853452,305760.63123729674,0.00012238013732712716
460981,1893356273664,1.672509970664978,599.8846338122594,307140.9325118768,0.00012108237569918856
462659,1900394315776,1.6751681089401245,601.761152752987,308101.7102095293,0.00012017694825772196
464445,1907885342720,1.671654839515686,603.7396514160039,309114.701524994,0.0001192157287732698
465759,1913396658176,1.6746321821212768,605.2131596599303,309869.1377458843,0.00011851020099129528
467953,1922598961152,1.6700562286376952,607.6498691665898,311116.733013294,0.00011733539577107877
470480,1933197967360,1.6706900882720948,610.4716868327131,312561.5036583491,0.000115987379103899
472362,1941091647488,1.670096197128296,612.5532063371096,313627.2416446001,0.00011498706589918584
474479,1949970989056,1.672633442878723,614.9096566533404,314833.7442065103,0.00011386564437998459
475718,1955167731712,1.670849027633667,616.2887570926199,315539.8436314214,0.00011321121564833447
477678,1963388567552,1.6683212041854858,618.4896738461576,316666.7130092327,0.00011217887367820367
479405,1970632130560,1.6645479679107666,620.5010973832233,317696.5618602103,0.00011127226753160357
481490,1979377254400,1.6670279264450074,622.9267749258964,318938.508762059,0.00011018155055353418
483395,1987367403520,1.6700633478164673,625.1531309454637,320078.4030440774,0.00010918873158516362
484660,1992673198080,1.665715732574463,626.6322892833138,320835.73211305664,0.00010853145067812875
486388,1999920955392,1.6684691619873047,628.6401761626622,321863.77019528305,0.00010763623140519485
487760,2005675540480,1.6899535751342774,630.2428063296861,322684.3168407993,0.00010692761861719191
489679,2013724409856,1.6870677042007447,632.4661689389804,323822.678496758,0.00010593978367978707
491707,2022230458368,1.683250117301941,634.8101256682668,325022.7843421526,0.00010490007844055071
493795,2030988165120,1.6834172248840331,637.2431921455864,326268.5143785402,0.00010383423796156421
495562,2038399500288,1.6794629859924317,639.3029817152653,327323.12663821585,0.00010293597733834758
497249,2045475291136,1.6757073497772217,641.2957346101667,328343.41612040537,0.00010208162711933255
499270,2053951979520,1.6750809478759765,643.6868357490317,329567.65990350425,0.00010106235276907682
501917,2065054302208,1.6795844745635986,646.8346841992168,331179.358309999,9.973444684874266e-05
504501,2075892383744,1.6791791486740113,649.9504268523947,332774.6185484261,9.844604937825352e-05
505920,2081844101120,1.6794115686416626,651.6345685349747,333636.89908990706,9.774190402822569e-05
507735,2089456762880,1.6736989212036133,653.8060941808081,334748.72022057377,9.684478573035449e-05
509781,2098038308864,1.6739554023742675,656.2464802015245,335998.19786318054,9.583831706549972e-05
511352,2104627560448,1.6752780342102052,658.1126194182707,336953.6611421546,9.50690227909945e-05
512792,2110667358208,1.6734393405914307,659.8307014095916,337833.3191217109,9.436659456696361e-05
514516,2117898338304,1.6712262153625488,661.8950038467839,338890.24196955335,9.352908818982542e-05
516978,2128224714752,1.670324263572693,664.8669427431114,340411.87468447303,9.233966557076201e-05
519037,2136860786688,1.6687762260437011,667.3400292872751,341678.09499508486,9.135099389823154e-05
520968,2144959987712,1.6679233884811402,669.6824876479714,342877.43367576133,9.042886085808277e-05
523402,2155168923648,1.6661997795104981,672.6088669252829,344375.73986574484,8.927362796384841e-05
525489,2163922436096,1.667925977706909,675.1255832034733,345664.2986001783,8.828948193695396e-05
527861,2173871325184,1.6650946950912475,677.9682576375237,347119.74791041214,8.71782103786245e-05
529924,2182524174336,1.6656306838989259,680.4565219490987,348393.73923793854,8.621808228781447e-05
531732,2190107475968,1.6626214647293092,682.6502407960503,349516.92328757775,8.538155816495419e-05
533512,2197573337088,1.6671656608581542,684.8457100742603,350641.0035580213,8.456254727207124e-05
535979,2207920685056,1.6623902940750122,687.8088072102388,352158.10929164226,8.343499212060124e-05
538441,2218247061504,1.6625070142745972,690.8060061686901,353692.67515836935,8.231859101215377e-05
539611,2223154397184,1.6638607454299927,692.2126377288533,354412.8705171729,8.179118594853207e-05
541835,2232482529280,1.6632291555404664,694.9052978233159,355791.51248553774,8.07943069958128e-05
543513,2239520571392,1.6591477632522582,696.9350562256079,356830.74878751126,8.004709525266662e-05
544834,2245061246976,1.6582461643218993,698.5149932145702,357639.6765258599,7.946186815388501e-05
546541,2252220923904,1.6569991302490235,700.5529090210858,358683.08941879595,7.870959962019697e-05
548386,2259959414784,1.6534462308883666,702.7537166691092,359809.9029345839,7.790157542331144e-05
550107,2267177811968,1.6563579320907593,704.8354743227627,360875.7628532545,7.715264655416831e-05
551929,2274819833856,1.652633171081543,707.0277662998835,361998.21634554036,7.636484951945022e-05
553695,2282226974720,1.6558599662780762,709.1717543189402,363095.9382112974,7.560628728242591e-05
555203,2288551985152,1.6531555843353272,710.9832715057181,364023.4350109277,7.496249600080773e-05
557414,2297825591296,1.658251905441284,713.6318656351355,365379.51520518935,7.402522169286385e-05
559184,2305249509376,1.6528305721282959,715.7625381680682,366470.4195420509,7.328063657041639e-05
561554,2315190009856,1.6525710678100587,718.5529555138435,367899.11322308786,7.229171023936942e-05
562859,2320663576576,1.651809163093567,720.0850926347636,368683.56742899894,7.175114296842366e-05
565169,2330352418816,1.6541441106796264,722.8299401458767,370088.92935468885,7.080127397784963e-05
567134,2338594226176,1.6530265474319459,725.1825567879315,371293.46907542093,7.000035111559555e-05
569714,2349415530496,1.6502766132354736,728.2318701144516,372854.7174985992,6.895873957546428e-05
571324,2356168359936,1.651259379386902,730.1611943364807,373842.5315002781,6.831453356426209e-05
574248,2368432504832,1.6482793807983398,733.6468108414203,375627.1671508072,6.715607014484704e-05
576334,2377181822976,1.644599723815918,736.1349917772365,376901.11578994506,6.63387545500882e-05
578782,2387449479168,1.6505870151519775,739.0739024216011,378405.8380398598,6.538941670442e-05
580516,2394722402304,1.6470952558517455,741.1653197225369,379476.6436979389,6.472343375207856e-05
582668,2403748544512,1.6487390184402466,743.7736260008398,380812.09651243,6.390442285919562e-05
584669,2412141346816,1.6448141527175903,746.1684989620086,382038.2714685484,6.315039354376495e-05
586818,2421154906112,1.646544461250305,748.775306891994,383372.9571287009,6.234873580979183e-05
589001,2430311071744,1.6475654029846192,751.4020625109125,384717.8560055872,6.154309085104614e-05
591113,2439169441792,1.6450910377502441,753.9527754451661,386023.82102792506,6.077205034671351e-05
593228,2448040394752,1.6460276412963868,756.5354509950179,387346.15090944915,6.0008260334143415e-05
594605,2453815951360,1.6390350008010863,758.1939404388438,388195.29750468803,5.951549974270165e-05
597153,2464503037952,1.637695927619934,761.2579777614424,389764.0846138585,5.8613157307263464e-05
598759,2471239090176,1.6295817375183106,763.2197744939834,390768.5245409195,5.8050762163475156e-05
600863,2480063905792,1.6312317228317261,765.7391971130864,392058.46892190026,5.732145291403867e-05
602611,2487395549184,1.6288472127914428,767.8321921499497,393130.08238077426,5.67220376979094e-05
604453,2495121457152,1.6286871242523193,770.052846124808,394267.0572159017,5.609680010820739e-05
606186,2502390185984,1.626492328643799,772.1559839896788,395343.86380271555,5.5514599807793275e-05
606868,2505250701312,1.6294204493363698,772.9715416562736,395761.4293280121,5.528709516511299e-05
608527,2512209051648,1.6318567752838136,774.9540150309879,396776.4556958658,5.4737502068746835e-05
609912,2518018162688,1.6299163818359375,776.6102461643201,397624.4460361319,5.4282838391372934e-05
612696,2529695105024,1.6283214426040649,779.9288053733754,399323.5483511682,5.338044138625264e-05
614703,2538113073152,1.624042534828186,782.3509216785588,400563.67189942213,5.2739502280019224e-05
617059,2547994853376,1.64087806224823,785.185724072868,402015.0907253084,5.19974491908215e-05
618445,2553808158720,1.642133264541626,786.8536925417527,402869.0905813774,5.156615225132555e-05
620429,2562129657856,1.6360367107391358,789.3145963788388,404129.0733459655,5.0955564802279696e-05
622331,2570107224064,1.6323289918899535,791.575088748329,405286.44543914445,5.0377762818243355e-05
624509,2579242418176,1.6336705684661865,794.1814413488138,406620.89797059266,4.9725240387488157e-05
625869,2584946671616,1.63454927444458,795.8170756094358,407458.34271203115,4.932274896418676e-05
628359,2595390488576,1.6327994871139526,798.8134971637937,408992.5105478624,4.8595778935123235e-05
629947,2602051043328,1.634876675605774,800.7342204672283,409975.9208792209,4.8138899728655815e-05
631685,2609340743680,1.6314793634414673,802.8435091534811,411055.87668658234,4.764491313835606e-05
633406,2616559140864,1.630139570236206,804.9092638572018,412113.54309488734,4.716201510746032e-05
635185,2624020807680,1.6301208162307739,807.085845822091,413227.9530609106,4.666941094910726e-05
636953,2631436337152,1.6280076551437377,809.2236022954875,414322.4843752896,4.61865020042751e-05
638743,2638944141312,1.6293576526641846,811.3632367052068,415417.97719306586,4.570435703499243e-05
641116,2648897224704,1.6281150197982788,814.1943204899674,416867.4920908633,4.507573976297863e-05
643518,2658971942912,1.626808156967163,817.0790731657314,418344.4854608545,4.4451757275965065e-05
645200,2666026762240,1.6276088953018188,819.1059377729686,419382.2401397599,4.402222475619055e-05
646903,2673169661952,1.6250265979766845,821.1531966983101,420430.4367095348,4.3593576265266165e-05
649246,2682996916224,1.6259414958953857,823.9828252710962,421879.20653880126,4.3014148104703054e-05
652174,2695277838336,1.6242961645126344,827.4949100539828,423677.3939476392,4.2306914110668004e-05
654123,2703452536832,1.623875789642334,829.8569429315521,424886.7547809547,4.184658610029146e-05
656975,2715414691840,1.621957221031189,833.2863375114017,426642.6048058377,4.1188093746313825e-05
658694,2722624700416,1.626935167312622,835.3677633732899,427708.29484712443,4.07998995797243e-05
660347,2729557884928,1.6193980121612548,837.4232649987488,428760.7116793594,4.0432812966173515e-05
662381,2738089099264,1.6226060914993286,839.8879330915463,430022.6217428717,3.9989481592783704e-05
664665,2747668889600,1.6181461000442505,842.6497982943592,431436.69672671193,3.9502705476479605e-05
666543,2755545792512,1.6224317026138306,844.9222846863987,432600.20975943614,3.91112407669425e-05
668035,2761803694080,1.6199790287017821,846.7305949538775,433526.06461638527,3.8805901567684487e-05
669716,2768854319104,1.6205505228042603,848.7580324526521,434564.11261575785,3.8467911508632824e-05
671548,2776538284032,1.6194300413131715,850.955616959009,435689.2758830126,3.81068566639442e-05
673792,2785950302208,1.618673267364502,853.719618993503,437104.44492467353,3.767499583773315e-05
675542,2793290334208,1.6200080347061157,855.8460515671908,438193.1784024017,3.7346177123254165e-05
677331,2800793944064,1.6195425510406494,858.0345433676414,439313.6862042324,3.7017267459305e-05
679068,2808079450112,1.6178715658187866,860.1212525989619,440382.0813306685,3.670493606477976e-05
681041,2816354811904,1.619278564453125,862.4906129164967,441595.1938132463,3.6358578654471785e-05
682885,2824089108480,1.6154922294616698,864.7152048434652,442734.1848798542,3.6042976716998965e-05
684471,2830741274624,1.6204235363006592,866.6373480686542,443718.32221115095,3.577781535568647e-05
686548,2839452844032,1.6162836599349975,869.1610365124169,445010.45069435745,3.5439366911305115e-05
688512,2847690457088,1.6139143562316896,871.5226151134167,446219.57893806935,3.51285380020272e-05
690770,2857161195520,1.6140108585357666,874.2491468152524,447615.56316940923,3.47822715411894e-05
692653,2865059069952,1.612514958381653,876.5247202096293,448780.6567473302,3.450260192039423e-05
695067,2875184119808,1.6133524417877196,879.4364041243566,450271.4389116706,3.4156193578382954e-05
697014,2883350429696,1.613981342315674,881.7887764691106,451475.85355218465,3.3886746678035706e-05
698746,2890614964224,1.613626651763916,883.8767033670731,452544.87212394143,3.3654530852800235e-05
701452,2901964750848,1.6106493425369264,887.1577838441956,454224.78532822814,3.330586332594976e-05
703123,2908973432832,1.6123323392868043,889.1781213319924,455259.1981219801,3.309917883598246e-05
704756,2915822731264,1.6120843172073365,891.1229818776349,456254.96672134905,3.2903564715525135e-05
706984,2925167640576,1.6084727716445923,893.7894090776833,457620.17744777387,3.264685801696032e-05
708614,2932004356096,1.6117757987976074,895.7409758135146,458619.3796165195,3.2466501579619944e-05
710902,2941600923648,1.6067533922195434,898.5324710232566,460048.62516390736,3.222398299840279e-05
712655,2948953538560,1.6069003009796143,900.6844297494815,461150.42803173454,3.204659151379019e-05
714446,2956465537024,1.6114630460739137,902.828141202652,462248.0082957578,3.187291440553963e-05
716271,2964120141824,1.6145262241363525,905.0301348181574,463375.4290268966,3.170380659867078e-05
718439,2973213392896,1.6167470741271972,907.64100600771,464712.1950759475,3.1513252906734124e-05
720429,2981560057856,1.615721011161804,910.0234193265205,465931.9906951785,3.1348234188044444e-05
722346,2989600538624,1.6144047689437866,912.3105630227034,467103.0082676241,3.119822940789163e-05
724097,2996944764928,1.6113934993743897,914.3946314318039,468170.0512930836,3.106891381321475e-05
726062,3005186572288,1.6150265264511108,916.7757618370391,469389.190060564,3.093255145358853e-05
728431,3015122878464,1.6120684432983399,919.6071913458889,470838.8819690951,3.078047666349448e-05
730800,3025059184640,1.6116199398040771,922.4232348246237,472280.69623020734,3.064189513679594e-05
732477,3032093032448,1.60649507522583,924.4540804354708,473320.48918296106,3.0551960662705824e-05
734574,3040888487936,1.6101762390136718,927.0469926404139,474648.0602318919,3.0449025871348567e-05
736132,3047423213568,1.6150142908096314,929.0061162371555,475651.13151342364,3.0379411327885464e-05
737978,3055165898752,1.6060742330551148,931.2157941709505,476782.48661552666,3.0304503525258042e-05
740325,3065009930240,1.5952490282058716,934.0287519650691,478222.7210061154,3.0221137421904132e-05
741809,3071234277376,1.5957392265922146,935.8210044055174,479140.35425562493,3.0175287974998355e-05
744212,3081313189888,1.5954618740081787,938.745490208727,480637.6909868682,3.0112320018815808e-05
746818,3092243546112,1.5991949892044068,941.9034764933106,482254.579964575,3.0059802156756632e-05
748203,3098052657152,1.5962308692932128,943.6019803061257,483124.21391673636,3.0038570912438445e-05
749620,3103995985920,1.597407283782959,945.3358183610009,484011.93900083244,3.0021647035027854e-05
751627,3112413954048,1.5941282081604005,947.7966657645534,485271.89287145133,3.0005981898284517e-05
753851,3121742086144,1.5939428043365478,950.4965920289866,486654.25511884113,2.9999999242136255e-05
1 training_steps training_tokens training_loss walltime gputime learning_rate
2 1865 1955594240 5.540690021514893 0.7975208023282855 408.33065079208217 7.161599933169782e-05
3 3790 4114087936 3.00810284614563 1.6655786928923269 852.7762907608713 0.00015066239575389773
4 5313 6110314496 2.6936181354522706 2.419871313692319 1238.9741126104673 0.0002237664011772722
5 7402 9252634624 2.525974779129028 3.525297917627946 1804.9525338255085 0.000299999926937744
6 9616 13150715904 2.3369024181365967 4.839082616515136 2477.6102996557497 0.00029999829712323844
7 11655 17390895104 2.276480207443237 6.211945974481024 3180.5163389342842 0.00029999419348314404
8 13968 23110877184 2.2075621032714845 7.9962738511337434 4094.0922117804766 0.00029998470563441515
9 15910 28752740352 2.1356527423858642 9.694892240850036 4963.7848273152185 0.0002999709395226091
10 17655 34566045696 2.187712240219116 11.403844424466975 5838.768345327091 0.000299952196655795
11 19561 41906601984 2.1478218460083007 13.512215933090056 6918.254557742109 0.0002999218995682895
12 21599 50454593536 2.108550395965576 15.963069546280945 8173.091607695844 0.00029987728339619935
13 22929 56033017856 2.08190354347229 17.555688533989233 8988.512529402487 0.0002998427371494472
14 24879 64211910656 2.063254547119141 20.013122328626352 10246.718632256692 0.00029978438396938145
15 26435 70738247680 2.0415384721755983 22.095803890509558 11313.051591940894 0.00029973124037496746
16 28408 79013609472 2.0200176572799684 24.742381052788318 12668.099099027619 0.0002996554540004581
17 30712 88677285888 2.009900689125061 27.834729681593487 14251.381596975865 0.00029955507488921285
18 32719 97095254016 1.997526035308838 30.529593802309225 15631.152026782323 0.00029945719870738685
19 34620 105068625920 1.9822303581237792 33.088940144262565 16941.537353862434 0.00029935556813143194
20 37346 116502298624 1.9698213863372802 36.75907196609288 18820.644846639556 0.0002991946239490062
21 39233 124416950272 1.9548299026489258 39.309065435078146 20126.24150276001 0.00029907276621088386
22 40846 131182362624 1.951894497871399 41.485595499098174 21240.624895538265 0.0002989618224091828
23 42956 140032344064 1.9350375270843505 44.33258508443755 22698.283563232024 0.0002988072519656271
24 44939 148349648896 1.9308405590057374 47.01033564593735 24069.291850719925 0.00029865227406844497
25 46768 156021030912 1.9222344255447388 49.47616130587861 25331.79458860985 0.00029850099235773087
26 48882 164887789568 1.9197535276412965 52.35843906614072 26807.520801864048 0.00029831615393050015
27 50706 172538200064 1.912511978149414 54.81750936548799 28066.564795129852 0.00029814810841344297
28 52358 179467190272 1.907254514694214 57.04858303080124 29208.874511770235 0.0002979890559799969
29 53857 185754451968 1.9000687837600707 59.087224691490626 30252.6590420432 0.00029783911304548383
30 56177 195485237248 1.897755184173584 62.22864813516615 31861.067845205067 0.0002975965035147965
31 58449 205014695936 1.8944290637969972 65.2858384233869 33426.349272774096 0.0002973465307150036
32 60122 212031766528 1.885751051902771 67.54159698311844 34581.29765535664 0.00029715464916080236
33 62532 222140039168 1.8851725578308105 70.83456301174779 36267.29626201487 0.0002968665794469416
34 64820 231736606720 1.875647120475769 73.90514303951494 37839.43323623165 0.0002965803723782301
35 67145 241488363520 1.8717713451385498 77.07916604916753 39464.53301717377 0.0002962769358418882
36 69418 251022016512 1.8673745584487915 80.13863182021647 41030.979491950835 0.00029596799868158996
37 70813 256873070592 1.8599135208129882 82.01500265421159 41991.68135895633 0.0002957723627332598
38 73257 267123949568 1.8653322410583497 85.29531497185086 43671.20126558764 0.00029541869298554957
39 74982 274359123968 1.8608362674713135 87.62793199229114 44865.50118005306 0.0002951606293208897
40 76878 282311524352 1.8511997079849243 90.18867519614679 46176.601700427156 0.00029486900893971324
41 79546 293501927424 1.8491973686218262 93.81005986021884 48030.75064843205 0.0002944445004686713
42 81690 302494515200 1.8464100503921508 96.70326836685862 49512.073403831615 0.00029409138369373977
43 83549 310291726336 1.847438826560974 99.1904628869521 50785.51699811948 0.0002937766257673502
44 86305 321851228160 1.8410191345214844 102.87963783826878 52674.374573193614 0.00029329530661925673
45 87821 328209793024 1.8423179483413696 104.95056398590805 53734.68876078492 0.0002930230984929949
46 89624 335772123136 1.8375876760482788 107.36518641720015 54970.97544560648 0.00029269250808283687
47 91624 344160731136 1.8310838747024536 110.08795571576191 56365.0333264701 0.00029231709777377546
48 94363 355648929792 1.8548078060150146 113.81864126870086 58275.14432957484 0.0002917881647590548
49 96400 364192727040 1.8301151275634766 116.56662833875006 59682.11370944003 0.000291383737931028
50 98285 372098990080 1.825421872138977 119.117614549062 60988.21864911974 0.0002910011389758438
51 99761 378289782784 1.828107204437256 121.11521686213217 62010.99103341167 0.0002906959562096745
52 101660 386254766080 1.8232884979248047 123.67062551266181 63319.36026248285 0.0002902960986830294
53 102797 391023689728 1.8284022760391236 125.20398345403945 64104.439528468196 0.0002900528197642416
54 104466 398023983104 1.8237161207199097 127.48315444020703 65271.375073386 0.0002896904479712248
55 106544 406739746816 1.8209293365478516 130.37456855220438 66751.77909872864 0.0002892306074500084
56 108654 415589728256 1.813926682472229 133.23329634167231 68215.44772693623 0.0002887538284994662
57 110464 423181418496 1.8184159755706788 135.6778559462546 69467.06224448235 0.00028833700343966484
58 113489 435869188096 1.8118912790502821 139.763474430842 71558.8989085911 0.0002876241924241185
59 115252 443263746048 1.8117296981811524 142.15046140346098 72781.03623857202 0.0002871994802262634
60 116687 449282572288 1.8083889770507813 144.07612540268525 73766.97620617485 0.00028684877906925976
61 119106 459428593664 1.8078338527679443 147.34195422902133 75439.08056525892 0.0002862474066205323
62 121073 467678789632 1.808600254058838 149.9825934684281 76791.08785583518 0.00028574903262779117
63 122958 475585052672 1.7992388534545898 152.52741438835127 78094.03616683585 0.0002852636098396033
64 124630 482597928960 1.8028169393539428 154.7699610268964 79242.22004577096 0.000284826586721465
65 126358 489845686272 1.8049570083618165 157.08673302915514 80428.40731092743 0.00028436866705305874
66 128236 497722589184 1.8046081829071046 159.62801352390318 81729.54292423843 0.00028386374469846487
67 130926 509005266944 1.7990764093399048 163.24389280245367 83580.87311485628 0.0002831274177879095
68 132497 515594518528 1.794449429512024 165.37276175601704 84670.85401908073 0.0002826903073582798
69 135020 526176747520 1.799844126701355 168.77996288452678 86415.34099687771 0.0002819774381350726
70 136777 533546139648 1.7945206785202026 171.14425289637555 87625.85748294428 0.0002814731269609183
71 139067 543151095808 1.7959566926956176 174.2254043554347 89203.40702998257 0.00028080615447834134
72 140573 549467717632 1.8007401327292125 176.28781320277358 90259.36035982007 0.00028036159346811473
73 142990 559605350400 1.790226879119873 179.52959129813544 91919.15074464535 0.00027963833417743444
74 144717 566848913408 1.7950389575958252 181.85608978862726 93110.31797177716 0.00027911417419090867
75 147235 577410170880 1.787166004180908 185.26090506606388 94853.58339382471 0.0002783390518743545
76 149646 587522637824 1.7874946737289428 188.52470080996864 96524.64681470394 0.00027758482610806823
77 152254 598461382656 1.7876266622543335 192.05541756899714 98332.37379532654 0.00027675574528984725
78 153497 603674902528 1.7870644330978394 193.73033408004696 99189.93104898404 0.0002763558004517108
79 155091 610360623104 1.7821037721633912 195.90790076383007 100304.845191081 0.00027583842165768147
80 157080 618703093760 1.7816897821426392 198.57830149684983 101672.09036638711 0.00027518573915585876
81 159083 627104284672 1.7820564126968383 201.2724561886386 103051.49756858297 0.00027452060021460056
82 160774 634196852736 1.7861653804779052 203.56257374424416 104224.03775705301 0.00027395293000154197
83 162912 643164274688 1.7875264167785645 206.45703154702142 105706.00015207496 0.00027322719688527286
84 164504 649841606656 1.7890281875928242 208.59348014120815 106799.86183229857 0.00027268106350675225
85 166091 656497967104 1.7810548543930054 210.73933679447862 107898.54043877305 0.0002721317869145423
86 168208 665377308672 1.7790726804733277 213.599654382709 109363.023043947 0.0002713915309868753
87 170798 676240556032 1.7792628765106202 217.10682849737182 111158.69619065437 0.0002704742655623704
88 173178 686222999552 1.7723922634124756 220.3199128641763 112803.79538645827 0.00026962021365761757
89 175477 695865704448 1.7773733282089232 223.42362898933956 114392.89804254186 0.0002687851374503225
90 177319 703591612416 1.7751788663864136 225.91816708412628 115670.10154707266 0.0002681089681573212
91 179285 711837614080 1.773937292098999 228.55287790967017 117019.07348975113 0.00026738038286566734
92 181269 720159113216 1.7757549047470094 231.22714584410738 118388.29867218298 0.00026663794415071607
93 182472 725204860928 1.7724720859527587 232.86696931923933 119227.88829145054 0.00026618424453772604
94 183903 731206909952 1.7717243003845216 234.78111301701074 120207.9298647095 0.0002656411670614034
95 186140 740589568000 1.7713909292221068 237.79312360069395 121750.0792835553 0.00026478481595404446
96 188681 751247294464 1.7675371074676514 241.22784125286 123508.65472146432 0.00026380125200375915
97 190733 759854006272 1.770745587348938 243.9752635411917 124915.33493309015 0.00026299862656742334
98 192470 767139512320 1.7692016410827636 246.32182628229677 126116.77505653595 0.00026231343508698046
99 194277 774718619648 1.7701459550857543 248.78614687347053 127378.50719921691 0.00026159503613598645
100 195792 781072990208 1.7644649791717528 250.8326970803534 128426.34090514094 0.00026098836679011583
101 198006 790359179264 1.7638875579833984 253.84096263601015 129966.5728696372 0.0002600947336759418
102 199191 795329429504 1.7611687517166137 255.44903973567233 130789.90834466423 0.0002596129779703915
103 201402 804603035648 1.7673775005340575 258.4134064139623 132307.6640839487 0.0002587077615316957
104 203281 812484132864 1.7651812601089478 260.93973858808965 133601.1461571019 0.00025793202803470194
105 204903 819287293952 1.7612399768829345 263.1386813809225 134727.00486703232 0.00025725766317918897
106 206833 827382300672 1.7639558458328246 265.7458362247564 136061.86814707526 0.00025644959532655776
107 209473 838455263232 1.7610173416137695 269.30228242161377 137882.76859986625 0.00025533439475111663
108 211312 846168588288 1.759269905090332 271.79195232891027 139157.47959240206 0.00025455086142756045
109 213227 854200680448 1.7630469226837158 274.3732711282936 140479.11481768632 0.00025372920208610594
110 215847 865189756928 1.7609153509140014 277.9089087475696 142289.36127875565 0.00025259560788981616
111 217356 871518961664 1.7579796981811524 279.9560044781967 143337.4742928367 0.0002519378322176635
112 219172 879135817728 1.7634607887268066 282.2283169674041 144500.8982873109 0.0002511414932087064
113 222150 891626455040 1.7575871229171753 285.76951404988944 146313.9911935434 0.000249824661295861
114 223853 898769354752 1.758929591178894 287.8028389246148 147355.05352940276 0.000249065546086058
115 225145 904188395520 1.7552893447875977 289.3465113940193 148145.41383373787 0.00024848670000210404
116 226928 911666839552 1.751872878074646 291.46764839194356 149231.4359766751 0.00024768381263129413
117 227290 913185177600 1.756528417269389 291.96401525144137 149485.57580873798 0.00024752022000029683
118 229251 921410207744 1.7513698720932007 294.6090092848358 150839.81275383593 0.0002466306905262172
119 231418 930499264512 1.7536880302429199 297.51891305445406 152329.68348388048 0.00024564118939451873
120 232756 936111243264 1.7567100238800049 299.3165955686522 153250.09693114992 0.0002450268075335771
121 234747 944462102528 1.7647992753982544 301.9957619307597 154621.83010854898 0.00024410788319073617
122 236656 952469028864 1.759487557411194 304.57088663315164 155940.29395617364 0.00024322151148226112
123 238174 958835982336 1.760659966468811 306.609707181298 156984.17007682458 0.00024251305148936808
124 240473 968478687232 1.754089126586914 309.6952503732213 158563.9681910893 0.00024143399787135422
125 242768 978104614912 1.7523606967926026 312.7955168953312 160151.30465040958 0.00024034960370045155
126 244243 984291213312 1.7513489294052125 314.7860534490736 161170.4593659257 0.00023964889987837523
127 245978 991568330752 1.7525734424591064 317.14563100169306 162378.56307286685 0.0002388209686614573
128 248018 1000124710912 1.7499562788009644 319.9244360278179 163801.31124624275 0.00023784241057001054
129 249886 1007959670784 1.7509011316299439 322.45674101591925 165097.85140015066 0.00023694158880971372
130 251357 1014129491968 1.7487824440002442 324.4791230457826 166133.3109994407 0.00023622905428055674
131 252908 1020634857472 1.7466645431518555 326.58318169927685 167210.58903002975 0.00023547477030660957
132 255231 1030378225664 1.7461054420471191 329.7483051645596 168831.13224425452 0.00023433937167283148
133 256784 1036891979776 1.7455737209320068 331.87937631953497 169922.2406756019 0.00023357658938039094
134 258875 1045662269440 1.7457578039169313 334.731770063507 171382.66627251558 0.00023254487314261496
135 260952 1054373838848 1.7445444059371948 337.5729135723391 172837.3317490376 0.00023151483037509024
136 262527 1060979867648 1.741570553779602 339.7600884215228 173957.1652718197 0.00023073032207321376
137 264811 1070559657984 1.740293960571289 342.88246244716737 175555.8207729497 0.00022958747285883874
138 266420 1077308293120 1.7429306316375732 345.0796831538715 176680.7977747822 0.00022877875017002225
139 268308 1085227139072 1.744356451034546 347.65395848590146 177998.82674478155 0.00022782600717619061
140 269559 1090474213376 1.7428239250183106 349.3460391367528 178865.17203801742 0.00022719251865055412
141 271809 1099911397376 1.7441408920288086 352.4030745393055 180430.37416412443 0.00022604875266551971
142 274026 1109210169344 1.7409614515304566 355.4137426253769 181971.836224193 0.00022491635172627866
143 276200 1118328586240 1.7383298921585082 358.35239492776185 183476.42620301407 0.00022380080190487206
144 279273 1131217682432 1.7371133943883383 362.5356638252912 185618.25987854908 0.00022221545805223286
145 281879 1142148038656 1.7415697383880615 366.08764050357314 187436.87193782945 0.00022086345416028053
146 283803 1150217879552 1.733839235305786 368.711490408947 188780.28308938086 0.0002198608999606222
147 286646 1162142285824 1.7431989669799806 372.57302363409656 190757.38810065744 0.00021837285021319985
148 288431 1169629118464 1.739035325050354 375.010250589488 192005.24830181786 0.00021743458637502044
149 289869 1175660527616 1.733289074897766 376.974500554174 193010.9442837371 0.00021667654800694436
150 292495 1186674769920 1.7315478420257568 380.5901660706151 194862.16502815494 0.00021528734941966832
151 295211 1198066499584 1.733106060028076 384.2868979963938 196754.89177415363 0.00021384400315582752
152 297195 1206387998720 1.7410508108139038 386.9830853080197 198135.3396777061 0.00021278555504977703
153 298925 1213644144640 1.734706358909607 389.3362156377362 199340.14240652093 0.00021185987861827016
154 300889 1221881757696 1.73500732421875 392.0144997620718 200711.42387818077 0.00021080594160594046
155 302662 1229318258688 1.7418826770782472 394.42366878365414 201944.91841723092 0.00020985178707633168
156 304473 1236914143232 1.7310642719268798 396.8901623202001 203207.76310794245 0.00020887458231300116
157 306476 1245315334144 1.7335577774047852 399.62016656875767 204605.52528320393 0.0002077907556667924
158 308518 1253880102912 1.7285248136520386 402.42136555782304 206039.7391656054 0.00020668267097789794
159 310574 1262503591936 1.733399453163147 405.2239282874955 207474.6512831977 0.00020556384697556496
160 312781 1271760420864 1.730285539627075 408.3160193839066 209057.8019245602 0.00020435944315977395
161 314612 1279440191488 1.733029899597168 410.81628584431604 210337.9383522898 0.0002033576020039618
162 316647 1287975600128 1.731144299507141 413.5669219080583 211746.26401692585 0.00020224145555403084
163 319268 1298968870912 1.729205231666565 416.9754043734229 213491.40703919253 0.00020079984096810222
164 320942 1305990135808 1.7270024967193605 418.99536520049196 214525.62698265188 0.00019987679843325168
165 323057 1314861088768 1.7282018089294433 421.5389839125796 215827.95976324077 0.0001987080613616854
166 324688 1321701998592 1.727188115119934 423.4974892227614 216830.71448205382 0.0001978049403987825
167 325782 1326290567168 1.7273711681365966 424.81380970657455 217504.67056976617 0.00019719830015674233
168 328381 1337191563264 1.7251214504241943 427.95426755280107 219112.58498703415 0.00019575434271246195
169 330172 1344703561728 1.7238556051254272 430.13040363881544 220226.7666630735 0.0001947571145137772
170 332386 1353989750784 1.7249505424499512 432.82640173550084 221607.11768857643 0.00019352202070876956
171 334457 1362676154368 1.7219355773925782 435.3433918957347 222895.81665061618 0.00019236441585235298
172 336042 1369324126208 1.72216224193573 437.23262586380747 223863.10444226942 0.0001914770546136424
173 337560 1375691079680 1.7201724815368653 439.0764706419921 224807.15296869996 0.00019062607316300273
174 338748 1380673912832 1.7237060013271512 440.51795167100903 225545.19125555662 0.00018995934806298465
175 340933 1389838467072 1.7192433309555053 443.16397790562996 226899.95668768254 0.00018873147200793028
176 342332 1395706298368 1.7205400276184082 444.8788870963792 227777.99019334614 0.00018794421339407563
177 344302 1403969077248 1.7203184127807618 447.26131719529656 228997.79440399184 0.00018683428061194718
178 345968 1410956787712 1.720102686882019 449.29559398623377 230039.3441209517 0.0001858944451669231
179 347502 1417390850048 1.720639853477478 451.16690631718046 230997.4560343964 0.0001850281551014632
180 349991 1427830472704 1.7197772884368896 454.2264308151358 232563.93257734954 0.00018362075206823647
181 352052 1436474933248 1.7142712354660035 456.81043543511663 233886.94294277971 0.00018245380488224328
182 354054 1444871929856 1.7160462188720702 459.453295596552 235240.08734543461 0.00018131898832507432
183 355452 1450735566848 1.7196543836593627 461.30180067648325 236186.52194635943 0.0001805258507374674
184 357345 1458675384320 1.7139066517353059 463.9481888459466 237541.47268912467 0.00017945100262295455
185 359270 1466749419520 1.7155957555770873 466.5414323894858 238869.21338341673 0.00017835704784374684
186 361118 1474500493312 1.7160774993896484 469.038111011795 240147.51283803905 0.0001773060066625476
187 363330 1483778293760 1.716701912879944 472.0381865977867 241683.5515380668 0.0001760469749569893
188 365934 1494700261376 1.7132270240783691 475.5838090708114 243498.91024425544 0.0001745635672705248
189 367689 1502061264896 1.7156493997573852 477.96177319155544 244716.42787407638 0.00017356315220240504
190 369392 1509204164608 1.7158074569702149 480.26144646743097 245893.86059132466 0.00017259192827623338
191 371001 1515952799744 1.715286192893982 482.4363151349443 247007.3933490915 0.00017167393525596708
192 373275 1525490647040 1.712519268989563 485.5007621892336 248576.39024088762 0.0001703760353848338
193 375882 1536425197568 1.7095552492141723 489.0597850280192 250398.60993434582 0.00016888746176846325
194 378308 1546600579072 1.7065624713897705 492.38097163360044 252099.05747640342 0.0001675017992965877
195 380320 1555039518720 1.7018836069107055 495.12656034960173 253504.7988989961 0.00016635240172035992
196 381423 1559665836032 1.712470350265503 496.61899996729517 254268.92798325513 0.00016572224558331072
197 384180 1571229532160 1.70383526802063 500.35769686916666 256183.14079701333 0.0001641471026232466
198 386393 1580511526912 1.7026185607910156 503.3778410258571 257729.45460523883 0.0001628828322282061
199 388342 1588686225408 1.703223738670349 506.02086480091833 259082.6827780702 0.00016176952340174466
200 389711 1594428227584 1.7003657007217408 507.8724601782208 260030.69961124906 0.00016098766354843974
201 391539 1602095415296 1.6955804538726806 510.35535180901985 261301.94012621816 0.000159943854669109
202 393541 1610492411904 1.6995060205459596 513.0894965861288 262701.82225209795 0.00015880104911047965
203 395741 1619719880704 1.7011328125 516.0827098217362 264234.34742872894 0.00015754574269521981
204 398543 1631472320512 1.7008357238769531 519.8783201021156 266177.6998922832 0.000155947869643569
205 400565 1639953203200 1.695354881286621 522.6083451644168 267575.4727241814 0.000154795590788126
206 402527 1648182427648 1.6946661138534547 525.2668895711776 268936.6474604429 0.00015367820742540061
207 403945 1654129950720 1.6960719108581543 527.1914807180993 269922.03812766686 0.00015287112910300493
208 405313 1659867758592 1.697601842880249 529.0370912977813 270866.99074446404 0.0001520929072285071
209 406788 1666054356992 1.6927887296676636 531.0349452814536 271889.89198410424 0.000151254324009642
210 409387 1676955353088 1.69403892993927 534.5653393185967 273697.4537311215 0.00014977800310589373
211 411155 1684370882560 1.691395869255066 536.9709613285441 274929.1322002146 0.0001487747795181349
212 413089 1692482666496 1.6922231245040893 539.5964738321247 276273.39460204786 0.00014767838001716882
213 414732 1699373907968 1.6947362184524537 541.7996070183738 277401.39879340737 0.00014674787234980613
214 416436 1706521001984 1.6899096250534058 544.0972045027258 278577.76870539563 0.0001457837497582659
215 417863 1712506273792 1.6881012392044068 545.9991046208406 279551.5415658704 0.0001449771225452423
216 419237 1718269247488 1.691229190826416 547.8411327792202 280494.65998296073 0.00014420114166568965
217 421285 1726859182080 1.6884182071685792 550.5797526560408 281896.8333598929 0.00014304582145996392
218 424030 1738372546560 1.6905427885055542 554.2562450989998 283779.1974906879 0.0001414999132975936
219 426068 1746920538112 1.6902626466751098 557.0390545870044 285203.9959485463 0.00014035419735591859
220 427364 1752356356096 1.6865596675872803 558.7964797184435 286103.79761584307 0.0001396265724906698
221 429281 1760396836864 1.6867563104629517 561.412200978866 287443.0469011794 0.00013855169527232647
222 432073 1772107333632 1.6869472122192384 565.1983350192647 289381.54752986354 0.0001369893434457481
223 433564 1778361040896 1.6836950588226318 567.304830058465 290460.07298993407 0.00013615658099297434
224 435451 1786275692544 1.682388458251953 569.8640171882282 291770.37680037285 0.0001351043174508959
225 436975 1792667811840 1.6846920680999755 571.9226006626227 292824.3715392628 0.00013425585348159075
226 439052 1801379381248 1.6854845952987672 574.7291045055265 294261.30150682956 0.00013310158101376146
227 440966 1809407279104 1.6832574605941772 577.334748714053 295595.39134159515 0.000132040077005513
228 443227 1818890600448 1.6824970388412475 580.0256552258438 296973.13547563204 0.00013078891788609326
229 445179 1827077881856 1.6792938709259033 582.2320406351785 298102.8048052114 0.00012971126125194132
230 447934 1838633189376 1.6817043399810792 585.2915628118471 299669.28015966574 0.00012819441326428205
231 449710 1846082273280 1.6782426357269287 587.2799727474821 300687.34604671085 0.00012721921666525304
232 452524 1857885044736 1.6820737409591675 590.4152128734677 302292.58899121545 0.00012567844532895833
233 454426 1865862610944 1.6787804031372071 592.544702035982 303382.8874424228 0.00012464018072932959
234 456289 1873676599296 1.681654314994812 594.6135522814485 304442.13876810163 0.00012362573761492968
235 458583 1883298332672 1.6730261087417602 597.1887328853452 305760.63123729674 0.00012238013732712716
236 460981 1893356273664 1.672509970664978 599.8846338122594 307140.9325118768 0.00012108237569918856
237 462659 1900394315776 1.6751681089401245 601.761152752987 308101.7102095293 0.00012017694825772196
238 464445 1907885342720 1.671654839515686 603.7396514160039 309114.701524994 0.0001192157287732698
239 465759 1913396658176 1.6746321821212768 605.2131596599303 309869.1377458843 0.00011851020099129528
240 467953 1922598961152 1.6700562286376952 607.6498691665898 311116.733013294 0.00011733539577107877
241 470480 1933197967360 1.6706900882720948 610.4716868327131 312561.5036583491 0.000115987379103899
242 472362 1941091647488 1.670096197128296 612.5532063371096 313627.2416446001 0.00011498706589918584
243 474479 1949970989056 1.672633442878723 614.9096566533404 314833.7442065103 0.00011386564437998459
244 475718 1955167731712 1.670849027633667 616.2887570926199 315539.8436314214 0.00011321121564833447
245 477678 1963388567552 1.6683212041854858 618.4896738461576 316666.7130092327 0.00011217887367820367
246 479405 1970632130560 1.6645479679107666 620.5010973832233 317696.5618602103 0.00011127226753160357
247 481490 1979377254400 1.6670279264450074 622.9267749258964 318938.508762059 0.00011018155055353418
248 483395 1987367403520 1.6700633478164673 625.1531309454637 320078.4030440774 0.00010918873158516362
249 484660 1992673198080 1.665715732574463 626.6322892833138 320835.73211305664 0.00010853145067812875
250 486388 1999920955392 1.6684691619873047 628.6401761626622 321863.77019528305 0.00010763623140519485
251 487760 2005675540480 1.6899535751342774 630.2428063296861 322684.3168407993 0.00010692761861719191
252 489679 2013724409856 1.6870677042007447 632.4661689389804 323822.678496758 0.00010593978367978707
253 491707 2022230458368 1.683250117301941 634.8101256682668 325022.7843421526 0.00010490007844055071
254 493795 2030988165120 1.6834172248840331 637.2431921455864 326268.5143785402 0.00010383423796156421
255 495562 2038399500288 1.6794629859924317 639.3029817152653 327323.12663821585 0.00010293597733834758
256 497249 2045475291136 1.6757073497772217 641.2957346101667 328343.41612040537 0.00010208162711933255
257 499270 2053951979520 1.6750809478759765 643.6868357490317 329567.65990350425 0.00010106235276907682
258 501917 2065054302208 1.6795844745635986 646.8346841992168 331179.358309999 9.973444684874266e-05
259 504501 2075892383744 1.6791791486740113 649.9504268523947 332774.6185484261 9.844604937825352e-05
260 505920 2081844101120 1.6794115686416626 651.6345685349747 333636.89908990706 9.774190402822569e-05
261 507735 2089456762880 1.6736989212036133 653.8060941808081 334748.72022057377 9.684478573035449e-05
262 509781 2098038308864 1.6739554023742675 656.2464802015245 335998.19786318054 9.583831706549972e-05
263 511352 2104627560448 1.6752780342102052 658.1126194182707 336953.6611421546 9.50690227909945e-05
264 512792 2110667358208 1.6734393405914307 659.8307014095916 337833.3191217109 9.436659456696361e-05
265 514516 2117898338304 1.6712262153625488 661.8950038467839 338890.24196955335 9.352908818982542e-05
266 516978 2128224714752 1.670324263572693 664.8669427431114 340411.87468447303 9.233966557076201e-05
267 519037 2136860786688 1.6687762260437011 667.3400292872751 341678.09499508486 9.135099389823154e-05
268 520968 2144959987712 1.6679233884811402 669.6824876479714 342877.43367576133 9.042886085808277e-05
269 523402 2155168923648 1.6661997795104981 672.6088669252829 344375.73986574484 8.927362796384841e-05
270 525489 2163922436096 1.667925977706909 675.1255832034733 345664.2986001783 8.828948193695396e-05
271 527861 2173871325184 1.6650946950912475 677.9682576375237 347119.74791041214 8.71782103786245e-05
272 529924 2182524174336 1.6656306838989259 680.4565219490987 348393.73923793854 8.621808228781447e-05
273 531732 2190107475968 1.6626214647293092 682.6502407960503 349516.92328757775 8.538155816495419e-05
274 533512 2197573337088 1.6671656608581542 684.8457100742603 350641.0035580213 8.456254727207124e-05
275 535979 2207920685056 1.6623902940750122 687.8088072102388 352158.10929164226 8.343499212060124e-05
276 538441 2218247061504 1.6625070142745972 690.8060061686901 353692.67515836935 8.231859101215377e-05
277 539611 2223154397184 1.6638607454299927 692.2126377288533 354412.8705171729 8.179118594853207e-05
278 541835 2232482529280 1.6632291555404664 694.9052978233159 355791.51248553774 8.07943069958128e-05
279 543513 2239520571392 1.6591477632522582 696.9350562256079 356830.74878751126 8.004709525266662e-05
280 544834 2245061246976 1.6582461643218993 698.5149932145702 357639.6765258599 7.946186815388501e-05
281 546541 2252220923904 1.6569991302490235 700.5529090210858 358683.08941879595 7.870959962019697e-05
282 548386 2259959414784 1.6534462308883666 702.7537166691092 359809.9029345839 7.790157542331144e-05
283 550107 2267177811968 1.6563579320907593 704.8354743227627 360875.7628532545 7.715264655416831e-05
284 551929 2274819833856 1.652633171081543 707.0277662998835 361998.21634554036 7.636484951945022e-05
285 553695 2282226974720 1.6558599662780762 709.1717543189402 363095.9382112974 7.560628728242591e-05
286 555203 2288551985152 1.6531555843353272 710.9832715057181 364023.4350109277 7.496249600080773e-05
287 557414 2297825591296 1.658251905441284 713.6318656351355 365379.51520518935 7.402522169286385e-05
288 559184 2305249509376 1.6528305721282959 715.7625381680682 366470.4195420509 7.328063657041639e-05
289 561554 2315190009856 1.6525710678100587 718.5529555138435 367899.11322308786 7.229171023936942e-05
290 562859 2320663576576 1.651809163093567 720.0850926347636 368683.56742899894 7.175114296842366e-05
291 565169 2330352418816 1.6541441106796264 722.8299401458767 370088.92935468885 7.080127397784963e-05
292 567134 2338594226176 1.6530265474319459 725.1825567879315 371293.46907542093 7.000035111559555e-05
293 569714 2349415530496 1.6502766132354736 728.2318701144516 372854.7174985992 6.895873957546428e-05
294 571324 2356168359936 1.651259379386902 730.1611943364807 373842.5315002781 6.831453356426209e-05
295 574248 2368432504832 1.6482793807983398 733.6468108414203 375627.1671508072 6.715607014484704e-05
296 576334 2377181822976 1.644599723815918 736.1349917772365 376901.11578994506 6.63387545500882e-05
297 578782 2387449479168 1.6505870151519775 739.0739024216011 378405.8380398598 6.538941670442e-05
298 580516 2394722402304 1.6470952558517455 741.1653197225369 379476.6436979389 6.472343375207856e-05
299 582668 2403748544512 1.6487390184402466 743.7736260008398 380812.09651243 6.390442285919562e-05
300 584669 2412141346816 1.6448141527175903 746.1684989620086 382038.2714685484 6.315039354376495e-05
301 586818 2421154906112 1.646544461250305 748.775306891994 383372.9571287009 6.234873580979183e-05
302 589001 2430311071744 1.6475654029846192 751.4020625109125 384717.8560055872 6.154309085104614e-05
303 591113 2439169441792 1.6450910377502441 753.9527754451661 386023.82102792506 6.077205034671351e-05
304 593228 2448040394752 1.6460276412963868 756.5354509950179 387346.15090944915 6.0008260334143415e-05
305 594605 2453815951360 1.6390350008010863 758.1939404388438 388195.29750468803 5.951549974270165e-05
306 597153 2464503037952 1.637695927619934 761.2579777614424 389764.0846138585 5.8613157307263464e-05
307 598759 2471239090176 1.6295817375183106 763.2197744939834 390768.5245409195 5.8050762163475156e-05
308 600863 2480063905792 1.6312317228317261 765.7391971130864 392058.46892190026 5.732145291403867e-05
309 602611 2487395549184 1.6288472127914428 767.8321921499497 393130.08238077426 5.67220376979094e-05
310 604453 2495121457152 1.6286871242523193 770.052846124808 394267.0572159017 5.609680010820739e-05
311 606186 2502390185984 1.626492328643799 772.1559839896788 395343.86380271555 5.5514599807793275e-05
312 606868 2505250701312 1.6294204493363698 772.9715416562736 395761.4293280121 5.528709516511299e-05
313 608527 2512209051648 1.6318567752838136 774.9540150309879 396776.4556958658 5.4737502068746835e-05
314 609912 2518018162688 1.6299163818359375 776.6102461643201 397624.4460361319 5.4282838391372934e-05
315 612696 2529695105024 1.6283214426040649 779.9288053733754 399323.5483511682 5.338044138625264e-05
316 614703 2538113073152 1.624042534828186 782.3509216785588 400563.67189942213 5.2739502280019224e-05
317 617059 2547994853376 1.64087806224823 785.185724072868 402015.0907253084 5.19974491908215e-05
318 618445 2553808158720 1.642133264541626 786.8536925417527 402869.0905813774 5.156615225132555e-05
319 620429 2562129657856 1.6360367107391358 789.3145963788388 404129.0733459655 5.0955564802279696e-05
320 622331 2570107224064 1.6323289918899535 791.575088748329 405286.44543914445 5.0377762818243355e-05
321 624509 2579242418176 1.6336705684661865 794.1814413488138 406620.89797059266 4.9725240387488157e-05
322 625869 2584946671616 1.63454927444458 795.8170756094358 407458.34271203115 4.932274896418676e-05
323 628359 2595390488576 1.6327994871139526 798.8134971637937 408992.5105478624 4.8595778935123235e-05
324 629947 2602051043328 1.634876675605774 800.7342204672283 409975.9208792209 4.8138899728655815e-05
325 631685 2609340743680 1.6314793634414673 802.8435091534811 411055.87668658234 4.764491313835606e-05
326 633406 2616559140864 1.630139570236206 804.9092638572018 412113.54309488734 4.716201510746032e-05
327 635185 2624020807680 1.6301208162307739 807.085845822091 413227.9530609106 4.666941094910726e-05
328 636953 2631436337152 1.6280076551437377 809.2236022954875 414322.4843752896 4.61865020042751e-05
329 638743 2638944141312 1.6293576526641846 811.3632367052068 415417.97719306586 4.570435703499243e-05
330 641116 2648897224704 1.6281150197982788 814.1943204899674 416867.4920908633 4.507573976297863e-05
331 643518 2658971942912 1.626808156967163 817.0790731657314 418344.4854608545 4.4451757275965065e-05
332 645200 2666026762240 1.6276088953018188 819.1059377729686 419382.2401397599 4.402222475619055e-05
333 646903 2673169661952 1.6250265979766845 821.1531966983101 420430.4367095348 4.3593576265266165e-05
334 649246 2682996916224 1.6259414958953857 823.9828252710962 421879.20653880126 4.3014148104703054e-05
335 652174 2695277838336 1.6242961645126344 827.4949100539828 423677.3939476392 4.2306914110668004e-05
336 654123 2703452536832 1.623875789642334 829.8569429315521 424886.7547809547 4.184658610029146e-05
337 656975 2715414691840 1.621957221031189 833.2863375114017 426642.6048058377 4.1188093746313825e-05
338 658694 2722624700416 1.626935167312622 835.3677633732899 427708.29484712443 4.07998995797243e-05
339 660347 2729557884928 1.6193980121612548 837.4232649987488 428760.7116793594 4.0432812966173515e-05
340 662381 2738089099264 1.6226060914993286 839.8879330915463 430022.6217428717 3.9989481592783704e-05
341 664665 2747668889600 1.6181461000442505 842.6497982943592 431436.69672671193 3.9502705476479605e-05
342 666543 2755545792512 1.6224317026138306 844.9222846863987 432600.20975943614 3.91112407669425e-05
343 668035 2761803694080 1.6199790287017821 846.7305949538775 433526.06461638527 3.8805901567684487e-05
344 669716 2768854319104 1.6205505228042603 848.7580324526521 434564.11261575785 3.8467911508632824e-05
345 671548 2776538284032 1.6194300413131715 850.955616959009 435689.2758830126 3.81068566639442e-05
346 673792 2785950302208 1.618673267364502 853.719618993503 437104.44492467353 3.767499583773315e-05
347 675542 2793290334208 1.6200080347061157 855.8460515671908 438193.1784024017 3.7346177123254165e-05
348 677331 2800793944064 1.6195425510406494 858.0345433676414 439313.6862042324 3.7017267459305e-05
349 679068 2808079450112 1.6178715658187866 860.1212525989619 440382.0813306685 3.670493606477976e-05
350 681041 2816354811904 1.619278564453125 862.4906129164967 441595.1938132463 3.6358578654471785e-05
351 682885 2824089108480 1.6154922294616698 864.7152048434652 442734.1848798542 3.6042976716998965e-05
352 684471 2830741274624 1.6204235363006592 866.6373480686542 443718.32221115095 3.577781535568647e-05
353 686548 2839452844032 1.6162836599349975 869.1610365124169 445010.45069435745 3.5439366911305115e-05
354 688512 2847690457088 1.6139143562316896 871.5226151134167 446219.57893806935 3.51285380020272e-05
355 690770 2857161195520 1.6140108585357666 874.2491468152524 447615.56316940923 3.47822715411894e-05
356 692653 2865059069952 1.612514958381653 876.5247202096293 448780.6567473302 3.450260192039423e-05
357 695067 2875184119808 1.6133524417877196 879.4364041243566 450271.4389116706 3.4156193578382954e-05
358 697014 2883350429696 1.613981342315674 881.7887764691106 451475.85355218465 3.3886746678035706e-05
359 698746 2890614964224 1.613626651763916 883.8767033670731 452544.87212394143 3.3654530852800235e-05
360 701452 2901964750848 1.6106493425369264 887.1577838441956 454224.78532822814 3.330586332594976e-05
361 703123 2908973432832 1.6123323392868043 889.1781213319924 455259.1981219801 3.309917883598246e-05
362 704756 2915822731264 1.6120843172073365 891.1229818776349 456254.96672134905 3.2903564715525135e-05
363 706984 2925167640576 1.6084727716445923 893.7894090776833 457620.17744777387 3.264685801696032e-05
364 708614 2932004356096 1.6117757987976074 895.7409758135146 458619.3796165195 3.2466501579619944e-05
365 710902 2941600923648 1.6067533922195434 898.5324710232566 460048.62516390736 3.222398299840279e-05
366 712655 2948953538560 1.6069003009796143 900.6844297494815 461150.42803173454 3.204659151379019e-05
367 714446 2956465537024 1.6114630460739137 902.828141202652 462248.0082957578 3.187291440553963e-05
368 716271 2964120141824 1.6145262241363525 905.0301348181574 463375.4290268966 3.170380659867078e-05
369 718439 2973213392896 1.6167470741271972 907.64100600771 464712.1950759475 3.1513252906734124e-05
370 720429 2981560057856 1.615721011161804 910.0234193265205 465931.9906951785 3.1348234188044444e-05
371 722346 2989600538624 1.6144047689437866 912.3105630227034 467103.0082676241 3.119822940789163e-05
372 724097 2996944764928 1.6113934993743897 914.3946314318039 468170.0512930836 3.106891381321475e-05
373 726062 3005186572288 1.6150265264511108 916.7757618370391 469389.190060564 3.093255145358853e-05
374 728431 3015122878464 1.6120684432983399 919.6071913458889 470838.8819690951 3.078047666349448e-05
375 730800 3025059184640 1.6116199398040771 922.4232348246237 472280.69623020734 3.064189513679594e-05
376 732477 3032093032448 1.60649507522583 924.4540804354708 473320.48918296106 3.0551960662705824e-05
377 734574 3040888487936 1.6101762390136718 927.0469926404139 474648.0602318919 3.0449025871348567e-05
378 736132 3047423213568 1.6150142908096314 929.0061162371555 475651.13151342364 3.0379411327885464e-05
379 737978 3055165898752 1.6060742330551148 931.2157941709505 476782.48661552666 3.0304503525258042e-05
380 740325 3065009930240 1.5952490282058716 934.0287519650691 478222.7210061154 3.0221137421904132e-05
381 741809 3071234277376 1.5957392265922146 935.8210044055174 479140.35425562493 3.0175287974998355e-05
382 744212 3081313189888 1.5954618740081787 938.745490208727 480637.6909868682 3.0112320018815808e-05
383 746818 3092243546112 1.5991949892044068 941.9034764933106 482254.579964575 3.0059802156756632e-05
384 748203 3098052657152 1.5962308692932128 943.6019803061257 483124.21391673636 3.0038570912438445e-05
385 749620 3103995985920 1.597407283782959 945.3358183610009 484011.93900083244 3.0021647035027854e-05
386 751627 3112413954048 1.5941282081604005 947.7966657645534 485271.89287145133 3.0005981898284517e-05
387 753851 3121742086144 1.5939428043365478 950.4965920289866 486654.25511884113 2.9999999242136255e-05

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:debd5c63735b96a9e62fa5b44b0127c9452c341047ec2b919f82d8612674edce
size 418213162

View File

@@ -0,0 +1,50 @@
training_steps,training_tokens,training_loss,walltime,gputime,learning_rate
25,102400000,4.309772815704346,0.26922979407836667,34.461413642030934,1.9999999494757503e-05
50,204800000,2.6095064067840577,0.5344728346386659,68.41252283374924,1.9999999494757503e-05
75,307200000,2.1840998935699463,0.7994251677144519,102.32642146744985,1.9999999494757503e-05
100,409600000,1.9669664239883422,1.066323601762864,136.48942102564658,1.9999999494757503e-05
125,512000000,1.8014992809295653,1.3312230805499679,170.39655431039589,1.9999999494757503e-05
150,614400000,1.7220078945159911,1.5958814102547383,204.2728205126065,1.9999999494757503e-05
175,716800000,1.6801378870010375,1.8606056269711062,238.1575202523016,1.9999999494757503e-05
200,819200000,1.6575293684005736,2.126106606112505,272.14164558240066,1.9999999494757503e-05
225,921600000,1.639773211479187,2.391510878373186,306.1133924317678,1.9999999494757503e-05
250,1024000000,1.6256405591964722,2.6564186745483864,340.02159034219346,1.9999999494757503e-05
275,1126400000,1.614159688949585,2.922717172069702,374.10779802492186,1.9999999494757503e-05
300,1228800000,1.6140153646469115,3.187462724955874,407.9952287943519,1.9999999494757503e-05
325,1331200000,1.6082664394378663,3.453434134056764,442.0395691592658,1.9999999494757503e-05
350,1433600000,1.5916787338256837,3.718426102696622,475.9585411451676,1.9999999494757503e-05
375,1536000000,1.5903722620010377,3.9834007768839617,509.8752994411471,1.9999999494757503e-05
400,1638400000,1.5937609910964965,4.248280933908213,543.7799595402513,1.9999999494757503e-05
425,1740800000,1.5833971071243287,4.514559538923331,577.8636209821864,1.9999999494757503e-05
450,1843200000,1.588483600616455,4.7794486881652904,611.7694320851572,1.9999999494757503e-05
475,1945600000,1.5811502838134766,5.044273908992154,645.6670603509957,1.9999999494757503e-05
500,2048000000,1.5776061773300172,5.308712950952829,679.5152577219621,1.9999999494757503e-05
525,2150400000,1.5762306451797485,5.575429370141868,713.6549593781591,1.9999999494757503e-05
550,2252800000,1.577330994606018,5.8403744682127785,747.5679319312356,1.9999999494757503e-05
575,2355200000,1.5774771738052369,6.105136559348899,781.457479596659,1.9999999494757503e-05
600,2457600000,1.571889362335205,6.370161146941192,815.3806268084726,1.9999999494757503e-05
625,2560000000,1.5669999837875366,6.636842231150017,849.5158055872022,1.9999999494757503e-05
650,2662400000,1.5683012199401856,6.901824726711475,883.4335650190689,1.9999999494757503e-05
675,2764800000,1.5606089782714845,7.166626843748192,917.3282359997686,1.9999999494757503e-05
700,2867200000,1.569625825881958,7.431464089033584,951.2274033962988,1.9999999494757503e-05
725,2969600000,1.5637955999374389,7.697967706009089,985.3398663691634,1.9999999494757503e-05
750,3072000000,1.5669568061828614,7.962892068477961,1019.250184765179,1.9999999494757503e-05
775,3174400000,1.578919801712036,8.229167067304108,1053.3333846149258,1.9999999494757503e-05
800,3276800000,1.5597226810455322,8.493772274765757,1087.2028511700169,1.9999999494757503e-05
825,3379200000,1.5684496641159058,8.760366790807014,1121.3269492232978,1.9999999494757503e-05
850,3481600000,1.555274577140808,9.025527741439927,1155.2675509043106,1.9999999494757503e-05
875,3584000000,1.5589488697052003,9.290825104582503,1189.2256133865603,1.9999999494757503e-05
900,3686400000,1.56228581905365,9.555587218982518,1223.1151640297624,1.9999999494757503e-05
925,3788800000,1.5693172216415405,9.821702597393193,1257.1779324663287,1.9999999494757503e-05
950,3891200000,1.547282567024231,10.086582040948787,1291.0825012414448,1.9999999494757503e-05
975,3993600000,1.552180905342102,10.35216691473301,1325.0773650858252,1.9999999494757503e-05
1000,4096000000,1.5544623231887817,10.617501541588314,1359.0401973233043,1.9999999494757503e-05
1025,4198400000,1.5621129417419433,10.884025346612145,1393.1552443663545,1.9999999494757503e-05
1050,4300800000,1.5600895547866822,11.148848565988725,1427.0526164465568,1.9999999494757503e-05
1075,4403200000,1.5528885984420777,11.413467440123783,1460.9238323358443,1.9999999494757503e-05
1100,4505600000,1.5599483346939087,11.678333667439569,1494.8267094322648,1.9999999494757503e-05
1125,4608000000,1.5648639726638793,11.943580217023005,1528.7782677789446,1.9999999494757503e-05
1150,4710400000,1.549267168045044,12.20822147437584,1562.6523487201075,1.9999999494757503e-05
1175,4812800000,1.5537393379211426,12.473067252154218,1596.5526082757399,1.9999999494757503e-05
1200,4915200000,1.5565906286239624,12.737777670184224,1630.4355417835807,1.9999999494757503e-05
1220,4997120000,1.549389386177063,12.94956159459549,1657.5438841082228,1.9999999494757503e-05
1 training_steps training_tokens training_loss walltime gputime learning_rate
2 25 102400000 4.309772815704346 0.26922979407836667 34.461413642030934 1.9999999494757503e-05
3 50 204800000 2.6095064067840577 0.5344728346386659 68.41252283374924 1.9999999494757503e-05
4 75 307200000 2.1840998935699463 0.7994251677144519 102.32642146744985 1.9999999494757503e-05
5 100 409600000 1.9669664239883422 1.066323601762864 136.48942102564658 1.9999999494757503e-05
6 125 512000000 1.8014992809295653 1.3312230805499679 170.39655431039589 1.9999999494757503e-05
7 150 614400000 1.7220078945159911 1.5958814102547383 204.2728205126065 1.9999999494757503e-05
8 175 716800000 1.6801378870010375 1.8606056269711062 238.1575202523016 1.9999999494757503e-05
9 200 819200000 1.6575293684005736 2.126106606112505 272.14164558240066 1.9999999494757503e-05
10 225 921600000 1.639773211479187 2.391510878373186 306.1133924317678 1.9999999494757503e-05
11 250 1024000000 1.6256405591964722 2.6564186745483864 340.02159034219346 1.9999999494757503e-05
12 275 1126400000 1.614159688949585 2.922717172069702 374.10779802492186 1.9999999494757503e-05
13 300 1228800000 1.6140153646469115 3.187462724955874 407.9952287943519 1.9999999494757503e-05
14 325 1331200000 1.6082664394378663 3.453434134056764 442.0395691592658 1.9999999494757503e-05
15 350 1433600000 1.5916787338256837 3.718426102696622 475.9585411451676 1.9999999494757503e-05
16 375 1536000000 1.5903722620010377 3.9834007768839617 509.8752994411471 1.9999999494757503e-05
17 400 1638400000 1.5937609910964965 4.248280933908213 543.7799595402513 1.9999999494757503e-05
18 425 1740800000 1.5833971071243287 4.514559538923331 577.8636209821864 1.9999999494757503e-05
19 450 1843200000 1.588483600616455 4.7794486881652904 611.7694320851572 1.9999999494757503e-05
20 475 1945600000 1.5811502838134766 5.044273908992154 645.6670603509957 1.9999999494757503e-05
21 500 2048000000 1.5776061773300172 5.308712950952829 679.5152577219621 1.9999999494757503e-05
22 525 2150400000 1.5762306451797485 5.575429370141868 713.6549593781591 1.9999999494757503e-05
23 550 2252800000 1.577330994606018 5.8403744682127785 747.5679319312356 1.9999999494757503e-05
24 575 2355200000 1.5774771738052369 6.105136559348899 781.457479596659 1.9999999494757503e-05
25 600 2457600000 1.571889362335205 6.370161146941192 815.3806268084726 1.9999999494757503e-05
26 625 2560000000 1.5669999837875366 6.636842231150017 849.5158055872022 1.9999999494757503e-05
27 650 2662400000 1.5683012199401856 6.901824726711475 883.4335650190689 1.9999999494757503e-05
28 675 2764800000 1.5606089782714845 7.166626843748192 917.3282359997686 1.9999999494757503e-05
29 700 2867200000 1.569625825881958 7.431464089033584 951.2274033962988 1.9999999494757503e-05
30 725 2969600000 1.5637955999374389 7.697967706009089 985.3398663691634 1.9999999494757503e-05
31 750 3072000000 1.5669568061828614 7.962892068477961 1019.250184765179 1.9999999494757503e-05
32 775 3174400000 1.578919801712036 8.229167067304108 1053.3333846149258 1.9999999494757503e-05
33 800 3276800000 1.5597226810455322 8.493772274765757 1087.2028511700169 1.9999999494757503e-05
34 825 3379200000 1.5684496641159058 8.760366790807014 1121.3269492232978 1.9999999494757503e-05
35 850 3481600000 1.555274577140808 9.025527741439927 1155.2675509043106 1.9999999494757503e-05
36 875 3584000000 1.5589488697052003 9.290825104582503 1189.2256133865603 1.9999999494757503e-05
37 900 3686400000 1.56228581905365 9.555587218982518 1223.1151640297624 1.9999999494757503e-05
38 925 3788800000 1.5693172216415405 9.821702597393193 1257.1779324663287 1.9999999494757503e-05
39 950 3891200000 1.547282567024231 10.086582040948787 1291.0825012414448 1.9999999494757503e-05
40 975 3993600000 1.552180905342102 10.35216691473301 1325.0773650858252 1.9999999494757503e-05
41 1000 4096000000 1.5544623231887817 10.617501541588314 1359.0401973233043 1.9999999494757503e-05
42 1025 4198400000 1.5621129417419433 10.884025346612145 1393.1552443663545 1.9999999494757503e-05
43 1050 4300800000 1.5600895547866822 11.148848565988725 1427.0526164465568 1.9999999494757503e-05
44 1075 4403200000 1.5528885984420777 11.413467440123783 1460.9238323358443 1.9999999494757503e-05
45 1100 4505600000 1.5599483346939087 11.678333667439569 1494.8267094322648 1.9999999494757503e-05
46 1125 4608000000 1.5648639726638793 11.943580217023005 1528.7782677789446 1.9999999494757503e-05
47 1150 4710400000 1.549267168045044 12.20822147437584 1562.6523487201075 1.9999999494757503e-05
48 1175 4812800000 1.5537393379211426 12.473067252154218 1596.5526082757399 1.9999999494757503e-05
49 1200 4915200000 1.5565906286239624 12.737777670184224 1630.4355417835807 1.9999999494757503e-05
50 1220 4997120000 1.549389386177063 12.94956159459549 1657.5438841082228 1.9999999494757503e-05

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e922e0c4112bf78d634ff506c400a651620f43e966b11e2a6fe98206c6e9a423
size 3379212

View File

@@ -0,0 +1,50 @@
training_steps,training_tokens,training_loss,walltime,gputime,learning_rate
25,102400000,1.4206320714950562,0.26831979325136307,34.34493353617447,2.938559919130057e-05
50,204800000,1.3607354307174682,0.532874067107453,68.20788058975398,2.8771199140464887e-05
75,307200000,1.344892144203186,0.797669098578563,102.10164461805607,2.8156800908618607e-05
100,409600000,1.3339987087249756,1.0622922554949321,135.9734087033513,2.7542400857782923e-05
125,512000000,1.3272197246551514,1.3275567747589183,169.92726716914154,2.692800080694724e-05
150,614400000,1.3201901483535767,1.5918107887810053,203.75178096396868,2.6313600756111555e-05
175,716800000,1.3260539388656616,1.856745663885765,237.6634449773779,2.569920070527587e-05
200,819200000,1.3183754396438598,2.1223753731163395,271.66404775889146,2.5084800654440187e-05
225,921600000,1.3132305669784545,2.3882311399600185,305.69358591488236,2.4470400603604503e-05
250,1024000000,1.3129970407485962,2.65354331628931,339.65354448503166,2.385600055276882e-05
275,1126400000,1.3064435338973999,2.9190579786122703,373.6394212623706,2.3241600501933135e-05
300,1228800000,1.3097565698623657,3.1846308088092967,407.63274352759,2.262720045109745e-05
325,1331200000,1.2990926265716554,3.450655307807062,441.6838793993039,2.2012800400261767e-05
350,1433600000,1.3015284061431884,3.7156531358282527,475.60360138601635,2.1398400349426083e-05
375,1536000000,1.300846767425537,3.9803074113379786,509.47934865126126,2.07840002985904e-05
400,1638400000,1.2993078660964965,4.245298594229775,543.3982200614112,2.0169600247754715e-05
425,1740800000,1.2959114503860474,4.510613332994379,577.3585066232805,1.955520019691903e-05
450,1843200000,1.2931818628311158,4.775614008341895,611.2785930677626,1.8940800146083347e-05
475,1945600000,1.2935280084609986,5.040500104048112,645.1840133181583,1.8326400095247664e-05
500,2048000000,1.297581434249878,5.305721228775273,679.1323172832349,1.771200004441198e-05
525,2150400000,1.2973516607284545,5.571934178461597,713.2075748430844,1.7097599993576296e-05
550,2252800000,1.2920558738708496,5.836718603762894,747.0999812816505,1.6483199942740612e-05
575,2355200000,1.2921710443496703,6.101621121729093,781.0075035813239,1.5868799891904928e-05
600,2457600000,1.2922473382949828,6.3664986443806235,814.9118264807198,1.5254399841069244e-05
625,2560000000,1.286066074371338,6.632426781060178,848.9506279757028,1.463999979023356e-05
650,2662400000,1.2801355123519897,6.897544031576831,882.8856360418343,1.4025599739397876e-05
675,2764800000,1.2844274616241456,7.16281749378081,916.8406392039436,1.3411199688562192e-05
700,2867200000,1.2837993860244752,7.427888308341158,950.7697034676683,1.2796799637726508e-05
725,2969600000,1.277207851409912,7.694053462864512,984.8388432466576,1.2182399586890824e-05
750,3072000000,1.2725739479064941,7.9594902915458245,1018.8147573178655,1.1568000445549842e-05
775,3174400000,1.279445676803589,8.224737335718995,1052.7663789720314,1.0953600394714158e-05
800,3276800000,1.2785338878631591,8.4897940390995,1086.693637004736,1.0339200343878474e-05
825,3379200000,1.2763902473449706,8.754925038348853,1120.6304049086532,9.72480029304279e-06
850,3481600000,1.273976821899414,9.020435715932496,1154.6157716393595,9.110400242207106e-06
875,3584000000,1.2713893747329712,9.285925992178,1188.598526998784,8.496000191371422e-06
900,3686400000,1.2784578609466553,9.551624662507189,1222.6079568009202,7.881600140535738e-06
925,3788800000,1.2703066444396973,9.81712425603072,1256.5919047719321,7.267200089700054e-06
950,3891200000,1.271108751296997,10.082206830408738,1290.5224742923185,6.6528000388643704e-06
975,3993600000,1.2680458974838258,10.347317189690123,1324.4566002803358,6.0383999880286865e-06
1000,4096000000,1.2702019023895263,10.612797046162356,1358.4380219087816,5.4239999371930026e-06
1025,4198400000,1.2751475191116333,10.87931673718002,1392.5525423590425,4.809599886357319e-06
1050,4300800000,1.266355185508728,11.143856210205902,1426.4135949063555,4.195199835521635e-06
1075,4403200000,1.2681057167053222,11.408356219250548,1460.2695960640701,3.580800012059626e-06
1100,4505600000,1.271327452659607,11.672805533763803,1494.1191083217668,2.9663999612239422e-06
1125,4608000000,1.2650820541381835,11.938276334099923,1528.0993707647901,2.3519999103882583e-06
1150,4710400000,1.270536971092224,12.203680372205294,1562.0710876422777,1.737599973239412e-06
1175,4812800000,1.2631586408615112,12.468635068036095,1595.9852887086201,1.1232000360905658e-06
1200,4915200000,1.2647824430465697,12.734026105942206,1629.9553415606024,5.087999852548819e-07
1220,4997120000,1.2646446466445922,12.946103368596539,1657.101231180357,1.727999965339677e-08
1 training_steps training_tokens training_loss walltime gputime learning_rate
2 25 102400000 1.4206320714950562 0.26831979325136307 34.34493353617447 2.938559919130057e-05
3 50 204800000 1.3607354307174682 0.532874067107453 68.20788058975398 2.8771199140464887e-05
4 75 307200000 1.344892144203186 0.797669098578563 102.10164461805607 2.8156800908618607e-05
5 100 409600000 1.3339987087249756 1.0622922554949321 135.9734087033513 2.7542400857782923e-05
6 125 512000000 1.3272197246551514 1.3275567747589183 169.92726716914154 2.692800080694724e-05
7 150 614400000 1.3201901483535767 1.5918107887810053 203.75178096396868 2.6313600756111555e-05
8 175 716800000 1.3260539388656616 1.856745663885765 237.6634449773779 2.569920070527587e-05
9 200 819200000 1.3183754396438598 2.1223753731163395 271.66404775889146 2.5084800654440187e-05
10 225 921600000 1.3132305669784545 2.3882311399600185 305.69358591488236 2.4470400603604503e-05
11 250 1024000000 1.3129970407485962 2.65354331628931 339.65354448503166 2.385600055276882e-05
12 275 1126400000 1.3064435338973999 2.9190579786122703 373.6394212623706 2.3241600501933135e-05
13 300 1228800000 1.3097565698623657 3.1846308088092967 407.63274352759 2.262720045109745e-05
14 325 1331200000 1.2990926265716554 3.450655307807062 441.6838793993039 2.2012800400261767e-05
15 350 1433600000 1.3015284061431884 3.7156531358282527 475.60360138601635 2.1398400349426083e-05
16 375 1536000000 1.300846767425537 3.9803074113379786 509.47934865126126 2.07840002985904e-05
17 400 1638400000 1.2993078660964965 4.245298594229775 543.3982200614112 2.0169600247754715e-05
18 425 1740800000 1.2959114503860474 4.510613332994379 577.3585066232805 1.955520019691903e-05
19 450 1843200000 1.2931818628311158 4.775614008341895 611.2785930677626 1.8940800146083347e-05
20 475 1945600000 1.2935280084609986 5.040500104048112 645.1840133181583 1.8326400095247664e-05
21 500 2048000000 1.297581434249878 5.305721228775273 679.1323172832349 1.771200004441198e-05
22 525 2150400000 1.2973516607284545 5.571934178461597 713.2075748430844 1.7097599993576296e-05
23 550 2252800000 1.2920558738708496 5.836718603762894 747.0999812816505 1.6483199942740612e-05
24 575 2355200000 1.2921710443496703 6.101621121729093 781.0075035813239 1.5868799891904928e-05
25 600 2457600000 1.2922473382949828 6.3664986443806235 814.9118264807198 1.5254399841069244e-05
26 625 2560000000 1.286066074371338 6.632426781060178 848.9506279757028 1.463999979023356e-05
27 650 2662400000 1.2801355123519897 6.897544031576831 882.8856360418343 1.4025599739397876e-05
28 675 2764800000 1.2844274616241456 7.16281749378081 916.8406392039436 1.3411199688562192e-05
29 700 2867200000 1.2837993860244752 7.427888308341158 950.7697034676683 1.2796799637726508e-05
30 725 2969600000 1.277207851409912 7.694053462864512 984.8388432466576 1.2182399586890824e-05
31 750 3072000000 1.2725739479064941 7.9594902915458245 1018.8147573178655 1.1568000445549842e-05
32 775 3174400000 1.279445676803589 8.224737335718995 1052.7663789720314 1.0953600394714158e-05
33 800 3276800000 1.2785338878631591 8.4897940390995 1086.693637004736 1.0339200343878474e-05
34 825 3379200000 1.2763902473449706 8.754925038348853 1120.6304049086532 9.72480029304279e-06
35 850 3481600000 1.273976821899414 9.020435715932496 1154.6157716393595 9.110400242207106e-06
36 875 3584000000 1.2713893747329712 9.285925992178 1188.598526998784 8.496000191371422e-06
37 900 3686400000 1.2784578609466553 9.551624662507189 1222.6079568009202 7.881600140535738e-06
38 925 3788800000 1.2703066444396973 9.81712425603072 1256.5919047719321 7.267200089700054e-06
39 950 3891200000 1.271108751296997 10.082206830408738 1290.5224742923185 6.6528000388643704e-06
40 975 3993600000 1.2680458974838258 10.347317189690123 1324.4566002803358 6.0383999880286865e-06
41 1000 4096000000 1.2702019023895263 10.612797046162356 1358.4380219087816 5.4239999371930026e-06
42 1025 4198400000 1.2751475191116333 10.87931673718002 1392.5525423590425 4.809599886357319e-06
43 1050 4300800000 1.266355185508728 11.143856210205902 1426.4135949063555 4.195199835521635e-06
44 1075 4403200000 1.2681057167053222 11.408356219250548 1460.2695960640701 3.580800012059626e-06
45 1100 4505600000 1.271327452659607 11.672805533763803 1494.1191083217668 2.9663999612239422e-06
46 1125 4608000000 1.2650820541381835 11.938276334099923 1528.0993707647901 2.3519999103882583e-06
47 1150 4710400000 1.270536971092224 12.203680372205294 1562.0710876422777 1.737599973239412e-06
48 1175 4812800000 1.2631586408615112 12.468635068036095 1595.9852887086201 1.1232000360905658e-06
49 1200 4915200000 1.2647824430465697 12.734026105942206 1629.9553415606024 5.087999852548819e-07
50 1220 4997120000 1.2646446466445922 12.946103368596539 1657.101231180357 1.727999965339677e-08

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e67c420a23acc8902f5b1ef57e8baa6c7ffcd5c15b90783828bfaf156e1219a1
size 3357018

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cc8b337f3bd69430af103f927c1d838d29c158bd29bdfdc12f69405b37e49441
size 4924315872

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:efe1c9aaf6bb27991347d4f1fa47eb53ba71ef4163db6b1d5491c48690626b9a
size 4983047384

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5117b407cc109ad9d83616ae7cad460566421acc167668c2c992fc073ff4c113
size 3506598760

View File

@@ -0,0 +1,330 @@
{
"metadata": {
"total_size": 13413924864
},
"weight_map": {
"model.embed_tokens.weight": "model-00001-of-00003.safetensors",
"model.norm.weight": "model-00001-of-00003.safetensors",
"lm_head.weight": "model-00001-of-00003.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.8.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.9.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.self_attn.rotary_emb.inv_freq": "model-00001-of-00003.safetensors",
"model.layers.10.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.17.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.18.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.self_attn.rotary_emb.inv_freq": "model-00002-of-00003.safetensors",
"model.layers.22.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.22.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.23.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.24.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.25.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.26.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.27.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.28.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.29.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.30.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.31.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.self_attn.rotary_emb.inv_freq": "model-00003-of-00003.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00003-of-00003.safetensors"
}
}

30
special_tokens_map.json Normal file
View File

@@ -0,0 +1,30 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<pad>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

129170
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

49
tokenizer_config.json Normal file
View File

@@ -0,0 +1,49 @@
{
"add_bos_token": true,
"add_eos_token": false,
"added_tokens_decoder": {
"0": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"3": {
"content": "<pad>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [],
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 1000000000000000000000000000000,
"pad_token": "<pad>",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>"
}