--- pipeline_tag: text-generation library_name: transformers tags: - text-generation - metadata-localization - metadata-ablation - 1b - with-metadata - pretraining --- # combined_only_country_with_metadata_1b ## Summary This repo contains the country 1b model at the final 10k-step checkpoint for the metadata localization project. It was trained from scratch on the project corpus, using the Llama 3.2 tokenizer and vocabulary. ## Variant Metadata - Stage: `pretrain` - Family: `metadata_ablation` - Size: `1b` - Metadata condition: `with_metadata` - Base model lineage: `Trained from scratch; tokenizer/vocabulary from meta-llama/Llama-3.2-1B` ## Weights & Biases Provenance - Run name: `19/03/2026_15:29:19_combined_only_country_with_metadata_1b` - Internal run URL: `https://wandb.ai/iamshnoo/nanotron/runs/asmtp179` - Note: the Weights & Biases workspace is private; public readers should use the summarized metrics and configuration below. - State: `finished` - Runtime: `57h 1m 11s` ## Run Summary - `KPI/train_lm_loss`: `2.1391` - `KPI/train_perplexity`: `8.492` - `KPI/val_loss`: `2.2036` - `KPI/val_perplexity`: `9.0577` - `KPI/consumed_tokens/train`: `41,943,040,000` - `_step`: `10,000` ## Training Configuration - `train_steps`: `10,000` - `sequence_length`: `2,048` - `micro_batch_size`: `8` - `batch_accumulation_per_replica`: `32` - `learning_rate`: `0.003` - `min_decay_lr`: `0.0003` - `checkpoint_interval`: `1,000` ## Training Curves Static plots below were exported from the private Weights & Biases run and embedded here for public access. ### Train Loss ![Train Loss](assets/train_loss.png) ### Validation Perplexity ![Validation Perplexity](assets/val_perplexity.png) ### Throughput ![Throughput](assets/tokens_per_sec.png) ## Project Context This model is part of the metadata localization release. Related checkpoints and variants are grouped in the public Hugging Face collection [Metadata Conditioned LLMs](https://huggingface.co/collections/iamshnoo/metadata-conditioned-llms). - Training data source: [News on the Web (NOW) Corpus](https://www.english-corpora.org/now/) - Project repository: [https://github.com/iamshnoo/metadata_localization](https://github.com/iamshnoo/metadata_localization) - Paper: [https://arxiv.org/abs/2601.15236](https://arxiv.org/abs/2601.15236) Last synced: `2026-04-02 14:43:13 UTC`