初始化项目，由ModelHub XC社区提供模型

Model: cglez/gpt2-ag_news Source: Original Platform
2026-05-30 19:21:46 +08:00
commit 5de9e6fb4d
16 changed files with 150540 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,97 @@
+---
+library_name: transformers
+language: en
+license: mit
+datasets:
+- fancyzhx/ag_news
+base_model:
+- openai-community/gpt2
+---
+
+# Model Card: GPT-2-AG-News
+
+An in-domain GPT-2, pre-trained from scratch on the AG-News dataset texts.
+
+## Model Details
+
+### Description
+
+This model is based on the [GPT-2](https://huggingface.co/openai-community/gpt2)
+architecture and was pre-trained from scratch (in-domain) using the text in AG-News dataset, excluding its test split.
+
+- **Developed by:** [Cesar Gonzalez-Gutierrez](https://ceguel.es)
+- **Funded by:** [ERC](https://erc.europa.eu)
+- **Architecture:** GPT-2
+- **Language:** English
+- **License:** MIT
+- **Base model:** [GPT-2](https://huggingface.co/openai-community/gpt2)
+
+### Checkpoints
+
+Intermediate checkpoints from the pre-training process are available and can be accessed using specific tags,
+which correspond to training epochs and steps:
+
+| Epoch | Step | Tags | |
+|---|---|---|---|
+| 1 | 1125 | epoch-1 | step-1125 |
+| 5 | 5625 | epoch-5 | step-5625 |
+| 10 | 11250 | epoch-10 | step-11250 |
+| 20 | 22500 | epoch-20 | step-22500 |
+| 30 | 33750 | epoch-30 | step-33750 |
+| 40 | 45000 | epoch-40 | step-45000 |
+| 50 | 56250 | epoch-50 | step-56250 |
+| 60 | 67500 | epoch-60 | step-67500 |
+| 70 | 78750 | epoch-70 | step-78750 |
+| 80 | 90000 | epoch-80 | step-90000 |
+| 90 | 101250 | epoch-90 | step-101250 |
+| 100 | 112500 | epoch-100 | step-112500 |
+
+To load a model from a specific intermediate checkpoint, use the `revision` parameter with the corresponding tag:
+```python
+from transformers import AutoModelForCausalLM
+
+model = AutoModelForMaskedLM.from_pretrained("<model-name>", revision="<checkpoint-tag>")
+```
+
+### Sources
+
+- **Paper:** [Information pending]
+
+## Training Details
+
+For more details on the training procedure, please refer to the base model's documentation:
+[Training procedure](https://huggingface.co/openai-community/gpt2#training-procedure).
+
+### Training Data
+
+All texts from AG-News dataset, excluding the test partition.
+
+#### Training Hyperparameters
+
+- **Precision:** fp16
+- **Batch size:** 8
+- **Gradient accumulation steps:** 12
+
+## Uses
+
+For typical use cases and limitations, please refer to the base model's guidance: 
+[Inteded uses & limitations](https://huggingface.co/openai-community/gpt2#intended-uses--limitations).
+
+## Bias, Risks, and Limitations
+
+This model inherits potential risks and limitations from the base model. Refer to:
+[Limitations and bias](https://huggingface.co/openai-community/gpt2#limitations-and-bias).
+
+## Environmental Impact
+
+- **Hardware Type:** NVIDIA A100 PCIE 40GB
+- **Hours used:** 15
+- **Cluster Provider:** [Artemisa](https://artemisa.ific.uv.es/web/)
+- **Compute Region:** EU
+- **Carbon Emitted:** 1.62 kg CO2 eq.
+
+## Citation
+
+**BibTeX:**
+
+[More Information Needed]