初始化项目，由ModelHub XC社区提供模型

Model: jjjjjvvvvv/business-news-generator Source: Original Platform
2026-05-10 00:10:57 +08:00
commit 6f2b6591a0
11 changed files with 294208 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,65 @@
+---
+library_name: transformers
+license: apache-2.0
+base_model: HuggingFaceTB/SmolLM-135M
+tags:
+- generated_from_trainer
+model-index:
+- name: business-news-generator
+  results: []
+---
+
+
+# business-news-generator
+
+This model is a fine-tuned version of [HuggingFaceTB/SmolLM-135M](https://huggingface.co/HuggingFaceTB/SmolLM-135M) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 3.1695
+
+## Model description
+
+This model is a fine-tuned version of SmolLM-135M trained on the AG News dataset to generate business news-style text. The model learns patterns from financial and economic news articles and generates short business-related text based on prompts.
+
+## Intended uses & limitations
+
+This model is intended for educational purposes and experimentation with text generation. It can generate simple business news-style text based on prompts such as earnings reports, stock market updates, and merger announcements.
+
+Limitations include occasional incoherent sentences, lack of factual accuracy, and reduced performance when using small training subsets or parameter-efficient fine-tuning methods like LoRA.
+
+## Training and evaluation data
+
+The model was trained on the AG News dataset, specifically filtered to include only business-related articles. The dataset contains labeled news text across categories such as World, Sports, Business, and Sci/Tech. Only the Business category was used for fine-tuning.
+
+## Training procedure
+
+The model was fine-tuned using the Hugging Face Transformers library. Training was performed for 2 epochs using a batch size of 8 and a cosine learning rate scheduler. Both full fine-tuning and LoRA-based fine-tuning approaches were implemented and compared.
+
+### Training hyperparameters
+
+The following hyperparameters were used during training:
+- learning_rate: 0.0005
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- num_epochs: 2
+
+### Training results
+
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 3.1445        | 0.32  | 200  | 3.2506          |
+| 2.8323        | 0.64  | 400  | 3.1549          |
+| 2.6594        | 0.96  | 600  | 3.0463          |
+| 1.689         | 1.28  | 800  | 3.1806          |
+| 1.5107        | 1.6   | 1000 | 3.1657          |
+| 1.4594        | 1.92  | 1200 | 3.1695          |
+
+
+### Framework versions
+
+- Transformers 4.57.6
+- Pytorch 2.10.0+cu128
+- Datasets 4.8.4
+- Tokenizers 0.22.2