初始化项目，由ModelHub XC社区提供模型

Model: NeshVerse/Flash_Financial_SFT_Nanbeige_4.1-3B Source: Original Platform
2026-06-07 22:46:21 +08:00
commit cedded253d
12 changed files with 761 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,127 @@
+---
+language:
+- en
+license: apache-2.0
+library_name: transformers
+tags:
+- finance
+- sales
+- lora
+- qlora
+- unsloth
+- nanbeige
+- domain-specific
+- numerical-analysis
+- aggregation
+- structured-data
+datasets:
+- custom-financial-sales-data
+model-index:
+- name: Flash_Financial_SFT_Nanbeige_4.1-3B
+  results: []
+base_model: Nanbeige/Nanbeige4.1-3B
+pipeline_tag: text-generation
+---
+
+## Model Overview
+
+**Flash_Financial_SFT_Nanbeige_4.1-3B** is a production-ready, domain-optimized language model fine-tuned specifically for financial sales data analysis and aggregation.
+
+### Key Highlights
+
+| Achievement | Metric | Status |
+|-------------|--------|--------|
+| Training Efficiency | 3.7 hours on single T4 GPU | Optimized |
+| Loss Reduction | 3.91 to 0.52 (86% improvement) | Excellent |
+| Perplexity | 1.69 | Outstanding |
+| Parameter Efficiency | 0.043% trainable (1.7M params) | Ultra-efficient |
+| Generalization | Training loss equals Eval loss (0.52) | No overfitting |
+| Memory Footprint | ~50MB adapter | Deployment-ready |
+
+### Technical Architecture
+
+- **Base Model:** Nanbeige4.1-3B (3.9B parameters)
+- **Fine-tuning Method:** QLoRA (4-bit quantization + LoRA)
+- **LoRA Configuration:** Rank 4, Alpha 8, Target modules: q_proj, v_proj, o_proj
+- **Trainable Parameters:** 1,703,936 (0.043% of base)
+- **Sequence Length:** 256 tokens
+- **Effective Batch Size:** 8 (1 x 8 gradient accumulation)
+- **Precision:** FP16 training, 4-bit inference compatible
+
+### Training Performance
+
+- **Training Duration:** 222.7 minutes (3.7 hours)
+- **Total Steps:** 4,683
+- **Training Examples:** 37,463 structured records
+- **Final Training Loss:** 0.5178
+- **Final Eval Loss:** 0.5224
+- **Perplexity:** 1.69
+- **Convergence:** Smooth, stable, no overfitting
+
+### Core Capabilities
+
+**Primary Functions:**
+- Numerical Aggregation: Sum, average, count sales values accurately
+- Temporal Analysis: Monthly, quarterly, annual sales summaries
+- Structured Parsing: Extract insights from formatted sales records
+- Report Generation: Produce consistent, formatted output
+
+
+### Deployment Advantages
+
+| Advantage | Benefit |
+|-----------|---------|
+| Tiny Footprint | 50MB adapter vs 6GB+ full model |
+| Fast Inference | 4-bit quantization ready |
+| Low Compute | Runs on consumer GPUs (8GB+ VRAM) |
+| Easy Integration | Drop-in replacement for base model |
+| Cost Efficient | Minimal cloud compute requirements |
+
+### Performance Benchmarks
+
+| Task | Expected Performance |
+|------|-------------------|
+| Sales total calculation | Greater than 95% accuracy |
+| Monthly aggregation | Greater than 90% accuracy |
+| Format consistency | Greater than 98% reliability |
+| Numerical precision | High (exact sums) |
+| Novel data handling | Moderate (domain-limited) |
+
+### Ideal Use Cases
+
+- Business Intelligence Dashboards
+- Automated Sales Reporting
+- Financial Data Extraction Pipelines
+- ERP System Integration
+- Sales Performance Analytics
+- Structured Data Q&A Systems
+
+### Limitations and Considerations
+
+| Limitation | Mitigation |
+|------------|------------|
+| Domain-specific only | Use within sales/finance contexts |
+| Structured input required | Pre-format data before input |
+| 256 token context | Suitable for single records, not long documents |
+| English language only | Train separate model for other languages |
+| No complex reasoning | Combine with RAG for multi-step analysis |
+
+### Why This Model Stands Out
+
+1. **Efficiency Leader:** 0.043% parameter training achieves 86% loss reduction
+2. **Production Proven:** 3.7-hour training with zero crashes or instability
+3. **Metric Excellence:** 1.69 perplexity rivals models 10x larger
+4. **Deployment Ready:** Immediate usability with standard inference pipelines
+5. **Cost Optimized:** Minimal compute for maximum domain performance
+
+### Citation
+
+```bibtex
+@misc{sales-finance-lora-3b-2024,
+  title={Sales-Finance-LoRA-3B: Efficient Domain Adaptation for Financial Sales Analysis},
+  author={Neshverse},
+  year={2024},
+  howpublished={https://huggingface.co/Neshverse/sales-finance-lora-3b},
+  note={Fine-tuned using Unsloth QLoRA on Nanbeige4.1-3B. 
+        Training: 3.7h on T4 GPU, 37K examples, 86% loss reduction, 1.69 perplexity.}
+}