初始化项目，由ModelHub XC社区提供模型

Model: ik-ram28/BioMistral-CPT-SFT-7B Source: Original Platform
2026-05-26 00:58:17 +08:00
commit 3f6c4b64dc
18 changed files with 268600 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,84 @@
+---
+library_name: transformers
+tags:
+- medical
+license: apache-2.0
+language:
+- fr
+- en
+base_model:
+- ik-ram28/BioMistral-CPT-7B
+- BioMistral/BioMistral-7B
+---
+
+## Model Description
+
+BioMistral-CPT-SFT-7B is a French medical language model based on BioMistral-7B, adapted for French medical domain applications through a combined approach of Continual Pre-Training (CPT) followed by Supervised Fine-Tuning (SFT). 
+
+## Model Details
+
+- **Model Type**: Causal Language Model
+- **Base Model**: BioMistral-7B 
+- **Language**: French (adapted from English medical model)
+- **Domain**: Medical/Healthcare
+- **Parameters**: 7 billion
+- **License**: Apache 2.0
+- **Paper**: [Adaptation des connaissances médicales pour les grands modèles de langue : Stratégies et analyse comparative](https://github.com/ikram28/medllm-strategies)
+
+## Training Details
+
+### Continual Pre-Training (CPT)
+- **Dataset**: NACHOS corpus (opeN crAwled frenCh Healthcare cOrpuS)
+  - **Size**: 7.4 GB of French medical texts
+  - **Word Count**: Over 1 billion words
+  - **Sources**: 24 French medical websites
+- **Training Duration**: 2.8 epochs
+- **Hardware**: 32 NVIDIA H100 80GB GPUs
+- **Training Time**: 11 hours
+- **Optimizer**: AdamW
+- **Learning Rate**: 2e-5
+- **Weight Decay**: 0.01
+- **Batch Size**: 16 with gradient accumulation of 2
+
+### Supervised Fine-Tuning (SFT)
+- **Dataset**: 30K French medical question-answer pairs
+  - 10K native French medical questions
+  - 10K translated medical questions from English resources
+  - 10K generated questions from French medical texts
+- **Method**: DoRA (Weight-Decomposed Low-Rank Adaptation)
+- **Training Duration**: 10 epochs
+- **Hardware**: 1 NVIDIA H100 80GB GPU
+- **Training Time**: 42 hours
+- **Rank**: 16
+- **Alpha**: 16
+- **Learning Rate**: 2e-5
+- **Batch Size**: 4
+
+
+
+
+## Computational Impact
+
+- **Total Training Time**: 53 hours (11h CPT + 42h SFT)
+- **Hardware**: 32 GPU H100 + 1 GPU H100
+- **Carbon Emissions**: 10.11 kgCO2e (9.04 + 1.07)
+
+
+
+## Ethical Considerations
+
+- **Medical Accuracy**: This model is for research and educational purposes only. Performance limitations make it unsuitable for critical medical applications
+- **Bias**: May contain biases from both English and French medical literature
+
+
+## Citation
+
+If you use this model, please cite:
+
+```bibtex
+
+```
+
+## Contact
+
+For questions about this model, please contact: ikram.belmadani@lis-lab.fr