初始化项目,由ModelHub XC社区提供模型
Model: VTSNLP/Llama3-ViettelSolutions-8B Source: Original Platform
This commit is contained in:
141
README.md
Normal file
141
README.md
Normal file
@@ -0,0 +1,141 @@
|
||||
---
|
||||
library_name: transformers
|
||||
license: llama3
|
||||
datasets:
|
||||
- VTSNLP/vietnamese_curated_dataset
|
||||
language:
|
||||
- vi
|
||||
- en
|
||||
base_model:
|
||||
- meta-llama/Meta-Llama-3-8B
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
# Model Information
|
||||
|
||||
<!-- Provide a quick summary of what the model is/does. -->
|
||||
|
||||
|
||||
|
||||
## Model Details
|
||||
|
||||
### Model Description
|
||||
|
||||
<!-- Provide a longer summary of what this model is. -->
|
||||
|
||||
Llama3-ViettelSolutions-8B is a variant of the Meta Llama-3-8B model, continued pre-trained on the [Vietnamese curated dataset](https://huggingface.co/datasets/VTSNLP/vietnamese_curated_dataset) and supervised fine-tuned on 5 million samples of Vietnamese instruct data.
|
||||
- **Developed by:** Viettel Solutions
|
||||
- **Funded by:** NVIDIA
|
||||
- **Model type:** Autoregressive transformer model
|
||||
- **Language(s) (NLP):** Vietnamese, English
|
||||
- **License:** Llama 3 Community License
|
||||
- **Finetuned from model:** meta-llama/Meta-Llama-3-8B
|
||||
|
||||
## Uses
|
||||
|
||||
Example snippet for usage with Transformers:
|
||||
|
||||
```
|
||||
import transformers
|
||||
import torch
|
||||
|
||||
model_id = "VTSNLP/Llama3-ViettelSolutions-8B"
|
||||
|
||||
pipeline = transformers.pipeline(
|
||||
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
|
||||
)
|
||||
pipeline("Xin chào!")
|
||||
```
|
||||
|
||||
|
||||
## Training Details
|
||||
|
||||
### Training Data
|
||||
|
||||
- Dataset for continue pretrain: [Vietnamese curated dataset](https://huggingface.co/datasets/VTSNLP/vietnamese_curated_dataset)
|
||||
|
||||
- Dataset for supervised fine-tuning: [Instruct general dataset](https://huggingface.co/datasets/VTSNLP/instruct_general_dataset)
|
||||
|
||||
|
||||
### Training Procedure
|
||||
|
||||
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
||||
|
||||
#### Preprocessing
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
|
||||
#### Training Hyperparameters
|
||||
|
||||
- **Training regime:** bf16 mixed precision <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
||||
- **Data sequence length:** 8192
|
||||
- **Tensor model parallel size:** 4
|
||||
- **Pipelinemodel parallel size:** 1
|
||||
- **Context parallel size:** 1
|
||||
- **Micro batch size:** 1
|
||||
- **Global batch size:** 512
|
||||
|
||||
## Evaluation
|
||||
|
||||
<!-- This section describes the evaluation protocols and provides the results. -->
|
||||
|
||||
### Testing Data, Factors & Metrics
|
||||
|
||||
#### Testing Data
|
||||
|
||||
<!-- This should link to a Dataset Card if possible. -->
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
#### Factors
|
||||
|
||||
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
#### Metrics
|
||||
|
||||
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
### Results
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
#### Summary
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
## Technical Specifications
|
||||
|
||||
- Compute Infrastructure: NVIDIA DGX
|
||||
|
||||
- Hardware: 4 x A100 80GB
|
||||
|
||||
- Software: [NeMo Framework](https://github.com/NVIDIA/NeMo)
|
||||
|
||||
## Citation
|
||||
|
||||
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
||||
|
||||
**BibTeX:**
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
**APA:**
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
## More Information
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
## Model Card Authors
|
||||
|
||||
[More Information Needed]
|
||||
|
||||
## Model Card Contact
|
||||
|
||||
[More Information Needed]
|
||||
Reference in New Issue
Block a user