Auto Sync from git://github.com/01-ai/Yi.git/commit/704d5c148e087e9d1c83fb51e02790b197ce1aba

This commit is contained in:
ai-modelscope
2024-03-21 18:09:31 +08:00
parent 3b6072188a
commit a850b459d1

View File

@@ -773,11 +773,11 @@ pip install torch==2.0.1 deepspeed==0.10 tensorboard transformers datasets sente
#### Hardware Setup #### Hardware Setup
For the Yi-6B model, a node with 4 GPUs, each has GPU mem larger than 60GB is recommended. For the Yi-6B model, a node with 4 GPUs, each with GPU memory larger than 60GB, is recommended.
For the Yi-34B model, because the usage of zero-offload technique takes a lot CPU memory, please be careful to limit the GPU numbers in 34B finetune training. Please use CUDA_VISIBLE_DEVICES to limit the GPU number (as shown in scripts/run_sft_Yi_34b.sh). For the Yi-34B model, because the usage of the zero-offload technique consumes a lot of CPU memory, please be careful to limit the number of GPUs in the 34B finetune training. Please use CUDA_VISIBLE_DEVICES to limit the number of GPUs (as shown in scripts/run_sft_Yi_34b.sh).
A typical hardware setup for finetuning 34B model is a node with 8GPUS (limit to 4 in running by CUDA_VISIBLE_DEVICES=0,1,2,3), each has GPU mem larger than 80GB, with total CPU mem larger than 900GB. A typical hardware setup for finetuning the 34B model is a node with 8 GPUs (limited to 4 in running by CUDA_VISIBLE_DEVICES=0,1,2,3), each with GPU memory larger than 80GB, and total CPU memory larger than 900GB.
#### Quick Start #### Quick Start
@@ -865,7 +865,7 @@ python quantization/gptq/eval_quantized_model.py \
#### GPT-Q quantization #### GPT-Q quantization
[GPT-Q](https://github.com/IST-DASLab/gptq) is a PTQ (Post-Training Quantization) [GPT-Q](https://github.com/IST-DASLab/gptq) is a PTQ (Post-Training Quantization)
method. It's memory saving and provides potential speedups while retaining the accuracy method. It saves memory and provides potential speedups while retaining the accuracy
of the model. of the model.
Yi models can be GPT-Q quantized without a lot of efforts. Yi models can be GPT-Q quantized without a lot of efforts.
@@ -911,7 +911,7 @@ python quantization/awq/eval_quantized_model.py \
--model /quantized_model \ --model /quantized_model \
--trust_remote_code --trust_remote_code
``` ```
<details style="display: inline;"><summary>For detailed explanations, see the explanations below. ⬇️</summary> <ul> <details style="display: inline;"><summary>For details, see the explanations below. ⬇️</summary> <ul>
#### AWQ quantization #### AWQ quantization