Auto Sync from git://github.com/01-ai/Yi.git/commit/704d5c148e087e9d1c83fb51e02790b197ce1aba
This commit is contained in:
10
README.md
10
README.md
@@ -773,11 +773,11 @@ pip install torch==2.0.1 deepspeed==0.10 tensorboard transformers datasets sente
|
|||||||
|
|
||||||
#### Hardware Setup
|
#### Hardware Setup
|
||||||
|
|
||||||
For the Yi-6B model, a node with 4 GPUs, each has GPU mem larger than 60GB is recommended.
|
For the Yi-6B model, a node with 4 GPUs, each with GPU memory larger than 60GB, is recommended.
|
||||||
|
|
||||||
For the Yi-34B model, because the usage of zero-offload technique takes a lot CPU memory, please be careful to limit the GPU numbers in 34B finetune training. Please use CUDA_VISIBLE_DEVICES to limit the GPU number (as shown in scripts/run_sft_Yi_34b.sh).
|
For the Yi-34B model, because the usage of the zero-offload technique consumes a lot of CPU memory, please be careful to limit the number of GPUs in the 34B finetune training. Please use CUDA_VISIBLE_DEVICES to limit the number of GPUs (as shown in scripts/run_sft_Yi_34b.sh).
|
||||||
|
|
||||||
A typical hardware setup for finetuning 34B model is a node with 8GPUS (limit to 4 in running by CUDA_VISIBLE_DEVICES=0,1,2,3), each has GPU mem larger than 80GB, with total CPU mem larger than 900GB.
|
A typical hardware setup for finetuning the 34B model is a node with 8 GPUs (limited to 4 in running by CUDA_VISIBLE_DEVICES=0,1,2,3), each with GPU memory larger than 80GB, and total CPU memory larger than 900GB.
|
||||||
|
|
||||||
#### Quick Start
|
#### Quick Start
|
||||||
|
|
||||||
@@ -865,7 +865,7 @@ python quantization/gptq/eval_quantized_model.py \
|
|||||||
#### GPT-Q quantization
|
#### GPT-Q quantization
|
||||||
|
|
||||||
[GPT-Q](https://github.com/IST-DASLab/gptq) is a PTQ (Post-Training Quantization)
|
[GPT-Q](https://github.com/IST-DASLab/gptq) is a PTQ (Post-Training Quantization)
|
||||||
method. It's memory saving and provides potential speedups while retaining the accuracy
|
method. It saves memory and provides potential speedups while retaining the accuracy
|
||||||
of the model.
|
of the model.
|
||||||
|
|
||||||
Yi models can be GPT-Q quantized without a lot of efforts.
|
Yi models can be GPT-Q quantized without a lot of efforts.
|
||||||
@@ -911,7 +911,7 @@ python quantization/awq/eval_quantized_model.py \
|
|||||||
--model /quantized_model \
|
--model /quantized_model \
|
||||||
--trust_remote_code
|
--trust_remote_code
|
||||||
```
|
```
|
||||||
<details style="display: inline;"><summary>For detailed explanations, see the explanations below. ⬇️</summary> <ul>
|
<details style="display: inline;"><summary>For details, see the explanations below. ⬇️</summary> <ul>
|
||||||
|
|
||||||
#### AWQ quantization
|
#### AWQ quantization
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user