初始化项目,由ModelHub XC社区提供模型
Model: RDson/Phi-3-mini-code-finetune-128k-instruct-v1 Source: Original Platform
This commit is contained in:
37
README.md
Normal file
37
README.md
Normal file
@@ -0,0 +1,37 @@
|
||||
---
|
||||
license: other
|
||||
license_name: phi-3
|
||||
license_link: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/raw/main/LICENSE
|
||||
datasets:
|
||||
- m-a-p/CodeFeedback-Filtered-Instruction
|
||||
tags:
|
||||
- phi
|
||||
- phi-3
|
||||
- '3'
|
||||
- code
|
||||
---
|
||||
Finetune of [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) on [m-a-p/CodeFeedback-Filtered-Instruction](https://huggingface.co/datasets/m-a-p/CodeFeedback-Filtered-Instruction) for ~9-10h using a single 3090 24GB.
|
||||
|
||||
Due to limited resources and time, the training was only on half (0.5136) of the epoch.
|
||||
|
||||
```
|
||||
train_loss: 0.43311
|
||||
```
|
||||
|
||||
```
|
||||
learning_rate=5e-5,
|
||||
lr_scheduler_type="cosine",
|
||||
max_length=1024,
|
||||
max_prompt_length=512,
|
||||
overwrite_output_dir=True,
|
||||
beta=0.1,
|
||||
gradient_accumulation_steps=8,
|
||||
optim="adamw_torch",
|
||||
num_train_epochs=1,
|
||||
evaluation_strategy="steps",
|
||||
eval_steps=0.2,
|
||||
logging_steps=1,
|
||||
warmup_steps=50,
|
||||
fp16=True,
|
||||
save_steps=50
|
||||
```
|
||||
Reference in New Issue
Block a user