初始化项目,由ModelHub XC社区提供模型
Model: BeaverAI/Cream-Phi-3-14B-v1a Source: Original Platform
This commit is contained in:
61
README.md
Normal file
61
README.md
Normal file
@@ -0,0 +1,61 @@
|
||||
tldr; This is Phi 3 Medium finetuned for (mainly SFW) roleplaying.
|
||||
|
||||
It was a promising release candidate that fell flat when things got moist.
|
||||
|
||||
I'm publishing all the details for anyone else interested in finetuning Phi 3.
|
||||
|
||||
Training Details:
|
||||
- 8x H100 80GB SXM GPUs
|
||||
- 1 hour training time
|
||||
|
||||
Results for Roleplay Mode (i.e., not Instruct format):
|
||||
- Strong RP formatting.
|
||||
- Tends to output short, straightforward replies to the player character.
|
||||
- Starts to break down when things get moist.
|
||||
- Important: My testing is lazy and flawed. Take it with a grain of salt and test the GGUFs before taking notes.
|
||||
|
||||

|
||||
|
||||
Axolotl Config (some fields omitted)
|
||||
```yaml
|
||||
base_model: failspy/Phi-3-medium-4k-instruct-abliterated-v3
|
||||
load_in_4bit: true
|
||||
bf16: auto
|
||||
fp16:
|
||||
tf32: false
|
||||
flash_attention: true
|
||||
|
||||
sequence_len: 4096
|
||||
datasets:
|
||||
- path: Undi95/andrijdavid_roleplay-conversation-sharegpt
|
||||
type: customphi3
|
||||
|
||||
num_epochs: 2
|
||||
warmup_steps: 30
|
||||
weight_decay: 0.1
|
||||
|
||||
adapter: lora
|
||||
lora_r: 128
|
||||
lora_alpha: 16
|
||||
lora_dropout: 0.1
|
||||
lora_target_linear: true
|
||||
|
||||
gradient_accumulation_steps: 2
|
||||
micro_batch_size: 2
|
||||
gradient_checkpointing: true
|
||||
gradient_checkpointing_kwargs:
|
||||
use_reentrant: true
|
||||
|
||||
sample_packing: true
|
||||
pad_to_sequence_len: true
|
||||
|
||||
optimizer: paged_adamw_8bit
|
||||
lr_scheduler: cosine
|
||||
learning_rate: 0.0001
|
||||
max_grad_norm: 1.0
|
||||
|
||||
val_set_size: 0.01
|
||||
evals_per_epoch: 3
|
||||
eval_max_new_tokens: 128
|
||||
eval_batch_size: 1
|
||||
```
|
||||
Reference in New Issue
Block a user