Cream-Phi-3-14B-v1a/README.md

tldr; This is Phi 3 Medium finetuned for (mainly SFW) roleplaying.

It was a promising release candidate that fell flat when things got moist.

I'm publishing all the details for anyone else interested in finetuning Phi 3.

Training Details:
- 8x H100 80GB SXM GPUs
- 1 hour training time

Results for Roleplay Mode (i.e., not Instruct format):
- Strong RP formatting.
- Tends to output short, straightforward replies to the player character.
- Starts to break down when things get moist.
- Important: My testing is lazy and flawed. Take it with a grain of salt and test the GGUFs before taking notes.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/WxdrI9dDHm4nuFHAe8FzZ.png)

Axolotl Config (some fields omitted)
```yaml
base_model: failspy/Phi-3-medium-4k-instruct-abliterated-v3
load_in_4bit: true
bf16: auto
fp16:
tf32: false
flash_attention: true

sequence_len: 4096
datasets:
  - path: Undi95/andrijdavid_roleplay-conversation-sharegpt
    type: customphi3

num_epochs: 2
warmup_steps: 30
weight_decay: 0.1

adapter: lora
lora_r: 128
lora_alpha: 16
lora_dropout: 0.1
lora_target_linear: true

gradient_accumulation_steps: 2
micro_batch_size: 2
gradient_checkpointing: true
gradient_checkpointing_kwargs:
   use_reentrant: true

sample_packing: true
pad_to_sequence_len: true

optimizer: paged_adamw_8bit
lr_scheduler: cosine
learning_rate: 0.0001
max_grad_norm: 1.0

val_set_size: 0.01
evals_per_epoch: 3
eval_max_new_tokens: 128
eval_batch_size: 1
```
初始化项目，由ModelHub XC社区提供模型 Model: BeaverAI/Cream-Phi-3-14B-v1a Source: Original Platform 2026-05-08 07:31:58 +08:00			`tldr; This is Phi 3 Medium finetuned for (mainly SFW) roleplaying.`

			`It was a promising release candidate that fell flat when things got moist.`

			`I'm publishing all the details for anyone else interested in finetuning Phi 3.`

			`Training Details:`
			`- 8x H100 80GB SXM GPUs`
			`- 1 hour training time`

			`Results for Roleplay Mode (i.e., not Instruct format):`
			`- Strong RP formatting.`
			`- Tends to output short, straightforward replies to the player character.`
			`- Starts to break down when things get moist.`
			`- Important: My testing is lazy and flawed. Take it with a grain of salt and test the GGUFs before taking notes.`

			`![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/WxdrI9dDHm4nuFHAe8FzZ.png)`

			`Axolotl Config (some fields omitted)`
			```yaml
			`base_model: failspy/Phi-3-medium-4k-instruct-abliterated-v3`
			`load_in_4bit: true`
			`bf16: auto`
			`fp16:`
			`tf32: false`
			`flash_attention: true`

			`sequence_len: 4096`
			`datasets:`
			`- path: Undi95/andrijdavid_roleplay-conversation-sharegpt`
			`type: customphi3`

			`num_epochs: 2`
			`warmup_steps: 30`
			`weight_decay: 0.1`

			`adapter: lora`
			`lora_r: 128`
			`lora_alpha: 16`
			`lora_dropout: 0.1`
			`lora_target_linear: true`

			`gradient_accumulation_steps: 2`
			`micro_batch_size: 2`
			`gradient_checkpointing: true`
			`gradient_checkpointing_kwargs:`
			`use_reentrant: true`

			`sample_packing: true`
			`pad_to_sequence_len: true`

			`optimizer: paged_adamw_8bit`
			`lr_scheduler: cosine`
			`learning_rate: 0.0001`
			`max_grad_norm: 1.0`

			`val_set_size: 0.01`
			`evals_per_epoch: 3`
			`eval_max_new_tokens: 128`
			`eval_batch_size: 1`
			```