Files
ModelHub XC f8c4d08454 初始化项目,由ModelHub XC社区提供模型
Model: staeiou/bartleby-qwen3-1.7b_v5
Source: Original Platform
2026-05-27 08:10:15 +08:00

32 lines
2.6 KiB
Plaintext

================================================================================
BARTLEBY FULL FINETUNE _ 16-BIT _ AUTO TEMPLATE+MASK DETECT _ LAST-ANSWER MULTITURN
================================================================================
MODEL : unsloth/Qwen3-1.7B
DATA : data/training_data_v2_filtered.jsonl
GOLD : data/gold_seed_training_data_sosts.jsonl
OUTPUT : staeiou/bartleby-qwen3-1.7b_v5
CACHE_DIR : /workspace/.cache/huggingface/datasets
SEQ : 1024
PACKING : False
LOAD_4BIT : False (forced 16-bit base)
FULL_FT : True
REMOTE_CODE: False
FAMILY : qwen
TRL_COMPAT : ConstantLengthDataset patched=True
ADAPTERS : disabled
TRAIN : bs=2 grad_accum=16 eff_bs=32
EPOCHS : 2.0
LR : 1e-05 scheduler=cosine warmup=0.03 weight_decay=0.05 max_grad_norm=1.0
MULTITURN : num=0 max_turns=5 (only last assistant supervised)
GOLD_REPEAT: 5
GPU : Single GPU (CUDA_VISIBLE_DEVICES=0)
================================================================================
[1/7] Loading base model...
==((====))== Unsloth 2026.3.5: Fast Qwen3 patching. Transformers: 5.3.0. vLLM: 0.1
3.0.
\\ /| NVIDIA RTX 5000 Ada Generation. Num GPUs = 1. Max memory: 31.475 GB.
Platform: Linux.
O^O/ \_/ \ Torch: 2.9.0+cu128. CUDA: 8.9. CUDA Toolkit: 12.8. Triton: 3.5.0
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.33.post1. FA2 = False] "-____-" Free license: http://github.com/unslothai/unsloth