baseline-qwen3-4b-grounded_table/README.md at b7300314ac6df02f6696579af75a7b80088b91b9

Files

ModelHub XC b7300314ac 初始化项目，由ModelHub XC社区提供模型

Model: boradorish/baseline-qwen3-4b-grounded_table
Source: Original Platform

2026-05-28 16:10:18 +08:00

library_name, license, base_model, tags, model-index

library_name

license

base_model

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 4e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 2
gradient_accumulation_steps: 32
total_train_batch_size: 64
total_eval_batch_size: 2
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 3.0

Training Loss	Epoch	Step	Validation Loss
0.007	0.3397	92	0.0101
0.0078	0.6794	184	0.0117
0.0027	1.0185	276	0.0084
0.0084	1.3581	368	0.0087
0.0061	1.6978	460	0.0078
0.0037	2.0369	552	0.0077
0.0027	2.3766	644	0.0085
0.0041	2.7163	736	0.0085