Files
DAPO_E2H-math-gaussian_0p5_0p5/.hydra/overrides.yaml

9 lines
237 B
YAML
Raw Normal View History

- mode=train
- task=math
- algorithm=grpo
- algorithm.training.curriculum_schedule=gaussian
- model=qwen15
- algorithm.training.max_steps=1600
- algorithm.training.vllm_mode=colocate
- algorithm.training.vllm_gpu_memory_utilization=0.25