初始化项目,由ModelHub XC社区提供模型
Model: divelab/DAPO_E2H-math-gaussian_0p5_0p5 Source: Original Platform
This commit is contained in:
8
.hydra/overrides.yaml
Normal file
8
.hydra/overrides.yaml
Normal file
@@ -0,0 +1,8 @@
|
||||
- mode=train
|
||||
- task=math
|
||||
- algorithm=grpo
|
||||
- algorithm.training.curriculum_schedule=gaussian
|
||||
- model=qwen15
|
||||
- algorithm.training.max_steps=1600
|
||||
- algorithm.training.vllm_mode=colocate
|
||||
- algorithm.training.vllm_gpu_memory_utilization=0.25
|
||||
Reference in New Issue
Block a user