Files
counsel-env-qwen3-0.6b-grpo/training_summary.json
ModelHub XC 2ea3d793ee 初始化项目,由ModelHub XC社区提供模型
Model: heavycoderhh/counsel-env-qwen3-0.6b-grpo
Source: Original Platform
2026-06-16 07:21:17 +08:00

18 lines
506 B
JSON

{
"artifact_repo": "heavycoderhh/counsel-env-qwen3-0.6b-grpo",
"dataset_size": 256,
"env_url": "https://heavycoderhh-counsel-env.hf.space",
"max_completion_length": 512,
"max_steps": 200,
"metrics": {
"total_flos": 0.0,
"train_loss": -0.0162161529250443,
"train_runtime": 4111.2914,
"train_samples_per_second": 0.195,
"train_steps_per_second": 0.049
},
"model": "Qwen/Qwen3-0.6B",
"num_generations": 4,
"space_repo": "heavycoderhh/counsel-env",
"use_vllm": false
}