Files

11 lines
272 B
Markdown
Raw Permalink Normal View History

---
license: apache-2.0
---
ewqr2130/alignment-handbook-zephyr-7b_ppo_5e7step_51
runing the SFT with PPO for 51 steps.
runing the SFT with PPO for 51 steps.
runing the SFT with PPO for 51 steps.
runing the SFT with PPO for 51 steps.
runing the SFT with PPO for 51 steps.