Suggest an alternative to

chain-of-hindsight

Chain-of-Hindsight, A Scalable RLHF Method

Why do you think that https://github.com/PKU-Alignment/safe-rlhf is a good alternative to chain-of-hindsight