Suggest an alternative to

pbo

Policy-based optimization : single-step policy gradient seen as an evolution strategy

Why do you think that https://github.com/kaifishr/RocketLander is a good alternative to pbo