Proximal policy optimization algorithms.” arXiv preprint arXiv:1707.06347 (2017). 2、Naderi, Kourosh, Joose