A Max-Min Entropy Framework for Reinforcement Learning

Authors: Seungyul Han, Youngchul Sung

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical results show that the proposed algorithm yields drastic performance improvement over the current state-of-the-art RL algorithms. and 6 Experiments We provide numerical results to show the performance of the proposed MME and DE-MME in pure exploration and various control tasks.
Researcher Affiliation Academia Seungyul Han Graduate School of Artificial Intelligence UNIST Ulsan, South Korea 44919 syhan@unist.ac.kr Youngchul Sung School of Electrical Engineering KAIST Daejeon, South Korea 34141 ycsung@kaist.ac.kr
Pseudocode Yes The detailed implementation and algorithm of MME are provided in Appendix A.
Open Source Code Yes We provide source code for the proposed method at http://github.com/seungyulhan/mme/ that requires Python Tensorflow.
Open Datasets Yes Sparse Mujoco [27, 35] is a sparse version of Mujoco [52] in Open AI Gym [8]
Dataset Splits No No explicit training/validation/test dataset splits (e.g., percentages or sample counts) are mentioned. The paper discusses using random seeds for averaging results over multiple runs.
Hardware Specification Yes All the algorithms are run on a machine with Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz and an NVIDIA TITAN Xp GPU.
Software Dependencies Yes We used TensorFlow 1.15.0 and Python 3.6.9 for our implementation.
Experiment Setup Yes Detailed experimental setup is provided in Appendix B.