reproducibilityindex.ai

EMI: Exploration with Mutual Information

Authors: Hyoungseok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show competitive results on challenging locomotion tasks with continuous control and on image-based exploration tasks with discrete actions on Atari. The source code is available at https://github.com/snu-mllab/EMI.
Researcher Affiliation	Academia	Hyoungseok Kim * 1 2 Jaekyeom Kim * 1 2 Yeonwoo Jeong 1 2 Sergey Levine 3 Hyun Oh Song 1 2 1Seoul National University, Department of Computer Science and Engineering 2Neural Processing Research Center 3UC Berkeley, Department of Electrical Engineering and Computer Sciences. Correspondence to: Hyun Oh Song <hyunoh@snu.ac.kr>.
Pseudocode	Yes	Algorithm 1 shows the complete procedure in detail.
Open Source Code	Yes	The source code is available at https://github.com/snu-mllab/EMI.
Open Datasets	Yes	We compare the experimental performance of EMI to recent prior works on both low-dimensional locomotion tasks with continuous control from rllab benchmark (Duan et al., 2016) and the complex vision-based tasks with discrete control from the Arcade Learning Environment (Bellemare et al., 2013).
Dataset Splits	No	The paper discusses the datasets used (rllab benchmark, Atari environments) but does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages, counts, or explicit standard splits for these environments beyond just naming them).
Hardware Specification	No	The paper does not mention any specific hardware (GPU models, CPU types, memory, etc.) used for running the experiments.
Software Dependencies	No	The paper mentions using 'TRPO (Schulman et al., 2015)' and 'Adam (Kingma & Ba, 2015) optimizer' and general neural network architectures, but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	In the locomotion experiments, we use a 2-layer fully connected neural network as the policy network. In the Atari experiments, we use a 2-layer convolutional neural network followed by a single layer fully connected neural network. We convert the 84 x 84 input RGB frames to grayscale images and resize them to 52 x 52 images following the practice in Tang et al. (2017). The embedding dimensionality is set to d = 2 and intrinsic reward coefﬁcient is set to η = 0.001 in all of the environments. We use Adam (Kingma & Ba, 2015) optimizer to train embedding networks.