reproducibilityindex.ai

Fast And Slow Learning Of Recurrent Independent Mechanisms

Authors: Kanika Madan, Nan Rosemary Ke, Anirudh Goyal, Bernhard Schölkopf, Yoshua Bengio

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments in the domain of grounded language learning, in which poor data efﬁciency is one of the major limitations for agents to learn efﬁciently and generalize well (Hermann et al., 2017; Chaplot et al., 2017; Wu et al., 2018; Yu et al., 2018; Chevalier-Boisvert et al., 2018). We show, empirically, how the proposed learning agent can generalize better not only on the seen data, but also is more sample efﬁcient, faster to train and adapt, and has better transfer capabilities in the face of changes in distributions.
Researcher Affiliation	Academia	1 Mila, University of Montreal, 2 Mila, Polytechnique Montréal, 3 Max Planck Institute for Intelligent Systems.
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include any links to a code repository.
Open Datasets	Yes	The experiments are based on a large variety of environments from the Mini Grid and Baby AI suite (Chevalier-Boisvert et al., 2018) which provide an egocentric and partially observed view of the environment.
Dataset Splits	No	The paper does not provide specific percentages or counts for training, validation, or test dataset splits. It discusses training and evaluation across different environments rather than fixed data partitions.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using the Proximal Policy Optimization algorithm but does not specify any software dependencies with their version numbers (e.g., Python, PyTorch, TensorFlow versions or specific library versions).
Experiment Setup	Yes	For generalized advantage function, we used λ = 0.99, and discounted future rewards by a factor of γ = 0.99. For all of our environments, we used n = 5 total modules, with only k = 3 of them active at any given time.