Flexible Option Learning

Authors: Martin Klissarov, Doina Precup

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically verify the merits of our approach on a wide variety of domains, ranging from gridworlds using tabular representations [Sutton et al., 1999b, Bacon et al., 2016], control with linear function approximation [Moore, 1991], continuous control [Todorov et al., 2012, Brockman et al., 2016] and vision-based navigation [Chevalier-Boisvert, 2018].
Researcher Affiliation Collaboration Martin Klissarov Mila, Mc Gill University martin.klissarov@mail.mcgill.ca Doina Precup Mila, Mc Gill University and Deep Mind dprecup@cs.mcgill.ca
Pseudocode Yes We do so by instantiating these update rules within the option-critic (OC) algorithm which we present in Algorithm 1 in appendix A.1 and name it Multi-updates Option Critic (MOC) algorithm.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes We first evaluate our approach in the classic Four Rooms domain [Sutton et al., 1999b] [...] classic Mountain Car domain [Moore, 1991, Sutton and Barto, 2018] [...] Mu Joco domain [Todorov et al., 2012] [...] Mini World [Chevalier-Boisvert, 2018].
Dataset Splits No The paper mentions that hyperparameters are available in the appendix, implying some form of tuning, but does not explicitly describe any train/validation/test dataset splits or methodologies for data partitioning.
Hardware Specification No The paper does not specify any particular hardware components such as CPU or GPU models, memory, or types of computing clusters used for running the experiments.
Software Dependencies No The paper mentions using and building upon algorithms like A2C and PPO, and references a PyTorch implementation, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes All hyperparameters are available in the appendix.