Options of Interest: Temporal Abstraction with Interest Functions

Authors: Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup4444-4451

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the efficacy of the proposed approach through quantitative and qualitative results, in both discrete and continuous environments.
Researcher Affiliation Collaboration Khimya Khetarpal,1,2 Martin Klissarov,1,2 Maxime Chevalier-Boisvert,2,3 Pierre-Luc Bacon,4 Doina Precup1,2,5 1Mc Gill University, 2Mila, 3Universite de Montreal, 4Stanford University, 5Google DeepMind
Pseudocode Yes Pseudo-code of the interest-option-critic (IOC) algorithm using intra-option Q-learning is shown in Algorithm 1.
Open Source Code Yes A link to the source code for all experiments is provided on the project page. 1https://sites.google.com/view/optionsofinterest
Open Datasets Yes We first consider the classic FR domain (1999) (Fig. 3a). We use simple continuous control tasks implemented in Mujoco (Todorov, Erez, and Tassa 2012). We use the Oneroom task where the agent has to navigate to a randomly placed red block in a closed room (Fig. 4f) from the Mini World frame-work (Chevalier-Boisvert 2018). The initial configuration of this environment follows the standard Half Cheetah-v1 from Open AI s Gym.
Dataset Splits No The paper describes experimental setups, but does not provide specific training/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper mentions software frameworks like Mujoco and Mini World, and algorithms like PPOC and DQN, but does not specify version numbers for these or other software dependencies.
Experiment Setup Yes Additional details are provided in Appendix A.3.1. Complete details about the implementation and hyper-parameter search are provided in Appendix A.3.2. See Appendix A.3.3 for details about implementation and hyper-parameters.