reproducibilityindex.ai

Information-theoretic Task Selection for Meta-Reinforcement Learning

Authors: Ricardo Luna Gutierrez, Matteo Leonetti

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We reproduce different meta-RL experiments from the literature and show that ITTS improves the ﬁnal performance in all of them. [...] 5 Experimental Evaluation The main aim of this evaluation is twofold: to demonstrate that task selection is indeed beneﬁcial for meta-RL, and show that applying ITTS to existing meta-RL algorithms consistently results in better performance on test tasks.
Researcher Affiliation	Academia	Ricardo Luna Gutierrez School of Computing University of Leeds Leeds, UK scrlg@leeds.ac.uk Matteo Leonetti School of Computing University of Leeds Leeds, UK M.Leonetti@leeds.ac.uk
Pseudocode	Yes	Algorithm 1 Information-Theoretic Task Selection [...] Algorithm 2 Relevance Evaluation
Open Source Code	Yes	All the parameters and implementation details for every experiment are available in the supplementary material, as well as the source code. For training individual tasks and meta-RL agents, garage [10] was used.
Open Datasets	Yes	Cart Pole, from Open AI gym [3], is a classic control task [...] Mini Grid is an open-source grid world package proposed as an RL benchmark [5].
Dataset Splits	Yes	In every domain we used K = 5 validation tasks.
Hardware Specification	Yes	We limited the number of training tasks in each domain so that the generation and training until convergence repeated for 5 times would not exceed 72 hours of computation on an 8-core machine at 1.8GHz and 32GB of RAM.
Software Dependencies	No	The paper mentions 'garage [10] was used' but does not provide a specific version number for this toolkit or any other software dependencies.
Experiment Setup	Yes	The threshold determines when a task is considered different enough from another task, that is, their difference measured as in Equation 1 is greater than or equal to ϵ. [...] All the parameters and implementation details for every experiment are available in the supplementary material, as well as the source code. [...] 20 rollouts per gradient were used.