reproducibilityindex.ai

Combined Reinforcement Learning via Abstract Representations

Authors: Vincent Francois-Lavet, Yoshua Bengio, Doina Precup, Joelle Pineau3582-3589

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In the experimental section, we show for two contrasting domains that the CRAR agent is able to build an interpretable low-dimensional representation of the task and that it can use it for efﬁcient planning. We also show that the CRAR agent leads to effective multi-task generalization and that it can efﬁciently be used for transfer learning.
Researcher Affiliation	Collaboration	Vincent Franc ois-Lavet Mc Gill University, Mila vincent.francois-lavet@mcgill.ca Doina Precup Mc Gill University, Mila, Deep Mind dprecup@cs.mcgill.ca Yoshua Bengio Universit e de Montreal, Mila yoshua.bengio@mila.quebec Joelle Pineau Mc Gill University, Mila, Facebook AI Research jpineau@cs.mcgill.ca
Pseudocode	No	No explicit pseudocode or algorithm block labeled as such was found in the paper.
Open Source Code	Yes	The source code for all experiments is available at https://github.com/Vin F/deer/
Open Datasets	No	The paper describes datasets it generated for its experiments (e.g., “5000 transitions obtained with a purely random policy” for Labyrinth task, and “2 * 10^5 steps” for meta-learning labyrinths), but it does not provide concrete access information (link, DOI, formal citation) for these datasets to be publicly available.
Dataset Splits	No	The paper mentions “training set” and evaluation “at test time” but does not provide specific details on dataset splits (e.g., percentages, exact counts, or citations to predefined splits) for training, validation, and testing. It refers to data gathered offline on a training set and then evaluates on new tasks from the distribution.
Hardware Specification	No	No specific hardware details (e.g., GPU models, CPU types, or cloud computing instance specifications) used for running the experiments were provided in the paper.
Software Dependencies	No	The paper mentions general techniques like “RMSprop” for optimization and “DQN” algorithms, but does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	Yes	Details, hyper-parameters along with an ablation study are provided in Appendix B. α = 5 10 4, β = 0.2 and decreasing α by 10% every 2000 training steps.