reproducibilityindex.ai

Composing Entropic Policies using Divergence Correction

Authors: Jonathan Hunt, Andre Barreto, Timothy Lillicrap, Nicolas Heess

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study this approach in the tabular case and on non-trivial continuous control problems with compositional structure and show that it outperforms or matches existing methods across all tasks considered.
Researcher Affiliation	Industry	Jonathan J Hunt 1 Andre Barreto 1 Timothy P Lillicrap 1 Nicolas Heess 1 1Deep Mind. Correspondence to: Jonathan J Hunt <jjhunt@google.com>.
Pseudocode	Yes	Algorithm 1 AISBP training algorithm
Open Source Code	No	Videos of the tasks and supplementary information at https: //tinyurl.com/yaplfwaq.
Open Datasets	No	The paper describes using simulated environments (e.g., 8x8 tabular world, point mass, planar manipulator, jumping ball, ant) and generating experience within them. It does not refer to a specific public dataset with access information (link, DOI, citation).
Dataset Splits	No	The paper describes training using a replay buffer and evaluating performance, but it does not specify explicit training, validation, or test dataset splits with percentages or sample counts.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper refers to tools like DeepMind Control Suite and MuJoCo in the references, but it does not provide specific software dependencies with version numbers for replication (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1' or 'TensorFlow 2.x').
Experiment Setup	No	The paper describes its algorithms and theoretical components, and mentions that full details are in appendix C, but it does not provide specific hyperparameter values or concrete training configurations in the main text.