reproducibilityindex.ai

Imitation by Predicting Observations

Authors: Andrew Jaegle, Yury Sulsky, Arun Ahuja, Jake Bruce, Rob Fergus, Greg Wayne

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that FORM performs comparably to a strong baseline IRL method (GAIL) on the Deep Mind Control Suite benchmark, while outperforming GAIL in the presence of task-irrelevant features.
Researcher Affiliation	Industry	1Deep Mind. Correspondence to: Andrew Jaegle <drewjaegle@deepmind.com>.
Pseudocode	Yes	Algorithm 1 Imitation learning with FORM
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing its code or a link to a code repository.
Open Datasets	Yes	We evaluate FORM against strong baselines on 13 tasks from six domains from the Deep Mind Control Suite (DCS) (Tassa et al., 2018), a set of benchmarks for continuous control domains...
Dataset Splits	No	The paper mentions generating demonstration trajectories for training and evaluating performance, but does not explicitly define specific train/validation/test splits, nor does it specify a separate validation set.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using JAX and NumPy (in references) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	Architecture. We use simple feedforward architectures to parameterize the density models (3 layer MLPs with 256 units, and tanh and ELU (Clevert et al., 2016) nonlinearities). We model the density as a mixture of 4 Gaussian components... We tuned ℓ2 weight (sweeping values of [0.0, 0.01, 0.1, and 1.0]) and the fraction of each batch generated by agent rollouts (sweeping values of [0.0, 0.01, 0.1, 1.0]) per domain, but otherwise use identical hyperparameters for all FORM models.