Imitation by Predicting Observations

Authors: Andrew Jaegle, Yury Sulsky, Arun Ahuja, Jake Bruce, Rob Fergus, Greg Wayne

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that FORM performs comparably to a strong baseline IRL method (GAIL) on the Deep Mind Control Suite benchmark, while outperforming GAIL in the presence of task-irrelevant features.
Researcher Affiliation Industry 1Deep Mind. Correspondence to: Andrew Jaegle <drewjaegle@deepmind.com>.
Pseudocode Yes Algorithm 1 Imitation learning with FORM
Open Source Code No The paper does not provide an explicit statement about open-sourcing its code or a link to a code repository.
Open Datasets Yes We evaluate FORM against strong baselines on 13 tasks from six domains from the Deep Mind Control Suite (DCS) (Tassa et al., 2018), a set of benchmarks for continuous control domains...
Dataset Splits No The paper mentions generating demonstration trajectories for training and evaluating performance, but does not explicitly define specific train/validation/test splits, nor does it specify a separate validation set.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using JAX and NumPy (in references) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Architecture. We use simple feedforward architectures to parameterize the density models (3 layer MLPs with 256 units, and tanh and ELU (Clevert et al., 2016) nonlinearities). We model the density as a mixture of 4 Gaussian components... We tuned ℓ2 weight (sweeping values of [0.0, 0.01, 0.1, and 1.0]) and the fraction of each batch generated by agent rollouts (sweeping values of [0.0, 0.01, 0.1, 1.0]) per domain, but otherwise use identical hyperparameters for all FORM models.