Imitation by Predicting Observations
Authors: Andrew Jaegle, Yury Sulsky, Arun Ahuja, Jake Bruce, Rob Fergus, Greg Wayne
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that FORM performs comparably to a strong baseline IRL method (GAIL) on the Deep Mind Control Suite benchmark, while outperforming GAIL in the presence of task-irrelevant features. |
| Researcher Affiliation | Industry | 1Deep Mind. Correspondence to: Andrew Jaegle <drewjaegle@deepmind.com>. |
| Pseudocode | Yes | Algorithm 1 Imitation learning with FORM |
| Open Source Code | No | The paper does not provide an explicit statement about open-sourcing its code or a link to a code repository. |
| Open Datasets | Yes | We evaluate FORM against strong baselines on 13 tasks from six domains from the Deep Mind Control Suite (DCS) (Tassa et al., 2018), a set of benchmarks for continuous control domains... |
| Dataset Splits | No | The paper mentions generating demonstration trajectories for training and evaluating performance, but does not explicitly define specific train/validation/test splits, nor does it specify a separate validation set. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using JAX and NumPy (in references) but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Architecture. We use simple feedforward architectures to parameterize the density models (3 layer MLPs with 256 units, and tanh and ELU (Clevert et al., 2016) nonlinearities). We model the density as a mixture of 4 Gaussian components... We tuned ℓ2 weight (sweeping values of [0.0, 0.01, 0.1, and 1.0]) and the fraction of each batch generated by agent rollouts (sweeping values of [0.0, 0.01, 0.1, 1.0]) per domain, but otherwise use identical hyperparameters for all FORM models. |