Provably Efficient Imitation Learning from Observation Alone

Authors: Wen Sun, Anirudh Vemula, Byron Boots, Drew Bagnell

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also demonstrate the efficacy of FAIL on multiple Open AI Gym control tasks.
Researcher Affiliation Collaboration 1Robotics Institute, Carnegie Mellon University, USA 2College of Computing, Georgia Institute of Technology, USA 3Aurora Innovation, USA.
Pseudocode Yes Algorithm 1 Min-Max Game (D?, D, , F, T)
Open Source Code Yes Implementation and scripts for reproducing results can be found at https://github.com/wensun/ Imitation-Learning-from-Observation
Open Datasets Yes We test FAIL on three simulations from open AI Gym (Brockman et al., 2016): Swimmer, Reacher, and the Fetch Robot Reach task (Fetch Reach).
Dataset Splits No The paper mentions 'training samples' but does not provide specific percentages, counts, or methodologies for dataset splits (train/validation/test).
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running experiments are provided.
Software Dependencies No The paper mentions software components like ADAM, TRPO, and Open AI Baselines, but does not provide specific version numbers for any of them.
Experiment Setup Yes For Swimmer we set H to be 100 while for Reacher and Fetch Reach, H is 50 in default. For all three tasks, we discrete the action space via discretizing each dimension into 5 numbers and applying categorical distribution independently for each dimension. The total number of iteration T in Algorithm 1 is set to 1000 without any future tuning. All policies are parameterized by one-layer neural networks.