Provably Efficient Imitation Learning from Observation Alone
Authors: Wen Sun, Anirudh Vemula, Byron Boots, Drew Bagnell
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also demonstrate the efficacy of FAIL on multiple Open AI Gym control tasks. |
| Researcher Affiliation | Collaboration | 1Robotics Institute, Carnegie Mellon University, USA 2College of Computing, Georgia Institute of Technology, USA 3Aurora Innovation, USA. |
| Pseudocode | Yes | Algorithm 1 Min-Max Game (D?, D, , F, T) |
| Open Source Code | Yes | Implementation and scripts for reproducing results can be found at https://github.com/wensun/ Imitation-Learning-from-Observation |
| Open Datasets | Yes | We test FAIL on three simulations from open AI Gym (Brockman et al., 2016): Swimmer, Reacher, and the Fetch Robot Reach task (Fetch Reach). |
| Dataset Splits | No | The paper mentions 'training samples' but does not provide specific percentages, counts, or methodologies for dataset splits (train/validation/test). |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running experiments are provided. |
| Software Dependencies | No | The paper mentions software components like ADAM, TRPO, and Open AI Baselines, but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | For Swimmer we set H to be 100 while for Reacher and Fetch Reach, H is 50 in default. For all three tasks, we discrete the action space via discretizing each dimension into 5 numbers and applying categorical distribution independently for each dimension. The total number of iteration T in Algorithm 1 is set to 1000 without any future tuning. All policies are parameterized by one-layer neural networks. |