Generative Adversarial Imitation Learning

Authors: Jonathan Ho, Stefano Ermon

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our algorithm in Section 6, where we find that it outperforms competing methods by a wide margin in training policies for complex, high-dimensional physics-based control tasks over various amounts of expert data. We evaluated GAIL against baselines on 9 physics-based control tasks...
Researcher Affiliation Collaboration Jonathan Ho Open AI hoj@openai.com Stefano Ermon Stanford University ermon@cs.stanford.edu
Pseudocode Yes Algorithm 1 Generative adversarial imitation learning
Open Source Code No The paper does not provide an explicit statement about releasing source code for the methodology described, nor does it include a link to a code repository.
Open Datasets Yes Each task comes with a true cost function, defined in the Open AI Gym [5].
Dataset Splits Yes Behavioral cloning: a given dataset of state-action pairs is split into 70% training data and 30% validation data.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions software components like "Open AI Gym", "Adam", "TRPO", and "Mu Jo Co", but it does not provide specific version numbers for any of these.
Experiment Setup Yes We used all algorithms to train policies of the same neural network architecture for all tasks: two hidden layers of 100 units each, with tanh nonlinearities in between. The discriminator networks for GAIL also used the same architecture. All networks were always initialized randomly at the start of each trial. For each task, we gave FEM, GTAL, and GAIL exactly the same amount of environment interaction for training. We ran all algorithms 5-7 times over different random seeds in all environments except Humanoid, due to time restrictions.