reproducibilityindex.ai

A Bayesian Approach to Generative Adversarial Imitation Learning

Authors: Wonseok Jeon, Seokin Seo, Kee-Eung Kim

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated our BGAIL on ﬁve continuous control tasks (Hopper-v1, Walker2d-v1, Half Cheetahv1, Ant-v1, Humanoid-v1) from Open AI Gym, implemented with the Mu Jo Co physics simulator [Todorov et al., 2012]. The imitation performances of vanilla GAIL, tuned GAIL and our algorithm are summarized in Table 1.
Researcher Affiliation	Collaboration	Wonseok Jeon1, Seokin Seo1, Kee-Eung Kim1,2 1 School of Computing, KAIST, Republic of Korea 2 PROWLER.io {wsjeon, siseo}@ai.kaist.ac.kr, kekim@cs.kaist.ac.kr
Pseudocode	Yes	Algorithm 1 Bayesian Generative Adversarial Imitation Learning (BGAIL)
Open Source Code	No	The paper states "our code was built on the GAIL implementation in Open AI Baselines" and "Our SVGD was implemented using the code released by the authors2", with footnote 2 linking to a SVGD repository. However, it does not explicitly state that the BGAIL code developed in this paper is publicly available.
Open Datasets	Yes	expert s trajectories were collected from the expert policy released by the authors of the original GAIL1, but our code was built on the GAIL implementation in Open AI Baselines [Dhariwal et al., 2017] which uses Tensor Flow [Abadi et al., 2016]. For the policy, Gaussian policy was used with both mean and variance dependent on the observation. Footnote 1: https://github.com/openai/imitation
Dataset Splits	No	The paper describes evaluation procedures using "50 independent trajectories" and training for "5 trained policies" and "number of training iterations," but it does not specify explicit dataset splits (e.g., percentages or counts) for training, validation, or testing.
Hardware Specification	No	The paper mentions using the MuJoCo physics simulator and TensorFlow but does not provide any specific hardware details such as GPU or CPU models used for the experiments.
Software Dependencies	No	The paper mentions software like Open AI Gym, MuJoCo, Open AI Baselines, TensorFlow, Adam optimizer, and SVGD, but it does not provide specific version numbers for these software components.
Experiment Setup	Yes	For all tasks, neural networks with 2 hidden layers were used for all policy and discriminator networks, where 100 hidden units for each hidden layer and tanh activations are used. ... For the discriminator, the number of particles K was chosen to be 5. ... For training, we used uninformative prior and SVGD along with the Adam optimizer [Kingma and Ba, 2014] ... In addition, 5 inner loops were used for updating discriminator parameters, which corresponds to the inner loop from line 6 to line 11 in Algorithm 1.