reproducibilityindex.ai

Model-Free Imitation Learning with Policy Optimization

Authors: Jonathan Ho, Jayesh Gupta, Stefano Ermon

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated our approach in a variety of scenarios: ﬁnite gridworlds of varying sizes, the continuous planar navigation task of Levine and Koltun (2012), a family of continuous environments of varying numbers of observation features (Karpathy, 2015), and a variation of Levine & Koltun s highway driving simulation, in which the agent receives high-dimensional egocentric observation features.
Researcher Affiliation	Academia	Jonathan Ho HOJ@CS.STANFORD.EDU Jayesh K. Gupta JKG@CS.STANFORD.EDU Stefano Ermon ERMON@CS.STANFORD.EDU Stanford University
Pseudocode	Yes	Algorithm 1 IM-REINFORCE, Algorithm 2 IM-TRPO
Open Source Code	No	The paper does not provide any specific statements or links regarding the availability of its source code.
Open Datasets	Yes	We evaluated our approach in a variety of scenarios: ﬁnite gridworlds of varying sizes, the continuous planar navigation task of Levine and Koltun (2012), a family of continuous environments of varying numbers of observation features (Karpathy, 2015)... Karpathy, Andrej. Reinforcejs: Waterworld demo, 2015. URL http://cs.stanford.edu/people/karpathy/reinforcejs/waterworld.html.
Dataset Splits	No	The paper does not provide specific training/validation/test dataset splits, only describing how expert data was generated and used in experiments (e.g.,
Hardware Specification	No	The paper mentions 'On our system' but provides no specific hardware details such as CPU or GPU models, or memory specifications, which are necessary for reproducibility.
Software Dependencies	No	The paper refers to algorithms and model types (e.g.,
Experiment Setup	No	The paper states that 'Details on the environments and training methodology are in the supplement' and describes general policy construction (Gaussian action distributions, multi-layer perceptron) but does not provide specific hyperparameters like learning rate, batch size, or optimizer settings within the main text.