reproducibilityindex.ai

Imitation Learning from Video by Leveraging Proprioception

Authors: Faraz Torabi, Garrett Warnell, Peter Stone

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally test the proposed technique on several Mu Jo Co domains and show that it outperforms other imitation from observation algorithms by a large margin. In this section, we describe the experimental procedure by which we evaluated this hypothesis, and discuss the results.
Researcher Affiliation	Collaboration	Faraz Torabi1 , Garrett Warnell2 and Peter Stone1 1The University of Texas at Austin 2Army Research Laboratory {faraztrb, pstone}@cs.utexas.edu, garrett.a.warnell.civ@mail.mil
Pseudocode	Yes	Pseudocode and a diagrammatic representation of our proposed algorithm are presented in Algorithm 1 and Figure 1, respectively.
Open Source Code	No	The paper states: 'The considered domains, methods, and implementations are presented in more detail in the longer version of the paper on ar Xiv [Torabi et al., 2019c]'. This refers to an arXiv preprint of this very paper, not a source code repository.
Open Datasets	Yes	We evaluated our method on a subset of the continuous control tasks available via Open AI Gym [Brockman et al., 2016] and the Mu Jo Co simulator [Todorov et al., 2012]: Mountain Car Continuous, Inverted Pendulum, Inverted Double Pendulum, Hopper, Walker2d, Half Cheetah. After the expert agents were trained, we recorded 64 64, 30-fps video demonstrations of their behavior.
Dataset Splits	No	The paper mentions generating results using 'ten independent trials' and measuring performance over '1000 trajectories', but it does not specify explicit training, validation, or test splits for the demonstration data used in the imitation learning process.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using 'Open AI Gym' and 'Mu Jo Co simulator' and 'PPO' but does not specify version numbers for these or any other software dependencies.
Experiment Setup	Yes	To generate the demonstration data, we ﬁrst trained an expert agents using pure reinforcement learning (i.e., not from imitation). More speciﬁcally, we used proximal policy optimization (PPO) [Schulman et al., 2017] and the ground truth reward function provided by Open AI Gym. After the expert agents were trained, we recorded 64 64, 30-fps video demonstrations of their behavior. The results shown here were generated using ten inde-pendent trials, where each trial used a different random seed to initialize the environments, model parameters, etc.