Generative predecessor models for sample-efficient imitation learning

Authors: Yannick Schroecker, Mel Vecerik, Jon Scholz

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our approach to a state-of-the-art imitation learning method, showing that it outperforms or matches its performance on two simulated robot manipulation tasks and demonstrate significantly higher sample efficiency by applying the algorithm on a real robot.
Researcher Affiliation Collaboration Yannick Schroecker College of Computing Georgia Institute of Technology Atlanta, USA yannickschroecker@gateu.edu Mel Vecerik & Jonathan Scholz Deep Mind London, United Kingdom {vec,jscholz}@google.com
Pseudocode Yes Algorithm 1 Generative Predecessor Models for Imitation Learning (GPRIL)
Open Source Code No The paper references an 'open source implementation of GAIL' and provides a link to a YouTube video. However, it does not explicitly state that the source code for the proposed GPRIL method is made available or provide a link to it.
Open Datasets No The paper states that expert demonstrations were recorded using tele-operation and kinesthetic teaching for both simulated and physical robot tasks, such as 'We record expert demonstrations using tele-operation...' and 'For each scenario, we provide 20 demonstrations using kinesthetic teaching...'. However, it does not provide any links, DOIs, repository names, or citations to indicate public availability of these collected demonstration datasets.
Dataset Splits No The paper describes collecting 'training data using self-supervised roll-outs' and using 'expert demonstrations'. It mentions evaluating success rates over '100 roll-outs' for results, but it does not provide explicit details on how the data is partitioned into training, validation, and test sets, nor does it specify exact percentages or sample counts for these splits.
Hardware Specification No The paper mentions using 'a single asynchronous simulation' and '16 parallel simulations', and conducting experiments on a 'physical pan-tilt unit' robot. However, it does not provide specific details about the hardware used for computation, such as CPU models, GPU models, or memory specifications.
Software Dependencies No The paper mentions using 'Adam' as an optimizer and comparing against 'GAIL' and 'behavioral cloning'. However, it does not provide specific version numbers for any software libraries, frameworks (e.g., TensorFlow, PyTorch), or the optimizer itself.
Experiment Setup Yes Appendix B provides detailed 'HYPERPARAMETERS' for GPRIL, GAIL, and Behavioral Cloning across different tasks. This includes specific values for 'Total iterations', 'Batch size B', 'γ', 'Replay memory size', 'NB', 'Nπ', 'Hidden layers', 'Optimizer', 'Learning rate', 'L2-regularization', 'min(σi)', 'Gradient clip', 'KL step size', 'Entropy regularization', and 'σ bounds'.