One-Shot Imitation Learning

Authors: Yan Duan, Marcin Andrychowicz, Bradly Stadie, OpenAI Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, Wojciech Zaremba

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that the use of soft attention allows the model to generalize to conditions and tasks unseen in the training data. We conduct experiments with the block stacking tasks described in Section 3.2. Fig. 3 shows the performance of various architectures.
Researcher Affiliation Collaboration Berkeley AI Research Lab, Open AI Work done while at Open AI {rockyduan, jonathanho, pabbeel}@eecs.berkeley.edu {marcin, bstadie, jonas, ilyasu, woj}@openai.com
Pseudocode No The paper describes the algorithm and architecture components but does not provide a formal pseudocode block or algorithm listing.
Open Source Code No The paper mentions "Videos of our experiments are available at http://bit.ly/nips2017-oneshot." This link is for videos, not for source code of the methodology. No other statements about code release are present.
Open Datasets No Concretely, we collect 140 training tasks, and 43 test tasks, each with a different desired layout of the blocks. The paper describes its own collected dataset but does not state it is publicly available or provide any access information (link, citation, repository).
Dataset Splits No The paper states: "Concretely, we collect 140 training tasks, and 43 test tasks, each with a different desired layout of the blocks. We collect 1000 trajectories per task for training, and maintain a separate set of trajectories and initial configurations to be used for evaluation." While it mentions training and test tasks, it does not specify a validation set, nor does it provide detailed percentages or counts for how the data is split between these sets in a way that would allow for reproduction beyond the number of tasks.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies No The paper mentions: "Across all experiments, we use Adamax [25] to perform the optimization with a learning rate of 0.001." While it names an optimizer, it does not provide specific version numbers for any programming languages, libraries, or frameworks (e.g., Python, TensorFlow, PyTorch, scikit-learn).
Experiment Setup Yes Across all experiments, we use Adamax [25] to perform the optimization with a learning rate of 0.001. In our experiments, we use p = 0.95, which reduces the length of demonstrations by a factor of 20.