One-Shot Imitation Learning
Authors: Yan Duan, Marcin Andrychowicz, Bradly Stadie, OpenAI Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, Wojciech Zaremba
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the use of soft attention allows the model to generalize to conditions and tasks unseen in the training data. We conduct experiments with the block stacking tasks described in Section 3.2. Fig. 3 shows the performance of various architectures. |
| Researcher Affiliation | Collaboration | Berkeley AI Research Lab, Open AI Work done while at Open AI {rockyduan, jonathanho, pabbeel}@eecs.berkeley.edu {marcin, bstadie, jonas, ilyasu, woj}@openai.com |
| Pseudocode | No | The paper describes the algorithm and architecture components but does not provide a formal pseudocode block or algorithm listing. |
| Open Source Code | No | The paper mentions "Videos of our experiments are available at http://bit.ly/nips2017-oneshot." This link is for videos, not for source code of the methodology. No other statements about code release are present. |
| Open Datasets | No | Concretely, we collect 140 training tasks, and 43 test tasks, each with a different desired layout of the blocks. The paper describes its own collected dataset but does not state it is publicly available or provide any access information (link, citation, repository). |
| Dataset Splits | No | The paper states: "Concretely, we collect 140 training tasks, and 43 test tasks, each with a different desired layout of the blocks. We collect 1000 trajectories per task for training, and maintain a separate set of trajectories and initial configurations to be used for evaluation." While it mentions training and test tasks, it does not specify a validation set, nor does it provide detailed percentages or counts for how the data is split between these sets in a way that would allow for reproduction beyond the number of tasks. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments. |
| Software Dependencies | No | The paper mentions: "Across all experiments, we use Adamax [25] to perform the optimization with a learning rate of 0.001." While it names an optimizer, it does not provide specific version numbers for any programming languages, libraries, or frameworks (e.g., Python, TensorFlow, PyTorch, scikit-learn). |
| Experiment Setup | Yes | Across all experiments, we use Adamax [25] to perform the optimization with a learning rate of 0.001. In our experiments, we use p = 0.95, which reduces the length of demonstrations by a factor of 20. |