Learning to Interactively Learn and Assist
Authors: Mark Woodward, Chelsea Finn, Karol Hausman2535-2543
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through a series of experiments, we demonstrate the emergence of a variety of interactive learning behaviors, including information-sharing, information-seeking, and question-answering. |
| Researcher Affiliation | Industry | Mark Woodward, Chelsea Finn, Karol Hausman Google Brain, Mountain View {markwoodward, chelseaf, karolhausman}@google.com |
| Pseudocode | No | The paper describes the training procedure and models in prose and mathematical equations but does not include formal pseudocode blocks or algorithm listings. |
| Open Source Code | No | An interactive game and videos for all experiments are available at: https://interactive-learning.github.io. This link directs to a project website which states that the code will be open-sourced later, rather than providing direct access to the code for the described methodology. |
| Open Datasets | No | The paper uses custom simulated grid-world environments and object gathering task domains, but does not mention the use of any publicly available datasets or provide access information for the data generated. |
| Dataset Splits | No | The paper describes using 100 test tasks 'not seen during training' but does not explicitly mention a separate validation set or provide details on how the data was split into train/validation/test sets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, TensorFlow x.x, PyTorch x.x). |
| Experiment Setup | Yes | The training batch size was 100 episodes and the models were trained for 150,000 gradient steps (Experiments 1-3) or 40,000 gradient steps (Experiment 4). Table 1 gives the setup for each experiment, including GRID SHAPE, NUM. OBJECTS, OBSERVATIONS, and OBSERVATION WINDOW. Actions were chosen ϵ-greedy, and hidden states for the recurrent LSTM cells are reset to 0 at the start of each episode. |