Inverse Reinforcement Learning with Natural Language Goals

Authors: Li Zhou, Kevin Small11116-11124

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our algorithm outperforms multiple baselines by a large margin on a vision-based NL instruction following dataset (Room-2Room), demonstrating a promising advance in enabling the use of NL instructions in specifying agent goals. We evaluate our model on the Room-2-Room (R2R) dataset (Anderson et al. 2018), a visually-grounded NL navigation task in realistic 3D indoor environments. We evaluate the model performance based on the trajectory success rate. The performance of our algorithm and baselines are shown in Figure 1, Table 1, and Table 2.
Researcher Affiliation Industry Li Zhou, Kevin Small Amazon Alexa {lizhouml, smakevin}@amazon.com
Pseudocode Yes Algorithm 1 Inverse Reinforcement Learning with Natural Language Goals (Lang Goal IRL)
Open Source Code No The paper does not provide any statements about releasing code or links to a code repository.
Open Datasets Yes We evaluate our model on the Room-2-Room (R2R) dataset (Anderson et al. 2018), a visually-grounded NL navigation task in realistic 3D indoor environments. The dataset contains 7,189 routes sampled from 90 real world indoor environments.
Dataset Splits Yes The dataset is split into train (61 environments and 14,025 instructions), seen validation (61 environments same as train set, and 1,020 instructions), unseen validation (11 new environments and 2,349 instructions), and test (18 new environments and 4,173 instructions).
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU/GPU models, memory).
Software Dependencies No The paper mentions using Soft Actor-Critic (SAC) and various network components like MLP, LSTM, and attention mechanisms, but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup No The main text states: "Appendix B contains details about model architecture and optimization." and "For implementation details of our algorithms and the baselines, please refer to Appendix B." Since Appendix B is not part of the provided text, the specific experimental setup details are not present in the main content.