reproducibilityindex.ai

Multi-Level Compositional Reasoning for Interactive Instruction Following

Authors: Suvaansh Bhambri, Byeonghwi Kim, Jonghyun Choi

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our empirical evaluations with a long horizon instruction following task with the condition of not requiring additional depth supervision and perfect egomotion assumption, usually not available for real world deployment, we observe that MCR-Agent outperforms most prior arts in literature by large margins. The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23) Experiments Dataset. Metrics. Comparison with State of the Arts Ablation Study
Researcher Affiliation	Academia	Yonsei University suvaanshbhambri@gmail.com, byeonghwikim@yonsei.ac.kr, jc@yonsei.ac.kr
Pseudocode	No	The paper describes the model architecture and components with equations but does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code	Yes	The code is available at https://github.com/yonseivnl/mcr-agent.
Open Datasets	Yes	To evaluate our approach in challenging scenarios, we focus on the problem of interactive instruction following in the ALFRED benchmark (Shridhar et al. 2020), which poses numerous challenges including long-term planning, partial observability, and irreversible state changes. The dataset is divided into three splits; train , validation , and test set.
Dataset Splits	Yes	The dataset is divided into three splits; train , validation , and test set. To evaluate the generalisation ability of an embodied agent to novel environments, the benchmark further divides validation and test trajectories into seen and unseen splits.
Hardware Specification	No	No specific hardware details (e.g., CPU/GPU models, memory specifications, or cloud instance types) used for running experiments were mentioned in the paper.
Software Dependencies	No	The paper mentions various models and architectures (e.g., Bi-LSTM, Res Net, pretrained object detector) but does not provide specific version numbers for any software dependencies or libraries (e.g., PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	No	The paper describes the overall training process and some architectural decisions but does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed system-level training configurations.