reproducibilityindex.ai

Online Continual Learning for Interactive Instruction Following Agents

Authors: Byeonghwi Kim, Minhyuk Seo, Jonghyun Choi

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In the proposed Behavior-IL and Environment-IL setups, our simple CAMA outperforms prior state of the art in our empirical validations by noticeable margins.
Researcher Affiliation	Academia	1Yonsei University 2Seoul National University
Pseudocode	Yes	For better understanding, we outline the high-level flow of our CAMA in Algorithm 1 in the appendix.
Open Source Code	Yes	The project page including codes is https://github.com/snumprlab/cl-alfred.
Open Datasets	Yes	As the ALFRED dataset (Shridhar et al., 2020) requires a comprehensive understanding of natural language and visual environments for intelligent agents, we build our continual benchmark on top of it.
Dataset Splits	Yes	For the train split, we subsample 3,141 episodes, leading to 12,564 episodes in total. For the validation seen split, we subsample 106 episodes, leading to 424 episodes in total. Finally, for the validation unseen split, we subsample 87 episodes, leading to 348 episodes in total.
Hardware Specification	No	The paper does not provide specific details such as GPU/CPU models, processor types, or memory amounts used for running the experiments. It only generally refers to 'edge devices' in the context of memory limitations.
Software Dependencies	No	The paper mentions software components like 'Adam optimizer' and specific schedulers ('Exponental LR', 'Reset LR') along with their parameters, but does not provide specific version numbers for any software libraries or frameworks (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	For our CAMA, we empirically set αa = 0.99 and αc = 0.99. We provide more implementation details such as hyperparameters in Sec. D.4 for space s sake. ... we use the Adam optimizer with an initial learning rate of 0.001 and a batch size of 32 per streamed sample. We utilize the Exponental LR (Li & Arora, 2019) and Reset LR (Loshchilov & Hutter, 2016) schedulers with γ = 0.95 and m = 10 for our CAMA and the baselines except CLIB with γ = 0.9999.