Online Continual Learning for Interactive Instruction Following Agents
Authors: Byeonghwi Kim, Minhyuk Seo, Jonghyun Choi
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the proposed Behavior-IL and Environment-IL setups, our simple CAMA outperforms prior state of the art in our empirical validations by noticeable margins. |
| Researcher Affiliation | Academia | 1Yonsei University 2Seoul National University |
| Pseudocode | Yes | For better understanding, we outline the high-level flow of our CAMA in Algorithm 1 in the appendix. |
| Open Source Code | Yes | The project page including codes is https://github.com/snumprlab/cl-alfred. |
| Open Datasets | Yes | As the ALFRED dataset (Shridhar et al., 2020) requires a comprehensive understanding of natural language and visual environments for intelligent agents, we build our continual benchmark on top of it. |
| Dataset Splits | Yes | For the train split, we subsample 3,141 episodes, leading to 12,564 episodes in total. For the validation seen split, we subsample 106 episodes, leading to 424 episodes in total. Finally, for the validation unseen split, we subsample 87 episodes, leading to 348 episodes in total. |
| Hardware Specification | No | The paper does not provide specific details such as GPU/CPU models, processor types, or memory amounts used for running the experiments. It only generally refers to 'edge devices' in the context of memory limitations. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer' and specific schedulers ('Exponental LR', 'Reset LR') along with their parameters, but does not provide specific version numbers for any software libraries or frameworks (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | For our CAMA, we empirically set αa = 0.99 and αc = 0.99. We provide more implementation details such as hyperparameters in Sec. D.4 for space s sake. ... we use the Adam optimizer with an initial learning rate of 0.001 and a batch size of 32 per streamed sample. We utilize the Exponental LR (Li & Arora, 2019) and Reset LR (Loshchilov & Hutter, 2016) schedulers with γ = 0.95 and m = 10 for our CAMA and the baselines except CLIB with γ = 0.9999. |