AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers

Authors: Jake Grigsby, Justin Sasek, Samyak Parajuli, Ikechukwu D. Adebi, Amy Zhang, Yuke Zhu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Large-scale comparisons in Meta-World ML45, Multi-Game Procgen, Multi-Task POPGym, Multi-Game Atari, and Baby AI find that this design unlocks significant progress in online multi-task adaptation and memory problems without explicit task labels.
Researcher Affiliation Academia Jake Grigsby Justin Sasek Samyak Parajuli Daniel Adebi Amy Zhang Yuke Zhu The University of Texas at Austin Equal contribution {grigsby,yukez}@cs.utexas.edu
Pseudocode No The paper describes the methods using equations but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Code for the agent and multi-task environments used in our experiments is available on Git Hub at UT-Austin-RPL/amago.
Open Datasets Yes Comparisons on Meta-World ML45 [17], Multi-Task POPGym [27], Multi-Game Procgen [28], Multi-Game Atari [29], and Multi-Task Baby AI [30] evaluate the importance of scale-resistant updates.
Dataset Splits No The paper mentions generating 'train/test' splits for datasets like Baby AI and Meta-World, but it does not provide specific percentages or sample counts for training, validation, and test splits needed to reproduce the data partitioning rigorously.
Hardware Specification Yes All of the results in this paper were completed on NVIDIA A5000 GPUs. We train each agent on one GPU whenever possible but add a second GPU for Procgen Memory-Hard (Figure 8) where model size and context length use all available memory.
Software Dependencies No The paper mentions various software components and techniques used, such as Adam W optimizer [104], Normformer [105], σReparam [106], IMPALA CNN [107], Dr QV2 [109], and Layer Norm [110]. However, it does not provide specific version numbers for any of these software dependencies.
Experiment Setup Yes Table 1: Learning Hyperparameter Details" and "Table 2: Agent Architecture Details" in Appendix A provide specific values for hyperparameters and architectural configurations.