reproducibilityindex.ai

AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers

Authors: Jake Grigsby, Justin Sasek, Samyak Parajuli, Ikechukwu D. Adebi, Amy Zhang, Yuke Zhu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Large-scale comparisons in Meta-World ML45, Multi-Game Procgen, Multi-Task POPGym, Multi-Game Atari, and Baby AI find that this design unlocks significant progress in online multi-task adaptation and memory problems without explicit task labels.
Researcher Affiliation	Academia	Jake Grigsby Justin Sasek Samyak Parajuli Daniel Adebi Amy Zhang Yuke Zhu The University of Texas at Austin Equal contribution {grigsby,yukez}@cs.utexas.edu
Pseudocode	No	The paper describes the methods using equations but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Code for the agent and multi-task environments used in our experiments is available on Git Hub at UT-Austin-RPL/amago.
Open Datasets	Yes	Comparisons on Meta-World ML45 [17], Multi-Task POPGym [27], Multi-Game Procgen [28], Multi-Game Atari [29], and Multi-Task Baby AI [30] evaluate the importance of scale-resistant updates.
Dataset Splits	No	The paper mentions generating 'train/test' splits for datasets like Baby AI and Meta-World, but it does not provide specific percentages or sample counts for training, validation, and test splits needed to reproduce the data partitioning rigorously.
Hardware Specification	Yes	All of the results in this paper were completed on NVIDIA A5000 GPUs. We train each agent on one GPU whenever possible but add a second GPU for Procgen Memory-Hard (Figure 8) where model size and context length use all available memory.
Software Dependencies	No	The paper mentions various software components and techniques used, such as Adam W optimizer [104], Normformer [105], σReparam [106], IMPALA CNN [107], Dr QV2 [109], and Layer Norm [110]. However, it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	Table 1: Learning Hyperparameter Details" and "Table 2: Agent Architecture Details" in Appendix A provide specific values for hyperparameters and architectural configurations.