AMAGO: Scalable In-Context Reinforcement Learning for Adaptive Agents

Authors: Jake Grigsby, Linxi Fan, Yuke Zhu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our agent is scalable and applicable to a wide range of problems, and we demonstrate its strong performance empirically in meta-RL and long-term memory domains. AMAGO s focus on sparse rewards and off-policy data also allows in-context learning to extend to goal-conditioned problems with challenging exploration. Our experiments are divided into two parts.
Researcher Affiliation Collaboration Jake Grigsby1, Linxi Jim Fan2, Yuke Zhu1 1The University of Texas at Austin 2NVIDIA Research
Pseudocode Yes Algorithm 1 Simplified Hindsight Instruction Relabeling
Open Source Code Yes Our agent is open-source1 and specifically designed to be efficient, stable, and applicable to new environments with little tuning. 1Code is available here: https://ut-austin-rpl.github.io/amago/
Open Datasets Yes We empirically demonstrate its power and flexibility in existing meta-RL and memory benchmarks, including state-of-the-art results in the POPGym suite [30]... We evaluate AMAGO on two new benchmarks... before applying it to instruction-following tasks in the procedurally generated worlds of Crafter [33].
Dataset Splits No The paper mentions tuning AMAGO on one environment for POPGym before applying it to others, and evaluating on 'held-out test tasks' for Meta-World, but it does not specify explicit train/validation splits (e.g., 80/10/10 percentages) for any of the datasets used to reproduce the experiments.
Hardware Specification Yes Each AMAGO agent is trained on one A5000 GPU. We compare training throughput in a common locomotion benchmark [97] with more details in Appendix D. We compare training throughput... on a single NVIDIA A5000 GPU.
Software Dependencies No The paper mentions its open-source nature and provides a link to its code (which would implicitly use Python and PyTorch), but it does not explicitly list software names with version numbers in the text for reproducibility.
Experiment Setup Yes AMAGO Hyperparameter Information. Network architecture details for our main experimental domains are provided in Table 3. Table 4 lists the hyperparameters for our RL training process. Many of AMAGO s details are designed to reduce hyperparameter sensitivity, and this allows us to use a consistent configuration across most experiments.