reproducibilityindex.ai

Learning to Cooperate with Humans using Generative Agents

Authors: Yancheng Liang, Daphne Chen, Abhishek Gupta, Simon S. Du, Natasha Jaques

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method Generative Agent Modeling for Multi-agent Adaptation (GAMMA) on Overcooked, a challenging cooperative cooking game that has become a standard benchmark for zero-shot coordination. We conduct an evaluation with real human teammates, and the results show that GAMMA consistently improves performance, whether the generative model is trained on simulated populations or human datasets.
Researcher Affiliation	Academia	Yancheng Liang, Daphne Chen, Abhishek Gupta, Simon S. Du, Natasha Jaques University of Washington {yancheng, daphc, abhgupta, ssdu, nj}@cs.washington.edu
Pseudocode	No	The paper describes the methodology textually and provides an overview diagram (Figure 2), but it does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	1See our website for human-AI study videos and an interactive demo. The training code is also available. Our demo website is https://sites.google.com/view/human-ai-gamma-2024/ and contains the code and more experiment results.
Open Datasets	Yes	We evaluate GAMMA using the Overcooked environment [1] as a popular benchmark for prior work on human-AI cooperation [1, 24, 29, 36, 37]. For the human dataset in the original Overcooked paper [1], their open-sourced dataset contains 16 joint human-human trajectories for Cramped Room environment, 17 for Asymmetric Advantages, 16 for Coordination Ring, 12 for Forced Coordination, and 15 for Counter Circuit.. with length of T 1200.
Dataset Splits	Yes	To train a VAE on it, the dataset is split into a training dataset with 70% data and a validation dataset with the rest of 30% data.
Hardware Specification	Yes	We conducted our main experiments on clusters of AMD EPYC 64-Core Processor and NVIDIA A40/L40.
Software Dependencies	No	The paper mentions using PPO [25] and MAPPO [38] and provides hyperparameters in tables, but it does not specify software library names with their exact version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We provide information about the implementation details B and hyperparameters used in our experiments D to help reproduce our results. Table 1: Hyperparameters for policy models and Table 2: Hyperparameters for VAE models (these tables list specific values for learning rate, batch size, epoch, etc.). Also, Reward shaping for dish and soup pick-up is used for the first 100M steps to encourage exploration.