Learning to Assist Humans without Inferring Rewards

Authors: Vivek Myers, Evan Ellis, Sergey Levine, Benjamin Eysenbach, Anca Dragan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a theoretical framework connecting our objective to prior work on empowerment and goal inference, and empirically show that agents trained with this objective can assist humans in the Overcooked environment [10] as well as more complex versions of the obstacle gridworld assistance benchmark proposed by Du et al. [6].5 Experiments We seek to answer two questions with our experiments. First, does our approach enable assistance in standard cooperation benchmarks? Second, does our approach scale to harder benchmarks where prior methods fail?
Researcher Affiliation Academia Vivek Myers1 Evan Ellis1 Sergey Levine1 Benjamin Eysenbach2 Anca Dragan1 1UC Berkeley 2Princeton University
Pseudocode Yes Algorithm 1: Empowerment via Successor Representations (ESR)
Open Source Code Yes 1Code: https://github.com/vivekmyers/empowerment_successor_representations
Open Datasets Yes Our experiments will use two benchmarks designed by prior work to study assistance: the obstacle gridworld [6] and Overcooked [10].
Dataset Splits No The paper uses reinforcement learning environments (obstacle gridworld, Overcooked) rather than traditional datasets with explicit train/validation/test splits. Performance is evaluated after training, but no specific validation split for data is mentioned.
Hardware Specification Yes We ran all our experiments on NVIDIA RTX A6000 GPUs with 48GB of memory within an internal cluster.
Software Dependencies No Our losses (Eqs. 10 and 13) were computed and optimized in JAX with Adam [61]. We used a hardware-accelerated version of the Overcooked environment from the Jax MARL package [62].
Experiment Setup No Specific hyperparameter values can be found in our code, which is available at https://github.com/vivekmyers/empowerment_successor_representations.