Learning to Assist Humans without Inferring Rewards
Authors: Vivek Myers, Evan Ellis, Sergey Levine, Benjamin Eysenbach, Anca Dragan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide a theoretical framework connecting our objective to prior work on empowerment and goal inference, and empirically show that agents trained with this objective can assist humans in the Overcooked environment [10] as well as more complex versions of the obstacle gridworld assistance benchmark proposed by Du et al. [6].5 Experiments We seek to answer two questions with our experiments. First, does our approach enable assistance in standard cooperation benchmarks? Second, does our approach scale to harder benchmarks where prior methods fail? |
| Researcher Affiliation | Academia | Vivek Myers1 Evan Ellis1 Sergey Levine1 Benjamin Eysenbach2 Anca Dragan1 1UC Berkeley 2Princeton University |
| Pseudocode | Yes | Algorithm 1: Empowerment via Successor Representations (ESR) |
| Open Source Code | Yes | 1Code: https://github.com/vivekmyers/empowerment_successor_representations |
| Open Datasets | Yes | Our experiments will use two benchmarks designed by prior work to study assistance: the obstacle gridworld [6] and Overcooked [10]. |
| Dataset Splits | No | The paper uses reinforcement learning environments (obstacle gridworld, Overcooked) rather than traditional datasets with explicit train/validation/test splits. Performance is evaluated after training, but no specific validation split for data is mentioned. |
| Hardware Specification | Yes | We ran all our experiments on NVIDIA RTX A6000 GPUs with 48GB of memory within an internal cluster. |
| Software Dependencies | No | Our losses (Eqs. 10 and 13) were computed and optimized in JAX with Adam [61]. We used a hardware-accelerated version of the Overcooked environment from the Jax MARL package [62]. |
| Experiment Setup | No | Specific hyperparameter values can be found in our code, which is available at https://github.com/vivekmyers/empowerment_successor_representations. |