AvE: Assistance via Empowerment

Authors: Yuqing Du, Stas Tiomkin, Emre Kiciman, Daniel Polani, Pieter Abbeel, Anca Dragan

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our approach against assistance based on goal inference, highlighting scenarios where our method overcomes failure modes stemming from goal ambiguity or misspecification. As existing methods for estimating empowerment in continuous domains are computationally hard, precluding its use in real time learned assistance, we also propose an efficient empowerment-inspired proxy metric. Using this, we are able to successfully demonstrate our method in a shared autonomy user study for a challenging simulated teleoperation task with human-in-the-loop training.
Researcher Affiliation Collaboration Yuqing Du UC Berkeley yuqing_du@berkeley.edu Stas Tiomkin UC Berkeley stas@berkeley.edu Emre Kıcıman Microsoft Research emrek@microsoft.com Daniel Polani University of Hertfordshire d.polani@herts.ac.uk Pieter Abbeel UC Berkeley pabbeel@berkeley.edu Anca Dragan UC Berkeley anca@berkeley.edu
Pseudocode Yes Algorithm 1: Empowerment-inspired Diversity Bonus
Open Source Code Yes For accompanying code, see https://github.com/yuqingd/ave.
Open Datasets Yes The main simulation used is Lunar Lander from Open AI Gym [8]
Dataset Splits No The paper describes training on '500 episodes' and evaluating on '100 evaluation episodes' and '50 episodes' for human participants, but does not provide specific train/validation/test dataset splits with percentages or sample counts in the main text.
Hardware Specification No The paper states 'The copilots were trained on 500 episodes of max length 1000 steps on AWS EC2', but does not specify exact GPU/CPU models, processor types, or detailed computer specifications beyond the cloud service.
Software Dependencies No The paper mentions using 'DQN' and 'Open AI Gym [8]' but does not provide specific version numbers for software dependencies such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For the empowered copilots, we do a hyperparameter sweep over cemp from 0.00001 to 10.0 and found the best performance with cemp = 0.001.