reproducibilityindex.ai

AvE: Assistance via Empowerment

Authors: Yuqing Du, Stas Tiomkin, Emre Kiciman, Daniel Polani, Pieter Abbeel, Anca Dragan

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test our approach against assistance based on goal inference, highlighting scenarios where our method overcomes failure modes stemming from goal ambiguity or misspeciﬁcation. As existing methods for estimating empowerment in continuous domains are computationally hard, precluding its use in real time learned assistance, we also propose an efﬁcient empowerment-inspired proxy metric. Using this, we are able to successfully demonstrate our method in a shared autonomy user study for a challenging simulated teleoperation task with human-in-the-loop training.
Researcher Affiliation	Collaboration	Yuqing Du UC Berkeley yuqing_du@berkeley.edu Stas Tiomkin UC Berkeley stas@berkeley.edu Emre Kıcıman Microsoft Research emrek@microsoft.com Daniel Polani University of Hertfordshire d.polani@herts.ac.uk Pieter Abbeel UC Berkeley pabbeel@berkeley.edu Anca Dragan UC Berkeley anca@berkeley.edu
Pseudocode	Yes	Algorithm 1: Empowerment-inspired Diversity Bonus
Open Source Code	Yes	For accompanying code, see https://github.com/yuqingd/ave.
Open Datasets	Yes	The main simulation used is Lunar Lander from Open AI Gym [8]
Dataset Splits	No	The paper describes training on '500 episodes' and evaluating on '100 evaluation episodes' and '50 episodes' for human participants, but does not provide specific train/validation/test dataset splits with percentages or sample counts in the main text.
Hardware Specification	No	The paper states 'The copilots were trained on 500 episodes of max length 1000 steps on AWS EC2', but does not specify exact GPU/CPU models, processor types, or detailed computer specifications beyond the cloud service.
Software Dependencies	No	The paper mentions using 'DQN' and 'Open AI Gym [8]' but does not provide specific version numbers for software dependencies such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	For the empowered copilots, we do a hyperparameter sweep over cemp from 0.00001 to 10.0 and found the best performance with cemp = 0.001.