Explaining Reinforcement Learning to Mere Mortals: An Empirical Study

Authors: Andrew Anderson, Jonathan Dodge, Amrita Sadarangani, Zoe Juozapaitis, Evan Newman, Jed Irvine, Souti Chattopadhyay, Alan Fern, Margaret Burnett

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a user study to investigate the impact of explanations on non-experts understanding of reinforcement learning (RL) agents. We investigate both a common RL visualization, saliency maps (the focus of attention), and a more recent explanation type, reward-decomposition bars (predictions of future types of rewards). We designed a 124 participant, four-treatment experiment to compare participants mental models of an RL agent in a simple Real-Time Strategy (RTS) game. Our results show that the combination of both saliency and reward bars were needed to achieve a statistically significant improvement in mental model score over the control.
Researcher Affiliation Academia Andrew Anderson , Jonathan Dodge , Amrita Sadarangani , Zoe Juozapaitis , Evan Newman , Jed Irvine , Souti Chattopadhyay , Alan Fern and Margaret Burnett Oregon State University {anderan2, dodgej, sadarana, juozapaz, newmanev, irvine, chattops, Alan.Fern, burnett}@eecs.oregonstate.edu
Pseudocode No The paper describes the SARSA algorithm and neural network representations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes The game and study materials/code are here1 1https://ir.library.oregonstate.edu/concern/datasets/tt44ps61c
Open Datasets Yes The paper mentions building their own game environment and provides a URL to what appears to be an institutional repository specifically for datasets: '1https://ir.library.oregonstate.edu/concern/datasets/tt44ps61c'. While the text refers to 'game and study materials/code', the URL path '/datasets/' suggests that the data from the user study, which serves as the dataset for their experiments, is likely accessible there.
Dataset Splits No The paper describes a user study with a 'between-subjects design' and 'four treatments' (Control, Saliency, Rewards, Everything). This is an experimental design for a user study, not a train/validation/test split of data for machine learning model evaluation, which is what the question implies.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU models, CPU types, memory) used for training the RL agent or running the user study experiments.
Software Dependencies No The paper mentions using a 'neural network representation' and the 'decomposed SARSA learning algorithm' but does not provide specific software names with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes The agent trained using the decomposed SARSA learning algorithm using a discount factor of 0.9, a learning rate of 0.1, with ϵ-greedy exploration (ϵ decayed from 0.9 to 0.1). It trained for 30,000 games, at which point it demonstrated high-quality actions.