Explaining Reinforcement Learning to Mere Mortals: An Empirical Study
Authors: Andrew Anderson, Jonathan Dodge, Amrita Sadarangani, Zoe Juozapaitis, Evan Newman, Jed Irvine, Souti Chattopadhyay, Alan Fern, Margaret Burnett
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present a user study to investigate the impact of explanations on non-experts understanding of reinforcement learning (RL) agents. We investigate both a common RL visualization, saliency maps (the focus of attention), and a more recent explanation type, reward-decomposition bars (predictions of future types of rewards). We designed a 124 participant, four-treatment experiment to compare participants mental models of an RL agent in a simple Real-Time Strategy (RTS) game. Our results show that the combination of both saliency and reward bars were needed to achieve a statistically significant improvement in mental model score over the control. |
| Researcher Affiliation | Academia | Andrew Anderson , Jonathan Dodge , Amrita Sadarangani , Zoe Juozapaitis , Evan Newman , Jed Irvine , Souti Chattopadhyay , Alan Fern and Margaret Burnett Oregon State University {anderan2, dodgej, sadarana, juozapaz, newmanev, irvine, chattops, Alan.Fern, burnett}@eecs.oregonstate.edu |
| Pseudocode | No | The paper describes the SARSA algorithm and neural network representations, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The game and study materials/code are here1 1https://ir.library.oregonstate.edu/concern/datasets/tt44ps61c |
| Open Datasets | Yes | The paper mentions building their own game environment and provides a URL to what appears to be an institutional repository specifically for datasets: '1https://ir.library.oregonstate.edu/concern/datasets/tt44ps61c'. While the text refers to 'game and study materials/code', the URL path '/datasets/' suggests that the data from the user study, which serves as the dataset for their experiments, is likely accessible there. |
| Dataset Splits | No | The paper describes a user study with a 'between-subjects design' and 'four treatments' (Control, Saliency, Rewards, Everything). This is an experimental design for a user study, not a train/validation/test split of data for machine learning model evaluation, which is what the question implies. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU models, CPU types, memory) used for training the RL agent or running the user study experiments. |
| Software Dependencies | No | The paper mentions using a 'neural network representation' and the 'decomposed SARSA learning algorithm' but does not provide specific software names with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | The agent trained using the decomposed SARSA learning algorithm using a discount factor of 0.9, a learning rate of 0.1, with ϵ-greedy exploration (ϵ decayed from 0.9 to 0.1). It trained for 30,000 games, at which point it demonstrated high-quality actions. |