Explaining Reinforcement Learning with Shapley Values

Authors: Daniel Beechey, Thomas M. S. Smith, Özgür Şimşek

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition. and We present experimental results in a variety of domains. We contrast SVERL-P with applying Shapley values to policies and to value functions, demonstrating the limitations of the latter approaches.
Researcher Affiliation Academia Daniel Beechey 1 Thomas M. S. Smith 1 Ozg ur S ims ek 1 1Department of Computer Science, University of Bath, UK. Correspondence to: Daniel Beechey <djeb20@bath.ac.uk>.
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Code is available at https://github.com/bath-reinforcement-learning-lab/SVERL_icml_2023.
Open Datasets Yes Taxi is a classic reinforcement learning domain by Dietterich (1998). We used the implementation by Open AI Gym (Brockman et al., 2016).
Dataset Splits No The paper describes experiments in reinforcement learning environments but does not provide specific train/validation/test dataset splits, which are more typical for supervised learning. Reinforcement learning experiments typically involve agents interacting directly with environments rather than static data splits.
Hardware Specification No This research made use of Hex, the GPU Cloud in the Department of Computer Science at the University of Bath.
Software Dependencies No The paper mentions using 'Open AI Gym' for the Taxi domain, but no specific version number for Open AI Gym or any other software dependencies is provided.
Experiment Setup Yes Gridworld-A, shown in Figure 1a, is a deterministic gridworld... The reward is 1 for every action taken and an additional +10 for transitioning into a goal state... The discount factor γ is 1. Taxi is a classic reinforcement learning domain by Dietterich (1998)... Rewards are 1 for all actions, an additional +20 for dropping a passenger at the correct destination, and an additional 10 for attempting to pick up or drop off the passenger at an inappropriate location. Minesweeper...There is only one reward signal: 20 whenever the agent reveals a mine.