Explaining Reinforcement Learning with Shapley Values
Authors: Daniel Beechey, Thomas M. S. Smith, Özgür Şimşek
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition. and We present experimental results in a variety of domains. We contrast SVERL-P with applying Shapley values to policies and to value functions, demonstrating the limitations of the latter approaches. |
| Researcher Affiliation | Academia | Daniel Beechey 1 Thomas M. S. Smith 1 Ozg ur S ims ek 1 1Department of Computer Science, University of Bath, UK. Correspondence to: Daniel Beechey <djeb20@bath.ac.uk>. |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Code is available at https://github.com/bath-reinforcement-learning-lab/SVERL_icml_2023. |
| Open Datasets | Yes | Taxi is a classic reinforcement learning domain by Dietterich (1998). We used the implementation by Open AI Gym (Brockman et al., 2016). |
| Dataset Splits | No | The paper describes experiments in reinforcement learning environments but does not provide specific train/validation/test dataset splits, which are more typical for supervised learning. Reinforcement learning experiments typically involve agents interacting directly with environments rather than static data splits. |
| Hardware Specification | No | This research made use of Hex, the GPU Cloud in the Department of Computer Science at the University of Bath. |
| Software Dependencies | No | The paper mentions using 'Open AI Gym' for the Taxi domain, but no specific version number for Open AI Gym or any other software dependencies is provided. |
| Experiment Setup | Yes | Gridworld-A, shown in Figure 1a, is a deterministic gridworld... The reward is 1 for every action taken and an additional +10 for transitioning into a goal state... The discount factor γ is 1. Taxi is a classic reinforcement learning domain by Dietterich (1998)... Rewards are 1 for all actions, an additional +20 for dropping a passenger at the correct destination, and an additional 10 for attempting to pick up or drop off the passenger at an inappropriate location. Minesweeper...There is only one reward signal: 20 whenever the agent reveals a mine. |