reproducibilityindex.ai

Explainable Reinforcement Learning through a Causal Lens

Authors: Prashan Madumal, Tim Miller, Liz Sonenberg, Frank Vetere2493-2500

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We computationally evaluate the model in 6 domains and measure performance and task prediction accuracy. We report on a study with 120 participants who observe agents playing a real-time strategy game (Starcraft II) and then receive explanations of the agents behaviour.
Researcher Affiliation	Academia	Prashan Madumal, Tim Miller, Liz Sonenberg, Frank Vetere Victoria, Australia pmathugama@student.unimelb.edu.au, {tmiller, l.sonenberg, f.vetere}@unimelb.edu.au
Pseudocode	Yes	Algorithm 1 Task Prediction:Action Inﬂuence Model Input: trained regression models L, current state St Output: predicted action a
Open Source Code	No	No explicit statement or link providing access to the authors' open-source code for the methodology was found in the paper.
Open Datasets	Yes	We evaluate action inﬂuence models in 5 Open AI RL benchmark domains (Brockman et al. 2016) and in the Starcraft II domain.
Dataset Splits	No	The paper mentions training phases for RL agents and structural equations ('training phase of the RL agent', 'time taken to train the structural equations'), but does not specify explicit train/validation/test dataset splits (e.g., percentages or counts) for the data used in these processes.
Hardware Specification	No	No specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments were provided in the paper. General computing environments were not specified either.
Software Dependencies	No	The paper mentions types of regression learners (linear SGD regression, decision tree regression, multilayer perceptron regression) and various RL algorithms (PG, DQN, SARSA, DDQN, PPO, A2C), but does not provide specific software dependencies with version numbers.
Experiment Setup	No	The paper describes the general process of learning structural equations, including using experience replay and updating equations using regression learners with mini-batches. However, it does not provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific optimizer settings for the models.