Counterfactual Explanations in Sequential Decision Making Under Uncertainty
Authors: Stratis Tsirtsis, Abir De, Manuel Rodriguez
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our algorithm using both synthetic and real data from cognitive behavioral therapy and show that the counterfactual explanations our algorithm finds can provide valuable insights to enhance sequential decision making under uncertainty. |
| Researcher Affiliation | Academia | Stratis Tsirtsis MPI-SWS stsirtsis@mpi-sws.org Abir De IIT Bombay abir@cse.iitb.ac.in Manuel Gomez-Rodriguez MPI-SWS manuelgr@mpi-sws.org |
| Pseudocode | Yes | ALGORITHM 1: It samples a counterfactual explanation from the counterfactual policy π and ALGORITHM 2: It returns the optimal counterfactual policy and its average counterfactual outcome |
| Open Source Code | Yes | Our code is accessible at https://github.com/Networks-Learning/counterfactual-explanations-mdp. |
| Open Datasets | Yes | We use anonymized data from a clinical trial comparing the efficacy of hypnotherapy and cognitive behavioral therapy [25] for the treatment of patients with mild to moderate symptoms of major depression7. ... [25] Kristina Fuhr, Cornelie Schweizer, Christoph Meisner, and Anil Batra. Efficacy of hypnotherapy compared to cognitive-behavioural therapy for mild-to-moderate depression: study protocol of a randomised-controlled rater-blind trial (wiki-d). BMJ open, 7(11):e016978, 2017. |
| Dataset Splits | No | The paper describes using "realizations" of synthetic data and "patient data" for evaluation, but does not specify explicit training, validation, or test dataset splits in the conventional machine learning sense for model training/evaluation. |
| Hardware Specification | Yes | All experiments were performed on a machine equipped with 48 Intel(R) Xeon(R) 3.00GHz CPU cores and 1.5TB memory. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Experimental setup. We characterize the synthetic decision making process using an MDP with states S = {0, . . . , n 1} and actions A = {0, . . . , m 1}, where n = 20 and m = 10, and time horizon T = 20. ... To compute the counterfactual transition probability Pτ,t for each observed realization τ, we follow the procedure described in Section 2 with d = 1,000 samples for each noise posterior distribution4. For real data: To derive the counterfactual transition probability for each patient, we start by creating an MDP with n = 5 states and m = 11 actions. ... Finally, to compute the counterfactual transition probability Pτ,t for each realization τ T , we follow the procedure described in Section 2 with d = 1,000 samples for each noise posterior distribution. |