reproducibilityindex.ai

Finding Counterfactually Optimal Action Sequences in Continuous State Spaces

Authors: Stratis Tsirtsis, Manuel Rodriguez

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we evaluate the performance and the qualitative insights of our method by performing a series of experiments using real patient data from critical care.
Researcher Affiliation	Academia	Stratis Tsirtsis Max Planck Institute for Software Systems Kaiserslautern, Germany stsirtsis@mpi-sws.org Manuel Gomez-Rodriguez Max Planck Institute for Software Systems Kaiserslautern, Germany manuelgr@mpi-sws.org
Pseudocode	Yes	Algorithm 2: Graph search via A*
Open Source Code	Yes	1Our code is accessible at https://github.com/Networks-Learning/counterfactual-continuous-mdp.
Open Datasets	Yes	To evaluate our method, we use real patient data from MIMIC-III [54], a freely accessible critical care dataset commonly used in reinforcement learning for healthcare [6, 55 57].
Dataset Splits	Yes	Specifically, for each configuration of Lh and Lφ, we randomly split the dataset into a training and a validation set (with a size ratio 4-to-1), we train the corresponding SCM using the training set, and we evaluate the log-likelihood of the validation set based on the trained SCM.
Hardware Specification	Yes	All experiments were performed using an internal cluster of machines equipped with 16 Intel(R) Xeon(R) 3.20GHz CPU cores, 512GBs of memory and 2 NVIDIA A40 48GB GPUs.
Software Dependencies	No	The paper mentions using 'neural networks' and the 'Adam optimizer' but does not specify software dependencies like programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or their specific version numbers.
Experiment Setup	Yes	We use an SCM with Lipschitz constants Lh = 1.0, Lφ = 0.1... We jointly train the weights of the networks h and φ and the covariance matrix of the noise prior on the observed patient transitions using stochastic gradient descent with the negative log-likelihood of each transition as a loss. Subsequently, we optimize those parameters using the Adam optimizer with a learning rate of 0.001, a batch size of 256, and we train the model for 100 epochs.