Towards Safe Policy Learning under Partial Identifiability: A Causal Approach

Authors: Shalmali Joshi, Junzhe Zhang, Elias Bareinboim

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments We evaluate the proposed method on 1) Synthetic data, and 2) the International Stroke Trial (IST) data (Group et al. 1997; Sandercock, Niewada, and Członkowska 2011) and learn four policies.
Researcher Affiliation Academia 1Department of Biomedical Informatics 2Department of Computer Science Columbia University New York, NY
Pseudocode Yes Algorithm 1: Safe Policy Learning
Open Source Code No The paper does not explicitly state that the source code for the methodology is available or provide a link.
Open Datasets Yes International Stroke Trial (IST) data (Group et al. 1997; Sandercock, Niewada, and Członkowska 2011)
Dataset Splits Yes Fig. 3 shows the mean outcome for varying thresholds averaged over 5-fold cross-validation (standard errors not visible due to low variability). Also: We set aside 30% data as a held-out test set.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models or types of processors used for experiments.
Software Dependencies No The paper mentions using a Multi-layer Perceptron (MLP) and GELU activations, but does not specify software versions for libraries, frameworks, or programming languages.
Experiment Setup Yes The function family Π (see Eq. (26)) corresponds to a two-layer Multi-layer Perceptron (MLP) with 5 hidden units and the GELU activations (Hendrycks and Gimpel 2016). Also from Algorithm 1: 'Input: ... learning rate λ > 0'.