Towards Safe Policy Learning under Partial Identifiability: A Causal Approach
Authors: Shalmali Joshi, Junzhe Zhang, Elias Bareinboim
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments We evaluate the proposed method on 1) Synthetic data, and 2) the International Stroke Trial (IST) data (Group et al. 1997; Sandercock, Niewada, and Członkowska 2011) and learn four policies. |
| Researcher Affiliation | Academia | 1Department of Biomedical Informatics 2Department of Computer Science Columbia University New York, NY |
| Pseudocode | Yes | Algorithm 1: Safe Policy Learning |
| Open Source Code | No | The paper does not explicitly state that the source code for the methodology is available or provide a link. |
| Open Datasets | Yes | International Stroke Trial (IST) data (Group et al. 1997; Sandercock, Niewada, and Członkowska 2011) |
| Dataset Splits | Yes | Fig. 3 shows the mean outcome for varying thresholds averaged over 5-fold cross-validation (standard errors not visible due to low variability). Also: We set aside 30% data as a held-out test set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or types of processors used for experiments. |
| Software Dependencies | No | The paper mentions using a Multi-layer Perceptron (MLP) and GELU activations, but does not specify software versions for libraries, frameworks, or programming languages. |
| Experiment Setup | Yes | The function family Π (see Eq. (26)) corresponds to a two-layer Multi-layer Perceptron (MLP) with 5 hidden units and the GELU activations (Hendrycks and Gimpel 2016). Also from Algorithm 1: 'Input: ... learning rate λ > 0'. |