reproducibilityindex.ai

Learning Robust Decision Policies from Observational Data

Authors: Muhammad Osama, Dave Zachariah, Peter Stoica

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The performance and statistical properties of the proposed method are illustrated using both real and synthetic data.
Researcher Affiliation	Academia	Muhammad Osama muhammad.osama@it.uu.se Dave Zachariah dave.zachariah@it.uu.se Peter Stoica peter.stoica@it.uu.se Division of System and Control Department of Information Technology Uppsala University, Sweden.
Pseudocode	Yes	Algorithm 1 Robust policy
Open Source Code	Yes	The code for the experiments can be found here.
Open Datasets	Yes	We use data from the Infant Health and Development program (IHDP) [3], which investigated the effect of personalized home visits and intensive high-quality child care on the health of low birth-weight and premature infants [8].
Dataset Splits	No	The IHDP data contains 747 data points and we randomly select a subset of n = 600 training points that form Dn. The remaining 147 points are used to evaluate learned policies.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependency details, such as library names with version numbers (e.g., Python 3.8, TensorFlow 2.x, PyTorch 1.x).
Experiment Setup	Yes	We create a synthetic dataset, drawing n = 200 data points from the training distribution (1)... We let α = 20%. The IHDP data contains 747 data points and we randomly select a subset of n = 600 training points that form Dn. The remaining 147 points are used to evaluate learned policies. To learn the weights (8) for the robust policy, we ﬁrst reduce the 25-dimensional covariates ez into 4dimensional features z = enc(ez) using an autoencoder [2, sec.7.1]. Then bp(z\|x) is a learned Gaussian mixture model with four mixture components and bp(x) is a learned Bernoulli model. Together the models deﬁne (8) and a robust policy πα(z) is learned for the target probability α = 20%. The probability that the cost y exceeds yα(z) is 18.6%, estimated using 500 Monte Carlo runs...