Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Robust Decision Policies from Observational Data
Authors: Muhammad Osama, Dave Zachariah, Peter Stoica
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The performance and statistical properties of the proposed method are illustrated using both real and synthetic data. |
| Researcher Affiliation | Academia | Muhammad Osama EMAIL Dave Zachariah EMAIL Peter Stoica EMAIL Division of System and Control Department of Information Technology Uppsala University, Sweden. |
| Pseudocode | Yes | Algorithm 1 Robust policy |
| Open Source Code | Yes | The code for the experiments can be found here. |
| Open Datasets | Yes | We use data from the Infant Health and Development program (IHDP) [3], which investigated the effect of personalized home visits and intensive high-quality child care on the health of low birth-weight and premature infants [8]. |
| Dataset Splits | No | The IHDP data contains 747 data points and we randomly select a subset of n = 600 training points that form Dn. The remaining 147 points are used to evaluate learned policies. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details, such as library names with version numbers (e.g., Python 3.8, TensorFlow 2.x, PyTorch 1.x). |
| Experiment Setup | Yes | We create a synthetic dataset, drawing n = 200 data points from the training distribution (1)... We let α = 20%. The IHDP data contains 747 data points and we randomly select a subset of n = 600 training points that form Dn. The remaining 147 points are used to evaluate learned policies. To learn the weights (8) for the robust policy, we first reduce the 25-dimensional covariates ez into 4dimensional features z = enc(ez) using an autoencoder [2, sec.7.1]. Then bp(z|x) is a learned Gaussian mixture model with four mixture components and bp(x) is a learned Bernoulli model. Together the models define (8) and a robust policy πα(z) is learned for the target probability α = 20%. The probability that the cost y exceeds yα(z) is 18.6%, estimated using 500 Monte Carlo runs... |