reproducibilityindex.ai

Active Learning for Decision-Making from Imbalanced Observational Data

Authors: Iiris Sundin, Peter Schulam, Eero Siivola, Aki Vehtari, Suchi Saria, Samuel Kaski

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of this decision-making aware active learning in two decision-making tasks: in simulated data with binary outcomes and in a medical dataset with synthetic and continuous treatment outcomes.
Researcher Affiliation	Academia	1Department of Computer Science, Aalto University, Espoo, Finland 2Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
Pseudocode	No	The paper does not contain any explicit pseudocode blocks or algorithms labeled as such.
Open Source Code	No	The paper refers to and links to third-party toolboxes (GPy and Stan) but does not explicitly state that the authors' own implementation code or source code for the methodology described in the paper is available.
Open Datasets	Yes	The IHDP data set. We use the Infant Health and Development Program (IHDP) dataset from Hill (2011), also used e.g. by Shalit et al. (2017) and Alaa & van der Schaar (2017), including synthetic outcomes, containing 747 observations of 25 features.
Dataset Splits	Yes	We evaluate the performance in leave-one-out cross-validation, but in order to make the problem even more realistically hard, for each of the 747 target units we choose randomly 100 observations as training examples.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	Yes	We ﬁt separate GPs to the outcomes of each treatment with GPy1 (version 1.9.2). ... The model is ﬁt using a probabilistic programming language Stan (Stan Development Team, 2017; Carpenter et al., 2017).
Experiment Setup	Yes	We use an exponentiated quadratic kernel with a separate length-scale parameter for each variable, and optimize the hyperparameters using marginal likelihood. ... We use Gauss-Hermite quadrature of order 32 to approximate the expectations in D-M aware, Targeted-IG, and EIG. ... Training sample size is 30. ... We assume that the RBF centers and length-scale are known, so that only w0 and w1 need to be learned.