Causal Effect Identification by Adjustment under Confounding and Selection Biases

Authors: Juan Correa, Elias Bareinboim

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we introduce a generalized version of covariate adjustment that simultaneously controls for both confounding and selection biases. We first derive a sufficient and necessary condition for recovering causal effects using covariate adjustment from an observational distribution collected under preferential selection. We then relax this setting to consider cases when additional, unbiased measurements over a set of covariates are available for use (e.g., the age and gender distribution obtained from census data). Finally, we present a complete algorithm with polynomial delay to find all sets of admissible covariates for adjustment when confounding and selection biases are simultaneously present and unbiased data is available. Specifically, we solved the following problems: 1. Identification and recoverability without external data: The data is collected under selection bias, P(v | S=1), when does a set of covariates Z allow P(y | do(x)) to be estimated by adjusting for Z? 2. Identification and recoverability with external data: The data is collected under selection bias P(v | S=1) and unbiased samples of P(t), T V, are available. When does a set of covariates Z T license the estimation of P(y | do(x)) by adjusting for Z? 3. Finding admissible adjustment sets with external data: How can we list all admissible sets Z capable of identifying and recovering P(y | do(x)), for Z T V?
Researcher Affiliation Academia Juan D. Correa Purdue University correagr@purdue.edu Elias Bareinboim Purdue University eb@purdue.edu
Pseudocode No No structured pseudocode or algorithm blocks found. The paper refers to the LISTSEP procedure from an external source but does not provide its implementation details.
Open Source Code No No statement regarding the release of source code for the described methodology.
Open Datasets No The paper is theoretical and does not use or reference any specific public dataset for empirical training or evaluation.
Dataset Splits No The paper is theoretical and does not describe empirical experiments or dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not describe experimental hardware specifications.
Software Dependencies No No specific software dependencies with version numbers are mentioned that are required to replicate the theoretical derivations or proposed algorithm.
Experiment Setup No The paper is theoretical and does not describe any experimental setup or hyperparameters.