Multiply Robust Federated Estimation of Targeted Average Treatment Effects

Authors: Larry Han, Zhu Shen, Jose Zubizarreta

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the finite sample properties of five different estimators: (i) an augmented inverse probability weighted (AIPW) estimator using data from the target site only (Target), (ii) an AIPW estimator that weights each site proportionally to its sample size (SS), (iii) an AIPW estimator that weights each site inverse-proportionally to its variance (IVW), (iv) an AIPW estimator that weights each site with the L1 weights defined in (9) (AIPW-L1), and (v) a multiply robust estimator with the L1 weights defined in (9) (MR-L1). Across different settings, we examine the performance of each estimator in terms of mean absolute error, root mean square error, and coverage and length of 95% confidence intervals (CI) across 500 simulations.
Researcher Affiliation Academia Larry Han Department of Health Sciences Northeastern University Boston, MA 02115 lar.han@northeastern.edu; Zhu Shen Department of Biostatistics Harvard University Boston, MA 02115 zhushen@g.harvard.edu; José R. Zubizarreta Departments of Health Care Policy, Biostatistics, and Statistics Harvard University Boston, MA 02115 zubizarreta@hcp.med.harvard.edu
Pseudocode No The paper describes its methods through narrative text and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No All experiments in this study were performed using the statistical programming language R (version 4.2.2). The package sn (version 2.1.0) was employed for generating covariates that follow a skewednormal distribution within each site. To solve the density ratio estimating equations, we utilized the root Solve package (version 1.8.2.3). For the estimation of adaptive ensemble weights based on penalized regression of site-specific influence functions, we employed the glmnet package (version 4.1-4).
Open Datasets No We apply our approach to study the treatment effect of percutaneous coronary intervention (PCI) on the duration of hospitalization for patients experiencing acute myocardial infarction (AMI) with data from the Centers for Medicare & Medicaid Services (CMS). No concrete access information like a link, DOI, or specific citation for public access is provided for the CMS data.
Dataset Splits Yes First, we randomly partition the data within each site into a training set Dtrain k of units indexed by {1, ..., ntrain k } and a validation set Dval k of units indexed by {ntrain k + 1, ..., nk}. Then, each candidate treatment model is fit on Dtrain k to obtain bπj a,ntrain k for j J .
Hardware Specification No The replication of experiments was carried out using ten CPU cores, while the implementation of the model-mixing algorithm utilized five CPU cores.
Software Dependencies Yes All experiments in this study were performed using the statistical programming language R (version 4.2.2). The package sn (version 2.1.0) was employed for generating covariates that follow a skewednormal distribution within each site. To solve the density ratio estimating equations, we utilized the root Solve package (version 1.8.2.3). For the estimation of adaptive ensemble weights based on penalized regression of site-specific influence functions, we employed the glmnet package (version 4.1-4). Specifically, the packages foreach (version 1.5.2) and do Parallel were employed to facilitate the replication of experiments. For the purpose of model-mixing in multiply robust estimation, we employed the parallel package (version 4.2.2).
Experiment Setup Yes To compute the optimal ensemble weights, we solve the adaptive LASSO problem 9. The tuning parameter λ is selected through cross-validation using a grid of values {0, 10 3, 10 2, 0.1, 0.5, 1, 2, 5, 10}. To perform cross-validation, the simulated datasets in each site are split into two equally sized training and validation datasets.