reproducibilityindex.ai

DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning

Authors: Siqi Xu, Lin Liu, Zhonghua Liu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive synthetic experiments are conducted to support our ﬁndings and also expose the gap between theory and practice. As a proof of concept, we apply Deep Med to analyze two real datasets on machine learning fairness and reach conclusions consistent with previous ﬁndings.
Researcher Affiliation	Academia	Siqi Xu Department of Statistics and Actuarial Sciences University of Hong Kong Hong Kong SAR, China sqxu@hku.hk Lin Liu Institute of Natural Sciences, MOE-LSC, School of Mathematical Sciences, CMA-Shanghai, and SJTU-Yale Joint Center for Biostatistics and Data Science Shanghai Jiao Tong University and Shanghai Artiﬁcial Intelligence Laboratory Shanghai, China linliu@sjtu.edu.cn Zhonghua Liu Department of Biostatistics Columbia University New York, NY, USA zl2509@cumc.columbia.edu
Pseudocode	Yes	Algorithm 1 Deep Med with V -fold cross-ﬁtting
Open Source Code	Yes	Finally, a user-friendly R package can be found at https://github.com/siqixu/Deep Med.
Open Datasets	No	The paper mentions using synthetic data, which is generated, and analyzing real data from the COMPAS algorithm, citing Dressel and Farid (2018). However, it does not provide a direct link, DOI, or specific repository name for the COMPAS dataset itself, nor does it describe how to access or generate the synthetic data.
Dataset Splits	Yes	We adopt a 3-fold cross-validation to choose the hyperparameters for DNNs (depth L, width K, L1-regularization parameter λ and epochs), RF (number of trees and maximum number of nodes) and GBM (numbers of trees and depth). We use a completely independent sample for the hyperparameter selection.
Hardware Specification	No	The authors would also like to thank Department of Statistics and Actuarial Sciences at The University of Hong Kong for providing highperformance computing servers that supported the numerical experiments in this paper. This statement is too general and does not include specific hardware models (e.g., GPU, CPU models, or memory details).
Software Dependencies	No	The Lasso is implemented using the R package hdm with a data-driven penalty. The DNN, RF and GBM are implemented using the R packages keras , random Forest and gbm , respectively. No specific version numbers for these R packages (hdm, keras, randomForest, gbm) are provided.
Experiment Setup	Yes	We adopt a 3-fold cross-validation to choose the hyperparameters for DNNs (depth L, width K, L1-regularization parameter λ and epochs), RF (number of trees and maximum number of nodes) and GBM (numbers of trees and depth)... We use the cross-entropy loss for the binary response and the mean-squared loss for the continuous response. We ﬁx the batch-size as 100 and the other hyperparameters for the other methods are set to the default values in their R packages.