DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation

Authors: Qinshuo Liu, Zixin Wang, Xi’An Li, Xinyao Ji, Lei Zhang, Lin Liu, Zhonghua Liu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive numerical experiments and a real data analysis, we demonstrate the numerical and statistical advantages of DNA-SE over traditional methods.
Researcher Affiliation Academia 1Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong SAR, China 2School of Life Sciences, Shanghai Jiao Tong University, Shanghai, China 3Institute of Natural Sciences, MOE-LSC, School of Mathematical Sciences, CMAShanghai, Shanghai Jiao Tong University, Shanghai, China 4SJTUYale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China 5Department of Biostatistics, Columbia University, New York, NY, USA.
Pseudocode Yes Algorithm 1 Pseudocode for DNA-SE
Open Source Code Yes We have wrapped DNA-SE into a python package, which is accessible via this link.
Open Datasets Yes First, we apply the DNA-SE algorithm to the problem of parameter estimation in regression models with the response variable Y being subject to MNAR as described in (3) in Example 3.1. Here we are interested in estimating the regression coefficient β = (β1, β2) , the true value of which is set to be β = (0.25, 0.5) in the simulation. The sample size N is 500. ... Next, we apply the DNA-SE algorithm to the problem of sensitivity analyses in causal inference with unmeasured confounding described in (5) in Example 3.2. ... The sample size N is 1000. ... Next we apply DNA-SE to the transfer learning problem described in (7) in Example 3.3. ... The sample size N is 10000. ... In this section, we further demonstrate the empirical performance of DNA-SE by reanalyzing a real dataset that was previously analyzed in Zhao & Ma (2022). ... This dataset comprises 2486 subjects
Dataset Splits Yes The hyperparameters of the proposed DNA-SE model include: the alternating frequency γ, Monte Carlo sizes J1 and J2 for numerically approximating the integrals for the integral equations and the related loss LK, and other common DNN tuning parameters such as depth, width, step sizes, batch sizes, and etc. These hyperparameters are tuned via grid search using cross-validation. We employ the hyperbolic tangent (tanh) function as the activation function (Lu et al., 2022), and set the learning rate to be 10 4. Detailed information regarding hyperparameter tuning can be found in Appendix D.
Hardware Specification No The paper does not provide specific details about the hardware used (e.g., CPU, GPU models, memory).
Software Dependencies No The paper mentions that DNA-SE is a "python package" and references other works that might imply certain software (e.g., PyTorch, TensorFlow by context), but it does not specify any software dependencies with version numbers.
Experiment Setup Yes The hyperparameters of the proposed DNA-SE model include: the alternating frequency γ, Monte Carlo sizes J1 and J2 for numerically approximating the integrals for the integral equations and the related loss LK, and other common DNN tuning parameters such as depth, width, step sizes, batch sizes, and etc. These hyperparameters are tuned via grid search using cross-validation. We employ the hyperbolic tangent (tanh) function as the activation function (Lu et al., 2022), and set the learning rate to be 10 4. Detailed information regarding hyperparameter tuning can be found in Appendix D.