Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach

Authors: Luofeng Liao, You-Lin Chen, Zhuoran Yang, Bo Dai, Mladen Kolar, Zhaoran Wang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We consider both 2-layer and multi-layer NNs with Re LU activation functions and prove global convergence in an overparametrized regime, where the number of neurons is diverging. The results are established using techniques from online learning and local linearization of NNs, and improve in several aspects the current state-of-the-art. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
Researcher Affiliation Collaboration Luofeng Liao The University of Chicago luofengl@uchicago.edu You-Lin Chen The University of Chicago youlinchen@uchicago.edu Zhuoran Yang Princeton University zy6@princeton.edu Bo Dai Google Research, Brain Team bodai@google.com Zhaoran Wang Northwestern University zhaoranwang@gmail.com Mladen Kolar The University of Chicago mkolar@chicagobooth.edu
Pseudocode Yes Algorithm 1 is the proposed stochastic primal-dual algorithm for solving the game (8). Given initial weights θ1 and ω1, stepsize η, and i.i.d. samples t X1,t, X2,tu, for t 2, . . . , T 1, θt 1 ΠSB θt η θFpθt, ωt; X1,t, X2,tq , ωt 1 ΠSB ωt η ωFpθt, ωt; X1,t, X2,tq . (Algorithm 1)
Open Source Code No The paper does not provide an explicit statement about the release of its source code or a link to a code repository for the methodology described.
Open Datasets No The paper does not provide concrete access information (link, DOI, repository, or formal citation with authors/year) for any publicly available or open dataset used for training. It discusses examples of models but does not describe empirical evaluation on specific datasets.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing. The paper is theoretical in nature and focuses on convergence proofs.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running experiments. The paper is theoretical and does not detail an experimental setup requiring such information.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup No The paper does not contain specific experimental setup details such as concrete hyperparameter values, training configurations, or system-level settings. The paper is theoretical and focuses on algorithm design and convergence proofs.