Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach
Authors: Luofeng Liao, You-Lin Chen, Zhuoran Yang, Bo Dai, Mladen Kolar, Zhaoran Wang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We consider both 2-layer and multi-layer NNs with Re LU activation functions and prove global convergence in an overparametrized regime, where the number of neurons is diverging. The results are established using techniques from online learning and local linearization of NNs, and improve in several aspects the current state-of-the-art. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting. |
| Researcher Affiliation | Collaboration | Luofeng Liao The University of Chicago luofengl@uchicago.edu You-Lin Chen The University of Chicago youlinchen@uchicago.edu Zhuoran Yang Princeton University zy6@princeton.edu Bo Dai Google Research, Brain Team bodai@google.com Zhaoran Wang Northwestern University zhaoranwang@gmail.com Mladen Kolar The University of Chicago mkolar@chicagobooth.edu |
| Pseudocode | Yes | Algorithm 1 is the proposed stochastic primal-dual algorithm for solving the game (8). Given initial weights θ1 and ω1, stepsize η, and i.i.d. samples t X1,t, X2,tu, for t 2, . . . , T 1, θt 1 ΠSB θt η θFpθt, ωt; X1,t, X2,tq , ωt 1 ΠSB ωt η ωFpθt, ωt; X1,t, X2,tq . (Algorithm 1) |
| Open Source Code | No | The paper does not provide an explicit statement about the release of its source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper does not provide concrete access information (link, DOI, repository, or formal citation with authors/year) for any publicly available or open dataset used for training. It discusses examples of models but does not describe empirical evaluation on specific datasets. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing. The paper is theoretical in nature and focuses on convergence proofs. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running experiments. The paper is theoretical and does not detail an experimental setup requiring such information. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | No | The paper does not contain specific experimental setup details such as concrete hyperparameter values, training configurations, or system-level settings. The paper is theoretical and focuses on algorithm design and convergence proofs. |