Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Parallel Deep Neural Networks Have Zero Duality Gap
Authors: Yifei Wang, Tolga Ergen, Mert Pilanci
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we prove that the duality gap for deeper linear networks with vector outputs is non-zero. In contrast, we show that the zero duality gap can be obtained by stacking standard deep networks in parallel, which we call a parallel architecture, and modifying the regularization. Therefore, we prove the strong duality and existence of equivalent convex problems that enable globally optimal training of deep networks. As a by-product of our analysis, we demonstrate that the weight decay regularization on the network parameters explicitly encourages low-rank solutions via closed-form expressions. In addition, we show that strong duality holds for three-layer standard Re LU networks given rank-1 data matrices. |
| Researcher Affiliation | Academia | Yifei Wang, Tolga Ergen & Mert Pilanci Department of Electrical Engineering Stanford University EMAIL |
| Pseudocode | No | The paper provides mathematical derivations and proofs but no pseudocode or algorithm blocks. |
| Open Source Code | No | The paper is theoretical and does not mention releasing any source code for its methods. |
| Open Datasets | No | The paper is theoretical and does not use or reference specific datasets for training or evaluation. |
| Dataset Splits | No | The paper is theoretical and does not describe any dataset splits for validation. |
| Hardware Specification | No | The paper is theoretical and does not describe any hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not list any software dependencies with version numbers for experimental reproducibility. |
| Experiment Setup | No | The paper is theoretical and does not provide details about an experimental setup, such as hyperparameters or training settings. |