reproducibilityindex.ai

Parallel Deep Neural Networks Have Zero Duality Gap

Authors: Yifei Wang, Tolga Ergen, Mert Pilanci

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we prove that the duality gap for deeper linear networks with vector outputs is non-zero. In contrast, we show that the zero duality gap can be obtained by stacking standard deep networks in parallel, which we call a parallel architecture, and modifying the regularization. Therefore, we prove the strong duality and existence of equivalent convex problems that enable globally optimal training of deep networks. As a by-product of our analysis, we demonstrate that the weight decay regularization on the network parameters explicitly encourages low-rank solutions via closed-form expressions. In addition, we show that strong duality holds for three-layer standard Re LU networks given rank-1 data matrices.
Researcher Affiliation	Academia	Yifei Wang, Tolga Ergen & Mert Pilanci Department of Electrical Engineering Stanford University {wangyf18,ergen,pilanci}@stanford.edu
Pseudocode	No	The paper provides mathematical derivations and proofs but no pseudocode or algorithm blocks.
Open Source Code	No	The paper is theoretical and does not mention releasing any source code for its methods.
Open Datasets	No	The paper is theoretical and does not use or reference specific datasets for training or evaluation.
Dataset Splits	No	The paper is theoretical and does not describe any dataset splits for validation.
Hardware Specification	No	The paper is theoretical and does not describe any hardware used for experiments.
Software Dependencies	No	The paper is theoretical and does not list any software dependencies with version numbers for experimental reproducibility.
Experiment Setup	No	The paper is theoretical and does not provide details about an experimental setup, such as hyperparameters or training settings.