reproducibilityindex.ai

Implicit Bias of Gradient Descent on Linear Convolutional Networks

Authors: Suriya Gunasekar, Jason D. Lee, Daniel Soudry, Nati Srebro

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We show that gradient descent on full width linear convolutional networks of depth L converges to a linear predictor related to the ℓ2/L bridge penalty in the frequency domain. This is in contrast to fully connected linear networks, where regardless of depth, gradient descent converges to the ℓ2 maximum margin solution.Finally, in this paper we focus on characterizing which global minimum does gradient descent on over-parameterized linear models converge to, while assuming that for appropriate choice of step sizes gradient descent iterates asymptotically minimize the optimization objective.
Researcher Affiliation	Collaboration	Suriya Gunasekar TTI at Chicago, USA suriya@ttic.edu Jason D. Lee USC Los Angeles, USA jasonlee@marshall.usc.edu Daniel Soudry Technion, Israel daniel.soudry@gmail.com Nathan Srebro TTI at Chicago, USA nati@ttic.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any statement about releasing open-source code or provide a link to a code repository for the described methodology.
Open Datasets	No	The paper discusses 'separable linear classiﬁcation dataset {(xn, yn) : n = 1, 2, . . . N}' as a general concept for its theoretical analysis, but does not mention or provide access information for a specific, publicly available dataset.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments, therefore it does not provide specific details regarding training, validation, or test dataset splits.
Hardware Specification	No	The paper focuses on theoretical analysis and does not describe any experiments that would require hardware, thus no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not describe any implementation details that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe any empirical experimental setup, including hyperparameters or system-level training settings.