reproducibilityindex.ai

Training Linear Neural Networks: Non-Local Convergence and Complexity Results

Authors: Armin Eftekhari

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper identiﬁes conditions under which the gradient ﬂow provably trains a linear network, in spite of the non-strict saddle points present in the optimization landscape. This paper also provides the computational complexity of training linear networks with gradient ﬂow. To achieve these results, this work develops a machinery to provably identify the stable set of gradient ﬂow, which then enables us to improve over the state of the art in the literature of linear networks (Bah et al., 2019; Arora et al., 2018a).
Researcher Affiliation	Academia	1Department of Mathematics and Mathematical Statistics, Umea University, Sweden. AE is indebted to Holger Rauhut, Ulrich Terstiege and Gongguo Tang for insightful discussions. Correspondence to: Armin Eftekhari <armin.eftekhari@umu.se>.
Pseudocode	No	No pseudocode or algorithm blocks are provided in the paper.
Open Source Code	No	The paper does not provide any links or explicit statements about the availability of source code for the described methodology.
Open Datasets	No	The paper describes a "randomly-generated whitened training dataset" for a numerical example, but it is not a publicly available dataset with concrete access information (link, citation, or repository).
Dataset Splits	No	The paper does not specify training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running the numerical example.
Software Dependencies	No	The paper mentions implementing "discretization of (17) obtained from the explicit (or forward) Euler method" but does not specify any software names with version numbers.
Experiment Setup	Yes	Suppose that the sample size is m = 50, and consider a randomly-generated whitened training dataset... with dx = 5 and dy = 1. ...We also set W0 2 = 10 Z 2. Instead of induced ﬂow (17), we implemented the discretization of (17) obtained from the explicit (or forward) Euler method with a step size of 10 6 with 105 steps.