reproducibilityindex.ai

Escaping Saddle-Point Faster under Interpolation-like Conditions

Authors: Abhishek Roy, Krishnakumar Balasubramanian, Saeed Ghadimi, Prasant Mohapatra

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we show that under over-parametrization several standard stochastic optimization algorithms escape saddle-points and converge to local-minimizers much faster. One of the fundamental aspects of over-parametrized models is that they are capable of interpolating the training data. We show that, under interpolation-like assumptions... the first-order oracle complexity of Perturbed Stochastic Gradient Descent (PSGD) algorithm to reach an -local-minimizer, matches the corresponding deterministic rate of O(1/ 2).
Researcher Affiliation	Academia	Abhiskek Roy Department of Statistics University of California, Davis abroy@ucdavis.edu Krishnakumar Balasubramanian Department of Statistics University of California, Davis kbala@ucdavis.edu Saeed Ghadimi Department of Management Sciences University of Waterloo sghadimi@uwaterloo.ca Prasant Mohapatra Department of Computer Science University of California, Davis pmohapatra@ucdavis.edu
Pseudocode	Yes	Algorithm 1 Perturbed Stochastic Gradient Descent Algorithm
Open Source Code	No	The paper is theoretical and does not mention releasing any source code or providing links to a code repository for the described methodology.
Open Datasets	No	The paper is theoretical and does not involve empirical training on datasets. No public dataset information is provided.
Dataset Splits	No	The paper is theoretical and does not involve empirical validation on datasets. No information on training, validation, or test splits is provided.
Hardware Specification	No	The paper is theoretical and does not describe any experimental hardware specifications.
Software Dependencies	No	The paper is theoretical and does not mention any software dependencies with specific version numbers.
Experiment Setup	No	The paper is theoretical and does not describe any empirical experiment setup details such as hyperparameters or training configurations.