Gradient Descent Learns One-hidden-layer CNN: Don’t be Afraid of Spurious Local Minima

Authors: Simon Du, Jason Lee, Yuandong Tian, Aarti Singh, Barnabas Poczos

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 6, we use simulations to verify our theories. We also illustrate our theoretical results with numerical experiments.
Researcher Affiliation Collaboration 1Machine Learning Department, Carnegie Mellon University 2Department of Data Sciences and Operations, University of Southern California 3Facebook Artificial Intelligence Research.
Pseudocode Yes Algorithm 1 Gradient Descent for Learning One-Hidden Layer CNN with Weight Normalization
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository.
Open Datasets No The paper describes the input as 'Gaussian input Z' and 'every entry of Z is sampled from a Gaussian distribution with mean 0 and variance 1', but does not provide concrete access information (link, DOI, or specific citation to a dataset resource) for a publicly available dataset.
Dataset Splits No The paper does not specify dataset splits (training, validation, or test) for experimental reproduction.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers needed for replication.
Experiment Setup No The paper provides some model parameters (k, p) for the experimental section and mentions a 'learning rate' in Algorithm 1, but it does not specify concrete hyperparameter values such as the specific learning rate used, batch size, number of epochs, or optimizer settings for the numerical experiments described.