Gradient Descent Learns One-hidden-layer CNN: Don’t be Afraid of Spurious Local Minima
Authors: Simon Du, Jason Lee, Yuandong Tian, Aarti Singh, Barnabas Poczos
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 6, we use simulations to verify our theories. We also illustrate our theoretical results with numerical experiments. |
| Researcher Affiliation | Collaboration | 1Machine Learning Department, Carnegie Mellon University 2Department of Data Sciences and Operations, University of Southern California 3Facebook Artificial Intelligence Research. |
| Pseudocode | Yes | Algorithm 1 Gradient Descent for Learning One-Hidden Layer CNN with Weight Normalization |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | No | The paper describes the input as 'Gaussian input Z' and 'every entry of Z is sampled from a Gaussian distribution with mean 0 and variance 1', but does not provide concrete access information (link, DOI, or specific citation to a dataset resource) for a publicly available dataset. |
| Dataset Splits | No | The paper does not specify dataset splits (training, validation, or test) for experimental reproduction. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers needed for replication. |
| Experiment Setup | No | The paper provides some model parameters (k, p) for the experimental section and mentions a 'learning rate' in Algorithm 1, but it does not specify concrete hyperparameter values such as the specific learning rate used, batch size, number of epochs, or optimizer settings for the numerical experiments described. |