Uniform Convergence of Gradients for Non-Convex Learning and Optimization
Authors: Dylan J. Foster, Ayush Sekhari, Karthik Sridharan
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | The goal of the present work is to introduce learning-theoretic tools to in a general sense improve understanding of when and why gradient-based methods succeed for non-convex learning problems. Our precise technical contributions are as follows: We bring vector-valued Rademacher complexities [30] and associated vector-valued contraction principles to bear on the analysis of uniform convergence for gradients. |
| Researcher Affiliation | Academia | Dylan J. Foster Cornell University djfoster@cornell.edu Ayush Sekhari Cornell University sekhari@cs.cornell.edu Karthik Sridharan Cornell University sridharan@cs.cornell.edu |
| Pseudocode | No | The paper describes a 'meta-algorithm' and references other algorithms, but it does not include any formal pseudocode blocks or algorithm listings with labels such as 'Algorithm 1'. |
| Open Source Code | No | The paper does not contain any statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | No | The paper is theoretical and focuses on mathematical analysis of learning problems. It does not conduct empirical experiments using specific datasets, and therefore does not provide access information for a publicly available dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe empirical experiments or dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is purely theoretical and does not describe any computational experiments, so there is no mention of specific hardware specifications used for running experiments. |
| Software Dependencies | No | The paper is purely theoretical and does not describe any computational experiments. It does not list specific software dependencies with version numbers that would be required for replication. |
| Experiment Setup | No | The paper is purely theoretical and does not describe any experimental setup details, hyperparameters, or training configurations. |