reproducibilityindex.ai

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Authors: Alon Brutzkus, Amir Globerson

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide an empirical demonstration of this in Section 6 where gradient descent is shown to succeed in the Gaussian case and fail for a different distribution. Here we empirically demonstrate both the easy and hard cases.
Researcher Affiliation	Academia	1Tel Aviv University, Blavatnik School of Computer Science. Correspondence to: Alon Brutzkus <alonbrutzkus@mail.tau.ac.il>, Amir Globerson <gamir@cs.tau.ac.il>.
Pseudocode	No	The paper describes mathematical derivations and algorithms using equations and narrative text, but it does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the methodology described in this paper is publicly available.
Open Datasets	No	The paper describes the generation of synthetic training data for its empirical illustration ('To generate the hard case, we begin with a set splitting problem...', 'we use a Gaussian distribution G as deﬁned earlier and generate a training set...'), but it does not specify a publicly available dataset with concrete access information such as a link, DOI, or formal citation.
Dataset Splits	No	The paper's empirical section illustrates concepts but does not provide specific details on training, validation, or test dataset splits such as percentages or sample counts.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware used to run its experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions using 'Ada Grad (Duchi et al., 2011)' for optimization, but it does not provide specific version numbers for this or any other software dependencies.
Experiment Setup	No	The paper mentions using a 'random normal initializer' and choosing the 'best performing learning rate schedule' for Ada Grad, but it does not provide specific hyperparameter values or detailed training configurations for its experiments.