reproducibilityindex.ai

Learning One Convolutional Layer with Overlapping Patches

Authors: Surbhi Goel, Adam Klivans, Raghu Meka

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6. Experiments: SGD vs Convotron. To further support our theoretical ﬁndings, we empirically compare the performance of SGD (Algorithm 3) with our algorithm Convotron. We measure performance based on the failure probability, that is, the fraction of runs the algorithm fails to converge on randomly initialized runs (the randomness is over both the choice of initialization for SGD and the draws from the distribution).
Researcher Affiliation	Academia	1Department of Computer Science, University of Texas at Austin 2Department of Computer Science, UCLA.
Pseudocode	Yes	Algorithm 1 Convotron; Algorithm 2 Convotron-No-Overlap; Algorithm 3 SGD
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	No	The paper mentions generating synthetic data (
Dataset Splits	No	The paper does not provide specific dataset split information for training, validation, or testing.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	In the experiments, given a ﬁxed true weight vector, for varying learning rates (increments of 0.01), we choose 50 random initializations and run the two algorithms with them as starting points. We plot the failure probability (θ = 0.1) with varying learning rate. Note that the lowest learning rate we use is 0.01 as making the learning rate too small requires high number of iterations for convergence for both algorithms. We ﬁrst test the performance on a simple 1D convolution case with (n, k, d, T) = (8, 4, 1, 6000) and 2D case with (n1, n2, k1, k2, d1, d2, T) = (5, 5, 3, 3, 1, 1, 15000) on inputs drawn from a normalized (l2 norm 1) Gaussian distribution with identity covariance matrix.