reproducibilityindex.ai

How to Characterize The Landscape of Overparameterized Convolutional Neural Networks

Authors: Yihong Gu, Weizhong Zhang, Cong Fang, Jason D. Lee, Tong Zhang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We start from using our proposed technique NNG to show that two overparameterized CNNs with same architecture and initialization method, though have different initializations at the beginning of training, learn the unique solution path during the whole training process, which could be explained by our theory in Section 5.1. Then we give some empirical evidences for the convexity of overparameterized CNN by showing the uniqueness of its optimal solution and visualizing its the loss landscape in Section 6.2. All our empirical ﬁndings are consistent across a range of architectures and datasets, we only present the results on CIFAR-10 with VGG-16 below and postpone the results on other architectures and datasets to appendix.
Researcher Affiliation	Academia	1 Princeton University, 2 Hong Kong University of Science and Technology
Pseudocode	Yes	Detailed steps are given in Alg. 1 in Appendix. An optimization-based algorithm is designed to construct θγ, i.e., Alg. 2 in Appendix.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described, such as a repository link or an explicit statement of code release.
Open Datasets	Yes	All our empirical ﬁndings are consistent across a range of architectures and datasets, we only present the results on CIFAR-10 with VGG-16 below and postpone the results on other architectures and datasets to appendix.
Dataset Splits	No	The paper mentions "validation error" but does not specify the dataset splits (e.g., percentages, sample counts, or a citation to a predefined split) needed to reproduce the data partitioning for validation.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running its experiments (e.g., exact CPU/GPU models, memory details, or specific cluster configurations).
Software Dependencies	No	The paper does not provide specific ancillary software dependencies with version numbers needed to replicate the experiment.
Experiment Setup	Yes	We use ℓ1,2 regularizer, and save intermediate checkpoints of NN parameters at time-step t {1, 2, 5, 8} S{10k : k N+} in the entire section. We let θ1 and θ2 to be VGG-16 trained after 2 epochs using he_uniform and he_normal initializers, attaining 19% and 57% accuracy respectively...