Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Convex Geometry and Duality of Over-parameterized Neural Networks

Authors: Tolga Ergen, Mert Pilanci

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then run our approach in Theorem 11, i.e., denoted as Theory and GD on these datasets. In Figure 10, we plot the mean test accuracy (solid lines) of each algorithm along with a one standard deviation conﬁdence band (shaded regions). As illustrated in this example, our approach achieves slightly better generalization performance compared to GD. We also visualize the sample data distributions and the corresponding function ﬁts in Figure 10a, where we provide an example to show the agreement between the solutions found by our approach and GD. We then consider classiﬁcation tasks and report the performance of the algorithms on MNIST (Le Cun) and CIFAR-10 (Krizhevsky et al., 2014). In order to verify our results in Theorem 15, we run 5 SGD trials with independent initializations for the network parameters, where we use subsampled versions of the datasets. As illustrated in Figure 11 and 12, the network constructed using the closed-form solution achieves the lowest training objective and highest test accuracy for both datasets.
Researcher Affiliation	Academia	Tolga Ergen EMAIL Department of Electrical Engineering Stanford University Stanford, CA 94305, USA Mert Pilanci EMAIL Department of Electrical Engineering Stanford University Stanford, CA 94305, USA
Pseudocode	Yes	We also provide the full algorithm in Algorithm 1. Algorithm 1 Cutting Plane based Training Algorithm for Two-Layer NNs (without bias) ...The complete algorithm is also presented in Algorithm 2. Algorithm 2 Convex-RF ...Algorithm 3 Cutting Plane based Training Algorithm for Two-Layer NNs (with bias)
Open Source Code	No	The paper does not contain any explicit statements about releasing their code, nor does it provide a link to a code repository for the methodology described.
Open Datasets	Yes	We then consider classiﬁcation tasks and report the performance of the algorithms on MNIST (Le Cun) and CIFAR-10 (Krizhevsky et al., 2014). We also evaluate the performances on several regression datasets, namely Bank, Boston Housing, California Housing, Elevators, Stock (Torgo), and the Twenty Newsgroups text classiﬁcation dataset (Mitchell and Learning, 1997). We also remark that all the datasets we use are publicly available and further information, e.g., training and test sizes, can be obtained through the provided references (Le Cun; Krizhevsky et al., 2014; Torgo; new).
Dataset Splits	No	For the synthetic dataset, the paper states: 'we generate multiple datasets with nonoverlapping training and test splits.' This is a general statement without specific percentages or counts. For MNIST and CIFAR-10, the captions for Figures 11 and 12 mention: 'where (n, d) = (200, 250), K = 10, β = 10-3, m = 100' and 'where (n, d) = (60, 60), K = 10, β = 10-3, m = 100' respectively, indicating subsampled versions, but not the train/test/validation split ratios. It also refers to external references for 'training and test sizes' for some datasets, but does not state the specific splits used in their experiments in the main text.
Hardware Specification	No	The paper does not mention any specific hardware (e.g., GPU models, CPU types, memory specifications) used for running its experiments.
Software Dependencies	No	In order to solve the convex optimization problems in our approach, we use CVX (Grant and Boyd, 2014). However, notice that when dealing with large datasets, e.g., CIFAR-10, plain CVX solvers might need signiﬁcant amount of memory. In order to circumvent these issues, we use SPGL1 (van den Berg and Friedlander, 2007) and Super SCS (Themelis and Patrinos, 2019) for large datasets. While these tools are mentioned, specific version numbers are not provided.
Experiment Setup	Yes	Figure 11: Training and test performance of 5 independent SGD trials on whitened and sampled MNIST, where (n, d) = (200, 250), K = 10, β = 10-3, m = 100 and we use squared loss with one hot encoding. For the method denoted as Theory, we use the layer weights in Theorem 15. Figure 12: Training and test performance of 5 independent SGD trials on whitened and sampled CIFAR-10, where (n, d) = (60, 60), K = 10, β = 10-3, m = 100 and we use squared loss with one hot encoding. For Theory, we use the layer weights in Theorem 15. For all the experiments, we use the regularization term (also known as weight decay) to let the algorithms generalize well on unseen data (Krogh and Hertz, 1992).