Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

The Nuclear Route: Sharp Asymptotics of ERM in Overparameterized Quadratic Networks

Authors: Vittorio Erba, Emanuele Troiani, Lenka Zdeborová, Florent Krzakala

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Figure 1 we compare the asymptotic results of Theorem 1 with numerical experiments at finite size run directly on the equivalent convex loss (9) (using the solver CVXPY [Diamond and Boyd, 2016, Agrawal et al., 2018]). We notice that despite the theoretical results being valid in the high-dimensional limit, they are in excellent agreement with simulation at sizes as moderate as d = 50. Details on the numerical experiments are given in Appendix F. The mapping onto a matrix problem implies immediately that Theorem 1 describes also the performance of gradient descent at convergence.
Researcher Affiliation Academia Vittorio Erba Statistical Physics of Computation Laboratory EPFL, Switzerland Emanuele Troiani Statistical Physics of Computation Laboratory EPFL, Switzerland Lenka Zdeborová Statistical Physics of Computation Laboratory EPFL, Switzerland Florent Krzakala Information, Learning & Physics Laboratory EPFL, Switzerland
Pseudocode Yes The generic form of the "rectangular" GAMP algorithm reads [Javanmard and Montanari, 2013]: ut+1 = AT gt(vt) + dtet(ut), vt = Aet(ut) btgt 1(vt 1) , (25)
Open Source Code Yes We provide the code for the numerical implementation of the equations in Theorem 1 and the experiments at https://github.com/SPOC-group/Overparametrised_Net.
Open Datasets No In this paper, we thus study learning by empirical risk minimization with quadratic networks from data that is also generated by a quadratic network. Consider a dataset D = {xµ, yµ}n µ=1 where the data xµ Rd are standard Gaussian xµ N(0, Id) (though our results allow for some universality) for µ = 1, . . . , n, and the labels yµ R are generated by an unknown target function f (x).
Dataset Splits No The paper uses synthetic data which is generated on the fly, not split from a pre-existing dataset. It mentions "training and test errors" and evaluating on "a new test sample x with the same distribution as the training samples", which implies a conceptual separation rather than a fixed partition.
Hardware Specification No All our numerical procedure can be run on common laptops.
Software Dependencies No For Figure 1 right and d = 100 we used the CVXPY package in Python. All the other experiments are realized in Py Torch.
Experiment Setup Yes We instantiate the student weights randomly as centered Gaussian variables with standard deviation 10e-3 and the target ones with variance 1, appropriately changing the functional form of the target. In Figure 1 right we optimize using LBFGS for the sake of efficiency. For Figure 1 left we optimize using the GD iteration wt+1 k = wt k 2λη wt k with η = 20.