reproducibilityindex.ai

Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems

Authors: Stefano Sarao Mannelli, Pierfrancesco Urbani

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we use dynamical mean field theory techniques to describe analytically the average dynamics of these methods in a prototypical non-convex model: the (spiked) matrix-tensor model. We derive a closed set of equations that describe the behaviour of heavy-ball momentum and Nesterov acceleration in the infinite dimensional limit. By numerical integration of these equations, we observe that these methods speed up the dynamics but do not improve the algorithmic threshold with respect to gradient descent in the spiked model.Our non-rigorous results are checked against extensive numerical simulations.
Researcher Affiliation	Academia	Stefano Sarao Mannelli Department of Experimental Psychology University of Oxford Oxford, United Kingdom stefano.saraomannelli@psy.ox.ac.uk Pierfrancesco Urbani Universit e Paris-Saclay, CNRS, CEA Institut de physique th eorique Gif-sur-Yvette, France pierfrancesco.urbani@ipht.fr
Pseudocode	No	The paper defines the algorithms (Nesterov acceleration, Polyak's or heavy ball momentum, gradient descent) using mathematical equations (3-7), but does not present them in a pseudocode or algorithm block format.
Open Source Code	No	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] We use standard algorithms and provide details on the parameters. They can be easily reproduce in any computer.
Open Datasets	No	The paper defines and uses generative models (mixed p-spin model and spiked matrix-tensor model) for its analysis and simulations. It does not refer to or provide access to pre-existing, publicly available datasets.
Dataset Splits	No	The paper conducts numerical simulations on generative models and provides simulation parameters. However, it does not specify explicit training, validation, or test dataset splits in the conventional sense of fixed datasets.
Hardware Specification	No	Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] Tha data for each ﬁgure can be obtained in maximum one day of laptop simulation.
Software Dependencies	No	The paper provides model and algorithm parameters for the simulations (e.g., p=3, alpha=0.01, beta=0.9) but does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or specific solvers).
Experiment Setup	Yes	All the ﬁgures resulting from the experiments contain details to reproduce them. For example, Figure 1 specifies: 'The simulations in the ﬁgures have parameters p = 3, σ3 = 2/p, σ2 = 1, ridge parameter µ = 10 and input dimension N = 1024. ... heavy ball momentum in blue with α = 0.01 and β = 0.9;'