reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Large-width asymptotics and training dynamics of $\alpha$-Stable ReLU neural networks

Authors: Stefano Favaro, Sandra Fortini, Stefano Peluchetti

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To demonstrate numerically Theorem 2.1, we sample random neural networks according to 3 for various values of width m and stability index α. We evaluate these networks on a ﬁne uniform grid of points in [0, 1]2. Figure 2 displays the results, which show that the function samples remain well-behaved as m grows larger.
Researcher Affiliation	Collaboration	Stefano Favaro EMAIL Department of Economics and Statistics University of Torino and Collegio Carlo Alberto, Sandra Fortini EMAIL Department of Decision Sciences Bocconi University, Stefano Peluchetti EMAIL Cogent Labs, Tokyo
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. The methods are described through mathematical formulations and proofs.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository. The OpenReview link refers to the paper's review forum, not a code base.
Open Datasets	No	The paper uses a 'ﬁne uniform grid of points in [0, 1]2' for its numerical illustrations, which is a synthetic data generation method, not a publicly available dataset requiring access information.
Dataset Splits	No	The paper performs numerical illustrations on a 'ﬁne uniform grid of points in [0, 1]2' but does not describe any training, test, or validation dataset splits, as it's not a machine learning experiment with conventional data partitioning.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run its numerical illustrations or experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers used for the experiments or numerical illustrations.
Experiment Setup	Yes	By assuming the learning rate ηm = (log m)2/α, we show that: i) if m + then (log m)2/αHm(W(0), X; α) converges weakly to an (α/2)-Stable (almost surely) positive deﬁnite random matrix H (X, X; α); ii) and for every δ > 0 the gradient descent achieves zero training error at linear rate, for m suﬃciently large, with probability 1 δ.