Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics

Authors: Alireza Mousavi-Hosseini, Denny Wu, Murat A Erdogdu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we perform numerical simulations to verify the intuitions from Theorem 3. Specifically, we train a two-layer neural network with width m = 50 and Re LU activation, where the first layer weights are initialized uniformly on the sphere, and fix the first half of the second layer coordinates at +1/m, and the second half at 1/m. The input follows the distribution x N(0, Σ) (with an extra 1 appended for bias), where Σ = diag(σ2) with σ2 1 = 1 and σ2 i = deff 1 d 1 for input dimension d = 50. The labels are generated by a single-index model of the following form y = g( e1, x ) = e1, x 2 1. Therefore, the effective dimension from Definition 1 is exactly equal to deff. We train the neural network using the squared loss with MFLA, with a stepsize of 0.1, weight decay parameter 0.01, temperature 0.001. Figure 1b shows the test loss at the end of 200 iterations of MFLA for different numbers of training samples n and effective dimension deff. For each value of n and deff, we average the test loss over 5 independent runs with different realizations of data and initialization. In Figure 2 we measure the generalization gap, i.e. the average loss difference on the training set of n samples, and a test set of 100000 samples, at the end of 3000 iterations of training with MFLA. For this experiment, we try n = 100, n = 200, and n = 500. As seen from both figures, deff controls the generalization gap and test loss, both of which decay with larger n.
Researcher Affiliation	Academia	Alireza Mousavi-Hosseini1,2, Denny Wu3,4, Murat A. Erdogdu1,2 1University of Toronto, 2Vector Insitute, 3New York University, 4Flatiron Institute EMAIL,edu, EMAIL
Pseudocode	No	The paper describes methods using mathematical equations and descriptive text, but does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code	Yes	The code to reproduce the experimental results is provided at: https://github.com/mousavih/MFLD-Learnability.
Open Datasets	No	The paper uses synthetic data generated according to specific statistical distributions and models (e.g., 'The input follows the distribution x N(0, Σ)', 'The labels are generated by a single-index model') rather than relying on a publicly available or open dataset with concrete access information.
Dataset Splits	Yes	In Figure 2 we measure the generalization gap, i.e. the average loss difference on the training set of n samples, and a test set of 100000 samples, at the end of 3000 iterations of training with MFLA. For this experiment, we try n = 100, n = 200, and n = 500.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper describes the methodology and training process but does not specify any software dependencies with version numbers.
Experiment Setup	Yes	Specifically, we train a two-layer neural network with width m = 50 and Re LU activation, where the first layer weights are initialized uniformly on the sphere, and fix the first half of the second layer coordinates at +1/m, and the second half at 1/m. We train the neural network using the squared loss with MFLA, with a stepsize of 0.1, weight decay parameter 0.01, temperature 0.001.