reproducibilityindex.ai

Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width

Authors: Hanxu Zhou, Zhou Qixuan, Zhenyuan Jin, Tao Luo, Yaoyu Zhang, Zhi-Qin Xu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	With carefully designed experiments and a large computation cost, for both synthetic datasets and real datasets, we ﬁnd that the dynamics of each layer also could be divided into a linear regime and a condensed regime, separated by a critical regime.
Researcher Affiliation	Academia	Hanxu Zhou1, Qixuan Zhou1, Zhenyuan Jin1, Tao Luo1,2, Yaoyu Zhang1,3, Zhi-Qin John Xu1 , 1 School of Mathematical Sciences, Institute of Natural Sciences, MOE-LSC and Qing Yuan Research Institute, Shanghai Jiao Tong University 2 CMA-Shanghai, Shanghai Artiﬁcial Intelligence Laboratory 3 Shanghai Center for Brain Science and Brain-Inspired Technology
Pseudocode	No	The paper does not contain any sections or figures labeled 'Pseudocode' or 'Algorithm', nor are there any structured code-like blocks.
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] In supplementary.
Open Datasets	Yes	The input dimension d is determined by the training data, i.e., d = 1 for synthetic data and d = 28 28 for MNIST.
Dataset Splits	No	The paper does not explicitly specify a validation dataset split; it mentions using synthetic data and MNIST but only describes the synthetic data as having 4 training points and training with full batch gradient descent.
Hardware Specification	No	The paper mentions usage of 'HPC of School of Mathematical Sciences and the Student Innovation Center, and the Siyuan-1 cluster supported by the Center for High Performance Computing at Shanghai Jiao Tong University', but it does not specify any particular GPU models, CPU models, or other detailed hardware specifications.
Software Dependencies	No	The paper does not provide any specific software dependencies, libraries, or their version numbers used in the experiments.
Experiment Setup	Yes	Throughout this section, we use three-layer fully-connected neural networks with size, d-m-m-dout. The input dimension d is determined by the training data, i.e., d = 1 for synthetic data and d = 28 28 for MNIST. The output dimension is dout = 1 for synthetic data and dout = 10 for MNIST. The number of hidden neurons m is speciﬁed in each experiment. All parameters are initialized by a Gaussian distribution N(0, var), where var depends on β1, β2 and β3. The total data size is n. The training method is gradient descent with full batch, learning rate lr and MSE loss.