reproducibilityindex.ai

Neural Network Architecture Beyond Width and Depth

Authors: Shijun Zhang, Zuowei Shen, Haizhao Yang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we use numerical experimentation to show the advantages of the super-approximation power of Re LU Nest Nets.
Researcher Affiliation	Academia	Zuowei Shen Department of Mathematics National University of Singapore matzuows@nus.edu.sg Haizhao Yang Department of Mathematics University of Maryland, College Park hzyang@umd.edu Shijun Zhang Department of Mathematics National University of Singapore zhangshijun@u.nus.edu
Pseudocode	No	The paper describes the network architecture and mathematical definitions but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not mention releasing source code for the described methodology or provide any links to a code repository.
Open Datasets	Yes	We will design convolutional neural network (CNN) architectures activated by Re LU or the subnetwork activation function ϱ given in Equation (4) to classify image samples in Fashion-MNIST [47].
Dataset Splits	No	The paper specifies training and test sample counts: For each i {0,1}, we randomly choose 3 105 training samples and 3 104 test samples in Si with label i. For Fashion-MNIST, it states: This dataset consists of a training set of 6 104 samples and a test set of 104 samples. However, it does not explicitly mention a validation split.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies	No	The paper mentions using RAdam [23] as the optimization method, but it does not specify any software names with version numbers for libraries, frameworks, or environments.
Experiment Setup	Yes	The number of epochs and the batch size are set to 500 and 512, respectively. We adopt RAdam [23] as the optimization method. In epochs 5(i 1) + 1 to 5i for i = 1,2, ,100, the learning rate is 0.2 0.002 0.9i 1 for the parameters in ϱ and 0.002 0.9i 1 for all other parameters.