reproducibilityindex.ai

Is Deeper Better only when Shallow is Good?

Authors: Eran Malach, Shai Shalev-Shwartz

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform experiments on learning fractal distributions with deep networks trained with SGD and assert that the approximation curve has a crucial effect on whether a depth efﬁciency is observed or not. ... In this section we present our experimental results on learning deep networks with Adam optimizer ([7]). ... We perform the same experiments with different fractal structures ... Finally, we want to show that the results given in this paper are interesting beyond the scope of our admittedly synthetic fractal distributions. ... To address this concern, we performed similar experiments on the CIFAR-10 data, studying the effect of width and depth on the performance of neural-networks on real data.
Researcher Affiliation	Academia	Eran Malach School of Computer Science The Hebrew University Jerusalem, Israel eran.malach@mail.huji.ac.il Shai Shalev-Shwartz School of Computer Science The Hebrew University Jerusalem, Israel shais@cs.huji.ac.il
Pseudocode	No	The paper does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets	Yes	Finally, we analyze the behavior of networks of growing depth on CIFAR-10. ... We performed similar experiments on the CIFAR-10 data, studying the effect of width and depth on the performance of neural-networks on real data.
Dataset Splits	No	The paper states "We sample 50K examples for a train dataset and 5K examples for a test dataset." but does not mention a validation split.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions "Adam optimizer ([7])" and "Tensorflow. Cifar-10 tensorﬂow tutorial, models/tutorials/image/cifar10. 2018." but does not specify version numbers for these software dependencies.
Experiment Setup	Yes	We train feed-forward networks of varying depth and width on a 2D Cantor distribution of depth 5. We sample 50K examples for a train dataset and 5K examples for a test dataset. We train the networks on this dataset with Adam optimizer for 106 iterations, with batch size of 100 and different learning rates. We observe the best performance of each conﬁguration (depth and width) on the test data along the runs.