Optimization Landscape and Expressivity of Deep CNNs

Authors: Quynh Nguyen, Matthias Hein

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To illustrate Theorem 3.5 we plot the rank of the feature matrices of the network in Figure 1. We use the MNIST dataset with N = 60000 training and 10000 test samples. ... In Table 2 we show the smallest singular value of the feature matrices together with the corresponding training loss, training and test error.
Researcher Affiliation Academia 1Department of Mathematics and Computer Science, Saarland University, Germany 2University of Tu bingen, Germany.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets Yes We use the MNIST dataset with N = 60000 training and 10000 test samples.
Dataset Splits No The paper mentions training and test samples but does not specify a separate validation dataset split or provide details on how validation was performed.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper mentions 'Adam (Kingma & Ba, 2015)' as the optimizer and 'sigmoid activation function' but does not provide specific version numbers for any software components or libraries.
Experiment Setup Yes We then vary the number of convolutional filters T1 of the first layer from 10 to 100 and train the corresponding network with squared loss and sigmoid activation function using Adam (Kingma & Ba, 2015) and decaying learning rate for 2000 epochs.