On the Expressive Power of Overlapping Architectures of Deep Learning

Authors: Or Sharir, Amnon Shashua

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To further ground our theoretical results, we demonstrate our findings through experiments with standard Conv Nets on the CIFAR10 image classification dataset. and 5 EXPERIMENTS In this section we show that the theoretical results of sec. 4.2 indeed hold in practice. We train each of these networks for classification over the CIFAR-10 dataset
Researcher Affiliation Academia Or Sharir & Amnon Shashua The Hebrew University of Jerusalem {or.sharir,shashua}@cs.huji.ac.il
Pseudocode No The paper uses mathematical notation and descriptions of network architectures but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes The source code for reproducing the above experiments and plots can be found at https://github.com/HUJI-Deep/Overlaps And Expressiveness.
Open Datasets Yes We train each of these networks for classification over the CIFAR-10 dataset
Dataset Splits No The paper mentions training on CIFAR-10 with data augmentation and reporting 'training accuracy'. It does not provide specific details on train/validation/test splits (e.g., percentages or sample counts) for reproducibility of data partitioning.
Hardware Specification No No specific hardware details (like GPU/CPU models, processor types, or memory amounts) used for running experiments are mentioned in the paper.
Software Dependencies No The paper mentions 'ADAM (Kingma and Ba, 2015)' as an optimizer, but does not list any specific software libraries or frameworks with version numbers (e.g., Python, PyTorch, TensorFlow, etc.) that would be necessary for replication.
Experiment Setup Yes More specifically, the network has 5 blocks, each starting with a B B convolution with C channels, stride 1 1, and Re LU activation, and then followed by 2 2 max-pooling layer. After the fifth conv-pool , there is a final dense layer with 10 outputs and softmax activations. and The training itself is carried out for 300 epochs with ADAM (Kingma and Ba, 2015) using its standard hyper-parameters