On the Expressive Power of Overlapping Architectures of Deep Learning
Authors: Or Sharir, Amnon Shashua
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To further ground our theoretical results, we demonstrate our findings through experiments with standard Conv Nets on the CIFAR10 image classification dataset. and 5 EXPERIMENTS In this section we show that the theoretical results of sec. 4.2 indeed hold in practice. We train each of these networks for classification over the CIFAR-10 dataset |
| Researcher Affiliation | Academia | Or Sharir & Amnon Shashua The Hebrew University of Jerusalem {or.sharir,shashua}@cs.huji.ac.il |
| Pseudocode | No | The paper uses mathematical notation and descriptions of network architectures but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code for reproducing the above experiments and plots can be found at https://github.com/HUJI-Deep/Overlaps And Expressiveness. |
| Open Datasets | Yes | We train each of these networks for classification over the CIFAR-10 dataset |
| Dataset Splits | No | The paper mentions training on CIFAR-10 with data augmentation and reporting 'training accuracy'. It does not provide specific details on train/validation/test splits (e.g., percentages or sample counts) for reproducibility of data partitioning. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, processor types, or memory amounts) used for running experiments are mentioned in the paper. |
| Software Dependencies | No | The paper mentions 'ADAM (Kingma and Ba, 2015)' as an optimizer, but does not list any specific software libraries or frameworks with version numbers (e.g., Python, PyTorch, TensorFlow, etc.) that would be necessary for replication. |
| Experiment Setup | Yes | More specifically, the network has 5 blocks, each starting with a B B convolution with C channels, stride 1 1, and Re LU activation, and then followed by 2 2 max-pooling layer. After the fifth conv-pool , there is a final dense layer with 10 outputs and softmax activations. and The training itself is carried out for 300 epochs with ADAM (Kingma and Ba, 2015) using its standard hyper-parameters |