FractalNet: Ultra-Deep Neural Networks without Residuals

Authors: Gustav Larsson, Michael Maire, Gregory Shakhnarovich

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, fractal networks match the excellent performance of standard residual networks on both CIFAR and Image Net classification tasks
Researcher Affiliation Academia Gustav Larsson University of Chicago larsson@cs.uchicago.edu Michael Maire TTI Chicago mmaire@ttic.edu Gregory Shakhnarovich TTI Chicago greg@ttic.edu
Pseudocode No The paper describes the fractal network expansion rule and block structure using diagrams and mathematical formulas but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a specific repository link or an explicit statement about the release of the source code for the methodology described.
Open Datasets Yes Section 4 provides experimental comparisons to residual networks across the CIFAR-10, CIFAR-100 (Krizhevsky, 2009), SVHN (Netzer et al., 2011), and Image Net (Deng et al., 2009) datasets.
Dataset Splits Yes Table 2: Image Net (validation set, 10-crop).
Hardware Specification No We gratefully acknowledge the support of NVIDIA Corporation with the donation of GPUs used for this research.
Software Dependencies No We implement Fractal Net using Caffe (Jia et al., 2014).
Experiment Setup Yes For experiments using dropout, we fix drop rate per block at p0%, 10%, 20%, 30%, 40%q, similar to Clevert et al. (2016). Local drop-path uses 15% drop rate across the entire network. We run for 400 epochs on CIFAR, 20 epochs on SVHN, and 70 epochs on Image Net. Our learning rate starts at 0.02 (for Image Net, 0.001) and we train using stochastic gradient descent with batch size 100 (for Image Net, 32) and momentum 0.9.