Divisive Feature Normalization Improves Image Recognition Performance in AlexNet

Authors: Michelle Miller, SueYeon Chung, Kenneth D. Miller

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Divisive normalization always improved performance for models with batch or group or no normalization, generally by 1-2 percentage points, on both the CIFAR-100 and Image Net databases.
Researcher Affiliation Academia Michelle Miller1, Sue Yeon Chung1,2,3, Ken D. Miller1,4 1 Center for Theoretical Neuroscience, Columbia University, 2 Center for Neural Science, New York University, 3 Flatiron Institute, Simons Foundation, 4 Swartz Program in Theoretical Neuroscience, Kavli Institute for Brain Science, Department of Neuroscience, College of Physicians and Surgeons, Zuckerman Mind Brain Behavior Institute, Columbia University
Pseudocode No The paper provides mathematical formalisms (e.g., Eq. 1 for Divisive Normalization) but does not include structured pseudocode or algorithm blocks.
Open Source Code No All software used in this project will be deposited in a publicly accessible github repository no later than the time of the 2022 ICLR meeting.
Open Datasets Yes The CIFAR training and validation images were resized to 32 32 3 and horizontally flipped; Imagenet training images resized to 224 224 3 and horizontally flipped; Imagenet validation images resized to 256 256 3 and center cropped. Each color channel was always standardized.
Dataset Splits Yes The CIFAR training and validation images were resized to 32 32 3 and horizontally flipped; Imagenet training images resized to 224 224 3 and horizontally flipped; Imagenet validation images resized to 256 256 3 and center cropped. Each color channel was always standardized.
Hardware Specification No No specific hardware details such as GPU or CPU models, memory, or cloud instance types are mentioned in the paper.
Software Dependencies No The paper mentions software like 'pytorch local response normalization (LRN)' and external packages like 'Fool Box' and 'texture-vs-shape package' but does not specify their version numbers.
Experiment Setup Yes Unless otherwise specified, the learning rate used in the models was .01. Batch sizes were 128. The initial normalization parameters were λ = 10., α = .1, β = 1., k = 10, except for the Divisive model with no other normalizations, for which initial λ = 1. and k = 0.5 to make learning reliable (further discussed in Results). The Weight initialization method followed that of He et al. (2015)...