reproducibilityindex.ai

Kalman Normalization: Normalizing Internal Representations Across Network Layers

Authors: Guangrun Wang, jiefeng peng, Ping Luo, Xinjiang Wang, Liang Lin

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments. We ﬁrst evaluate KN on Image Net 2012 classiﬁcation dataset. Table 1 compares the top-1 validation accuracies. KN achieves 76.1% top-1 accuracy, outperforming BN and BRN by a large margin (3.4% and 3.4%).
Researcher Affiliation	Collaboration	Guangrun Wang Sun Yat-sen University wanggrun@mail2.sysu.edu.cn Jiefeng Peng Sun Yat-sen University jiefengpeng@gmail.com Ping Luo The Chinese University of Hong Kong pluo.lhi@gmail.com Xinjiang Wang Sense Time Group Ltd. Liang Lin Sun Yat-sen University linliang@ieee.org
Pseudocode	No	The paper describes the approach using mathematical equations (e.g., Eqn. 7) and descriptive text, but it does not present structured pseudocode or an algorithm block.
Open Source Code	No	The paper does not provide any concrete access information for source code, such as a repository link or an explicit statement of code release.
Open Datasets	Yes	We ﬁrst evaluate KN on Image Net 2012 classiﬁcation dataset [24] which consists of 1, 000 categories. To investigate the application of micro-batch training, we use COCO 2017 detection & segmentation benchmark [6]. We conducted more studies on the CIFAR-10 and CIFAR-100 dataset [15]. We also conduct experiments on SVHN dataset [20].
Dataset Splits	Yes	The models are trained on the 1.28M training images and evaluated on the 50k validation images. The models are trained in the COCO train2017 set and evaluated in the COCO val2017 set. both of which consist of 50k training images and 10k testing images.
Hardware Specification	Yes	For a fair comparison, both methods are trained in the same computing machine with four Titan X GPUs.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	Our baseline models are three representative networks, including Inceptionv2 [27], Res Net50, and Res Net101 [8]. We employ the baseline of typical batch size (i.e.32) for comparison. We use the schedule of 280k training steps. Speciﬁcally, the resolution is set as (800, 1333); and we sample 256 boxes for each image. We use batch size of only 2.