Training CNNs With Normalized Kernels

Authors: Mete Ozay, Takayuki Okatani

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results show that the proposed method can successfully train popular CNN models using several different types of kernel normalization methods. Moreover, they show that the proposed method improves classification performance of baseline CNNs, and provides state-of-the-art performance for major image classification benchmarks.
Researcher Affiliation Academia 1Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi, Japan 2RIKEN Center for AIP, Tokyo, Japan
Pseudocode Yes Algorithm 1: Training CNNs using SGD on Riemannian submanifolds of normalized kernels.
Open Source Code No No explicit statement or link to the open-source code for the described methodology was found.
Open Datasets Yes The proposed algorithm is used to train various different types of CNNs on different image classification datasets. Experimental results show that normalized kernels can be used boost performance of base-line CNNs. ... The results given in Table 1 are obtained by training CNNs on the Cifar-10 dataset ... We first employ our methods for training of residual networks (Res) with constant depth (RCD) and stochastic depth (RSD) consisting of 110 layers (Huang, Liu, and Weinberger 2016; Huang et al. 2016). In order to explore how the proposed methods enable us to learn invariance properties as discussed above, we analyze the results for Cifar and Imagenet datasets that are augmented using standard DA methods.
Dataset Splits No No specific dataset split information (exact percentages, sample counts, or detailed splitting methodology for all datasets) needed to reproduce the data partitioning was explicitly provided in the main text. While it mentions using 'ILSVRC 2012 validation data', it doesn't detail the splits for other datasets like Cifar-10 or Cifar-100.
Hardware Specification No No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments were provided.
Software Dependencies No No specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment were provided.
Experiment Setup No The paper states, 'For a fair comparison with state-of-the-art methods, we used the same code and hyperparameters provided by the authors,' but does not explicitly provide specific hyperparameter values or detailed training configurations within the main text of this paper.