Training CNNs With Normalized Kernels
Authors: Mete Ozay, Takayuki Okatani
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results show that the proposed method can successfully train popular CNN models using several different types of kernel normalization methods. Moreover, they show that the proposed method improves classification performance of baseline CNNs, and provides state-of-the-art performance for major image classification benchmarks. |
| Researcher Affiliation | Academia | 1Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi, Japan 2RIKEN Center for AIP, Tokyo, Japan |
| Pseudocode | Yes | Algorithm 1: Training CNNs using SGD on Riemannian submanifolds of normalized kernels. |
| Open Source Code | No | No explicit statement or link to the open-source code for the described methodology was found. |
| Open Datasets | Yes | The proposed algorithm is used to train various different types of CNNs on different image classification datasets. Experimental results show that normalized kernels can be used boost performance of base-line CNNs. ... The results given in Table 1 are obtained by training CNNs on the Cifar-10 dataset ... We first employ our methods for training of residual networks (Res) with constant depth (RCD) and stochastic depth (RSD) consisting of 110 layers (Huang, Liu, and Weinberger 2016; Huang et al. 2016). In order to explore how the proposed methods enable us to learn invariance properties as discussed above, we analyze the results for Cifar and Imagenet datasets that are augmented using standard DA methods. |
| Dataset Splits | No | No specific dataset split information (exact percentages, sample counts, or detailed splitting methodology for all datasets) needed to reproduce the data partitioning was explicitly provided in the main text. While it mentions using 'ILSVRC 2012 validation data', it doesn't detail the splits for other datasets like Cifar-10 or Cifar-100. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments were provided. |
| Software Dependencies | No | No specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment were provided. |
| Experiment Setup | No | The paper states, 'For a fair comparison with state-of-the-art methods, we used the same code and hyperparameters provided by the authors,' but does not explicitly provide specific hyperparameter values or detailed training configurations within the main text of this paper. |