reproducibilityindex.ai

Breaking Inter-Layer Co-Adaptation by Classifier Anonymization

Authors: Ikuro Sato, Kohta Ishikawa, Guoqing Liu, Masayuki Tanaka

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Real-data experiments under more general conditions provide supportive evidences. We use the CIFAR-10 dataset, a 10-class image classiﬁcation dataset having 5 × 10^4 training samples, and the CIFAR-100 dataset, a 100-class image classiﬁcation dataset having the same number of samples (Krizhevsky & Hinton, 2009).
Researcher Affiliation	Collaboration	Ikuro Sato 1 Kohta Ishikawa 1 Guoqing Liu 1 Masayuki Tanaka 2 1Denso IT Laboratory, Inc., Japan 2National Institute of Advanced Industrial Science and Technology, Japan.
Pseudocode	Yes	Algorithm 1 Approximate minimization in Eq. (2)
Open Source Code	No	No statement regarding the release of source code or a link to a code repository was found.
Open Datasets	Yes	We use the CIFAR-10 dataset, a 10-class image classiﬁcation dataset having 5 x 10^4 training samples, and the CIFAR-100 dataset, a 100-class image classiﬁcation dataset having the same number of samples (Krizhevsky & Hinton, 2009).
Dataset Splits	No	In each training, we tested a couple of different initial learning rates and chose the best-performing one in the validation.
Hardware Specification	No	No specific hardware details (such as GPU or CPU models, memory, or cluster specifications) used for running experiments were mentioned in the paper.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments.
Experiment Setup	Yes	Training details. SGD with momentum is used in each baseline experiment. In each FOCA experiment, the feature-extractor part uses SGD with momentum, and the classiﬁer part uses gradient descent with momentum. In each training, we tested a couple of different initial learning rates and chose the best-performing one in the validation. A manual learning rate scheduling is adopted; the learning rate is dropped by a ﬁxed factor 1-3 times. The weak classiﬁers are randomly initialized each time by zero-mean Gaussian distribution with standard deviation 0.1 for both CIFAR-10 and -100. Cross entropy loss with softmax normalization and Re LU activation (Nair & Hinton, 2010) are used in every case. No data augmentation is adopted. The batch size b used in the weak-classiﬁer training is 100 for the CIFAR10 and 1000 for the CIFAR-100 experiments. The number of updates to generate θ is 32 for the CIFAR-10 and 64 for the CIFAR-100 experiments. Max-norm regularization (Srivastava et al., 2014) is used for the FOCA training, to stabilize the training. We found that the FOCA training can be made even more stable when updating the featureextractor parameters u times for a given weak classiﬁer parameters. We used this trick with u = 8 in the CIFAR100 experiments.