reproducibilityindex.ai

A Kernel Perspective for Regularizing Deep Neural Networks

Authors: Alberto Bietti, Grégoire Mialon, Dexiong Chen, Julien Mairal

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We tested the regularization strategies presented in Section 2 in the context of improving generalization on small datasets and training robust models. Our goal is to use common architectures used for large datasets and improve their performance in different settings through regularization. Our Pytorch implementation of the various strategies is available at https://github.com/albietz/kernel_reg. ... Table 1. Regularization on CIFAR10 with 1 000 examples for VGG-11 and Res Net-18. Each entry shows the test accuracy with/without data augmentation when all hyper-parameters are optimized on a validation set.
Researcher Affiliation	Academia	1Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, 38000 Grenoble, France 2Département d informatique de l ENS, ENS, CNRS, Inria, PSL, 75005 Paris, France.
Pseudocode	No	The paper describes its methods and approaches using mathematical formulations and descriptive text, but it does not include any formally presented pseudocode or algorithm blocks.
Open Source Code	Yes	Our Pytorch implementation of the various strategies is available at https://github.com/albietz/kernel_reg.
Open Datasets	Yes	We consider the datasets CIFAR10 and MNIST when using a small number of training examples, as well as 102 datasets of biological sequences that suffer from small sample size. ... We consider the Structural Classiﬁcation Of Proteins (SCOP) version 1.67 dataset (Murzin et al., 1995)
Dataset Splits	Yes	In order to study the potential effectiveness of each method, we assume that a reasonably large validation set is available to select hyper-parameters; thus, we keep 10 000 annotated examples for this purpose. ... This allows us to use the ﬁrst 51 datasets as a validation set for hyper-parameter tuning, and we report average performance with these ﬁxed choices on the remaining 51 datasets.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper mentions 'Our Pytorch implementation' but does not specify any version numbers for Pytorch or any other software dependencies.
Experiment Setup	Yes	Each strategy derived in Section 2 is trained for 500 epochs using SGD with momentum and batch size 128, halving the step-size every 40 epochs. ... Training was done using Adam with a learning rate ﬁxed to 0.01, and a weight decay parameter tuned for each method.