reproducibilityindex.ai

Efficient Training of Low-Curvature Neural Networks

Authors: Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju, François Fleuret

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we perform experiments to (1) evaluate the effectiveness of our proposed method in training models with low curvature as originally intended, (2) evaluate whether low curvature models have robust gradients in practice, and (3) evaluate the effectiveness of low-curvature models for adversarial robustness. Our experiments are primarily conducted on a base ResNet-18 architecture ([28]) using the CIFAR10 and CIFAR100 datasets ([29]), and using the Pytorch [30] framework.
Researcher Affiliation	Academia	Suraj Srinivas 1 Harvard University ssrinivas@seas.harvard.edu Kyle Matoba Idiap Research Institute & EPFL kyle.matoba@epfl.ch Himabindu Lakkaraju Harvard University hlakkaraju@hbs.edu François Fleuret University of Geneva francois.fleuret@unige.ch
Pseudocode	No	The paper mentions providing a "PyTorch-style code snippet in the appendix" but does not explicitly label it as "Pseudocode" or an "Algorithm" block within the main text.
Open Source Code	Yes	Code to implement our method and replicate our experiments is available at https://github.com/kylematoba/lcnn.
Open Datasets	Yes	Our experiments are primarily conducted on a base Res Net-18 architecture ([28]) using the CIFAR10 and CIFAR100 datasets ([29])
Dataset Splits	No	The paper specifies training details but does not explicitly mention a validation dataset split or how it was used for model selection or hyperparameter tuning.
Hardware Specification	Yes	Our methods entailed fairly modest computation our most involved computations can be completed in under three GPU days, and all experimental results could be computed in less than 60 GPU-days. We used a mixture of GPUs primarily NVIDIA Ge Force GTX 1080 Tis on an internal compute cluster.
Software Dependencies	No	Our experiments are primarily conducted on a base Res Net-18 architecture ([28]) using the CIFAR10 and CIFAR100 datasets ([29]), and using the Pytorch [30] framework. ... We use the Cleverhans library [33] to implement PGD. The paper mentions software like PyTorch and Cleverhans but does not specify their version numbers.
Experiment Setup	Yes	All our models are trained for 200 epochs with an SGD + momentum optimizer, with a momentum of 0.9 and an initial learning rate of 0.1 which decays by a factor of 10 at 150 and 175 epochs, and a weight decay of 5 × 10−4.