reproducibilityindex.ai

Unlocking Deterministic Robustness Certification on ImageNet

Authors: Kai Hu, Andy Zou, Zifan Wang, Klas Leino, Matt Fredrikson

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we provide an empirical evaluation of Li Res Net and EMMA loss in comparison to certifiably robust training approaches in prior works. We begin by comparing the best VRAs we achieve against the best VRAs reported in the literature in Section 5.1. Next, in Section 5.2, we run head-to-head comparisons between EMMA and the standard Glo Ro losses as an ablation study to measure the empirical benefits of EMMA loss. Section 5.3 presents experiments on networks of different depths, to shed light on the unique depth scalability of the Li Res Net architecture.
Researcher Affiliation	Collaboration	Kai Hu Carnegie Mellon University Pittsburgh, PA 15213 kaihu@andrew.cmu.edu Andy Zou Carnegie Mellon University Pittsburgh, PA 15213 andyzou@cmu.edu Zifan Wang Center for AI Safety San Francisco, CA 94111 zifan@safe.ai Klas Leino Carnegie Mellon University Pittsburgh, PA 15213 kleino@cs.cmu.edu Matt Fredrikson Carnegie Mellon University Pittsburgh, PA 15213 mfredrik@cs.cmu.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks (e.g., labeled as 'Algorithm' or 'Pseudocode').
Open Source Code	Yes	Our code is publicly available on Git Hub.1 1Code available at https://github.com/hukkai/liresnet.
Open Datasets	Yes	We experiment on CIFAR-10/100 and Tiny-Image Net using ℓ2 perturbations with ϵ = 36/255, the standard datasets and radii used in prior work. Additionally, we demonstrate the scalability of Glo Ro Li Res Nets by training on and certifying Image Net.
Dataset Splits	Yes	Even in academic research, methods on certifying Image Net with RS only report results using 1% of the validation images (i.e. 500 images) [25, 39, 40]; however, with the Lipschitz-based approach employed in our work, we can certify the entire validation set (50,000 images) in less than one minute.
Hardware Specification	Yes	Our experiments were conducted on an 8-GPU (Nvidia A100) machine with 64 CPUs (Intel Xeon Gold 6248R).
Software Dependencies	No	The paper mentions that its implementation is 'based on Py Torch [38]', but it does not specify a version number for PyTorch or any other software libraries or dependencies. The reference itself does not provide a version used.
Experiment Setup	Yes	On the first 3 datasets, all models are trained with the NAdam [15] with the Lookahead optimizer wrapper [61] with a batch size of 256 and a learning rate of 10 3 for 800 epochs. We use a cosine learning rate decay [34] with linear warmup [21] in the first 20 epochs. On Image Net, we only change the batch size to 1024 and training epochs to 400.