reproducibilityindex.ai

Relaxing Local Robustness

Authors: Klas Leino, Matt Fredrikson

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we introduce two relaxed safety properties for classiﬁers that address this observation: (1) relaxed top-k robustness, which serves as the analogue of top-k accuracy; and (2) afﬁnity robustness, which speciﬁes which sets of labels must be separated by a robustness margin, and which can be ϵ-close in ℓp space. We show how to construct models that can be efﬁciently certiﬁed against each relaxed robustness property, and trained with very little overhead relative to standard gradient descent. Finally, we demonstrate experimentally that these relaxed variants of robustness are well-suited to several signiﬁcant classiﬁcation problems, leading to lower rejection rates and higher certiﬁed accuracies than can be obtained when certifying standard local robustness.
Researcher Affiliation	Academia	Klas Leino Carnegie Mellon University kleino@cs.cmu.edu Matt Fredrikson Carnegie Mellon University mfredrik@cs.cmu.edu
Pseudocode	No	The paper describes the proposed methods in text and uses mathematical equations (e.g., Equation 1) but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	our code is publicly available at https://github.com/klasleino/gloro. For reproducibility and full transparency of our implementation, we have made our code publicly available on Git Hub at https://github.com/klasleino/gloro.
Open Datasets	Yes	Our evaluation focuses on datasets for which our relaxed robustness variants are appropriate. Namely, we select datasets with large numbers of ﬁne-grain classes, or classes with a large degree of feature-overlap: Euro SAT [11], CIFAR-100 [17], and Tiny-Imagenet [19].
Dataset Splits	No	The paper mentions training procedures and evaluation metrics (VRA), but it does not specify explicit training/validation/test dataset splits (e.g., percentages or counts) within the main text. It refers to Appendix F for training details but does not provide specific split information there either from the provided text.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper does not mention specific software dependencies with their version numbers required to reproduce the experiments (e.g., Python 3.x, PyTorch 1.x, specific library versions).
Experiment Setup	No	The paper states that "The details of the architecture, training procedure, and hyperparameters are provided in Appendix F in the supplementary material." However, the question asks for details in the main text, and the provided text does not include these specific hyperparameter values or training configurations within its body.