Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Certified Training: Small Boxes are All You Need
Authors: Mark Niklas Mueller, Franziska Eckert, Marc Fischer, Martin Vechev
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show in an extensive empirical evaluation that SABR outperforms existing certified defenses in terms of both standard and certifiable accuracies across perturbation magnitudes and datasets, pointing to a new class of certified training methods promising to alleviate the robustness-accuracy trade-off. |
| Researcher Affiliation | Academia | Mark Niklas Mรผller , Franziska Eckert , Marc Fischer & Martin Vechev Department of Computer Science ETH Zurich, Switzerland |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code released at https://github.com/eth-sri/sabr |
| Open Datasets | Yes | We conduct experiments on MNIST (Le Cun et al., 2010), CIFAR-10 (Krizhevsky et al., 2009), and TINYIMAGENET (Le & Yang, 2015) for the challenging โ perturbations |
| Dataset Splits | No | The paper mentions training and test sets but does not provide specific details about validation dataset splits (percentages, sample counts, or methodology). |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments. |
| Software Dependencies | No | The paper mentions "Py Torch" and "MN-BAB" but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | No | The paper states "We chose similar training hyperparameters as prior work (Shi et al., 2021) and provide more detailed information in App. C.", but the main text does not contain specific hyperparameter values or detailed system-level training settings. |